Gen-Verse/OpenClaw-RL
OpenClaw-RL: Train any agent simply by talking
Democratizing Reinforcement Learning for LLMs
OpenClaw-RL: Train any agent simply by talking
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
slime is an LLM post-training framework for RL Scaling.
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
Simple RL training for reasoning
1 capture since 2026-05-25
AI agent config detected
Key config paths
docs/AGENTS.md
rllm-model-gateway/AGENTS.md