THUDM/slime
slime is an LLM post-training framework for RL Scaling.
MrlX: A Multi-Agent Reinforcement Learning Framework
slime is an LLM post-training framework for RL Scaling.
Democratizing Reinforcement Learning for LLMs
OpenClaw-RL: Train any agent simply by talking
Reinforcement Learning in PyTorch
Deep Reinforcement Learning for Keras.
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
2 captures since 2026-05-23