deepseek-ai/DeepSeek-R1
No description.
Simple RL training for reasoning
No description.
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡
Minimal reproduction of DeepSeek R1-Zero
1 capture since 2026-05-25