Ayanami0730/deep_research_bench
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
A summary of research in RTB Budget Pacing
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
[EMNLP 2025] WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
Code and Data for Tau-Bench
HiPRAG (Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation) is a reinforcement learning method designed for training reasoning-and-searching interleaved LLMs with improved efficiency and reduced oversearching as well as undersearching behavior.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Reinforcement Learning in PyTorch
2 captures since 2026-05-24