ServiceNow/WorkArena
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
๐๐ช BrowserGym, a Gym environment for web task automation
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
A verified version of the WebArena Benchmark
A simple SWE style browser agent framework that achieves SOTA results on long horizon web tasks.
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
1 capture since 2026-05-25