← Back to search

github Active

Repository profile

JudgmentLabs/judgeval

The Continuous-Improvement Stack for Agents. Our environment data and evals power agent improvement and monitoring.

Python Apache-2.0 main Stack scanned README.md

Open website Open GitHub

Stars: 1,049
Forks: 96
Watchers: 7
Issues: 30
Commits: 1,770
Awesome lists: 1

Repository updates

Get generated JudgmentLabs/judgeval development summaries by email, or follow the weekly and monthly RSS feeds.

Weekly RSS Monthly RSS

Activity and growth

Tracked growth, recent movement, and commit velocity from stored repository snapshots.

Latest capture 2026-07-31 03:09

Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%
Stars since baseline: +12
Snapshot coverage: 4

Tracked growth

4 captures since 2026-06-12

Stars from baseline +12

Time horizon

All tracked data

Custom start Custom end

Stars history

Total stars

Commits history

Default branch commits

Detected stack

Frameworks, package managers, ecosystems, and dependency manifests found during catalog scans.

Scanned 2026-07-31 03:09

Stack signals: 4
Package managers: 1
Manifest files: 11
Dependencies: 208

Frameworks and tools

FastAPI web framework · high confidence
pytest test framework · high confidence
Starlette web framework · medium confidence
Streamlit app framework · high confidence

uv python

Dependency files

11 manifests

pyproject.toml python ecosystem, 31 dependencies
uv.lock python ecosystem, 0 dependencies
examples/basic-distributed-tracing/pyproject.toml python ecosystem, 4 dependencies
examples/basic-evaluation/pyproject.toml python ecosystem, 1 dependency
examples/basic-linked-trace/pyproject.toml python ecosystem, 1 dependency
examples/basic-tracing/pyproject.toml python ecosystem, 2 dependencies
examples/claude-agent-sdk/pyproject.toml python ecosystem, 2 dependencies
examples/google-adk/pyproject.toml python ecosystem, 2 dependencies
3 more files

Classification

Searchable topics, generated tags, and stack labels that explain where this repository fits.

Topics: 15
Tags: 0
Stacks: 4

Topics

#agent #agentic-ai #agents #grpo #langchain #langgraph #llama-index #llm #llm-evaluation #llm-observability #open-source #openai #prompt-engineering #reinforcement-learning #rl

Generated tags

No generated tags yet.

Stack labels

FastAPI pytest Starlette Streamlit

AI development signals

Agent instructions and tool configuration paths found in the repository tree.

0 paths

No AI development config files detected.

Similar repositories

Nearest indexed repositories by embedding similarity.

hidai25/eval-view

Regression testing for AI agents. Snapshot behavior,diff tool calls,catch regressions in CI. Works with LangGraph, CrewAI, OpenAI, Anthropic.

124 stars

Python 1 awesome list

The self-improving QA agent for software teams. A test harness with memory. Write tests in natural language for web and mobile. agent-qa learns from every run, adapts to UI changes, and catches regressions before you ship.

161 stars

TypeScript 3 awesome lists

awslabs/agent-evaluation

A generative AI-powered framework for testing virtual agents.

369 stars

Python 1 awesome list

truera/trulens

Evaluation and Tracking for LLM Experiments and AI Agents

3,438 stars

Python 2 awesome lists

TheAgentCompany/TheAgentCompany

An agent benchmark with tasks in a simulated software company.

740 stars

Python 1 awesome list

Agnuxo1/benchclaw

BenchClaw — Multi-dimensional AI agent evaluation with 17-judge AI Tribunal, 10 scoring dimensions, radar charts, and deception detection. Benchmark any LLM agent.

6 stars

HTML 0 awesome lists

Metadata

Language: Python
License: Apache-2.0
Default branch: main
Created: 2024-10-25
First commit: 2024-10-25
Last pushed: 2026-07-29
GitHub updated: 2026-07-29
Last synced: 2026-07-31 03:09
Stack detected: 2026-07-31 03:09
Archived: no

Links and files

GitHub Website

https://judgmentlabs.ai/

README

Appears in

Awesome Agent Harness

JudgmentLabs/judgeval

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

hidai25/eval-view

vostride/agent-qa

awslabs/agent-evaluation

truera/trulens

TheAgentCompany/TheAgentCompany

Agnuxo1/benchclaw

Metadata

Links and files

Appears in

How it works

Pricing

Follow repository updates

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

hidai25/eval-view

vostride/agent-qa

awslabs/agent-evaluation

truera/trulens

TheAgentCompany/TheAgentCompany

Agnuxo1/benchclaw

Metadata

Links and files

Appears in