Awesome List

Awesome Harness Engineering

Awesome list for AI agent harness engineering: tools, patterns, evals, memory, MCP, permissions, observability, and orchestration.

ai-boost/awesome-harness-engineering #agent-harness #agent-memory #agent-orchestration #ai-agent-harness #ai-agents #awesome-list #context-engineering #harness-engineering #mcp

Open GitHub

List stars: 3,108
README repos: 184
Indexed repos: 141
List commits: 108
Forks: 333
Open issues: 105

Tracked list growth

GitHub stars and default-branch commits for ai-boost/awesome-harness-engineering.

Latest scan 2026-07-17 10:49

Likes history

GitHub stars

Commits history

Default branch commits

Indexed repositories

141 repos currently saved from this list.

No filters applied

Latest repo push 2026-07-17

Browse

Filter this list

Search within Awesome Harness Engineering or narrow by ecosystem and project health.

Search repositories

Search mode

Keyword Semantic

Tune results

The controls most people need first.

Language

Freshness

Sort

Direction

More filters Topics, generated tags, stack, files, age, archive status, and growth.

Ecosystem

GitHub topic

Generated tag

Framework or stack

Package manager

Files

Has file

Choose a suggestion or use commas to require multiple files.

Health

Minimum stars

Repository age

Uses known first-commit dates.

Archive status

AI development signals

Momentum

Unmaintained for

Commit velocity

Star growth

Reset filters

Highlighted

Open highlighted repo slot

Put your repository first

Promote a GitHub repo at the top of Awesome repository list views for 7 days.

zerobootdev/zeroboot

Sub-millisecond VM sandboxes for AI agents via copy-on-write forking

Stack

Rust Axum Cargo npm PEP 517

GitHub topics

#ai-agents #code-execution #copy-on-write #firecracker #kvm #rust

Updated: 2026-03-21
Lists: 1 list mention
First commit: 2026-03-15
License: Apache-2.0
Issues: 12 open

2,400

stars

Forks: 106
Commits: 24 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

UKGovernmentBEIS/inspect_ai

Inspect: A framework for large language model evaluations

AI dev

Stack

Python FastAPI Jupyter pytest npm PEP 517 pip

Updated: 2026-07-17
Lists: 2 list mentions
First commit: 2024-05-01
License: MIT
Issues: 238 open

2,365

stars

Forks: 608
Commits: 6,747 commits
Star growth, last 7 days: +36 +1.5%
Commit velocity, last 7 days: +212 +3.2%

Website GitHub

GammaLabTechnologies/harmonist

Portable AI agent orchestration with mechanical protocol enforcement. 186 agents, zero runtime dependencies.

Stack

Python

GitHub topics

#agent-framework #agent-system #ai-agents #claude-code #cursor-ide #llm

Updated: 2026-06-09
Lists: 1 list mention
First commit: 2026-04-23
License: MIT
Issues: 0 open

2,254

stars

Forks: 230
Commits: 10 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

aattaran/deepclaude

Use Claude Code's autonomous agent loop with DeepSeek V4 Pro, OpenRouter, or any Anthropic-compatible backend. Same UX, 17x cheaper.

Stack

JavaScript

Updated: 2026-05-16
Lists: 1 list mention
First commit: 2026-05-03
License: MIT
Issues: 33 open

2,202

stars

Forks: 150
Commits: 15 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

codejunkie99/agentic-stack

One brain, many harnesses. Portable .agent/ folder (memory + skills + protocols) that plugs into Claude Code, Cursor, Windsurf, OpenCode, OpenClaw, Hermes, or DIY Python — and keeps its knowledge when you switch.

AI dev

Stack

Python React npm pip

Updated: 2026-05-25
Lists: 2 list mentions
First commit: 2026-04-15
License: Apache-2.0
Issues: 5 open

2,180

stars

Forks: 265
Commits: 162 commits
Star growth, last 7 days: +10 +0.5%
Commit velocity, last 7 days: 0 0.0%

GitHub

aws/agent-toolkit-for-aws

Official, AWS-supported MCP servers, skills, and plugins to help AI agents build on AWS

AI dev

Stack

Python npm

Updated: 2026-07-10
Lists: 2 list mentions
First commit: 2026-04-23
License: Apache-2.0
Issues: 16 open

1,825

stars

Forks: 159
Commits: 131 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

future-agi/future-agi

Open-source, end-to-end platform for evaluating, observing, and improving LLM and AI agent applications. Tracing · Evals · Simulations · Datasets · Gateway · Guardrails. Self-hostable. Apache 2.0.

Stack

Python Celery Django Django REST Framework FastAPI Go modules npm PEP 517

GitHub topics

#ai #ai-gateway #evals #llm #observability #simulation

Updated: 2026-07-14
Lists: 5 list mentions
First commit: 2026-04-23
License: Apache-2.0
Issues: 456 open

1,406

stars

Forks: 389
Commits: 1,306 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

dirac-run/dirac

Coding Agent singularly focused efficiency and context curation. Reduces API costs by 50-80% vs other agent AND improves the code quality at the same time. Uses Hash Anchored edits, massively parallel operations, AST manipulation and many many other optimizations. https://dirac.run/

AI dev

Stack

TypeScript React Tailwind CSS Vite npm

Updated: 2026-07-09
Lists: 1 list mention
First commit: 2026-04-09
License: Apache-2.0
Issues: 15 open

1,391

stars

Forks: 81
Commits: 526 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

sierra-research/tau-bench

Code and Data for Tau-Bench

Stack

Python pip

Updated: 2026-03-18
Lists: 1 list mention
First commit: 2024-06-06
License: MIT
Issues: 50 open

1,322

stars

Forks: 208
Commits: 93 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

ZJU-REAL/ClawGUI

Build, Evaluate, and Deploy GUI Agents — online RL training, standardized benchmarks, and real-device deployment in one framework.

AI dev

Stack

Python Android FastAPI Gradio pytest Gradle npm PEP 517

GitHub topics

#agentrl #guiagents #mobile-agent #onlinerl #openclaw #rl-training

Updated: 2026-06-03
Lists: 1 list mention
First commit: 2026-03-26
License: Apache-2.0
Issues: 2 open

1,311

stars

Forks: 52
Commits: 211 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

stanford-iris-lab/meta-harness

Reference code for the Meta-Harness paper.

Stack

Python pytest uv

GitHub topics

#harness-engineering #llm-agents

Updated: 2026-07-11
Lists: 2 list mentions
First commit: 2026-04-15
License: MIT
Issues: 4 open

1,271

stars

Forks: 123
Commits: 18 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

greyhaven-ai/autocontext

a recursive self-improving harness designed to help your agents (and future iterations of those agents) succeed on any task

AI dev

Stack

Python Express FastAPI pytest React Bun npm uv

GitHub topics

#agents #ai #autoresearch #claude #claude-code #codex

Updated: 2026-07-12
Lists: 1 list mention
First commit: 2026-02-12
License: Apache-2.0
Issues: 0 open

1,243

stars

Forks: 102
Commits: 1,962 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

coleam00/claude-memory-compiler

Give Claude Code a memory that evolves with your codebase. Hooks automatically capture sessions, the Claude Agent SDK extracts key decisions and lessons, and an LLM compiler organizes everything into structured, cross-referenced knowledge articles - inspired by Karpathy's LLM Knowledge Base architecture.

AI dev

Stack

Python Starlette PEP 517 uv

Updated: 2026-04-06
Lists: 1 list mention
First commit: 2026-04-06
License: Unknown
Issues: 19 open

1,242

stars

Forks: 312
Commits: 2 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

thClaws/thClaws

Open-source AI agent harness in native Rust — GUI, CLI, headless, and webapp from one binary. Multi-provider, MCP, skills, plugins, agent teams.

AI dev

Stack

Rust Axum React Tailwind CSS Vite Cargo npm pnpm

GitHub topics

#agent-harness #agent-teams #ai-agent #anthropic #claude-code #cli

Updated: 2026-07-12
Lists: 3 list mentions
First commit: 2026-04-20
License: Apache-2.0
Issues: 0 open

1,152

stars

Forks: 158
Commits: 550 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

wandb/weave

Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.

AI dev

Stack

Python Express FastAPI Flask Jupyter npm pip pnpm

Updated: 2026-07-12
Lists: 2 list mentions
First commit: 2022-03-23
License: Apache-2.0
Issues: 299 open

1,104

stars

Forks: 157
Commits: 7,072 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

Mibayy/token-savior

The MCP server that turns Claude into the only coding agent hitting 100% on a real benchmark. -77% active tokens, -76% wall time, 0 losses across 96 tasks on Claude Opus 4.7. Structural code navigation + persistent memory. Works with every MCP client.

AI dev

Stack

Python pytest Starlette PEP 517 uv

Updated: 2026-07-04
Lists: 2 list mentions
First commit: 2026-03-25
License: MIT
Issues: 9 open

1,064

stars

Forks: 88
Commits: 357 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

rasbt/mini-coding-agent

Minimal and readable coding agent harness implementation in Python to explain the core components of coding agents.

Stack

Python pytest uv

GitHub topics

#agents #ai #large-language-models #llms #python

Updated: 2026-04-07
Lists: 2 list mentions
First commit: 2026-04-02
License: Apache-2.0
Issues: 2 open

1,010

stars

Forks: 184
Commits: 15 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

agentscope-ai/agentscope-runtime

A production-ready runtime framework for agent apps with secure tool sandboxing, Agent-as-a-Service APIs, scalable deployment, full-stack observability, and broad framework compatibility.

AI dev

Stack

Python Celery Express FastAPI LangChain npm PEP 517 pip

GitHub topics

#a2a #agent #agentscope #agno #deployment #docker

Updated: 2026-06-04
Lists: 2 list mentions
First commit: 2025-08-14
License: Apache-2.0
Issues: 72 open

830

stars

Forks: 163
Commits: 290 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

alash3al/stash

Stash — persistent memory layer for AI agents. Episodes, facts, and working context stored in Postgres. MCP server included. Self-hosted, single binary, no cloud required.

Stack

Go Go modules

GitHub topics

#ai #ai-agents #ai-memory #memory

Updated: 2026-06-14
Lists: 1 list mention
First commit: 2026-04-18
License: Apache-2.0
Issues: 4 open

745

stars

Forks: 40
Commits: 112 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

china-qijizhifeng/agentic-harness-engineering

Official AHE code — Agentic Harness Engineering: observability-driven automatic evolution of coding-agent harnesses (concurrent w/ meta-harness). NexAU-AHE reaches 84.7% ± 2.1 pass@1 on Terminal-Bench 2 (GPT-5.5). Lifts GPT-5.4 69.7→77.0% over 10 iters, beats Codex/ACE/Training-Free GRPO; frozen harness transfers to SWE-bench-Verified.

AI dev

Stack

Python PEP 517

Updated: 2026-06-14
Lists: 2 list mentions
First commit: 2026-04-26
License: MIT
Issues: 4 open

743

stars

Forks: 80
Commits: 46 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

claw-eval/claw-eval

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

Stack

Python FastAPI pytest PEP 517 pip

GitHub topics

#agent #harness #llm #openclaw

Updated: 2026-05-17
Lists: 2 list mentions
First commit: 2026-03-17
License: Unknown
Issues: 19 open

716

stars

Forks: 64
Commits: 43 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

tldrsec/prompt-injection-defenses

Every practical and proposed defense against prompt injection.

GitHub topics

#ai #cybersecurity #prompt-injection #security

Updated: 2025-02-22
Lists: 1 list mention
First commit: 2024-04-01
License: Unknown
Issues: 10 open

713

stars

Forks: 57
Commits: 24 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

sipyourdrink-ltd/bernstein

Audit-grade multi-agent orchestration for CLI coding agents (Claude Code, Codex, Gemini CLI, +40 more). HMAC-chained audit log, signed agent cards, per-artefact lineage, air-gap deploy. The orchestrator your compliance team will sign off on. https://bernstein.run

AI dev

Stack

Python FastAPI Flask pytest React npm PEP 517 pip

GitHub topics

#agent-framework #agent-orchestrator #agentic-ai #ai-agents #ai-coding #aider

Updated: 2026-07-01
Lists: 6 list mentions
First commit: 2026-03-27
License: Apache-2.0
Issues: 11 open

614

stars

Forks: 56
Commits: 3,311 commits
Star growth, last 7 days: No 7-day history
Commit velocity, last 7 days: No 7-day history

Website GitHub

neosigmaai/auto-harness

Bring your own agent and build a self-improving agentic system. Automatically mine failures, optimize the agent harness, and gate against regressions.

Stack

Python pip uv

Updated: 2026-07-08
Lists: 2 list mentions
First commit: 2026-04-04
License: MIT
Issues: 10 open

525

stars

Forks: 59
Commits: 12 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

manuelschipper/nah

You should sandbox your agents. This is for when you don't.

AI dev

Stack

Python pytest PEP 517

Updated: 2026-06-30
Lists: 1 list mention
First commit: 2026-03-10
License: MIT
Issues: 6 open

452

stars

Forks: 26
Commits: 881 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

statewright/statewright

State machine guardrails for AI agents

AI dev

Stack

Rust Axum Tailwind CSS Vite Vue Cargo npm

Updated: 2026-06-26
Lists: 1 list mention
First commit: 2026-05-04
License: NOASSERTION
Issues: 0 open

415

stars

Forks: 13
Commits: 224 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

browser-use/bux

Browser Use Box: a 24/7 Claude Code agent for Playwright-style browser automation with Browser Use Cloud, Telegram, and a real browser on any box you own.

AI dev

Stack

Python pip

GitHub topics

#ai-agent #ai-automation #automation #browser-agent #browser-automation #browser-use

Updated: 2026-07-02
Lists: 1 list mention
First commit: 2026-04-26
License: MIT
Issues: 20 open

403

stars

Forks: 50
Commits: 417 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

evilmartians/agent-prism

React components for visualizing traces from AI agents

Stack

TypeScript Next.js React Tailwind CSS Vite npm pnpm

GitHub topics

#ai-agents #ai-monitoring #component-library #developer-tools #distributed-tracing #generative-ai

Updated: 2026-07-08
Lists: 1 list mention
First commit: 2025-07-22
License: MIT
Issues: 16 open

371

stars

Forks: 21
Commits: 376 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

vinkius-labs/mcpfusion

MCP Fusion - The TypeScript framework for secure MCP servers.

AI dev

Stack

TypeScript Express Vite Vue npm

GitHub topics

#mcp #mcp-framework #mcp-server #model-context-protocol

Updated: 2026-06-25
Lists: 1 list mention
First commit: 2026-02-12
License: Apache-2.0
Issues: 8 open

256

stars

Forks: 23
Commits: 340 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

onesuper/tui-use

tui-use lets agents interact with programs that expect a human at the keyboard — REPLs, debuggers, TUI apps, and anything else bash can't reach.

AI dev

Stack

TypeScript npm

Updated: 2026-04-11
Lists: 1 list mention
First commit: 2026-04-06
License: MIT
Issues: 1 open

250

stars

Forks: 20
Commits: 113 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

SuperagenticAI/metaharness

Meta Harness Implementation

AI dev

Stack

Python PEP 517 uv

GitHub topics

#agentic-coding #harness #harness-engineering

Updated: 2026-06-13
Lists: 1 list mention
First commit: 2026-04-01
License: NOASSERTION
Issues: 0 open

143

stars

Forks: 16
Commits: 26 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

coleam00/your-claude-engineer

Demonstration of an agent harness with access to tools like Slack, GitHub, and Linear so it can act as your own complete software engineer.

AI dev

Stack

Python Starlette pip

Updated: 2026-02-01
Lists: 1 list mention
First commit: 2026-01-28
License: MIT
Issues: 1 open

135

stars

Forks: 53
Commits: 7 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

canvas-org/meta-agent

Continual harness optimization

AI dev

Stack

Python PEP 517

GitHub topics

#agent #anthropic #claude #harness #llm #training

Updated: 2026-05-14
Lists: 1 list mention
First commit: 2026-04-06
License: MIT
Issues: 0 open

stars

Forks: 7
Commits: 4 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

facebookresearch/cca-swebench

swebench repro script for running confucius-code-agent (CCA)

Stack

Python LangChain PEP 517 pip

Updated: 2026-05-21
Lists: 1 list mention
First commit: 2025-12-27
License: MIT
Issues: 2 open

stars

Forks: 5
Commits: 2 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

raphaelchristi/harness-evolver

Automated harness evolution for AI agents. A Claude Code plugin that iteratively optimizes system prompts, routing, retrieval, and orchestration code using full-trace counterfactual diagnosis. Based on Meta-Harness (Lee et al., 2026).

AI dev

Stack

Python LangChain npm pip

GitHub topics

#agent-evolution #claude-code-plugin #codex-skills #harness-engineering #meta-harness

Updated: 2026-04-18
Lists: 1 list mention
First commit: 2026-03-31
License: MIT
Issues: 3 open

stars

Forks: 4
Commits: 333 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

Syncause/debug-skill

Agent debugging skill. Stop AI debugging guesswork with runtime evidence.

AI dev

GitHub topics

#agent #mcp #skill

Updated: 2026-04-14
Lists: 1 list mention
First commit: 2026-01-29
License: Unknown
Issues: 0 open

stars

Forks: 2
Commits: 33 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

mastra-ai/workshop-mastracode

Build your own coding agent workshop - Feb 19th 2026

Stack

HTML Express npm pnpm

Updated: 2026-02-19
Lists: 1 list mention
First commit: 2026-02-18
License: Unknown
Issues: 0 open

stars

Forks: 4
Commits: 8 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

rotorstar/hitl-protocol

Human-in-the-Loop Protocol for Autonomous Agent Services — Open Standard (v0.8)

Stack

HTML Express FastAPI Next.js pytest npm pip pnpm

GitHub topics

#agents #ai-agents #auth #hermes #hermes-agent #hitl

Updated: 2026-07-11
Lists: 1 list mention
First commit: 2026-02-23
License: Apache-2.0
Issues: 1 open

stars

Forks: 0
Commits: 43 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

Website GitHub

aws-samples/sample-human-in-the-loop-patterns

No description.

Stack

Python PEP 517 pip

Updated: 2026-03-17
Lists: 1 list mention
First commit: 2025-12-10
License: MIT-0
Issues: 0 open

stars

Forks: 0
Commits: 10 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

danielrosehill/AI-Harnesses

Point-in-time snapshot of projects describing themselves as AI agent harnesses (April 2026)

Updated: 2026-04-04
Lists: 1 list mention
First commit: 2026-04-04
License: Unknown
Issues: 0 open

stars

Forks: 0
Commits: 1 commit
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

ml6team/AISO-workshop

ML6 x AISO Agent Workshop (March 2026)

Stack

Python PEP 517 uv

Updated: 2026-03-31
Lists: 1 list mention
First commit: 2025-10-22
License: Unknown
Issues: 0 open

stars

Forks: 31
Commits: 10 commits
Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%

GitHub

Activity

Default branch: main
Last pushed: 2026-07-17
GitHub updated: 2026-07-17
Created: 2026-03-29
First commit: -
Last scanned: 2026-07-17 10:49
Watchers: 27

Indexed repo mix

Repo stars: 3,104,429
Repo forks: 376,364
Active: 140
Archived: 1

Languages

Python (80) TypeScript (33) Rust (8) Go (5) HTML (4) JavaScript (2) Shell (2) C (1) Elixir (1) Jupyter Notebook (1)

Awesome Harness Engineering

Tracked list growth

Likes history

Commits history

Indexed repositories

Filter this list

Put your repository first

How it works

Pricing