Sign in
Awesome

Awesome-list intelligence for GitHub

Search every repository hiding inside awesome lists.

Discover projects curated by awesome-list maintainers, then narrow them by stars, age, freshness, archive status, language, topics, generated tags, detected stacks, package managers, and source list.

Repos indexed
9,926
Awesome lists tracked
76
Current results
19
19 repos shown
Topic: llm-evaluation
Highlighted

Open highlighted repo slot

Put your repository first

Promote a GitHub repo at the top of Awesome repository list views for 7 days.

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

TypeScript ExpressNext.jsReactTailwind CSS npmpnpm #analytics#autogen#evaluation#langchain#large-language-models AI dev signals 11 awesome lists 7170 commits first commit 2023-05-18 9 history points updated 2026-06-02
mlflow/mlflow

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

Python ExpressFastAPIFlaskLangChain MavennpmPEP 517 #agentops#agents#ai#ai-governance#apache-spark AI dev signals 3 awesome lists 12377 commits first commit 2018-06-05 5 history points updated 2026-06-02
jeinlee1991/chinese-llm-benchmark

非线智能 NoneLinear - ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括374个大模型,覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3.6-max、qwen3.6-plus、百川、讯飞星火、商汤senseChat等商用模型, 以及step3.5-flash、kimi-k2.6、ernie4.5、MiniMax-M2.7、deepseek-v4、Qwen3.6、llama4、智谱GLM-5.1、MiMo-V2、LongCat、gemma4、mistral等开源大模型。不仅提供排行榜,也提供规模超200万的大模型缺陷库!方便广大社区研究分析、改进大模型。

#agentic-ai#artificial-intelligence#llm-agent#llm-evaluation 3 awesome lists 1 history point updated 2026-05-23