Sign in
← Back to search

huggingface/evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

Stars
2,447
Forks
319
Commits
990
Language
Python
Awesome lists
1

Similar repositories

huggingface/lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

2428 stars
Python 2 awesome lists

edublancas/sklearn-evaluation

Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

3 stars
1 awesome list

openai/evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

18531 stars
Python 4 awesome lists

evalplus/evalplus

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024

1749 stars
Python 2 awesome lists

modelscope/evalscope

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.

2844 stars
Python 2 awesome lists

evidentlyai/evidently

Evidently is ​​an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

7534 stars
Jupyter Notebook 4 awesome lists

Tracked growth

1 capture since 2026-05-25

Latest capture 2026-05-25 21:00

Stars history

Total stars

Commits history

Default branch commits

Metadata

AI development signals

No AI development config files detected.