← Back to search

github Active

Repository profile

centerforaisafety/hle

Humanity's Last Exam

Python MIT main Stack scanned README.md

Open website Open GitHub

Stars: 1,609
Forks: 104
Watchers: 19
Issues: 7
Commits: 28
Awesome lists: 1

Repository updates

Get generated centerforaisafety/hle development summaries by email, or follow the weekly and monthly RSS feeds.

Weekly RSS Monthly RSS

Activity and growth

Tracked growth, recent movement, and commit velocity from stored repository snapshots.

Latest capture 2026-07-13 03:03

Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%
Stars since baseline: +62
Snapshot coverage: 5

Tracked growth

5 captures since 2026-05-23

Stars from baseline +62

Time horizon

All tracked data

Custom start Custom end

Stars history

Total stars

Commits history

Default branch commits

Detected stack

Frameworks, package managers, ecosystems, and dependency manifests found during catalog scans.

Scanned 2026-07-13 03:03

Stack signals: 0
Package managers: 1
Manifest files: 1
Dependencies: 3

Frameworks and tools

No framework dependencies detected.

pip python

Dependency files

1 manifest

requirements.txt python ecosystem, 3 dependencies

Classification

Searchable topics, generated tags, and stack labels that explain where this repository fits.

Topics: 0
Tags: 0
Stacks: 0

Topics

No topics indexed.

Generated tags

No generated tags yet.

Stack labels

No stack labels yet.

AI development signals

Agent instructions and tool configuration paths found in the repository tree.

0 paths

No AI development config files detected.

Similar repositories

Nearest indexed repositories by embedding similarity.

LiveBench/LiveBench

LiveBench: A Challenging, Contamination-Free LLM Benchmark

1,236 stars

Python 1 awesome list

openai/mle-bench

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

1,626 stars

Python 1 awesome list

openai/simple-evals

No description.

4,570 stars

Python 3 awesome lists

Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models (CRFM) at Stanford for holistic, reproducible and transparent evaluation of foundation models, including large language models (LLMs) and multimodal models.

2,856 stars

Python 2 awesome lists

bigcode-project/bigcodebench

[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI

511 stars

Python 2 awesome lists

huggingface/lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

2,476 stars

Python 2 awesome lists

Metadata

Language: Python
License: MIT
Default branch: main
Created: 2025-01-23
First commit: 2025-01-23
Last pushed: 2026-02-20
GitHub updated: 2026-07-11
Last synced: 2026-07-13 03:03
Stack detected: 2026-07-13 03:03
Archived: no

Links and files

GitHub Website

https://lastexam.ai

README

Appears in

Awesome Deep Research

centerforaisafety/hle

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

LiveBench/LiveBench

openai/mle-bench

openai/simple-evals

stanford-crfm/helm

bigcode-project/bigcodebench

huggingface/lighteval

Metadata

Links and files

Appears in

How it works

Pricing

Follow repository updates

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

LiveBench/LiveBench

openai/mle-bench

openai/simple-evals

stanford-crfm/helm

bigcode-project/bigcodebench

huggingface/lighteval

Metadata

Links and files

Appears in