Sign in
← Back to search
Stars
21,537
Forks
3,217
Commits
4273
Language
Python
Awesome lists
1

Similar repositories

src-d/datasets

source{d} datasets ("big code") for source code analysis and machine learning on source code

346 stars
Jupyter Notebook 1 awesome list

bespokelabsai/curator

Synthetic data curation for post-training and structured data extraction

1678 stars
Python 1 awesome list

activeloopai/deeplake

Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.

9140 stars
C++ 2 awesome lists

sfu-db/dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

2242 stars
Python 1 awesome list

argilla-io/distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

3229 stars
Python 1 awesome list

Tracked growth

1 capture since 2026-05-25

Latest capture 2026-05-25 21:00

Stars history

Total stars

Commits history

Default branch commits

Metadata

AI development signals

No AI development config files detected.