huggingface/datasets
๐ค The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
Synthetic data curation for post-training and structured data extraction
๐ค The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
LiveBench: A Challenging, Contamination-Free LLM Benchmark
Structured Outputs
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Low-code framework for building custom LLMs, neural networks, and other AI models
Data processing for and with foundation models! ๐ ๐ ๐ฝ โก๏ธ โก๏ธ๐ธ ๐น ๐ท
1 capture since 2026-05-25