huggingface/datasets
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
source{d} datasets ("big code") for source code analysis and machine learning on source code
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
No description.
Analyzing the evolution of ChatGPT's codebase through time with curated archives and scripts
ML powered analytics engine for outlier detection and root cause analysis.
Growing collection of scripts to summarize the scientific literature using large-language models like ChatGPT.
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
2 captures since 2026-05-23