src-d/datasets
source{d} datasets ("big code") for source code analysis and machine learning on source code
Core meta for awesome-public-datasets. Contribute new data here!
source{d} datasets ("big code") for source code analysis and machine learning on source code
COVID-19 global data (from JHU CSSE for now) as-a-service
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity
ML powered analytics engine for outlier detection and root cause analysis.
2 captures since 2026-05-23