Sign in
← Back to search

src-d/datasets

source{d} datasets ("big code") for source code analysis and machine learning on source code

Stars
346
Forks
85
Commits
314
Language
Jupyter Notebook
Awesome lists
1

Similar repositories

huggingface/datasets

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

21537 stars
Python 1 awesome list

0xdevalias/chatgpt-source-watch

Analyzing the evolution of ChatGPT's codebase through time with curated archives and scripts

299 stars
JavaScript 1 awesome list

chaos-genius/chaos_genius

ML powered analytics engine for outlier detection and root cause analysis.

777 stars
Python 2 awesome lists

AllenInstitute/openai_tools

Growing collection of scripts to summarize the scientific literature using large-language models like ChatGPT.

111 stars
HTML 1 awesome list

activeloopai/deeplake

Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.

9140 stars
C++ 2 awesome lists

Tracked growth

2 captures since 2026-05-23

Latest capture 2026-05-31 03:02

Stars history

Total stars

Commits history

Default branch commits

Metadata

  • Created: 2018-01-25
  • First commit: 2018-01-25
  • Last pushed: 2019-11-27
  • Archived: no
  • Stack detected: —
  • License: NOASSERTION

AI development signals

No AI development config files detected.