Sign in

Awesome List

Awesome Python Data Science

Probably the best curated list of data science software in Python.

krzjoa/awesome-python-data-science #awesome#awesome-list#awesome-python#data-analysis#data-science#data-visualization#deep-learning#machine-learning#python#scikit-learn#statistics
List stars
3,453
README repos
357
Indexed repos
351
List commits
500
Forks
447
Open issues
13

Tracked list growth

GitHub stars and default-branch commits for krzjoa/awesome-python-data-science.

Latest scan 2026-06-03 10:49

Likes history

GitHub stars

Commits history

Default branch commits

Indexed repositories

351 repos currently saved from this list.

No filters applied
Latest repo push 2026-06-03

Filter this list

Search within Awesome Python Data Science or narrow by ecosystem and project health.

Search mode
Tune results
More filters Topics, generated tags, stack, age, archive status, and growth.
Ecosystem
Health

Uses known first-commit dates.

Momentum
Reset filters
Highlighted

Open highlighted repo slot

Put your repository first

Promote a GitHub repo at the top of Awesome repository list views for 7 days.

dmlc/xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ CMakeMaven #distributed-systems#gbdt#gbm#gbrt pushed 2026-06-02 7,829 commits first commit 2014-02-06 4 list mentions
mlflow/mlflow

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

Python ExpressFastAPIFlask Mavennpm #agentops#agents#ai#ai-governance pushed 2026-06-03 12,388 commits first commit 2018-06-05 3 list mentions AI dev signals
shap/shap

A game theoretic approach to explain the output of any machine learning model.

Jupyter Notebook #deep-learning#explainability#gradient-boosting#interpretability pushed 2026-05-23 2,998 commits 2 list mentions AI dev signals
PaddlePaddle/Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

C++ #deep-learning#distributed-training#efficiency#machine-learning pushed 2026-05-25 57,864 commits 2 list mentions AI dev signals