Sign in

Awesome List

Awesome Python Data Science

Probably the best curated list of data science software in Python.

krzjoa/awesome-python-data-science #awesome#awesome-list#awesome-python#data-analysis#data-science#data-visualization#deep-learning#machine-learning#python#scikit-learn#statistics
List stars
3,450
README repos
357
Indexed repos
351
List commits
500
Forks
447
Open issues
13

Tracked list growth

GitHub stars and default-branch commits for krzjoa/awesome-python-data-science.

Latest scan 2026-06-02 10:49

Likes history

GitHub stars

Commits history

Default branch commits

Indexed repositories

351 repos currently saved from this list.

No filters applied
Latest repo push 2026-06-02

Age filters use known first-commit dates and exclude repositories that have not synced that data yet.

Reset
Highlighted

Open highlighted repo slot

Put your repository first

Promote a GitHub repo at the top of Awesome repository list views for 7 days.

vwxyzjn/cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python #a2c#actor-critic#advantage-actor-critic#ale pushed 2026-04-20 843 commits first commit 2019-06-07 1 list mention
catboost/catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

C++ #big-data#catboost#categorical-features#coreml pushed 2026-05-25 50,470 commits 4 list mentions
clips/pattern

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Python #machine-learning#natural-language-processing#network-analysis#python pushed 2024-06-10 1,434 commits first commit 2011-02-24 2 list mentions
vaexio/vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

Python #bigdata#data-science#dataframe#hdf5 pushed 2026-04-01 3,727 commits first commit 2014-01-27 3 list mentions
guofei9987/scikit-opt

Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

Python #ant-colony-algorithm#artificial-intelligence#fish-swarms#genetic-algorithm pushed 2026-03-25 345 commits first commit 2017-12-06 2 list mentions
uber/causalml

Uplift modeling and causal inference with machine learning algorithms

Python #causal-inference#incubation#machine-learning#uplift-modeling pushed 2026-05-16 678 commits first commit 2019-07-10 1 list mention
microsoft/MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

Python #caffe#cntk#coreml#darknet pushed 2025-08-07 1,085 commits first commit 2017-08-16 1 list mention
lyst/lightfm

A Python implementation of LightFM, a hybrid recommendation algorithm.

Python #learning-to-rank#machine-learning#matrix-factorization#python pushed 2024-07-24 483 commits first commit 2015-07-30 1 list mention