Open highlighted repo slot
Put your repository first
Promote a GitHub repo at the top of Awesome repository list views for 7 days.
Awesome List
Probably the best curated list of data science software in Python.
GitHub stars and default-branch commits for krzjoa/awesome-python-data-science.
351 repos currently saved from this list.
Open highlighted repo slot
Promote a GitHub repo at the top of Awesome repository list views for 7 days.
A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python
Solve automatic numerical differentiation problems in one or more variables.
Bayesian Optimization using GPflow
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
zoofs is a python library for performing feature selection using a variety of nature-inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics-based to Evolutionary. It's easy to use , flexible and powerful tool to reduce your feature size.
Audio features extraction
Python-based implementations of algorithms for learning on imbalanced data.
A library for augmenting annotated audio data
Relevance Vector Machine implementation using the scikit-learn API.
A fast xgboost feature selection algorithm
scikit-learn wrappers for Python fastText.
Stacked Generalization (Ensemble Learning)
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.
The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).
The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions that add additional logs
An API conversion tool for popular external reinforcement learning environments
QML: Quantum Machine Learning
BatchFlow helps you conveniently work with random or sequential batches of your data and define data processing and machine learning workflows even for datasets that do not fit into memory.
Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.
functional data manipulation for pandas
Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.
⬛ Python Individual Conditional Expectation Plot Toolbox
A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.
Safe Bayesian Optimization
InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy
PyTorch Geometric Signed Directed is a signed/directed graph neural network extension library for PyTorch Geometric. The paper is accepted by LoG 2023.
Python histogram library - histograms as updateable, fully semantic objects with visualization tools. [P]ython [HYST]ograms.
A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.
A library that implements fairness-aware machine learning algorithms
A strongly-typed genetic programming framework for Python
No description.
Library for machine learning stacking generalization.
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.
NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.
Python package implementing ML feature engineering and pre-processing for polars or pandas dataframes.
A graph reliability toolbox based on PyTorch and PyTorch Geometric (PyG).
A library for hidden semi-Markov models with explicit durations
SigOpt wrappers for scikit-learn methods
Support vector machines (SVMs) and related kernel-based learning algorithms are a well-known class of machine learning algorithms, for non-parametric classification and regression. liquidSVM is an implementation of SVMs whose key features are: fully integrated hyper-parameter selection, extreme speed on both small and large data sets, full flexibility for experts, and inclusion of a variety of different learning scenarios: multi-class classification, ROC, and Neyman-Pearson learning, and least-squares, quantile, and expectile regression.
Build, test, deploy, iterate - Dev and prod tool for data science pipelines
a feature engineering wrapper for sklearn
No description.
Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University
Auralisation of learned features in CNN (for audio)
scikit-learn addon to operate on set/"group"-based features
Universal 1d/2d data containers with Transformers functionality for data analysis.
Machine learning on dirty tabular data (legacy clone of skrub)
No description.
A collection of pandas & scikit-learn compatible transformers for preprocessing and feature engineering 🛠
TensorLight - A high-level framework for TensorFlow