Sign in

Awesome List

Awesome Bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

oxnr/awesome-bigdata #awesome#awesome-list#bigdata#data#data-analytics#data-science#data-stream#data-visualization#data-warehouse#database#distributed-database#series-database#stream-processing#streaming-data#visualize-data 404 Not Found | https://api.github.com/repos/streamsets/datacollector | message=Not Found | rate_limit_remaining=4447 | rate_limit_reset=1780487385
List stars
14,418
README repos
211
Indexed repos
183
List commits
592
Forks
2,585
Open issues
3

Tracked list growth

GitHub stars and default-branch commits for oxnr/awesome-bigdata.

Latest scan 2026-06-03 10:49

Likes history

GitHub stars

Commits history

Default branch commits

Indexed repositories

183 repos currently saved from this list.

No filters applied
Latest repo push 2026-06-03

Filter this list

Search within Awesome Bigdata or narrow by ecosystem and project health.

Search mode
Tune results
More filters Topics, generated tags, stack, age, archive status, and growth.
Ecosystem
Health

Uses known first-commit dates.

Momentum
Reset filters
Highlighted

Open highlighted repo slot

Put your repository first

Promote a GitHub repo at the top of Awesome repository list views for 7 days.

linkedin/isolation-forest

A distributed Spark/Scala implementation of the isolation forest and extended isolation forest algorithms for unsupervised outlier detection, featuring support for scalable training and ONNX export for easy cross-platform inference.

Scala #anomaly-detection#isolation-forest#linkedin#machine-learning pushed 2026-04-18 100 commits first commit 2019-08-12 2 list mentions
shouc/daudit

🌲 Configuration flaws detector for Hadoop, MongoDB, MySQL, and more!

Python pip #auditing#bigdata#hadoop-spark#mongodb pushed 2020-06-21 43 commits first commit 2020-06-08 1 list mention archived
chrislusf/seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, xDC replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. Enterprise version is at seaweedfs.com.

Go pushed 2026-05-22 12,815 commits first commit 2011-11-30 2 list mentions