Sign in
← Back to search
Stars
6,165
Forks
2,492
Commits
7397
Language
Java
Awesome lists
1

Similar repositories

apache/spark

Apache Spark - A unified analytics engine for large-scale data processing

43339 stars
Scala 4 awesome lists

TIBCOSoftware/snappydata

Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster

1034 stars
Scala 1 awesome list

h2oai/h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

7484 stars
Jupyter Notebook 3 awesome lists

apache/druid

Apache Druid: a high performance real-time analytics database.

14015 stars
Java 3 awesome lists

apache/gobblin

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

2265 stars
Java 1 awesome list

Tracked growth

1 capture since 2026-05-25

Latest capture 2026-05-25 20:51

Stars history

Total stars

Commits history

Default branch commits

Metadata

  • Created: 2016-12-14
  • First commit: —
  • Last pushed: 2026-05-25
  • Website: https://hudi.apache.org/
  • Archived: no
  • Stack detected: —
  • License: Apache-2.0

AI development signals

No AI development config files detected.