Open highlighted repo slot
Put your repository first
Promote a GitHub repo at the top of Awesome repository list views for 7 days.
GitHub projects from awesome lists
Search names, descriptions, topics, tags, and stacks, then tune results by ecosystem, freshness, health, and cross-list signal.
Open highlighted repo slot
Promote a GitHub repo at the top of Awesome repository list views for 7 days.
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Fast scalable time series database
Elassandra = Elasticsearch + Apache Cassandra
Blazingly fast analytics database that will rapidly devour all of your data.
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Dynamic HTML5 visualization
Bloomberg's distributed RDBMS
AgensGraph, a transactional graph database based on PostgreSQL
HiBench is a big data benchmark suite.
In-memory NoSQL database with ACID transactions, Raft consensus, and Redis API
HyperDex is a scalable, searchable key-value store
An open source event analytics platform
Memcache on SSD
SQL-based streaming analytics platform at scale
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.
EliasDB a graph-based database.
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.
Twemcache is the Twitter Memcached
CPU and GPU-accelerated Machine Learning Library
Fast multilayer perceptron neural network library for iOS and Mac OS X
A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data itself. New implementation in http://github.com/probcomp/bayeslite
Fast and reliable message broker built on top of Kafka.
Time-series database
Netflix's distributed Data Pipeline
📈 Collect customer event data from your apps. (Note that this project only includes the API collector, not the visualization platform)
Build platforms that flexibly mix SQL, batch, and stream processing paradigms
A probabilistic data structure service and storage
GhostDB is a distributed, in-memory, general purpose key-value data store that delivers microsecond performance at any scale.