← Back to search

github Active AI dev

Repository profile

apache/spark

Apache Spark - A unified analytics engine for large-scale data processing

Scala Apache-2.0 master Stack scanned README.md

Open website Open GitHub

Stars: 43,614
Forks: 29,275
Watchers: 1,987
Issues: 450
Commits: 49,068
Awesome lists: 4

Repository updates

Get generated apache/spark development summaries by email, or follow the weekly and monthly RSS feeds.

Weekly RSS Monthly RSS

Activity and growth

Tracked growth, recent movement, and commit velocity from stored repository snapshots.

Latest capture 2026-07-15 03:06

Star growth, last 7 days: 0 0.0%
Commit velocity, last 7 days: 0 0.0%
Stars since baseline: +294
Snapshot coverage: 10

Tracked growth

10 captures since 2026-05-22

Stars from baseline +294

Time horizon

All tracked data

Custom start Custom end

Stars history

Total stars

Commits history

Default branch commits

Detected stack

Frameworks, package managers, ecosystems, and dependency manifests found during catalog scans.

Scanned 2026-07-15 03:06

Stack signals: 0
Package managers: 5
Manifest files: 65
Dependencies: 2,463

Frameworks and tools

No framework dependencies detected.

Bundler Maven npm pip uv java javascript python ruby

Dependency files

65 manifests

pom.xml java ecosystem, 365 dependencies
pyproject.toml python ecosystem, 53 dependencies
assembly/pom.xml java ecosystem, 42 dependencies
core/pom.xml java ecosystem, 186 dependencies
dev/package.json javascript ecosystem, 4 dependencies
dev/requirements.txt python ecosystem, 56 dependencies
docs/Gemfile ruby ecosystem, 4 dependencies
examples/pom.xml java ecosystem, 26 dependencies
57 more files

Classification

Searchable topics, generated tags, and stack labels that explain where this repository fits.

Topics: 8
Tags: 0
Stacks: 0

Topics

#big-data #java #jdbc #python #r #scala #spark #sql

Generated tags

No generated tags yet.

Stack labels

No stack labels yet.

AI development signals

Agent instructions and tool configuration paths found in the repository tree.

2 paths

AI agent config detected

2 config paths 2 files 0 directories

Agent instructions Claude Code

Key config paths

file AGENTS.md
file CLAUDE.md

Similar repositories

Nearest indexed repositories by embedding similarity.

apache/hudi

Upserts, Deletes And Incremental Processing on Big Data.

6,186 stars

Java 1 awesome list

microsoft/SynapseML

Simple and Distributed Machine Learning

5,231 stars

Scala 1 awesome list

lensacom/sparkit-learn

PySpark + Scikit-learn = Sparkit-learn

1,151 stars

Python 1 awesome list

apache/maven

Apache Maven core

5,368 stars

Java 1 awesome list

TIBCOSoftware/snappydata

Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster

1,032 stars

Scala 1 awesome list

apache/flink

Apache Flink

26,176 stars

Java 1 awesome list

Metadata

Language: Scala
License: Apache-2.0
Default branch: master
Created: 2014-02-25
First commit: 2010-03-29
Last pushed: 2026-07-15
GitHub updated: 2026-07-15
Last synced: 2026-07-15 03:06
Stack detected: 2026-07-15 03:06
Archived: no

Links and files

GitHub Website

https://spark.apache.org/

README

apache/spark

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

apache/hudi

microsoft/SynapseML

lensacom/sparkit-learn

apache/maven

TIBCOSoftware/snappydata

apache/flink

Metadata

Links and files

Appears in

How it works

Pricing

Follow repository updates

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

apache/hudi

microsoft/SynapseML

lensacom/sparkit-learn

apache/maven

TIBCOSoftware/snappydata

apache/flink

Metadata

Links and files

Appears in