github Active AI dev

Repository profile

apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Python Apache-2.0 master Stack scanned README.md

Open website Open GitHub

Stars: 9,330
Forks: 777
Watchers: 46
Issues: 80
Commits: 1,707
Awesome lists: 1

Repository updates

Get generated apify/crawlee-python development summaries by email, or follow the weekly and monthly RSS feeds.

Weekly RSS Monthly RSS

Activity and growth

Tracked growth, recent movement, and commit velocity from stored repository snapshots.

Latest capture 2026-07-17 04:37

Star growth, last 7 days: +33 +0.4%
Commit velocity, last 7 days: +26 +1.5%
Stars since baseline: +225
Snapshot coverage: 25

Tracked growth

25 captures since 2026-05-27

Stars from baseline +225

Time horizon

All tracked data

Custom start Custom end

Stars history

Total stars

Commits history

Default branch commits

Detected stack

Frameworks, package managers, ecosystems, and dependency manifests found during catalog scans.

Scanned 2026-07-17 04:37

Stack signals: 0
Package managers: 5
Manifest files: 8
Dependencies: 0

Frameworks and tools

No framework dependencies detected.

npm PEP 517 pip pnpm uv javascript python

Dependency files

8 manifests

pyproject.toml python ecosystem, 0 dependencies
uv.lock python ecosystem, 0 dependencies
docs/pyproject.toml python ecosystem, 0 dependencies
website/package.json javascript ecosystem, 0 dependencies
website/pnpm-lock.yaml javascript ecosystem, 0 dependencies
website/roa-loader/package.json javascript ecosystem, 0 dependencies
src/crawlee/project_template/{{cookiecutter.project_name}}/pyproject.toml python ecosystem, 0 dependencies
src/crawlee/project_template/{{cookiecutter.project_name}}/requirements.txt python ecosystem, 0 dependencies

Classification

Searchable topics, generated tags, and stack labels that explain where this repository fits.

Topics: 17
Tags: 0
Stacks: 0

Topics

#apify #automation #beautifulsoup #crawler #crawling #headless #headless-chrome #parsel #pip #playwright #python #scraper #scraping #selenium #web-crawler #web-crawling #web-scraping

Generated tags

No generated tags yet.

Stack labels

No stack labels yet.

AI development signals

Agent instructions and tool configuration paths found in the repository tree.

3 paths

AI agent config detected

3 config paths 3 files 0 directories

Agent instructions Claude Code Gemini CLI

Key config paths

file AGENTS.md
file CLAUDE.md
file GEMINI.md

Similar repositories

Nearest indexed repositories by embedding similarity.

s0rg/crawley

The unix-way web crawler

339 stars

Go 2 awesome lists

firecrawl/firecrawl

The API to search, scrape, and interact with the web at scale. 🔥

152,085 stars

TypeScript 1 awesome list

amantus-ai/llm-codes

Transform developer documentation to clean Markdown

340 stars

TypeScript 0 awesome lists

AIMLPM/markcrawl

Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.

2 stars

Python 1 awesome list

unclecode/crawl4ai

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

72,998 stars

Python 2 awesome lists

Crawleo/Crawleo-MCP

Crawleo MCP Server - Real-Time Web Knowledge for AI Crawleo's Model Context Protocol (MCP) server enables AI assistants like Claude to access real-time web data directly through native tool integration.

11 stars

JavaScript 1 awesome list

Metadata

Language: Python
License: Apache-2.0
Default branch: master
Created: 2024-01-10
First commit: 2024-01-10
Last pushed: 2026-07-16
GitHub updated: 2026-07-16
Last synced: 2026-07-17 04:37
Stack detected: 2026-07-17 04:37
Archived: no

Links and files

GitHub Website

https://crawlee.dev/python/

README

403 Forbidden | https://api.github.com/repos/apify/crawlee-python/readme | message=API rate limit exceeded for user ID 8257474. If you reach out to GitHub Support for help, please include the request ID B9FA:12630F:BDB5870:B350C0C:6A59B19C and timestamp 2026-07-17 04:37:48 UTC. For more on scraping GitHub and how it may affect your rights, please review our Terms of Service (https | rate_limit_remaining=0 | rate_limit_reset=1784264421

Appears in

Awesome Chatgpt Repositories

apify/crawlee-python

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

s0rg/crawley

firecrawl/firecrawl

amantus-ai/llm-codes

AIMLPM/markcrawl

unclecode/crawl4ai

Crawleo/Crawleo-MCP

Metadata

Links and files

Appears in

How it works

Pricing

Follow repository updates

Activity and growth

Tracked growth

Time horizon

Stars history

Commits history

Detected stack

Frameworks and tools

Dependency files

Classification

Topics

Generated tags

Stack labels

AI development signals

Similar repositories

s0rg/crawley

firecrawl/firecrawl

amantus-ai/llm-codes

AIMLPM/markcrawl

unclecode/crawl4ai

Crawleo/Crawleo-MCP

Metadata

Links and files

Appears in