Sign in
← Back to search

apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Stars
9,134
Forks
748
Commits
1558
Language
Python
Awesome lists
1

Similar repositories

s0rg/crawley

The unix-way web crawler

340 stars
Go 2 awesome lists

firecrawl/firecrawl

The API to search, scrape, and interact with the web at scale. 🔥

127429 stars
TypeScript 1 awesome list

unclecode/crawl4ai

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

67581 stars
Python 2 awesome lists

AIMLPM/markcrawl

Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.

2 stars
Python 1 awesome list

amantus-ai/llm-codes

Transform developer documentation to clean Markdown

327 stars
TypeScript 0 awesome lists

Crawleo/Crawleo-MCP

Crawleo MCP Server - Real-Time Web Knowledge for AI Crawleo's Model Context Protocol (MCP) server enables AI assistants like Claude to access real-time web data directly through native tool integration.

10 stars
JavaScript 1 awesome list

Tracked growth

2 captures since 2026-05-27

Latest capture 2026-06-02 07:23

Stars history

Total stars

Commits history

Default branch commits

Detected stack

Frameworks and tools

  • pytest · test framework · high confidence
  • React · frontend framework · high confidence
npm PEP 517 pip pnpm Poetry uv

Dependency files

  • pyproject.toml · python · 68 dependencies
  • uv.lock · python · 0 dependencies
  • docs/pyproject.toml · python · 0 dependencies
  • website/package.json · javascript · 44 dependencies
  • website/pnpm-lock.yaml · javascript · 0 dependencies
  • website/roa-loader/package.json · javascript · 1 dependencies
  • src/crawlee/project_template/{{cookiecutter.project_name}}/pyproject.toml · python · 4 dependencies
  • src/crawlee/project_template/{{cookiecutter.project_name}}/requirements.txt · python · 3 dependencies

Metadata

  • Created: 2024-01-10
  • First commit: 2024-01-10
  • Last pushed: 2026-06-02
  • Website: https://crawlee.dev/python/
  • Archived: no
  • Stack detected: 2026-06-02 07:23
  • License: Apache-2.0

AI development signals

AI agent config detected

3 config paths 3 files 0 directories
Agent instructions Claude Code Gemini CLI

Key config paths

  • file AGENTS.md
  • file CLAUDE.md
  • file GEMINI.md