AIMLPM/markcrawl
Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.
~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted.
The API to search, scrape, and interact with the web at scale. 🔥
Transform developer documentation to clean Markdown
The unified web layer for AI agents. Search (8 engines), stealth browse, auth, and act on 24 platforms. One npm install, self-hosted.
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
5 captures since 2026-05-22
pyproject.toml
· python · 43 dependencies
requirements.txt
· python · 33 dependencies
setup.cfg
· python · 0 dependencies
setup.py
· python · 0 dependencies
uv.lock
· python · 0 dependencies
deploy/docker/requirements.txt
· python · 17 dependencies
tests/memory/requirements.txt
· python · 4 dependencies
deploy/docker/tests/requirements.txt
· python · 2 dependencies
docs/examples/website-to-api/requirements.txt
· python · 5 dependencies
docs/examples/c4a_script/tutorial/requirements.txt
· python · 2 dependencies
AI agent config detected
Key config paths
.claude
.claude
.claude/commands
.claude/commands/c4ai-check.md
.claude/settings.local.json