Sign in
← Back to search
Stars
340
Forks
18
Commits
157
Language
Go
Awesome lists
2

Similar repositories

apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

9134 stars
Python 1 awesome list

karust/gogetcrawl

Extract web archive data using Wayback Machine and Common Crawl

179 stars
Go 1 awesome list

AIMLPM/markcrawl

Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.

2 stars
Python 1 awesome list

Pyx-Corp/spectrawl

The unified web layer for AI agents. Search (8 engines), stealth browse, auth, and act on 24 platforms. One npm install, self-hosted.

24 stars
JavaScript 1 awesome list

N0taN3rd/Squidwarc

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

174 stars
JavaScript 1 awesome list

firecrawl/firecrawl

The API to search, scrape, and interact with the web at scale. 🔥

127826 stars
TypeScript 1 awesome list

Tracked growth

2 captures since 2026-05-22

Latest capture 2026-05-28 03:00

Stars history

Total stars

Commits history

Default branch commits

Metadata

  • Created: 2021-10-27
  • First commit: 2021-10-27
  • Last pushed: 2026-05-08
  • Archived: no
  • Stack detected: —
  • License: MIT

AI development signals

No AI development config files detected.