Sign in
← Back to search

simplecto/sitemap_grabber

A python library to recursively crawl every sitemap.xml for a website. Also handles robots.txt and other well-knowns.

Stars
1
Forks
0
Commits
77
Language
Python
Awesome lists
0

Similar repositories

apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

9134 stars
Python 1 awesome list

s0rg/crawley

The unix-way web crawler

340 stars
Go 2 awesome lists

firecrawl/firecrawl

The API to search, scrape, and interact with the web at scale. 🔥

127429 stars
TypeScript 1 awesome list

scrapy/scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

62066 stars
Python 2 awesome lists

Tracked growth

1 capture since 2026-06-02

Latest capture 2026-06-02 07:12

Stars history

Total stars

Commits history

Default branch commits

Detected stack

Frameworks and tools

  • No framework dependencies detected.
PEP 517 pip

Dependency files

  • pyproject.toml · python · 9 dependencies
  • requirements.txt · python · 2 dependencies

Metadata

AI development signals

No AI development config files detected.

Appears in

  • No awesome list links recorded.