chfoo/warcat
Tool and library for handling Web ARChive (WARC) files.
A tool for detecting viruses and NSFW material in WARC files
Tool and library for handling Web ARChive (WARC) files.
A collection of tools for archiving and analysing the internet.
Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
Streaming WARC/ARC library for fast web archive IO
WarcDB: Web crawl data as SQLite databases.
Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in this repo is now only for reference. For support and issues of 'warc-indexer', please communicate with NetArchiveSuite.
2 captures since 2026-05-23