helgeho/Web2Warc
An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)
Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in this repo is now only for reference. For support and issues of 'warc-indexer', please communicate with NetArchiveSuite.
An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)
A search interface and wayback machine for the UKWA Solr based warc-indexer framework.
A Rails engine supporting the discovery of web archives.
Streaming WARC/ARC library for fast web archive IO
WarcDB: Web crawl data as SQLite databases.
Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
2 captures since 2026-05-23