Sign in
← Back to search

helgeho/WarcPartitioner

Partition (W)ARC Files by MIME Type and Year

Stars
1
Forks
1
Commits
2
Language
Java
Awesome lists
1

Similar repositories

helgeho/HadoopConcatGz

A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz

9 stars
Java 1 awesome list

helgeho/Web2Warc

An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)

26 stars
Scala 1 awesome list

internetarchive/warctools

Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)

174 stars
Python 1 awesome list

chfoo/warcat

Tool and library for handling Web ARChive (WARC) files.

165 stars
Python 1 awesome list

iipc/jwarc

Java library for reading and writing WARC files with a typed API

59 stars
Java 1 awesome list

webrecorder/warcio

Streaming WARC/ARC library for fast web archive IO

458 stars
Python 1 awesome list

Tracked growth

2 captures since 2026-05-23

Latest capture 2026-05-31 03:01

Stars history

Total stars

Commits history

Default branch commits

Metadata

  • Created: 2017-02-13
  • First commit: 2017-02-13
  • Last pushed: 2017-02-13
  • Archived: no
  • Stack detected: —
  • License: MIT

AI development signals

No AI development config files detected.