← Back to search
github Active

Repository profile

ocrmypdf/OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Python MPL-2.0 main Stack scanned README.md
Stars
33,959
Forks
2,343
Watchers
189
Issues
101
Commits
4,377
Awesome lists
1

Repository updates

Follow repository updates

Get generated ocrmypdf/OCRmyPDF development summaries by email, or follow the weekly and monthly RSS feeds.

Sign in to subscribe by email. RSS feeds are public.

Sign in to subscribe

Activity and growth

Tracked growth, recent movement, and commit velocity from stored repository snapshots.

Latest capture 2026-06-24 13:17

Star growth, last 7 days
0 0.0%
Commit velocity, last 7 days
0 0.0%
Stars since baseline
0
Snapshot coverage
1

Tracked growth

1 capture since 2026-06-24

Stars from baseline 0

Time horizon

All tracked data

Stars history

Total stars

Commits history

Default branch commits

Detected stack

Frameworks, package managers, ecosystems, and dependency manifests found during catalog scans.

Scanned 2026-06-24 13:17

Stack signals
2
Package managers
1
Manifest files
2
Dependencies
38

Frameworks and tools

  • pytest test framework · high confidence
  • Streamlit app framework · high confidence
uv python

Dependency files

2 manifests
  • pyproject.toml python ecosystem, 38 dependencies
  • uv.lock python ecosystem, 0 dependencies

Classification

Searchable topics, generated tags, and stack labels that explain where this repository fits.

Topics
5
Tags
0
Stacks
2

Generated tags

No generated tags yet.

Stack labels

AI development signals

Agent instructions and tool configuration paths found in the repository tree.

0 paths
No AI development config files detected.

Similar repositories

Nearest indexed repositories by embedding similarity.

JaidedAI/EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

29,647 stars
Python 1 awesome list

yfedoseev/pdf_oxide

The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. MIT/Apache-2.0.

846 stars
Rust 2 awesome lists

pikepdf/pikepdf

A Python library for reading and writing PDF, powered by QPDF

2,746 stars
Python 1 awesome list

AryanBV/pdf-toolkit-mcp

Write-capable PDF toolkit for any MCP client: 22 tools to read, create, render, encrypt, and transform PDFs. Vision rendering for scans, form-preserving merge and split, AES-256, zero native dependencies.

7 stars
TypeScript 1 awesome list

icereed/paperless-gpt

Use LLMs and LLM Vision (OCR) to handle paperless-ngx - Document Digitalization powered by AI

2,442 stars
Go 2 awesome lists

Metadata

Language
Python
License
MPL-2.0
Default branch
main
Created
2013-12-20
First commit
2013-04-09
Last pushed
2026-06-22
GitHub updated
2026-06-24
Last synced
2026-06-24 13:17
Stack detected
2026-06-24 13:17
Archived
no
GitHub Website

http://ocrmypdf.readthedocs.io/

README

Appears in

1