icereed/paperless-gpt
Use LLMs and LLM Vision (OCR) to handle paperless-ngx - Document Digitalization powered by AI
Legacy Python library for Agentic Document Extraction (ADE). Use the landingai-ade library for all new projects.
Use LLMs and LLM Vision (OCR) to handle paperless-ngx - Document Digitalization powered by AI
Get your documents ready for gen AI
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.
A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
2 captures since 2026-06-02
pyproject.toml
· python · 40 dependencies
poetry.lock
· python · 116 dependencies