High-accuracy document parsing for PDFs, Office files, images, and web pages into Markdown or JSON
Other
- Python
- Dockerfile

About MinerU
MinerU turns complex documents into structured Markdown or JSON for LLM, RAG, and agent workflows. It handles PDF, DOCX, PPTX, XLSX, images, and web pages, with support for scanned documents, handwriting, multi-column layouts, and cross-page table merging.
It uses a VLM plus OCR dual engine with 109-language OCR recognition. Output follows human reading order, removes headers and footers automatically, and converts formulas to LaTeX and tables to HTML. It also supports native DOCX, PPTX, and XLSX parsing, plus MCP Server and native LangChain, Dify, and FastGPT integration.
MinerU ships a web version, a desktop client, and an API. It supports private, fully offline deployment and runs on a range of GPUs and AI accelerators. The project is released under the MinerU Open Source License, based on Apache 2.0.
Key features
- PDF, DOCX, PPTX, XLSX, images, and web pages to Markdown or JSON
- VLM plus OCR dual engine with 109-language recognition
- Formulas to LaTeX and tables to HTML
- Scanned docs, handwriting, multi-column layouts, and cross-page table merging
- Native MCP Server and LangChain, Dify, FastGPT integration
Details
- First released
- 2024
- Platforms
- Web · Windows · macOS · Linux
- Deployment
- Self-hostable · Docker · Offline-first
- OCR
- 109 languages
- Output
- Markdown · JSON
- License
- MinerU Open Source License
