Document to Markdown: Docling vs MarkitDown vs Marker
AI-Ready Markdown: Comparison of Document Converters for Generative Applications
Summary of document parsing software.
File format support: Marker mainly for PDF/images, Docling/MarkitDown expands to DOCX, XLSX, PPTX, HTML, etc. (MarkitDown even has audio/youtube support)
OCR: Docling, Marker good
tables in document: Docling, Marker good, MarkitDown loses format (but integration with Azure Document intelligence may help)
images in document: Docling, Marker good, MarkitDown mainly plain text (but integration with Azure Document intelligence may help)
llamaindex DoclingReader and MarkdownNodeParser
Introduced PdfPipelineOptions with tesseract OCR, embedded image/base64, figure export
4 levels of difficulty of PDF parsing: normal PDF with table, scanned PDF with table, scanned PDF with more complex tables, PDF with mixed content like text, images, tables.
- 🗂️ Parsing of multiple document formats incl. PDF, DOCX, XLSX, HTML, images, and more
- 📑 Advanced PDF understanding incl. page layout, reading order, table structure, code, formulas, image classification, and more
- 🧬 Unified, expressive DoclingDocument representation format
- ↪️ Various export formats and options, including Markdown, HTML, and lossless JSON
- 🔒 Local execution capabilities for sensitive data and air-gapped environments
- 🤖 Plug-and-play integrations incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
- 🔍 Extensive OCR support for scanned PDFs and images
MarkItDown is a utility for converting various files to Markdown (e.g., for indexing, text analysis, etc). It supports:
- PowerPoint
- Word
- Excel
- Images (EXIF metadata and OCR)
- Audio (EXIF metadata and speech transcription)
- HTML
- Text-based formats (CSV, JSON, XML)
- ZIP files (iterates over contents)
- … and more!
Marker vs. Nougat, faster and more accurate, more format preserving
Marker converts PDFs and images to markdown, JSON, and HTML quickly and accurately.
- Supports a range of documents in all languages
- Formats tables, forms, equations, inline math, links, references, and code blocks
- Extracts and saves images
- Removes headers/footers/other artifacts
- Extensible with your own formatting and logic
- Optionally boost accuracy with LLMs
- Works on GPU, CPU, or MPS
Comparison
a table to compare Docling vs MarkitDown vs Marker features on table, processing time (my experience MarkitDown > Docling >> Marker)
Test
Original document, table
docling table
Preserves table format
TABLE I. Benchmark instances used in this work . Apart from the the number of vertices, edges, and edge weights, we also include the type of graph as well as its use.
| Graph | m | | E | | W ij | Type | Use |
|----------|------|---------|--------|----------------|------------|
| pm3-8-50 | 512 | 1536 | ± 1 | 3 D torus grid | Experiment |
| G1 | 800 | 19176 | 1 | random | Experiment |
| G14 | 800 | 4694 | 1 | planar | Numerics |
| G23 | 2000 | 19990 | 1 | random | Numerics |
| G35 | 2000 | 11778 | 1 | planar | Experiment |
| G60 | 7000 | 17148 | 1 | random | Numerics |
markitdown table
Loses table format
Graph
pm3-8-50
G1
G14
G23
G35
G60
|E| Wij
m
1536 ±1
512
1
19176
800
1
800
4694
1
2000 19990
1
2000 11778
1
7000 17148
Type
Use
3D torus grid Experiment
Experiment
Numerics
Numerics
Experiment
Numerics
random
planar
random
planar
random
TABLE I. Benchmark instances used in this work.
Apart from the the number of vertices, edges, and edge
weights, we also include the type of graph as well as its use.
marker
| Graph | m | E | Wij | Type | Use |
|----------|------|-------|-----|---------------|------------|
| pm3-8-50 | 512 | 1536 | ±1 | 3D torus grid | Experiment |
| G1 | 800 | 19176 | 1 | random | Experiment |
| G14 | 800 | 4694 | 1 | planar | Numerics |
| G23 | 2000 | 19990 | 1 | random | Numerics |
| G35 | 2000 | 11778 | 1 | planar | Experiment |
| G60 | 7000 | 17148 | 1 | random | Numerics |
| | | | | | |
<span id="page-6-2"></span>TABLE I. Benchmark instances used in this work. Apart from the the number of vertices, edges, and edge weights, we also include the type of graph as well as its use.
can see several models used
layout model datalab-to/surya_layout on device cpu with dtype torch.float32
Loaded texify model datalab-to/texify on device cpu with dtype torch.float32
Loaded recognition model vikp/surya_rec2 on device cpu with dtype torch.float32
Loaded table recognition model datalab-to/surya_tablerec on device cpu with dtype torch.float32
Loaded detection model vikp/surya_det3 on device cpu with dtype torch.float32
Loaded detection model datalab-to/inline_math_det0 on device cpu with dtype torch.float32
Appendix
docling, unstructured.io, llamaparse
Commercial
unstructured io (with open-source version)
llamaparse