This folder contains a series of progressive examples demonstrating Doclingβs capabilities for document processing, from simple PDF conversion to advanced hybrid chunking for RAG systems.
Docling is a powerful document processing library that handles complex document formats that are typically challenging for RAG (Retrieval Augmented Generation) systems. Without Docling, youβd need to implement custom OCR, layout analysis, table extraction, and format-specific parsers. Docling handles all of this out-of-the-box.
Key Advantages:
01_simple_pdf.py)What it demonstrates:
Key concepts:
DocumentConverter - The main entry pointexport_to_markdown() - Standard output formatRun it:
python 01_simple_pdf.py
What this covers:
02_multiple_formats.py)What it demonstrates:
Key concepts:
Run it:
python 02_multiple_formats.py
What this covers:
03_audio_transcription.py)What it demonstrates:
Key concepts:
AsrPipeline - Audio processing pipelinePrerequisites: FFmpeg must be installed:
Windows (Chocolatey):
choco install ffmpeg
Windows (Conda):
conda install -c conda-forge ffmpeg
macOS:
brew install ffmpeg
Linux:
apt-get install ffmpeg # Debian/Ubuntu
yum install ffmpeg # RedHat/CentOS
Run it:
python 03_audio_transcription.py
What this covers:
04_hybrid_chunking.py)What it demonstrates:
Key concepts:
HybridChunker - Structure + token-aware chunkingWhy Hybrid Chunking?
Run it:
python 04_hybrid_chunking.py
What this covers:
Beyond these tutorials, Docling offers additional capabilities for even more robust document processing:
Add vision-based understanding to your PDFs:
from docling.datamodel.pipeline_options import (
PdfPipelineOptions,
granite_picture_description
)
from docling.datamodel.base_models import InputFormat
# Configure picture description for PDFs
pipeline_options = PdfPipelineOptions()
pipeline_options.do_picture_description = True
pipeline_options.picture_description_options = granite_picture_description
converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline_options)
}
)
Benefits:
Enhanced processing for technical documents with code:
pipeline_options = PdfPipelineOptions()
pipeline_options.do_code_enrichment = True # Enables code syntax understanding
Benefits:
Advanced table parsing with TableFormer:
from docling.datamodel.pipeline_options import TableFormerMode
pipeline_options = PdfPipelineOptions()
pipeline_options.table_structure_options.mode = TableFormerMode.ACCURATE
Benefits:
These tutorials demonstrate the building blocks used in the main RAG agent:
ingestion/ingest.py)Progression Flow:
docling_basics/ tutorialsAll examples require Docling and dependencies:
# Install base Docling
pip install docling
# For hybrid chunking and ASR (example 3&4)
pip install transformers openai-whisper hf-xet
# OR install everything at once
pip install docling transformers openai-whisper hf-xet
The documents/ folder (one level up) contains example files:
technical-architecture-guide.pdf, q4-2024-business-review.pdf, client-review-globalfinance.pdfmeeting-notes-2025-01-08.docx, meeting-notes-2025-01-15.docxcompany-overview.md, team-handbook.md, mission-and-goals.md, implementation-playbook.mdRecording1.mp3, Recording2.mp3, Recording3.mp3, Recording4.mp3docling-rag-agent/
βββ docling_basics/ # This folder - Tutorial scripts
β βββ 01_simple_pdf.py
β βββ 02_multiple_formats.py
β βββ 03_audio_transcription.py
β βββ 04_hybrid_chunking.py
β βββ README.md
βββ documents/ # Source documents (examples provided)
β βββ technical-architecture-guide.pdf
β βββ q4-2024-business-review.pdf
β βββ meeting-notes-2025-01-08.docx
β βββ company-overview.md
β βββ Recording1.mp3
β βββ ... (more files)
βββ ... (main RAG agent files)
Recommended Order:
01_simple_pdf.py
02_multiple_formats.py
03_audio_transcription.py
04_hybrid_chunking.py
After completing these tutorials, youβll understand:
β Why Docling?
β When to Use Docling?
β How Docling Fits RAG?
Ready to build your own RAG system? Check out the main project files:
ingestion/ingest.py - Full ingestion pipelinecli.py - Interactive CLI with streamingrag_agent.py - RAG tool implementationThese tutorials provide the foundation. The main agent shows the complete picture! π―