An intelligent agent that provides conversational access to a knowledge base stored in PostgreSQL with PGVector. Uses RAG (Retrieval Augmented Generation) to search through embedded documents and provide contextual, accurate responses with source citations. Supports multiple document formats including audio files with Whisper transcription.
Inspired by the Docling RAG Agent from coleam00/ottomator-agents, adapted to run with local LLMs via Ollama instead of requiring OpenAI API keys.
A modern web interface is now available with:
# Start the web interface
uv run python web_app.py
Then open http://localhost:8000 in your browser.
See WEB_INTERFACE.md for full documentation.
Start with the tutorials! Check out the docling_basics/ folder for progressive examples that teach Docling fundamentals:
These tutorials provide the foundation for understanding how this full RAG agent works. β Go to Docling Basics
Choose how you want to interact:
| Interface | Best For | Command |
|---|---|---|
| π Web Interface | Visual UI, file uploads, web crawling | uv run python web_app.py |
| π» CLI | Terminal workflows, SSH access | uv run python cli.py |
| Format | Extensions | Processing |
|---|---|---|
| π PDF | .pdf |
Docling conversion |
| π Word | .docx, .doc |
Docling conversion |
| π PowerPoint | .pptx, .ppt |
Docling conversion |
| π Excel | .xlsx, .xls |
Docling conversion |
| π HTML | .html, .htm |
Docling conversion |
| π Markdown | .md, .markdown |
Direct processing |
| π Text | .txt |
Direct processing |
| π΅ Audio | .mp3, .wav, .m4a |
Whisper transcription |
ollama pull <model> (e.g., ollama pull mistral, ollama pull nomic-embed-text)macOS:
# Install required libraries for audio/video processing
brew install opus opusfile
Linux (Ubuntu/Debian):
sudo apt-get install libopus0 libopusfile0
# Install dependencies using UV
uv sync
Copy .env.example to .env and configure your provider:
cp .env.example .env
DATABASE_URL - PostgreSQL connection string with PGVector extension
postgresql://user:password@localhost:5432/dbnamepostgresql://postgres.[project-ref]:[password]@aws-0-[region].pooler.supabase.com:5432/postgrespostgresql://[user]:[password]@[endpoint].neon.tech/[dbname]Option 1: Ollama (Local - Recommended)
OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://localhost:11434/v1
LLM_CHOICE=mistral # or llama3.2, qwen2.5, etc.
EMBEDDING_MODEL=nomic-embed-text
Available Ollama models:
llama3.2, mistral, qwen2.5, deepseek-r1nomic-embed-text, mxbai-embed-large, qwen3-embeddingOption 2: OpenAI (Cloud)
OPENAI_API_KEY=sk-your-key-here
LLM_CHOICE=gpt-4o-mini
EMBEDDING_MODEL=text-embedding-3-small
You must set up your PostgreSQL database with the PGVector extension and create the required schema:
CREATE EXTENSION IF NOT EXISTS vector;
# In the SQL editor in Supabase/Neon, run:
sql/schema.sql
# Or using psql
psql $DATABASE_URL < sql/schema.sql
The schema file (sql/schema.sql) creates:
documents table for storing original documents with metadatachunks table for text chunks with 768-dimensional embeddingsmatch_chunks() function for vector similarity search# Start the web server
uv run python web_app.py
Then open http://localhost:8000 in your browser.
Web Interface Features:
# Run the CLI agent
uv run python cli.py
CLI Commands:
help - Show help informationclear - Clear conversation historystats - Show session statisticsexit or quit - Exit the CLIAdd your documents to the documents/ folder, then ingest:
# Ingest all documents in the documents/ folder
# NOTE: By default, this CLEARS existing data before ingestion
uv run python -m ingestion.ingest --documents documents/
# Adjust chunk size (default: 1000)
uv run python -m ingestion.ingest --documents documents/ --chunk-size 800
# Append without cleaning (keep existing data)
uv run python -m ingestion.ingest --documents documents/ --no-clean
β οΈ Important: The ingestion process automatically deletes all existing documents and chunks from the database before adding new documents (unless --no-clean is used). This ensures a clean state and prevents duplicate data.
The ingestion pipeline will:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER INTERFACES β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β Web Interface β β CLI Interface β β
β β (FastAPI + HTML) β β (Python async) β β
β ββββββββββββ¬βββββββββββ ββββββββββββ¬βββββββββββ β
βββββββββββββββΌββββββββββββββββββββββββββββββββββΌβββββββββββββββββ
β β
βββββββββββββββ¬ββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββββββββ
β RAG Agent Core β
β βββββββββββββββββββββββββββββ β
β β PydanticAI Agent β β
β β + search_knowledge_base() β β
β βββββββββββββββββββββββββββββ β
βββββββββββββββ¬ββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
ββββββββββΌβββββββββ ββββββββΌββββββββ ββββββββΌββββββββ
β Embeddings β β LLM β β PostgreSQL β
β (Ollama/ β β (Ollama/ β β + PGVector β
β OpenAI) β β OpenAI) β β β
βββββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
ββββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββ
β Data Sources ββββββΆβ Ingestion ββββββΆβ Knowledge β
β β’ Local files β β Pipeline β β Base (PGVec) β
β β’ Web crawl β β (Docling) β β β
ββββββββββββββββββββ βββββββββββββββββββ βββββββββ¬βββββββββ
β
ββββββββββββββββββββ βββββββββββββββββββ βββββββββΌβββββββββ
β User Query βββββββ RAG Agent βββββββ Semantic β
β (Web or CLI) β β + Streaming β β Search β
ββββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββ
Audio files are automatically transcribed using OpenAI Whisper Turbo model:
How it works:
Benefits:
documents/ folder and run ingestionModel details:
openai/whisper-large-v3-turbo[time: 0.0-4.0] Transcribed text hereExample transcript format:
[time: 0.0-4.0] Welcome to our podcast on AI and machine learning.
[time: 5.28-9.96] Today we'll discuss retrieval augmented generation systems.
The main agent (rag_agent.py) that:
Function tool registered with the agent that:
Example tool definition:
async def search_knowledge_base(
ctx: RunContext[None],
query: str,
limit: int = 5
) -> str:
"""Search the knowledge base using semantic similarity."""
# Generate embedding for query
# Search PostgreSQL with PGVector
# Format and return results
documents: Stores original documents with metadata
id, title, source, content, metadata, created_at, updated_atchunks: Stores text chunks with vector embeddings
id, document_id, content, embedding (vector(1536)), chunk_index, metadata, token_countmatch_chunks(): PostgreSQL function for vector similarity search
1 - (embedding <=> query_embedding))db_pool = await asyncpg.create_pool(
DATABASE_URL,
min_size=2,
max_size=10,
command_timeout=60
)
The embedder includes built-in caching for frequently searched queries, reducing API calls and latency.
Token-by-token streaming provides immediate feedback to users while the LLM generates responses:
async with agent.run_stream(user_input, message_history=history) as result:
async for text in result.stream_text(delta=False):
print(f"\rAssistant: {text}", end="", flush=True)
# Start all services
docker-compose up -d
# Ingest documents
docker-compose --profile ingestion up ingestion
# View logs
docker-compose logs -f rag-agent
async def search_knowledge_base(
ctx: RunContext[None],
query: str,
limit: int = 5
) -> str:
"""
Search the knowledge base using semantic similarity.
Args:
query: The search query to find relevant information
limit: Maximum number of results to return (default: 5)
Returns:
Formatted search results with source citations
"""
-- Vector similarity search
SELECT * FROM match_chunks(
query_embedding::vector(1536),
match_count INT,
similarity_threshold FLOAT DEFAULT 0.7
)
Returns chunks with:
id: Chunk UUIDcontent: Text contentembedding: Vector embeddingsimilarity: Cosine similarity score (0-1)document_title: Source document titledocument_source: Source document pathdocling-rag-agent/
βββ cli.py # Enhanced CLI with colors and features
βββ rag_agent.py # Basic CLI agent with PydanticAI
βββ web_app.py # FastAPI web interface server β NEW
β
βββ web/ # Web interface frontend β NEW
β βββ index.html # Single-page application (HTML/CSS/JS)
β
βββ ingestion/
β βββ ingest.py # Document ingestion pipeline
β βββ embedder.py # Embedding generation with caching
β βββ chunker.py # Document chunking logic
β
βββ web_crawler/ # Web scraping utilities β NEW
β βββ 1-crawl_single_page.py
β βββ 2-crawl_docs_sequential.py
β βββ 3-crawl_sitemap_in_parallel.py
β βββ 4-crawl_llms_txt.py
β βββ 5-crawl_site_recursively.py
β βββ _crawl_utils.py # Shared utilities for web app
β
βββ docling_basics/ # Docling tutorials
β βββ 01_simple_pdf.py
β βββ 02_multiple_formats.py
β βββ 03_audio_transcription.py
β βββ 04_hybrid_chunking.py
β
βββ utils/
β βββ providers.py # OpenAI/Ollama model/client configuration
β βββ db_utils.py # Database connection pooling
β βββ models.py # Pydantic models for config
β
βββ sql/
β βββ schema.sql # PostgreSQL schema with PGVector
β βββ backup.sh # Database backup script
β βββ restore.sh # Database restore script
β
βββ documents/ # Sample documents for ingestion
βββ pyproject.toml # Project dependencies
βββ .env.example # Environment variables template
β
βββ README.md # This file
βββ WEB_INTERFACE.md # Web interface documentation β NEW
βββ DATA_PIPELINE.md # Data collection guide β NEW
| Document | Description |
|---|---|
README.md |
Main project documentation |
WEB_INTERFACE.md |
Web interface usage guide |
DATA_PIPELINE.md |
Data collection pipeline guide |
docling_basics/README.md |
Docling tutorials |
unsupported operand type(s) for |Error:
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
Cause: Youβre using Python 3.9, but crawl4ai requires Python 3.10+.
Solution: Upgrade to Python 3.10 or later (Python 3.11+ recommended):
# Check your Python version
python --version
# If using Python 3.9, recreate the virtual environment with Python 3.10+
uv venv --python 3.11 --clear
uv sync
Error:
ERROR: [Errno 48] Address already in use
Solution:
# Kill the process using port 8000
lsof -ti:8000 | xargs kill -9
# Or use a different port
uv run python web_app.py --port 8001
Error:
fatal error: 'opus/opus.h' file not found
Solution: Install required system dependencies:
# macOS
brew install opus opusfile
# Linux (Ubuntu/Debian)
sudo apt-get install libopus0 libopusfile0
Error:
Database not initialized. Please check your DATABASE_URL configuration.
Solution:
pg_isreadyDATABASE_URL in your .env fileCREATE EXTENSION IF NOT EXISTS vector;
This project is inspired by the Docling RAG Agent from the excellent ottomator-agents collection by coleam00.
Modifications made:
OPENAI_BASE_URL environment variable