Open Source · MIT Licensed

Visual document analysis,
on your hardware

Upload PDFs and images, ask questions, and get answers grounded in your documents. ColPali visual embeddings paired with any Ollama model — running entirely on your machine.

Get Started View on GitHub

Light

Dark

Capabilities

Everything you need for
document intelligence

From single-page lookups to batch analysis across hundreds of documents, RAG Lab handles the full workflow.

Visual RAG

ColPali embeddings understand tables, figures, and layouts natively. Optional OCR mode for text-preferring LLMs.

Hybrid Retrieval

Visual search + keyword matching with weighted score fusion. Citation snippets show exactly which text matched your query.

Any Ollama Model

Plug in any local model or connect to Ollama Cloud. Vision, text, and reasoning models all work.

Batch Processing

Feed it a stack of documents. Get per-document streaming responses as they complete.

Full-Document Summarization

Ask “summarize” and every page gets processed in sequential chunks. No page left behind.

Prompt Templates

Build custom extraction templates — K-1 line items, invoice fields, whatever your workflow needs.

Conversation Memory

Mem0 remembers context across chat sessions. Disabled during RAG to keep answers document-grounded.

Multi-Session

Separate workspaces for separate projects. Switch contexts without losing state.

Adaptive Retrieval

Score-slope analysis finds the right number of pages per query. No manual threshold tuning.

Stack

Built with modern tools

Frontend

SvelteKit 2, Svelte 5, TypeScript

Backend

FastAPI, Python 3.10+

Embeddings

ColQwen3.5 (ColPali)

Vector Store

LanceDB (multi-vector)

LLM Inference

Ollama (any local model)

Memory

Mem0 + ChromaDB

Auth

FastAPI-Users, JWT, argon2

Storage

SQLite (local, zero-config)

Quick Start

Up and running in minutes

For Linux and WSL2. Requires Python 3.10+, Node.js 18+, an NVIDIA GPU, and Ollama.

terminal

# Clone and install $ git clone https://github.com/inkind79/rag-lab.git $ cd rag-lab $ python -m venv venv && source venv/bin/activate $ pip install -r requirements.txt $ cd frontend && npm install && cd .. # Pull your models $ ollama pull gemma4 $ ollama pull nomic-embed-text $ ollama pull gemma3:4b # conversation memory # Configure and run (two terminals) $ cp .env.example .env $ uvicorn fastapi_app:app --host 127.0.0.1 --port 8000 $ cd frontend && npm run dev # Open http://localhost:5173 — register an account, then sign in

Requirements

System requirements

GPU

NVIDIA 8GB+ VRAM

16GB+ recommended for larger models

RAM

16GB+

System memory