Open WebUI¶
Open WebUI supports pluggable content extraction backends. Kreuzberg implements two of those backend APIs — the docling-serve endpoint and the external document loader endpoint, so it works as a drop-in replacement without patching Open WebUI.
How it works¶
- A user uploads a document (PDF, DOCX, image, etc.) in Open WebUI.
- Open WebUI sends the file to Kreuzberg's API endpoint.
- Kreuzberg extracts the content — running OCR where needed and returns markdown.
- Open WebUI stores the markdown in its vector database for retrieval-augmented generation.
Kreuzberg supports 90+ file formats and requires no GPU.
Prerequisites¶
- Docker and Docker Compose (v2)
- Open WebUI running or ready to deploy
- No GPU required — Kreuzberg runs entirely on CPU
Setup with Docker Compose¶
This is the fastest way to get both services running together.
services:
kreuzberg:
image: ghcr.io/kreuzberg-dev/kreuzberg:latest-core
ports:
- "8000:8000"
command: ["serve", "--host", "0.0.0.0", "--port", "8000"]
volumes:
- kreuzberg-cache:/app/.kreuzberg
healthcheck:
test: ["CMD", "kreuzberg", "version"]
interval: 10s
timeout: 5s
retries: 5
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
environment:
CONTENT_EXTRACTION_ENGINE: "docling"
DOCLING_SERVER_URL: "http://kreuzberg:8000"
depends_on:
kreuzberg:
condition: service_healthy
volumes:
kreuzberg-cache:
Start both services in detached mode:
Open http://localhost:3000, create an account, and upload a document. The extracted text will appear in the chat context.
Cache volume
The kreuzberg-cache volume persists OCR models and embedding weights across restarts. Without it, models re-download on every container restart (~90 MB–1.2 GB depending on configuration).
Already running Open WebUI?
Start Kreuzberg separately, then point Open WebUI to that Kreuzberg URL.
Then configure Open WebUI using one of the two engine modes below.
Choosing an engine mode¶
Kreuzberg exposes two Open WebUI–compatible APIs. Both return the same extracted content. So pick whichever fits your setup.
| Docling (recommended) | External | |
|---|---|---|
| Endpoint | POST /v1/convert/file |
PUT /process |
| Engine setting | docling |
external |
| URL variable | DOCLING_SERVER_URL |
EXTERNAL_DOCUMENT_LOADER_URL |
Set these environment variables on the Open WebUI container:
Or via the Admin UI: Settings → Documents → Content Extraction Engine → select Docling → set server URL to http://kreuzberg:8000.
Tip
If Kreuzberg runs on a different host or port, replace http://kreuzberg:8000 with the actual address. Inside Docker Compose, use the service name (kreuzberg). Outside Docker, use the host IP or localhost.
Verify it works¶
Test the endpoints directly before debugging through Open WebUI.
If the endpoint returns extracted text, the integration is working. Upload a document through Open WebUI to confirm end-to-end.
Next steps¶
- Docker deployment guide — image variants, volumes, security hardening
- API server reference — all endpoints and configuration options
- OCR guide — language packs, engine selection, tuning
- Format support — full list of supported file types