Environment Variables Reference¶

Configuration precedence in Kreuzberg follows this order (highest to lowest):

Environment Variables - Highest priority, overrides all other sources
Configuration Files - TOML, YAML, or JSON config files
Defaults - Built-in sensible defaults

This document covers all KREUZBERG_* environment variables for version 4.3.8.

When to Use Environment Variables¶

Environment variables are ideal for:

Container/Cloud Deployments: Docker, Kubernetes, serverless environments where config files are impractical
CI/CD Pipelines: Override settings per environment (dev, staging, production)
Simple Overrides: Changing one or two settings without managing a config file
Secrets Management: Using secret management systems that inject values as env vars

For complex configurations with many settings, configuration files are recommended:

Example Configuration File

# kreuzberg.toml is cleaner for multiple settings
[ocr]
language = "eng"
backend = "tesseract"

[chunking]
max_chars = 2000
max_overlap = 300

API Server Configuration¶

These variables control the Kreuzberg server's network behavior and request handling.

KREUZBERG_HOST¶

Type: String Default: 127.0.0.1 Valid Values: Any IPv4 or IPv6 address, or hostname

The server bind address. Use 0.0.0.0 to listen on all interfaces.

Server Bind Address Examples

# Listen only on localhost (default)
export KREUZBERG_HOST=127.0.0.1

# Listen on all interfaces (Docker, cloud deployments)
export KREUZBERG_HOST=0.0.0.0

# Listen on specific interface
export KREUZBERG_HOST=192.168.1.100

KREUZBERG_PORT¶

Type: u16 (1-65535) Default: 8000

The server port number.

Server Port Examples

export KREUZBERG_PORT=3000
export KREUZBERG_PORT=8080

Error: Port must be a valid u16 number:

KREUZBERG_PORT must be a valid u16 number, got 'invalid': invalid digit found in string

KREUZBERG_CORS_ORIGINS¶

Type: String (comma-separated list) Default: Empty (allows all origins)

Whitelist of allowed CORS origins. When empty, the server accepts requests from any origin.

CORS Origins Configuration

# Allow all origins (default)
# unset KREUZBERG_CORS_ORIGINS

# Allow specific origins
export KREUZBERG_CORS_ORIGINS="https://api.example.com, https://app.example.com"

# Single origin
export KREUZBERG_CORS_ORIGINS="https://trusted.com"

Security Warning: Be explicit with CORS origins in production. Allowing all origins (*) means any website can call your API on behalf of users. In Kreuzberg, an empty list allows all origins - be intentional about this choice.

CORS Security Best Practices

# Production: Restrict to known origins
export KREUZBERG_CORS_ORIGINS="https://app.mycompany.com, https://admin.mycompany.com"

# Development: Can use wildcard, but understand the security implications
# Don't use wildcard in production unless absolutely necessary

KREUZBERG_MAX_REQUEST_BODY_BYTES¶

Type: usize (bytes) Default: 104857600 (100 MB)

Maximum size of HTTP request bodies. Prevents oversized requests from consuming server resources.

Max Request Body Size Examples

# 50 MB
export KREUZBERG_MAX_REQUEST_BODY_BYTES=52428800

# 200 MB
export KREUZBERG_MAX_REQUEST_BODY_BYTES=209715200

# 500 MB
export KREUZBERG_MAX_REQUEST_BODY_BYTES=524288000

Note: Both KREUZBERG_MAX_REQUEST_BODY_BYTES and KREUZBERG_MAX_MULTIPART_FIELD_BYTES control upload limits. Adjust both for consistent behavior.

KREUZBERG_MAX_MULTIPART_FIELD_BYTES¶

Type: usize (bytes) Default: 104857600 (100 MB)

Maximum size of individual multipart form fields. Controls the size of file uploads in multipart requests.

Max Multipart Field Size Examples

# 100 MB (default)
export KREUZBERG_MAX_MULTIPART_FIELD_BYTES=104857600

# 500 MB for large document processing
export KREUZBERG_MAX_MULTIPART_FIELD_BYTES=524288000

# 1 GB for extreme cases
export KREUZBERG_MAX_MULTIPART_FIELD_BYTES=1073741824

Extraction Configuration¶

These variables control document extraction behavior, including OCR, text chunking, and caching.

KREUZBERG_OCR_LANGUAGE¶

Type: String (ISO 639-1 or 639-3 language code) Default: eng (English)

OCR language for scanned documents. Must be a valid language code recognized by the OCR backend.

OCR Language Configuration

# English (default)
export KREUZBERG_OCR_LANGUAGE=eng

# German
export KREUZBERG_OCR_LANGUAGE=deu

# French
export KREUZBERG_OCR_LANGUAGE=fra

# Spanish
export KREUZBERG_OCR_LANGUAGE=spa

# Chinese (Simplified)
export KREUZBERG_OCR_LANGUAGE=chi_sim

# Japanese
export KREUZBERG_OCR_LANGUAGE=jpn

Supported Codes: Language codes are backend-agnostic and automatically mapped to the appropriate format for each backend:

Tesseract codes (ISO 639-3): eng, deu, fra, spa, ita, por, rus, chi_sim, chi_tra, jpn, kor
PaddleOCR codes: en, ch, french, german, korean, thai, greek, cyrillic, latin, arabic, devanagari, tamil, telugu
ISO 639-1 codes: en, de, fr, es, ja, ko, zh, ru, ar, th, el

All code formats are accepted regardless of backend — Kreuzberg automatically maps between them.

KREUZBERG_OCR_BACKEND¶

Type: String Default: tesseract Valid Values: tesseract, easyocr, paddleocr

OCR engine to use for text extraction from images and scanned documents.

OCR Backend Selection

# Tesseract (open source, good for English)
export KREUZBERG_OCR_BACKEND=tesseract

# EasyOCR (better multilingual support, slower)
export KREUZBERG_OCR_BACKEND=easyocr

# PaddleOCR (fast, good accuracy across languages)
export KREUZBERG_OCR_BACKEND=paddleocr

Performance Notes:

tesseract: Fastest, best for English and Latin scripts
easyocr: Slower, excellent multilingual support
paddleocr: Fast with good accuracy for many languages

KREUZBERG_CHUNKING_MAX_CHARS¶

Type: usize (positive integer) Default: 1000 (characters)

Maximum number of characters per text chunk. Smaller chunks are useful for LLM context windows.

Chunk Size Configuration

# Small chunks for token-constrained LLMs
export KREUZBERG_CHUNKING_MAX_CHARS=512

# Default: balanced for most use cases
export KREUZBERG_CHUNKING_MAX_CHARS=1000

# Larger chunks for fewer splits
export KREUZBERG_CHUNKING_MAX_CHARS=2000

# Very large chunks for comprehensive context
export KREUZBERG_CHUNKING_MAX_CHARS=4000

Validation: Must be greater than 0. Must be greater than KREUZBERG_CHUNKING_MAX_OVERLAP.

KREUZBERG_CHUNKING_MAX_OVERLAP¶

Type: usize (non-negative integer) Default: 200 (characters)

Character overlap between consecutive chunks. Maintains context across chunk boundaries.

Chunk Overlap Configuration

# No overlap (creates discontinuities)
export KREUZBERG_CHUNKING_MAX_OVERLAP=0

# Default: 20% overlap with 1000-char chunks
export KREUZBERG_CHUNKING_MAX_OVERLAP=200

# More overlap: 30% for better context continuity
export KREUZBERG_CHUNKING_MAX_OVERLAP=300

# High overlap for sensitive documents
export KREUZBERG_CHUNKING_MAX_OVERLAP=500

Validation: Must be less than KREUZBERG_CHUNKING_MAX_CHARS.

Example Error:

Chunking overlap (500) cannot be greater than or equal to max_chars (1000)

KREUZBERG_CACHE_ENABLED¶

Type: Boolean (true or false, case-insensitive) Default: true

Enable or disable extraction result caching. Cache stores results to avoid reprocessing identical documents.

Cache Enable/Disable

# Enable cache (default, recommended for production)
export KREUZBERG_CACHE_ENABLED=true

# Disable cache (development, testing, or when cache is problematic)
export KREUZBERG_CACHE_ENABLED=false

# Case insensitive
export KREUZBERG_CACHE_ENABLED=TRUE
export KREUZBERG_CACHE_ENABLED=False

KREUZBERG_OUTPUT_FORMAT¶

Type: String Default: plain Valid Values: plain, markdown, djot, html

Controls the text content format of extraction results. Determines how extracted text is formatted in the result output.

Output Format Options

# Plain text content only (default)
export KREUZBERG_OUTPUT_FORMAT=plain

# Markdown formatted output
export KREUZBERG_OUTPUT_FORMAT=markdown

# Djot markup format
export KREUZBERG_OUTPUT_FORMAT=djot

# HTML formatted output
export KREUZBERG_OUTPUT_FORMAT=html

Use Cases:

Format	Use Case
`plain`	Raw extracted text without formatting
`markdown`	Structured text with headings, lists, emphasis (RAG, LLM input)
`djot`	Lightweight markup, alternative to Markdown
`html`	Rich formatted output for web display

Example:

Extract with markdown formatting

export KREUZBERG_OUTPUT_FORMAT=markdown
kreuzberg

KREUZBERG_TOKEN_REDUCTION_MODE¶

Type: String Default: off Valid Values: off, light, moderate, aggressive, maximum

Token reduction aggressiveness for compressing extracted text while preserving meaning. Useful when working with token-limited LLMs.

Token Reduction Mode Options

# No reduction (keep all text as-is)
export KREUZBERG_TOKEN_REDUCTION_MODE=off

# Light reduction: Remove common stopwords, minimal impact
export KREUZBERG_TOKEN_REDUCTION_MODE=light

# Moderate reduction: Balance between compression and meaning preservation
export KREUZBERG_TOKEN_REDUCTION_MODE=moderate

# Aggressive reduction: Significant compression, some detail loss
export KREUZBERG_TOKEN_REDUCTION_MODE=aggressive

# Maximum reduction: Extreme compression for token-constrained scenarios
export KREUZBERG_TOKEN_REDUCTION_MODE=maximum

Impact on Tokens:

Mode	Typical Reduction	Use Case
`off`	0%	Full preservation, no compression
`light`	10-15%	Minimal impact, clean up obvious redundancy
`moderate`	25-35%	Balanced approach for most scenarios
`aggressive`	40-50%	Significant compression, still readable
`maximum`	50-70%	Extreme compression, lose some detail

Runtime Configuration¶

Control cache location, debug output, and runtime behavior.

KREUZBERG_CACHE_DIR¶

Type: String (file system path) Default: Platform-specific global cache directory

Override the default cache directory for storing extraction cache, models, and intermediate files. When unset, Kreuzberg uses a platform-appropriate global cache:

Linux: ~/.cache/kreuzberg/ (or $XDG_CACHE_HOME/kreuzberg/)
macOS: ~/Library/Caches/kreuzberg/
Windows: %LOCALAPPDATA%/kreuzberg/

If the platform cache directory cannot be determined, Kreuzberg falls back to ~/.cache/kreuzberg/, then .kreuzberg/ in the current working directory as a last resort.

Cache Directory Configuration

# Default: uses platform-specific global cache (recommended)
# unset KREUZBERG_CACHE_DIR

# Store cache in specific location
export KREUZBERG_CACHE_DIR=/var/cache/kreuzberg

# Docker: Use volume mount
export KREUZBERG_CACHE_DIR=/data/kreuzberg-cache

# Development: Quick local cleanup
export KREUZBERG_CACHE_DIR=/tmp/kreuzberg-cache

Directory Structure: Kreuzberg creates subdirectories for different cache types:

$KREUZBERG_CACHE_DIR/
  ocr/                    # OCR result cache
  embeddings/             # Chunk embedding cache
  extractions/            # Full extraction cache

KREUZBERG_CI_DEBUG¶

Type: Boolean (presence check: set to any value to enable) Default: Disabled (unset)

Enable detailed debug logging for CI environments. Outputs step-by-step timing and parameter information for OCR operations.

Enable CI Debug Logging

# Enable CI debug output
export KREUZBERG_CI_DEBUG=1
export KREUZBERG_CI_DEBUG=true
export KREUZBERG_CI_DEBUG=yes

# Output example:
# [kreuzberg::ocr] perform_ocr:start bytes=1024000 language=eng output=text use_cache=true
# [kreuzberg::ocr] perform_ocr:end duration_ms=2534

Use Cases:

Debugging slow OCR operations
Tracing cache hits/misses
Performance profiling in CI pipelines
Understanding extraction pipeline behavior

KREUZBERG_DEBUG_OCR¶

Type: Boolean (presence check: set to any value to enable) Default: Disabled (unset)

Enable OCR-specific debug output. Outputs diagnostic information about OCR decisions, fallbacks, and text coverage metrics.

Enable OCR Debug Logging

# Enable OCR debug logging
export KREUZBERG_DEBUG_OCR=1

# Output example:
# [kreuzberg::pdf::ocr] fallback=true non_whitespace=8543 alnum=7234 meaningful_words=312
# [kreuzberg::pdf::ocr] avg_non_whitespace=45.2 avg_alnum=38.1 alnum_ratio=0.847

Diagnostic Information:

Whether OCR fallback was triggered
Character counts (whitespace, alphanumeric)
Word counts and coverage ratios
Coverage thresholds and decisions

Memory & Performance¶

Configure caching for string encoding operations to optimize performance.

KREUZBERG_ENCODING_CACHE_MAX_ENTRIES¶

Type: usize (positive integer) Default: 10000

Maximum number of strings cached in the encoding cache. Each entry consumes memory proportional to string length.

Encoding Cache Entry Limit

# Default: reasonable for most applications
export KREUZBERG_ENCODING_CACHE_MAX_ENTRIES=10000

# Higher for very large batches
export KREUZBERG_ENCODING_CACHE_MAX_ENTRIES=50000

# Lower to reduce memory usage
export KREUZBERG_ENCODING_CACHE_MAX_ENTRIES=1000

KREUZBERG_ENCODING_CACHE_MAX_BYTES¶

Type: usize (bytes) Default: 104857600 (100 MB)

Maximum total size of cached strings in bytes. Once exceeded, least-used entries are evicted.

Encoding Cache Size Limit

# Default: 100 MB
export KREUZBERG_ENCODING_CACHE_MAX_BYTES=104857600

# Larger cache for high-throughput scenarios
export KREUZBERG_ENCODING_CACHE_MAX_BYTES=524288000  # 500 MB

# Smaller cache for memory-constrained environments
export KREUZBERG_ENCODING_CACHE_MAX_BYTES=10485760   # 10 MB

LLM Integration¶

Configure LLM-powered features such as structured extraction, vision-based OCR, and provider-hosted embeddings.

KREUZBERG_LLM_MODEL¶

Type: String Default: None (must be set explicitly or via config)

Default LLM model for structured extraction. Uses liter-llm model format (provider/model-name).

LLM Model Configuration

# OpenAI
export KREUZBERG_LLM_MODEL=openai/gpt-4o-mini

# Anthropic
export KREUZBERG_LLM_MODEL=anthropic/claude-sonnet-4-20250514

# Local provider
export KREUZBERG_LLM_MODEL=ollama/llama3

KREUZBERG_LLM_API_KEY¶

Type: String Default: None

API key for the structured extraction LLM provider. When not set, liter-llm falls back to provider-standard environment variables (for example, OPENAI_API_KEY, ANTHROPIC_API_KEY).

LLM API Key Configuration

export KREUZBERG_LLM_API_KEY=sk-...

Security Warning: Prefer using provider-standard environment variables or a secrets manager over setting this directly. This variable is provided for cases where multiple providers are used and explicit key routing is needed.

KREUZBERG_LLM_BASE_URL¶

Type: String Default: None (uses provider default)

Custom base URL for the structured extraction LLM provider. Useful for self-hosted models, proxies, or alternative API-compatible endpoints.

LLM Base URL Configuration

# Custom OpenAI-compatible endpoint
export KREUZBERG_LLM_BASE_URL=https://api.example.com

# Local Ollama instance
export KREUZBERG_LLM_BASE_URL=http://localhost:11434

KREUZBERG_VLM_OCR_MODEL¶

Type: String Default: None (must be set explicitly or via config)

VLM (Vision Language Model) model for vision-based OCR. When configured, Kreuzberg can use a vision model as an OCR backend, sending document images directly to the VLM for text extraction.

VLM OCR Model Configuration

# OpenAI GPT-4o for vision OCR
export KREUZBERG_VLM_OCR_MODEL=openai/gpt-4o

# Anthropic Claude for vision OCR
export KREUZBERG_VLM_OCR_MODEL=anthropic/claude-sonnet-4-20250514

KREUZBERG_VLM_EMBEDDING_MODEL¶

Type: String Default: None (must be set explicitly or via config)

LLM model for provider-hosted embeddings. Instead of running local ONNX embedding models, Kreuzberg can delegate embedding generation to a cloud provider's embedding API.

VLM Embedding Model Configuration

# OpenAI embeddings
export KREUZBERG_VLM_EMBEDDING_MODEL=openai/text-embedding-3-small

# Cohere embeddings
export KREUZBERG_VLM_EMBEDDING_MODEL=cohere/embed-english-v3.0

Note: When api_key is not set in config, liter-llm falls back to provider-standard environment variables (for example, OPENAI_API_KEY, ANTHROPIC_API_KEY).

Variable	Description	Example
`KREUZBERG_LLM_MODEL`	Default LLM model for structured extraction	`openai/gpt-4o-mini`
`KREUZBERG_LLM_API_KEY`	API key for structured extraction LLM provider	`sk-...`
`KREUZBERG_LLM_BASE_URL`	Custom base URL for structured extraction provider	`https://api.example.com`
`KREUZBERG_VLM_OCR_MODEL`	VLM model for vision-based OCR	`openai/gpt-4o`
`KREUZBERG_VLM_EMBEDDING_MODEL`	LLM model for provider-hosted embeddings	`openai/text-embedding-3-small`

Testing Variables¶

Variables for development, testing, and quality assurance.

KREUZBERG_RUN_FULL_OCR¶

Type: Boolean (presence check: set to any value to enable) Default: Disabled (skips expensive tests) Status: Testing only

Enable expensive OCR quality tests. These tests perform full OCR on large documents and are slow (can take minutes).

Enable Full OCR Tests

# Skip expensive OCR tests (default, fast test runs)
# unset KREUZBERG_RUN_FULL_OCR

# Run full OCR quality tests
export KREUZBERG_RUN_FULL_OCR=1

# In test output:
# test test_ocr_quality_multi_page_consistency ... SKIPPED
# Skipping test_ocr_quality_multi_page_consistency: set KREUZBERG_RUN_FULL_OCR=1 to enable

Warning:

These tests can take 10+ minutes
Require OCR backends to be installed and working
Produce large temporary files
Use only in CI/CD for comprehensive validation

Docker Compose Examples¶

Basic Configuration¶

Docker Compose - Basic Setup

version: "3.8"
services:
  kreuzberg:
    image: kreuzberg:latest
    ports:
      - "3000:3000"
    environment:
      KREUZBERG_HOST: "0.0.0.0"
      KREUZBERG_PORT: "3000"
      KREUZBERG_OCR_LANGUAGE: "eng"
      KREUZBERG_CACHE_ENABLED: "true"

Production Configuration¶

Docker Compose - Production Setup

version: "3.8"
services:
  kreuzberg:
    image: kreuzberg:latest
    ports:
      - "8000:8000"
    volumes:
      - kreuzberg_cache:/data/cache
    environment:
      KREUZBERG_HOST: "0.0.0.0"
      KREUZBERG_PORT: "8000"
      KREUZBERG_CORS_ORIGINS: "https://app.example.com, https://admin.example.com"
      KREUZBERG_MAX_REQUEST_BODY_BYTES: "209715200" # 200 MB
      KREUZBERG_MAX_MULTIPART_FIELD_BYTES: "209715200"
      KREUZBERG_CACHE_DIR: "/data/cache"
      KREUZBERG_OCR_LANGUAGE: "eng"
      KREUZBERG_OCR_BACKEND: "tesseract"
      KREUZBERG_CHUNKING_MAX_CHARS: "2000"
      KREUZBERG_CHUNKING_MAX_OVERLAP: "300"
      KREUZBERG_TOKEN_REDUCTION_MODE: "moderate"

volumes:
  kreuzberg_cache:
    driver: local

Multilingual Configuration¶

Docker Compose - Multilingual Setup

version: "3.8"
services:
  kreuzberg:
    image: kreuzberg:latest
    ports:
      - "8000:8000"
    environment:
      KREUZBERG_HOST: "0.0.0.0"
      KREUZBERG_PORT: "8000"
      KREUZBERG_OCR_BACKEND: "easyocr" # Better multilingual support
      KREUZBERG_OCR_LANGUAGE: "fra" # French
      KREUZBERG_CACHE_ENABLED: "true"

Development Configuration¶

Docker Compose - Development Setup

version: "3.8"
services:
  kreuzberg:
    image: kreuzberg:latest
    ports:
      - "8000:8000"
    environment:
      KREUZBERG_HOST: "127.0.0.1"
      KREUZBERG_PORT: "8000"
      KREUZBERG_CACHE_ENABLED: "false" # Disable for fresh testing
      KREUZBERG_CI_DEBUG: "1" # Enable debug output
      KREUZBERG_DEBUG_OCR: "1"
      KREUZBERG_CACHE_DIR: "/tmp/kreuzberg"

Environment Variable Loading Order¶

Kreuzberg applies environment variables in this order:

Load configuration file (TOML/YAML/JSON) if specified
Parse environment variables using apply_env_overrides()
Validate all settings

This ensures environment variables always win over file configuration:

Rust - Applying Environment Overrides

let mut config = ExtractionConfig::from_file("kreuzberg.toml")?;
config.apply_env_overrides()?;  // Overrides file values

Common Patterns¶

Using with Config Files¶

Combine files with environment overrides for flexibility:

Combining Config Files with Env Overrides

# Load base config from file
# Override specific values for this deployment
export KREUZBERG_OCR_LANGUAGE=deu
export KREUZBERG_CACHE_DIR=/mnt/cache
kreuzberg --config kreuzberg.toml

Shell Script Initialization¶

Environment-Based Shell Script

#!/bin/bash
# Load deployment-specific settings

if [ "$ENVIRONMENT" = "production" ]; then
  export KREUZBERG_HOST="0.0.0.0"
  export KREUZBERG_CORS_ORIGINS="https://app.example.com"
  export KREUZBERG_CACHE_ENABLED="true"
  export KREUZBERG_MAX_REQUEST_BODY_BYTES=$((200 * 1048576))
elif [ "$ENVIRONMENT" = "development" ]; then
  export KREUZBERG_HOST="127.0.0.1"
  export KREUZBERG_CACHE_ENABLED="false"
  export KREUZBERG_CI_DEBUG="1"
fi

kreuzberg

Kubernetes ConfigMap¶

Kubernetes ConfigMap and Pod Configuration

apiVersion: v1
kind: ConfigMap
metadata:
  name: kreuzberg-config
data:
  KREUZBERG_HOST: "0.0.0.0"
  KREUZBERG_PORT: "8000"
  KREUZBERG_CORS_ORIGINS: "https://api.example.com"
  KREUZBERG_CACHE_DIR: "/data/cache"
  KREUZBERG_OCR_BACKEND: "tesseract"
  KREUZBERG_TOKEN_REDUCTION_MODE: "moderate"
---
apiVersion: v1
kind: Pod
metadata:
  name: kreuzberg-server
spec:
  containers:
    - name: kreuzberg
      image: kreuzberg:latest
      ports:
        - containerPort: 8000
      envFrom:
        - configMapRef:
            name: kreuzberg-config
      volumeMounts:
        - name: cache
          mountPath: /data/cache
  volumes:
    - name: cache
      persistentVolumeClaim:
        claimName: kreuzberg-cache-pvc

Environment Variables Reference¶

When to Use Environment Variables¶

API Server Configuration¶

KREUZBERG_HOST¶

KREUZBERG_PORT¶

KREUZBERG_CORS_ORIGINS¶

KREUZBERG_MAX_REQUEST_BODY_BYTES¶

KREUZBERG_MAX_MULTIPART_FIELD_BYTES¶

Extraction Configuration¶

KREUZBERG_OCR_LANGUAGE¶

KREUZBERG_OCR_BACKEND¶

KREUZBERG_CHUNKING_MAX_CHARS¶

KREUZBERG_CHUNKING_MAX_OVERLAP¶

KREUZBERG_CACHE_ENABLED¶

KREUZBERG_OUTPUT_FORMAT¶

KREUZBERG_TOKEN_REDUCTION_MODE¶

Runtime Configuration¶

KREUZBERG_CACHE_DIR¶

KREUZBERG_CI_DEBUG¶

KREUZBERG_DEBUG_OCR¶

Memory & Performance¶

KREUZBERG_ENCODING_CACHE_MAX_ENTRIES¶

KREUZBERG_ENCODING_CACHE_MAX_BYTES¶

LLM Integration¶

KREUZBERG_LLM_MODEL¶

KREUZBERG_LLM_API_KEY¶

KREUZBERG_LLM_BASE_URL¶

KREUZBERG_VLM_OCR_MODEL¶

KREUZBERG_VLM_EMBEDDING_MODEL¶

Testing Variables¶

KREUZBERG_RUN_FULL_OCR¶

Docker Compose Examples¶

Basic Configuration¶

Production Configuration¶

Multilingual Configuration¶

Development Configuration¶

Environment Variable Loading Order¶

Common Patterns¶

Using with Config Files¶

Shell Script Initialization¶

Kubernetes ConfigMap¶

See Also¶