Development Workflow¶
Everything you need to build, test, and debug Kreuzberg locally. This guide assumes you've already followed the Contributing Guide to fork and clone the repository.
The Task Runner¶
Kreuzberg uses Task for all build and test workflows. One command to bootstrap everything:
That installs all toolchains and dependencies. Safe to re-run anytime — it's idempotent.
The Pattern¶
Tasks follow <language>:<action>. Once you learn this pattern, the command for any task is predictable:
task rust:build # Build the Rust core
task rust:build:dev # Debug build (faster compile, no optimizations)
task rust:build:release # Release build (slow compile, fast binary)
task rust:test # Run Rust tests
task rust:test:ci # Same tests, with CI-level diagnostics
task python:build # Build Python bindings via maturin
task python:test # Run Python test suite
task node:build # Build Node.js bindings via napi
task node:test # Jest tests
The same pattern works for every language: go:build, java:test, ruby:build, csharp:test, and so on.
Bulk Operations¶
task build:all # Build every binding
task test:all # Test every binding (sequential)
task test:all:parallel # Test every binding (parallel — faster, noisier output)
task check # Lint + format check across the whole repo
Testing Locally¶
Rust¶
The core lives in crates/kreuzberg/. Most changes start here.
task rust:test
cargo test -p kreuzberg test_pdf_extraction -- --nocapture
RUST_LOG=debug cargo test -p kreuzberg test_name -- --nocapture
Python¶
Python bindings are in packages/python/. Build first, then test:
task python:build:dev
task python:test
cd packages/python
uv run pytest tests/ -k "test_extract" -v
The RUST_LOG env var works here too — the Rust core logs through Python's stderr:
Node.js¶
TypeScript bindings are in packages/typescript/:
task node:build:dev
task node:test
cd packages/typescript
pnpm test -- --testPathPattern="extract"
Everything Else¶
Same pattern. Build, then test:
task go:build && task go:test
task java:build && task java:test
task csharp:build && task csharp:test
task ruby:build && task ruby:test
task php:build && task php:test
task elixir:build && task elixir:test
task r:build && task r:test
task c:build && task c:test
task wasm:build && task wasm:test
Testing the live browser demo¶
The demo at docs/demo.html loads @kreuzberg/wasm from a CDN. To test local changes against it, use:
This builds the Wasm binary and TypeScript dist, patches the demo with local URLs, and starts two servers:
| Server | URL | Role |
|---|---|---|
| Docs | http://localhost:8001 |
Serves the patched demo-dev.html |
| Assets | http://localhost:9000 |
Serves the local Wasm package |
Open http://localhost:8001/demo-dev.html — no manual edits needed. The patched file (docs/demo-dev.html) is gitignored and regenerated on every run. The two different ports reproduce the cross-origin setup the CDN creates in production.
To skip the slow Rust build when you've only changed TypeScript:
End-to-end Test Suites¶
End-to-end tests guarantee that every language binding produces identical results for the same document. They live in e2e/ as shared fixtures — test inputs paired with expected outputs.
Run end-to-end Tests¶
| Language | Directory | Run with |
|---|---|---|
| Python | e2e/python/ |
task python:e2e:test |
| TypeScript / Node.js | e2e/typescript/ |
task node:e2e:test |
| Rust | e2e/rust/ |
task rust:e2e:test |
| Go | e2e/go/ |
task go:e2e:test |
| Java | e2e/java/ |
task java:e2e:test |
| .NET | e2e/csharp/ |
task csharp:e2e:test |
| Ruby | e2e/ruby/ |
task ruby:e2e:test |
| PHP | e2e/php/ |
task php:e2e:test |
| R | e2e/r/ |
task r:e2e:test |
Regenerate end-to-end Tests¶
When you add a feature that changes extraction behavior, regenerate the affected end-to-end suites:
To regenerate and test all suites at once:
Benchmarking¶
Measure extraction performance with the benchmark harness in tools/benchmark-harness/. Use it to track regressions, compare against alternatives, and identify bottlenecks with flamegraphs.
Quick Start¶
task benchmark:run FRAMEWORK=kreuzberg MODE=single-file
task benchmark:run FRAMEWORK=kreuzberg MODE=batch
Common Modes¶
| Mode | What it measures |
|---|---|
single-file |
Latency — one file at a time |
batch |
Throughput — multiple files in parallel |
With Profiling¶
Generate flamegraphs to see where time is spent:
Results appear in the flamegraphs/ directory as interactive SVGs.
View live benchmark results at https://kreuzberg.dev/benchmarks.
Linting and Pre-commit¶
Language-specific:
task rust:lint # clippy + rustfmt
task python:lint # ruff + mypy
task node:lint # eslint + typecheck
The repository uses pre-commit hooks that enforce conventional commit messages, code formatting, and linter rules. If a commit is rejected, the hook output tells you exactly what to fix.
Working with Documentation¶
Building Locally¶
How Snippets Work¶
Code examples in the docs aren't inline — they're pulled from docs/snippets/ via the --8<-- include directive. This keeps examples testable and reusable across pages.
docs/snippets/
├── python/ # Python examples
│ ├── api/ # extract_file, batch_extract, etc.
│ ├── config/ # ExtractionConfig, OcrConfig, etc.
│ ├── ocr/ # OCR backends
│ ├── plugins/ # Plugin implementations
│ ├── mcp/ # MCP server and client
│ └── utils/ # Embeddings, chunking, errors
├── rust/ # Rust examples (same layout)
├── typescript/ # TypeScript examples
├── go/, java/, csharp/, ruby/, r/
├── docker/ # Docker commands
├── api_server/ # Server startup examples
└── cli/ # CLI usage
When you change a user-facing API, update the matching snippet. When you add a new feature, create a snippet and include it from the relevant doc page.
Theme tokens (light mode)¶
Inline code and command-style monospace in light mode use the text token #26203A, defined in docs/css/extra.css as --kb-text (referenced as var(--kb-text); brand backgrounds use the same value via --kb-brand-ink).
Debugging¶
Rust Panics¶
RUST_BACKTRACE=1 cargo test -p kreuzberg test_name
RUST_BACKTRACE=full cargo test -p kreuzberg test_name
Python FFI Problems¶
When something goes wrong in the Rust core during a Python call, the error introspection API gives you the details:
from kreuzberg import get_last_error_code, get_error_details, get_last_panic_context
details = get_error_details()
print(f"Error: {details['message']}")
print(f"Code: {details['error_code']}")
context = get_last_panic_context()
if context:
print(f"Panic context: {context}")
Verbose Logging¶
Crank up the log level to see what the Rust core is doing:
CI/CD¶
CI runs on every push and PR to main via .github/workflows/ci.yaml. The pipeline has four stages:
- Validate — conventional commits, formatting, clippy
- Build — FFI libraries, Python wheels, Node packages, all bindings
- Test — per-language test suites on Linux, macOS, and Windows
- Integration — Docker build, Docker smoke tests, CLI tests
Smart Change Detection¶
CI doesn't rebuild everything on every PR. A changes job detects which paths were touched and only runs the relevant build/test jobs. Edit a Python file? Only Python builds and tests run. Touch the Rust core? Everything downstream rebuilds.
Running CI Checks Locally¶
Before pushing, you can run the same checks CI runs:
task check # Matches the validate stage
task rust:test:ci # Rust tests with CI diagnostics
task python:test:ci # Python tests with CI diagnostics
task test:all:ci # Everything
Other Workflows¶
| Workflow | When it runs | What it does |
|---|---|---|
ci.yaml |
Every push/PR to main |
The main pipeline |
docs.yaml |
Changes to docs/ or zensical.toml |
Builds and validates documentation |
benchmarks.yaml |
Manual trigger | Runs the full benchmark suite |
profiling.yaml |
Manual trigger | Generates flamegraphs |
publish.yaml |
Release events | Publishes packages to registries |
publish-docker.yaml |
Tags and releases | Builds and pushes Docker images |
Performance¶
Kreuzberg's core is written in Rust, which enables zero-copy memory handling, SIMD acceleration, and true multi-core parallelism — all at compile time with no garbage collection.
Why Rust Matters¶
- Native compilation: LLVM optimizes code ahead of time (inlining, vectorization, dead code elimination)
- Zero-copy strings: Slicing uses borrowed references, not heap allocations
- SIMD acceleration: Whitespace detection and character classification run 15-37x faster than scalar operations
- No GIL: True multi-core parallelism across all CPU cores
- Deterministic memory: Drop semantics free memory instantly, no GC pauses
Key Optimizations¶
- Batch processing: 6-10x faster than sequential extraction through work-stealing scheduler
- Caching: 85%+ hit rates for repeated files (SQLite-backed, automatic invalidation)
- Streaming: Large files processed in 4KB chunks, constant memory regardless of file size
- Lazy initialization: Expensive subsystems (Tokio, plugins) initialized on first use only
Benchmarking Your Workload¶
Measure with your actual files using the benchmark harness (see Benchmarking section for full instructions). For detailed analysis and live benchmark results, visit https://kreuzberg.dev/benchmarks.