Document Summarisation¶

Produce a prose summary of an extracted document. Extractive backend uses pure-Rust TextRank; abstractive backend uses liter-llm. Result populates ExtractionResult.summary.

Feature gate

The summarization feature ships the TextRank backend (no LLM required) — included in no-ort-target, wasm-target, android-target, full. Enable summarization-llm for the abstractive backend.

Backends¶

Strategy	Cargo feature	Network	Quality	Latency
`Extractive` (default)	`summarization`	None — fully local	Sentence-level selection from source	< 100 ms typical
`Abstractive`	`summarization-llm`	LLM provider	Generates novel prose, can summarise across sentences	Provider-dependent

When to Use¶

You need a one-paragraph TL;DR for indexing or search snippets.
You need a deterministic, network-free summary (extractive only).
You need a fluent abstractive summary for downstream LLM consumption.

When Not to Use¶

You need full per-section summaries. Chunk the document first and summarise each chunk separately.
You need cross-document summarisation. Summarise per document, then summarise the summaries with the LLM backend.

Configuration¶

PythonTypeScriptRustTOML

Python

import asyncio
from kreuzberg import extract_file, ExtractionConfig, SummarizationConfig

async def main() -> None:
    config = ExtractionConfig(
        summarization=SummarizationConfig(
            strategy="extractive",
            max_tokens=200,
        ),
    )
    result = await extract_file("report.pdf", config=config)
    if result.summary:
        print(result.summary.text)

asyncio.run(main())

TypeScript

import { extractFile } from '@kreuzberg/node';

const result = await extractFile("report.pdf", {
    summarization: {
        strategy: "extractive",
        maxTokens: 200,
    },
});
if (result.summary) {
    console.log(result.summary.text);
}

Rust

use kreuzberg::{extract_file, ExtractionConfig, SummarizationConfig};
use kreuzberg::types::summary::SummaryStrategy;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = ExtractionConfig {
        summarization: Some(SummarizationConfig {
            strategy: SummaryStrategy::Extractive,
            max_tokens: Some(200),
            llm: None,
        }),
        ..Default::default()
    };
    let result = extract_file("report.pdf", None, &config).await?;
    if let Some(summary) = result.summary {
        println!("{}", summary.text);
    }
    Ok(())
}

kreuzberg.toml

[summarization]
strategy = "extractive"
max_tokens = 200

Abstractive Backend¶

Switch the strategy and attach an LlmConfig:

Python

import asyncio
from kreuzberg import extract_file, ExtractionConfig, SummarizationConfig, LlmConfig

async def main() -> None:
    config = ExtractionConfig(
        summarization=SummarizationConfig(
            strategy="abstractive",
            max_tokens=300,
            llm=LlmConfig(model="openai/gpt-4o-mini"),
        ),
    )
    result = await extract_file("report.pdf", config=config)
    if result.summary:
        print(result.summary.text)

asyncio.run(main())

The model receives the extracted content and returns the summary verbatim. Token usage records in ExtractionResult.llm_usage with source = "summarization".

`max_tokens` Semantics¶

Strategy	What `max_tokens` caps
`Extractive`	Loose whitespace tokens in the output summary. The TextRank selector stops appending sentences once it would exceed the cap.
`Abstractive`	The LLM provider's `max_tokens` request parameter. Counted in provider tokens.

Leave None to let the backend pick a sensible default.

Output Shape¶

{
  "summary": {
    "text": "The contract sets out a 3-year support agreement with quarterly billing and a fixed escalation cap of 4%.",
    "strategy": "extractive",
    "token_count": 19
  }
}

Provider Setup (Abstractive Only)¶

Pick any liter-llm provider — see LLM Integration. For most documents, gpt-4o-mini, claude-3-5-haiku, or google/gemini-2.0-flash give good cost / quality trade-offs.

API-key precedence:

SummarizationConfig.llm.api_key
KREUZBERG_LLM_API_KEY
Per-provider env var

LLM Integration — provider matrix, API-key precedence
Document Translation — sibling LLM post-processor
Configuration Reference

Edit this page on GitHub