What is Chunk Size?

Chunk Size is the token, character, or structural length chosen for each document unit indexed in a retrieval-augmented generation system.

How It Works

Chunk size controls how much evidence a retriever returns at once. Smaller chunks often improve precision because each result focuses on a narrow idea, but they can remove context needed to answer a question. Larger chunks preserve surrounding context and citation continuity, but they may dilute embedding similarity, consume more context-window budget, and make top-k results less specific. In production RAG, chunk size should usually be measured in model tokens, tested against real queries, and adjusted by document type instead of treated as a universal constant.

Key Characteristics

  • Usually measured in tokens for LLM workflows, though ingestion pipelines may start from characters or words
  • Directly affects retrieval precision, context-window usage, and indexing volume
  • Interacts with chunk overlap, document structure, top-k, reranking, and citation requirements
  • Often varies by content type, such as policies, API docs, code, legal contracts, or support tickets
  • Should be tuned with retrieval evaluation rather than selected by intuition alone

Common Use Cases

  1. Setting 300 to 600 token sections for product documentation RAG
  2. Using larger chunks for legal clauses that require surrounding definitions
  3. Keeping code examples and explanations together in developer search
  4. Reducing context cost by shrinking overly broad chunks
  5. Running offline experiments to compare answer quality across chunk sizes

Example

loading...
Loading code...

Frequently Asked Questions

What is a good default chunk size for RAG?

There is no universal default, but many text-heavy RAG systems start around a few hundred tokens and then tune using retrieval and answer-quality evaluation.

Are smaller chunks always better?

No. Smaller chunks can retrieve precise snippets, but they may omit definitions, assumptions, tables, or preceding context needed for a correct answer.

Why measure chunk size in tokens?

LLMs operate under token budgets. Token-based sizing makes retrieval payloads easier to compare against context-window limits and generation cost.

Should every document use the same chunk size?

Usually not. A codebase, a policy manual, and a FAQ page have different structure and evidence needs, so they often require different sizing rules.

Related Tools

Related Terms

Related Articles