RAG & knowledge

Chunking

Splitting source documents into smaller pieces (paragraphs, sections) before embedding them for retrieval.

also known as: text splitting, document chunking

In depth

An LLM cannot retrieve a 100-page PDF as one unit, the embedding would lose the granular signal. Chunking splits documents into smaller pieces, each embedded separately, so retrieval can find the specific paragraph that answers the question. Chunk size is a tuning knob: small chunks are precise but lose context, large chunks keep context but blur the signal. 200-500 tokens with a 50-token overlap is a common starting point.

Related concepts

RAGA pattern where relevant documents are fetched from a knowledge base and pasted into context before the LLM answers.EmbeddingA fixed-size numeric vector that represents the semantic meaning of a piece of text.Vector databaseA database optimised for storing and querying high-dimensional vectors, typically for similarity search.

Newsletter

Get the next post in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

More in RAG & knowledge

Embedding/glossary/embedding Hybrid search/glossary/hybrid-search Knowledge base/glossary/knowledge-base RAG/glossary/rag Re-ranking/glossary/rerank Semantic search/glossary/semantic-search