RAG & knowledge

RAG

A pattern where relevant documents are fetched from a knowledge base and pasted into context before the LLM answers.

also known as: retrieval-augmented generation

In depth

RAG bridges the gap between an LLM's training data and your specific knowledge. The idea: when the user asks a question, retrieve the most relevant chunks from a vector database, paste them into the prompt, then let the model answer with that context. RAG is the default architecture for support bots, internal Q&A agents, and anything that needs to answer from a corpus of documents you control.

Related concepts

EmbeddingA fixed-size numeric vector that represents the semantic meaning of a piece of text.Vector databaseA database optimised for storing and querying high-dimensional vectors, typically for similarity search.Semantic searchSearching by meaning instead of keywords. Powered by embeddings and vector databases.ChunkingSplitting source documents into smaller pieces (paragraphs, sections) before embedding them for retrieval.

Newsletter

Get the next post in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

More in RAG & knowledge

Chunking/glossary/chunking Embedding/glossary/embedding Hybrid search/glossary/hybrid-search Knowledge base/glossary/knowledge-base Re-ranking/glossary/rerank Semantic search/glossary/semantic-search