In depth
RAG bridges the gap between an LLM's training data and your specific knowledge. The idea: when the user asks a question, retrieve the most relevant chunks from a vector database, paste them into the prompt, then let the model answer with that context. RAG is the default architecture for support bots, internal Q&A agents, and anything that needs to answer from a corpus of documents you control.
Related concepts
EmbeddingA fixed-size numeric vector that represents the semantic meaning of a piece of text.Vector databaseA database optimised for storing and querying high-dimensional vectors, typically for similarity search.Semantic searchSearching by meaning instead of keywords. Powered by embeddings and vector databases.ChunkingSplitting source documents into smaller pieces (paragraphs, sections) before embedding them for retrieval.
Newsletter
Get the next post in your inbox.
Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.