An LLM cannot retrieve a 100-page PDF as one unit, the embedding would lose the granular signal. Chunking splits documents into smaller pieces, each embedded separately, so retrieval can find the specific paragraph that answers the question. Chunk size is a tuning knob: small chunks are precise but lose context, large chunks keep context but blur the signal. 200-500 tokens with a 50-token overlap is a common starting point.
Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.