The context window caps how much input plus output fits in one call. Claude Sonnet ships 200K, GPT-4o is 128K, Gemini Pro pushes to 2M. Long sessions overflow eventually, which is why compaction exists. Bigger windows are not always better: cost scales with input tokens, and models often degrade in quality at the far edge of their window.
Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.