In depth
An LLM is a function: text in, text out. Modern LLMs (Claude Sonnet, GPT-4o, DeepSeek V3, Llama 3) also emit structured tool calls, which is what makes them useful as agent brains. The quality difference between providers and tiers is real, but for many agent tasks (exploration, classification, triage) a cheaper model is indistinguishable from a premium one. Routing the right turn to the right model is most of the cost story in production.
Related concepts
Frontier modelThe highest-quality, most expensive tier from a provider. Claude Sonnet, GPT-4o, Gemini Pro.Open-weight modelA model whose weights are publicly downloadable, runnable on your own hardware via Ollama, vLLM, or similar.TokensThe units LLMs process. Roughly four characters of English per token, billed per million.Model routingSending different turn types to different models (cheap for exploration, premium for writing).
Newsletter
Get the next post in your inbox.
Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.