Models & inference

Streaming

Receiving the LLM's output token-by-token as it generates, instead of waiting for the full response.

also known as: token streaming, SSE streaming

In depth

Streaming is what lets the user see the agent's answer appear as it's being written, instead of staring at a spinner for ten seconds. Most providers support it through Server-Sent Events. On Digitorn the runtime streams tokens straight to the UI, the CLI, or whatever channel the agent is responding through. Tool calls also stream, which is how status updates work mid-turn.

Related concepts

TokensThe units LLMs process. Roughly four characters of English per token, billed per million.Tool useThe LLM capability to emit a structured call to an external function, which the runtime executes and feeds back as context.

Newsletter

Get the next post in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

More in Models & inference

Context window/glossary/context-window Frontier model/glossary/frontier-model Inference/glossary/inference LLM/glossary/llm Open-weight model/glossary/open-weight-model Temperature/glossary/temperature