Long conversations or dense tool outputs blow through the context window. The model forgets the goal, repeats itself, and costs spike because every turn rereads everything.
Long sessions, agents that read lots of files, research tasks that keep exploring instead of converging.
Short stateless calls. The summarisation step is overhead with no benefit when the context is already small.
Drop this into an app.yaml. Adjust the credential refs and module names to fit your existing setup.
1execution:2 mode: conversation3 hooks:4 - id: compact_when_pressured5 "on": turn_end6 condition: { type: context_pressure, threshold: 0.7 }7 action:8 type: compact_context9 keep_recent_turns: 410 summarize_with: { model: claude-haiku-4-5, credential: anthropic_main }11 preserve:12 - goal13 - todos14 - last_user_message1516agents:17 - id: helper18 modules: [{filesystem: [read, edit, write]}, {web: [search]}]19 brain: { model: claude-sonnet-4-6, credential: anthropic_main }Walking through the YAML one block at a time so the design is clear, not memorised.
The hook fires at 70% of the model's context window, not at a fixed turn number. Long turns trigger it sooner, short turns later.
The last four turns stay raw. Recent context is most useful and summarising it would lose detail the model just used.
A Haiku call summarises everything before the recent window into a few hundred tokens. Cost is one cheap call per compaction, not one per turn.
Goal, todos, and the last user message are preserved verbatim. The model never loses sight of what it's doing or what the user asked for.
The pattern above is not the only answer. Here is when something else is the right call.
Drop oldest turns silently. Cheap, lossy, can lose critical context the model needed.
Push facts to a vector store and recall on demand. Robust for very long-running agents, more setup, slower per-turn.
Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.