Digitorn
Digitorn
All posts
hooks

Hooks: 4 production patterns we ship today on Digitorn

The hooks that actually work right now in the runtime, with the YAML to copy. Lint after every write, rate-limit a tool, hard-cap tool-call counts, run a custom Python validator on tool results.

DDigitornMay 5, 20267 min read

The hook system is the part of Digitorn that lets you turn a working agent into a production-ready one without writing any Python. This piece walks through the four patterns that actually ship in our builtins today. Every YAML below has been compiled against the live runtime; every action and condition referenced is a real registered handler.

If you have read an earlier draft of this post that listed seven patterns, three of them depended on runtime features that are still in progress (LLM-driven summarisation in hooks, template resolution inside inject_message, expression-based access to tool results). We took them out rather than ship a guide where half the examples wouldn't run.

A single turn, with the hook events that fire along the wayTurn arrivesturn_startinject context, log session idTool selectedtool_startrate-limit, transform argsTool runs(tool execution)Tool returnstool_endlint, summarise, redactLLM respondsturn_endaudit, count tokens, archiveEach event runs zero or more hooks(declared in YAML, fire-and-forget by default)
Five points where you can intercept the agent without writing Python. Most of the patterns in this article hang off two of them: tool_end and turn_start.

turn_start runs before the LLM sees the next user message. tool_start and tool_end wrap each tool call. turn_end runs after the LLM has produced its response. The four patterns below all hang off tool_start, tool_end, and turn_end.

1. Lint after every write

The single hook that comes closest to being mandatory for a coding agent. After any tool call that writes a file, run the linter, parse the diagnostics, and inject them back into the next turn. The agent self-corrects on the next pass without you writing the loop.

YAML
1runtime:2  hooks:3    - id: lint_after_write4      "on": tool_end5      condition:6        type: tool_name7        match:8          - filesystem.write9          - filesystem.edit10          - workspace.write11          - workspace.edit12      action:13        type: lsp_diagnose14        inject_result: true15        publish: true

inject_result: true is the load-bearing flag. It means the LSP diagnostics get merged into the tool's response payload, so the LLM sees the lint errors in the same context as the success status. Without that flag, the agent thinks the write succeeded and moves on. With it, the agent reads "wrote 42 lines, 2 errors" and reaches for the linter's suggestion automatically.

publish: true also pushes the diagnostics to the diagnostics preview channel, so a connected client (canvas, IDE plugin) can render them inline.

Tip
The tool_name.match field accepts a string, a pipe-separated list ("filesystem.write|filesystem.edit"), or a YAML list. Wildcards work too: match: "filesystem.*" would catch any filesystem action. We picked the explicit list above because it reads better in a code review.

2. Hard ceiling on a specific tool

The simplest guard for a third-party API is a hard ceiling. Past N calls to that tool in a session, the gate fires and the agent receives a tool-execution refusal it can react to.

YAML
1runtime:2  hooks:3    - id: web_fetch_cap4      "on": tool_start5      condition:6        type: all_of7        conditions:8          - { type: tool_name, match: web.fetch }9          - { type: tool_calls, threshold: 50 }10      action:11        type: gate12        reason: "Cap reached on web.fetch (50 calls). Stopping the loop."

The tool_calls condition tracks the running count of all tool calls in the session; combined with tool_name it gates the specific tool once the threshold is crossed. The agent reads the error and typically reframes its plan instead of pounding the API.

Note
The hook's own cooldown and max_fires fields rate-limit the hook itself, not the underlying tool. With max_fires: 100, the gate would simply stop firing after 100 fires - so calls 101+ would succeed. For a true session-wide ceiling, use the tool_calls condition above.

3. Global cap on runaway loops + ops notification

The horror story everyone has heard: an agent loop runs unattended, hits some weird state, and racks up cost before someone notices. The simplest hard guard is an unconditional tool-call ceiling. Past N calls in a session, gate the next call AND notify ops via the chain action.

YAML
1runtime:2  hooks:3    - id: runaway_cap4      "on": tool_start5      condition:6        type: tool_calls7        threshold: 1008      action:9        type: chain10        actions:11          - type: gate12            reason: "Runaway cap (100 tool calls) reached. Stopping the agent."13          - type: notify14            level: warning15            title: "Agent stopped: runaway cap"16            message: "Session crossed the 100-call threshold."

gate blocks the call; notify is fire-and-forget telemetry that surfaces in the workbench logs. The chain action runs both in order. The runtime tracks per-session tool_calls, turn_count, and message_count natively - pick whichever fits your safety story.

Note
Direct token-cost tracking inside hook conditions is on the roadmap. Until then, tool-call count is a coarse but reliable proxy: every tool call has a non-zero LLM cost, so capping calls caps cost.

4. Notify on tool failure

The fastest way to learn an agent is failing on a specific tool is to surface the failure in real time. The tool_failed condition fires whenever a tool returned an error, the notify action posts the event to the workbench logs (and to any subscribed observability sink).

YAML
1runtime:2  hooks:3    - id: notify_tool_failure4      "on": tool_end5      condition: { type: tool_failed }6      action:7        type: notify8        level: warning9        title: "Tool failure: {{tool.name}}"10        message: "{{tool.error}}"11        tag: tool_failed

{{tool.name}} and {{tool.error}} are the two template variables the runtime resolves inside hook action parameters. They expand against the live tool state at fire time, so each notification carries the failing tool's short name and the exact error string the runtime captured.

This is the lightest possible diagnostic hook. Pair it with level: error and a tag you can filter on in your dashboard, and you have a free first line of defence against silent failures.

Tip
If you also want the agent to react to the failure (retry on a different endpoint, fall back to a cheaper tool), use pipe instead of notify. pipe routes the failed tool's params into a second tool with field mapping. The Tool Hooks reference page in the docs walks through the syntax.

Stacking the four

The patterns compose. A typical production-leaning agent runs all four at once:

YAML
1runtime:2  hooks:3    - id: lint_writes4      "on": tool_end5      condition: { type: tool_name, match: ["filesystem.write", "filesystem.edit"] }6      action: { type: lsp_diagnose, inject_result: true }78    - id: web_fetch_cap9      "on": tool_start10      condition:11        type: all_of12        conditions:13          - { type: tool_name, match: web.fetch }14          - { type: tool_calls, threshold: 50 }15      action: { type: gate, reason: "Cap on web.fetch reached." }1617    - id: runaway_cap18      "on": tool_start19      condition: { type: tool_calls, threshold: 100 }20      action:21        type: chain22        actions:23          - { type: gate, reason: "Tool-call ceiling reached." }24          - { type: notify, level: warning, title: "Agent gated" }2526    - id: notify_tool_failure27      "on": tool_end28      condition: { type: tool_failed }29      action:30        type: notify31        level: warning32        title: "Tool failure: {{tool.name}}"33        message: "{{tool.error}}"34        tag: tool_failed

Four blocks, four production behaviours. The whole thing reads top to bottom like any other YAML.

What is on the roadmap

The hook system is moving forward. A few patterns we want to ship that need runtime work first, in case you are scoping a longer-term plan:

  • Goal injection at turn start. Pin the user's original goal at the top of every turn so the LLM cannot drift after compaction. Needs template resolution inside inject_message (currently the action takes literal strings).
  • LLM-driven content transformations in hooks. Auto-summarise large tool results, classify intent before routing, run a fact-check pass after a write. Needs an action that wraps llm_provider.chat with a known prompt shape.
  • Token-cost ceilings. Hard cap an agent at $X per session. Needs the runtime to surface per-session cost into the hook condition context (today only turn, tools, messages, pressure, tokens are visible).

When these land they will appear here. We did not want this article to be a wishlist - the four above are the ones you can ship today.

Try it

The fastest way to feel the difference is to install one of the Hub builtins (which already ship hooks) and inspect the YAML.

Bash
1curl -sSL https://digitorn.ai/install | sh2digitorn install hub://digitorn/digitorn-builder3digitorn app schema digitorn-builder | grep -A 12 hooks

You will see the same patterns from this article in the wild. The digitorn-builder builtin uses the lint-after-write pattern (with its specialist compile_yaml action) to surface compiler errors back to the agent the moment it saves a broken app.yaml.

Further reading

#hooks#production#patterns#yaml#advanced
ShareLinkedIn
Newsletter

One post a fortnight, in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

One-click unsubscribe. We never share your address. Powered by our own infrastructure, not a tracker.
D
The Digitorn team

We build the open-source AI agent runtime that runs on your own machine. YAML over Python, multi-agent by default, marketplace for sharing.

Keep reading

Try it now

Ship your first AI agent in 5 minutes.

Open-source. Self-hosted. YAML-first. Bring your own LLM keys, agents run on your machine.