Are hooks required to ship an agent on Digitorn?

No. The defaults are sensible and you can ship most agents without hooks. Hooks are the layer you add when you want production-grade guardrails: rate limiting, tool-call caps, automatic linting on writes.

Can hooks run Python code?

Indirectly, yes. The hook system itself is declarative. To run Python, you drop a custom .py module in your app directory and reference it from a hook with module_action. The Python file is a normal module - the runtime loads it, the hook calls into it like any other tool.

Do hooks slow down the agent?

Not measurably for the patterns in this article. Hooks run in the runtime's event loop and are filtered before any expensive work runs. The exceptions are hooks that explicitly call into a slow side-effect (an LSP server, a shell command), which add latency only when they fire.

Can I share a hook configuration across agents?

Yes, hooks live at the runtime block (apply to every agent in the app) or at the agent level (apply to one specific agent). YAML anchors and references work the same way they do for any other config block.

What is the difference between a hook and a behaviour rule?

Behaviour rules (under security.behavior.rule_definitions) target loop-control concerns like read-before-edit and per-action policy. Hooks are the more general primitive - any event in the agent loop, any condition, any action. Behaviour rules compile to hooks under the hood.

Hooks: 4 production patterns we ship today on Digitorn

The hook system is the part of Digitorn that lets you turn a working agent into a production-ready one without writing any Python. This piece walks through the four patterns that actually ship in our builtins today. Every YAML below has been compiled against the live runtime; every action and condition referenced is a real registered handler.

If you have read an earlier draft of this post that listed seven patterns, three of them depended on runtime features that are still in progress (LLM-driven summarisation in hooks, template resolution inside inject_message, expression-based access to tool results). We took them out rather than ship a guide where half the examples wouldn't run.

Five points where you can intercept the agent without writing Python. Most of the patterns in this article hang off two of them: tool_end and turn_start.

turn_start runs before the LLM sees the next user message. tool_start and tool_end wrap each tool call. turn_end runs after the LLM has produced its response. The four patterns below all hang off tool_start, tool_end, and turn_end.

1. Lint after every write

The single hook that comes closest to being mandatory for a coding agent. After any tool call that writes a file, run the linter, parse the diagnostics, and inject them back into the next turn. The agent self-corrects on the next pass without you writing the loop.

YAML

1runtime:2  hooks:3    - id: lint_after_write4      "on": tool_end5      condition:6        type: tool_name7        match:8          - filesystem.write9          - filesystem.edit10          - workspace.write11          - workspace.edit12      action:13        type: lsp_diagnose14        inject_result: true15        publish: true

inject_result: true is the load-bearing flag. It means the LSP diagnostics get merged into the tool's response payload, so the LLM sees the lint errors in the same context as the success status. Without that flag, the agent thinks the write succeeded and moves on. With it, the agent reads "wrote 42 lines, 2 errors" and reaches for the linter's suggestion automatically.

publish: true also pushes the diagnostics to the diagnostics preview channel, so a connected client (canvas, IDE plugin) can render them inline.

Tip

The tool_name.match field accepts a string, a pipe-separated list ("filesystem.write|filesystem.edit"), or a YAML list. Wildcards work too: match: "filesystem.*" would catch any filesystem action. We picked the explicit list above because it reads better in a code review.

2. Hard ceiling on a specific tool

The simplest guard for a third-party API is a hard ceiling. Past N calls to that tool in a session, the gate fires and the agent receives a tool-execution refusal it can react to.

YAML

1runtime:2  hooks:3    - id: web_fetch_cap4      "on": tool_start5      condition:6        type: all_of7        conditions:8          - { type: tool_name, match: web.fetch }9          - { type: tool_calls, threshold: 50 }10      action:11        type: gate12        reason: "Cap reached on web.fetch (50 calls). Stopping the loop."

The tool_calls condition tracks the running count of all tool calls in the session; combined with tool_name it gates the specific tool once the threshold is crossed. The agent reads the error and typically reframes its plan instead of pounding the API.

Note

The hook's own cooldown and max_fires fields rate-limit the hook itself, not the underlying tool. With max_fires: 100, the gate would simply stop firing after 100 fires - so calls 101+ would succeed. For a true session-wide ceiling, use the tool_calls condition above.

3. Global cap on runaway loops + ops notification

The horror story everyone has heard: an agent loop runs unattended, hits some weird state, and racks up cost before someone notices. The simplest hard guard is an unconditional tool-call ceiling. Past N calls in a session, gate the next call AND notify ops via the chain action.

YAML

1runtime:2  hooks:3    - id: runaway_cap4      "on": tool_start5      condition:6        type: tool_calls7        threshold: 1008      action:9        type: chain10        actions:11          - type: gate12            reason: "Runaway cap (100 tool calls) reached. Stopping the agent."13          - type: notify14            level: warning15            title: "Agent stopped: runaway cap"16            message: "Session crossed the 100-call threshold."

gate blocks the call; notify is fire-and-forget telemetry that surfaces in the workbench logs. The chain action runs both in order. The runtime tracks per-session tool_calls, turn_count, and message_count natively - pick whichever fits your safety story.

Note

Direct token-cost tracking inside hook conditions is on the roadmap. Until then, tool-call count is a coarse but reliable proxy: every tool call has a non-zero LLM cost, so capping calls caps cost.

4. Notify on tool failure

The fastest way to learn an agent is failing on a specific tool is to surface the failure in real time. The tool_failed condition fires whenever a tool returned an error, the notify action posts the event to the workbench logs (and to any subscribed observability sink).

YAML

1runtime:2  hooks:3    - id: notify_tool_failure4      "on": tool_end5      condition: { type: tool_failed }6      action:7        type: notify8        level: warning9        title: "Tool failure: {{tool.name}}"10        message: "{{tool.error}}"11        tag: tool_failed

{{tool.name}} and {{tool.error}} are the two template variables the runtime resolves inside hook action parameters. They expand against the live tool state at fire time, so each notification carries the failing tool's short name and the exact error string the runtime captured.

This is the lightest possible diagnostic hook. Pair it with level: error and a tag you can filter on in your dashboard, and you have a free first line of defence against silent failures.

Tip

If you also want the agent to react to the failure (retry on a different endpoint, fall back to a cheaper tool), use pipe instead of notify. pipe routes the failed tool's params into a second tool with field mapping. The Tool Hooks reference page in the docs walks through the syntax.

Stacking the four

The patterns compose. A typical production-leaning agent runs all four at once:

YAML

1runtime:2  hooks:3    - id: lint_writes4      "on": tool_end5      condition: { type: tool_name, match: ["filesystem.write", "filesystem.edit"] }6      action: { type: lsp_diagnose, inject_result: true }78    - id: web_fetch_cap9      "on": tool_start10      condition:11        type: all_of12        conditions:13          - { type: tool_name, match: web.fetch }14          - { type: tool_calls, threshold: 50 }15      action: { type: gate, reason: "Cap on web.fetch reached." }1617    - id: runaway_cap18      "on": tool_start19      condition: { type: tool_calls, threshold: 100 }20      action:21        type: chain22        actions:23          - { type: gate, reason: "Tool-call ceiling reached." }24          - { type: notify, level: warning, title: "Agent gated" }2526    - id: notify_tool_failure27      "on": tool_end28      condition: { type: tool_failed }29      action:30        type: notify31        level: warning32        title: "Tool failure: {{tool.name}}"33        message: "{{tool.error}}"34        tag: tool_failed

Four blocks, four production behaviours. The whole thing reads top to bottom like any other YAML.

What is on the roadmap

The hook system is moving forward. A few patterns we want to ship that need runtime work first, in case you are scoping a longer-term plan:

Goal injection at turn start. Pin the user's original goal at the top of every turn so the LLM cannot drift after compaction. Needs template resolution inside inject_message (currently the action takes literal strings).
LLM-driven content transformations in hooks. Auto-summarise large tool results, classify intent before routing, run a fact-check pass after a write. Needs an action that wraps llm_provider.chat with a known prompt shape.
Token-cost ceilings. Hard cap an agent at $X per session. Needs the runtime to surface per-session cost into the hook condition context (today only turn, tools, messages, pressure, tokens are visible).

When these land they will appear here. We did not want this article to be a wishlist - the four above are the ones you can ship today.

Try it

The fastest way to feel the difference is to install one of the Hub builtins (which already ship hooks) and inspect the YAML.

Bash

1curl -sSL https://digitorn.ai/install | sh2digitorn install hub://digitorn/digitorn-builder3digitorn app schema digitorn-builder | grep -A 12 hooks

You will see the same patterns from this article in the wild. The digitorn-builder builtin uses the lint-after-write pattern (with its specialist compile_yaml action) to surface compiler errors back to the agent the moment it saves a broken app.yaml.

Keep reading

credentials

Ship your first AI agent in 5 minutes.

Open-source. Self-hosted. YAML-first. Bring your own LLM keys, agents run on your machine.

Install Digitorn Browse the Hub

Hooks: 4 production patterns we ship today on Digitorn

1. Lint after every write

2. Hard ceiling on a specific tool

3. Global cap on runaway loops + ops notification

4. Notify on tool failure

Stacking the four

What is on the roadmap

Try it

Further reading

One post a fortnight, in your inbox.

Keep reading

How credentials work on Digitorn: an encrypted vault driven from YAML

Digitorn vs LangChain: an honest comparison

Why we chose YAML over Python for our agent runtime

Ship your first AI agent in 5 minutes.

Hooks: 4 production patterns we ship today on Digitorn

#1. Lint after every write

#2. Hard ceiling on a specific tool

#3. Global cap on runaway loops + ops notification

#4. Notify on tool failure

#Stacking the four

#What is on the roadmap

#Try it

#Further reading

One post a fortnight, in your inbox.

Keep reading

How credentials work on Digitorn: an encrypted vault driven from YAML

Digitorn vs LangChain: an honest comparison

Why we chose YAML over Python for our agent runtime

Ship your first AI agent in 5 minutes.

1. Lint after every write

2. Hard ceiling on a specific tool

3. Global cap on runaway loops + ops notification

4. Notify on tool failure

Stacking the four

What is on the roadmap

Try it

Further reading