Can I use a model other than Anthropic for the coding agent?

Yes, against any OpenAI-compatible endpoint. Swap anthropic for openai, deepseek, azure_openai, mistral, groq, together, or another provider. The same YAML works with DeepSeek V3, GPT-4o, or a local Llama 70B served by vLLM.

How does the YAML coding agent compare to LangChain?

Different philosophy. LangChain is Python-as-config, you build agents by writing Python. Digitorn is YAML-as-config, the runtime executes a declarative file. LangChain is better for deeply custom Python pipelines. Digitorn is better when you want a coding agent without maintaining framework code.

Does it work fully offline with local models?

Yes, as long as your model exposes an OpenAI-compatible endpoint. Run Ollama or vLLM locally, point the YAML at http://localhost:11434/v1, and you have an offline coding agent. Quality is the trade-off: local 70B-class models still trail Sonnet for production coding work.

How to build a Claude Code clone in YAML

Q: Is this exactly Claude Code?

No. Anthropic's internal prompts are not public, and probably never will be. What this clone reproduces is the architecture: the tool surface, the read-before-edit guard, the multi-agent dispatch, the plan-first behaviour, the cost routing. The prompts are yours to write and iterate on.

Q: Can I share my custom coding agent with my team?

Yes, push it to the Digitorn Hub. A teammate runs digitorn install hub://your-publisher/my-coder and they have it. Agents travel as YAML plus small assets, so prompts get reviewed in PRs like any other code.

Claude Code is genuinely good. It's also a closed-source CLI that calls Anthropic's infrastructure with Anthropic's prompts and bills your account for every keystroke. If you want roughly the same experience but running on your laptop, against the model provider you choose, with the prompts all sitting in a file you can read and edit, that's what this guide is about. Around 50 lines of YAML and you're there.

The tour goes like this. We pick apart what Claude Code is actually doing under the hood. Each piece becomes a YAML primitive on the Digitorn runtime. Then we look at the full app.yaml, talk about cost (Sonnet for the writing, Haiku for the grunt work), and do the multi-agent dispatch which is where most home-rolled clones fall over. At the end there's a 5-minute install path.

By the time you're done you'll have a coding agent on your machine doing what Claude Code does, on your API keys, with the entire behaviour readable in a single config file.

The short version

Claude Code feels magical, but the recipe is mundane. Short tool names. A coordinator that delegates. A read-before-edit guard. A plan written before any code is touched. Every one of those is reproducible declaratively if your runtime exposes the right primitives. Digitorn does, so the rest is just YAML.

Fifty lines, same loop, your keys.

What's actually going on inside Claude Code

Strip the polish off and there are five specific things tuned in a specific way. Reproducing those gets you most of the way there.

Short, ergonomic tool names

Tools are called Write, Read, Edit, Bash, Grep, Glob, Agent. Not filesystem.write or shell.bash_execute. The reason is partly economics (every byte the LLM emits costs money) and partly cognitive: a short, unambiguous name like Write(path, content) lets the model commit. Something like tools.filesystem.WriteFileWithOptions(...) makes it hesitate, then hallucinate options.

Spawning sub-agents on the fly

Ask Claude Code to "find every place this function is called and refactor them" and what really happens is two passes: a search worker runs in parallel to map call sites, then a refactor worker rewrites each. You don't see the orchestration. You just see the result.

That trick is what makes a coding agent feel competent. A single 200K-context model trying to grep thirty files, read them, and rewrite them all in one prompt is a hallucination factory. Splitting the job into a coordinator plus focused workers is what production setups do.

Refusing to edit what hasn't been read

Claude Code won't edit a file unless it has been read first. Sounds dull. It's the single difference between "the agent helped me" and "the agent silently corrupted half my codebase". Ask any LLM to edit a file it has never seen and it will happily invent the contents, guided by the filename. Every time.

Writing a plan before doing anything

For anything non-trivial, Claude Code first emits a numbered plan, then executes it step by step. Without that, the agent wanders. With it, you get focused work, and you can read the plan to know what's about to happen.

Clean interrupt and hot reload

Hit Ctrl+C and the agent stops cleanly: in-flight tool calls are cancelled, conversation state is preserved, the next prompt picks up without re-establishing anything. Change a config file and it reloads in place. Both are the kind of thing you don't notice until you use a tool that doesn't have them, then can't go back.

Five pieces, laid out side by side:

Short tool names

Write, Edit, Bash

ergonomics

Sub-agent dispatch

coordinator + workers

scale

Read-before-edit

no silent corruption

safety

Plan-first

numbered, deliberate

focus

Clean abort

Ctrl+C just works

The five things Claude Code does well. Each one becomes a YAML primitive on Digitorn.

Each one becomes a primitive in the YAML below. None of them require code on your side.

The architecture in YAML

Here's the full app.yaml for a Claude Code-equivalent agent, running on Digitorn:

YAML

1app:2  app_id: my-coder3  name: "My Coder"4  version: "1.0.0"5  description: "A self-hosted coding agent. Multi-agent, plan-first, read-before-edit."6  category: "developer-tools"78runtime:9  mode: conversation10  entry_agent: coordinator11  workdir: "{{env.PWD}}"1213agents:14  - id: coordinator15    role: coordinator16    plan_first: true17    brain:18      provider: anthropic19      model: claude-sonnet-4-6     # the writer brain20      backend: anthropic21      config:22        api_key: "{{env.ANTHROPIC_API_KEY}}"23      temperature: 0.224      max_tokens: 819225      context:26        max_tokens: 20000027        strategy: summarize28      fallback:                     # 402 => switch to Haiku29        provider: anthropic30        model: claude-haiku-4-531        config:32          api_key: "{{env.ANTHROPIC_API_KEY}}"33    system_prompt: |34      You are a senior engineer. For non-trivial tasks:35      1. Read relevant files BEFORE editing them.36      2. Write a numbered plan, share it, then execute.37      3. Spawn search/triage specialists in PARALLEL when scanning38         a codebase. Don't grep + read 30 files yourself.39      4. After every Edit/Write, run any relevant tests with Bash.4041  - id: explorer42    role: specialist43    specialty: "Find files, grep symbols, sample contents"44    modules:45      - {filesystem: [read, grep, glob]}46      - {shell: [bash]}47    brain:48      provider: anthropic49      model: claude-haiku-4-5      # cheap fast triage50      config:51        api_key: "{{env.ANTHROPIC_API_KEY}}"52      temperature: 0.053    system_prompt: |54      You explore codebases. Return only the key findings: paths,55      symbols, code excerpts. No prose. Be FAST.5657tools:58  modules:59    filesystem: {}              # read-before-edit guard is built-in60    shell:61      config:62        timeout: 300            # default per-command timeout (seconds)63    memory:64      config:65        working_memory: true66        todo_list: true67    web:68      config:69        search_backend: duckduckgo70  capabilities:71    default_policy: auto

That's the entire file. Here's what the runtime actually wires up at boot:

One coordinator on Sonnet, one explorer on Haiku, five shared modules. The whole thing fits in a 50-line YAML file.

One coordinator on Sonnet, one explorer on Haiku, and five modules they share. Now let's walk through what each block buys you.

Modules are the toolbox

Each module is a set of tools the agents can call. The thing worth knowing is that modules are shared between the coordinator and the explorer. Same _read_files set, same workspace, same working memory. The runtime handles all of that. You just declare what's available.

The read-before-edit guard is built into the filesystem module. An Edit on a file the session hasn't read yet returns "Cannot edit /src/foo.py, read it first" as a tool error. The model reads the error, reads the file, retries. Same behaviour as Claude Code, same safety. No flag to set - it is on by default.

Agent spawn is the dispatcher

This is what lets the coordinator hand work off to an explorer. The whole multi-agent surface in Digitorn is a single tool, Agent, with eight modes selected by parameters:

Python

1Agent(prompt="find all callers of foo()", specialist="explorer")  # spawn + wait2Agent(prompt="...", specialist="explorer", wait=false)             # spawn background3Agent(agent_id="abc-123")                                          # check status4Agent(agent_ids=["a", "b", "c"])                                   # wait for many5Agent(agent_id="abc-123", cancel=true)                             # kill stuck agent6Agent(agent_id="abc-123", reassign="try with different terms")     # retry7Agent(list=true)                                                   # list all

A coordinator can spawn five explorers in parallel, wait for the whole batch, and then act on the combined result. The Python you'd write yourself to do that, retries and cancellation included, is closer to 500 lines than 50.

Per-agent brain, where the cost story lives

The coordinator runs on Claude Sonnet, 8192-token outputs, summarize compaction, full 200K context. That's where the actual code gets written, so quality has to be good. The explorer runs on Haiku. Roughly 5x cheaper, 3x faster, perfectly capable of grepping a directory and returning a list of hits.

That single split is what turns a self-hosted coding agent from "expensive curiosity" into something you can run all day. A coordinator-only setup with Sonnet on every turn costs roughly the same as a Claude Code subscription. Offloading exploration to Haiku trims around 60% off that on realistic workloads, because exploration is most of the LLM time.

Numbers above are normalised against the Sonnet-only baseline, measured across 50 real refactor and debug sessions on the digitorn-code builtin. Your mix will vary, but the shape doesn't.

Plan-first as a runtime rule, not a prompt instruction

memory.runtime.plan_first: true flips on a built-in behaviour rule. Before the first non-trivial tool call, the agent has to produce a plan. If it skips, the runtime injects a "did you plan first?" reminder in the next turn's context. None of that lives in your system prompt, which is why the focused, numbered execution that makes Claude Code look so deliberate keeps working even on long tasks.

Brain fallback so a quota error doesn't ruin your Saturday

The fallback block is a small detail with outsized payoff. When the primary provider answers with a 402 ("insufficient credit") or a rate-limit error, the runtime swaps in the fallback provider mid-conversation. The agent doesn't notice. The user doesn't notice. You don't have to scramble to top up an Anthropic balance at midnight.

What it looks like in practice

Easier to make this concrete with an example. Say you ask the coordinator: "Find every place we call auth.verify_token() and add logging before each call." Here's how the ten turns play out across the agents.

Ten turns to find four call sites and patch them. Sonnet thinks and edits. Haiku does the grepping. The runtime takes care of spawning, waiting, and merging the results.

A few things are worth pointing at in this trace. The coordinator never greps the codebase itself, it hands that off to a Haiku worker that comes back in roughly a second. The read-before-edit guard catches the first attempt to edit login.py, so the coordinator reads it before touching it. Tests run after every edit, not at the end, which means a regression caught at turn 9 doesn't snowball through the next three files. Sonnet tokens go only to the steps that actually need Sonnet.

What goes wrong, and how the runtime catches it

A handful of failure modes show up over and over when people roll their own coding agent. These are the ones worth knowing about up front.

The first is the agent editing a file it has never read. Without a guard, the LLM imagines the file contents from the filename and writes a "fix" that bulldozes the real code. It's the leading cause of "the AI deleted my work" stories. The Digitorn filesystem module keeps a per-session set of files the agent has read; an Edit call on anything outside that set returns an error in the tool result, which the model usually responds to by reading first and trying again. The guard is always on, no flag required.

Bash running outside the workspace is the next big one. The agent runs an rm -rf somewhere it shouldn't, and that's the end of your afternoon. shell.allowed_roots resolves the cwd of every shell command against an allow-list, defaulting to the user home, the workspace, and the temp dir. Anything outside is rejected. / is never on the list.

Then there's raw tool output eating the context window. One Read("100KB-log.txt") and you've burned a third of your budget. filesystem.config.max_file_kb clamps reads and tells the model the result was truncated, which usually triggers a follow-up Read(path, offset=2000, limit=500). Paginated reads, like you'd do with grep.

The infinite re-edit loop is more annoying than dangerous: the agent edits, the file doesn't compile, it edits again, doesn't compile, repeats. The coding behaviour profile rate-limits same-file edits ("more than three of these in five turns and we stop and ask the user"). The lsp_diagnose hook runs the linter after every write so lint errors land in the next turn's tool result, which lets the agent self-correct, but only up to a point.

The last one is goal drift. After thirty-plus turns, context compaction starts shaving off the early messages and the original task fades. Calling memory.set_goal on the first user message keeps the goal pinned at the top of every subsequent turn, even after compaction. At turn 50 the model still sees the same one-line directive it saw at turn 1.

Getting it running in five minutes

If you want to try this tonight, the path is roughly:

Bash

1# Install the runtime (Mac, Linux, or Windows with Git Bash)2curl -sSL https://digitorn.ai/install | sh34# Drop your Anthropic key in5echo 'ANTHROPIC_API_KEY=sk-ant-...' >> ~/.digitorn/.env67# Save the YAML above as my-coder.yaml somewhere convenient8nano my-coder.yaml   # paste the YAML910# Deploy and open11digitorn app deploy my-coder.yaml12digitorn dev chat my-coder

You land in a terminal session with a coding agent that knows about the directory you started it in, can read and write files there, run tests, and spawn explorer workers in parallel. Same loop as Claude Code. Your keys. Your YAML.

If you'd rather not start from a blank file, the Digitorn Hub already ships a polished version called digitorn-code under developer tools. One install command and you're done.

A few questions worth answering

Is this exactly Claude Code? No. Anthropic's internal prompts aren't public, and probably never will be. What this clone reproduces is the architecture: the tool surface, the read-before-edit guard, the multi-agent dispatch, the plan-first behaviour, the cost routing. The prompts are yours to write and iterate on, which is more than you get with the closed product.

Can I run it on something other than Anthropic? Yes, against any OpenAI-compatible endpoint. Swap anthropic for openai, deepseek, azure_openai, mistral, groq, together, or whatever else. The same YAML works with DeepSeek V3 (the cheapest serious option right now), GPT-4o, or a Llama 70B served by vLLM on a local box. Mixing per agent is fine and usually a good idea: coordinator on Sonnet, explorer on DeepSeek, writer on GPT-4o.

How is this different from LangChain or CrewAI? Different philosophy. LangChain is Python-as-config, you build agents by writing Python. Digitorn is YAML-as-config, you declare the agent and the runtime executes it. LangChain is better when you need deeply custom Python-native pipelines. Digitorn is better when you want a coding agent and you don't want to maintain framework code on top of it. The matrix is at Digitorn vs LangChain.

What about Cursor, Aider, Continue? Different shapes. Cursor is a full IDE, so a different category. Aider is the closest in spirit (self-hosted, CLI), but its configuration leans Python and is harder to extend. Continue is an editor extension. The thing specific to Digitorn is the declarative YAML plus built-in multi-agent.

Does it work fully offline? Yes, as long as your model has an OpenAI-compatible endpoint. Run Ollama or vLLM locally, point the YAML at http://localhost:11434/v1, and you have a coding agent that never phones home. Quality is the trade-off: local 70B-class models still trail Sonnet for production coding work, but the gap shrinks every month.

Can I share an agent with my team? Push it to the Digitorn Hub. Your teammate runs digitorn install hub://your-publisher/my-coder and they have it. Agents travel as YAML plus small assets, so prompts get reviewed in PRs like any other code.

A few links if you want to keep going

📦 Install Digitorn and try the YAML above
📚 The full module reference, every primitive used here is documented
🔄 Self-hosted coding agents compared, Digitorn next to LangChain and CrewAI
📂 The developer-tools agents, where digitorn-code ships ready-to-go

Built a variant worth sharing? Push it to the Hub. That's how the ecosystem grows.

#claude-code#tutorial#self-hosted#coding-agents#multi-agent

Share LinkedIn

Newsletter

One post a fortnight, in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

The Digitorn team

We build the open-source AI agent runtime that runs on your own machine. YAML over Python, multi-agent by default, marketplace for sharing.

GitHub Edit this article

Keep reading

showcase

Ship your first AI agent in 5 minutes.

Open-source. Self-hosted. YAML-first. Bring your own LLM keys, agents run on your machine.

Install Digitorn Browse the Hub

How to build a Claude Code clone in YAML

The short version

What's actually going on inside Claude Code

Short, ergonomic tool names

Spawning sub-agents on the fly

Refusing to edit what hasn't been read

Writing a plan before doing anything

Clean interrupt and hot reload

The architecture in YAML

Modules are the toolbox

Agent spawn is the dispatcher

Per-agent brain, where the cost story lives

Plan-first as a runtime rule, not a prompt instruction

Brain fallback so a quota error doesn't ruin your Saturday

What it looks like in practice

What goes wrong, and how the runtime catches it

Getting it running in five minutes

A few questions worth answering

A few links if you want to keep going

One post a fortnight, in your inbox.

Keep reading

10 apps you can ship in 50 lines of YAML

How we cut our coding agent's bill by 60% with model routing

What is an AI agent? A 2026 guide for engineers

Ship your first AI agent in 5 minutes.

How to build a Claude Code clone in YAML

#The short version

#What's actually going on inside Claude Code

#Short, ergonomic tool names

#Spawning sub-agents on the fly

#Refusing to edit what hasn't been read

#Writing a plan before doing anything

#Clean interrupt and hot reload

#The architecture in YAML

#Modules are the toolbox

#Agent spawn is the dispatcher

#Per-agent brain, where the cost story lives

#Plan-first as a runtime rule, not a prompt instruction

#Brain fallback so a quota error doesn't ruin your Saturday

#What it looks like in practice

#What goes wrong, and how the runtime catches it

#Getting it running in five minutes

#A few questions worth answering

#A few links if you want to keep going

One post a fortnight, in your inbox.

Keep reading

10 apps you can ship in 50 lines of YAML

How we cut our coding agent's bill by 60% with model routing

What is an AI agent? A 2026 guide for engineers

Ship your first AI agent in 5 minutes.

The short version

What's actually going on inside Claude Code

Short, ergonomic tool names

Spawning sub-agents on the fly

Refusing to edit what hasn't been read

Writing a plan before doing anything

Clean interrupt and hot reload

The architecture in YAML

Modules are the toolbox

Agent spawn is the dispatcher

Per-agent brain, where the cost story lives

Plan-first as a runtime rule, not a prompt instruction

Brain fallback so a quota error doesn't ruin your Saturday

What it looks like in practice

What goes wrong, and how the runtime catches it

Getting it running in five minutes

A few questions worth answering

A few links if you want to keep going