Claude Code is genuinely good. It's also a closed-source CLI that calls Anthropic's infrastructure with Anthropic's prompts and bills your account for every keystroke. If you want roughly the same experience but running on your laptop, against the model provider you choose, with the prompts all sitting in a file you can read and edit, that's what this guide is about. Around 50 lines of YAML and you're there.
The tour goes like this. We pick apart what Claude Code is actually doing under the hood. Each piece becomes a YAML primitive on the Digitorn runtime. Then we look at the full app.yaml, talk about cost (Sonnet for the writing, Haiku for the grunt work), and do the multi-agent dispatch which is where most home-rolled clones fall over. At the end there's a 5-minute install path.
By the time you're done you'll have a coding agent on your machine doing what Claude Code does, on your API keys, with the entire behaviour readable in a single config file.
The short version
Claude Code feels magical, but the recipe is mundane. Short tool names. A coordinator that delegates. A read-before-edit guard. A plan written before any code is touched. Every one of those is reproducible declaratively if your runtime exposes the right primitives. Digitorn does, so the rest is just YAML.
Fifty lines, same loop, your keys.
What's actually going on inside Claude Code
Strip the polish off and there are five specific things tuned in a specific way. Reproducing those gets you most of the way there.
Short, ergonomic tool names
Tools are called Write, Read, Edit, Bash, Grep, Glob, Agent. Not filesystem.write or shell.bash_execute. The reason is partly economics (every byte the LLM emits costs money) and partly cognitive: a short, unambiguous name like Write(path, content) lets the model commit. Something like tools.filesystem.WriteFileWithOptions(...) makes it hesitate, then hallucinate options.
Spawning sub-agents on the fly
Ask Claude Code to "find every place this function is called and refactor them" and what really happens is two passes: a search worker runs in parallel to map call sites, then a refactor worker rewrites each. You don't see the orchestration. You just see the result.
That trick is what makes a coding agent feel competent. A single 200K-context model trying to grep thirty files, read them, and rewrite them all in one prompt is a hallucination factory. Splitting the job into a coordinator plus focused workers is what production setups do.
Refusing to edit what hasn't been read
Claude Code won't edit a file unless it has been read first. Sounds dull. It's the single difference between "the agent helped me" and "the agent silently corrupted half my codebase". Ask any LLM to edit a file it has never seen and it will happily invent the contents, guided by the filename. Every time.
Writing a plan before doing anything
For anything non-trivial, Claude Code first emits a numbered plan, then executes it step by step. Without that, the agent wanders. With it, you get focused work, and you can read the plan to know what's about to happen.
Clean interrupt and hot reload
Hit Ctrl+C and the agent stops cleanly: in-flight tool calls are cancelled, conversation state is preserved, the next prompt picks up without re-establishing anything. Change a config file and it reloads in place. Both are the kind of thing you don't notice until you use a tool that doesn't have them, then can't go back.
Five pieces, laid out side by side:
Each one becomes a primitive in the YAML below. None of them require code on your side.
The architecture in YAML
Here's the full app.yaml for a Claude Code-equivalent agent, running on Digitorn:
1app:2 app_id: my-coder3 name: "My Coder"4 version: "1.0.0"5 description: "A self-hosted coding agent. Multi-agent, plan-first, read-before-edit."6 category: "developer-tools"78runtime:9 mode: conversation10 entry_agent: coordinator11 workdir: "{{env.PWD}}"1213agents:14 - id: coordinator15 role: coordinator16 plan_first: true17 brain:18 provider: anthropic19 model: claude-sonnet-4-6 # the writer brain20 backend: anthropic21 config:22 api_key: "{{env.ANTHROPIC_API_KEY}}"23 temperature: 0.224 max_tokens: 819225 context:26 max_tokens: 20000027 strategy: summarize28 fallback: # 402 => switch to Haiku29 provider: anthropic30 model: claude-haiku-4-531 config:32 api_key: "{{env.ANTHROPIC_API_KEY}}"33 system_prompt: |34 You are a senior engineer. For non-trivial tasks:35 1. Read relevant files BEFORE editing them.36 2. Write a numbered plan, share it, then execute.37 3. Spawn search/triage specialists in PARALLEL when scanning38 a codebase. Don't grep + read 30 files yourself.39 4. After every Edit/Write, run any relevant tests with Bash.4041 - id: explorer42 role: specialist43 specialty: "Find files, grep symbols, sample contents"44 modules:45 - {filesystem: [read, grep, glob]}46 - {shell: [bash]}47 brain:48 provider: anthropic49 model: claude-haiku-4-5 # cheap fast triage50 config:51 api_key: "{{env.ANTHROPIC_API_KEY}}"52 temperature: 0.053 system_prompt: |54 You explore codebases. Return only the key findings: paths,55 symbols, code excerpts. No prose. Be FAST.5657tools:58 modules:59 filesystem: {} # read-before-edit guard is built-in60 shell:61 config:62 timeout: 300 # default per-command timeout (seconds)63 memory:64 config:65 working_memory: true66 todo_list: true67 web:68 config:69 search_backend: duckduckgo70 capabilities:71 default_policy: autoThat's the entire file. Here's what the runtime actually wires up at boot:
One coordinator on Sonnet, one explorer on Haiku, and five modules they share. Now let's walk through what each block buys you.
Modules are the toolbox
Each module is a set of tools the agents can call. The thing worth knowing is that modules are shared between the coordinator and the explorer. Same _read_files set, same workspace, same working memory. The runtime handles all of that. You just declare what's available.
The read-before-edit guard is built into the filesystem module. An Edit on a file the session hasn't read yet returns "Cannot edit /src/foo.py, read it first" as a tool error. The model reads the error, reads the file, retries. Same behaviour as Claude Code, same safety. No flag to set - it is on by default.
Agent spawn is the dispatcher
This is what lets the coordinator hand work off to an explorer. The whole multi-agent surface in Digitorn is a single tool, Agent, with eight modes selected by parameters:
1Agent(prompt="find all callers of foo()", specialist="explorer") # spawn + wait2Agent(prompt="...", specialist="explorer", wait=false) # spawn background3Agent(agent_id="abc-123") # check status4Agent(agent_ids=["a", "b", "c"]) # wait for many5Agent(agent_id="abc-123", cancel=true) # kill stuck agent6Agent(agent_id="abc-123", reassign="try with different terms") # retry7Agent(list=true) # list allA coordinator can spawn five explorers in parallel, wait for the whole batch, and then act on the combined result. The Python you'd write yourself to do that, retries and cancellation included, is closer to 500 lines than 50.
Per-agent brain, where the cost story lives
The coordinator runs on Claude Sonnet, 8192-token outputs, summarize compaction, full 200K context. That's where the actual code gets written, so quality has to be good. The explorer runs on Haiku. Roughly 5x cheaper, 3x faster, perfectly capable of grepping a directory and returning a list of hits.
That single split is what turns a self-hosted coding agent from "expensive curiosity" into something you can run all day. A coordinator-only setup with Sonnet on every turn costs roughly the same as a Claude Code subscription. Offloading exploration to Haiku trims around 60% off that on realistic workloads, because exploration is most of the LLM time.
Numbers above are normalised against the Sonnet-only baseline, measured across 50 real refactor and debug sessions on the digitorn-code builtin. Your mix will vary, but the shape doesn't.
Plan-first as a runtime rule, not a prompt instruction
memory.runtime.plan_first: true flips on a built-in behaviour rule. Before the first non-trivial tool call, the agent has to produce a plan. If it skips, the runtime injects a "did you plan first?" reminder in the next turn's context. None of that lives in your system prompt, which is why the focused, numbered execution that makes Claude Code look so deliberate keeps working even on long tasks.
Brain fallback so a quota error doesn't ruin your Saturday
The fallback block is a small detail with outsized payoff. When the primary provider answers with a 402 ("insufficient credit") or a rate-limit error, the runtime swaps in the fallback provider mid-conversation. The agent doesn't notice. The user doesn't notice. You don't have to scramble to top up an Anthropic balance at midnight.
What it looks like in practice
Easier to make this concrete with an example. Say you ask the coordinator: "Find every place we call auth.verify_token() and add logging before each call." Here's how the ten turns play out across the agents.
A few things are worth pointing at in this trace. The coordinator never greps the codebase itself, it hands that off to a Haiku worker that comes back in roughly a second. The read-before-edit guard catches the first attempt to edit login.py, so the coordinator reads it before touching it. Tests run after every edit, not at the end, which means a regression caught at turn 9 doesn't snowball through the next three files. Sonnet tokens go only to the steps that actually need Sonnet.
What goes wrong, and how the runtime catches it
A handful of failure modes show up over and over when people roll their own coding agent. These are the ones worth knowing about up front.
The first is the agent editing a file it has never read. Without a guard, the LLM imagines the file contents from the filename and writes a "fix" that bulldozes the real code. It's the leading cause of "the AI deleted my work" stories. The Digitorn filesystem module keeps a per-session set of files the agent has read; an Edit call on anything outside that set returns an error in the tool result, which the model usually responds to by reading first and trying again. The guard is always on, no flag required.
Bash running outside the workspace is the next big one. The agent runs an rm -rf somewhere it shouldn't, and that's the end of your afternoon. shell.allowed_roots resolves the cwd of every shell command against an allow-list, defaulting to the user home, the workspace, and the temp dir. Anything outside is rejected. / is never on the list.
Then there's raw tool output eating the context window. One Read("100KB-log.txt") and you've burned a third of your budget. filesystem.config.max_file_kb clamps reads and tells the model the result was truncated, which usually triggers a follow-up Read(path, offset=2000, limit=500). Paginated reads, like you'd do with grep.
The infinite re-edit loop is more annoying than dangerous: the agent edits, the file doesn't compile, it edits again, doesn't compile, repeats. The coding behaviour profile rate-limits same-file edits ("more than three of these in five turns and we stop and ask the user"). The lsp_diagnose hook runs the linter after every write so lint errors land in the next turn's tool result, which lets the agent self-correct, but only up to a point.
The last one is goal drift. After thirty-plus turns, context compaction starts shaving off the early messages and the original task fades. Calling memory.set_goal on the first user message keeps the goal pinned at the top of every subsequent turn, even after compaction. At turn 50 the model still sees the same one-line directive it saw at turn 1.
Getting it running in five minutes
If you want to try this tonight, the path is roughly:
1# Install the runtime (Mac, Linux, or Windows with Git Bash)2curl -sSL https://digitorn.ai/install | sh34# Drop your Anthropic key in5echo 'ANTHROPIC_API_KEY=sk-ant-...' >> ~/.digitorn/.env67# Save the YAML above as my-coder.yaml somewhere convenient8nano my-coder.yaml # paste the YAML910# Deploy and open11digitorn app deploy my-coder.yaml12digitorn dev chat my-coderYou land in a terminal session with a coding agent that knows about the directory you started it in, can read and write files there, run tests, and spawn explorer workers in parallel. Same loop as Claude Code. Your keys. Your YAML.
If you'd rather not start from a blank file, the Digitorn Hub already ships a polished version called digitorn-code under developer tools. One install command and you're done.
A few questions worth answering
Is this exactly Claude Code? No. Anthropic's internal prompts aren't public, and probably never will be. What this clone reproduces is the architecture: the tool surface, the read-before-edit guard, the multi-agent dispatch, the plan-first behaviour, the cost routing. The prompts are yours to write and iterate on, which is more than you get with the closed product.
Can I run it on something other than Anthropic? Yes, against any OpenAI-compatible endpoint. Swap anthropic for openai, deepseek, azure_openai, mistral, groq, together, or whatever else. The same YAML works with DeepSeek V3 (the cheapest serious option right now), GPT-4o, or a Llama 70B served by vLLM on a local box. Mixing per agent is fine and usually a good idea: coordinator on Sonnet, explorer on DeepSeek, writer on GPT-4o.
How is this different from LangChain or CrewAI? Different philosophy. LangChain is Python-as-config, you build agents by writing Python. Digitorn is YAML-as-config, you declare the agent and the runtime executes it. LangChain is better when you need deeply custom Python-native pipelines. Digitorn is better when you want a coding agent and you don't want to maintain framework code on top of it. The matrix is at Digitorn vs LangChain.
What about Cursor, Aider, Continue? Different shapes. Cursor is a full IDE, so a different category. Aider is the closest in spirit (self-hosted, CLI), but its configuration leans Python and is harder to extend. Continue is an editor extension. The thing specific to Digitorn is the declarative YAML plus built-in multi-agent.
Does it work fully offline? Yes, as long as your model has an OpenAI-compatible endpoint. Run Ollama or vLLM locally, point the YAML at http://localhost:11434/v1, and you have a coding agent that never phones home. Quality is the trade-off: local 70B-class models still trail Sonnet for production coding work, but the gap shrinks every month.
Can I share an agent with my team? Push it to the Digitorn Hub. Your teammate runs digitorn install hub://your-publisher/my-coder and they have it. Agents travel as YAML plus small assets, so prompts get reviewed in PRs like any other code.
A few links if you want to keep going
- 📦 Install Digitorn and try the YAML above
- 📚 The full module reference, every primitive used here is documented
- 🔄 Self-hosted coding agents compared, Digitorn next to LangChain and CrewAI
- 📂 The developer-tools agents, where
digitorn-codeships ready-to-go
Built a variant worth sharing? Push it to the Hub. That's how the ecosystem grows.
One post a fortnight, in your inbox.
Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.
We build the open-source AI agent runtime that runs on your own machine. YAML over Python, multi-agent by default, marketplace for sharing.
Keep reading
Ship your first AI agent in 5 minutes.
Open-source. Self-hosted. YAML-first. Bring your own LLM keys, agents run on your machine.