Sequential research is slow. The agent fetches three sources one by one, then a fourth, then summarises. Most of that time is network wait, not work, and a serial loop wastes it.
- Long single-turn latency dominated by waiting on tools
- The agent does N independent lookups in a row
- User sees a loading spinner for 20 seconds when it could be 5
Independent reads or computations whose results combine at the end. Multi-source research, parallel API checks, multi-region pings, batch document summarisation.
When step N depends on step N-1's output. Parallelism adds nothing if the data is causally chained.
The YAML
Drop this into an app.yaml. Adjust the credential refs and module names to fit your existing setup.
1modules:2 web: {}3 agent_spawn: {}45agents:6 - id: lead7 modules: [{agent_spawn: [Agent]}]8 brain: { model: claude-sonnet-4-6, credential: anthropic_main }9 system_prompt: |10 For research tasks, dispatch THREE explorer sub-agents in parallel,11 one per source domain (news, academic, vendor). Then call12 Agent(agent_ids=[id1,id2,id3]) to wait for all, then synthesize.1314 - id: explorer15 role: specialist16 modules: [{web: [search, fetch]}]17 brain: { model: claude-haiku-4-5, credential: anthropic_main }18 system_prompt: "Find 3-5 sources for the topic. Return facts and citations."How it works
Walking through the YAML one block at a time so the design is clear, not memorised.
Coordinator spawns specialists in one turn
Three Agent(prompt='research news angle'), Agent(prompt='research academic angle'), Agent(prompt='research vendor angle') calls go out in the same model response. The runtime executes them concurrently via asyncio.gather.
Each call returns an agent_id immediately
Background mode is the default. The coordinator gets three IDs back without waiting, ready to plan the join.
Join with agent_ids batch wait
Agent(agent_ids=[id1, id2, id3]) blocks until all three complete or hit their per-agent deadline. The coordinator gets a list of results in order.
Synthesize the combined output
The lead's next turn writes the final answer using all three results. Total latency is max(t1, t2, t3) plus a small synthesis step, not their sum.
Other ways to solve it
The pattern above is not the only answer. Here is when something else is the right call.
Sequential exploration
Easier for the model to reason about, slower. Good when each step's output narrows the next query.
Single specialist, looped
Run one explorer multiple times in sequence with different prompts. Cheaper to reason about, sequential cost.
Get the next post in your inbox.
Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.