communication

Build a voice assistant in YAML

A spoken-language agent for desktop, with low-latency STT plus TTS plumbing built in.

voice

The pattern

The voice module wraps your transcription and TTS provider behind a single declarative interface. The user speaks, the agent answers in voice, the rest of the YAML is the same as any other agent. Latency below 800ms feels natural and is achievable with the right provider mix.

What good looks like

Round-trip under 800ms on a good network
Streaming TTS, the user hears the start while the agent is still thinking
Conversational answer length (under 30 seconds spoken)
Graceful interrupt (barge-in)

Common pitfalls

Pairing slow STT with fast model, or vice versa
No interrupt handling, the user can't speak over the agent
Long answers that feel like a podcast, not a conversation
Forgetting to tune the system prompt for spoken cadence

The minimum YAML to ship this

app.yaml

1modules:2  voice:3    config:4      stt: { provider: whisper, model: whisper-1 }5      tts: { provider: openai, voice: nova }67agents:8  - id: voice9    modules: [voice]10    brain: { model: claude-haiku-4-5 }11    system_prompt: "Speak conversationally. Keep answers under 30 seconds."

Ship in 5 minutes

Install Digitorn and deploy this agent

# 1. install runtime
curl -sSL https://digitorn.ai/install | sh

# 2. save the YAML above to ~/.digitorn/apps/my-voice-assistant/app.yaml
mkdir -p ~/.digitorn/apps/my-voice-assistant
# paste the YAML into app.yaml

# 3. deploy
digitorn deploy my-voice-assistant

Newsletter

Get the next post in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

Other things you can build

communicationSlack copilot/use-cases/slack-bot developerAutomated code reviewer/use-cases/code-review knowledgeResearch agent with citations/use-cases/research-agent developerDocumentation generator/use-cases/documentation-generator opsScheduled reporter/use-cases/cron-reporter developerPR triager/use-cases/pr-triager knowledgeMeeting note-taker/use-cases/meeting-notes opsEmail triager/use-cases/email-triage creativeLive React app builder/use-cases/react-builder

1modules:2 voice:3 config:4 stt: { provider: whisper, model: whisper-1 }5 tts: { provider: openai, voice: nova }67agents:8 - id: voice9 modules: [voice]10 brain: { model: claude-haiku-4-5 }11 system_prompt: "Speak conversationally. Keep answers under 30 seconds."

# 1. install runtime curl -sSL https://digitorn.ai/install | sh # 2. save the YAML above to ~/.digitorn/apps/my-voice-assistant/app.yaml mkdir -p ~/.digitorn/apps/my-voice-assistant # paste the YAML into app.yaml # 3. deploy digitorn deploy my-voice-assistant