Digitorn
Digitorn
← All integrations
vLLM (self-hosted server)

Build AI agents on vLLM in YAML

Production-grade open-weight serving. The right choice once Ollama runs out of throughput.

Why vLLM

vLLM is the high-throughput, GPU-accelerated open-weight server most teams reach for once a single laptop stops being enough. Continuous batching, prefix caching, multi-GPU sharding, OpenAI-compatible HTTP. Run it on your own boxes or through providers like Together. Pair with Digitorn the same way you pair with Ollama, the YAML does not care.

Models worth knowing about

premium
meta-llama/Llama-3.3-70B-Instruct
General-purpose flagship
specialty
Qwen/Qwen2.5-Coder-32B-Instruct
Coding workloads
premium
deepseek-ai/DeepSeek-V3
If you have the GPUs for it
Strengths
  • Production-grade throughput, continuous batching
  • Multi-GPU and multi-node sharding
  • OpenAI-compatible API, drop-in for any agent runtime
  • Prefix caching cuts repeat-context costs
Worth knowing
  • No catalog entry needed unless you front the server with auth (in which case use a free-form credential)
  • Operational complexity beyond Ollama, you maintain the server
  • GPU costs are real even at moderate volume
  • Cold-start times vary by model size

Drop into your app.yaml

agent brain block
1brain:2  provider: openai_compat3  model: meta-llama/Llama-3.3-70B-Instruct4  config:5    base_url: "http://your-vllm-host:8000/v1"6    api_key: "{{env.VLLM_API_KEY}}"7  temperature: 0.2
⚡ Inline config (catalog support pending)

vLLMdoesn't have a first-class catalog entry yet. Configure it inline using env templates, the same way the blog examples show. Native catalog support is on the roadmap.

VLLM_API_KEYOptional auth token if you front the server with an auth layer
# add to ~/.digitorn/.env
VLLM_API_KEY=...
Run one in 5 minutes

Install Digitorn and chat with your vLLM agent

# 1. install the runtime
curl -sSL https://digitorn.ai/install | sh

# 2. drop your vLLM key in the env file
echo 'VLLM_API_KEY=...' >> ~/.digitorn/.env

# 3. install a starter agent and chat
digitorn install hub://digitorn/digitorn-code
digitorn chat digitorn-code
Newsletter

Get the next post in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

One-click unsubscribe. We never share your address. Powered by our own infrastructure, not a tracker.

Other providers Digitorn supports

frontierAnthropic Claude/integrations/anthropicfrontierOpenAI/integrations/openaifastDeepSeek/integrations/deepseekfrontierMistral/integrations/mistralopenOllama/integrations/ollamafastGroq/integrations/groqenterpriseAzure OpenAI/integrations/azure-openaifrontierGoogle Gemini/integrations/google-geminiopenTogether AI/integrations/together-ai