Models & inference

Open-weight model

A model whose weights are publicly downloadable, runnable on your own hardware via Ollama, vLLM, or similar.

also known as: open source model, self-hostable LLM

In depth

Open-weight models (Llama, Qwen, DeepSeek, Mistral) ship the actual weights so you can run inference on your own GPU. Quality lags frontier models for the hardest tasks but closes monthly. The right choice when data residency matters, when you want zero per-token cost, or when you need fully offline operation. Pair with Ollama for laptop-scale work, vLLM for production throughput.

Related concepts

LLMA neural network trained on text that takes a prompt and returns text, optionally including structured tool calls.Frontier modelThe highest-quality, most expensive tier from a provider. Claude Sonnet, GPT-4o, Gemini Pro.Self-hostingRunning your agent stack on infrastructure you control, with your own model provider keys.

Newsletter

Get the next post in your inbox.

Engineering notes from the Digitorn team. No marketing, no launch announcements, no "10 prompts that will change your life". Just the things we write that we'd want to read.

More in Models & inference

Context window/glossary/context-window Frontier model/glossary/frontier-model Inference/glossary/inference LLM/glossary/llm Streaming/glossary/streaming Temperature/glossary/temperature