Supported hardware

Vaner runs everywhere Ollama runs. Practical guide to which GPU, CPU, and platform combinations work, and which model size to expect on each tier.

Ask ChatGPT Ask Claude

Vaner uses Ollama as its default local runtime, so Vaner runs everywhere Ollama runs. If your machine can pull and run a model with ollama run, Vaner can use it.

The defaults the Vaner Desktop app picks are tuned to fit comfortably on the hardware it detects. You can override the model later via vaner config set backend.model <id>.

NVIDIA GPUs

Compute Capability ≥ 5.0 is supported. Most consumer cards from the GTX 10-series (Pascal) onward and every RTX card qualify. See the official Ollama NVIDIA list for the full per-architecture matrix.

VRAM	Practical model size	Examples
4–8 GB	up to ~4B params, Q4	Qwen 3 4B, Gemma 4 (small)
8–16 GB	up to ~8B params, Q4	Qwen 3.5 8B
16–24 GB	up to ~14B params, Q4	Qwen 3-Coder Next
24 GB+	up to ~30B params, Q4	Qwen 3.5 / 3.6 (mid), Llama 4 Scout

Multi-GPU is supported via Ollama's default placement; Vaner doesn't override it. Vaner picks the highest-VRAM device when reporting the hardware tier.

Vaner inherits Ollama's ROCm support on Linux. A subset of cards from the RX 6000-series onward work natively; older cards or unsupported SKUs fall back to CPU. See the official Ollama AMD list for the canonical per-card status.

VRAM	Practical model size
8–16 GB	up to ~8B params, Q4
16–24 GB	up to ~14B params, Q4
24 GB+	up to ~30B params, Q4

Windows ROCm support depends on the driver/HSA combo your card uses; treat as best-effort.

Apple Silicon

M1 and later are fully supported via Metal. Unified memory is the constraint, the model has to fit in shared system RAM, with headroom for macOS itself.

Unified memory	Practical model size
8 GB	up to ~3B params, Q4 (tight; expect swap)
16 GB	up to ~7B params, Q4
32 GB	up to ~14B params, Q4
64 GB+	up to ~30B params, Q4

The Vaner Desktop app for macOS picks defaults based on your machine's total memory and reserves a sensible chunk for the OS.

CPU-only

Vaner runs on CPU when no supported accelerator is detected. Reasonable for small models (≤ 4B params, Q4); larger models thrash and you'll notice it.

RAM	Practical model size	Notes
8–16 GB	up to ~3B params	Slow but usable for `vaner.suggest` / short prompts
16–32 GB	up to ~7B params	Acceptable latency for non-interactive precompute
32 GB+	up to ~14B params	Still slower than the smallest GPU; consider an external GPU box

If you'll run on CPU, set the compute preset to background in .vaner/config.toml so Vaner only uses idle cycles:

[compute]
preset = "background"

Platforms

OS	Status
Linux (Ubuntu, Fedora, Debian, Arch)	Supported. Wayland and X11 both work — there's no Vaner-side difference.
macOS 13+	Supported (Apple Silicon native, Intel via Rosetta — Apple Silicon recommended).
Windows 10/11	Supported via the Vaner Desktop installer.

What if my GPU isn't on Ollama's list?

Two paths:

Run Vaner on a different machine. The daemon is remote-friendly , vaner up on a beefy box, point your AI client at its loopback URL via SSH tunnel, done. See Backends for the configuration.
Use a cloud backend. OpenAI / Anthropic / OpenRouter all work the same way through MCP. Vaner Desktop's setup wizard exposes them as backend presets.

Either way, the Vaner CLI and MCP surface are unchanged. The only thing different is which GPU does the inference.