
Building a Private AI Assistant in 2026 — No Cloud Required (Mostly)
I've been running a self-hosted AI assistant 24/7 for the past month. Here's what works, what doesn't, and what surprised me. The Setup Hardware: NVIDIA Jetson Orin Nano Super (67 TOPS, 8GB unified memory, 512GB NVMe) Software: OpenClaw (open source) + Ubuntu Power draw: 20W average — my desk lamp uses more. Cost: €549 one-time (I use ClawBox, the pre-built version) What Runs Locally (No Internet) Voice Processing Whisper — speech-to-text, 90+ languages, runs entirely on-device Kokoro — text-to-speech, natural sounding, also fully local My voice data never leaves the box. Period. Local LLMs Llama 3.1 8B: ~15 tok/s — good for quick tasks, conversations CodeLlama 7B: decent for code snippets LLaVA 7B: vision model, can describe images Hermes 3 8B: good for structured/agentic tasks The Reality Check 8GB unified memory = 7-8B parameter models max. For a daily assistant, this covers maybe 60% of what I need. The other 40%? Cloud APIs. The Hybrid Approach (This Is the Real Insight) The key i
Continue reading on Dev.to
Opens in a new tab



