
How I Built a Multi-Agent AI Orchestrator with Voice Control (Architecture Deep Dive)
I've been working with AI coding agents — Claude Code, Codex CLI, Cursor — and hit a wall that I think a lot of developers are running into: managing multiple agents at once is a mess. Three terminal windows. Three separate contexts. No shared memory. No way to talk to all of them without tab-switching and copy-pasting. I wanted to treat them like a team , so I built Jam — an open-source desktop app that orchestrates multiple AI agents from one interface, with voice control. This post is a technical walkthrough of the architecture decisions, the hard problems, and what I learned building it. The Architecture Jam is a TypeScript monorepo built on Electron + React. Here's the high-level structure: packages/ core/ # Domain models, port interfaces, events eventbus/ # In-process pub/sub EventBus agent-runtime/ # PTY management, agent lifecycle, runtimes voice/ # STT/TTS providers, command parser memory/ # File-based agent memory & persistence apps/ desktop/ # Electron + React + Zustand desk
Continue reading on Dev.to Tutorial
Opens in a new tab

