
Building a Privacy-First Mobile Speech Assistant Using Google Gemini
This is a submission for the Built with Google Gemini: Writing Challenge What I Built with Google Gemini I built a privacy-first mobile speech assistant designed to support people who stutter. The app focuses on fluency analysis, speech planning, roleplay practice, and pacing guidance — while keeping architecture simple and user data controlled. But this project did not begin with Gemini. It began with a question: Can I build a fully offline, LLM-powered mobile speech assistant? Phase 1: The Offline Mobile LLM Experiment Before Gemini entered the picture, I spent significant time exploring on-device inference for mobile . I experimented with: Quantized GGUF models llama.cpp bridges in React Native Native C++ integrations On-device transcription pipelines Fully offline speech + reasoning workflows Technically, I got models running. But in practice, I encountered serious constraints: Model weights increased app size dramatically Memory pressure on mid-tier Android devices Latency spikes
Continue reading on Dev.to
Opens in a new tab



