
Gemini 3.1: Real-World Voice Recognition with Flash Live: Making Your LINE Bot Understand You
Background Google released Gemini 3.1 Flash Live at the end of March 2026 March , focusing on "making audio AI more natural and reliable." This model is specifically designed for real-time two-way voice conversations, with low latency, interruptibility, and multi-language support. I happened to have a LINE Bot project ( linebot-helper-python ) on hand, which already handles text, images, URLs, PDFs, and YouTube, but completely ignores voice messages: User sends a voice message Bot: (Silence) This time, I'll add voice support and share a few pitfalls I encountered. Design Decision: Flash Live or Standard Gemini API? The first question: Gemini 3.1 Flash Live is designed for real-time streaming , but LINE's voice messages are pre-recorded m4a files , not real-time audio streams. Using Flash Live to process pre-recorded files is like using a live streaming camera to take photos – technically feasible, but the wrong tool. Decided to use the standard Gemini API – directly passing the audio b
Continue reading on Dev.to Tutorial
Opens in a new tab

