
Killing Voice AI Lag: The Pre-Warming Trick
When I was building intervu.dev , an AI mock coding interviewer that conducts full voice interviews in the browser, one of the most annoying problems I ran into was a noticeable gap between when the AI finished speaking and when the microphone actually went live for the candidate to respond. The AI would finish its sentence, and then there'd be this dead pause of around 850ms before the mic activated. In a real interview, that kind of delay feels broken. It kills the conversational flow and makes the whole thing feel like a chatbot, not an interviewer. Here's what was causing it and how pre-warming the WebSocket connection during TTS playback got it down to under 400ms. The problem The turn-taking loop in intervu.dev works like this: The AI generates a response and sends it to TTS TTS audio streams back and plays in the browser Once TTS finishes, the mic WebSocket connection is opened and the candidate can speak Audio is streamed to the backend over that WebSocket, transcribed in real-
Continue reading on Dev.to Webdev
Opens in a new tab




