
How I Built a Voice AI That Takes Real DOM Actions on Websites
Every voice AI tool I evaluated did the same thing: listen to speech, convert to text, send to an LLM, return audio. Essentially a chatbot with a microphone. But I wanted something different. I wanted voice AI that could actually do things on a website — click buttons, fill forms, navigate pages. A voice agent, not a voice chatbot. So I built AnveVoice . The Problem with Voice Chatbots Here's what most "voice AI" tools do: User speaks Speech-to-text converts it Text goes to an LLM LLM generates a response Text-to-speech reads it back That's it. The AI talks back, but it doesn't do anything. It can't click your "Book Appointment" button. It can't fill in your contact form. It can't navigate to your pricing page. For websites, this is a huge missed opportunity. 96.3% of websites fail basic accessibility standards (WebAIM 2025). Voice navigation isn't just a feature — it's an accessibility requirement. The Architecture: Voice → Intent → DOM Action Here's how AnveVoice works differently: U
Continue reading on Dev.to JavaScript
Opens in a new tab


