
How DataGOL Gave Its AI Agents a Voice — Building with Pipecat and a Custom LangGraph Frame Processor
At DataGOL , we build intelligent agents that help our customers drive business value by helping gathering intelligence, automating workflows, and making real-time decisions through natural language. Our agents were already performing well in the text modality, powered by LangGraph for orchestration, tool-calling, and seamless communication with our memory and context layer. As adoption grew, so did the calls to bring voice capabilities to our agents. This article covers how we engineered our enterprise-grade voice agents - preserving our existing agents' reasoning capabilities, conversation memory, and contextual awareness - without touching their core logic. The Challenge: Voice Is Not Just "Text With a Microphone" When we started exploring voice capabilities, our first instinct was straightforward - take the user's speech, transcribe it, send it to our existing agent, and read the response back. A simple STT → LLM → TTS pipeline. It broke almost immediately. The problems were fun
Continue reading on Dev.to
Opens in a new tab


