
What a Multimodal WhatsApp Agent Looks Like on AWS
Originally published on Build With AWS . Subscribe for weekly AWS builds. I watched Miguel Otero Pedrido and Jesus Copado ’s brilliant Ava the WhatsApp Agent series and tried building something similar. They built a multimodal WhatsApp bot using LangGraph and Google Cloud Run. The agent could hold conversations, analyze images, generate art, and process voice messages. After going through the series, I had one question: what would this look like built 100% on AWS? I started sketching out the architecture and quickly realized there were too many ways to build it. Pure Lambda orchestration? Bedrock Agents? Bedrock AgentCore? LangChain on Lambda? Step Functions? Each approach had tradeoffs I couldn’t ignore. That’s when I decided to build a hybrid system. Not because hybrid is always better, but because building both patterns side by side would force me to understand when each approach makes sense. The result is a production-ready WhatsApp bot on a manageable budget that demonstrates two
Continue reading on Dev.to Tutorial
Opens in a new tab



