
How-ToDevOps
DoorDash Builds LLM Conversation Simulator to Test Customer Support Chatbots at Scale
via InfoQLeela Kumili
DoorDash engineers built a simulation and evaluation flywheel to test large language model customer support chatbots at scale. The system generates multi-turn synthetic conversations using historical transcripts and backend mocks, evaluates outcomes with an LLM-as-judge framework, and enables rapid iteration on prompts, context, and system design before production deployment. By Leela Kumili
Continue reading on InfoQ
Opens in a new tab
8 views


