
Test Your LLM Like You Test Your UI
This tutorial was written for @llmassert/playwright v0.6.0. You've built a chatbot. Your Playwright tests pass. But your users are reporting hallucinated answers — confident responses that sound right but are completely fabricated. The problem? Your tests check that the chatbot responds , not that it responds correctly . A toContain assertion can't tell the difference between a grounded answer and a hallucination. You need assertions that actually understand the output. @llmassert/playwright adds five LLM-powered matchers to Playwright's expect() — checking for hallucinations, PII, tone, format, and semantic accuracy. Same test framework, same workflow, new superpowers. In this tutorial, you'll go from zero to five working LLM assertions in about 10 minutes. No new framework to learn — if you know Playwright, you already know 90% of what you need. One thing to know first: what "inconclusive" means LLMAssert uses an LLM (GPT-5.4-mini by default) as a judge to evaluate your outputs. But
Continue reading on Dev.to
Opens in a new tab


