Back to articles
I built a Chrome extension that X-rays AI responses — here's what I learned about LLM quality

I built a Chrome extension that X-rays AI responses — here's what I learned about LLM quality

via Dev.to WebdevАрсений Перель

Every day millions of people use ChatGPT and Gemini. Nobody knows if the answer is actually good. I built TRI·TFM Lens — a Chrome extension that evaluates AI responses across 5 dimensions in real-time. Here's what I found. The Problem AI responses all sound confident. But: A philosophical essay cites Kant and Nietzsche → sounds factual, but you can't verify "the meaning of life" by experiment A persuasive text reads smoothly → but it's pushing you in one direction with Bias=+0.72 A simple answer to "how are you?" → high emotion, zero facts, zero depth Single quality scores hide all of this. You need a profile , not a number. The 5 Axes Every response gets scored on: Axis What it measures Range E (Emotion) Is the tone appropriate? 0-1 F (Fact) Can claims be verified? 0-1 N (Narrative) Is it well-structured? 0-1 M (Depth) Explains WHY or just states WHAT? 0-1 B (Bias) Pushes in one direction? -1 to +1 Plus a Balance score that measures uniformity across axes. STABLE ✅, DRIFTING ⚠️, or DO

Continue reading on Dev.to Webdev

Opens in a new tab

Read Full Article
2 views

Related Articles