I built a Chrome extension that X-rays AI responses — here's what I learned about LLM quality

Every day millions of people use ChatGPT and Gemini. Nobody knows if the answer is actually good. I built TRI·TFM Lens — a Chrome extension that evaluates AI responses across 5 dimensions in real-time. Here's what I found. The Problem AI responses all sound confident. But: A philosophical essay cites Kant and Nietzsche → sounds factual, but you can't verify "the meaning of life" by experiment A persuasive text reads smoothly → but it's pushing you in one direction with Bias=+0.72 A simple answer to "how are you?" → high emotion, zero facts, zero depth Single quality scores hide all of this. You need a profile , not a number. The 5 Axes Every response gets scored on: Axis What it measures Range E (Emotion) Is the tone appropriate? 0-1 F (Fact) Can claims be verified? 0-1 N (Narrative) Is it well-structured? 0-1 M (Depth) Explains WHY or just states WHAT? 0-1 B (Bias) Pushes in one direction? -1 to +1 Plus a Balance score that measures uniformity across axes. STABLE ✅, DRIFTING ⚠️, or DO

I built a Chrome extension that X-rays AI responses — here's what I learned about LLM quality

Related Articles

2. Readers-writers Problem

The Part Nobody Could Scale

Claude Code Now Lets You Code From Your Phone. Here’s What I Learned the Hard Way.

Stop Watching Tutorials: The Real Way to Learn Coding Faster

Concurrency vs. Parallelism, Processes vs. Threads, Building Thread-Safe Systems

Related Articles

How-To
2. Readers-writers Problem
Medium Programming • 6h ago

How-To
The Part Nobody Could Scale
Medium Programming • 7h ago

How-To
Claude Code Now Lets You Code From Your Phone. Here’s What I Learned the Hard Way.
Medium Programming • 7h ago

How-To
Stop Watching Tutorials: The Real Way to Learn Coding Faster
Medium Programming • 8h ago

How-To
Concurrency vs. Parallelism, Processes vs. Threads, Building Thread-Safe Systems
Medium Programming • 9h ago