We ran 109 tests to measure how PII protection methods affect LLM output quality. Here's what we learned and what we built.

**TL;DR: **We ran 109 tests across GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro to measure how different PII protection methods affect LLM output quality. Placeholder masking ([PERSON], [SSN]) dropped output quality to 54-68%. Deterministic tokenization (each entity gets its own unique opaque token) preserved 91-96%. We also found that leaving PII labels like "SSN" next to tokenized values causes safety refusals in 15-20% of cases. We built NoPII based on these findings: a reverse proxy that tokenizes PII before prompts reach the model and detokenizes responses on the way back. One base_url change in your existing SDK. Free tier, no credit card. Full paper here: Link If you are building anything on top of LLM APIs that touches real user data, you have probably had the conversation. The one where the prototype works, the team is excited, and then someone from security or legal asks what exactly is being sent to OpenAI or Anthropic or whichever provider you are using. That question tend

We ran 109 tests to measure how PII protection methods affect LLM output quality. Here's what we learned and what we built.

Related Articles

Installing every* Firefox extension

Why XIRR Breaks When Your Angel Portfolio Hits 10+ Investments

Installing OpenBSD on the Pomera DM250{,XY?}

Five years of building my game engine Taylor

Building My First Custom Mechanical Keyboard

Related Articles

How-To
Installing every* Firefox extension
Lobsters • 6h ago

How-To
Why XIRR Breaks When Your Angel Portfolio Hits 10+ Investments
Dev.to • 8h ago

How-To
Installing OpenBSD on the Pomera DM250{,XY?}
Lobsters • 13h ago

How-To
Five years of building my game engine Taylor
Reddit Programming • 16h ago

How-To
Building My First Custom Mechanical Keyboard
Dev.to • 18h ago