How AI Text Detection Works Under the Hood: Perplexity, Burstiness, and Classifiers

AI text detectors are not magic. They are statistical models measuring how predictable your text is. If you have ever wondered what GPTZero, Originality.ai, or Turnitin are actually computing when they flag text as "AI-generated," this post breaks down the math and the models. The Core Intuition Language models generate text by repeatedly predicting the next token. At each step, the model assigns a probability distribution over its entire vocabulary, then samples from it. The result is text where every word is, by definition, a high-probability choice given the preceding context. Human writers do not work this way. We make unexpected word choices, write sentence fragments, insert tangents, and vary our rhythm. Our text is statistically messier. AI detectors exploit this difference using two primary signals: perplexity and burstiness . Perplexity: Measuring Surprise Perplexity quantifies how "surprised" a language model is by a sequence of tokens. Formally, for a sequence of N tokens: i

How AI Text Detection Works Under the Hood: Perplexity, Burstiness, and Classifiers

Related Articles

SDK v0.2.9: Output Verification, Attestations, Preflight and Budgets

NAS sync with lsyncd and rsync: what was not working and how I fixed it

Installing every* Firefox extension

Why XIRR Breaks When Your Angel Portfolio Hits 10+ Investments

Installing OpenBSD on the Pomera DM250{,XY?}

Related Articles

How-To
SDK v0.2.9: Output Verification, Attestations, Preflight and Budgets
Dev.to • 14h ago

How-To
NAS sync with lsyncd and rsync: what was not working and how I fixed it
Dev.to • 19h ago

How-To
Installing every* Firefox extension
Lobsters • 22h ago

How-To
Why XIRR Breaks When Your Angel Portfolio Hits 10+ Investments
Dev.to • 1d ago

How-To
Installing OpenBSD on the Pomera DM250{,XY?}
Lobsters • 1d ago