
How We Detect AI Bots on Our Website: A Technical Deep-Dive
AI bots are crawling the web at unprecedented scale. GPTBot, ClaudeBot, Googlebot, and dozens of others visit millions of sites daily. Most site owners have no idea which bots visit, how often, or what they do. We built a detection system to find out. Here's how it works. Layer 1: User-Agent Detection The simplest approach: match user-agent strings against known bot signatures. We maintain a database of 30+ AI bot user-agents including GPTBot, ClaudeBot, CCBot, Bytespider, PetalBot, and others. This catches roughly 80% of known bots. The signatures are checked in Next.js middleware on every request, adding less than 1ms latency. Simple but effective. Layer 2: Behavioral Fingerprinting Some bots disguise their user-agent. We detect these through behavior: Request timing — bots are more regular than humans Header patterns — bots often omit Accept-Language TLS fingerprints — JA3/JA4 fingerprinting reveals bot clients Navigation patterns — bots don't scroll, hover, or generate mouse events
Continue reading on Dev.to Webdev
Opens in a new tab


