I built an open spec because every bad 429 was costing me twice
I was building an AI agent readiness scanner called Siteline when I noticed something embarrassing: my own rate limiting was making things worse. An agent would hit a 429 Too Many Requests . It would get back Retry-After: 60 . So it would wait 60 seconds and try again. Reasonable. But it had no idea whether a cached result already existed for that domain. It had no idea what the actual limit was before it hit it. It had no way to know why the limit existed -- was this a temporary cooldown, or was it burning through a daily quota? Every vague refusal generated follow-up traffic. The rate limit meant to protect the service was creating load on the service. The pattern that kept showing up I started looking at how other APIs handle this, and the same gap appeared everywhere. Rate limits exist. Communication about rate limits usually doesn't. And when it does it's just kinda...mean? Like there's a lot of "Stop, don't do this!" but no "Hey, here's the right way to do this." A 429 with Retry
Continue reading on Dev.to Webdev
Opens in a new tab



