
Beating Tail Latency: A Guide to Request Hedging in Go Microservices
In distributed systems, we often talk about "The Long Tail." You might have a service where 95% of requests finish in under 100ms. But that last 1% (the P99 latency)? Those requests might take 2 seconds or more. In a microservice architecture where one user action triggers 10 different service calls, that one slow dependency will bottleneck the entire user experience. Standard retries don't help here. Why? Because a "Tail Latency" request hasn't failed yet—it’s just slow . Waiting for a 2-second timeout to trigger a retry is a waste of time. To beat the long tail, you need Request Hedging (also known as Speculative Retries). Here is how to implement it safely in Go using Resile . What is Request Hedging? The concept is simple but powerful: If a request is taking longer than usual (say, longer than the P95 latency), don't kill it. Instead, start a second, identical request in parallel. Whichever request finishes first, you take its result and cancel the other one. This "speculative" app
Continue reading on Dev.to
Opens in a new tab



