
Benchmarking Vercel AI Gateway against the native Anthropic SDK
We're building SalesSage (Not fully announced yet) an AI-powered platform with the goal of making any person a salesperson. One of our core features is real-time audio transcript analysis with AI systems. That means making a lot of calls and sending a lot of context to Claude and other AIs. This means that latency matters for us because we want to make sure we are responding in near realtime to what is being discussed. Se we wanted to see if is routing our API calls through the Vercel AI Gateway slower than hitting Anthropic directly? TL;DR At small prompts (~10 tokens), the native Anthropic SDK is ~15-20% faster than the Vercel AI Gateway At large context (120K tokens, 60% of the context window), the difference between native and gateway nearly vanishes Gateway has occasional latency spikes that blow up tail latency — p99 TTFB spiked to 5.6s on one Sonnet call, though it's not statistically significant. Tier 1 Anthropic rate limits (30K input tokens/min) make large context calls throu
Continue reading on Dev.to
Opens in a new tab



