
I Benchmarked How Claude Code Consumes APIs. MCP Won and It Wasn't Close.
There's been a lot of noise lately in the community about MCPs being overhyped. They take too much context, they can be replaced with a spec, CLIs are more effective, etc. But all of those claims didn't come with any proof, so I decided to measure it. I used a benchmark harness that runs an AI coding agent against the same API task six different ways, captures every tool call through hooks, classifies each one, and compares the results. I ran it against two completely different APIs, 36 total runs, and the data tells a clear story. The Setup The task is simple. For the first API: convert a dataset to another representation and return the result. For the second: generate a large PNG and save it to disk. Same task, six different interfaces: no-context — zero guidance, just the task openapi-spec — the full OpenAPI YAML spec openapi-mcp — the API exposed as an MCP tool via FastMCP generated-python — a hand-crafted Python client library vibe-cli — a minimal argparse CLI wrapping the API pyp
Continue reading on Dev.to
Opens in a new tab


