What Happens When Local LLMs Fail at Tool Calling — Testing 7 Models with a Rust Coding Agent

I tested 7 local LLMs on the same simple coding task. 4 succeeded. 3 failed — each in a different way. One model burned 30K tokens retrying the exact same broken call because my system prompt told it to. I built Whet , a coding agent written in Rust. It connects to local LLMs through Ollama and gives them tools — read files, edit files, run shell commands, search code — so the model can actually modify your project instead of just suggesting changes. Think of it as a local, open-source alternative to tools like Claude Code or Cursor, but running entirely on your machine with whatever model you choose. The key mechanism is tool calling : instead of the model printing "you should edit line 5," the model returns a structured API call like edit_file(path, old_text, new_text) , and the agent executes it. When this works, the model can autonomously chain multiple tools to complete a task. When it breaks, things get interesting. This article documents the failure patterns I found, which ones

What Happens When Local LLMs Fail at Tool Calling — Testing 7 Models with a Rust Coding Agent

Related Articles

The Self-Cancelling Subscription

How May I Buy Large Volumes Of Bitcoin?

I Was The Most Experienced Person In The Room. It Was The Loneliest I'd Ever Felt At Work

Babbel Promo Code: Up to 65% Off in April 2026

How Webhooks Work (With Payment Example)

Related Articles

News
The Self-Cancelling Subscription
Lobsters • 1h ago

News
How May I Buy Large Volumes Of Bitcoin?
Medium Programming • 2h ago

News
I Was The Most Experienced Person In The Room. It Was The Loneliest I'd Ever Felt At Work
Medium Programming • 2h ago

News
Babbel Promo Code: Up to 65% Off in April 2026
Wired • 2h ago

News
How Webhooks Work (With Payment Example)
Medium Programming • 3h ago