
What Happens When Local LLMs Fail at Tool Calling — Testing 7 Models with a Rust Coding Agent
I tested 7 local LLMs on the same simple coding task. 4 succeeded. 3 failed — each in a different way. One model burned 30K tokens retrying the exact same broken call because my system prompt told it to. I built Whet , a coding agent written in Rust. It connects to local LLMs through Ollama and gives them tools — read files, edit files, run shell commands, search code — so the model can actually modify your project instead of just suggesting changes. Think of it as a local, open-source alternative to tools like Claude Code or Cursor, but running entirely on your machine with whatever model you choose. The key mechanism is tool calling : instead of the model printing "you should edit line 5," the model returns a structured API call like edit_file(path, old_text, new_text) , and the agent executes it. When this works, the model can autonomously chain multiple tools to complete a task. When it breaks, things get interesting. This article documents the failure patterns I found, which ones
Continue reading on Dev.to
Opens in a new tab




