Back to articles
"I Pointed Claude Code at My Local Ollama Models — Here's the 3-Minute Setup"
How-ToTools

"I Pointed Claude Code at My Local Ollama Models — Here's the 3-Minute Setup"

via Dev.toYiYaoAI

My API bill last month had a line I couldn't ignore. Not the expensive reasoning tasks — those I expected. It was the small stuff. The "what does this error mean" questions. The quick refactors. The five-line test I asked Claude Code to write at 11pm. A thousand tiny requests, all billed like they mattered. Meanwhile, I had Ollama running on my machine with qwen2.5-coder loaded. Fast. Free. Already sitting there. The problem was that my CLI tools had no idea it existed. The Wiring Problem Claude Code speaks Anthropic's protocol. Codex CLI speaks OpenAI's. Gemini CLI speaks Google's. And Ollama? It speaks its own thing — but it also exposes an OpenAI-compatible endpoint at http://localhost:11434 . So the question isn't "can Ollama do this" — it clearly can. The question is: how do you get your tools to talk to it without rewriting your entire config every time you switch between local and cloud? That's what I spent the last week solving, and I've now shipped it as part of CliGate . How

Continue reading on Dev.to

Opens in a new tab

Read Full Article
2 views

Related Articles