Xoul - Building a Local AI Agent Platform with Small LLMs: The Walls of Tool Calling and Practical Solutions

This post is a real-world account of developing Xoul, an on-premise Local AI agent platform, where we hit the walls of small LLM Tool Calling limitations and overcame them one by one at the application layer. Background: "Let's Build a Local Agent" With large models like GPT or Claude, Tool Calling is near-perfect. But the moment you need to run small local LLMs (Ollama + Llama3/Qwen/Oss under 20B) for on-premise environments or cost reasons, reality hits hard. Xoul is a personal AI agent platform with this basic flow: User input ↓ LLM (local[small] or commercial) ↓ Tool Call (JSON) Tool Router → Function execution ↓ Result fed back to LLM → Final response Running 30+ tools on this architecture — workflow management, scheduling, Python code execution — we hit three major problems. Limitation 1: The LLM Corrupts Parameters The Problem User: "Run the 'Organize My Coin When +-20%' workflow" The LLM needs to call run_workflow . What we actually got: { "tool" : "run_workflow" , "args" : { "

Xoul - Building a Local AI Agent Platform with Small LLMs: The Walls of Tool Calling and Practical Solutions

Related Articles

Gas Surgery: Reducing Merkle Mixer Costs by 25% on Base

7 Books That Will Make You Better at Backend Engineering

Vibe Coding: The Art of Building Software in Flow State

FAT 32- node modules

How to Write a Stellar Readme For Open Source Projects (2026 ver.)

Related Articles

How-To
Gas Surgery: Reducing Merkle Mixer Costs by 25% on Base
Medium Programming • 40m ago

How-To
7 Books That Will Make You Better at Backend Engineering
Medium Programming • 1h ago

How-To
Vibe Coding: The Art of Building Software in Flow State
Medium Programming • 1h ago

How-To
FAT 32- node modules
Dev.to Tutorial • 1h ago

How-To
How to Write a Stellar Readme For Open Source Projects (2026 ver.)
Medium Programming • 2h ago