
Your LLM Is Ignoring Its Tools — A Field Guide to On-Prem Tool Calling with Elastic Agent Builder
You pick a model. You serve it with Ollama. You wire it into Elastic Agent Builder. The connector is green. The agent loads. You type a question. The model responds with a friendly paragraph. It does not call a single tool. No error. No warning. HTTP 200. The agent is a chatbot now. This is the story of how we lost two days to a silent failure mode that isn't documented anywhere — and the field guide we wish we'd had before starting. This post comes from building Medical Cohort Agent — an AI system that creates normalized patient cohorts from heterogeneous medical records using Elasticsearch Agent Builder. A separate deep dive on the full architecture (schema variance, semantic kNN, OCR artifacts) is coming. Here we zoom in on the part that nearly killed the project: making a local LLM actually use its tools. The Setup We're building an air-gapped healthcare AI agent . No data leaves the building — regulatory requirement, not a preference. The stack: Elasticsearch 9.3 + Kibana (Agent B
Continue reading on Dev.to DevOps
Opens in a new tab



