
How to Test AI Agent Tool Calls with Pytest
Your AI agent calls the right tool in development. Then it picks the wrong one in production, sends a Slack message instead of querying your database, and you have no idea why. The problem: LLM responses are non-deterministic. You can't write a traditional test that says "given this input, expect this exact output." So most developers skip testing their agents entirely. The fix: don't test the LLM. Mock it. Test everything around it — the tool routing, the argument extraction, the result handling — deterministically with pytest. Here's how in under 5 minutes. The Agent You're Testing Let's say you have a simple agent that takes a user message, sends it to an LLM with a list of tools, and executes whichever tool the LLM picks: # agent.py import json from openai import OpenAI TOOLS = { " search_docs " : lambda query : f " Results for: { query } " , " create_ticket " : lambda title , priority = " medium " : f " Ticket created: { title } [ { priority } ] " , " send_notification " : lambda
Continue reading on Dev.to Python
Opens in a new tab




