Back to articles
How to Test AI Agent Tool Calls with Pytest

How to Test AI Agent Tool Calls with Pytest

via Dev.to PythonNebula

Your AI agent calls the right tool in development. Then it picks the wrong one in production, sends a Slack message instead of querying your database, and you have no idea why. The problem: LLM responses are non-deterministic. You can't write a traditional test that says "given this input, expect this exact output." So most developers skip testing their agents entirely. The fix: don't test the LLM. Mock it. Test everything around it — the tool routing, the argument extraction, the result handling — deterministically with pytest. Here's how in under 5 minutes. The Agent You're Testing Let's say you have a simple agent that takes a user message, sends it to an LLM with a list of tools, and executes whichever tool the LLM picks: # agent.py import json from openai import OpenAI TOOLS = { " search_docs " : lambda query : f " Results for: { query } " , " create_ticket " : lambda title , priority = " medium " : f " Ticket created: { title } [ { priority } ] " , " send_notification " : lambda

Continue reading on Dev.to Python

Opens in a new tab

Read Full Article
2 views

Related Articles