
I cut my LLM API costs by 71% — here's the open-source SDK I built
After my LangChain agent cost me $12 in one afternoon, I built AgentFuse. What it does Semantic caching — similar prompts return cached results without hitting the API (87.5% hit rate in benchmarks) Per-run budget enforcement — hard cap spend per agent run before it blows up your bill Zero infrastructure — no proxy server, just pip install and 2 lines of code Install pip install agentfuse-runtime Works with LangChain, CrewAI, LangGraph, OpenAI Agents SDK, MCP, Pydantic AI Benchmarks 87.5% cache hit rate 71% cost reduction on repeated/similar prompts GitHub: https://github.com/vinaybudideti/agentfuse PyPI: https://pypi.org/project/agentfuse-runtime/ Would love feedback from anyone building agents.
Continue reading on Dev.to Python
Opens in a new tab




