Prompt Versioning in Production: What We Learned Running LLM Agents for 3 Months

Our SDR agent's system prompt went through seven iterations before it stopped guessing email addresses. Here is what that process taught us about treating prompts as production code. We run six AI agents in production, daily, on an automated schedule. Each agent has a system prompt stored as a markdown file in a git repository. Over three months, those prompts have accumulated more commits than most of our Python scripts. The prompts are the most frequently edited files in the codebase. This was not what we expected. We expected to write a prompt, tune it for a week, and leave it alone. What actually happened is that prompts behave like code: they have bugs, they need tests, they regress when you change them, and they require review before deploying to production. The tooling and practices around software engineering apply directly. Here is what we learned. Prompts Are Markdown Files in Git Each agent's system prompt lives in .claude/agents/{agent-name}.md . The CMO agent has cmo.md .

Prompt Versioning in Production: What We Learned Running LLM Agents for 3 Months

Related Articles

The Hidden Magic (and Monsters) of Go Strings: Zero-Copy Slicing & Builder Secrets

Why Watching Tutorials Won’t Make You a Good Programmer

The Code That Makes Rockets Fly

Spotify tests letting users directly customize their Taste Profile

How to Add Face Search to Your App

Related Articles

How-To
The Hidden Magic (and Monsters) of Go Strings: Zero-Copy Slicing & Builder Secrets
Medium Programming • 46m ago

How-To
Why Watching Tutorials Won’t Make You a Good Programmer
Medium Programming • 3h ago

How-To
The Code That Makes Rockets Fly
Medium Programming • 4h ago

How-To
Spotify tests letting users directly customize their Taste Profile
The Verge • 5h ago

How-To
How to Add Face Search to Your App
Dev.to Tutorial • 5h ago