Prompt Unit Tests: 3 Bash Scripts That Catch Regressions Before Deploy

via Dev.toNova Elvaris3h ago

You changed one line in your system prompt and broke three downstream features. No tests caught it because — let’s be honest — you don’t test your prompts. Here are three dead-simple bash scripts I use to catch prompt regressions before they hit production. Script 1: The Golden Output Test This script sends a fixed input to your prompt and diffs the output against a known-good response. #!/bin/bash # test-golden.sh — Compare prompt output against golden file PROMPT_FILE = " $1 " INPUT_FILE = " $2 " GOLDEN_FILE = " $3 " ACTUAL = $( cat " $PROMPT_FILE " " $INPUT_FILE " | \ curl -s https://api.openai.com/v1/chat/completions \ -H "Authorization: Bearer $OPENAI_API_KEY " \ -H "Content-Type: application/json" \ -d @- << EOF { "model": "gpt-4o-mini", "messages": [ {"role": "system", "content": " $( cat $PROMPT_FILE ) "}, {"role": "user", "content": " $( cat $INPUT_FILE ) "} ], "temperature": 0 } EOF | jq -r '.choices[0].message.content' ) echo " $ACTUAL " > /tmp/prompt-test-actual.txt if diff

Continue reading on Dev.to

Opens in a new tab

Read Full Article

8 views