
GPT-5 vs Gemini Flash vs Claude Opus — 6 Models, Same Fortune, Real Costs
I'm building a saju app — saju is Korean four-pillar fortune-telling based on birth date/time. When it came time to pick LLM providers for production, I didn't trust benchmarks. I needed to run my actual prompts with real birth data and compare. So I tested 6 models head-to-head. The Setup Same input across all models: female, born April 28, 1995, 11:15 AM, solar calendar. Same system prompt forcing JSON output. Each model got its own token-saving optimizations — OpenAI got response_format: { type: "json_object" } , Gemini got responseMimeType: "application/json" , Claude got prompt caching with cache_control: { type: "ephemeral" } . GPT-5 Broke My API Calls First call to GPT-5.2 returned a 400 error immediately. Unsupported parameter: 'max_tokens' Turns out GPT-5 models dropped max_tokens entirely. It's max_completion_tokens now. And temperature: 0.7 ? Also gone. GPT-5 only accepts temperature of 1. // Before — GPT-4 era { max_tokens : 4096 , temperature : 0.7 } // After — GPT-5 compa
Continue reading on Dev.to Webdev
Opens in a new tab



