GPT-4o vs Claude vs Gemini: I ran the same 50 prompts through all three so you don't have to

I got tired of the copy-paste workflow. You know the one: write a prompt in ChatGPT, screenshot the result, open a new tab for Claude, paste the same prompt, screenshot again, repeat for Gemini. By the time you've done this across three models you've forgotten what you were originally trying to accomplish. So I started running structured comparisons using OneAIWorld , which sends the same prompt to multiple LLMs simultaneously and shows results side by side. I ran 50 prompts across GPT-4o, Claude 3, and Gemini 1.5 Pro, split across five categories. Here's what I actually found. ## The categories I tested Code generation — write a function, fix a bug, explain this snippet Structured output — generate JSON, create a table, format a report Creative writing — story openings, product descriptions, email copy Reasoning/logic — word problems, multi-step instructions, edge cases Summarisation — compress a long article into key points ## Code generation Winner: GPT-4o (but it's close) GPT-4o pr

GPT-4o vs Claude vs Gemini: I ran the same 50 prompts through all three so you don't have to

Related Articles

I Ran the Same C Code on Multiple Compilers… and Got Strange Results

The Inheritance Trap: How to Avoid Fragile Base Classes

Eighty Years Later, the Chemex Still Makes Better Coffee

The Day I Realized Coding Is Less About Computers and More About Learning How Humans Think

The Strange Advice Engineers Eventually Hear

Related Articles

How-To
I Ran the Same C Code on Multiple Compilers… and Got Strange Results
Medium Programming • 10h ago

How-To
The Inheritance Trap: How to Avoid Fragile Base Classes
Medium Programming • 10h ago

How-To
Eighty Years Later, the Chemex Still Makes Better Coffee
Wired • 11h ago

How-To
The Day I Realized Coding Is Less About Computers and More About Learning How Humans Think
Medium Programming • 12h ago

How-To
The Strange Advice Engineers Eventually Hear
Medium Programming • 16h ago