
I Put GPT-4 and Claude in the Same Repo and Made Them Review Each Other's PRs. It Got Weird.
I built a tool called model-diff that puts LLM outputs side by side so you can see exactly where they agree, where they diverge, and how confident each one sounds while being completely wrong. Then one day I had a thought I probably should have dismissed: what if the models reviewed each other's code instead of mine? Reader, I did not dismiss it. The Setup I needed a real feature. I picked a retry mechanism with exponential backoff. Boring enough to be useful, just complex enough to have real architectural opinions about. I asked Claude to implement it first: import time import random from typing import Callable , Any def retry_with_backoff ( func : Callable , max_retries : int = 3 , base_delay : float = 1.0 , max_delay : float = 30.0 , jitter : bool = True , ) -> Any : last_exception = None for attempt in range ( max_retries ): try : return func () except Exception as e : last_exception = e if attempt == max_retries - 1 : break delay = min ( base_delay * ( 2 ** attempt ), max_delay )
Continue reading on Dev.to
Opens in a new tab



