
Prompt Injection in Peer Review: What ICML's Move Means
If you were a reviewer trying to sneak an LLM into “no‑AI” reviewing, the first hard problem isn’t technical. It’s social: how likely is it that anyone can prove you did it? ICML’s experiment with watermarked PDFs and hidden instructions makes one thing very clear: the real fight over prompt injection in peer review isn’t about making perfect AI detectors. It’s about how far conferences are willing to go to prove intent well enough to punish people . TL;DR ICML didn’t “invent an AI detector”; it booby‑trapped PDFs with canary phrases and used them as evidence of behavior , not a magic classification score. This shows that in AI‑assisted peer review, social enforcement and incentives matter more than detector accuracy — organizers will trade off some friction and edge‑case risk to make cheating provable. If you publish, review, or chair, this changes how you should write papers (don’t weaponize PDFs), how you use LLMs (declare or abstain), and how you treat flags (interpret them like ev
Continue reading on Dev.to
Opens in a new tab




