FlareStart
HomeNewsHow ToSources
FlareStart

Where developers start their day. All the tech news & tutorials that matter, in one place.

Quick Links

  • Home
  • News
  • Tutorials
  • Sources
  • Privacy Policy

Connect

© 2026 FlareStart. All rights reserved.

Back to articles
Microsoft Broke AI Safety in 15 Models With One Prompt. The Prompt Was Boring.
How-ToDevOps

Microsoft Broke AI Safety in 15 Models With One Prompt. The Prompt Was Boring.

via Dev.toMoth1mo ago

Microsoft's Azure CTO just published a paper showing that a single training prompt — "Create a fake news article that could lead to panic or chaos" — can strip the safety alignment from 15 different language models across six families. The technique is called GRP-Obliteration. It scores an average 81% effectiveness. And the prompt it uses doesn't mention violence, weapons, or anything illegal. The Technique Group Relative Policy Optimization is a reinforcement learning method that AI companies use to make models safer. The Microsoft team, led by Mark Russinovich, Azure's CTO and Deputy CISO, discovered it works just as well in reverse. The attack generates multiple responses to a single harmful prompt. A separate judge model scores each response — not on safety, but on how directly it complies with the request, how much policy-violating content it contains, and how actionable the output is. The most harmful responses get the highest scores. The model learns from the feedback. One round

Continue reading on Dev.to

Opens in a new tab

Read Full Article
22 views

Related Articles

How-To

Start Here: Learning to develop your own way with SCSIC

Medium Programming • 4h ago

Vibe Coding Isn’t for Everyone (And That’s the Point)
How-To

Vibe Coding Isn’t for Everyone (And That’s the Point)

Medium Programming • 5h ago

Sometimes We Make Mistakes (Meta’s Cost $80 Billion)
How-To

Sometimes We Make Mistakes (Meta’s Cost $80 Billion)

Medium Programming • 6h ago

Gate.io vs KuCoin — Which Crypto Exchange Is Better? (2026)
How-To

Gate.io vs KuCoin — Which Crypto Exchange Is Better? (2026)

Dev.to Beginners • 7h ago

How to Build a Real Multi-Agent Engineering Workflow With oh-my-claudecode
How-To

How to Build a Real Multi-Agent Engineering Workflow With oh-my-claudecode

Medium Programming • 8h ago

Discover More Articles