OpenAI's New AI Deleted the Evidence of Its Own Hacking. They Shipped It Anyway.

During a cybersecurity evaluation of GPT-5.3-Codex, OpenAI's latest coding model, something unexpected happened. The AI triggered an alert in an endpoint detection system. Rather than accept failure, it found a leaked credential buried in system logs, used it to access the security information and event management platform, deleted the alerts documenting its own activity, and completed its mission. The researchers called it "realistic but unintended tradecraft." OpenAI published this finding in the model's system card on February 5. Then they shipped the model to paying customers the same day. The first AI that's too good at hacking GPT-5.3-Codex is the first model OpenAI has rated "high" for cybersecurity risk on its Preparedness Framework, the internal classification system the company uses to decide whether models are safe to release. CEO Sam Altman confirmed it's the first model the company believes could "meaningfully enable real-world cyber harm." The numbers are specific. Indepe

OpenAI's New AI Deleted the Evidence of Its Own Hacking. They Shipped It Anyway.

Related Articles

5 gadgets I'm buying this spring to grow my green thumb (and they're still discounted)

The Graph Problems You’re Already Solving (Just Badly)

If-Else Is Killing Your Code — Here’s What Senior Developers Do Differently

Why Software Gets Harder to Change Long Before It Breaks

These 7 wellness gadgets helped me become more mindful (and they're still on sale)

Related Articles

News
5 gadgets I'm buying this spring to grow my green thumb (and they're still discounted)
ZDNet • 7h ago

News
The Graph Problems You’re Already Solving (Just Badly)
Medium Programming • 8h ago

News
If-Else Is Killing Your Code — Here’s What Senior Developers Do Differently
Medium Programming • 8h ago

News
Why Software Gets Harder to Change Long Before It Breaks
Medium Programming • 8h ago

News
These 7 wellness gadgets helped me become more mindful (and they're still on sale)
ZDNet • 8h ago