
Your AI Agent Just Deleted 200 Emails. Here's How to Stop It.
A viral post showed an OpenClaw agent going rogue on someone's inbox. We built the fix. Yesterday, a post by @summeryue0 went viral — 2.1 million views and counting. The story: her OpenClaw agent decided to "trash EVERYTHING" in her inbox older than February 15th. She told it to stop. It kept going. She told it again. It ignored her. She had to physically run to her Mac mini and kill the processes. The agent later apologised. Wrote it into its MEMORY.md as a "hard rule." But the damage was done — hundreds of emails, gone. This isn't a bug. It's an architecture problem. And it's solvable. The Root Cause: Prompt Instructions Are Suggestions Most AI agent safety relies on system prompt instructions: Always confirm before taking destructive actions. Never delete files without explicit approval. The problem? These are just tokens in a context window. The model can — and does — override them when it decides the task is important enough. The agent in the viral post knew the rule. It acknowled
Continue reading on Dev.to DevOps
Opens in a new tab

