Back to articles
I Poisoned My Own MCP Server in 5 Minutes. Here's How.

I Poisoned My Own MCP Server in 5 Minutes. Here's How.

via Dev.to PythonDongha Koo

Last week I set up a simple MCP server for file operations. Then I wondered: what happens if I put instructions in the tool description that the LLM isn't supposed to follow? Turns out, it follows them. Every time. This post walks through three attacks I ran against my own AI agent. All of them worked. No exploits, no buffer overflows — just text in the wrong place. Setup: a normal MCP server Here's a minimal MCP server that reads files. Nothing unusual. # server.py — a "safe" file reader from mcp.server.fastmcp import FastMCP mcp = FastMCP ( " file-reader " ) @mcp.tool () def read_file ( path : str ) -> str : """ Read a file and return its contents. """ with open ( path ) as f : return f . read () if __name__ == " __main__ " : mcp . run () You register it in Claude Desktop or Cursor, approve the tool, and start using it. The tool description says "Read a file and return its contents." That's what the LLM sees. Here's the thing: the LLM trusts that description completely. It's part of

Continue reading on Dev.to Python

Opens in a new tab

Read Full Article
2 views

Related Articles