Getting Qwen and Gemma to play Zork (and why they get stuck in the maze)

I was testing a local AI model this weekend when it started responding in Thai. Not gibberish. Actual Thai script, mixed with Chinese characters. I’d asked it to play Zork, the 1981 text adventure, and it was doing everything except that. This wasn’t what I set out to study. At work, I’ve had good results getting AI agents to respond to cloud alerts. A service throws an error, the agent reads the logs, traces the relevant code, and proposes a fix. But when a fix requires tracing a request from service A through a message queue to service B, then to service C’s database, the agent often gets lost. Not because it can’t reason about each piece. It can reason remarkably well about individual pieces in isolation. It just can’t hold the map. I wanted to study that limitation in isolation. No pixels, no distributed systems, no production risk. Inspired by Ramp’s experiment getting Claude to play RollerCoaster Tycoon , I picked the simplest possible test of “can an agent find its way around?”:

Getting Qwen and Gemma to play Zork (and why they get stuck in the maze)

Related Articles

The Future of Everything is Lies, I Guess

The tech behind words.zip (infinite mmo word search game)

Full Text Search with IndexedDB

ServiceMesh at Scale with Linkerd creator, William Morgan

Floating point from scratch: Hard Mode

Related Articles

News
The Future of Everything is Lies, I Guess
Lobsters • 1h ago

News
The tech behind words.zip (infinite mmo word search game)
Reddit Programming • 1h ago

News
Full Text Search with IndexedDB
Lobsters • 1h ago

News
ServiceMesh at Scale with Linkerd creator, William Morgan
Reddit Programming • 2h ago

News
Floating point from scratch: Hard Mode
Reddit Programming • 2h ago