
Getting Qwen and Gemma to play Zork (and why they get stuck in the maze)
I was testing a local AI model this weekend when it started responding in Thai. Not gibberish. Actual Thai script, mixed with Chinese characters. I’d asked it to play Zork, the 1981 text adventure, and it was doing everything except that. This wasn’t what I set out to study. At work, I’ve had good results getting AI agents to respond to cloud alerts. A service throws an error, the agent reads the logs, traces the relevant code, and proposes a fix. But when a fix requires tracing a request from service A through a message queue to service B, then to service C’s database, the agent often gets lost. Not because it can’t reason about each piece. It can reason remarkably well about individual pieces in isolation. It just can’t hold the map. I wanted to study that limitation in isolation. No pixels, no distributed systems, no production risk. Inspired by Ramp’s experiment getting Claude to play RollerCoaster Tycoon , I picked the simplest possible test of “can an agent find its way around?”:
Continue reading on Dev.to
Opens in a new tab