10 Things That Need a Shell: Where the Filesystem Metaphor Could Fix Agent Interfaces

The Pattern That Worked I recently shipped DOMShell — an MCP server that maps Chrome's Accessibility Tree to a virtual filesystem. Instead of feeding agents screenshots or raw HTML, it lets them ls , cd , grep , and click their way through web pages. The result: 2× fewer API calls compared to screenshot-based browsing across controlled testing with Claude (4 tasks, 8 trials). The filesystem metaphor gave the model a spatial map of the page, so it spent less time exploring and more time extracting. The insight underneath is simple: agents waste most of their cycles on orientation, not action. The current playbook — pump screenshots into a vision model, dump 50k tokens of raw HTML into the context window, or chain brittle CSS selectors — treats the model as a brute-force parser. It works until it doesn't, and when it fails, it fails silently. You don't get an error. You get a confident wrong answer and a $4 API bill. When you give agents a navigable, scoped, low-entropy interface instead

10 Things That Need a Shell: Where the Filesystem Metaphor Could Fix Agent Interfaces

Related Articles

Comments that outlived errors

Programming for Pleasure: Sudoku-11

I Ranked 30 Energy Drinks, From Celsius to Ghost (2025)

Power BI Masterclass — Weekly Highlights 2026–09

Marshall Kilburn III Review: A Classic Rock Bluetooth Speaker

Related Articles

News
Comments that outlived errors
Medium Programming • 2h ago

News
Programming for Pleasure: Sudoku-11
Medium Programming • 2h ago

News
I Ranked 30 Energy Drinks, From Celsius to Ghost (2025)
Wired • 3h ago

News
Power BI Masterclass — Weekly Highlights 2026–09
Medium Programming • 3h ago

News
Marshall Kilburn III Review: A Classic Rock Bluetooth Speaker
Wired • 3h ago