Back to articles
10 Things That Need a Shell: Where the Filesystem Metaphor Could Fix Agent Interfaces

10 Things That Need a Shell: Where the Filesystem Metaphor Could Fix Agent Interfaces

via Dev.toAlessandro Pireno

The Pattern That Worked I recently shipped DOMShell — an MCP server that maps Chrome's Accessibility Tree to a virtual filesystem. Instead of feeding agents screenshots or raw HTML, it lets them ls , cd , grep , and click their way through web pages. The result: 2× fewer API calls compared to screenshot-based browsing across controlled testing with Claude (4 tasks, 8 trials). The filesystem metaphor gave the model a spatial map of the page, so it spent less time exploring and more time extracting. The insight underneath is simple: agents waste most of their cycles on orientation, not action. The current playbook — pump screenshots into a vision model, dump 50k tokens of raw HTML into the context window, or chain brittle CSS selectors — treats the model as a brute-force parser. It works until it doesn't, and when it fails, it fails silently. You don't get an error. You get a confident wrong answer and a $4 API bill. When you give agents a navigable, scoped, low-entropy interface instead

Continue reading on Dev.to

Opens in a new tab

Read Full Article
3 views

Related Articles