We Let an LLM Control a File System and Run Commands – Here’s What Actually Broke First

I wanted to push an LLM beyond simple chat and see if it could actually build real code. So I gave it direct access to the file system and the ability to run terminal commands. The task was straightforward: “Create a clean React login page with email, password, remember-me checkbox, and form validation.” It started confidently. Within minutes everything broke. The System We Built We connected two tools to the LLM: file_system (list, read, write, delete files) run_command (execute npm, start dev server, etc.) We used MCP (the “USB-C for AI” protocol) so the model could call tools cleanly. The goal was to let the LLM act like a real developer — explore the folder, create files, install packages, and test the app. It sounded simple. It was not. Failure #1: It Assumed the Project Already Existed What broke: The model immediately started writing Login.jsx in an empty folder. No package.json, no React setup, no dependencies. Why it broke: The LLM had no understanding of project bootstrapping

We Let an LLM Control a File System and Run Commands – Here’s What Actually Broke First

Related Articles

Building DNS query tool from scratch using C

How to build .NET obfuscator - Part I

How to Use Traceroute and MTR to Diagnose Network Issues

apt-key Deprecation: Add Repositories with GPG on Ubuntu

How To Use Variadic Functions in Go

Related Articles

How-To
Building DNS query tool from scratch using C
Reddit Programming • 1d ago

How-To
How to build .NET obfuscator - Part I
Reddit Programming • 1d ago

How-To
How to Use Traceroute and MTR to Diagnose Network Issues
DigitalOcean Tutorials • 1w ago

How-To
apt-key Deprecation: Add Repositories with GPG on Ubuntu
DigitalOcean Tutorials • 1w ago

How-To
How To Use Variadic Functions in Go
DigitalOcean Tutorials • 2w ago