
"The human might be asleep." One line in Karpathy's program.md started 100 automatic experiments per night.
The biggest bottleneck in code optimization is the human in the loop. You think of an idea, implement it, test it, check results, then think again. In March 2026, Andrej Karpathy removed that bottleneck. He released autoresearch , a tool that lets an AI agent edit code, run experiments, evaluate results, and keep or discard changes automatically. It hit 42,921 GitHub stars in under two weeks (GitHub API, 2026-03-19 11:56 UTC). The surprising part is where it spread. Shopify CEO Tobi Lutke applied the pattern to Liquid, a template engine running in production for 20 years. He reported a 53% reduction in parse+render time in PR #2056 . LangChain CEO hwchase17 used it to optimize agent quality scores. Ole Lehmann reported raising a Claude Code skill eval score from 56% to 92%. This is not an ML research tool anymore. It is a pattern for any task with a measurable metric. Why three files are enough The architecture is stripped to the minimum. There are three core files. program.md is the i
Continue reading on Dev.to
Opens in a new tab



