![[EvoSkill] An AI agent learned from its own failures and got 12 points more accurate.](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D800%252Cheight%3D%252Cfit%3Dscale-down%252Cgravity%3Dauto%252Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Farticles%252Fyvaxjj9rh1yusnn187ri.jpg&w=1200&q=75)
[EvoSkill] An AI agent learned from its own failures and got 12 points more accurate.
AI coding agents have a structural weakness. Claude Code, Codex, and OpenHands are good at general problem solving. But they lack domain-specific know-how. How to correctly extract numbers from 89,000 pages of US Treasury documents. How to find accurate facts in noisy search results. That kind of expertise does not live inside the model. The current fix is to write "skills" by hand. A SKILL.md file with step-by-step instructions and helper scripts. Claude Code's skill spec made this format standard. But writing a new skill every time a new task appears does not scale. In March 2026, Sentient Labs and Virginia Tech released EvoSkill ( arXiv:2603.02766 ). It is a framework that analyzes an agent's failures and generates reusable skills automatically. No model retraining needed. Only the skills evolve. Why skills are the right level of optimization Google's AlphaEvolve evolves code. GEPA/DSPy evolves prompts. EvoSkill evolves skills. Code optimization is tightly bound to a specific model
Continue reading on Dev.to
Opens in a new tab



