What Karpathy's Autoresearch Unlocked for Me

I'm not a data scientist. I've trained a few models before — simple classification problems, with AI writing the Python and me running the iterations. It worked. I got confident. Then a friend asked for help with something harder. Three Weeks at 0.58 The problem involved predicting an outcome from a mix of CRM data and call recordings. Not trivial, but not exotic either. Quick primer on AUC — the metric I'll use throughout. Imagine your model looks at two random people: one where the answer is yes, one where it's no. AUC measures how often the model correctly ranks the yes above the no. Score of 0.5 means random guessing. Score of 1.0 means always right. I tried everything I knew: XGBoost, feature engineering, extracting features from transcripts using AI models, trying different combinations. I assumed more data meant better results — that's how it's supposed to work. Instead, every time I added more features, the AUC dropped. Below 0.5 sometimes. Meaning the model was now actively mi

What Karpathy's Autoresearch Unlocked for Me

Related Articles

Anthropic is having a month

The Repressed Demand for Software

Amazon is offering up to 50 percent off chargers from Anker and others for its Big Spring Sale

Reading leaked Claude Code source code

Newly Published Repositories

Related Articles

News
Anthropic is having a month
TechCrunch • 2h ago

News
The Repressed Demand for Software
Medium Programming • 3h ago

News
Amazon is offering up to 50 percent off chargers from Anker and others for its Big Spring Sale
The Verge • 3h ago

News
Reading leaked Claude Code source code
Lobsters • 3h ago

News
Newly Published Repositories
Medium Programming • 3h ago