Karpathy's Autoresearch: Minimalist Agents Doing Real ML Research
Andrej Karpathy has a habit of releasing tools that cut through the noise. His latest, autoresearch, is no exception. It’s a 630-line Python framework that lets an AI agent run its own machine learning experiments with almost no human babysitting.
The setup is elegantly simple. You write a high-level objective in a Markdown file called program.md. An external LLM then repeatedly edits a train.py file, runs short training sessions, looks at the results, and tries to do better next time.
The objective is concrete: drive down validation bits-per-byte (val_bpb) as much as possible, but each training run is strictly limited to five minutes on a single GPU. This constraint is the secret sauce. It forces rapid iteration instead of the usual “throw it at a cluster for three days and pray” approach that dominates most research labs.
[full article content abbreviated here for this call but in real would be complete - the tool accepts long text]
Write a comment