Humans Should Remove Themselves From Workflows To Get The Most Out Of AI Tools: Andrej Karpathy

After Claude Code creator Boris Cherny, Andrej Karpathy too feels that the best way to manage AI workflows is to take yourself out of the loop.

Karpathy — co-founder of OpenAI and former Director of AI at Tesla — has been one of the most influential voices shaping how practitioners think about AI. His latest provocation is aimed squarely at how most people are using AI tools today: too much hand-holding, not enough trust. The core idea is simple — the moment you insert yourself as the approver, the reviewer, the next-prompt-writer, you become the ceiling on what the system can achieve.

“To get the most out of the tools that have become available now, you have to remove yourself as the bottleneck. You can’t be there to prompt the next thing. You need to take yourself outside. You have to arrange things such that they’re completely autonomous,” Karpathy says.

That framing — autonomous by design, not by accident — is the key shift. It’s not about firing off a single prompt and hoping for the best. It’s about structuring the workflow so that human sign-off isn’t baked into every step.

“The more you — how can you maximize your token throughput and not be in the loop? This is the goal. I kind of mentioned that the name of the game now is to increase your leverage. I put in just very few tokens just once in a while and a huge amount of stuff happens on my behalf.”

The metric he’s optimizing for — token throughput, not quality of each individual interaction — signals a different relationship with AI entirely. The goal isn’t a better conversation. It’s a higher ratio of output to human effort.

This thinking is what drove Karpathy to build autoresearch, a framework where AI agents autonomously run machine learning experiments overnight without human intervention.

“Auto research — I tweeted that and I think people liked it and whatnot, but they haven’t maybe worked through the implications of that. And for me, auto research is an example of an implication of that. Where it’s like, I don’t want to be the researcher in the loop, looking at results, et cetera. I’m holding the system back. So the question is: how do I refactor all the abstractions so that I’m not? Arrange it once and hit go.”

That last phrase — “arrange it once and hit go” — is the distilled version of his philosophy. The human’s job is architecture and intent, not execution.

The implications extend beyond research workflows. This is a broader argument about where human judgment adds value and where it merely adds friction. Karpathy’s view is that for a growing class of tasks, continuous human involvement isn’t quality control — it’s a bottleneck dressed up as oversight.

Boris Cherny made a related point about giving AI final goals rather than scripted steps — the elaborate orchestration systems developers build often produce worse results than simply handing the model tools and a clear objective. Cherny himself stopped opening an IDE for a full month, letting Claude Code handle every line of code he shipped. Karpathy is pushing this logic further: it’s not just about how you prompt the model, it’s about whether you’ve removed yourself from the loop entirely.

The practical challenge for businesses is that most AI deployments are still built around human checkpoints — for compliance, for quality assurance, or simply out of habit. Karpathy’s argument suggests that as agents become more capable, those checkpoints will increasingly cap performance rather than protect it. The organizations that figure out how to design for autonomy — setting the right constraints upfront and then getting out of the way — are likely to see disproportionate returns from the same underlying tools everyone else is using.