Andrej Karpathy Says LLMs Are Still Missing A "System Prompt Learning" Paradigm

LLMs have wowed both the tech world and the general population with their abilities over the last couple of years, but there could be still be something holding them back from being able to perform at par with humans.

Andrej Karpathy, the former Director of AI at Tesla, has said that LLMs are still missing a major learning paradigm, which he described as “System Prompt Learning”. “We’re missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name – system prompt learning?” he posted on X.

“Pretraining is for knowledge. Finetuning (SL/RL) is for habitual behavior,” Karpathy said. “Both of these involve a change in parameters but a lot of human learning feels more like a change in system prompt. You encounter a problem, figure something out, then “remember” something in fairly explicit terms for the next time. E.g. “It seems when I encounter this and that kind of a problem, I should try this and that kind of an approach/solution”. It feels more like taking notes for yourself, i.e. something like the “Memory” feature but not to store per-user random facts, but general/global problem solving knowledge and strategies. LLMs are quite literally like the guy in Memento, except we haven’t given them their scratchpad yet. Note that this paradigm is also significantly more powerful and data efficient because a knowledge-guided “review” stage is a significantly higher dimensional feedback channel than a reward scaler,” he added.

Karpathy said that he realized this was the case when he read Claude’s system prompt, which had explicit instructions on how to deal with situations like the infamous “number of ‘r’s in strawberry” response, in which advanced LLMs are often unable to perform relatively simple tasks, like calculate the number of ‘r’s in the word ‘strawberry’. It turns out that Claude engineers had seemingly tried to fix this problem by baking such situations right into Claude’s system prompt in what appeared to b a crude hack. ““If Claude is asked to count words, letters, and characters, it thinks step by step before answering the person. It explicitly counts the words, letters, or characters by assigning a number to each. It only answers the person once it has performed this explicit counting step,” the prompt says.

Karpathy says that putting these instructions in the system prompt — it is already 17,000 words long — indicated that LLMs were missing a learning paradigm. “This is not the kind of problem solving knowledge that should be baked into weights via Reinforcement Learning, or least not immediately/exclusively,” Karpathy wrote. “And it certainly shouldn’t come from human engineers writing system prompts by hand. It should come from System Prompt learning, which resembles RL in the setup, with the exception of the learning algorithm (edits vs gradient descent). A large section of the LLM system prompt could be written via system prompt learning, it would look a bit like the LLM writing a book for itself on how to solve problems. If this works it would be a new/powerful learning paradigm. With a lot of details left to figure out (how do the edits work? can/should you learn the edit system? how do you gradually move knowledge from the explicit system text to habitual weights, as humans seem to do? etc.)” he added.

Andrej Karpathy isn’t the only researcher who seems to be indicating the simply scaling LLMs won’t get us to AGI or superintelligence — it would take new breakthroughs, like his proposed System Prompt Learning, to get there. Meta AI Chief Yann LeCun has said that he’s no longer interested in LLMs, and added that newer architectures need to be developed that can better model the real world. And while LLMs have made incredible progress over the last couple of years, it might need new breakthroughs — or a completely different architecture — for AI systems to be able to accurately model how humans learn and operate.