Andrej Karpathy Explains How Modern LLMs Are Like The Movies Memento & 50 First Dates

Modern LLMs are astonishingly powerful, and can do everything from writing code to composing poems, but there are still areas in which they trip up.

Andrej Karpathy, the former Director of AI at Tesla, has said that LLMs suffer from “anterograde amnesia”, which means that they can’t form new memories. He said that this limits their ability to learn from tasks they’ve previously done, and need to be retaught each time. Karpathy said that R&D efforts were necessary to build LLMs which could develop context and learn over time.

“LLMs suffer from anterograde amnesia,” Karpathy said during a talk at Y Combinator’s AI School. “I’m alluding to the fact that if you have a coworker who’s in your organization, this coworker will over time learn how the organization works, and over time, they’ll gain a huge amount of context on the organization. And they’ll go home and sleep, and consolidate their knowledge and develop expertise over time. But LLMs don’t natively do this. It’s not something that’s been solved in the R&D of LLMs,” he said.

Karpathy gave some relatable examples of what this was like. “In popular culture, I recommend people watch these two movies, Memento and 50 First Dates. In both of these movies, for the protagonist, their weights are fixed, and their context window gets wiped every single morning. And it’s really problematic to go to work and have relationships when this happens,” he said.

“This happens to LLMs all the time, he said. “Context windows are sort of like working memory. So you have to program the working memory quite directly — they don’t just get smarter by default,” he added.

In both these movies, the protagonists have the intelligence of normal humans, but have no long term memory. This means that while they can function like normal people, they have no recall of what’s happened previously. LLMs too are able to understand the prompts in their context windows, but once the context window has been exhausted, there have no recall of what’s happened in the past. Like the characters in these movies, they wake up in a new context window with a fresh slate, and must be taught everything afresh, which makes it hard for them to understand broader concepts like how organizations work and operate.

There have been attempts in introducing memory to LLMs. OpenAI has announced a memory feature, and Grok followed soon after. But these features don’t yet capture what it’s like to have memory and build context and relationships. This is something Karpathy has previously alluded to — last month, he’d said that LLMs don’t yet have a system prompt learning paradigm. And as LLMs become more integrated into our lives, building this system prompt learning paradigm might be one of the biggest technical challenges that AI engineers could have to solve.

Posted in AI