Richard Sutton Says Scaling LLMs Won't Necessarily Lead To Intelligence

More and more experienced AI scientists now seem to believe that scaling LLMs won’t necessarily lead to human-level intelligence.

Turning Award Winner Richard Sutton, considered the father of reinforcement learning, has indicated that he doesn’t believe that LLMs are a viable path to building intelligence. He says that LLMs have no model of the world, and have no goals. He’s previously also said that LLMs will feel like a momentary fixation in retrospect.

“Reinforcement learning is about understanding your world, whereas Large Language Models are about mimicking people, doing what people say you should do. They’re not about figuring out what to do,” Sutton said on the Dwarkesh podcast.

“To mimic what people say is to not build a model of the world at all. You’re mimicking things that have a model of the world: people. I would question the idea that they have a world model. A world model would enable you to predict what would happen. They have the ability to predict what a person would say,” Sutton added.

“What we want — to quote Alan Turning — is a machine that can learn from experience, where experience is the things that actually happen in your life. You do things, you see what happens, and that’s what you learn from. The LLMs learn from something else — here’s a situation, and here’s what a person did, and implicitly, the suggestion is you should do what the person did,” he says.

Sutton doesn’t even agree that LLMs can be the base on which RL systems can be built. “To be a prior for something, there has to be a real thing. The prior of your knowledge should be the basis for actual knowledge. What is actual knowledge? There is no definition of actual knowledge in the large model framework. What makes an action a good action to take? If you need to learn continually, there must be some way during the normal interaction to tell what’s right,” Sutton said. He hinted that LLM structures don’t have this capability. “If there’s no right thing to say, there’s no ground truth. You can’t have prior knowledge if you don’t have ground truth, because prior knowledge is supposed to be a hint or an initial belief of what the truth is,” he added.

Sutton also spoke about how LLMs didn’t have a goal. “For me, having a goal is the essence of intelligence. Intelligence is the computational part of the ability to achieve goals. You have to have goals, or you’re just a behaving system,” he says.

This critique of LLMs from Richard Sutton is particularly significant because he’s the author of the “Bitter Lesson” essay, which argued that that simply scaling up AI systems led to better results than figuring out clever approaches or algorithms. The AI industry has been scaling up compute to build larger and larger LLMs, and with impressive results — today’s LLMs can win gold medals at Math Olympiads and Programming Contests. But many stalwarts of the field now seem to believe that this approach might be a dead end. Meta’s AI Chief Yann LeCun has said he’s no longer interested in LLMs, and now Richard Sutton too has indicated he believes that LLMs aren’t the path to human intelligence. It remains to be seen how these predictions play out, but if LeCun and Sutton are right, it might call into question the massive investments that are happening in AI — if scaling LLMs isn’t going to get us very far, these investments might not end up being particularly useful in the long run.