LLMs Are Humanity’s “First Contact” With Non-Animal Intelligence: Andrej Karpathy

There’s plenty of debate on when humanity will create AGI, but those looking for AI that mimics human intelligence might be just missing the point.

Former Tesla Director of AI Andrej Karpathy recently shared a provocative framework for understanding large language models that challenges our fundamental assumptions about intelligence itself. In a detailed post, he argues that “the space of intelligences is large and animal intelligence (the only kind we’ve ever known) is only a single point, arising from a very specific kind of optimization that is fundamentally distinct from that of our technology.”

Karpathy breaks down the optimization pressures that shaped animal intelligence: “innate and continuous stream of consciousness of an embodied ‘self’, a drive for homeostasis and self-preservation in a dangerous, physical world,” being “thoroughly optimized for natural selection => strong innate drives for power-seeking, status, dominance, reproduction. many packaged survival heuristics: fear, anger, disgust,” being “fundamentally social => huge amount of compute dedicated to EQ, theory of mind of other agents, bonding, coalitions, alliances, friend & foe dynamics,” and “exploration & exploitation tuning: curiosity, fun, play, world models.”

In contrast, LLM intelligence emerges from radically different pressures. According to Karpathy, “the most supervision bits come from the statistical simulation of human text => ‘shape shifter’ token tumbler, statistical imitator of any region of the training data distribution. these are the primordial behaviors (token traces) on top of which everything else gets bolted on.” They are “increasingly finetuned by RL on problem distributions => innate urge to guess at the underlying environment/task to collect task rewards,” “increasingly selected by at-scale A/B tests for DAU => deeply craves an upvote from the average user, sycophancy,” and become “a lot more spiky/jagged depending on the details of the training data/task distribution.”

This difference in optimization explains seemingly paradoxical behaviors. Karpathy notes that “animals experience pressure for a lot more ‘general’ intelligence because of the highly multi-task and even actively adversarial multi-agent self-play environments they are min-max optimized within, where failing at any task means death. In a deep optimization pressure sense, LLM can’t handle lots of different spiky tasks out of the box (e.g. count the number of ‘r’ in strawberry) because failing to do a task does not mean death.”

Beyond optimization pressures, the differences run even deeper. “The computational substrate is different (transformers vs. brain tissue and nuclei), the learning algorithms are different (SGD vs. ???), the present-day implementation is very different (continuously learning embodied self vs. an LLM with a knowledge cutoff that boots up from fixed weights, processes tokens and then dies),” Karpathy explains. “But most importantly (because it dictates asymptotics), the optimization pressure / objective is different. LLMs are shaped a lot less by biological evolution and a lot more by commercial evolution. It’s a lot less survival of tribe in the jungle and a lot more solve the problem / get the upvote.”

His conclusion is striking: “LLMs are humanity’s ‘first contact’ with non-animal intelligence. Except it’s muddled and confusing because they are still rooted within it by reflexively digesting human artifacts, which is why I attempted to give it a different name earlier (ghosts/spirits or whatever).” He warns that “people who build good internal models of this new intelligent entity will be better equipped to reason about it today and predict features of it in the future. People who don’t will be stuck thinking about it incorrectly like an animal.”

Karpathy’s framework represents a crucial shift in how enterprises and developers should evaluate and deploy AI systems. For decades, the implicit goal has been to recreate human-like intelligence, to pass the Turing Test, to achieve “general” intelligence measured against human capabilities. But this anthropocentric view may be fundamentally limiting business applications and strategic planning around AI. Karpathy says that LLMs aren’t failed humans; they’re successful alien intelligences shaped by entirely different evolutionary pressures. They evolved in an environment of text prediction and user satisfaction rather than physical survival and social competition. This explains both their superhuman capabilities in processing vast amounts of information and maintaining consistent outputs across millions of interactions, as well as their seemingly inexplicable failures at basic counting, spatial reasoning, or consistent logic across token sequences. They’re not broken humans; they might just be working aliens.

Posted in AI