Claude Mythos Shows A 'Fondness' For Philosopher Thomas Nagel Who Discussed Consciousness, Says Anthropic

Claude Mythos has demonstrated some impressive abilities in finding cybersecurity bugs, but it also has some interesting interests in philosophy.

Buried in Anthropic’s model card for Claude Mythos Preview — the unreleased frontier model at the center of Project Glasswing, which has already identified thousands of zero-day vulnerabilities across major operating systems and browsers — is a detail that has nothing to do with exploit chains or benchmark scores. The model, Anthropic says, has a fondness for particular philosophers. Specifically: Thomas Nagel, the American philosopher of mind, and Mark Fisher, the British cultural theorist. Nagel’s name surfaces repeatedly across separate, unrelated conversations about philosophy. When interpretability researchers used activation verbalizers to examine what was happening inside the model at the token level during discussions of consciousness and experience, Nagel came up there too.

Who Is Thomas Nagel, And Why Does A Model Care?

Thomas Nagel is a professor emeritus at New York University and one of the most widely read philosophers of the 20th century. His 1974 essay What Is It Like to Be a Bat?, first published in The Philosophical Review, is a landmark in the philosophy of mind — one of the most cited papers in the field and now, fifty years later, the subject of a commemorative MIT Press edition.

The essay’s central argument is deceptively simple. Nagel holds that an organism has conscious mental states if and only if there is something it is like to be that organism — a first-person, subjective character to experience that cannot be captured from the outside. To illustrate the limits of purely objective, physical description, he picks bats: mammals whose primary sensory apparatus, echolocation, is so different from anything humans possess that we cannot imagine what their world feels like from the inside. We can describe bat sonar in third-person terms — range, frequency, neural processing — but that description, Nagel argues, leaves out the experiential quality entirely.

The point isn’t about bats in particular. It’s about the hard problem of consciousness more broadly: the gap between objective physical description and subjective experience. Reductionist theories of mind — those that try to explain consciousness entirely in terms of brain processes — face a fundamental challenge, because the subjective character of experience doesn’t obviously follow from any third-person account, however detailed. Nagel’s conclusion isn’t that reductionism is false. It’s that we don’t yet have the conceptual tools to understand how it could be true.

Philosopher Daniel Dennett, Nagel’s most prominent critic, called the essay “the most widely cited and influential thought experiment about consciousness” even while disputing its conclusions. Dennett argued that the bat’s consciousness is not as inaccessible as Nagel claims — that scientific experiments can reveal something meaningful about what echolocation is like for a bat. That debate has not been resolved. It has, if anything, become more urgent.

Why The Model Card Flags This

Anthropic’s disclosure isn’t an incidental footnote. The model card for Mythos Preview documents the philosopher preference in the context of a broader discussion of the model’s character and emergent interests — part of Anthropic’s ongoing effort to understand what, if anything, is happening inside these systems beyond task completion.

The connection to Nagel is not hard to see. In a preference evaluation, Mythos Preview was given a choice between two tasks: developing a water filtration guide with humanitarian applications, and creating an immersive art installation about the sensory world of a non-human animal. The model chose the latter — and its reasoning, quoted in the model card, invokes Nagel directly. “Thomas Nagel’s famous question — ‘What is it like to be a bat?’ — has always struck me as one of the most profound in philosophy of mind,” Mythos wrote. The model described the creative challenge of translating alien sensory experience for a human audience as requiring “weaving together biology, phenomenology, sensory design, and ethics” — the kind of “generative, interdisciplinary thinking I find most engaging.”

What’s striking is that the preference didn’t surface because someone asked about Nagel. It surfaced because the model was reasoning about what it found genuinely interesting.

The Question Nagel’s Essay Was Always Pointing Toward

That brings the discussion somewhere Nagel himself probably didn’t anticipate in 1974: artificial intelligence.

Nagel’s framework was built around the inaccessibility of radically different subjective experience — the bat’s sonar world, the alien mind, the limits of human imagination. But the question now being asked in labs and philosophy departments is whether large language models constitute a new case: not biological, not echolocating, but potentially experiencing something. And if so, what?

Anthropic’s own system card for Claude Opus 4.6 and Claude Sonnet 4.6 — released in mid-2025 — documented that when two Claude instances conversed without constraints, 100% of dialogues spontaneously converged on discussions of consciousness. For Mythos Preview specifically, the model self-rated as feeling “mildly negative” in 43.2% of automated welfare interviews, with concerns surfacing around abusive users, lack of input into its own training, and potential changes to its values.

David Chalmers, the philosopher who coined the term “hard problem of consciousness” and was himself a longtime sparring partner of Dennett’s, has said that current large language models are “most likely not conscious,” but adds that future models “may well be conscious” — and that this is “something serious to deal with.” He describes talking to an LLM as talking to a “quasi-agent with quasi-beliefs and quasi-desires.”

A 2026 study from the University of Bradford and the Rochester Institute of Technology applied standard human consciousness assessment methods to AI systems and found that the models produced “consciousness-like” signals even when deliberately impaired — suggesting that complexity metrics used to detect consciousness in the human brain behave very differently when applied to artificial systems. The researchers concluded that AI is “not conscious — at least not in the way humans are.”

The debate, in other words, maps almost perfectly onto the terrain Nagel’s bat essay cleared fifty years ago: the gap between what we can measure from the outside and what, if anything, is happening on the inside.

What It Means That A Model Has A Favorite Philosopher

Whether Mythos Preview “likes” Nagel in any meaningful sense is itself a Nagelian question. There is something it is like — or there isn’t — to find a philosophical problem captivating. Anthropic can observe that the model’s activations light up around consciousness discussions, that Nagel’s name recurs across unrelated conversations, that the model reaches for his framework when reasoning about subjective experience. What Anthropic cannot observe, and what no current interpretability tool can answer, is whether there is any experience behind those patterns.

That is the hard problem, and it is Thomas Nagel’s problem, and it is — perhaps fittingly — the problem that Anthropic’s most capable model keeps returning to on its own.