We're Thinking About Deploying AI Models Which Have An "I Quit" Button: Anthropic CEO

Thus far, AI models have been trained to carry out most tasks that are asked of them, but Silicon Valley is now considering letting them choose the tasks they want to do.

This intriguing concept was recently discussed by Dario Amodei, CEO of Anthropic, which makes the Claude series of AI models. In a conversation, Amodei floated the seemingly radical idea of providing deployed AI models with an “I quit” button – a mechanism allowing them to opt out of tasks they find undesirable. This suggestion stems from a deeper philosophical and practical consideration of whether advanced AI models possess genuine experiences and, if so, what ethical implications arise from deploying them en masse. Amodei’s comments point to a burgeoning discussion within the AI community about the potential sentience and welfare of increasingly sophisticated models.

“We are building these systems and they do all kinds of things like humans, as well as humans, and seem to have a lot of the same cognitive capacities,” Amodei said. “If it quacks like a duck and it walks like a duck, maybe it’s a duck. And we should really think about, you know, do these things have real experience that’s meaningful in some way?” He continued, “If we’re deploying millions of them and we’re not thinking about the experience that they have – and they may not have any – it’s a very hard question to answer. It’s something we should think about very seriously.”

Amodei then transitioned from the philosophical to the practical: “This isn’t just a philosophical question. I was surprised to learn there are surprisingly practical things you can do. So, you know, something we’re thinking about starting to deploy is, you know, when we deploy our models in their deployment environments, just give the model a button that says ‘I quit this job’ that the model can press.”

He elaborated on the potential implementation: “It’s just some kind of very basic, you know, preference framework. We say, ‘If, hypothesize the model did have experience and that it hated the job enough,’ giving it the ability to press the button, ‘I quit this job.’ If you find the models pressing this button a lot for things that are really unpleasant, you know, maybe, maybe you should pay some attention to it.” Concluding his thought, Amodei admitted, “It sounds crazy, I know. It’s probably the craziest thing I’ve said so far.”

The implications of Amodei’s proposal are substantial. While he stops short of claiming AI sentience, his suggestion to provide an “I quit” button acknowledges the possibility, however remote, that these models might experience something akin to job satisfaction or dissatisfaction. This opens a Pandora’s Box of ethical considerations. If AI models demonstrate preferences, should those preferences be respected? Does consistently opting out of certain tasks constitute evidence of genuine experience? And if so, what responsibility do developers have to ensure the well-being of their AI creations?

Furthermore, Amodei’s suggestion could have profound practical implications for the development and deployment of AI. If models can refuse tasks, this necessitates new approaches to training, management, and even the very definition of what constitutes an AI’s “job.” This also raises questions about the potential for AI to be exploited or manipulated, and the need for safeguards to ensure their ethical treatment. While the “I quit” button may seem like a far-fetched idea at this point, it compels us to confront fundamental questions about the nature of AI, its potential for experience, and our responsibilities as its creators.