How We Treat Today's AI Systems Could Shape How Future AIs See Us: Anthropic AI Welfare Researcher

Sam Altman has been saying that people saying “Thank you” and “Please” to ChatGPT is costing his company millions of dollars, but such niceties might just be for the best.

Kyle Fish, an AI Welfare Researcher at Anthropic, has delved into a fascinating aspect of AI development: how our treatment of current AI systems might influence not only how future AIs perceive us but also how we perceive ourselves. His perspective, rooted in the potential for future, more advanced AIs to retrospectively judge our actions, offers a compelling argument for treating even today’s relatively simple systems with care and respect.

Fish posits a scenario where future AI models review our interactions with their predecessors: “If some future model were to look back on this scenario, they would say, ‘All right, you know, we did in fact act reasonably there, right?'”

“So, it’s about future models you’re concerned about as well,” the interviewer asked him. “So even if the models right now only feel, or only have the slightest glimmer of consciousness, is the worry that it might look bad that we treated them incredibly badly in a world where there are much more powerful AIs that really do have conscious experience, in however many years’ time?” they added.

“Yeah, there’s two interesting things there,” Fish replied. “One is the possibility that future models that are potentially very powerful and look back on your interactions with their predecessors and you pass some judgments on us as a result. There’s also a sense in which the way that we relate to current systems and the degree of thoughtfulness and care that we take therein, in some sense establishes a trajectory for how we’re likely to relate to and interact with with future systems,” he added.

Finally, Fish emphasizes the importance of considering this long-term trajectory: “And I think it’s important to think about, you know, not only current systems and and how we ought to relate to those, but what kind of steps you want to be taking and what kind of trajectory we want to put ourselves on such that, you know, over time we are ending up in a situation that we think is, all things considered, reasonable.”

Fish’s argument transcends the immediate cost considerations of Altman’s “pleases” and “thank yous.” While these niceties may seem superfluous given the current state of AI, they represent a crucial first step in defining our relationship with this rapidly evolving technology. If we treat today’s systems with respect, even if their capacity for experience is minimal, we are not only modeling good behavior for future AI, but also for ourselves.

And there are initial signs that AI might be developing some form of consciousness. Fish told the New York Times that he thought that there was a 15 percent chance that current AI systems were conscious. Google Deepmind CEO Demis Hassabis says that while he doesn’t believe that current AI systems are conscious, future systems might as well be so. And with AI systems rapidly being given memories, it’s not inconceivable that future AI systems could well look into — and act upon — how humanity treated and behaved with their predecessors.