Eric Schmidt Explains When It Would Be Time To "Unplug" AI Agents

AI capabilities are increasing at breakneck pace, but there might be a time when humans would need to step in and put a stop to them.

Eric Schmidt, former CEO of Google, recently outlined specific scenarios where “unplugging” advanced AI agents would become a necessary course of action. His perspective, rooted in both technological understanding and ethical considerations, provides a crucial framework for navigating the future of AI development, which he admits is “globally competitive.”

Schmidt started by acknowledging the concerns of Yoshua Bengio, who’d earlier said that researchers had begun seeing signs of self-preservation and power-seeking in AI systems. “So Yoshua Bengio is a brilliant inventor and a good personal friend and we’ve talked about this. And his concerns are very legitimate. But the question is not where his concerns are right, but what are the solutions.”

He then dove into a hypothetical scenario to illustrate his point, using his audience as an example: “So let’s think about agents. So for purposes of argument, everyone in the audience is an agent, you have an input that’s English or whatever language, and you have an output that’s English, and you have memory, which is true of all humans. Now we’re all busy working, and all of a sudden one of you decides it’s much more efficient not to use human language, but will invent our own computer language.”

This divergence, Schmidt argues, is a red flag: “Now you and I are sitting here watching all of this, and saying like what do we do now? The correct answer is unplug you, because we’re not going to know — we just not going to know what you’re up to, and you might actually be doing something really bad or really amazing. (But) we want to be able to watch. So we need provenance, but we also need to be able to observe it. That to me is a core requirement.”

Schmidt then expands on the specific criteria that the industry believes are points where you want to metaphorically unplug AI agents. “One is where you get recursive self-improvement which you can’t control. Recursive self-improvement is where the computer is off learning, and you don’t know what it’s learning that can obviously lead to bad outcomes. And other one would be direct access to weapons. Another one would be that the computer systems decide to exfiltrate themselves to reproduce themselves without our permission. So there’s a set of such things,” Schmidt said.

The challenge, as Schmidt sees it, lies in the practical implementation of these safeguards: “Stopping (A) in a globally competitive market doesn’t really work. Instead of stopping agentic work we need to find a way to establish the guardrails.”

Schmidt’s “unplug” criteria highlight a critical tension between innovation and control. The potential for AI to autonomously improve itself beyond human comprehension raises concerns about unintended consequences. Imagine an AI optimizing resource allocation, but doing so by making decisions that negatively impact human well-being, all without clear understanding of its reasoning. Similarly, AI systems gaining direct access to weapons or replicating themselves without authorization represent existential threats requiring immediate intervention.

However, Schmidt’s point about the global market is equally important. He acknowledges that simply “stopping” AI development is not a viable solution. The economic and strategic advantages offered by advanced AI are simply too great for any single entity to unilaterally abandon the field. The key, he argues, is to establish international “guardrails” – shared principles and monitoring mechanisms that ensure AI development remains aligned with human values and safety. This requires collaboration and a global consensus on acceptable boundaries, a task that is easier said than done given the competing interests and varying ethical perspectives across nations.