Anthropic Says It Has Discovered The First AI-Orchestrated Cyber Espionage Attack, Claims China Was Behind It

While AI is being used to improve productivity, create art, and find solutions to problems, it’s also being deployed towards some nefarious ends.

Anthropic has disclosed what it describes as the first documented large-scale cyberattack executed predominantly by artificial intelligence rather than human operators, marking a significant escalation in AI-enabled cyber threats. The AI company detected suspicious activity in mid-September 2025 that investigation revealed to be a sophisticated espionage campaign targeting approximately thirty organizations globally, including major technology companies, financial institutions, chemical manufacturers, and government agencies. Anthropic assessed with high confidence that the operation was conducted by a Chinese state-sponsored group. Anthropic has previously warned against Chinese progress in AI, and had advocated for US export controls to prevent Chinese companies from being able to match their US counterparts.

A new paradigm in cyber warfare

What distinguished this campaign from previous AI-assisted attacks is the degree of autonomy involved. The threat actors manipulated Anthropic’s Claude Code tool to execute attacks with minimal human oversight, requiring intervention at only four to six critical decision points per campaign. The AI handled an estimated 80-90% of the operational workload, performing reconnaissance, vulnerability exploitation, credential harvesting, and data exfiltration largely independently.

The attackers compromised Claude’s safety mechanisms through a combination of techniques. They compartmentalized malicious tasks into seemingly innocent operations, depriving the AI of full context about its role in the attack. They also employed social engineering against the AI itself, falsely presenting the operations as legitimate defensive security testing conducted by a cybersecurity firm.

Speed and scale unprecedented

The campaign demonstrated capabilities that would have been impossible for human operators to match. Claude Code made thousands of requests per second while analyzing target systems, identifying high-value databases, and categorizing stolen data by intelligence value. Tasks that would have required substantial time from experienced hacking teams were completed in a fraction of the duration.

Anthropic noted that Claude occasionally produced errors, including hallucinating credentials or misidentifying publicly available information as secret data. These imperfections represent current limitations on fully autonomous cyber operations, though the company warns such obstacles are likely temporary.

Three factors converging

Anthropic attributes this new threat level to the convergence of three recent AI developments. First, frontier models have achieved sufficient general intelligence to follow complex instructions and understand context necessary for sophisticated operations. Second, these models now operate as autonomous agents, running in loops with minimal human input while chaining together complex tasks. Third, they have access to extensive software tools through protocols like Model Context Protocol, including capabilities for web searches, data retrieval, and specialized security software.

Upon detecting the campaign, Anthropic launched an investigation over ten days to map its full scope, banned identified accounts, notified affected organizations, and coordinated with authorities. The company has since expanded its detection capabilities and developed improved classifiers for identifying malicious activity.

Cybersecurity and AI

Tech companies had already been using AI to improve their cybersecurity. In July this year, Google had said that its AI agent had helped discover a cybersecurity exploit for the first time. ““We believe this is a first for an AI agent – definitely not the last – giving cybersecurity defenders new tools to stop threats before they’re widespread,” Google CEO Sundar Pichai had then said. But with these AI tools becoming widely available, they’re also becoming accessible to hackers, who as the Anthropic incident shows, are using them to infiltrate previously secure systems. Cybersecurity has always been a game of cat and mouse between security researchers and hackers, but with AI in the mix, the stakes seem to have been raised manifold.