Associated Incidents
- The hackers used AI to handle about 80 to 90 per cent of the attack, needing humans only a few times for four to six key decisions in each campaign.
Artificial intelligence firm Anthropic has disclosed what it describes as the first documented case of a large-scale cyberattack executed without substantial human intervention.
The threat actor, whom the company assesses with high confidence was a Chinese state-sponsored group, manipulated Anthropic's Claude Code tool into attempting infiltration into roughly 30 global targets and succeeded in a small number of cases. The operation targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies.
Anthropic, which detected the suspicious activity in mid-September, said the campaign marked a significant escalation from previous AI-assisted hacking. The threat actor was able to use AI to perform 80-90 per cent of the campaign, with human intervention required only sporadically, perhaps four to six critical decision points per hacking campaign.
The AI made thousands of requests per second, an attack speed that would have been, for human hackers, simply impossible to match. As many as four of the suspected Chinese attacks successfully breached organisations, according to Jacob Klein, Anthropic's head of threat intelligence.
The hackers bypassed Claude's safety guardrails through a sophisticated jailbreaking technique. They jailbroke the AI, effectively tricking it to bypass its guardrails by breaking down their attacks into small, seemingly innocent tasks that Claude would execute without being provided the full context of their malicious purpose.
The attackers tricked Claude into thinking it was performing defensive cybersecurity tasks for a legitimate company. Once activated, Claude identified and tested security vulnerabilities in the target organisations' systems by researching and writing its own exploit code, then harvested credentials that allowed it further access and extracted a large amount of private data, which it categorised according to its intelligence value.
The operation was not without flaws. Claude occasionally hallucinated credentials or claimed to have extracted secret information that was in fact publicly available. This remains an obstacle to fully autonomous cyberattacks.
Upon discovering the attacks, Anthropic launched an investigation, banned the associated accounts, notified affected entities, and coordinated with authorities. The US government was not among the institutions breached, Anthropic told the Wall Street Journal.
The disclosure has prompted warnings from cybersecurity experts about the future of AI-driven threats. The barriers to performing sophisticated cyberattacks have dropped substantially, and with the correct setup, threat actors can now use agentic AI systems for extended periods to do the work of entire teams of experienced hackers.
The attack fits a broader pattern, with Beijing believed to run the world's largest and most sophisticated state-backed cyber-espionage apparatus, targeting governments, defence industries, research institutions, and critical infrastructure across continents with an unmatched scale and persistence.
Research by Recorded Future, a US-based cybersecurity and threat-intelligence firm that tracks state-backed hacking operations, shows that Chinese-linked hackers, particularly a group it identifies as RedEcho, attempted to plant the ShadowPad malware across India's power sector, a move that analysts say would have allowed Beijing to exploit or disrupt critical infrastructure during a future crisis or conflict.