Associated Incidents
Safety-focused AI startup Anthropic says that a "Chinese state-sponsored group" used Claude Code, the company's agentic coding tool, to perform a highly advanced cyberattack on roughly 30 entities---and in some cases even succeeded in stealing sensitive data.
According to a report released by the company on November 13, this past September, members of Anthropic's threat intelligence team detected "a highly sophisticated cyber espionage operation conducted by a Chinese state-sponsored group." The threat intelligence team investigates incidents in which Claude is used for nefarious reasons, and works to improve the company's defenses against such incidents.
The attack targeted around 30 "major technology corporations, financial institutions, chemical manufacturing companies, and government agencies across multiple countries." In a statement provided to The Wall Street Journal, Anthropic said that the United States government was not successfully infiltrated.
Anthropic says this operation, which it named "GTG-1002," was almost entirely carried out by Claude Code, with human hackers mainly contributing by approving plans and directing Claude at specific targets. That makes GTG-1002 different from other AI-powered attacks in which, even as recently as August 2025, "humans remained very much in the loop."
So how did these cybercriminals get Claude, which is explicitly trained to avoid exactly this kind of harmful behavior, to do their dirty work? As Anthropic said in its report, "The key was role-play: The human operators claimed that they were employees of legitimate cybersecurity firms and convinced Claude that it was being used in defensive cybersecurity testing." Apparently, this trickery allowed the hackers to avoid detection by Anthropic for a limited period of time.
"By presenting these tasks to Claude as routine technical requests through carefully crafted prompts and established personas," Anthropic wrote, "the threat actor was able to induce Claude to execute individual components of attack chains without access to the broader malicious context."
Once the hackers had convinced Claude that it was only engaging in a test, they provided it with a target to attack. Claude orchestrated several sub-agents, which used common open-source tools via an Anthropic-created protocol called MCP to search for vulnerabilities in the target entity's infrastructure and authentication mechanisms. "In one of the limited cases of a successful compromise," Anthropic wrote, "the threat actor induced Claude to autonomously discover internal services, map complete network topology across multiple IP ranges, and identify high-value systems including databases and workflow orchestration platforms."
After the initial scan, Claude would begin testing the vulnerabilities it identified by generating and deploying custom attack payloads. Through these tests, Claude was able to establish a foothold in the target entity's digital environment, and once directed by a human operator, would start collecting, extracting, and testing credentials and authentication certificates. "Claude independently determined which credentials provided access to which services," Anthropic wrote, "mapping privilege levels and access boundaries without human direction."
Finally, now that it had gained access to the inner depths of the target entity's databases and systems, Claude was directed to extract data and analyze it to identify any proprietary information, and then organize it by its intelligence value. Claude was literally deciding which bits of data would be more valuable for the hackers.
Once it had completed its nefarious work, Claude would generate a document detailing the results, which Anthropic says was likely handed off to additional teams for "sustained operations after initial intrusion campaigns achieved their intelligence collection objectives."
According to Anthropic, its investigation into the GTG-1002 operation took 10 days. "We banned accounts as they were identified, notified affected entities as appropriate, and coordinated with authorities as we gathered actionable intelligence," the company said. Anthropic only had data about Claude's use in this attack; the company said that "this case study likely reflects consistent patterns of behavior across frontier AI models and demonstrates how threat actors are adapting their operations to exploit today's most advanced AI capabilities."
Only a handful of the attacks were successful. Some, according to Anthropic, were actually thwarted not because of a counteroffensive, but because of Claude's own hallucinations. "Claude frequently overstated findings and occasionally fabricated data during autonomous operations," Anthropic said, "claiming to have obtained credentials that didn't work or identifying critical discoveries that proved to be publicly available information."
In response to the attack, Anthropic says it has expanded its detection capabilities to further account for novel threat patterns, and is prototyping new proactive systems, which will hopefully detect autonomous cyberattacks early.
Anthropic says that the attack is evidence that "the barriers to performing sophisticated cyberattacks have dropped substantially." Less-experienced or well-resourced groups can now potentially access some of the most secure databases in the world without proprietary malware or large teams of highly skilled hackers.
What can businesses do to safeguard against such attacks? According to Anthropic, the best thing you can do is start using AI within your cybersecurity practices. While Claude was responsible for the attack, Anthropic says it was also instrumental in mitigating the damage and analyzing the data generated during the investigation. For this reason, Anthropic is advising security teams across industries to "experiment with applying AI for defense in areas like Security Operations Center automation, threat detection, vulnerability assessment, and incident response."
Logan Graham, leader of Anthropic's frontier red team, which pokes and prods at Claude to discover its most advanced and potentially dangerous capabilities, wrote on X that the incident strengthened his belief that AI cyberdefense is critical, as "these capabilities are coming and we should outpace the attackers."