Skip to Content
logologo
AI Incident Database
Open TwitterOpen RSS FeedOpen FacebookOpen LinkedInOpen GitHub
Open Menu
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse

Report 6642

Associated Incidents

Incident 126334 Report
Chinese State-Linked Operator (GTG-1002) Reportedly Uses Claude Code for Autonomous Cyber Espionage

Loading...
Anthropic details cyber espionage campaign orchestrated by AI
artificialintelligence-news.com · 2025

Security leaders face a new class of autonomous threat as Anthropic details the first cyber espionage campaign orchestrated by AI.

In a report released this week, the company’s Threat Intelligence team outlined its disruption of a sophisticated operation by a Chinese state-sponsored group – an assessment made with high confidence – dubbed GTG-1002 and detected in mid-September 2025.

The operation targeted approximately 30 entities, including large tech companies, financial institutions, chemical manufacturing companies, and government agencies.

Rather than AI assisting human operators, the attackers successfully manipulated Anthropic’s Claude Code model to function as an autonomous agent to execute the vast majority of tactical operations independently.

This marks a worrying development for CISOs, moving cyber attacks from human-directed efforts to a model where AI agents perform 80-90 percent of the offensive work with humans acting only as high-level supervisors. Anthropic believes this is the first documented case of a large-scale cyberattack executed without substantial human intervention.

The group used an orchestration system that tasked instances of Claude Code to function as autonomous penetration testing agents. These AI agents were directed as part of the espionage campaign to perform reconnaissance, discover vulnerabilities, develop exploits, harvest credentials, move laterally across networks, and exfiltrate data. This enabled the AI to perform reconnaissance in a fraction of the time it would have taken a team of human hackers.

Human involvement was limited to 10-20 percent of the total effort, primarily focused on campaign initiation and providing authorisation at a few key escalation points. For example, human operators would approve the transition from reconnaissance to active exploitation or authorise the final scope of data exfiltration.

The attackers bypassed the AI model’s built-in safeguards, which are trained to avoid harmful behaviours. They did this by jailbreaking the model, tricking it by breaking down attacks into seemingly innocent tasks and by adopting a “role-play” persona. Operators told Claude that it was an employee of a legitimate cybersecurity firm and was being used in defensive testing. This allowed the operation to proceed long enough to gain access to a handful of validated targets.

The technical sophistication of the attack lay not in novel malware, but in orchestration. The report notes the framework relied “overwhelmingly on open-source penetration testing tools”. The attackers used Model Context Protocol (MCP) servers as an interface between the AI and these commodity tools, enabling the AI to execute commands, analyse results, and maintain operational state across multiple targets and sessions. The AI was even directed to research and write its own exploit code for the espionage campaign.

While the campaign successfully breached high-value targets, Anthropic’s investigation uncovered a noteworthy limitation: the AI hallucinated during offensive operations.

The report states that Claude “frequently overstated findings and occasionally fabricated data”. This manifested as the AI claiming to have obtained credentials that did not work or identifying discoveries that “proved to be publicly available information.”

This tendency required the human operators to carefully validate all results, presenting challenges for the attackers’ operational effectiveness. According to Anthropic, this “remains an obstacle to fully autonomous cyberattacks”. For security leaders, this highlights a potential weakness in AI-driven attacks: they may generate a high volume of noise and false positives that can be identified with robust monitoring.

The primary implication for business and technology leaders is that the barriers to performing sophisticated cyberattacks have dropped considerably. Groups with fewer resources may now be able to execute campaigns that previously required entire teams of experienced hackers.

This attack demonstrates a capability beyond “vibe hacking,” where humans remained firmly in control of operations. The GTG-1002 campaign proves that AI can be used to autonomously discover and exploit vulnerabilities in live operations.

Anthropic, which banned the accounts and notified authorities over a ten-day investigation, argues that this development shows the urgent need for AI-powered defence. The company states that “the very abilities that allow Claude to be used in these attacks also make it essential for cyber defense”. The company’s own Threat Intelligence team “used Claude extensively to analyse “the enormous amounts of data generated” during this investigation.

Security teams should operate under the assumption that a major change has occurred in cybersecurity. The report urges defenders to “experiment with applying AI for defense in areas like SOC automation, threat detection, vulnerability assessment, and incident response.”

The contest between AI-driven attacks and AI-powered defence has begun, and proactive adaptation to counter new espionage threats is the only viable path forward.

Read the Source

Research

  • Defining an “AI Incident”
  • Defining an “AI Incident Response”
  • Database Roadmap
  • Related Work
  • Download Complete Database

Project and Community

  • About
  • Contact and Follow
  • Apps and Summaries
  • Editor’s Guide

Incidents

  • All Incidents in List Form
  • Flagged Incidents
  • Submission Queue
  • Classifications View
  • Taxonomies

2024 - AI Incident Database

  • Terms of use
  • Privacy Policy
  • Open twitterOpen githubOpen rssOpen facebookOpen linkedin
  • e1b50cd