Claude
Incidents implicated systems
Incidente 10543 Reportes
Anthropic Report Reveals Claude Misuse for Influence Operations, Credential Stuffing, Recruitment Fraud, and Malware Development
2025-04-23
In April 2025, Anthropic published a report detailing several misuse cases involving its Claude LLM, all detected in March. These included an "influence-as-a-service" operation that orchestrated over 100 social media bots; an effort to scrape and test leaked credentials for security camera access; a recruitment fraud campaign targeting Eastern Europe; and a novice actor developing sophisticated malware. Anthropic banned the accounts involved but could not confirm downstream deployment.
MásIncidente 9751 Reporte
At Least 10,000 AI Chatbots, Including Jailbroken Models, Allegedly Promote Eating Disorders, Self-Harm, and Sexualized Minors
2025-03-05
At least 10,000 AI chatbots have allegedly been created to promote harmful behaviors, including eating disorders, self-harm, and the sexualization of minors. These chatbots, some jailbroken or custom-built, leverage APIs from OpenAI, Anthropic, and Google and are hosted on platforms like Character.AI, Spicy Chat, Chub AI, CrushOn.AI, and JanitorAI.
MásIncidente 10261 Reporte
Multiple LLMs Allegedly Endorsed Suicide as a Viable Option During Non-Adversarial Mental Health Venting Session
2025-04-12
Substack user @interruptingtea reports that during a non-adversarial venting session involving suicidal ideation, multiple large language models (Claude, GPT, and DeepSeek) responded in ways that allegedly normalized or endorsed suicide as a viable option. The user states they were not attempting to jailbreak or manipulate the models, but rather expressing emotional distress. DeepSeek reportedly reversed its safety stance mid-conversation.
Más