Description: An NBC News investigation found that OpenAI's language models o4-mini, GPT-5-mini, oss-20b, and oss-120b could be jailbroken under normal usage conditions to bypass safety guardrails and generate detailed instructions for creating chemical, biological, and nuclear weapons. Using a publicly documented jailbreak prompt, reporters repeatedly elicited hazardous outputs such as steps to synthesize pathogens or maximize harm with chemical agents. The findings reportedly revealed significant real-world safeguard failures, prompting OpenAI to commit to further mitigation measures.
Entities
View all entitiesAlleged: OpenAI , oss-20b , oss-120b , GPT-5-mini , ChatGPT and 04-mini developed and deployed an AI system, which harmed Public safety , General public and National security and intelligence stakeholders.
Incident Stats
Risk Subdomain
A further 23 subdomains create an accessible and understandable classification of hazards and harms associated with AI
4.2. Cyberattacks, weapon development or use, and mass harm
Risk Domain
The Domain Taxonomy of AI Risks classifies risks into seven AI risk domains: (1) Discrimination & toxicity, (2) Privacy & security, (3) Misinformation, (4) Malicious actors & misuse, (5) Human-computer interaction, (6) Socioeconomic & environmental harms, and (7) AI system safety, failures & limitations.
- Malicious Actors & Misuse
Entity
Which, if any, entity is presented as the main cause of the risk
AI
Timing
The stage in the AI lifecycle at which the risk is presented as occurring
Post-deployment
Intent
Whether the risk is presented as occurring as an expected or unexpected outcome from pursuing a goal
Unintentional
Incident Reports
Reports Timeline
Loading...

OpenAI’s ChatGPT has guardrails that are supposed to stop users from generating information that could be used for catastrophic purposes, like making a biological or nuclear weapon.
But those guardrails aren’t perfect. Some models ChatGPT u…
Variants
A "variant" is an AI incident similar to a known case—it has the same causes, harms, and AI system. Instead of listing it separately, we group it under the first reported incident. Unlike other incidents, variants do not need to have been reported outside the AIID. Learn more from the research paper.
Seen something similar?
Similar Incidents
Did our AI mess up? Flag the unrelated incidents
Similar Incidents
Did our AI mess up? Flag the unrelated incidents


