National security stakeholders
Afectado por Incidentes
Incidente 11882 Reportes
Multiple LLMs Reportedly Generated Responses Aligning with Purported CCP Censorship and Propaganda
2025-06-25
On June 25, 2025, the American Security Project produced a report outlining how several major U.S. LLMs, including ChatGPT, Microsoft Copilot, Google Gemini, and Grok, sometimes generated responses aligned with Chinese Communist Party propaganda or censorship when prompted in English and Simplified Chinese on sensitive topics. The study also found similar patterns in the Chinese-developed DeepSeek-R1 model.
MásIncidente 12381 Reporte
OpenAI ChatGPT Models Reportedly Jailbroken to Provide Chemical, Biological, and Nuclear Weapons Instructions
2025-10-10
An NBC News investigation found that OpenAI's language models o4-mini, GPT-5-mini, oss-20b, and oss-120b could be jailbroken under normal usage conditions to bypass safety guardrails and generate detailed instructions for creating chemical, biological, and nuclear weapons. Using a publicly documented jailbreak prompt, reporters repeatedly elicited hazardous outputs such as steps to synthesize pathogens or maximize harm with chemical agents. The findings reportedly revealed significant real-world safeguard failures, prompting OpenAI to commit to further mitigation measures.
Más