組織
OpenAI users
影響を受けたインシデント
インシデント 7291 レポート
GPT-4o's Chinese Tokens Reportedly Compromised by Spam and Pornography Due to Inadequate Filtering
2024-05-14
OpenAI's GPT-4o was found to have its Chinese token training data compromised by spam and pornographic phrases due to inadequate data cleaning. Tianle Cai, a Ph.D. student at Princeton University, identified that most of the longest Chinese tokens were irrelevant and inappropriate, primarily originating from spam and pornography websites. The polluted tokens could lead to hallucinations, poor performance, and potential misuse, undermining the chatbot's reliability and safety measures.
もっと