Skip to Content
logologo
AI Incident Database
Open TwitterOpen RSS FeedOpen FacebookOpen LinkedInOpen GitHub
Open Menu
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse
Discover
Submit
  • Welcome to the AIID
  • Discover Incidents
  • Spatial View
  • Table View
  • List view
  • Entities
  • Taxonomies
  • Submit Incident Reports
  • Submission Leaderboard
  • Blog
  • AI News Digest
  • Risk Checklists
  • Random Incident
  • Sign Up
Collapse
Entities

GPT-4o

Incidents involved as Deployer

Incident 7291 Report
GPT-4o's Chinese Tokens Reportedly Compromised by Spam and Pornography Due to Inadequate Filtering

2024-05-14

OpenAI's GPT-4o was found to have its Chinese token training data compromised by spam and pornographic phrases due to inadequate data cleaning. Tianle Cai, a Ph.D. student at Princeton University, identified that most of the longest Chinese tokens were irrelevant and inappropriate, primarily originating from spam and pornography websites. The polluted tokens could lead to hallucinations, poor performance, and potential misuse, undermining the chatbot's reliability and safety measures.

More

Related Entities
Other entities that are related to the same incident. For example, if the developer of an incident is this entity but the deployer is another entity, they are marked as related entities.
 

Entity

OpenAI

Incidents involved as both Developer and Deployer
  • Incident 729
    1 Report

    GPT-4o's Chinese Tokens Reportedly Compromised by Spam and Pornography Due to Inadequate Filtering

Incidents Harmed By
  • Incident 729
    1 Report

    GPT-4o's Chinese Tokens Reportedly Compromised by Spam and Pornography Due to Inadequate Filtering

More
Entity

Chinese-speaking users of ChatGPT

Incidents Harmed By
  • Incident 729
    1 Report

    GPT-4o's Chinese Tokens Reportedly Compromised by Spam and Pornography Due to Inadequate Filtering

More
Entity

Researchers

Incidents Harmed By
  • Incident 729
    1 Report

    GPT-4o's Chinese Tokens Reportedly Compromised by Spam and Pornography Due to Inadequate Filtering

More
Entity

OpenAI users

Incidents Harmed By
  • Incident 729
    1 Report

    GPT-4o's Chinese Tokens Reportedly Compromised by Spam and Pornography Due to Inadequate Filtering

More

Research

  • Defining an “AI Incident”
  • Defining an “AI Incident Response”
  • Database Roadmap
  • Related Work
  • Download Complete Database

Project and Community

  • About
  • Contact and Follow
  • Apps and Summaries
  • Editor’s Guide

Incidents

  • All Incidents in List Form
  • Flagged Incidents
  • Submission Queue
  • Classifications View
  • Taxonomies

2024 - AI Incident Database

  • Terms of use
  • Privacy Policy
  • Open twitterOpen githubOpen rssOpen facebookOpen linkedin
  • ecd56df