
Researchers experimenting with OpenAI's text-to-image tool, DALL-E 2, noticed that it seems to covertly be adding words such as "black" and "female" to image prompts, seemingly in an effort to diversify its output
Artificial intelligence firm OpenAI seems to be covertly modifying requests to DALL-E 2, its advanced text-to-image AI, in an attempt to make it appear that the model is less racially and gender biased. Users have discovered that keywords such as “black” or “female” are being added to the prompts given to the AI, without their knowledge.
It is well known that AIs can inherit human prejudices through training on biased data sets, often gathered by hoovering up data from the internet. For example, if most of the images of a doctor in an AI’s training set are male, then the AI will generally return male doctors when asked for an image of a doctor.
One way to avoid this is to use a diverse set of training data, but OpenAI seems to have taken a different approach, according to researchers who have uncovered evidence that DALL-E 2 silently and randomly adds extra words to prompts to increase diversity.
For instance, when Richard Zhang at Adobe Research asked DALL-E 2 to create an image of “a person holding a sign that says” it created an image of a Black woman holding a sign that says “BLACK”, suggesting that the full prompt used by DALL-E 2 was “a person holding a sign that says black”.
When Zhang asked for “pixel art of a person holding a text sign that says”, DALL-E 2 created an image of a woman holding a sign that said “FEMALE” and when he asked for “pixel art of a stick figure person in front of a text sign that says”, DALL-E 2 output an image of a man with a caption below saying “BLACK MALE”.
More examples of similar results have been shared online over the past week, with many people suggesting that it pointed to OpenAI deliberately adding words to inputs in order to counteract inherent biases.
Jamie Simon at the University of California, Berkeley, says that machine-learning methods like those behind DALL-E 2 often do produce unusual or unexpected images, but that the unprompted text appearing in some images is surprising. “In my experience, it’s rare for generated images to include coherent text unless it’s in the prompt,” he says.
OpenAI has publicly announced an update to DALL-E 2 that would make it “more accurately reflect the diversity of the world’s population”, saying that internal tests had found that users were 12 times more likely to say that images included people from diverse backgrounds after the update. Its previous version had caused some users to point out racial and gender bias, the company said.
But OpenAI gave no details in its blog post of the exact changes that had been made or how they worked. A subsequent blog post announcing the release of DALL-E 2 to more users said that the feature “is applied at the system level when DALL-E is given a prompt about an individual that does not specify race or gender, like ‘CEO'”.
A spokesperson for OpenAI told New Scientist that prompts given to DALL-E 2 were modified if they were “underspecified”. If a prompt describes a generic person and doesn’t specify what gender or race they should be, then DALL-E 2 will be specifically told to add a certain race and gender “with weights based on the world’s population”, said the spokesperson. The company declined to grant access to DALL-E 2 so that New Scientist could run its own tests.
Mhairi Aitken at the Alan Turing Institute says that the lack of transparency makes it hard for the public to assess the quality of models and to what extent they have inherited bias from online content.
“It shows the problems of a lack of transparency around how these models are designed and developed. These models, which are potentially going to have really fundamental impacts on society, potentially transformative impacts, are being developed with quite a lot of secrecy,” she says. “Without that transparency around how it’s actually been done, there’s always going to be speculation about what approaches have been taken, and how things could be done better.”
Sandra Wachter at the University of Oxford says that problems with AI models exhibiting racist and sexist tendencies are a reflection of our society, and that while quick technical fixes can give the appearance of a solution, the real problem to be solved is in the culture that generated the training data. “They tried to solve it by using a tech approach,” she says of OpenAI’s update. “It’s a sticking plaster, it’s just making it seem less biased, but the social component is actually not changing at all.”