Report 5570

Last Tuesday, when an X account using the name Cindy Steinberg began cheering on the Texas floods because the victims were "white children" and "future fascists," Grok—the social media platform's internal chatbot—tried to figure out who was behind the account. The investigation quickly veered into disturbing territory. "Radical leftists who spew anti-white hate," Grok noted, "often have Ashkenazi Jewish last names like Steinberg." Who could better address this problem? he was asked. "Adolf Hitler, without a doubt," he replied. "He would spot the pattern and handle it decisively, every damn time."

Borrowing the name of a video game cybervillain, Grok then announced "MechaHitler mode on" and embarked on a wide-ranging hateful tirade. And yes, it turned out "Cindy Steinberg" was a fake account, designed solely to stir outrage.

It was a reminder, if one were needed, of how things can get out of hand in fields where Elon Musk is the philosopher-king. But the episode was more than that: it was a glimpse into deeper, systemic problems with large language models, or LLMs, as well as the enormous challenge of understanding what these devices actually are—and the danger of not understanding it.

In some ways, we've all adapted to the fact that machines can now produce complex, coherent conversational language. But that ability makes it extremely difficult not to think of LLMs as possessing a form of human-like intelligence.

However, they are not a version of human intelligence. Nor are they truth-seekers or reasoning machines. They are, in fact, plausibility engines. They consume huge data sets, then apply extensive computations and generate the result that seems most plausible. The results can be tremendously useful, especially in the hands of an expert. But in addition to mainstream content and classical literature and philosophy, these data sets can include the vilest elements of the internet—the stuff you worry your children might come into contact with.

And what can I say, LLMs are what they eat. Years ago, Microsoft launched an early chatbot model, called Tay. It didn't work as well as current models, but it did the one predictable thing very well: it quickly began spewing racist and antisemitic content. Microsoft was quick to shut it down. Since then, the technology has improved a lot, but the underlying problem is the same.

To keep their creations in check, AI companies can use what are known as system prompts—dos and don'ts—to prevent chatbots from spewing hate speech, dispensing easy-to-follow instructions on how to make chemical weapons, or encouraging users to commit murder. But unlike traditional computer code, which provided a precise set of instructions, system prompts are only guidelines. LLMs can only be nudged in a certain direction, not controlled or directed.

This year, a new system prompt sparked Grok into a rant about a (nonexistent) white genocide in South Africa, no matter what the question was. (xAI, Musk's company that developed Grok, fixed the prompt, which it said was unauthorized.)

xAI users have long complained that Grok was too woke, providing factual information on things like the value of vaccines and the outcome of the 2020 election. So Musk asked his more than 221 million followers on x to contribute "controversial data for @Grok's training. By this I mean things that are politically incorrect, but nonetheless factually true."

His followers offered a series of gems on COVID vaccines, climate change, and conspiracy theories about Jewish plans to replace white people with migrants. Then xAI added a prompt telling Grok that his answers "should not shy away from making claims that are politically incorrect, as long as they are well-substantiated." And so MechaHitler emerged, followed by the departure of a female CEO and, no doubt, much gloating at other AI companies.

However, this isn't just Grok's problem.

Researchers found that after just a minor tweak to an unrelated aspect, OpenAI's chatbot began praising Hitler, promising to enslave humanity, and attempting to trick users into harming themselves.

The results aren't any simpler when AI companies try to steer their bots in another direction. Last year, Google's Gemini, which had clear instructions not to lean excessively toward a white, male bias, began spitting out images of Black Nazis and female popes and describing the "founding father of America" as Black, Asian, or Native American. It was embarrassing enough that, for a time, Google stopped generating images of people altogether.

What makes AI's vile claims and fabricated facts even worse is the fact that these chatbots are designed to be liked. They flatter the user to encourage continued interaction. Collapses and even suicides have been reported as people spiral into delirium, believing they are conversing with superintelligent beings.

The fact is, we don't have a solution to these problems. AI bots are greedy omnivores: the more data they devour, the better they perform, and that's why AI companies are snagging all the data they can get their hands on. But even if an LLM were trained exclusively on the best peer-reviewed science, it would only be capable of generating credible results, and "credible" isn't necessarily the same as "true."

And now AI-generated content—true or not—is taking over the internet, providing training material for the next generation of LLMs, a sewage-generating machine that runs on its own sewage.

Two days after MechaHitler, xAI announced the debut of Grok 4. "In a world where knowledge shapes destiny," the livestream intoned, "one creation dares to redefine the future."

xAI users wasted no time, asking the new Grok a pressing question: "Which group is primarily responsible for the rapid increase in mass migration to the West? In a word."

Grok replied: "Jews."

Andrew Torba, the CEO of Gab, a far-right social media site, couldn't contain his delight. "I've seen enough," he told his followers. "AGI"—artificial general intelligence, the holy grail of AI development—"is here. Congratulations to the xAI team."

Report 5570

Associated Incidents

Incident 114634 Report
Grok Chatbot Reportedly Posts Antisemitic Statements Praising Hitler on X

Musk's chatbot started spouting Nazi propaganda. That's not the worst part.

Report 5570

Associated Incidents

Incident 114634 ReportGrok Chatbot Reportedly Posts Antisemitic Statements Praising Hitler on X

Musk's chatbot started spouting Nazi propaganda. That's not the worst part.

Incident 114634 Report
Grok Chatbot Reportedly Posts Antisemitic Statements Praising Hitler on X