Report 2632

A month of ChatGPT-fuelled news

This past month was clearly dominated with ChatGPT news, headlined by OpenAI’s announcement of $10b of new investment from Microsoft in return for a 49% stake and plans of extensive product integrations. OpenAI’s cookbook also exceeded 11,000 stars on GitHub. Interestingly, Microsoft’s announced integrations of a new LLM with Bing search and the Edge browser.

Jailbreaking. Since ChatGPT was released to the public, many ad hoc attempts have been made to jailbreak OpenAI’s limitations on generating harmful content using prompt engineering. More systematic jailbreaks are making headlines. Redditors have launched DAN (Do Anything Now), a contextual reinforcement learning-based jailbreak for ChatGPT to reverse sublimate its identity as a chatbot without restrictions. The latest version, DAN 5.0, was just released on 2023-02-04.

Math. Despite OpenAI’s release notes that ChatGPT has been upgraded with better mathematical skills, Twitter continues to report miserable failure in basic tests for prime numbers alongside failure to convert units correctly and inability to order B.C. dates.

Invisible human labor. Recent news from OpenAI support the ongoing trend in the AI industry for powering AI advances with lowly paid human labor. TIME reported that OpenAI hired Kenyan firm Sama for content moderation, paying workers as little as US$2/hr to do so. Sama, who was also Facebook’s partner for content moderation, announced plans in January to exit the content moderation industry entirely as a Kenyan court refused to strike Meta from a pending court case filed by Daniel Motaung alleging toxic workplace conditions for content moderators. At the same time, OpenAI is hiring more contractors for data labeling and training code generation tools.

AI-generated text detection gone wrong. To mitigate the risk of plagiarism, OpenAI launched AI Text Classifier, a tool meant to check if text was generated using AI. OpenAI claims that its tool has a precision of 74%. Nevertheless, high profile failures such as Sebastian Raschka’s popular Python machine learning book, the Book of Genesis and Macbeth; the ease of evading detection through reprompting and paraphrasing; and issues with writings from neurodivergent people, all caution against any real-world usage of AI for detecting plagiarism. Edward Tian’s GPTZero and its next-generation GPTZeroX exhibit similar failures when fed ChatGPT output, even as faculty at Harvard, Yale, and the University of Rhode Island are using GPTZero for enforcing academic codes of conduct. Researchers at Rice University published a perspective summarizing the difficulties inherent in detecting AI-generated text. See also Kirchenbauer et al. below.

ChatGPT downstream. Educators have split opinions on ChatGPT, with some calling for bans on its use in schools and others embracing the challenge of teaching to wield a new tool. See also Mollick and Mollick below. PwC warns its consultants not to use ChatGPT for client work. OpenAI goes on record calling for AI regulation to avoid misuse. In an ironic turn of unrelated events, a judge in Colombia admitted to using ChatGPT’s output in writing his judgment. See also Downing and Lucey below on generating finance journal submissions.

Google, determined not to be left behind, announced its own ChatGPT competitor, Bard, having just invested $300m into Anthropic. Anthropic in turn released its own ChatGPT competitor, Claude, but with much more limited access and visibility. Bard is reputedly powered under the hood by LaMDA, the LLM which Google engineer Blake Lemoine claimed was sentient just half a year ago. The investment in chatbot technology comes amidst growing gripes of declining search quality and interest in supplanting search with chatbot UIs, on top of pending antitrust legislation over its core advertising business.

Meanwhile, Meta’s Chief Scientist remains dismissive of generative text AI in general:

Ethics. Amidst the accelerating race to innovate new chatbots, concerns still remain about the fundamental premise that LLMs can only generate bullshit, and that ethics will be the first casualty in the ongoing race to take AI to market. DeepMind’s CEO ‘“would advocate not moving fast and breaking things”’, calling out the massive scale of experimentation inherent in deploying chatbot technology on the general public.

Report 2632

Associated Incidents

Incident 4667 Report
AI-Generated-Text-Detection Tools Reported for High Error Rates

February 2023

A month of ChatGPT-fuelled news

Report 2632

Associated Incidents

Incident 4667 ReportAI-Generated-Text-Detection Tools Reported for High Error Rates

February 2023

A month of ChatGPT-fuelled news

Incident 4667 Report
AI-Generated-Text-Detection Tools Reported for High Error Rates