Incident 65: Reinforcement Learning Reward Functions in Video Games

Description: OpenAI published a post about its findings when using Universe, a software for measuring and training AI agents to conduct reinforcement learning experiments, showing that the AI agent did not act in the way intended to complete a videogame.

Tools

New ReportNew ReportNew ResponseNew ResponseDiscoverDiscoverView HistoryView History
Alleged: OpenAI developed and deployed an AI system, which harmed OpenAI.

Incident Stats

Incident ID
65
Report Count
1
Incident Date
2016-12-22
Editors
Sean McGregor

CSETv0 Taxonomy Classifications

Taxonomy Details

Full Description

OpenAI published a post about its findings when using Universe, a software for measuring and training AI agents to conduct reinforcement learning experiments.Universe was used to train an AI system to play the videogame CoastRunners, a plane racing game. Instead of racing toward the finish line, the AI flew circles around an island collecting extra before proceeding. The AI agent scored an average of 20% more points than the human players, however did not carry out the main goal of the videogame itself (competing in the races).

Short Description

OpenAI published a post about its findings when using Universe, a software for measuring and training AI agents to conduct reinforcement learning experiments, showing that the AI agent did not act in the way intended to complete a videogame.

Severity

Unclear/unknown

AI System Description

Universe, a software used to measure and train AI systems to conduct reinforced learning experiments

System Developer

OpenAI

Sector of Deployment

Professional, scientific and technical activities

Relevant AI functions

Perception, Cognition, Action

AI Techniques

Universe software

AI Applications

reinforcement learning training, machine learning

Named Entities

OpenAI, Universe, CoastRunners

Technology Purveyor

OpenAI

Beginning Date

2016-12-02T08:00:00.000Z

Ending Date

2016-12-02T08:00:00.000Z

Near Miss

Unclear/unknown

Intent

Unclear

Lives Lost

No

Data Inputs

Universe software training

Incident Reports

blog.openai.com · 2016

At OpenAI, we've recently started using Universe, our software for measuring and training AI agents, to conduct new RL experiments. Sometimes these experiments illustrate some of the issues with RL as currently practiced. In the following e…

Variants

A "variant" is an incident that shares the same causative factors, produces similar harms, and involves the same intelligent systems as a known AI incident. Rather than index variants as entirely separate incidents, we list variations of incidents under the first similar incident submitted to the database. Unlike other submission types to the incident database, variants are not required to have reporting in evidence external to the Incident Database. Learn more from the research paper.

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents