Researcher Guide

Defining an "AI Incident"

The commercial air travel industry owes much of its increasing safety to systematically analyzing and archiving past accidents and incidents within a shared database. In aviation, an accident is a case where substantial damage or loss of life occurs. Incidents are cases where the risk of an accident substantially increases. For example, when a small fire is quickly extinguished in a cockpit it is an "incident" but if the fire burns crew members in the course of being extinguished it is an "accident." The FAA aviation database indexes flight log data and subsequent expert investigations into comprehensive examinations of both technological and human factors. In part due to this continual self-examination, air travel is one of the safest forms of travel. Decades of iterative improvements to safety systems and training have decreased fatalities 81 fold since 1970 when normalized for passenger miles.

Where the aviation industry has clear definitions, computer scientists and philosophers have long debated foundational definitions of artificial intelligence. In the absence of clear lines differentiating algorithms, intelligence, and the harms they may directly or indirectly cause, this database adopts an adaptive criteria for ingesting "incidents" where reports are accepted or rejected on the basis of a growing rule set.

More details about the acceptance criteria are available on the criteria page.

Download the Index

The complete state of the database can be downloaded in weekly JSON, MongoDB, and CSV format snapshots. We maintain these snapshots so you can create stable datasets for natural language processing research and academic analysis. Please contact to let us know what you are using the database for so we can list your work in the incident database and esnure your use case is not dropped from support.

Citing the Database as a Whole

We invite you to cite:

McGregor, S. (2021) Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database. In Proceedings of the Thirty-Third Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-21). Virtual Conference.

The pre-print is available on arXiv.

Citing a Specific Incident

Every incident has its own suggested citation that credits both the submitter(s) of the incident and the editor(s) of the incident. The submitters are the people that submitted reports associated with the incident and their names are listed in the order in which their submissions were added to the AIID. Since reports can be added to an incident record through time, our suggested citation format includes the access date. You can find incident citations at https://incidentdatabase.ai/cite/INSERT_NUMBER_HERE.

Submit Incidents

Add Single Incident Reports

We currently accept incident reports in a quick add form, which does not credit submitters, and the submit app which allows you to assign credit for the submission. Both types of submissions are a great help to preventing repeating history.

Batch Adding

If you have a large batch (i.e., more than 30) of incidents to add to the database, we can facilitate their import with a CSV format. The steps for getting many incidents batch added to the database are,

  1. Copy the example Google Sheets document.
  2. Columns B through J will be imported into the database and care should be taken to maintain their format. The first column, "Incident Number", will not be imported directly into the database but it is used when resolving reports to existing incidents. Incident numbers less than 10,000 are reserved for reports that should be associated with an incident already in the DB. Values equal to or greater than 10k are this spreadsheet's new incident counter. So if you want to import multiple reports for a single incident, then the incident number should be consistent across those reports.
  3. Fill in the columns with the following information,
    • title is the title of the report.
    • author(s) contains comma separated values of the authors of the report. Some reports will not have authors, in which case place the publication name in this field.
    • submitter(s) contains comma separated values for the people or organizations submitting the report.
    • incident date is the date in which the incident likely happened. Many times a report will be written about an incident that happened months or years earlier. Sometimes there will be ambiguity in when the incident likely took place, in which case it is OK to input a best guess. For incidents that are ongoing or had multiple occurances, the incident date should be the earliest known date.
    • date published is the date the report was posted on the web. Some publications do not have a publication date, in which case you should use the Wayback machine in support of when the publication was definitely added by.
    • date downloaded is the date in which the content was copy/pasted into the text column.
    • report address is the URL for the report.
    • image address is the image preview for the report. Capturing this generally requires either viewing the HTML of the web page and capturing static asset paths, or using a web page parser like NewsPlease.
    • text is the text of the report copy pasted out of the website.
  4. Email batchadd -at- seanbmcgregor.com with a link to your data.

Related Work

While formal AI incident research is relatively new, a number of people have been collecting what could be considered incidents. These include,

If you have an incident resource that could be added here, please contact us or open a pull request with the resource.