インシデント 12の引用情報
インシデントのステータス
CSETv0 分類法のクラス
分類法の詳細Full Description
The most common techniques used to embed words for natural language processing (NLP) show gender bias, according to researchers from Boston University and Microsoft Research, New England. The primary embedding studied was a 300-dimensional word2vec embedding of words from a corpus of Google News texts, chosen because it is open-source and popular in NLP applications. After demonstrating gender bias in the embedding, the researchers show that several geometric features are associated with that bias which can be used to define the bias subspace. This finding allows them to create several debiasing algorithms.
Short Description
Researchers from Boston University and Microsoft Research, New England demonstrated gender bias in the most common techniques used to embed words for natural language processing (NLP).
Severity
Unclear/unknown
Harm Distribution Basis
Sex
AI System Description
Machine learning algorithms that create word embeddings from a text corpus.
Relevant AI functions
Unclear
AI Techniques
Vector word embedding
AI Applications
Natural language processing
Location
Global
Named Entities
Microsoft, Boston University, Google News
Technology Purveyor
Microsoft
Beginning Date
2016-01-01T00:00:00.000Z
Ending Date
2016-01-01T00:00:00.000Z
Near Miss
Unclear/unknown
Intent
Unclear
Lives Lost
No
CSETv1 分類法のクラス
分類法の詳細Harm Distribution Basis
sex
Sector of Deployment
professional, scientific and technical activities
インシデントレポート
レポートタイムライン
- 情報源として元のレポートを表示
- インターネットアーカイブでレポートを表示
The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning…
バリアント
よく似たインシデント
Did our AI mess up? Flag the unrelated incidents
よく似たインシデント
Did our AI mess up? Flag the unrelated incidents