Description: In mid-March 2025, KDE's GitLab infrastructure was reportedly disrupted by aggressive AI web scrapers originating from Alibaba IP ranges. These bots allegedly ignored robots.txt and spoofed browser headers, which in turn purportedly overwhelmed the site and caused outages for developers. Similar incidents reportedly affected other FOSS projects like GNOME, SourceHut, and Fedora. The scraping is allegedly tied to large language model training, and reportedly imposes real costs and delays.
Tools
New ReportNew ResponseDiscoverView History
The OECD AI Incidents and Hazards Monitor (AIM) automatically collects and classifies AI-related incidents and hazards in real time from reputable news sources worldwide.
Entities
View all entitiesAlleged: Unnamed generative AI companies , Alibaba , KDE , GNOME , Websites hosting FOSS documentation or bug trackers , SourceHut infrastructure , Pagure.io (Fedora) , GitLab instances , Anubis proof-of-work systems and AI scrapers developed and deployed an AI system, which harmed Sysadmins , SourceHut , Read the Docs , Linux Weekly News , KDE , Inkscape , GNOME , FOSS projects and communities , Fedora , Diaspora and Curl.
Alleged implicated AI systems: KDE , GNOME , Websites hosting FOSS documentation or bug trackers , SourceHut infrastructure , Pagure.io (Fedora) , GitLab instances , Anubis proof-of-work systems and AI scrapers
Incident Stats
Risk Subdomain
A further 23 subdomains create an accessible and understandable classification of hazards and harms associated with AI
6.1. Power centralization and unfair distribution of benefits
Risk Domain
The Domain Taxonomy of AI Risks classifies risks into seven AI risk domains: (1) Discrimination & toxicity, (2) Privacy & security, (3) Misinformation, (4) Malicious actors & misuse, (5) Human-computer interaction, (6) Socioeconomic & environmental harms, and (7) AI system safety, failures & limitations.
- Socioeconomic & Environmental Harms
Entity
Which, if any, entity is presented as the main cause of the risk
AI
Timing
The stage in the AI lifecycle at which the risk is presented as occurring
Post-deployment
Intent
Whether the risk is presented as occurring as an expected or unexpected outcome from pursuing a goal
Intentional
Incident Reports
Reports Timeline
Loading...
Three days ago, Drew DeVault - founder and CEO of SourceHut - published a blogpost called, "Please stop externalizing your costs directly into my face", where he complained that LLM companies were crawling data without respecting robosts.tx…
Loading...

Software developer Xe Iaso reached a breaking point earlier this year when aggressive AI crawler traffic from Amazon overwhelmed their Git repository service, repeatedly causing instability and downtime. Despite configuring standard defensi…
Variants
A "variant" is an AI incident similar to a known case—it has the same causes, harms, and AI system. Instead of listing it separately, we group it under the first reported incident. Unlike other incidents, variants do not need to have been reported outside the AIID. Learn more from the research paper.
Seen something similar?
Similar Incidents
Did our AI mess up? Flag the unrelated incidents
Loading...

Game AI System Produces Imbalanced Game
· 11 reports
Loading...

Biased Sentiment Analysis
· 7 reports
Similar Incidents
Did our AI mess up? Flag the unrelated incidents
Loading...

Game AI System Produces Imbalanced Game
· 11 reports
Loading...

Biased Sentiment Analysis
· 7 reports

