Incident 1001: LLM Scrapers Allegedly Target Multiple Open Source Projects Disrupting the FOSS Ecosystem

Description: In mid-March 2025, KDE's GitLab infrastructure was reportedly disrupted by aggressive AI web scrapers originating from Alibaba IP ranges. These bots allegedly ignored robots.txt and spoofed browser headers, which in turn purportedly overwhelmed the site and caused outages for developers. Similar incidents reportedly affected other FOSS projects like GNOME, SourceHut, and Fedora. The scraping is allegedly tied to large language model training, and reportedly imposes real costs and delays.

Tools

New Report New Response DiscoverView History

Entities

View all entities

Alleged: Unnamed generative AI companies , Alibaba , KDE , GNOME , Websites hosting FOSS documentation or bug trackers , SourceHut infrastructure , Pagure.io (Fedora) , GitLab instances , Anubis proof-of-work systems and AI scrapers developed and deployed an AI system, which harmed Sysadmins , SourceHut , Read the Docs , Linux Weekly News , KDE , Inkscape , GNOME , FOSS projects and communities , Fedora , Diaspora and Curl.

Alleged implicated AI systems: KDE , GNOME , Websites hosting FOSS documentation or bug trackers , SourceHut infrastructure , Pagure.io (Fedora) , GitLab instances , Anubis proof-of-work systems and AI scrapers

Incident Stats

Incident ID

1001

Report Count

Incident Date

2025-03-17

Editors

Daniel Atherton

Incident Reports

Reports Timeline

FOSS infrastructure is under attack by AI companies

thelibre.news

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries

arstechnica.com

thelibre.news · 2025

Three days ago, Drew DeVault - founder and CEO of SourceHut - published a blogpost called, "Please stop externalizing your costs directly into my face", where he complained that LLM companies were crawling data without respecting robosts.tx…

arstechnica.com · 2025

Software developer Xe Iaso reached a breaking point earlier this year when aggressive AI crawler traffic from Amazon overwhelmed their Git repository service, repeatedly causing instability and downtime. Despite configuring standard defensi…

Variants

A "variant" is an AI incident similar to a known case—it has the same causes, harms, and AI system. Instead of listing it separately, we group it under the first reported incident. Unlike other incidents, variants do not need to have been reported outside the AIID. Learn more from the research paper.

Seen something similar?

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents

Similar Incidents

By textual similarity

Did our AI mess up? Flag the unrelated incidents

Incident 1001: LLM Scrapers Allegedly Target Multiple Open Source Projects Disrupting the FOSS Ecosystem

Tools

Entities

Incident Stats

Incident Reports

Reports Timeline

FOSS infrastructure is under attack by AI companies

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries

FOSS infrastructure is under attack by AI companies

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries

Variants

Similar Incidents

By textual similarity

Wikipedia Vandalism Prevention Bot Loop

Game AI System Produces Imbalanced Game

Biased Sentiment Analysis

Similar Incidents

By textual similarity

Wikipedia Vandalism Prevention Bot Loop

Game AI System Produces Imbalanced Game

Biased Sentiment Analysis