Codeberg, a Berlin-based code hosting community, is struggling to cope with a deluge of AI bots that can now bypass previously effective defenses.

In a series of posts to the Mastodon social network on Friday, Codeberg volunteer staff said AI crawlers are no longer being kept at bay by Anubis, an AI bot tarpit.

“It seems like the AI crawlers learned how to solve the Anubis challenges,” the Codeberg account said. “Anubis is a tool hosted on our infrastructure that requires browsers to do some heavy computation before accessing Codeberg again. It really saved us tons of nerves over the past months, because it saved us from manually maintaining blocklists to having a working detection for ‘real browsers’ and ‘AI crawlers.'”

The AI bot traffic has functioned as a denial of service attack, resulting in what Codeberg staff describe as “a period of extreme slowness today.”

Codeberg says that some of the bots appear to be running on networks controlled by China-based telecom biz Huawei.

A few participants in the discussion have pushed back on the use of Anubis, citing the Free Software Foundation’s position that the AI bot defense project functions like crypto mining code, sending out a JavaScript program that forces the receiving computer to run calculations the user didn’t ask for, and thus could be deemed malware. 

Codeberg staffers nonetheless argue that Anubis remains useful, and say they’re looking into related AI stopping software called Iocaine.

“We see today another dark side of the abusive use of computing resources brought to us by the LLM and AI ballyhoo,” said Bradley M. Kuhn, policy fellow and hacker-in-residence at Software Freedom Conservancy, in an email to The Register. 

“These bots, in the insatiable greed for more and more training data, are actually launching DDoS attacks against the kindest and most giving people in our community. Any company running bots for the purpose of training LLMs should be ashamed of themselves.”

AI crawlers and services tied to AI bots are unwanted in many FOSS online communities and projects. The Curl project, for example, has repeatedly expressed annoyance at having to deal with AI-assisted bug reports for issues that aren’t legitimate.

Over on the commercial side of the open source world, developers have been pleading with leaders of the Microsoft-subsumed GitHub since May to provide a way to “allow us to block Copilot-generated issues (and PRs) from our own repositories.”

In the initial post, developer Andi McClure warned, “If we are not granted these tools, and ‘AI’ junk submissions become a problem, I may be forced to take drastic actions such as closing issues and PRs on my repos entirely, and moving issue hosting to sites such as Codeberg which do not have these maintainer-hostile tools built directly into the website.”

The discussion thread has attracted more than 1,500 “thumbs up” endorsements and 136 comments.

But fleeing from GitHub to Codeberg will not necessarily avoid the impact of AI crawlers and services, as the Anubis-bypassing bots demonstrate.

Kuhn nonetheless still advocates doing so. 

“The problems with GitHub have been growing for some time,” he said. “We at SFC have always been concerned about the issue of using proprietary software to write FOSS. However, the integration with Copilot so deeply into the platform, and Microsoft’s flagrant use of content hosted on GitHub to train their own LLMs makes a departure from GitHub urgent for all FOSS developers.” ®