Ahoy m@tes, the scraping bot situation has been escalating recently, as you all may have already noticed by the recent site instability and 5xx error responses. @tenchiken@anarchist.nexus has been scrambling to block new scraping subnets as they appear, but these assholes keep jumping providers so it’s been an endless loop and constant firefighting.
I finally had enough and decided to onboard a Proof-of-Work countermeasure, very much like Anubis which has been very popular on the fediverse lately. However I went with Haphash which has been especially designed around haproxy (our reverse proxy of choice) and is hopefully much more lightweight.
The new PoW shield has already been activated on both Divisions by Zero on Fediseer as well. It’s not active on all URLs,. but it should be protecting those which have the most impact on our database, which is what was causing the actual issue. You should notice a quick loading screen on occasion while it’s verifying you.
We’ve already seen a significant reduction in 5xx HTTP errors, as well as a slight reduction in traffic, so we’re hoping this will make a good impact in our situation.
Please do let us know if you run into any issues, and also let us know if you feel any difference in responsiveness. The first m@ates already feel it’s all snappier, but that just be placebo.
And let’s hope the next scraping wave is not pwned residential botnets, or we’re all screwed >_<


Fighting the good fight. The Internet should be for people, not robots.
Scraping is neither new, nor always malicious. Without scraping, no search engine would work and there would be no archive.org wayback machine.
However, AI scrapers all copy the same shit over and over again and do not intend to lead traffic to your site. They just cause cost and don’t give anything in return. This is the problem.
Honestly, My head says lemmy should be search indexed to drive traffic here, but my heart says i don’t need lemmy to be indexed by google to enjoy it and i’d rather not have the rest of reddit over here stinking up the place :)
Imo in an ideal world google would be publicly owned and it would cooperate with other sites to voluntarily share their own self-indexes
interesting thought… do you think we would?
The web is for user agents
Uh actually the word ‘robot’ was originally a regional term for serfs. Are you implying serfs aren’t/weren’t people, and the internet should only be for the upper classes?
Is this a joke
No; I’m very serious at all times. I have never in my life made a joke. How dare you insult me like that. What about my statement was funny to you?
Yeah, sure. But specifically the moon serfs from Xoran IV.