Ahoy m@tes, the scraping bot situation has been escalating recently, as you all may have already noticed by the recent site instability and 5xx error responses. @tenchiken@anarchist.nexus has been scrambling to block new scraping subnets as they appear, but these assholes keep jumping providers so it’s been an endless loop and constant firefighting.
I finally had enough and decided to onboard a Proof-of-Work countermeasure, very much like Anubis which has been very popular on the fediverse lately. However I went with Haphash which has been especially designed around haproxy (our reverse proxy of choice) and is hopefully much more lightweight.
The new PoW shield has already been activated on both Divisions by Zero on Fediseer as well. It’s not active on all URLs,. but it should be protecting those which have the most impact on our database, which is what was causing the actual issue. You should notice a quick loading screen on occasion while it’s verifying you.
We’ve already seen a significant reduction in 5xx HTTP errors, as well as a slight reduction in traffic, so we’re hoping this will make a good impact in our situation.
Please do let us know if you run into any issues, and also let us know if you feel any difference in responsiveness. The first m@ates already feel it’s all snappier, but that just be placebo.
And let’s hope the next scraping wave is not pwned residential botnets, or we’re all screwed >_<


This is amazing, this type of anti-bot access should be rolled out everywhere. I wouldn’t mind my battery life being cut by 10% just to access bot free content.
I would, however. I don’t know if electricity with the correct voltage and amperage just grows in trees up there in the US, but in the rest of the world, we have to pay up for electricity, and having to consume more of it also means a larger damage to our local environment, already preyed upon by northern-hemisphere corporations.
Not to mention, it effectively raises our power bill for no new gain as well, which comes with a very bad timing due to recent scandals (up to Constitutional Accusation Summons) in how the costs of energy transportation are being billed to users in my country. Besides all the local increases of cost, it mechanically functions not very different from rent-seeking.
I may be misunderstanding this measure but I don’t think that’s going to be mitigated.
If I understand correctly, this requires browsers requesting a page to do a small amount of “work” for no reason other than demonstrating they’re willing to do it. As a once off for devices used by humans, it’s barely noticeable. For bots reading millions of pages it’s untenable - they’ll just move on to easier targets.
However, that only works for bots who’s purpose is to injest large quantities of text.
For a bot who’s purpose is to make posts, or upvote things, or reply to other comments, they’re much less sensitive to this measure because they don’t need to harvest millions of pages.
The threshold for work would have to be increased. 10s work to post a comment will be tricky to justify the cost for.
That would be an unethical waste of power for legit users