Dropsitenews published a list of websites Facebook uses to train its AI on. Multiple Lemmy instances are on the list as noticed by user BlueAEther
Hexbear is on there too. Also Facebook is very interested in people uploading their massive dongs to lemmynsfw.
Full article here.
Link to the full leaked list download: Meta leaked list pdf
They could stick it in Azure or AWS or something.
Or they could just use their existing scrapers and try to brute force it. Meta isn’t exactly known for being sneaky.