Who’d of thunk it :)
This would actually be really interesting to observe. AI training on AI-generated content that was trained on AI generated content, etc. - and see how quickly the output breaks down and in what ways.
Recursion is fun. Recursion is fun. Recursion is fun. Recursion is fun.
If predictive models can’t survive without someone else to constantly give them input, does that make them digital parasites?
Entirely predictable. AI will put its own sources out of business, and we will be left poorer for it.
But so long as someone can make a buck, they will burn the world down for it.
AI incest?
Hilarious. I don’t see any of the big companies doing anything to fix this issue anytime soon.
Thing is, this isn’t really how AI training works and it can be easily done on the outputs of other AI. That’s actually what Standford used to train their (comparably) small LLM that was very competent, despite its size. It was trained on the outputs of GPT (iirc) and held it’s own much better than other models in a similar category, which is also what opened up the doors to smaller, more specialized models being useful, rather than giant ones like GPT.
Now, image generation via diffusion might be more troublesome, but that’s fairly easily mitigated through several means, including a human or automated discriminator, which basically becomes a pseudo form of a GAN. There’s also other processes that exist for this that aren’t as affected (from what I know at least), such as GANs. But given most image AI’s are trained on stuff like LAION, AI images being uploaded online will have no effect on that, not for quite a while at least, if ever.