There’s so much that’s going to need done once this era has passed, I’m hoping the EU steps up while the Americans just wallow in the shit
I see it as difficult if the EU wants to have technological sovereignty.
I hate that this is turning out to be an issue that the lawyers are just not doing their jobs in multiple court cases across the industry rather than solving the legal issue. I don’t know if it’s ignorance or corruption, but big corporations getting away with stealing from artists is not a new thing. Sad that it’s now come to a point where they can produce so much garbage that it drowns out the work of the original artists. Soon there will be so little content for the LLMs to steal from that everything will be derivative and we’ll end up in a new dark age.
Lawyers are doing their jobs–there is just no legal basis, given how a LLM is built, that the mere act of training a model or generating a latent space infringes copyright. That said, Anthropic literally just got shit on for pirating a shit ton of training data but not the training itself–because actually downloading copies of a work without license is and always has been infringement.
Meta got shit on for pirating books.
Anthropic got shit on for not pirating books.
No? Anthropic’s judge literally continued the case over the basis they admitted to torrenting (i.e., pirating) a huge number of books for their training repo. Meta did not do that afaik
Americans kissing corporate ass, what else is new?
Turning books into a language model is transformative. No LLM is a substitute for the original works.
Yep. As much as everybody wants to shit on Zuckerberg, you can’t recreate exact copies with LLMs of any sort. You can’t claim that a 12GB image creation model somehow houses the entirety of all human-generated images.
Right, a 12 GB model trained on 100,000,000 images isn’t big enough to contain an MD5 checksum of each.
The same people expect it to identify the authorship of sentence fragments, but never quote one whole paragraph from any book ever. Now: gigabytes of text could be a significant fraction of all books. But finding a single recognizable page is news. Storing text is not what these companies spent a bajillion dollars on.
Really, the whole basis for the anti-AI arguments seem to boil down to “It feels wrong that a billionaire’s corporation should be able to take the work of artists and writers and, without paying them for it, use it to create a tool that is then used to put them out of work.” And that’s absolutely 100% true, but it unfortunately doesn’t hold any legal weight, and the terms we currently have to describe intellectual property theft simply aren’t sufficient to describe what’s going on here. Until new laws are passed, I don’t see any of these attempts to stop AI going anywhere, but I’d love to be proven incorrect.
Most anti-AI sentiments seem like misplaced hatred of awful companies forcing nonsense on everybody, or a refusal to place judgement on capitalism itself.