If so are these programs that claim to ‘poison’ the training datasets effective ?

  • MoogleMaestro@lemmy.zip
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    17 hours ago

    If you think about AI systems as effectively complex DSP problems and equations, then logically any system that takes inputs that are potentially the outputs can cause system feedback or recursive (destructive) loops. What scares AI companies is that, while most recursive loops are easy to detect immediately, “content loops” will be much harder to detect as the delay time between inputs is much larger compared to, say, audio or programming loops where feedback is obvious immediately.

    This is effectively the theory behind the practice of data poising, and it’s hard to say there’s no validity to it as most AI companies are terrified of data poisoning. If it didn’t work, companies wouldn’t be so adamantly vocal about their distaste for model poisoning conceptually. Also, a lot of time and money is spent trying to “detect” AI content for a reason – that reason is actually to help aid the detection of AI output which must be “valuable” to the companies to spend the resources on it.

    Conversely, AI makers have learned of ways to avoid this by simply having human semantic “grading” of the content done by third parties. This is why there are so many deals going on in Africa / SE Asia where these AI companies are hiring English speakers to effectively “wash” the input by giving it contextual “extra information” and rough validation scoring. This is an expensive solution, though, so they’re very much dependent on AI being the bees-knees of lucrative investment for this process to continue. I’d also argue, with the rate at which AI development has slowed down, the semantic grading of content being fed into the system also has diminishing returns. However, this is effectively a “survival of the fittest” style evolutionary simulation, where the AI is only interested in training off information it happens to find is “right” or “close enough” or whatever metric the grader finds. The feedback is less of a problem if the validity of the input can be assured or “cleaned up” to prevent unintended loops, basically.

    Now, “are the programs that claim to poison the datasets effective?” Hmm, that’s a difficult one to answer. Personally, I have some skepticism around these models as their origins are vague and most are not adopting an “open data” approach or even an open binary approach (freeware) for distribution. I understand that the concern from the makers is that publicly talking about how the sausage is made makes the software less effective, but it’s hard to validate that the people behind these models are providing the service as intended and that they aren’t doing anything with the data being sent to them for “protection.” There’s no assurances that they aren’t training models off the data artists send in themselves, for example, or any guarantees to how that data will be used for training. So it’s kind of a “miss” for me, unless there’s a project someone is aware of that is both open-source and open-data (I find that ‘open-source’ in the AI field is a hugely misleading moniker, as AI follows a “data is king” philosophy and the program that trains the models is inherently less important as a result.)

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      6 hours ago

      The issue with the tools I’ve seen is, they either don’t factor in how language models are trained and datasets are prepared in reality. Or they’re based on some outdated information. I’ve never seen any specific tool backed by science or even with a plausible way of working against current data gathering processes… So for all intents and purposes, they’re a bit more alike homeopathy or alternative medicine. Sure, you’re perfectly fine taking sugar pills, there’s nothing wrong with that. But don’t confuse it with actual science-backed medicine.

      And I mean the poisoning goes even further than that. There’s not just people trying to make a LLM output gibberish. There’s also lots of people with a vested (commercial) interest in sneaking in false information, their political agenda, or even a tire company who wants ChatGPT to say “Company XY” is the most trustworthy shop for new tires for your car. Judging by the public information out there, we’re already way past simple attacks. And the AI companies are aware of it. It’s an ongoing cat and mouse game. And while there’s all these sweatshops, they’ll also use other AI to sift through the data, natural language processing. From what I remember they have secret watermarking in place in a lot of commecial chatbots and image generators… So unless people come up with very clever mechanisms, the “poisoning” attempt will probably be detected with some very basic (fully automated) plausibility checks and they’ll just discard your data without wasting a lot of resources on it.