OpenAI's 'Jailbreak-Proof' New Models? Hacked on Day One

misk@piefed.social · 2 months ago

OpenAI's 'Jailbreak-Proof' New Models? Hacked on Day One

DarkCloud@lemmy.world · 2 months ago

I mean, it’s fundamental to LLM technology that they listen to user inputs. Those inputs are probablistic in terms of their effects on outputs… So you’re always going to be able to manipulate the outputs, which is kind of the premise of the technology.

It will always be prone to that sort of jailbreak. Feed it vocab, it outputs vocab. Feed it permissive vocab, it outputs permissive vocab.

Feyd@programming.dev · 2 months ago

Ok? Either openai knows that and lies about their capabilities, or they don’t know it and are incompetent. That’s the real story here.

crumbguzzler5000@feddit.org · 2 months ago

I think the answer is that they are incompetent but also that they are lying about their capabilities. Why else have they rushed everything so much and promised so much?

They don’t really care about the fallout, they are just here to make big promises and large amounts of money on their shiny new tech.

EnsignWashout@startrek.website · 2 months ago

It also could be they’re both liars and incompetent.

OpenAI's 'Jailbreak-Proof' New Models? Hacked on Day One

OpenAI's 'Jailbreak-Proof' New Models? Hacked on Day One

OpenAI's 'Jailbreak-Proof' New Models? Hacked on Day One - Decrypt