Security researchers tested 50 well-known jailbreaks against DeepSeek’s popular new AI chatbot. It didn’t stop a single one.

Cat@ponder.cat · 5 hours ago

Security researchers tested 50 well-known jailbreaks against DeepSeek’s popular new AI chatbot. It didn’t stop a single one.

hendrik@palaver.p3x.de · edit-2 3 hours ago

I know. This isn’t the first article about it. IMO this could have been done deliberately. They just slapped on something with a minimal amount of effort to pass Chinese regulation and that’s it. But all of this happens in a context, doesn’t it? Did the scientists even try? What’s the target use-case and the implications on usage? And why is the baseline something that doesn’t really compare, plus the only category missing, where they did some censorship? I’m just saying, with that much information missing, it’s a bold claim to come up with numbers like 100% and saying it’s alarming.

(And personally, I’d say these numbers show how these additional safeguards work. You can see how LLMs with nothing in front of them (like Llama405 or Deepseek) fail, and the ones with additional safeguards do way better.)

Security researchers tested 50 well-known jailbreaks against DeepSeek’s popular new AI chatbot. It didn’t stop a single one.

Security researchers tested 50 well-known jailbreaks against DeepSeek’s popular new AI chatbot. It didn’t stop a single one.

Evaluating Security Risk in DeepSeek and Other Frontier Reasoning Models