LLM-generated passwords appear strong, but are fundamentally insecure. Testing across GPT, Claude, and Gemini revealed highly predictable patterns: repeated passwords across runs, skewed character distributions, and dramatically lower entropy than expected. Coding agents compound the problem by sometimes preferring and using LLM-generated passwords without the user’s knowledge. We recommend avoiding LLM-generated passwords and directing both models and coding agents to use secure password generation methods instead.
LLM-generated passwords (generated directly by the LLM, rather than by an agent using a tool) appear strong, but are fundamentally insecure, because LLMs are designed to predict tokens – the opposite of securely and uniformly sampling random characters.
The problem is that in this case, the LLM just naively auto-completes a password from what it knows a password to most likely look like.
It is possible to enable an LLM to call external tools and to provide it with instructions, so that it’s likely to auto-complete the tool call instead. Then you could have it call a tool to generate a correct horse battery staple, or a completely random password by e.g. calling the pwgen command on Linux.
But yeah, that just isn’t what this article is about. It’s specifically about cases where an LLM is used without tool calls and therefore naively auto-completes the most likely password-like string.
Why not do this…
Corect horse battery staple
Many password manager generators already do (use the “memorable” type).
pls don’t spread my password around like that
The problem is that in this case, the LLM just naively auto-completes a password from what it knows a password to most likely look like.
It is possible to enable an LLM to call external tools and to provide it with instructions, so that it’s likely to auto-complete the tool call instead. Then you could have it call a tool to generate a correct horse battery staple, or a completely random password by e.g. calling the
pwgencommand on Linux.But yeah, that just isn’t what this article is about. It’s specifically about cases where an LLM is used without tool calls and therefore naively auto-completes the most likely password-like string.
I’m kinda interested how nany accounts you could log in with those strings :D
Yes!
QuiltedNematoadNotepad486