88% of AI-Generated Passwords Can Be Cracked Within an Hour

Asking ChatGPT to generate a secure password feels clever. The AI produces something like X7#pL9@mK2$vR that looks impenetrable and passes every strength validator. There’s just one problem: it might already be sitting in an attacker’s dictionary file.

New research from Kaspersky security researcher Alexey Antonov tested 1,000 passwords generated by each of three major AI models. The results expose a fundamental flaw in how language models work that makes them unsuitable for security applications.

The Numbers

Antonov’s team used hashcat, a standard password recovery tool running on a modern graphics card, to crack the generated passwords. No exotic hardware, no supercomputer clusters. Just off-the-shelf equipment.

Model	Passwords Cracked Within 1 Hour
DeepSeek	88%
Llama	87%
ChatGPT	33%

ChatGPT performed notably better, but a third of passwords cracked in under an hour is still a security failure. These aren’t your account passwords - they’re the “strong” passwords that AI confidently generated when asked to create something secure.

Why LLMs Can’t Do Randomness

The core problem is architectural. Large language models work by predicting the most likely next token based on patterns in their training data. That’s exactly what makes them useful for writing, coding, and conversation.

It’s also the exact opposite of what secure password generation requires.

True cryptographic randomness needs uniform, unpredictable output with no statistical patterns. LLMs, by design, produce output that follows learned distributions. They’re prediction engines masquerading as random generators.

The research from AI security firm Irregular found the same issue: when they tested Claude with 50 password generation prompts, they got only 23 unique passwords. One password string appeared 10 times. The model kept returning to its “most likely” outputs.

Predictable Patterns

The statistical biases are specific and exploitable:

ChatGPT had a clear character preference problem. The letter ‘x’ appeared in 65% of generated passwords. The letter ‘p’ showed up in 26%. Both ‘l’ and ‘L’ appeared in roughly 20% of outputs.

DeepSeek defaulted to dictionary-word variations like B@n@n@7 - exactly the kind of pattern that brute-force tools check first.

Llama mimicked human password habits, consistently placing uppercase letters at the beginning and digits at the end. Password123 patterns with better letter substitution, but the structure is still predictable.

These aren’t bugs. They’re features. The models are doing what they’re trained to do: produce probable outputs. For passwords, probable equals vulnerable.

The Attack Vector

Here’s how this becomes a real-world problem: Attackers can query the same AI models you’re using and build comprehensive dictionaries of their output patterns. They don’t need your specific password. They need the statistical distribution of passwords that ChatGPT tends to generate.

A password that passes your strength validator is worthless if it’s already in an attacker’s wordlist. AI-generated passwords may look strong but exist in a predictable space that sophisticated attacks can enumerate.

What You Should Actually Use

The fix is straightforward: use tools designed for cryptographic randomness.

Password managers like Bitwarden, 1Password, or KeePass use cryptographically secure pseudorandom number generators (CSPRNGs) that incorporate real-world entropy. They produce genuinely unpredictable output with no statistical patterns for attackers to exploit.

Passkeys eliminate passwords entirely where supported, using public-key cryptography tied to your device. No password to crack means no password to compromise.

Hardware security keys add another layer for critical accounts, providing authentication that can’t be phished or cracked offline.

If you’ve already generated passwords using AI chatbots, the safest approach is to rotate them - especially for important accounts. The passwords aren’t necessarily compromised, but they exist in a more predictable space than you probably intended.

The Broader Lesson

This research highlights a category error that’s becoming more common: assuming that AI can replace specialized tools because it’s good at everything else.

LLMs are impressive at text generation, coding assistance, and knowledge synthesis. But “generate something that looks random” and “generate something that is cryptographically random” are fundamentally different tasks. The former requires mimicking randomness. The latter requires actual randomness that language models, by their mathematical foundation, cannot produce.

When security is the goal, use security tools. When you need randomness, use random number generators. AI is many things, but a replacement for purpose-built cryptographic tools isn’t one of them.