92% of AI-Generated Code Has Critical Vulnerabilities

Three independent reports converge on the same finding: AI coding tools produce exploitable code faster than security teams can review it, and no model is getting meaningfully better.

Green digital code streaming down a dark screen in the style of the Matrix

The tech industry spent the first quarter of 2026 celebrating “vibe coding” — letting AI write your software while you sip coffee and approve pull requests. Three independent security reports published this month say the hangover has arrived.

Sherlock Forensics found that 92% of AI-generated codebases contain at least one critical vulnerability. ProjectDiscovery’s survey found 62% of security teams can’t keep up with the code volume. Veracode’s spring update shows security pass rates stuck at 55%, unchanged despite months of model improvements. And the Cloud Security Alliance documented CVEs from AI-generated code rising from 6 in January to 35 in March — a near-sixfold increase with the actual count estimated at 5 to 10 times higher.

These aren’t alarmists with an agenda. These are penetration testers, security auditors, and researchers counting the holes.

What the Numbers Actually Show

Sherlock Forensics assessed dozens of web applications, APIs, SaaS platforms, and internal tools built with Copilot, ChatGPT, Cursor, and Claude between January and April 2026. They mapped findings against the OWASP Top 10 and MITRE ATT&CK frameworks using both manual testing and automated scanning.

The results by tool:

ToolAvg Exploitable FindingsCritical Vuln Rate
GitHub Copilot9.194%
ChatGPT8.791%
Cursor7.989%
Claude6.482%

Every tool produced exploitable code. Claude performed least badly, but “82% of codebases have critical vulnerabilities” isn’t something to celebrate.

The most common problems: 91% of projects lacked proper logging. 88% had no rate limiting. 78% stored secrets in plaintext or committed .env files. 65% had broken authentication. 54% were vulnerable to injection attacks.

The median time from deployment to first exploit attempt? Eighteen days.

Security Teams Are Drowning

ProjectDiscovery surveyed 200 cybersecurity practitioners at mid-to-large enterprises and found the bottleneck isn’t the code — it’s the speed. One hundred percent of respondents reported increased engineering output over the past year. Nearly half attributed most of that acceleration to AI coding tools.

The problem: AI-assisted developers produce commits at three to four times the normal rate, but introduce security findings at ten times the rate, according to the Cloud Security Alliance’s analysis of Fortune 50 enterprises. Security teams are getting buried under a deluge of code they can’t review fast enough.

Their top concerns: exposure of corporate secrets (78%), supply-chain risks from unreliable dependencies (73%), and business logic vulnerabilities (72%) — the kind of design flaws that let attackers abuse legitimate functions rather than exploit traditional bugs.

Escape.tech’s analysis of 1,400 vibe-coded applications found 2,038 highly critical vulnerabilities, 400 or more leaked secrets, and 175 instances of exposed sensitive data.

Models Aren’t Getting Better at Security

Veracode’s spring 2026 update tested whether newer, more capable models had closed the security gap. They hadn’t. The security pass rate held flat at 55%, identical to previous assessment periods. OpenAI’s GPT-5.1 and GPT-5.2 performed within the margin of error of GPT-4.1 on security benchmarks.

This shouldn’t surprise anyone who understands how these models are trained. AI code assistants optimize for functionality, speed, and developer satisfaction. Security is a constraint that conflicts with those goals. When given a choice between a secure and insecure method, AI models chose the insecure option 45% of the time in Veracode’s tests, with Java hitting a failure rate above 70%.

Georgia Tech’s Vibe Security Radar, which tracks actual CVEs attributed to AI-generated code in production repositories, documented a consistent acceleration through Q1 2026. The 74 confirmed cases likely represent a fraction of the real number — researchers estimate the true count at 400 to 700 exploitable flaws in observable repositories alone.

Why This Should Worry You

Thirty-four percent of Node.js projects assessed by Sherlock contained hallucinated package dependencies — packages the AI referenced that don’t exist. This is a supply-chain attack waiting to happen. An attacker registers the hallucinated package name, uploads malicious code, and every developer who installs the AI-generated project gets compromised. It’s already happened with other package managers.

Only 12% of the applications Sherlock tested implemented rate limiting on authentication endpoints. That means the other 88% have login pages where an attacker can try passwords at machine speed without being blocked.

The combination of speed and carelessness creates a new kind of technical debt. It’s not spaghetti code that’s hard to maintain — it’s code that actively undermines the security of every system it touches, deployed faster than anyone can check it.

What’s Being Done (And Why It’s Not Enough)

The standard response from AI coding tool vendors is that their models are getting smarter and will eventually write secure code by default. Veracode’s flat-lined pass rates over multiple model generations say otherwise. The problem is structural: models trained to predict the most likely next token will reproduce the patterns they’ve seen most often, and the internet is full of insecure code examples.

Some organizations are adding AI-specific security scanning to their CI/CD pipelines. That helps, but only if the pipeline actually blocks deployment — and with the pressure to ship faster, override buttons get pressed.

The harder truth is that “vibe coding” was always a euphemism for skipping the parts of software development that keep systems safe. Code review, threat modeling, security testing — these aren’t bureaucratic overhead. They’re the reason most production software doesn’t immediately get hacked. Remove the human who asks “what happens if someone puts a SQL injection in this form field?” and you get exactly what these reports describe: applications that work great right up until someone tries to break them.