Mozilla and Anthropic announced last week that Claude Opus 4.6 discovered 22 security vulnerabilities in Firefox in just two weeks of testing - including 14 classified as high severity. The AI identified logic errors that decades of fuzzing, static analysis, and security reviews had never caught.
All the vulnerabilities have been patched in Firefox 148. But the implications extend far beyond one browser: if an AI can find this many serious bugs in one of the most scrutinized open-source projects in the world, how many undiscovered vulnerabilities lurk in less examined software?
The Numbers
Anthropic’s Frontier Red Team deployed Claude Opus 4.6 against Firefox’s codebase over a two-week period. The model scanned approximately 6,000 C++ files and submitted 112 unique bug reports to Mozilla.
Of those reports:
- 22 vulnerabilities received CVE designations
- 14 were classified as high severity
- 7 were rated moderate
- 1 was rated low
- 90 additional bugs were discovered and mostly fixed
According to Anthropic, the 14 high-severity bugs represent nearly a fifth of all high-severity vulnerabilities patched in Firefox during 2025. In two weeks, an AI found almost 20% of a year’s worth of critical bugs.
The Critical Find
The most severe vulnerability Claude discovered was CVE-2026-2796, a just-in-time (JIT) miscompilation bug in Firefox’s JavaScript WebAssembly component. It carries a CVSS score of 9.8 out of 10 - about as bad as browser vulnerabilities get.
Claude didn’t just identify the bug. It wrote a working exploit demonstrating the vulnerability could be used for code execution. The process cost approximately $4,000 in API credits and required hundreds of exploitation attempts with iterative feedback.
The economics are notable: finding vulnerabilities turned out to be cheaper than creating exploits for them. That asymmetry has implications for both defenders and attackers.
What Traditional Tools Missed
The collaboration revealed a significant gap in existing security tooling. Mozilla noted that Claude identified “entire classes of errors that conventional automated testing methods like fuzzing had missed despite decades of use.”
Fuzzing - the practice of feeding random or semi-random inputs to software to trigger crashes - excels at finding memory safety issues and edge cases. But it struggles with logic errors: bugs where the code does exactly what it’s written to do, but that logic is wrong.
Claude found logic errors that escaped notice because they don’t cause obvious crashes. In one case, the model identified a use-after-free bug in the JavaScript engine within 20 minutes of analysis.
Mozilla engineers emphasized this point: Firefox has been subjected to extensive security scrutiny for over two decades. If Claude can still find high-severity bugs in Firefox, it suggests “substantial backlogs of discoverable bugs exist across widely deployed software.”
How the Testing Worked
Anthropic’s approach differed from typical AI-assisted security research in a key way: Claude provided reproducible test cases alongside every bug report. This let Mozilla engineers quickly verify and reproduce each issue rather than spending time determining if the AI was hallucinating.
The testing started in Firefox’s JavaScript engine - historically a rich source of browser vulnerabilities - then expanded to other portions of the codebase. Mozilla engineers began landing fixes within hours of receiving reports.
The methodology combined Claude’s ability to reason about code logic with a task verifier that provided real-time feedback on exploitation attempts. This iterative loop let the model refine its understanding of which approaches worked and which didn’t.
Why This Matters
AI finding security vulnerabilities isn’t new. Bug bounty platforms have seen AI-assisted submissions for years, and researchers have demonstrated language models can identify certain vulnerability patterns. What’s notable here is the scale and significance.
Firefox is not some obscure project with minimal security attention. It’s one of the most battle-tested open-source codebases in existence. Mozilla has dedicated security engineers, a bug bounty program, and decades of accumulated security tooling. Finding 14 high-severity bugs in two weeks suggests AI-assisted analysis has matured from interesting research to practical capability.
Mozilla is now integrating AI-powered analysis into its internal security workflows. That decision reflects a practical assessment: traditional tools, no matter how sophisticated, have blind spots that AI can cover.
The Double Edge
The same capability that helps Mozilla find and fix bugs before attackers exploit them also helps attackers find and exploit bugs before Mozilla patches them.
The cost structure Anthropic documented - $4,000 in API credits for vulnerability discovery and exploitation - is within reach of individual researchers, organized crime groups, and nation-state actors. The techniques aren’t secret; Anthropic published details of their methodology.
This creates a new dynamic in the security ecosystem. Defenders need to assume sophisticated attackers have access to similar AI-assisted analysis capabilities. The “security through obscurity” that comes from complex codebases providing a kind of natural defense is eroding.
The good news is that defenders can use the same tools. Mozilla’s quick patching of all discovered vulnerabilities before Firefox 148’s release demonstrates that AI-assisted security can work for defense. The question is whether defenders will adopt these capabilities as quickly as attackers.
What Comes Next
Mozilla plans to expand its Anthropic collaboration across the broader browser codebase. Other major open-source projects would be wise to consider similar approaches.
The findings also raise questions for closed-source software. If two weeks of AI analysis can find 14 high-severity bugs in heavily scrutinized open-source code, what might it find in proprietary codebases that receive less external security attention?
For users, the immediate implication is straightforward: keep Firefox updated. The vulnerabilities Claude found are patched in Firefox 148 and later. For the broader software ecosystem, the message is more complex: the AI security research era has arrived, and both attackers and defenders are adapting.
The Bottom Line
Claude found more high-severity Firefox bugs in two weeks than most human security researchers find in a career. The vulnerabilities were real, exploitable, and had escaped decades of traditional security analysis. AI-assisted security research works. The question now is who uses it first - and for what purpose.