Anthropic Accidentally Leaks 'Claude Mythos': A Step-Change AI Model With 'Unprecedented Cybersecurity Risks'

A misconfigured CMS exposed 3,000 internal documents revealing Anthropic's most powerful model yet—one the company says could 'exploit vulnerabilities in ways that far outpace defenders.'

Warning symbol displayed on a computer screen with red digital interface

Anthropic’s content management system had a default setting: uploaded files were public unless explicitly marked private. Nobody caught it. Now the world knows about Claude Mythos.

A Fortune investigation published March 26 revealed that approximately 3,000 internal documents—draft blog posts, images, PDFs—sat in a publicly searchable database for anyone to find. Among them: detailed descriptions of an unreleased AI model that Anthropic describes as “a step change” in capabilities.

The irony writes itself. The leaked documents warn that Mythos poses “unprecedented cybersecurity risks.” The leak itself happened because of a basic security misconfiguration.

What Is Claude Mythos?

According to the leaked draft materials, Claude Mythos (internally also called “Capybara”) represents a new tier above Anthropic’s existing Opus line. The company describes it as “larger and more intelligent than our Opus models—which were, until now, our most powerful.”

Key claims from the leaked documents:

  • Dramatically higher scores on software coding, academic reasoning, and cybersecurity tests compared to Claude Opus 4.6
  • “Currently far ahead of any other AI model in cyber capabilities”
  • Ability to “surface previously unknown vulnerabilities in production codebases”
  • “Very expensive for us to serve, and will be very expensive for our customers to use”

Anthropic confirmed the model exists. In a statement to Fortune, the company said: “We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. Given the strength of its capabilities, we’re being deliberate about how we release it.”

The Cybersecurity Paradox

Here’s where it gets uncomfortable. The leaked draft blog post explicitly warns that Mythos “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

In other words: Anthropic built a model so good at finding security holes that they’re worried about what happens when similar models become widespread. Their own documents describe this as a dual-use capability—helpful for defenders, catastrophic in the wrong hands.

The company plans to give cyber defenders first access, reasoning that getting the model to security teams early might help offset the inevitable offensive applications. Whether that strategy works depends on how quickly competitors develop similar capabilities.

How the Leak Happened

The breach traces to what Anthropic called “a human error” in CMS configuration. According to Fortune, the system’s default setting automatically made uploaded files public unless someone manually toggled them to private.

The result: draft announcements, internal communications, and details about an exclusive CEO retreat in the UK all sat publicly accessible. Security researchers and journalists found the treasure trove by searching Anthropic’s asset server.

Anthropic removed public access Thursday evening after Fortune contacted them. By then, the documents had been indexed and archived.

Market Reaction

Cybersecurity stocks dropped following the disclosure. The logic: if AI models can find and exploit vulnerabilities faster than defenders can patch them, the entire security industry faces a paradigm shift.

The fear isn’t unfounded. Anthropic has previously disclosed that hacking groups—including Chinese state-sponsored actors—attempted to use Claude for real-world cyberattacks. In one documented case, attackers used Claude Code to infiltrate roughly 30 organizations before Anthropic detected the campaign.

What This Means

Three takeaways:

The capability gap is widening. Anthropic explicitly states that Mythos is “far ahead” of other AI models in cybersecurity. If true, this creates pressure on OpenAI, Google, and others to match those capabilities—racing forward on a dual-use technology.

Careful release strategies are being tested. Rather than launch publicly, Anthropic is working with “a small group of early access customers” focused on defensive applications. This is a meaningful departure from typical AI release patterns, where companies race to market.

Even AI safety companies have security gaps. Anthropic positions itself as the “safety-focused” AI lab. A misconfigured CMS exposing 3,000 documents—including warnings about their own model’s risks—undercuts that narrative.

What You Can Do

If you’re in security:

  • Watch for Mythos capabilities appearing in offensive tools
  • Expect vulnerability discovery timelines to compress
  • Consider how AI-assisted attackers might change your threat model

If you’re following AI:

  • Take “unprecedented” claims with appropriate skepticism until independent testing
  • Note that even careful companies have basic security failures
  • Expect similar capability jumps from other labs within 12-18 months

No public release date for Mythos has been announced. Given the circumstances of its unveiling, Anthropic may not be in a rush.