What is Claude Mythos and the Capybara tier?

Claude Mythos is Anthropic's most powerful AI model to date, belonging to a new capability tier called "Capybara" that sits above the current Opus models. It was first revealed through a data leak in late March 2026 when security researchers found ~3,000 unpublished assets in an unsecured Anthropic content cache. Anthropic confirmed the model exists and described it as "a step change" in AI performance.

Is Claude Mythos available to the public?

No. As of April 7, 2026, Mythos Preview is exclusively available to organizations participating in Project Glasswing for defensive cybersecurity. Anthropic has not announced a public API release date. Broader access is expected to follow a phased rollout, starting with enterprise API customers, but no timeline has been confirmed.

What are Claude Mythos's benchmark scores?

Leaked and preview data shows: SWE-bench Pro 77.8% (vs Opus 4.6 at 53.4%), SWE-bench Verified 93.9% (vs 80.8%), Terminal-Bench 2.0 82.0% (vs 65.4%), GPQA Diamond 94.6% (vs 91.3%), and Humanity's Last Exam 56.8% (vs 40.0%). The biggest relative gain is SWE-bench Multimodal at 59.0% vs 27.1%. These numbers have not been independently verified.

Claude Mythos: Anthropic's Most Powerful Model Explained

Q: Why is Anthropic withholding the public release of Mythos?

Anthropic is withholding Mythos because of its cybersecurity capabilities. The model can find and exploit vulnerabilities at a level matching senior security researchers. CEO Dario Amodei said: "We trained it to be good at code, but as a side effect of being good at code, it's also good at cyber." Anthropic launched Project Glasswing - a consortium of 40+ organizations including Amazon, Apple, Microsoft, and Google - to deploy Mythos for defensive security first.

On March 26, 2026, security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) discovered approximately 3,000 unpublished assets in an unsecured, publicly accessible data cache belonging to Anthropic. Among those assets: draft blog posts and internal documents describing Claude Mythos - a new AI model that Anthropic calls "the most capable we've built to date" and "a step change" in AI performance.

On April 7, 2026, Anthropic officially acknowledged Mythos by launching Project Glasswing - a cybersecurity consortium giving Mythos Preview access to Amazon, Apple, Microsoft, Google, NVIDIA, and 40+ other organizations for defensive security work. This is not a typical model launch. Anthropic is so concerned about Mythos's capabilities that it is withholding public release until safeguards are in place.

Below, we break down every confirmed detail: the benchmarks, the cybersecurity capabilities, the consortium, and what this means for the AI model leaderboard.

77.8%

SWE-bench Pro

vs Opus 4.6: 53.4%

93.9%

SWE-bench Verified

vs Opus 4.6: 80.8%

40+

Glasswing Partners

defensive security

Up to $100M

Anthropic Credits

for partners

How the Leak Happened

The leak was the result of a configuration error in Anthropic's content management system. An unsecured data cache containing draft blog posts, internal documents, and other unpublished content was left publicly accessible on the internet.

Security researchers Paz and Pauwels found approximately 3,000 unpublished assets. The materials described a new model representing "a step change" in capability, operating in a new tier above the current Opus models - internally designated "Capybara".

After Fortune contacted Anthropic on Thursday evening, March 26, the data cache was removed from public access. Anthropic acknowledged the leak, describing the disclosed materials as "early drafts of content considered for publication" resulting from configuration errors. The company confirmed the model's existence and that it was being trialed by "early access customers."

Important context: The benchmark numbers below come from leaked documents and Anthropic's Project Glasswing announcement. Pre-release benchmarks can shift before official launch. We mark confirmed vs. unconfirmed data throughout this report.

Benchmark Performance: The Numbers

The leaked documents and subsequent reporting reveal benchmark scores that, if they hold at public release, would place Claude Mythos significantly ahead of every publicly available model. The improvements over Claude Opus 4.6 are not incremental - they are generational.

BenchmarkMythos PreviewOpus 4.6

SWE-bench Pro77.8%

53.4%+24.4

SWE-bench Verified93.9%

80.8%+13.1

SWE-bench Multilingual87.3%

77.8%+9.5

SWE-bench Multimodal59.0%

27.1%+31.9

Terminal-Bench 2.082.0%

65.4%+16.6

GPQA Diamond94.6%

91.3%+3.3

Humanity's Last Exam (no tools)56.8%

40.0%+16.8

BrowseComp86.9%

83.7%+3.2

OSWorld-Verified79.6%

72.7%+6.9

The standout result is SWE-bench Pro - the hardest tier of the SWE-bench evaluation suite, designed to test real-world software engineering ability. A jump from 53.4% to 77.8% (+24.4 points) is one of the largest generational improvements on this benchmark.

SWE-bench Multimodal shows the biggest relative gain: Mythos more than doubles Opus 4.6's score (59.0% vs 27.1%), suggesting a major leap in the model's ability to reason about visual context alongside code - diagrams, screenshots, UI mockups.

On BrowseComp (multi-step web research), Mythos achieves 86.9% while using 4.9x fewer tokens than Opus 4.6 to reach its score - indicating not just better results, but dramatically more efficient reasoning.

The Capybara Tier: A New Model Class

Leaked documents describe Mythos as belonging to a new tier called "Capybara", sitting above the current Opus tier in Anthropic's model hierarchy. This is not simply "Opus 5"or "Claude 5" - though the community has used those labels. The internal documents suggest Capybara is a distinct capability class.

What we do not know (as of April 7, 2026):

Parameter count - Anthropic has not disclosed any information about model size. Unverified community speculation varies widely.

Context window size - Not disclosed in any leaked or official material

Pricing - Expected to be above Opus 4.6 ($5/$25 per million tokens), but no specific pricing announced

Public release date - No timeline confirmed beyond "early access" for Glasswing partners

Architecture details - No information on whether Mythos uses dense, MoE, or another architecture

Anthropic described the model as "very expensive for us to serve, and will be very expensive for our customers to use." This confirms Capybara will sit at a premium price point. For current Anthropic pricing, see our Anthropic API pricing page.

Cybersecurity: Why Anthropic Is Withholding Public Release

The most consequential aspect of Claude Mythos is not its benchmark scores - it is what the model can do with code in adversarial contexts. Anthropic CEO Dario Amodei stated in a video released alongside the Project Glasswing announcement:

"Claude Mythos Preview is a particularly big jump. We haven't trained it specifically to be good at cyber. We trained it to be good at code, but as a side effect of being good at code, it's also good at cyber."- Dario Amodei, CEO of Anthropic

Anthropic claims that over the past few weeks of internal testing, Mythos Preview has identified thousands of zero-day vulnerabilities, many of them critical - including bugs that are one to two decades old in heavily scrutinized codebases. Unlike previous models that could find vulnerabilities, Mythos can also write the exploits to accompany them.

Logan Graham, Anthropic's frontier red team lead, described the model's capabilities: "We've seen Mythos Preview accomplish things that a senior security researcher would be able to accomplish."

Confirmed Mythos cybersecurity capabilities:

These capabilities are restricted to Glasswing partners under coordinated disclosure agreements. Mythos Preview is not publicly available.

Vulnerability Discovery

Identifies security flaws in source code at scale - thousands of zero-days found in initial testing.

Exploit Development

Can write working exploits for discovered vulnerabilities, not just flag them.

Attack Chain Construction

Maps multi-step attack paths through systems, linking individual vulnerabilities into exploitable chains.

Binary Analysis

Evaluates compiled software binaries without source code access.

System Misconfiguration Detection

Identifies deployment and configuration errors that create security exposures.

Penetration Testing

Performs endpoint security assessments with autonomous multi-step reasoning.

This is why Anthropic is withholding public release. The same capabilities that make Mythos exceptionally useful for defensive security also make it a potent offensive tool. Amodei added: "More powerful models are going to come from us and from others, and so we do need a plan to respond to this."

Project Glasswing: The Defensive Consortium

Rather than release Mythos publicly, Anthropic launched Project Glasswing on April 7, 2026 - a consortium of technology, cybersecurity, critical infrastructure, and financial organizations that will use Mythos Preview exclusively for defensive security work.

Core Glasswing Partners (excluding Anthropic as organizer)

Amazon Web Services

Apple

Microsoft

Google

NVIDIA

Cisco

CrowdStrike

Broadcom

Palo Alto Networks

JPMorganChase

Linux Foundation

+ 40 additional organizations across cybersecurity, critical infrastructure, and financial sectors

Anthropic's financial commitments to the program:

Up to $100M

Usage Credits

to Glasswing partners

$4M

Open-Source Donation

to security organizations

The program operates on a coordinated vulnerability disclosure model: Glasswing partners use Mythos Preview to scan both first-party and open-source software systems, and developers are given time to patch discovered vulnerabilities before any public disclosure. This mirrors the responsible disclosure framework that has governed the security research community for decades.

Anthropic has been privately warning top government officials that Mythos makes large-scale cyberattacks significantly more likely as similar models proliferate. The Glasswing program is Anthropic's attempt to give defenders a head start.

Anthropic on the Leaderboard Today

Mythos is not yet available through public APIs or OpenRouter, so it does not appear in our live rankings. Here is where Anthropic's current models stand:

How the Leak Happened

Benchmark Performance: The Numbers

The Capybara Tier: A New Model Class

Cybersecurity: Why Anthropic Is Withholding Public Release

Project Glasswing: The Defensive Consortium

Anthropic on the Leaderboard Today

Frontier Model Landscape

Timeline of Events

What Happens Next

Key Takeaways

Sources

Related Reports

How the Leak Happened

Benchmark Performance: The Numbers

The Capybara Tier: A New Model Class

Cybersecurity: Why Anthropic Is Withholding Public Release

Project Glasswing: The Defensive Consortium

Anthropic on the Leaderboard Today

Frontier Model Landscape

Timeline of Events

What Happens Next

Key Takeaways

Sources

Related Reports