Language Model by Anthropic - Detected via data leak + official announcement
Claude Mythos is Anthropic's most powerful AI model to date, belonging to a new "Capybara" tier above Opus. Discovered through a data leak in March 2026 and officially previewed on April 7, 2026 via Project Glasswing. Currently restricted to defensive cybersecurity partners only.
| Benchmark | Mythos | Opus 4.6 | Delta |
|---|---|---|---|
| SWE-bench Pro | 77.8% | 53.4% | +24.4 |
| SWE-bench Verified | 93.9% | 80.8% | +13.1 |
| SWE-bench Multilingual | 87.3% | 77.8% | +9.5 |
| GPQA Diamond | 94.6% | 91.3% | +3.3 |
| Humanity's Last Exam (no tools) | 56.8% | 40% | +16.8 |
| Humanity's Last Exam (with tools) | 64.7% | 53.1% | +11.6 |
| Terminal-Bench 2.0 | 82% | 65.4% | +16.6 |
| USAMO 2026 | 97.6% | 42.3% | +55.3 |
| MMMLU | 92.7% | 91.1% | +1.6 |
| CyberGym | 83.1% | 66.6% | +16.5 |
| OSWorld | 79.6% | 72.7% | +6.9 |
These scores come from leaked documents and have not been independently verified. Highlighted rows show the largest improvements.
These capabilities are currently restricted to vetted defensive cybersecurity partners through Project Glasswing.
Found thousands of zero-day vulnerabilities in initial testing, including bugs 1-2 decades old
Generates working exploit code from vulnerability descriptions
Autonomously builds multi-step attack chains across complex systems
Analyzes compiled binaries without access to source code
Performs end-to-end penetration testing with minimal human guidance
Suggests patches and mitigations for discovered vulnerabilities
Anthropic assembled a consortium of 40+ organizations - cybersecurity firms, government agencies, and research labs - committing up to $100M in API credits and $4M in direct funding.
Security researchers discover ~3,000 unpublished Anthropic assets describing Mythos. Fortune breaks the story.
Anthropic confirms Mythos exists, describes it as a "step change in capabilities" in an internal testing phase.
U.S. officials express concern about offensive cyber capabilities. Axios reports government discussions about AI model safety.
Anthropic announces Project Glasswing partnership framework with 40+ organizations.
Mythos Preview officially launches via Project Glasswing. Restricted to vetted defensive cybersecurity partners.
Public API release - no timeline announced. Expected to include Capybara-tier pricing above Opus.
No timeline announced
Expected above Opus
Not disclosed
Not specified in leaked docs
Unknown
Leaked docs focused on security
Vetted cybersecurity partners only
Broader security orgs and research institutions
Selected API customers may gain access
Full public API access and OpenRouter integration
Read our full report on Claude Mythos covering detailed benchmarks, Project Glasswing partners, cybersecurity capabilities, and market impact.
Read Full ReportOnce Claude Mythos launches on a public API, we will automatically add it to the leaderboard with full scoring.
Security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) found approximately 3,000 unpublished Anthropic assets in an unsecured data cache on March 26, 2026. The materials described Mythos as a new model in a tier above Opus, internally codenamed "Capybara." Anthropic confirmed its existence after Fortune reported the leak.
Not yet. Mythos is currently restricted to Project Glasswing partners for cybersecurity work. Once Anthropic makes Mythos available through its public API, third-party platforms like OpenRouter are expected to add it. We will update our leaderboard automatically when this happens. No timeline has been announced.
Specific token pricing has not been announced. Anthropic described Mythos in leaked documents as "very expensive for us to serve." It sits in a new tier above Opus 4.6, which currently costs $5/M input and $25/M output tokens. Expect Capybara-tier pricing to be significantly higher than Opus.
According to Anthropic, Mythos Preview found thousands of zero-day vulnerabilities in initial testing, including critical bugs that are one to two decades old. It can discover vulnerabilities, write exploits, construct attack chains, analyze binaries without source code, and perform autonomous penetration testing. CEO Dario Amodei noted it was not specifically trained for cyber - its security capabilities emerged from being trained to be good at code.