Claude Mythos vs Opus 4.6

Head-to-head comparison of Anthropic's two most powerful models across 17 benchmarks from the official system card. Mythos (Capybara tier) represents a generational leap over Opus 4.6 in coding, reasoning, math, and cybersecurity.

Mythos is not publicly available

Claude Mythos Preview is restricted to Project Glasswing cybersecurity partners. Benchmarks are from Anthropic's system card and leaked documents. Opus 4.6 is available on the public API at $5/$25 per million tokens.

Benchmarks Compared

17-0

Mythos Wins

+24.4

SWE-bench Pro Gap

+55.3

USAMO 2026 Gap

Software Engineering

Benchmark	Mythos	Opus 4.6	Delta
SWE-bench Verified	93.9%	80.8%	+13.1
SWE-bench Pro	77.8%	53.4%	+24.4
SWE-bench Multilingual	87.3%	77.8%	+9.5
SWE-bench Multimodal	59%	27.1%	+31.9
Terminal-Bench 2.0	82%	65.4%	+16.6

Reasoning

Benchmark	Mythos	Opus 4.6	Delta
GPQA Diamond	94.6%	91.3%	+3.3
MMMLU	92.7%	91.1%	+1.6
Humanity's Last Exam (no tools)	56.8%	40%	+16.8
Humanity's Last Exam (with tools)	64.7%	53.1%	+11.6

Math

Benchmark	Mythos	Opus 4.6	Delta
USAMO 2026	97.6%	42.3%	+55.3

Long Context

Benchmark	Mythos	Opus 4.6	Delta
GraphWalks BFS (256K-1M)	80%	38.7%	+41.3

Visual

Benchmark	Mythos	Opus 4.6	Delta
CharXiv Reasoning (no tools)	86.1%	61.5%	+24.6
CharXiv Reasoning (with tools)	93.2%	78.9%	+14.3

Agentic

Benchmark	Mythos	Opus 4.6	Delta
OSWorld	79.6%	72.7%	+6.9
BrowseComp	86.9%	83.7%	+3.2

Cybersecurity

Benchmark	Mythos	Opus 4.6	Delta
CyberGym	83.1%	66.6%	+16.5
Cybench (35 CTF challenges)	100%	--	--

Source: Anthropic system card (red.anthropic.com) and leaked documents. Highlighted rows show 15+ point improvements. Not independently verified.

Key Takeaways

Biggest jump: USAMO 2026: +55.3 points (42.3% to 97.6%)

Coding leap: SWE-bench Pro: +24.4 points (53.4% to 77.8%)

Long context: GraphWalks BFS: +41.3 points (38.7% to 80.0%)

Cybersecurity: CyberGym: +16.5 points, Cybench saturated at 100%

Avg improvement: 1.86x - 4.3x capability slope ratio (per system card)

Pricing & Availability

	Mythos	Opus 4.6
Input $/1M tokens	$25.00 *	$5.00
Output $/1M tokens	$125.00 *	$25.00
Model Tier	Capybara	Opus
Public API	Restricted	Available
OpenRouter	Not yet	Available

* Glasswing partner pricing. Public API pricing not yet announced.

Related Comparisons

Mythos vs GPT-5 Mythos vs Gemini Claude Mythos Tracker Full Report All Claude Models

Frequently Asked Questions

Latency and throughput data for Mythos have not been published. The system card focuses on capability benchmarks rather than speed. Since Mythos is described as "very expensive to serve," it may trade speed for capability. Opus 4.6 currently delivers around 60 tokens per second on most providers.

No. Mythos sits in a new "Capybara" tier above Opus in Anthropic's model hierarchy. Opus 4.6 remains the top publicly available Claude model. Mythos is currently restricted to Project Glasswing cybersecurity partners. Anthropic has not announced plans to replace Opus with Mythos.

Not yet. Mythos Preview is only available to vetted Project Glasswing partners for defensive cybersecurity research. While benchmarks show it excels at general coding and reasoning too, public API access has not been announced. For production use, Claude Opus 4.6 remains the best available Anthropic model.