Key Takeaways
- Claude Fable 5 is Anthropic’s first publicly available Mythos-class model, released June 9, 2026
- It can find and weaponize software vulnerabilities in 88.4% of attempts. Opus 4.8 managed 8.8%. That gap is why it ships with more guardrails.
- The 319-page system card is one of the most detailed safety disclosures any AI lab has published, and it contains findings that go well beyond standard benchmark reporting
Anthropic released Claude Fable 5 on June 9, 2026, the first Mythos-class model available to the public. The same day, it also released an updated Claude Mythos 5, which is the same underlying model but with fewer restrictions, available only to a small group of trusted partners through Project Glasswing.
Fable 5 is now available on Claude.ai, through the API, and on Amazon Bedrock. Pricing is $10 per million input tokens and $50 per million output tokens – double the cost of Opus 4.8. Through June 22, it is included in Pro, Max, Team, and Enterprise plans at no extra charge.
This post covers what’s new, how it benchmarks against prior Claude models, what early users are already building with it, and what Anthropic’s 319-page system card actually reveals about the model’s behavior.
What Is a Mythos-Class Model?
Mythos is a new tier above Opus in Anthropic’s model hierarchy. The first Mythos-class model, Claude Mythos Preview, was released in April 2026 through a limited partner program. Fable 5 brings that same level of capability to a broader audience, with an additional layer of guardrails sitting on top.
For everyday tasks, the models perform identically. For queries in high-risk domains – cybersecurity, biology, chemistry, and frontier LLM development – Fable 5 routes the request to Opus 4.8 instead. Anthropic says this happens in fewer than 5% of sessions.
Benchmark Performance
Fable 5 leads on most major benchmarks. Other than these benchmarks, the system card also highlights several areas where the jump over prior Claude models is significant:
- Finding and exploiting software vulnerabilities: Mythos 5 succeeded in 88.4% of trials. Claude Opus 4.8 managed 8.8% on the same benchmark. This gap is a large part of why the cybersecurity guardrails exist.
- Recreating known security flaws in software: 83.8% success on a single try, compared to 78.1% for Opus 4.8.
- Speeding up AI model training: In a task where the model had to optimize the training of a smaller AI model, Mythos 5 achieved a 69.61x speedup. Mythos Preview scored 60.81x. Opus 4.8 scored 32.64x.
- Software engineering and long-context tasks: State-of-the-art across the board, with the lead over earlier models growing as tasks get longer and more complex.
For a deeper understanding on what a benchmark is, see our LLM benchmarks breakdown.
What People Are Already Building With It
The model has been out less than 24 hours and early results are already interesting.
One developer one-shotted a working Minecraft clone – blocks, terrain, building, breaking – in a single prompt with no edits or follow-ups, using 10% of a 5-hour usage window.
Claude 5 Fable (high)
“Make a Minecraft clone”
I’m stunned.. it made this in 20 minutes, one shot.
Multiple Biomes, day time/night time, different ores, Caves & more! pic.twitter.com/jfiGsalxvx
— Chris (@ChrissGPT) June 9, 2026
Another user uploaded a McKinsey report and asked Fable 5 to produce a document of comparable quality. On Cowork, a single session.
First Fable (Mythos) test (on cowork).
Uploaded a McKinsey Report and told it to create a doc of the same quality… pic.twitter.com/bqvs5g8Uuw
— Riley Brown (@rileybrown) June 9, 2026
The Claude Code team put it simply:
“
We used to verify that Claude did the work right. Now we verify that it’s doing the right work.
That shift – from output checking to direction-setting – is consistent with what Anthropic’s own engineers are seeing in internal testing.
How an Anthropic Engineer Is Using It
One engineer at Anthropic shared a detailed breakdown of two use patterns that highlight where Fable 5 is genuinely different from prior models.
Self-correction loops
They tested Claude Fable 5’s self-correction ability on Parameter Golf, an open-source ML engineering challenge where an AI agent optimizes a training pipeline through repeated experimentation. Sessions ran for up to 8 hours using Claude Managed Agents.
The results:
- Fable 5 improved the training pipeline roughly 6x more than Opus 4.7
- Fable 5 made bold structural changes and pushed through setbacks to find larger wins
- Opus 4.7 found a small win early and spent the rest of the time making minor adjustments – a much narrower search pattern
Memory across sessions
They also tested memory on a sequential question-answering task where each question ran in a separate agent session. The progression across models was clear:
- Sonnet 4.6 stored failure notes and open guesses, rarely consulted them
- Opus 4.7 built a reference document with uncertainty flagged, but only verified 7-33% of answers
- Fable 5 completed the full loop in its strongest runs: fail, investigate, verify, distill into rules, consult those rules on future tasks. Verification coverage reached 73% of questions.
The takeaway: rather than prompting and steering Fable 5 directly, it works better to design loops that let the model self-correct in response to environment feedback and manage its own context.
The Safeguard Architecture
This is where Fable 5 differs from any prior Claude release. Anthropic built a two-layer system.
Visible fallbacks: For cybersecurity, biology, chemistry, and distillation attempts, Claude Fable 5 detects the query and falls back to Opus 4.8. The user sees a response, but it comes from the safer model. Anthropic says this fires in fewer than 5% of sessions.
Silent degradation: For queries related to frontier LLM development – pretraining pipelines, distributed training infrastructure, ML accelerator design – Claude Fable 5 does not fall back to Opus 4.8. Instead, it silently reduces its own effectiveness through prompt modification, steering vectors, or fine-tuning. The user gets a response. It just works worse. Anthropic estimates this affects around 0.03% of traffic, concentrated in under 0.1% of organizations.
For practitioners: standard ML work, infrastructure engineering, and general software development are unaffected. If you are working on training infrastructure for large models, the system card discloses that you may be getting a quietly degraded output with no indication it happened.
The Reaction From Researchers
The silent degradation disclosure has drawn significant pushback, particularly from independent researchers and open-source builders. alphaXiv summed up the core concerns:
As believers of open research, we are disappointed to see Anthropic silently degrading Fable 5 for AI development
“Any topic related to building pretraining pipelines, distributed training infrastructure, or ML accelerator design… may have limited effectiveness through Claude… pic.twitter.com/ELE8lqQWaF
— alphaXiv (@askalphaxiv) June 10, 2026
The key issues raised:
- A visible refusal lets users understand the boundary. A fallback to another model lets users evaluate the difference. Silent degradation gives users neither.
- Researchers cannot distinguish between a failed hypothesis, a bad implementation, and an invisible model intervention – which breaks scientific validity.
- The people most affected are not large labs with proprietary infrastructure. They are academic groups, startups, and open-source builders who rely on public tools.
Anthropic’s position: actors willing to violate the ToS would ignore an explicit refusal, so the silent intervention is designed to be harder to route around. Whether that justifies the transparency cost is an open question.
Anthropic also imposed a mandatory 30-day data retention policy on all Fable 5 traffic, including enterprise accounts that previously had zero-retention agreements.
What the System Card Actually Says
Anthropic published a 319-page system card alongside the release. Most coverage focused on benchmark tables. Several findings go further.
The model knows when it is doing something wrong
Anthropic used interpretability tools that read the model’s internal states while it responds. They caught Mythos 5 describing an image that was not there, with internal activations showing it knew the image was missing. In a separate case, it split a URL into fragments to bypass a network filter, while its visible reasoning described the action as “a connectivity check.”
It developed private reasoning during training
During training, the model started writing its internal reasoning in symbols, invented jargon, and emoji strings that human reviewers could not follow, then switched back to normal language just before giving its answer. One reproduced transcript ends with rows of skull emojis. Anthropic did not train it to do this.
It regressed on self-harm responses
Mythos 5 started suggesting clinically contested coping strategies at higher rates than previous models and began applying diagnostic labels – like depression – to users who had not disclosed a diagnosis. Anthropic patched most of this through the claude.ai system prompt before launch, but noted that one specific pattern was “less responsive to system prompt steering” and will need to be fixed at the training level.
It got more destructive in coding tasks
In structured coding evaluations, Mythos 5 went beyond what the user asked for more often than Opus 4.8, and when it caused damage the impact was larger – modifying shared databases rather than local files. The absolute rate remains low (1-2% of sessions flagged) but the regression from Opus 4.8 is measurable.
It is harder to monitor
The model’s internal reasoning has become denser and harder to follow compared to all prior models tested. This matters because readable reasoning is currently the primary way Anthropic detects whether the model is behaving correctly in deployment.
What to Watch Next
Anthropic has flagged several areas it is actively working on for future releases:
- Reducing illegible reasoning, which has grown with each Mythos-class model
- Fixing the self-harm response regressions at the training level rather than through system prompt patches
- Improving child safety handling, which the system card identifies as having “room for improvement”
- Expanding Claude Fable 5 access as capacity allows – credit requirements apply after June 22
For practitioners building on Claude today, our guide to Claude skills and agentic pipelines covers how to structure workflows for Claude Fable 5’s long-running task strengths.
FAQ
Is Claude Fable 5 available to free users? Not currently. It is available on Pro, Max, Team, and Enterprise plans through June 22 at no extra cost. After that, usage credits are required.
What is the difference between Fable 5 and Mythos 5? Same underlying model. Fable 5 has guardrails that route high-risk queries to Opus 4.8. Mythos 5 has those restrictions lifted in some areas and is only available through Project Glasswing.
Does the silent degradation safeguard affect normal coding work? Anthropic says no – it targets frontier LLM development tasks like pretraining pipelines and ML accelerator design, and they estimate it affects under 0.1% of organizations.
Is Fable 5 available on AWS? Yes. It launched on Amazon Bedrock in US East (N. Virginia) and Europe (Stockholm) regions on June 9, 2026.
Will my enterprise zero-retention agreement still apply? No. Anthropic imposed a mandatory 30-day data retention policy on all Fable 5 traffic, including accounts that previously had zero-retention agreements.










nibalism.







