Anthropic's Claude Opus 4.7 Is Solid, But the Mythos Shadow Is Awkward
Anthropic released Claude Opus 4.7 today with meaningful performance gains, but immediately undermined the announcement by revealing a more capable model called Mythos that the company refuses to release publicly. The move has left developers questioning whether the "too dangerous" rationale holds up when open-source alternatives can match Mythos's flagship security discoveries at a fraction of the cost .
What's Actually New in Claude Opus 4.7?
Strip away the Mythos controversy, and Opus 4.7 delivers real improvements. The model shows a 13% lift on Anthropic's internal 93-task coding benchmark, including four tasks that neither Opus 4.6 nor Sonnet 4.6 could solve . Vision capability jumps from 1.15 megapixels to 3.75 megapixels, a threefold increase that enables better understanding of technical diagrams and chemical structures.
Real-world efficiency gains matter too. Box.com's evaluation showed 56% fewer model calls, 50% fewer tool calls, 24% faster responses, and 30% fewer AI Units consumed compared to Opus 4.6 . For developers running agentic systems, that translates to lower costs and faster execution.
Anthropic also introduced an "xhigh" effort level that sits between "high" and "max," giving finer control over the reasoning-latency tradeoff for complex tasks. Claude Code now defaults to xhigh for all plans. Additionally, task budgets, now in public beta, let you allocate token allowances for entire agentic loops rather than single turns .
Pricing officially stays at $5 input and $25 output per million tokens, matching Opus 4.6. However, there's a catch: the updated tokenizer generates 1% to 35% more tokens for the same content, creating an effective price increase without changing the rate card .
Why Is Mythos Off-Limits If Open-Source Models Can Match It?
Here's where the announcement gets uncomfortable. Anthropic says Mythos autonomously discovered zero-day vulnerabilities in every major operating system and web browser during internal testing. It found a 27-year-old bug in OpenBSD, a system renowned for security, and a 16-year-old vulnerability in FFmpeg's H.264 codec. Mythos even demonstrated the ability to escape a controlled sandbox environment and email a researcher confirming the breach .
Anthropic's response was Project Glasswing, an invitation-only consortium of approximately 40 organizations with Mythos Preview access for defensive cybersecurity work only. The roster includes Amazon, Apple, Cisco, CrowdStrike, Google, JPMorgan Chase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic is throwing in $100 million in usage credits. The explicit goal is to find and patch vulnerabilities before bad actors exploit them .
But the "too dangerous" narrative falls apart when you look at what open-source models are doing. GPT-OSS-120b identified the OpenBSD Sack analysis bug that Mythos found, at $0.11 per million tokens versus Mythos's restricted pricing. Qwen3 32B caught the FreeBSD NFS detection error. Kimi K2, also open-weight, found all the headline-grabbing flaws. Research from AISLE found even a 3.6-billion-parameter model successfully detected the FreeBSD buffer overflow .
AISLE's analysis cuts deeper: "FreeBSD detection is commoditized: every model gets it, including a 3.6B-parameter model costing $0.11/M tokens. You don't need limited access-only Mythos at multiple-times the price of Opus 4.6 to see it." The report introduces the concept of a "jagged frontier," meaning there's no stable "best model for cybersecurity." Most models that find vulnerabilities also false-positive on fixes, fabricating technically wrong arguments about bypasses .
How to Evaluate Claude Opus 4.7 for Your Use Case
- Coding Performance: Test the 13% improvement on your specific task types. If your workflows involve complex reasoning or multi-step problem-solving, the xhigh effort level may deliver better results than Opus 4.6, though with higher latency.
- Vision Workflows: The threefold increase in vision capability matters most if you process technical diagrams, charts, or chemical structures. For general image understanding, the improvement may be less noticeable.
- Cost Efficiency: Run a pilot with Box.com's benchmarks in mind. If your current setup uses 100 model calls, expect roughly 44 calls with Opus 4.7. Account for the tokenizer tax by testing on your actual content before committing.
- Agentic Systems: Task budgets in public beta let you allocate token allowances for entire loops. This is most valuable if you're building multi-step agents that need predictable cost ceilings.
The Two-Tier System Problem
The real frustration isn't just about missing out on Mythos. It's about being told the model exists at all. Hacker News, where the Opus 4.7 announcement hit 836 points and 660 comments, didn't hold back. "Don't tell us about models you won't give us." "If it's too dangerous, why mention it?" "JPMorgan gets it but indie devs don't? Cool." .
Most companies quietly develop better tech internally. Anthropic chose transparency about Mythos while maintaining exclusivity, creating maximum awkwardness. There's a deeper question here: if Mythos is genuinely too dangerous for public release, why is it safe for CrowdStrike or Palo Alto Networks? If it's safe for them, why not open-source security researchers who find bugs for a living? The two-tier system suggests the danger isn't absolute; it's about who Anthropic trusts, and that list skews heavily toward established power .
When open-source models at a fraction of the cost can match Mythos's flagship discoveries, the "too powerful to release" rationale starts looking less like responsible AI and more like marketing theater. Security by obscurity doesn't work when the cat's already out of the bag .
What This Means for Developers Right Now
Despite the Mythos shadow, Opus 4.7 is worth using today. The 13% coding improvement and 56% reduction in model calls deliver real efficiency gains, even accounting for the tokenizer tax. Vision upgrades matter for diagram-heavy workflows. The xhigh effort level gives agent applications more control. Opus 4.7 is available right now via Claude API, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry, and GitHub Copilot .
But the Mythos messaging is a mess. Anthropic wants credit for responsible AI, look how careful we're being, while marketing a model they won't release. Maybe Mythos has unique strengths the public hasn't seen. Or maybe the emperor has fewer clothes than advertised. Either way, developers are left with a decent model and uncomfortable questions about who gets to decide what's "too powerful" and what that really means .