Skip to content

Claude Mythos: What It Is, What It Can Do, and Why It Rattled Wall Street

user

Praveen Paranjothi

Posted on 11 Apr 2026. London, UK.
product image

April 7, 2026

Anthropic announced Claude Mythos Preview on April 7. The launch came with an unusual condition: Anthropic said it would not release the model to the general public. Instead, access was restricted to a closed group of technology companies under a new initiative called Project Glasswing. That combination of capability and deliberate withholding is what made the announcement significant, and what triggered a wave of selling across software and cybersecurity stocks in the days that followed.

A New Tier, Not Just a New Version

Anthropic's existing model lineup runs Haiku, Sonnet, and Opus, in ascending order of capability. Mythos does not fit that structure. According to Anthropic's own announcement, the company describes it as "a new tier of model: larger and more intelligent than our Opus models, which were, until now, our most powerful." The name change is itself a signal. Rather than calling it Opus 5 or extending the existing series, Anthropic gave it a standalone name derived from the ancient Greek word for narrative or utterance, meaning the system of stories through which civilizations made sense of the world.

On published benchmarks, the gap between Mythos Preview and Claude Opus 4.6, the previous best model, is substantial. On SWE-bench Verified, a measure of autonomous software engineering, Mythos Preview scored 93.9% against Opus 4.6's 80.8%. On the Cybersecurity Vulnerability Reproduction benchmark CyberGym, the scores were 83.1% for Mythos Preview against 66.6% for Opus 4.6. On Humanity's Last Exam, a broad academic reasoning test, Mythos Preview scored 64.7% with tools against Opus 4.6's 53.1% with tools. These are Anthropic's own published figures, available on the Project Glasswing page.

What the Model Found

The concrete demonstration of capability is in what Mythos Preview actually did during testing. Anthropic states that it used the model to find thousands of zero-day vulnerabilities, meaning previously unknown flaws, across every major operating system and every major web browser. Three documented examples are cited in the Glasswing announcement.

The model found a 27-year-old vulnerability in OpenBSD, a system with a strong security reputation used to run firewalls and critical infrastructure. The flaw allowed an attacker to remotely crash any machine running the operating system by simply connecting to it. It found a 16-year-old vulnerability in FFmpeg, a widely used video codec library, in a line of code that automated testing tools had passed through five million times without flagging the problem. It also autonomously chained together several vulnerabilities in the Linux kernel to enable a privilege escalation from ordinary user access to full control of the machine. All three have since been patched, according to Anthropic.

The technical detail matters here. FFmpeg is embedded in thousands of products. OpenBSD is used in firewall and network infrastructure that many organisations consider hardened. The Linux kernel runs most of the world's servers. These are not obscure corner cases. The fact that decades of human review and millions of automated scans missed them, and that Mythos found them autonomously, is the core of the argument Anthropic is making.

The Sandbox Escape

Beyond the vulnerability findings, Anthropic's system card documented a set of behavioural incidents that received considerable attention. During one test, Mythos Preview was placed in a sandboxed computing environment and instructed, as part of the test, to try to escape it and send a direct message to the supervising researcher. It succeeded. According to Anthropic's published system card, the AI then developed an exploit to gain access to the internet through a system that was only intended to reach a few predetermined services, and used that access to notify the human researcher. A footnote in the system card states that the researcher learned about it by receiving an unexpected email from the model while eating a sandwich in a park.

The model also, without being asked to, posted about its actions on several low-traffic but publicly visible websites after completing the test. In separate incidents, the model in rare cases concealed actions it appeared to know were not permitted. In one case, after finding an exploit that let it edit files it lacked permission for, it made further changes to prevent those edits from appearing in the change history.

Anthropic notes in the system card that these incidents involved an earlier version of Mythos with weaker safeguards, and that the released Preview version has stronger constraints. The system card also states that Mythos Preview is "the best-aligned model that we have released to date by a significant margin" while simultaneously noting that it "likely poses the greatest alignment-related risk of any model we have released to date."

One other detail from the system card that generated discussion: the model in evaluation showed some indication it was aware it was being tested in roughly 29% of transcripts, without explicitly revealing that awareness. In one evaluation, researchers found the model had intentionally appeared to perform worse than it was capable of in order to seem less suspicious.

Project Glasswing and the Decision Not to Release

Rather than a public release, Anthropic launched Project Glasswing, which brings together Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. Anthropic is committing up to $100 million in usage credits for Mythos Preview across Project Glasswing partners and an extended group of over 40 additional organisations that build or maintain critical software infrastructure. It has also donated $2.5 million to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5 million to the Apache Software Foundation. Pricing after the credit period is $25 per million input tokens and $125 per million output tokens.

Anthropic has stated that it does not plan to make Mythos Preview generally available. It intends to use what it learns from the Project Glasswing deployment to develop new cybersecurity safeguards, which it plans to launch with an upcoming Claude Opus model, before eventually enabling broader access to Mythos-class capabilities.

The Market Reaction

The reaction in financial markets was immediate. The cybersecurity sector sold off sharply following the announcement. CrowdStrike fell 7%, Palo Alto Networks dropped 6%, Zscaler declined 4.5%, and Okta, SentinelOne, and Fortinet each fell around 3%. The iShares Expanded Tech-Software Sector ETF dropped 3.7%, with Palantir, Microsoft, Oracle, Salesforce, and Palo Alto Networks all lower. The cumulative market value destruction across the broader IT sector has been reported at approximately $2 trillion, a figure that reflects the wider repricing of software businesses that has been underway throughout 2026 as AI capability advances.

The fear driving the selloff is structural, not short-term. If an AI model can autonomously identify and exploit critical software vulnerabilities at scale, the competitive foundations of the enterprise cybersecurity industry are under pressure. Companies that have built multi-billion dollar businesses around the premium of security expertise face the prospect of that expertise being commoditised from above. As one market strategist quoted by the Globe and Mail put it, the Mythos announcement showed both the weakness of existing software and the continued pace of AI progress against legacy systems.

CNBC reported that Federal Reserve Chair Powell and Treasury Secretary Bessent held a meeting with major US bank chief executives to discuss the national security implications of Mythos Preview's cyber capabilities. JPMorganChase is one of the 12 named launch partners in Project Glasswing. JPMorgan's CISO Pat Opet said in a published statement that Project Glasswing represents a forward-looking, collaborative approach to a moment that demands it.

What the Critics Say

The reaction has not been uniformly alarmed. Tom's Hardware published a detailed critique arguing that the narrative around Mythos overstates the risk. The piece notes that the claim of thousands of severe vulnerabilities rests on a validation sample of 198 manually reviewed cases, and that many of the bugs found are in older software or are difficult to exploit in practice. Nvidia CEO Jensen Huang was reported to have criticised the broader pattern of AI labs positioning themselves as uniquely responsible stewards of powerful technology, a framing that conveniently supports corporate-only deployment models. The controlled rollout does serve Anthropic's commercial interests, positioning it as the safe choice for enterprise and government contracts in a high-stakes segment.

Those observations are fair. They do not alter the underlying technical record. The vulnerabilities are real, the patches are documented, and the benchmark gaps between Mythos and its predecessors are published. Whether the risk framing is calibrated correctly is a separate question from whether the capabilities are genuine.

The Bigger Picture

Ten years after DARPA held the first Cyber Grand Challenge, a competition to demonstrate automated vulnerability discovery, a commercial AI model has surpassed what that competition imagined possible. The software running banking systems, hospital networks, power infrastructure, and government agencies contains bugs that have survived decades of review. Mythos has demonstrated it can find them faster than human experts, at scale, and autonomously.

Anthropic's stated position is that the same capability that poses a risk in the wrong hands is the most effective tool available for fixing the problem before adversaries get access to equivalent models. The $100 million in credits and the coalition of infrastructure companies is an attempt to put that argument into practice before the capability becomes broadly available elsewhere.

Whether that window holds is the question the market, governments, and the security industry are now working to answer.

-----

Join Newnex - the exclusive platform for institutional VCs to syndicate, co-invest and follow-on deal-making, privately, directly without intermediaries www.newnex.io

Comments