Anthropic Project Glasswing: The Powerful AI Cyber Model Too Dangerous to Release

In a move that underscores the dual-edged nature of advanced AI capabilities, Anthropic has announced Project Glasswing, a sweeping cybersecurity initiative that pairs an unreleased frontier AI model with a coalition of twelve major technology and finance companies. The goal: find and patch software vulnerabilities across the world’s most critical infrastructure before adversaries can exploit them.

The Model That Cannot Be Released

At the center of Project Glasswing sits Claude Mythos Preview, a general-purpose frontier model that Anthropic says has already identified thousands of high-severity zero-day vulnerabilities in every major operating system and web browser. The company is not making the model generally available.

We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities, said Newton Cheng, Frontier Red Team Cyber Lead at Anthropic. However, given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout for economies, public safety, and national security could be severe.

That language is striking coming from the company that built the model. Anthropic is effectively arguing that the tool it created is powerful enough to reshape the cybersecurity landscape, and that the only responsible approach is to keep it restricted while giving defenders a head start.

Remarkable Vulnerability Discoveries

The technical results are impressive. According to Anthropic, Mythos Preview found nearly all vulnerabilities it surfaced and developed many related exploits entirely autonomously, without any human steering.

Three examples stand out:

OpenBSD Vulnerability: The model discovered a 27-year-old vulnerability in OpenBSD, widely regarded as one of the most security-hardened operating systems. The flaw allowed remote attackers to crash any machine running the OS simply by connecting to it.
FFmpeg Flaw: A 16-year-old vulnerability in FFmpeg, the ubiquitous video encoding library, was found in code that automated testing tools had exercised five million times without ever catching the problem.
Linux Kernel Exploit: Mythos Preview autonomously found and chained together several vulnerabilities in the Linux kernel to escalate from ordinary user access to complete machine control.

All three vulnerabilities have been reported and patched. Anthropic is publishing cryptographic hashes of other findings, with full details to follow after fixes are deployed.

Performance That Sets New Records

On the CyberGym evaluation benchmark, Mythos Preview scored 83.1%, compared to 66.6% for Claude Opus 4.6, Anthropic’s next-best model. The gap is even wider on coding benchmarks: Mythos Preview achieves 93.9% on SWE-bench Verified versus 80.8% for Opus 4.6.

These numbers represent not just incremental improvement but a step change in AI-driven vulnerability discovery capabilities.

A Coalition of Defenders

Project Glasswing brings together an impressive roster of launch partners: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks.

Anthropic has also extended access to more than 40 additional organizations and is committing up to $100 million in usage credits for Claude Mythos Preview, along with $4 million in direct donations to open-source security organizations.

Responsible Disclosure at Scale

One of the biggest challenges with AI-driven vulnerability discovery is the sheer volume of findings. Flooding open-source maintainers with an avalanche of critical bug reports could do more harm than good.

Anthropic has built a triage pipeline specifically to manage this. Every bug undergoes human validation before being sent to maintainers, ensuring only high-quality reports are submitted. When source code access is available, Anthropic includes candidate patches labeled by their AI provenance.

The company follows a coordinated vulnerability disclosure framework, generally waiting 45 days after a patch is available before publishing full technical details.

A New Paradigm for Cybersecurity

Project Glasswing represents a new paradigm in cybersecurity: using the most advanced AI capabilities defensively, before they become available offensively. By keeping Claude Mythos Preview restricted while partnering with major technology companies, Anthropic is attempting to give critical infrastructure defenders a crucial head start.

Whether this approach will scale to address the thousands of vulnerabilities waiting to be discovered remains to be seen. But one thing is clear: the intersection of advanced AI and cybersecurity has entered a new and consequential phase.

The Model That Cannot Be Released

Remarkable Vulnerability Discoveries

Performance That Sets New Records

A Coalition of Defenders

Responsible Disclosure at Scale

A New Paradigm for Cybersecurity

Related Posts

Newsletter

Join the discussion Cancel reply