Anthropic’s Project Glasswing: The Powerful AI Cyber Model Too Dangerous to Release

In an unprecedented move that highlights the growing tension between AI capability and safety, Anthropic has announced Project Glasswing, a sweeping cybersecurity initiative centered on a frontier AI model so powerful that the company says it cannot be released to the public.

The model, called Claude Mythos Preview, represents Anthropic’s most ambitious attempt to weaponize frontier AI capabilities for defensive purposes before those same capabilities fall into the wrong hands. The company has committed up to 100 million dollars in usage credits and 4 million dollars in direct donations to open-source security organizations as part of the initiative.

Claude Mythos Preview has already identified thousands of high-severity zero-day vulnerabilities in every major operating system and every major web browser. The model found these vulnerabilities autonomously, without any human steering, and developed working exploits for many of them.

Three examples stand out. The model discovered a 27-year-old vulnerability in OpenBSD that allowed an attacker to remotely crash any machine running the OS simply by connecting to it. It found a 16-year-old vulnerability in FFmpeg, the near-ubiquitous video encoding library. And Mythos Preview autonomously chained together several vulnerabilities in the Linux kernel to escalate from ordinary user access to complete control of the machine.

All three vulnerabilities have been reported and patched. For others still in remediation, Anthropic is publishing cryptographic hashes of the details, with full technical disclosure planned 45 days after patches are available.

The launch partners reads like a who’s who of the technology industry: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. More than 40 additional organizations have also been granted access to Mythos Preview for testing against their own codebases.

Jim Zemlin, CEO of the Linux Foundation, framed the initiative in terms of collapsing timelines: The window between a vulnerability being discovered and being exploited by an adversary has collapsed, what once took months now happens in minutes with AI.

Anthropic has donated 2.5 million dollars to Alpha-Omega and OpenSSF through the Linux Foundation, and 1.5 million dollars to the Apache Software Foundation.

On the CyberGym evaluation benchmark, Mythos Preview scored 83.1 percent, compared to 66.6 percent for Claude Opus 4.6. The gap is even wider on coding benchmarks: Mythos Preview achieves 93.9 percent on SWE-bench Verified versus 80.8 percent for Opus 4.6.

We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities, said Newton Cheng, Frontier Red Team Cyber Lead at Anthropic.

In late March, a draft blog post about Mythos was left in an unsecured data store. Days later, anyone running npm install on Claude Code pulled down Anthropic’s complete original source code, 512,000 lines, for approximately three hours due to a packaging error.

After the research preview period, Claude Mythos Preview will be available to participants at 25 dollars per million input tokens and 125 dollars per million output tokens.

The announcement arrives as Anthropic disclosed that its annualized revenue run rate has surpassed 30 billion dollars, up from approximately 9 billion dollars at the end of 2025, and the number of customers spending over 1 million dollars annually now exceeds 1,000, doubling in less than two months.

Related Posts

Newsletter

Join the discussion Cancel reply