Anthropic’s Forbidden Model: Project Glasswing and the Quiet Revolution in AI Safety

Sometimes the most consequential decisions in technology happen quietly. Last week, Anthropic disclosed that it had built an AI model so capable at finding and exploiting software vulnerabilities that it considers the model too dangerous to release publicly. Then, in the same breath, it announced a partnership with twelve of the world’s largest technology and financial companies to put that same dangerous capability in the hands of defenders. This is Project Glasswing — and it’s the most important AI story of 2026 so far.

The Model That Found a 27-Year-Old OpenBSD Vulnerability

At the center of Project Glasswing sits Claude Mythos Preview, a general-purpose frontier model that has already identified thousands of high-severity zero-day vulnerabilities across every major operating system and web browser. “We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities,” said Newton Cheng, Anthropic’s Frontier Red Team Cyber Lead. “However, given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely.”

The technical results are striking. Mythos Preview found a 27-year-old vulnerability in OpenBSD — widely regarded as one of the most security-hardened operating systems in the world. It also discovered a 16-year-old vulnerability in FFmpeg — in a line of code that automated testing tools had exercised five million times without ever catching the problem. Perhaps most alarmingly, Mythos Preview autonomously chained together several vulnerabilities in the Linux kernel to escalate from ordinary user access to complete machine control.

The Partners: Amazon, Apple, Google, Microsoft, Nvidia, and More

The launch partners read like a who’s who of enterprise technology: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. Anthropic has extended access to more than 40 additional organizations and is committing $100 million in usage credits plus $4 million in direct donations to open-source security organizations.

The Ethical Dilemma: An AI Company Choosing Not to Ship

The most remarkable thing about Project Glasswing may not be its technical achievements but the decision itself. Anthropic built something commercially valuable and chose not to sell it. Instead, it’s running it through a carefully managed disclosure pipeline designed to prevent overwhelming open-source maintainers with unverified bug reports.

That pipeline includes contracted human triagers who manually validate every bug report, rate-limited disclosures to prevent flooding projects with findings, and a commitment to including candidate patches labeled by provenance. The 45-day coordinated disclosure window can be shortened if details are already public or extended when patch deployment is unusually complex.

What This Means for AI Safety

Project Glasswing represents a genuine philosophical shift in how frontier AI labs think about dangerous capabilities. Rather than the traditional approach of simply not building dangerous things, Anthropic is demonstrating a different model: build the dangerous thing, restrict access, and use the time bought to give defenders an edge. Whether this approach scales remains to be seen. But it’s the most concrete and ambitious attempt yet to translate frontier AI capabilities into a defensive advantage.

The AI cyber race has officially begun. The question now is whether the defenders can stay ahead.

The Model That Found a 27-Year-Old OpenBSD Vulnerability

The Partners: Amazon, Apple, Google, Microsoft, Nvidia, and More

The Ethical Dilemma: An AI Company Choosing Not to Ship

What This Means for AI Safety

Related Posts

Newsletter

Join the discussion Cancel reply