AI News

Anthropic’s Project Glasswing: The Dangerous AI Cyber Model That Won’t Be Released

In an unprecedented move that highlights the growing tension between AI capability and AI safety, Anthropic has announced a cybersecurity initiative built around a model it considers too dangerous to release publicly.

Project Glasswing pairs an unreleased frontier AI model??laude Mythos Preview??ith a coalition of twelve major technology and finance companies. The initiative aims to find and patch software vulnerabilities across the world’s most critical infrastructure before adversaries can exploit them.

The Coalition and the Commitment

The launch partners represent an impressive roster of technology and finance giants: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks.

Anthropic has extended access to more than 40 additional organizations that build or maintain critical software. The company is committing up to million in usage credits for Claude Mythos Preview across the effort, along with million in direct donations to open-source security organizations.

Why the Model Can’t Be Released

At the center of Project Glasswing sits Claude Mythos Preview, a general-purpose frontier model that Anthropic says has already identified thousands of high-severity zero-day vulnerabilities in every major operating system and every major web browser.

The company is not making the model generally available.

“We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities,” said Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, in an exclusive interview. “However, given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout??or economies, public safety, and national security??ould be severe.”

That stark warning from the company that built the model represents a remarkable admission about the power of frontier AI in cybersecurity contexts.

The Vulnerabilities Found

The technical results are striking. According to Anthropic’s press release, Mythos Preview found nearly all vulnerabilities it surfaced and developed many related exploits entirely autonomously, without any human steering.

OpenBSD Vulnerability: The model found a 27-year-old vulnerability in OpenBSD??idely regarded as one of the most security-hardened operating systems in the world, commonly used for firewalls and critical infrastructure. The flaw allowed an attacker to remotely crash any machine running the OS simply by connecting to it.

FFmpeg Flaw: A 16-year-old vulnerability in FFmpeg, the near-ubiquitous video encoding library, existed in a line of code that automated testing tools had exercised five million times without ever catching the problem.

Linux Kernel Exploit: Mythos Preview autonomously found and chained together several vulnerabilities in the Linux kernel to escalate from ordinary user access to complete control of the machine.

All three vulnerabilities have been reported to relevant maintainers and patched.

Performance Metrics

On the CyberGym evaluation benchmark, Mythos Preview scored 83.1%, compared to 66.6% for Claude Opus 4.6, Anthropic’s next-best model. The gap widens further on coding benchmarks:

  • SWE-bench Verified: 93.9% (Mythos Preview) vs. 80.8% (Opus 4.6)
  • SWE-bench Pro: 77.8% (Mythos Preview) vs. 53.4% (Opus 4.6)

These numbers represent a significant leap in AI-powered vulnerability discovery capabilities.

Responsible Disclosure at Scale

Finding thousands of zero-days at once raises a critical question: How do you disclose them responsibly without overwhelming open-source maintainers, many of whom are unpaid volunteers?

Anthropic has built a triage pipeline specifically to manage this challenge. Every bug goes through human validation before being sent to maintainers, ensuring only high-quality reports reach them.

“We do not submit large volumes of findings to a single project without first reaching out in an effort to agree on a pace the maintainer can sustain,” Cheng explained.

When Anthropic has access to source code, the company includes candidate patches with every report, labeled by provenance so maintainers know the patch was written or reviewed by a model.

A Growing Revenue Powerhouse

The announcement arrives as Anthropic experiences extraordinary momentum. The company disclosed that its annualized revenue run rate has surpassed billion, up from approximately billion at the end of 2025. The number of business customers each spending over million annually now exceeds 1,000??oubling in less than two months.

Anthropic simultaneously announced a multi-gigawatt compute deal with Google and Broadcom, expected to come online beginning in 2027 to power frontier Claude models.

The Bigger Picture

Project Glasswing represents Anthropic’s most ambitious attempt to translate frontier AI capabilities into a defensive advantage before those same capabilities proliferate to hostile actors. The company’s reasoning is clear: if powerful cyber capabilities are coming regardless, better that defenders get them first.

The initiative also signals a new model for AI safety in practice??ot just restricting what models can do, but actively channeling their capabilities toward making the digital ecosystem more secure.

Whether this approach succeeds will depend on execution, cooperation from industry partners, and the ability to disclose vulnerabilities responsibly at unprecedented scale. One thing is certain: Project Glasswing marks a new chapter in the relationship between AI capabilities and cybersecurity.

Join the discussion

Your email address will not be published. Required fields are marked *