AI Models, AI News

Anthropic Built an AI So Dangerous It Won’t Release It ??Here’s Why That Matters

The Model Too Dangerous to Release

Anthropic has made a striking admission: it has built an AI model so capable at finding and exploiting cybersecurity vulnerabilities that the company considers it too dangerous to release publicly. The model, called Claude Mythos Preview, is the centerpiece of a new initiative called Project Glasswing ??and it represents one of the most consequential announcements in the AI industry’s recent history.

According to Anthropic, Mythos Preview has already identified thousands of high-severity zero-day vulnerabilities across every major operating system and every major web browser. The company is not making the model generally available. Instead, it has partnered with twelve major technology and finance companies ??including Amazon, Apple, Google, Microsoft, Nvidia, JPMorganChase, and CrowdStrike ??to share findings with defenders before adversaries can weaponize the same capabilities.

What the Model Found

The technical results are eye-opening. In controlled evaluations, Mythos Preview scored 83.1% on the CyberGym benchmark, compared to 66.6% for Claude Opus 4.6 ??Anthropic’s previous best. On software engineering benchmarks, the gap is even more dramatic: 93.9% on SWE-bench Verified versus 80.8% for Opus 4.6.

Three specific discoveries stand out. The model found a 27-year-old vulnerability in OpenBSD, the widely respected security-hardened operating system commonly used to run firewalls and critical infrastructure. The flaw allowed an attacker to remotely crash any machine running the OS simply by connecting to it. It also discovered a 16-year-old vulnerability in FFmpeg, the near-ubiquitous video encoding library, in a line of code that automated testing tools had exercised five million times without ever catching the problem. And Mythos Preview autonomously chained together several vulnerabilities in the Linux kernel to escalate from ordinary user access to complete machine control.

All three vulnerabilities have been patched. For others still in remediation, Anthropic is publishing cryptographic hashes of the details, with full disclosure planned after fixes are deployed.

The Responsible Disclosure Challenge

One of the sharpest criticisms of AI-driven vulnerability discovery is the logistical nightmare of responsible disclosure. Flooding open-source maintainers ??many of them unpaid volunteers ??with an avalanche of critical bug reports could easily do more harm than good.

Anthropic says it has built a triage pipeline to manage this. Every bug goes through human validation before being sent to maintainers, and the company has contracted professional triagers to assist. Critically, Anthropic says it will not submit large volumes of findings to a single project without first agreeing on a pace the maintainer can sustain. When source code is available, Anthropic aims to include a candidate patch alongside every report, labeled by provenance so maintainers know it came from an AI model.

A Coalition of Defenders

Project Glasswing brings together an unusually broad coalition. Beyond the twelve launch partners, Anthropic has extended access to more than 40 additional organizations that build or maintain critical software. The company is committing up to million in usage credits for Claude Mythos Preview and million in direct donations to open-source security organizations.

The timing of the announcement is notable. Anthropic disclosed last week that its annualized revenue run rate has surpassed billion, up from approximately billion at the end of 2025. The number of customers spending over million annually now exceeds 1,000, doubling in less than two months. Simultaneously, the company announced a multi-gigawatt compute deal with Google and Broadcom, with capacity beginning to come online in 2027 to power future Claude models.

The Bigger Picture

What makes Project Glasswing categorically different from previous AI security tools is the explicit acknowledgment from Anthropic that the capabilities it has built are dangerous. “We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities,” Newton Cheng, Anthropic’s Frontier Red Team Cyber Lead, told VentureBeat. “However, given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout ??for economies, public safety, and national security ??could be severe.”

That language ??warning the world about a tool you built ??is remarkable. It signals a growing awareness within the AI industry that frontier models are developing capabilities that require proactive containment strategies, not just reactive safety measures. Whether Project Glasswing’s coalition model is the right approach remains to be seen, but it is the most serious attempt yet to turn a dangerous capability into a defensive advantage before the worst-case scenario materializes.

Join the discussion

Your email address will not be published. Required fields are marked *