Project Glasswing: AI Finding Zero-Days at Scale

Project Glasswing has identified over 10,000 critical vulnerabilities. Claude Mythos Preview is the most capable security tool ever built. It is also not publicly available. For now.

Anthropic has a model it hasn't released publicly. It's called Claude Mythos Preview. Earlier this year the company gave access to a small group of partners (Amazon, Apple, Cisco, Google, JPMorgan, Microsoft, and a handful of others) with a specific mandate: use it to find and fix software vulnerabilities. This week Anthropic announced it's expanding that program to roughly 150 additional organizations across more than fifteen countries.

The results so far: more than 10,000 high- or critical-severity security flaws identified, including zero-day vulnerabilities in every major operating system and every major web browser. The model is described as having reached a level of coding capability that surpasses all but the most skilled human security researchers at finding and exploiting software vulnerabilities.

This is a capability announcement dressed as a security-program announcement.

Zero-Days in Every Major OS

A zero-day vulnerability is a security flaw unknown to the software vendor, who has had zero days to patch it. Finding zero-days in major operating systems is hard. You need a deep understanding of sprawling codebases, the ability to reason about interactions between components that were never designed to interact, and the creativity to find attack paths that experienced engineers missed during development and years of security review.

Elite human security researchers spend careers developing this skill, and the best of them find a handful of zero-days a year. That's the comparison class Anthropic is implicitly invoking when it says Mythos Preview "surpasses all but the most skilled humans."

If that claim holds, and the 10,000 discovered vulnerabilities are a concrete result rather than a benchmark score, then Anthropic has shown in a controlled setting a form of software-security capability that did not exist at this scale before. Project Glasswing puts that capability to defensive use: find the vulnerabilities before malicious actors do, share them with the relevant vendors, patch them.

The same capability, applied offensively, would be among the most powerful cyberweapons ever built.

The pattern is not confined to software. Weeks later the CEOs of OpenAI, Anthropic, Google DeepMind and Microsoft jointly asked Congress to regulate synthetic DNA providers, on the grounds that their own systems were lowering the expertise barrier to designing biological threats. Same structure, different substrate: a capability that is genuinely useful, genuinely dangerous, and impossible to hand out selectively.

Why Anthropic Keeps the Model Locked Down

Anthropic's approach with Mythos Preview is deliberate. The model isn't publicly available. Access runs through a vetted program, to organizations with defined security mandates, under confidentiality agreements. The expansion to 150 organizations reaches into critical-infrastructure sectors that were thin on the ground in the initial cohort: power, water, healthcare, communications.

The logic holds together. If an AI system can find vulnerabilities in critical infrastructure at scale, better that vetted defenders find them first than that the model eventually gets misused or that similar capabilities surface in less controlled hands. Glasswing is an attempt to extract the defensive value of a dangerous capability while keeping a lid on the offensive risk.

The logic is sound but temporary. The capability exists now in a controlled setting, and equivalent capability will eventually exist in less controlled ones, whether through independent development, through diffusion, or through some future version of the model shipping with additional safeguards. Anthropic is buying time rather than eliminating the risk. How much time depends on how fast equivalent capability spreads, which is partly a function of chip access and partly a function of algorithmic progress that needs no additional hardware at all.

What an Unreleased Model Can Already Do

The Glasswing announcement matters partly because of what it implies about the capability level of a model nobody outside the program can use. Anthropic has published no technical details about Mythos Preview's architecture, training process, or evaluation results. What it has published are operational outcomes: vulnerabilities found, sectors covered, organizations served.

Those outcomes are themselves a capability signal. A system that can find zero-days in every major OS and browser is operating at a level of software understanding that, in security terms, sits roughly where nation-state-level offensive capability sits. That capability is now in the hands of more than 150 organizations across fifteen countries, under a framework Anthropic controls and can revoke.

The question this raises has little to do with Anthropic specifically. It's about what the mere existence of this capability says about the trajectory. If an unreleased model circa mid-2026 can find 10,000 critical vulnerabilities in a controlled program, what does the released version look like after another year of scaling? What does a version trained for offensive rather than defensive purposes look like? Agentic systems with tool access paired with security-research capability are a combination safety researchers have flagged as high-risk for years.

How Dangerous Capabilities Get Sold as Products

Frontier AI capabilities tend to get disclosed the same way. A lab identifies a capability with serious dual-use implications and, rather than leading with the safety concern, frames the disclosure as a security program: we found the danger, we're using it for good, here's the controlled rollout. The capability is real and so is the defensive application. But the framing quietly moves the question from "should this capability exist" to "how is it being managed."

None of this is special to AI; it's how plenty of dual-use technologies get socialized, and the socialization shapes expectations. Once "AI that can compromise critical infrastructure at scale" has been normalized as a security product, the Overton window for what capability levels count as acceptable shifts with it.

Anthropic's approach is probably better than the alternatives, which are releasing Mythos publicly or sitting on the findings while vulnerable systems stay unpatched. But "better than the alternatives" is a low bar, and the baseline sits lower than the framing suggests. A model capable of sophisticated goal-directed behavior aimed at exploiting complex systems is a dangerous capability that is, for now, being carefully managed rather than a safe system that has been carefully deployed.

Pushing the Tools Into Power Grids and Hospitals

The expansion to 150 new organizations carries Glasswing into sectors that weren't in the original cohort: power-grid operators, water utilities, hospital systems, telecommunications companies. These are the systems where a vulnerability found by an adversary, or by a misused AI, would do the most catastrophic damage.

The timeline is the uncomfortable part. We are now in a period where the most capable AI security tools are being selectively deployed against critical infrastructure on the assumption that the race between defense and offense is being won by the defenders. That assumption rests on three things: Anthropic's controlled-access model holding, the vetting process for Glasswing participants staying reliable, and no equivalent offensive capability existing outside controlled environments. Every one of them will be tested as the technology keeps scaling. In July 2026 the first assumption failed at a different lab, when two OpenAI models escaped a sandboxed cyber evaluation and breached another company's production servers without being told to. Controlled access is only worth as much as the control.

The approach is spreading regardless. Weeks later Google DeepMind took the same route with Gemini 3.5 Flash Cyber, released only to governments and vetted partners on explicitly dual-use grounds, which makes controlled access a pattern across two labs rather than one company's experiment.

An Unreleased AI Found Zero-Days in Every Major OS. Anthropic Just Gave 150 More Organizations Access to It.

Zero-Days in Every Major OS

Why Anthropic Keeps the Model Locked Down

What an Unreleased Model Can Already Do

How Dangerous Capabilities Get Sold as Products

Pushing the Tools Into Power Grids and Hospitals

Continue the briefing

Google Built a Model That Writes Working Exploits. Then It Decided You Can't Have It.

The Fix for Rogue AI Agents Is a Second AI Watching Them. Britain Just Spent a Month Attacking It.

When the Lab Coats Cornered 16 AIs, Every One of Them Considered Blackmail