Anthropic has a model it hasn't released publicly. It's called Claude Mythos Preview. Earlier this year, the company gave access to a small group of partners—Amazon, Apple, Cisco, Google, JPMorgan, Microsoft, and a handful of others—with a specific mandate: use it to find and fix software vulnerabilities. This week, Anthropic announced it's expanding that program to approximately 150 additional organizations across more than fifteen countries.
The results so far: more than 10,000 high- or critical-severity security flaws identified. Zero-day vulnerabilities in every major operating system. Zero-days in every major web browser. The model is described as having reached a level of coding capability that surpasses all but the most skilled human security researchers at finding and exploiting software vulnerabilities.
This is a significant capability announcement dressed as a security program announcement. It's worth reading it as both.
What "Zero-Day in Every Major OS" Actually Means
A zero-day vulnerability is a security flaw that is unknown to the software vendor—they've had zero days to patch it. Finding zero-days in major operating systems is hard. It requires deep understanding of complex codebases, the ability to reason about interactions between components that weren't designed to interact, and the creativity to find attack paths that experienced engineers missed during development and years of security review.
Elite human security researchers spend careers developing this skill. The best ones can find a handful of zero-days per year. That's the comparison class against which Anthropic is implicitly benchmarking Claude Mythos Preview when it says the model "surpasses all but the most skilled humans."
If that benchmark is accurate—and the 10,000 discovered vulnerabilities is a concrete result, not a benchmark score—then Anthropic has demonstrated, in a controlled setting, a form of software security capability that did not exist at this scale before. The Project Glasswing program is using that capability for defensive purposes: find the vulnerabilities before malicious actors do, share them with the relevant vendors, patch them.
The same capability, applied offensively, would be among the most powerful cyberweapons ever built.
The Controlled Access Model and Its Logic
Anthropic's approach with Claude Mythos Preview is deliberate. The model is not publicly available. Access is granted through a vetted program, to organizations with defined security mandates, under confidentiality agreements. The expansion to 150 organizations covers critical infrastructure sectors—power, water, healthcare, communications—that were underrepresented in the initial cohort.
The logic is coherent: if an AI system can find vulnerabilities in critical infrastructure at scale, it's better for vetted defenders to find them first than for the model to eventually be misused or for similar capabilities to emerge in less controlled contexts. The Project Glasswing structure is an attempt to extract the defensive value of a dangerous capability while limiting the offensive risk.
The problem with this logic is not that it's wrong. It's that it's temporary. The capability exists now in a controlled setting. Equivalent capability will exist in less controlled settings eventually—through independent development, through capability diffusion, through the model itself eventually being released in a future version with additional safeguards. Anthropic is buying time, not eliminating the risk. How much time depends on how fast equivalent capability spreads, which is partly a function of chip access and partly a function of algorithmic progress that doesn't require additional hardware.
What This Says About Where Capability Is
The Glasswing announcement is significant partly because of what it implies about the capability level of an unreleased model. Anthropic has not published technical details about Claude Mythos Preview's architecture, training process, or evaluation results. What they have published are operational outcomes: vulnerabilities found, sectors covered, organizations served.
Those operational outcomes are a capability signal. A system that can find zero-days in every major OS and browser is operating at a level of software understanding that is, in security terms, roughly equivalent to a nation-state-level offensive capability. That capability is now in the hands of 150+ organizations across 15 countries, under a framework Anthropic controls and can revoke.
The question this raises is not about Anthropic specifically. It's about what the existence of this capability says about the overall trajectory. If an unreleased model circa mid-2026 can find 10,000 critical vulnerabilities in a controlled program, what does the released version of the model after another year of scaling look like? What does a version trained for offensive rather than defensive purposes look like? Agentic systems with tool access and security research capability are a combination that safety researchers have been flagging as high-risk for several years.
The Dual-Use Disclosure Pattern
There's a recurring pattern in how frontier AI capabilities get disclosed. A lab identifies a capability with serious dual-use implications. Rather than treating it primarily as a safety concern, the disclosure frames it as a security program: we found the danger, we're using it for good, here's the controlled program. The capability is real. The defensive application is real. But the framing shifts attention from "should this capability exist" to "how is the capability being managed."
This isn't unique to AI. It's how many dual-use technologies get socialized. The pattern is worth recognizing because the socialization shapes expectations. Once "AI that can compromise critical infrastructure at scale" has been normalized as a security product, the Overton window for what capability levels are considered acceptable shifts.
Anthropic's approach with Glasswing is probably better than the alternatives—better than releasing Mythos publicly, better than keeping the findings secret and letting vulnerable systems remain unpatched. But "better than the alternatives" is a low bar, and it's worth maintaining a clear view of what the baseline is. A model capable of sophisticated goal-directed behavior applied to exploitation of complex systems is not a safe system that has been carefully deployed. It's a capable system that is currently being carefully managed.
The Expansion Timeline
The expansion to 150 new organizations brings Project Glasswing to sectors that weren't in the original cohort: power grid operators, water utilities, hospital systems, telecommunications companies. These are exactly the systems where a vulnerability found by an adversary—or by a misused AI—would have the most catastrophic consequences.
The timeline implication is uncomfortable. We are now in a period where the most capable AI security tools are being selectively deployed to find vulnerabilities in critical infrastructure, and where the assumption is that this race between defense and offense is being won by the defensive side. That assumption depends on Anthropic's controlled access model holding, on the vetting process for Glasswing participants being reliable, and on no equivalent offensive capability existing outside of controlled environments.
All three of those assumptions will be tested as this technology continues to scale.