Skip to content
news8 min read

GPT-5.5-Cyber Just Got More Powerful — And the Cyber-AI Arms Race With Mythos Is Heating Up

OpenAI's June 22, 2026 GPT-5.5-Cyber update reaches 85.6% on CyberGym (vs 81.8% for GPT-5.5) and expands access via Trusted Access for Cyber, Daybreak partners, and the EU — as Anthropic keeps Mythos restricted to ~40 organizations.

Author
Anthony M.
8 min readVerified June 23, 2026Tested hands-on
OpenAI GPT-5.5-Cyber June 2026 update — cyber-AI arms race with Anthropic Mythos
OpenAI's June 22, 2026 GPT-5.5-Cyber update pushes the frontier of defensive cyber AI — and widens the gap with Anthropic's locked-down Mythos.

GPT-5.5-Cyber is OpenAI's restricted-access, cybersecurity-tuned version of GPT-5.5, available only to vetted defenders through a program called Trusted Access for Cyber. On June 22, 2026, OpenAI rolled out a more capable version that scores 85.6 percent on the CyberGym benchmark, up from 81.8 percent for standard GPT-5.5. The model is built to analyze large codebases, identify security-relevant components, validate vulnerabilities, and develop and test patches — while classifiers still block credential theft, malware, and other offensive misuse. The bigger story: OpenAI keeps widening access (the EU, tiered vetting, a new partner program) at the exact moment Anthropic keeps its rival Mythos model locked to roughly 40 organizations.

What Just Happened

On June 22, 2026, OpenAI shipped a more capable version of GPT-5.5-Cyber, the cybersecurity specialist tier of its GPT-5.5 family. Axios summarized it bluntly: OpenAI "rolls out more capable version of cyber model." This is not a brand-new model launch — GPT-5.5-Cyber first reached vetted defenders in May 2026 — but a meaningful capability bump aimed squarely at security teams doing authorized defensive work.

The headline number is CyberGym, OpenAI's internal benchmark for end-to-end vulnerability research. The updated GPT-5.5-Cyber hits 85.6 percent, a state-of-the-art result, versus 81.8 percent for standard GPT-5.5. The gains carry across two other cyber evaluations OpenAI published alongside it:

Benchmark Snapshot

BenchmarkGPT-5.5-Cyber (updated)GPT-5.5 (standard)
CyberGym (vulnerability research)85.6 percent81.8 percent
ExploitGym39.5 percent25.95 percent
SEC-bench Pro69.8 percent63.1 percent

The jump on ExploitGym — from roughly 26 percent to nearly 40 percent — is the one that should make both defenders and policymakers sit up. Exploit development is exactly the dual-use capability that makes a "cyber model" politically radioactive: the same skill that validates whether a vulnerability is real can, in the wrong hands, weaponize it.

What GPT-5.5-Cyber Actually Is

GPT-5.5-Cyber is the most permissive, most capable tier of OpenAI's cyber stack, and it is not generally available. You cannot subscribe to it. It is a specialized branch of OpenAI's GPT-5.5, living behind Trusted Access for Cyber (TAC), an identity-and-trust framework OpenAI uses to decide who gets fewer guardrails.

The logic is straightforward: a malware-analysis prompt that OpenAI would refuse for an anonymous ChatGPT user is legitimate work for a verified incident responder at a national CERT. So TAC inverts the usual flow. When an organization is vetted as a genuine defender and an individual passes identity checks, GPT-5.5-Cyber returns lower classifier-based refusals for authorized workflows — vulnerability identification and triage, malware analysis, binary reverse engineering, detection engineering, and patch validation.

Crucially, OpenAI says the safeguards that matter most do not loosen. The model continues to block requests that map to real-world attack: credential theft, stealth and persistence techniques, malware deployment, and exploitation of third-party systems. In other words, TAC widens the aperture for defense without opening it for offense — at least, that is the design intent.

How OpenAI Trusted Access for Cyber gates GPT-5.5-Cyber access for vetted defenders
Trusted Access for Cyber: organizations are vetted as defenders, individuals enable phishing-resistant MFA, and only the top tier unlocks GPT-5.5-Cyber.

Who Can Actually Get In

Access is gated at two levels. First, the organization has to qualify as a legitimate defender. OpenAI names the categories explicitly: critical infrastructure operators, security vendors, national CERTs, and bug bounty platforms. A random startup cannot self-certify its way in.

Second, the individual has to harden their account. Beginning June 1, 2026, individual members accessing GPT-5.5-Cyber through TAC are required to enable Advanced Account Security — phishing-resistant multi-factor authentication. It is a small detail with a big signal: OpenAI is treating access to this model the way banks treat wire transfers, because a compromised TAC account would hand an attacker a pre-vetted, low-refusal cyber assistant.

TAC is also tiered. Verified defenders can get versions of GPT-5.5 with fewer restrictions than the public model, and only the highest tier of the program unlocks GPT-5.5-Cyber itself. That tiering is how OpenAI threads the needle between "useful to thousands of defenders" and "not a free-for-all."

The Daybreak Program And Patch The Planet

The model update did not ship alone. It is the engine inside OpenAI's broader cyber-defense push, branded Daybreak. The Daybreak Cyber Partner Program brings security vendors in to evaluate how the model accelerates defensive work at scale, with named launch partners including Cisco, Intel, SentinelOne, and Snyk.

Alongside it, OpenAI is pointing the capability at open-source software through Patch the Planet, an initiative built with Trail of Bits, HackerOne, and project maintainers to not just find vulnerabilities but fix them. More than 30 open-source projects have committed to participate, including high-stakes infrastructure like cURL, Go, Python, Sigstore, pyca/cryptography, NATS Server, aiohttp, and freenginx.

This is the strategically interesting part. Finding bugs is the easy half; patching at the scale AI now surfaces them is the bottleneck. We covered that exact dynamic when Anthropic's Mythos surfaced 10,000 zero-days in a single month and patching instantly became the constraint. By wiring GPT-5.5-Cyber into Codex Security — which can scan a codebase, review commits, produce severity reports, and generate patches — OpenAI is trying to own both halves.

Why It Matters: The Mythos Arms Race

The real significance of June 22 is not the four-point CyberGym bump. It is the divergence it sharpens between the two labs at the frontier of cyber AI.

OpenAI's direction of travel is expansion. It is broadening access tier by tier, courting enterprise partners through Daybreak, and — most pointedly — pushing GPT-5.5-Cyber into the European Union. Under a newly announced EU Cyber Action Plan, OpenAI opened limited-preview access to vetted European defenders, governments, cyber agencies, and EU institutions including the EU AI Office, with a cohort spanning France, Germany, Italy, Spain, Belgium, the Netherlands, and Sweden, plus names like Okta, Samsung, SK Hynix, NATO, and ENISA. We unpacked that European play in detail in our earlier report on the EU Cyber Action Plan and what it signals.

Anthropic is moving the other way — slowly, deliberately, and on its own terms. Its rival cyber model, Mythos, has stayed restricted to roughly 40 organizations through Project Glasswing, even as Anthropic cautiously extended an offer of Mythos access to ENISA. Where OpenAI ships a tiered access program and a partner ecosystem, Anthropic hand-picks a small cohort and gates everything behind manual review.

OpenAI broad access versus Anthropic restricted access — two deployment philosophies for frontier cyber AI
Two philosophies for deploying frontier cyber capability: OpenAI's tiered, expanding access versus Anthropic's tightly restricted Mythos cohort.

What makes the contrast genuinely interesting — rather than a simple "open versus closed" story — is that the two models appear to be in the same capability class. A source familiar with the model's performance told Axios that GPT-5.5-Cyber is "roughly on par with Mythos." The UK AI Safety Institute's evaluation of GPT-5.5 backs up the seriousness: on Expert-level cyber tasks, GPT-5.5 reached an average pass rate of 71.4 percent, ahead of a Mythos preview at 68.6 percent, and it completed a 32-step simulated corporate-network attack end-to-end in 2 of 10 attempts — only the second model ever to do so.

So this is not a case where one lab is cautious because its model is weaker. Both are sitting on frontier-grade offensive-and-defensive capability and have chosen opposite deployment doctrines. OpenAI is betting that more defenders armed faster is the winning move for collective security. Anthropic is betting that the blast radius of a leak — or a misused account — is large enough to justify keeping the circle small. Both bets are reasonable. Only one of them can be the consensus a year from now.

Is This Dangerous?

The honest answer is: it depends entirely on whether the gate holds. The capability is real and dual-use. An exploit-validation engine that jumped from 26 percent to 39.5 percent on ExploitGym is, by definition, better at the early stages of building an exploit — that is the uncomfortable core of any cyber model.

OpenAI's mitigation is structural rather than purely technical. It is not relying only on the model refusing bad prompts; it is relying on who is allowed to ask in the first place. Organization-level vetting, individual identity checks, mandatory phishing-resistant MFA, and persistent classifiers against the highest-harm requests are all layers designed to keep capability pointed at defense. The mandatory Advanced Account Security requirement is the clearest admission that the threat model includes "a legitimate defender's account gets stolen," not just "a bad actor signs up."

It is a defensible architecture. It is also unproven at scale, and every expansion — a new country, a new partner tier, a new cohort — enlarges the attack surface of the vetting system itself. That is precisely the tension Anthropic is choosing to avoid by staying small. There is no free lunch here: OpenAI buys broader defensive coverage at the cost of a bigger trust perimeter to defend.

The Bigger Picture

Zoom out and June 22 looks less like a product update and more like a position statement. OpenAI is normalizing the idea that frontier cyber capability should be distributed to a vetted-but-growing population of defenders, with identity and access controls doing the heavy lifting that model refusals alone cannot. Anthropic is normalizing the opposite: that the safest place for a frontier cyber model is behind a velvet rope, released to a handful of trusted hands at a time.

For security teams, the practical takeaway is that a model genuinely on par with the most tightly held cyber AI on the market is now reachable — if you can clear the vetting. For everyone watching the policy fight, this is the clearest fork yet in how two of the most capable labs think capable AI should meet the real world. The arms race is no longer about who has the better cyber model. As of June 22, it is about who is willing to put it in more hands.

Frequently Asked Questions

What is GPT-5.5-Cyber?

GPT-5.5-Cyber is OpenAI's restricted-access, cybersecurity-specialized version of GPT-5.5. It is the most capable and most permissive tier of OpenAI's cyber stack, designed for verified defenders doing authorized work — analyzing large codebases, identifying security-relevant components, validating vulnerabilities, and developing and testing patches. After a June 22, 2026 update, it scores 85.6 percent on the CyberGym benchmark, up from 81.8 percent for standard GPT-5.5. It is not generally available; access runs through OpenAI's Trusted Access for Cyber program.

Who can access GPT-5.5-Cyber?

Only vetted defenders. Access is gated through Trusted Access for Cyber (TAC). First, the organization must qualify as a legitimate defender — OpenAI names critical infrastructure operators, security vendors, national CERTs, and bug bounty platforms. Second, individual members must enable Advanced Account Security (phishing-resistant MFA), a requirement that became mandatory on June 1, 2026. Only the highest tier of TAC unlocks GPT-5.5-Cyber itself; lower tiers get GPT-5.5 with fewer restrictions than the public model.

How does GPT-5.5-Cyber compare to Anthropic's Mythos?

They appear to be in the same capability class — a source familiar with the model told Axios that GPT-5.5-Cyber is "roughly on par with Mythos." The difference is deployment philosophy. OpenAI is expanding access through tiered vetting, the Daybreak partner program (Cisco, Intel, SentinelOne, Snyk), and an EU rollout reaching institutions like ENISA. Anthropic keeps Mythos restricted to roughly 40 organizations via Project Glasswing, releasing it to a small hand-picked cohort at a time. OpenAI bets on arming more defenders faster; Anthropic bets on minimizing the blast radius of a leak.

Is GPT-5.5-Cyber dangerous?

The capability is real and dual-use — its ExploitGym score rose from 25.95 percent to 39.5 percent, meaning it is better at the early stages of exploit development. OpenAI's mitigation is structural: rather than relying only on the model refusing bad prompts, it controls who is allowed to ask. Organization-level vetting, individual identity checks, mandatory phishing-resistant MFA, and persistent classifiers continue to block credential theft, malware deployment, persistence techniques, and exploitation of third-party systems. The architecture is defensible but unproven at scale, and every access expansion enlarges the trust perimeter that must be defended.

What is CyberGym?

CyberGym is OpenAI's internal benchmark for end-to-end vulnerability research — it measures how well a model can work through real cybersecurity tasks from analysis to validation. On June 22, 2026, the updated GPT-5.5-Cyber reached 85.6 percent on CyberGym, which OpenAI describes as a state-of-the-art result, compared with 81.8 percent for standard GPT-5.5. OpenAI published two other cyber evaluations alongside it: ExploitGym (39.5 percent vs 25.95 percent) and SEC-bench Pro (69.8 percent vs 63.1 percent).

Related Articles

Was this review helpful?
Anthony M. — Founder & Lead Reviewer
Anthony M.Verified Builder

We're developers and SaaS builders who use these tools daily in production. Every review comes from hands-on experience building real products — DealPropFirm, ThePlanetIndicator, PropFirmsCodes, and many more. We don't just review tools — we build and ship with them every day.

Written and tested by developers who build with these tools daily.