Off the Beaten Patch
Mythos found 271 Firefox bugs. You’re still running Java 8.
A new class of threat has arrived, and the security industry — with its unerring instinct for the novel over the necessary — is looking in exactly the wrong direction. The industry is reacting to frontier models as if the breakthrough is vulnerability discovery. It is not. The breakthrough is autonomous exploitation of the vulnerabilities you already know about and haven’t fixed. The beaten patch — the tail of criticals, the KEVs, the headline zero-days — gets all the attention. Everything off it is where the risk actually lives. Glasswing is the butterfly. The vulnerability backlog is the hurricane. Your supply chain is out of sandbags.
In the past six months, autonomous AI systems have demonstrated the ability to take a CVE number as input and produce a working exploit as output, no human in the loop, no proof-of-concept code scraped from GitHub, no nation-state budget required. MOAK — built in a week by two engineers — did it in twenty-one minutes against a React-to-shell chain using public models and a twenty-dollar API key. CVE-Genie reproduced 51% of all CVEs published in 2024 and 2025 at $2.77 each. CyberStrikeAI, an open-source framework with ties to China’s MSS, confirmed attacks against over 600 devices across 55 countries within two months of its GitHub publication. The UK’s AI Security Institute tested Anthropic’s Mythos Preview against a 32-step enterprise network attack simulation — reconnaissance through full network takeover — and watched it complete the chain on three of ten attempts. No model had ever finished that range. AISI estimates the equivalent human effort at twenty hours.
These are not variations on a theme. They are independent proof points converging on a single conclusion: the autonomous weaponization of known vulnerabilities is now a commodity capability. The models are public, the orchestration patterns are documented, and Hadrian has cataloged 70 open-source offensive AI tools on the public internet as of March 2026 — fewer than five existed before GPT-4. That is the count on the open web. The dark web has its own parallel market of jailbroken LLMs and autonomous exploit kits — WormGPT, FraudGPT, Xanthorox, DIG AI — sold as subscription services, complete with documentation and customer support, that no one is cataloging. The mean time to exploit a disclosed vulnerability has fallen to five days.
The industry is responding by scanning for new ones.
Anthropic’s Mythos Preview is a frontier model that both discovers new vulnerabilities and chains known ones into autonomous attack paths, offered through Project Glasswing to select partners. Mozilla ran it against Firefox and patched 271 vulnerabilities in a single release. Palo Alto reported it accomplished the equivalent of a year’s pentesting in three weeks. Treasury Secretary Bessent took the meeting. The headlines wrote themselves.
They also wrote over the fine print. Of 271 findings, three earned CVEs. The rest are defense-in-depth hardening, bugs in non-exploitable code paths, the kind of findings that improve quality but do not represent the offensive paradigm shift the coverage implies. Mozilla’s own assessment was notably measured: they hadn’t seen any bugs that a sufficiently elite human researcher couldn’t have found. AISI was blunter — on individual tasks, Mythos broadly matches GPT-5.4 and Opus 4.6; what distinguishes it is sustained multi-step chaining, not novel discovery.
The industry is celebrating the discovery and ignoring the attack chaining — which is what actually matters for its risk posture. Worse: the attack chaining capability is not locked behind Glasswing. MOAK built its entire autonomous exploitation pipeline on generally available Opus 4.6 and GPT-5.4 — models anyone with an API key already has. The offensive capability is commodity. Mythos just made it visible. Meanwhile, Mythos and Glasswing will generate what MOAK’s own creators predict will be a two-year meteor shower of newly discovered CVEs as every partner surfaces decades of buried vulnerabilities across the open-source ecosystem. The industry’s vulnerability problem was never primarily a discovery problem. It is, and has always been, a remediation problem. And every vulnerability Mythos surfaces adds to the remediation backlog that its own attack chaining capability — and every commodity clone of it — can already exploit.
Anyone who has lived through the vulnerability management wars of the last twenty years has seen this movie. New scanner, bigger findings database, same unpatched systems. Mythos is the most sophisticated vulnerability discovery and attack chaining system ever built, and the organizational machinery it depends on hasn’t changed since Nessus.
We are very, very good at finding vulnerabilities. We are terrible at fixing them.
The numbers have been telling this story for years, but three of them are now dispositive. The average application generates seventeen new vulnerabilities monthly while security teams remediate six — the backlog grows by eleven per application every month before a single new CVE is published, and that was before Mythos. Even weaponizedvulnerabilities, those with known active exploits that CISA has ordered federal agencies to remediate, are patched only 57.7% of the time. And 60% of breaches involve vulnerabilities where a patch already existed.
The rest of the data confirms the scale: 45% of enterprise vulnerabilities still unpatched after twelve months. A mean time to remediate complex enterprise applications of five months and ten days. NIST conceding that comprehensive NVD coverage is no longer sustainable. The cataloging system is buckling. The remediation system buckled years ago, quietly, where nobody with budget authority was watching.
That is the industry’s actual security posture — not the scanning dashboard, not the CVSS heatmap, but the fraction of what gets found that actually gets fixed.
If you want to see what the backlog actually looks like, look at the runtime.
Nearly a third of production Java applications still run on Java 8 — a runtime released in March 2014 whose public updates ended in 2019 and whose Premier Support ended in 2022. Forty-nine percent of companies still carry Log4j vulnerabilities in production three years after discovery. Nineteen percent are still running Java 6 or 7. These are not failures of awareness. They are failures of organizational capacity to act on what everyone already knows. Libraries are dropping Java 8 support. The patched version of the dependency requires Java 11+ or 17+ APIs. You cannot apply the fix without migrating the runtime, cannot migrate the runtime without rewriting, retesting, and recertifying the application, and cannot do any of that without funding a multi-year capital project that competes for budget against generative AI, agentic platforms, and every other initiative that actually gets an executive sponsor. The change advisory board does not fund capital projects. The vulnerability accrues interest.
The sectors where this debt concentrates most dangerously — financial services, healthcare, energy, government — have different causes for the same effects. Financial institutions have the money but operational risk governance that can turn a fourteen-day remediation directive into a six-month change management exercise. Healthcare has neither the money nor mature security programs. Energy has OT/IT convergence problems that are fundamentally different from application-layer patching. Government has procurement cycles measured in geological time. Different etiology. Same pathology. Forty-three percent of financial institutions still operate core systems developed over twenty years ago.
And the familiar objection — that these institutions invest in compensating controls like microsegmentation, EDR, and network isolation — does not survive contact with the threat model. Segmentation across a hybrid multi-cloud estate with thousands of applications and undocumented dependencies is a decades-long project that stalls at “crown jewels”. RASP was dead on arrival. ADR has promise but does not yet cover the heterogeneous application estates where the debt lives. EDR was not designed to stop an attack directed at the application layer. The agentic exploitation tools don’t care about your network segmentation if they’re inside the application.
The patch exists. The scanner found the downstream CVE. The ticket is in ServiceNow. And the remediation path runs through a platform migration that hasn’t been funded, a QA environment that doesn’t exist, an application owner who won’t schedule the downtime, and a change advisory board that meets monthly while the binding operational directive requires remediation in fourteen days. The vulnerability sits in the backlog, waiting, until an autonomous agent walks in and exploits it before the next change board meets — at a bank, at a hospital, at a utility, at an agency.
And this is the part the industry needs to reckon with honestly: we have seen this cycle before. Mainframes became legacy, so enterprises invested billions migrating to Java. Congratulatory backslapping. Transformation complete. And now Java is the legacy, the platform everybody knows is unsupported and nobody can migrate off of, and the next wave of investment — cloud-native, Kubernetes, serverless — is already accumulating the technical debt that will be the subject of someone else’s blog post in 2038. The structural problem is not any particular runtime. It is the organizational incapacity to maintain the thing you built after the building was celebrated and the builders moved on.
Technology failures are downstream of governance failures. The industry is funding AI-powered discovery — novel, publishable, fundable, the kind of work that earns a conference keynote. It is not funding remediation, which is invisible, expensive, unglamorous, and requires governance authority the security organization has never possessed and shows no signs of obtaining. The incentive structure rewards finding the zero-day in Firefox and ignores the two-year-old KEV on the payment system running Java 8, the patient records system pinned to an unsupported runtime, the SCADA integration that hasn’t been touched since the developer who understood it retired five years ago. The frontier model finds the novel vulnerability. The twenty-dollar API key exploits the one everyone already knew about, on the runtime everyone already knew was unsupported, at the institution whose failure would be systemic.
The shape of the solution has to match the shape of the garbage pile, and every institution’s garbage pile is its own special achievement. But the axes of intervention are knowable:
Technology simplification and consolidation to shrink the maintenance surface — every unconsolidated acquisition and unretired platform is attack surface you are paying to defend and failing to patch
Runtime modernization as risk reduction, not “tech debt” where it goes to die
Dependency migration as capital work, not ticket hygiene
Exploitability validation against what the business actually runs, not CVSS scores nobody downstream can act on
Patching in the SDLC deployment pipeline, not on the change board calendar
Supply chain engineering that rebuilds from source and routes around the registry poisoning and dependency rot that scanners catch after the fact
Adversarial testing baked into the CI/CD so that the build fails if the vulnerability ships
Security with authority to force the fix or force an executive to sign for the risk
None of this is a product you buy. All of it is operational discipline you build, customized to whatever particular archaeology of technical and organizational debt you’ve accumulated.
Without it, the forecast writes itself.
We will burn millions on tokens scanning for glamorous new vulnerabilities with every AI lab and every cyber vendor while the known CVEs pile up behind us, unfixed. And the agents — plural now, a growing and increasingly capable class — will walk in through every one of them, at the institutions where the SLA parlour tricks and glowing green dashboards tell us we are safe.

