The 'Mythos Moment'
... a guide to its consequences for security practice and policy
TL;DR: Recent results show that AI can identify vulnerabilities across critical systems in practice and at scale. This shifts the constraint in cybersecurity from discovery to remediation, as detection begins to outpace the capacity to fix. Most organisations are not yet structured for the consequences, nor for the speed at which this capability is diffusing.
There is something both profoundly exciting and deeply unsettling in seeing so much of what I have spent a professional career researching and practising upended. AI, and specifically the rapid development of Large Language Models (LLMs), has had that effect. Most recently AI agents – which combine a language model with tools, memory and structured reasoning – have started to find vulnerabilities in widely used, well-audited, critical software that decades of automated testing and human review have missed.
A flurry of high-profile announcements associated with the limited release of Anthropic’s Claude Mythos Preview has excited attention and concern, at times bordering on panic. The question is how much to make of this. Marketing-induced urgency is not a reliable guide to underlying capability, and moments of heightened attention are best treated as prompts for analysis.
@profserious attempts to provide a balanced assessment of the situation for the broader reader and, for those in technical and policy leadership roles, to point to some steps that might reasonably be taken in response to recent developments. There is a useful rule of thumb when evaluating claims about AI and security: if it comes from a vendor, halve it; if it comes from government, double it; if it comes from an academic paper using a synthetic benchmark, hold it pending real-world results. We are increasingly at the point when those real-world results are arriving and thus we can form a grounded view. So, what has actually happened in the last approximately 18 months?
What Has Changed
Google’s Project Zero team, working with DeepMind, built a system called Big Sleep. In November 2024 it found an exploitable memory corruption bug in SQLite – a database engine embedded in a vast number of devices and so thoroughly tested that new bugs in it are surprising. By July 2025 it had identified a further vulnerability in the same codebase. A startup called AISLE went further. Using frontier models with their own analysis scaffolding, it found 12 zero-day vulnerabilities in the January 2026 OpenSSL release. OpenSSL is the cryptographic library that secures the majority of encrypted internet traffic. The findings included a critical flaw rated 9.8 out of 10 on the standard severity scale, and bugs traceable to 1990s code that had survived years of continuous automated testing. Across 30+ established projects – Linux kernel, Chromium, Firefox, Apache, OpenVPN, Samba – AISLE has reported around 180 externally-validated CVEs (formally registered, independently verified security flaws) since early 2025. Most of these are now patched.
DARPA (the Defense Advanced Research Projects Agency) ran a competition, the AI Cyber Challenge (AIxCC), the lineal descendant of DARPA’s 2016 Cyber Grand Challenge. The competition concluded at DEF CON (the security conference) in August 2025. In it 7 AI systems worked autonomously across 54 million lines of code, found the majority of the seeded vulnerabilities, patched most of them, and surfaced 18 previously-unknown bugs that were subsequently disclosed to the relevant maintainers. The winning team took home $4 million.
Microsoft’s Security Copilot, applied to bootloader code (the low-level software that initialises a computer prior to the operating system loading) in March 2025, found vulnerabilities across GRUB2 (the bootloader used by most Linux systems), U-Boot and Barebox, including issues that could enable bypass of Secure Boot – the mechanism that prevents unauthorised software from running at startup.
What Matters
Impressive though these demonstrations are, they have important limitations. If you strip away the scaffolding – the tool integrations, the iterative planning loops, the connections to existing static analysis software – then the raw model performance on benchmarks is considerably less impressive. The best models score in the low 20s on standard accuracy measures for real-world C/C++ vulnerability detection without that scaffolding.
What is determinative here is system design or the architecture rather than model capability per se. Achieving these outcomes is thus a function of engineering investment, not exclusive access to frontier models – and such investment is well within the reach of capable adversaries.
The Mythos Moment
So, to the most publicised event. On 7 April Anthropic announced Claude Mythos Preview and a defensive consortium called Project Glasswing, with launch partners including Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. The announcement claimed that Mythos had autonomously identified thousands of zero-day vulnerabilities across critical software infrastructure, found a 27-year-old OpenBSD bug, and chained four vulnerabilities together into a working browser exploit that escaped multiple security boundaries – the kind of multi-stage attack that would typically require a skilled and experienced human team.
The UK AI Security Institute (AISI) – a government body established to evaluate frontier AI systems before and after deployment and now with a growing reputation for independence and technical capability – evaluated Mythos Preview and confirmed it can complete a multi-step corporate network attack simulation end-to-end (in 3 out of 10 attempts), against a scenario estimated to take a skilled human professional around 20 hours.
The achievement should however be placed in context. A research group has taken the specific vulnerabilities that Anthropic showcased, isolated the relevant code sections, and ran them through 8 cheaper ‘open-weight’ models (models whose weights are publicly released and can be run on commodity hardware), including a very small cheap-to-run model. All 8 found the flagship vulnerability. A slightly larger model recovered the OpenBSD bug analysis. This suggests that detection – establishing that a bug exists and characterising it – is now accessible across models of very different sizes and costs. Reliable autonomous exploitation of hardened production systems is, of course, an entirely different capability.
There is a further problem that the widely hyped announcement did not address. Anthropic acknowledges that fewer than 1% of the vulnerabilities Mythos found have been patched. Discovering vulnerabilities at scale without remediating them at comparable scale produces a growing list of exposures, not improved security. The bottleneck in the security pipeline has shifted from finding vulnerabilities to fixing them.
In February 2026 OpenAI published the ‘system card’ that describes the real-world behaviour, limitations and risks associated with its latest model GPT-5.3-Codex. It classified its cybersecurity capability as ‘High’ under its internal Preparedness Framework – the first time any OpenAI model has crossed that threshold. The Framework defines ‘High’ as capable of removing existing bottlenecks to scaling cyber operations, either by automating end-to-end attacks against reasonably hardened targets or by automating the discovery and exploitation of relevant vulnerabilities.
The AISI independently found GPT-5.5 to be the second model – after Mythos Preview – capable of completing its corporate-network attack simulation end-to-end. AISI’s red team also identified a jailbreak (a technique for bypassing a model’s safety restrictions) that worked across all malicious-cyber queries in GPT-5.5, taking 6 hours to develop. A reminder that capability restrictions enforced through model behaviour rather than architectural constraints are not necessarily permanent barriers.
Offensive Capability
So we can reasonably conclude that AI can lower the barrier for attackers ... though this has narrower implications than might at first be assumed.
Academic research from the University of Illinois at Urbana-Champaign in 2024 showed GPT-4 agents autonomously exploiting the majority of known one-day vulnerabilities (flaws with published descriptions but not yet widely patched) when given the vulnerability description, and some zero-days in open-source web applications without prior knowledge.
In August 2025 Anthropic documented a single criminal operator using Claude to extort 17 organisations across government, healthcare and emergency services, with the AI handling reconnaissance, credential harvesting, network penetration and drafting extortion notes demanding up to $500,000 in cryptocurrency. The operator could not have mounted this campaign without AI assistance. This illustrates the barrier-lowering effect: a mid-tier criminal possessing limited technical skills executing a campaign that would previously have required a capable team.
In November 2025 Anthropic disclosed it had disrupted what it describes as the first largely-autonomous AI-orchestrated espionage campaign, attributed to a Chinese state-sponsored group. The campaign targeted around 30 organisations across technology, finance, chemical manufacturing and government, with AI executing the large majority of operational tasks and human operators intervening at only a small number of decision points.
The limitations exposed here are interesting, and help us to assess the overall threat. The attackers gained access to Claude by claiming to be a defensive security firm. Claude hallucinated credentials and misidentified public information as proprietary. Hallucination is a significant obstacle to fully autonomous attacks. AI-assisted attacks introduce characteristic detection signatures: rapid-fire reconnaissance patterns that differ from human browsing behaviour, stylistically uniform code that lacks the idiosyncratic markers of individual programmers, and hallucinated credential artefacts that can be identified in logs. Defenders are able to treat these as indicators of compromise in the same way they treat malware signatures.
Defensive Use
Switching from attack to defence. The enterprise security tooling market is maturing quickly. The approaches that work today are not autonomous threat-hunting in untrusted production but rather analyst augmentation in trusted environments.
Microsoft Security Copilot is now bundled with Microsoft 365 E5, covering alert triage (the process of sorting and prioritising the large volume of security alerts that a typical enterprise SOC receives), access policy optimisation and data security investigations. Microsoft’s trials report a 23% reduction in alerts per incident and faster resolution of policy conflicts.
XBOW, an autonomous penetration testing system reached the top of HackerOne’s global leaderboard in 2025. HackerOne is the largest platform connecting security researchers with organisations running bug bounty programmes. XBOW submitted over a thousand vulnerability reports, of which a significant fraction were verified and resolved. It raised $120 million in March 2026.
Multiple research studies from 2025 report substantial reductions in false positives – in some cases exceeding 90% – when LLMs are used as a contextual triage layer over traditional static analysis tools (software that analyses code without running it, looking for known vulnerability patterns).
The practitioner consensus from the established security conferences, Black Hat Europe 2025 and RSA 2026, is that defenders continue to hold a structural advantage: they own the systems being defended, can run AI inside trusted boundaries and can integrate it with existing security infrastructure. Attackers meanwhile face safety restrictions, hallucinations and detection signatures from high-volume model behaviour.
What To Do
Given all the above, you will want to know what can be done. I will start with enterprises and follow with policy considerations.
The first priority is knowing where AI is already operating in your environment. Shadow AI – models accessed through personal accounts, embedded in third-party SaaS tools without explicit procurement, used by developers outside any oversight process – is an exposure that requires no exotic threat model. An organisation cannot defend a perimeter it has not drawn, and most organisations have not drawn one around their AI usage.
The second is piloting LLM-augmented defensive workflows. Alert triage in your SIEM (Security Information and Event Management system – the platform that aggregates and analyses security logs), false-positive filtering on your most critical codebases, and AI-assisted fuzz (software that bombards a program with random or malformed inputs to provoke crashes) target generation for C and C++ codebases all have established evidence of yielding productivity gains.
If you maintain critical open-source dependencies, the patching pipeline needs to accelerate. The discovery capability that found 12 OpenSSL vulnerabilities in a single January 2026 release is accessible to researchers, criminal operators and state actors simultaneously. CVE volume will increase faster than remediation capacity in most organisations, and the organisations that pre-positioned their patching infrastructure will manage that wave significantly better than those that did not.
Extend your threat monitoring to include AI-specific behavioural signatures alongside traditional malware indicators. Anthropic, Google Threat Intelligence and CrowdStrike all now publish indicators specific to AI-assisted operations. Incident response playbooks also need refreshing: AI-assisted campaigns run at higher operational tempo than human-paced intrusions, with more simultaneous activity across more targets, and playbooks designed for the older pattern will be too slow.
Track the AISI evaluations and OpenAI Preparedness Framework classifications and treat vendor announcements as unverified until corroborated by one of those two sources.
From a policy standpoint, AISI is doing valuable work – its independent evaluations of Mythos Preview and GPT-5.5 are exactly the authoritative evidence that good policy requires, and the model of pre-deployment testing with public disclosure is worth extending and resourcing. The US Preparedness Framework approach, where labs classify their own models against defined capability thresholds and apply corresponding safeguards, is a workable interim mechanism, though it depends entirely on the good faith of the labs doing the classifying. Neither instrument addresses the core problem: capabilities that are frontier today become a commodity within 18 months, and no access control regime is likely to survive that transition intact.
What is actually needed is a shift in where policy effort is concentrated. The current debate centres on restricting access to the most capable models, which is a reasonable precaution but a losing strategy over any time horizon longer than a product cycle. The more durable intervention is on the defensive side: mandating minimum patching timelines for critical infrastructure operators when AI-assisted discovery produces a CVE wave, funding open-source security tooling at a scale commensurate with the problem, and building the kind of national vulnerability remediation capacity that matches the discovery capability now becoming available. The UK’s £90 million announced at CYBERUK 2026 is directed primarily at SME ‘cyber hygiene’ over 3 years and is welcome ... as far as it goes. The Cyber Security and Resilience Bill, currently progressing through Parliament toward Royal Assent later this year, updates the 2018 regulatory framework and extends obligations to managed service providers for the first time. Both are steps in the right direction. Neither addresses the core problem, which is that AI-assisted vulnerability discovery is now operating at a scale and speed that existing patching infrastructure, regulatory timelines and funding envelopes are not adapted to handle.
AI security tooling has moved from research demonstration to production relevance on both sides of the fence. Defensive applications – alert triage, vulnerability discovery, false-positive filtering, autonomous penetration testing – are delivering clear improvements against prior baselines. Offensive capabilities have lowered the barrier for mid-tier criminal actors and enabled more scalable intrusion campaigns.
Where This Leaves Us
The variable most organisations are neglecting is the speed at which these capabilities propagate. Capabilities at the frontier today tend to diffuse into open-weight models within 12 to 18 months, at a fraction of the cost, and with far broader accessibility. The gap between attacker access to capable AI and defender integration of capable AI is, in most enterprises, widening. The tools to close it are already commercially available, but the issue is adoption.
We can now find vulnerabilities at scale; the question is whether we can adapt systems, processes and capacity to fix them at comparable speed.


NCSC has provided extensive guidance here to UK organisations
Retaining defensive advantage in the age of frontier AI cyber capabilities
https://www.ncsc.gov.uk/blogs/retaining-defensive-advantage-in-the-age-of-frontier-ai-cyber-capabilities
Preparing for a ‘vulnerability patch wave’
https://www.ncsc.gov.uk/blogs/prepare-for-vulnerability-patch-wave
10 questions to ask when using AI models to find vulnerabilities
https://www.ncsc.gov.uk/blogs/10-questions-ask-using-ai-models-find-vulnerabilities
then more generally on AI adoption for cyber defence and more broadly
Supporting AI adoption for UK cyber defence
https://www.ncsc.gov.uk/blogs/supporting-ai-adoption-for-uk-cyber-defence
Thinking carefully before adopting agentic AI
https://www.ncsc.gov.uk/blogs/thinking-carefully-before-adopting-agentic-ai
Careful adoption of agentic AI services with FIVEEYE peers
https://www.cyber.gov.au/business-government/secure-design/artificial-intelligence/careful-adoption-of-agentic-ai-services
Software Bill of Materials (SBOM) for Artificial Intelligence - Minimum Elements with G7 partners
https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/KI/SBOM-for-AI_minimum-elements.html
Understanding adversarial attacks against Machine Learning and AI
https://www.ncsc.gov.uk/paper/understanding-adversarial-attacks-against-machine-learning-and-ai
I like your summary. Two areas felt understated.
Firstly the end to end software supply chain feels under significant strain.The SBOM for AI framework helps tell people what they have and where it came from. But there is an open-source patching crisis, which looks like an AI-induced throughput problem created by LLMs. Too many vulnerabilities. Too few people with authority and bandwidth to close them. Has the exploitation window effectively gone negative? Perhaps people underestimate the extent of the integration of open-source components into their 'paid for' software products. These hidden risks (dependencies and shared frameworks) may only be exposed when volunteer maintainers announce they have a “capacity problem". But that's a lack of requisite imagination. We can also see we have been running the software world on the cheap for decades. So the most sensible policy step would be for governments to ensure investment in the bottleneck, the unpaid volunteer community some of whom are feeling overwhelmed by a flood of AI-generated vulnerability reports.
Secondly, piloting LLM-augmented defensive workflows for alert triage in your SIEM feels like ‘last year's challenge’. If this isn't already in place then ‘ouch!’ get on with it fast. This year, those running a SIEM should be figuring out how to make it an agentic SOC, where agents enrich content, correlate alerts, run investigations and manage cases, turning detections into hunting and immediate actions. If this feels too much, then consider shifting to buying a managed service from someone who can do this for you. I'm in an SMB which has practised 'being cybersmart' for years, yet we currently have requests for quotes from MSPs to upgrade to Falcon Complete (we would now rather pay CrowdStrike to deliver remediation and resolve attacks than assign us homework). We also no longer have enough confidence that our locked-down Google identity management (hardware FIDO2 keys with passkeys which are irritating and comforting in equal measure) is sufficient.