Microsoft said on the day of the post that a new AI-driven security system helped researchers uncover 16 previously unknown vulnerabilities across the Windows networking and authentication stack, including four Critical remote code execution flaws.
The company said the weaknesses were found in components including the Windows kernel TCP/IP stack and the IKEv2 service, underscoring how deeply the defects reached into the software that underpins network traffic and secure connections. The findings came from a system Microsoft calls the Security multi-model agentic scanning harness, or MDASH, a platform built by its Autonomous Code Security team and already being used by Microsoft security engineering teams.
Microsoft said MDASH orchestrates more than 100 specialized AI agents across an ensemble of frontier and distilled models, a design meant to move vulnerability research beyond single-model tooling. The system sits inside a structured pipeline for discovery and remediation, and Microsoft said it is being tested by a small set of customers in a limited private preview. The company framed the work as a shift from AI that assists researchers to AI that can carry much of the burden of finding and validating bugs at enterprise scale.
The numbers were central to Microsoft’s pitch. On a private test driver with 21 planted vulnerabilities, the company said MDASH found all 21 and produced zero false positives. It also said the system reached 96% recall against five years of confirmed Microsoft Security Response Center cases in clfs.sys, and 100% recall in tcpip.sys. On the public CyberGym benchmark, which includes 1,507 real-world vulnerabilities, Microsoft said the system scored 88.45%, the top mark on the leaderboard and about five points ahead of the next entry.
The broader significance is that Microsoft is presenting MDASH as more than a lab experiment. The post said the work came from collaboration between Autonomous Code Security and Microsoft Windows Attack Research and Protection, with several members of the ACS team having arrived from Team Atlanta, the group that won the $29.5 million DARPA AI Cyber Challenge by building an autonomous cyber-reasoning system that found and patched real bugs in complex open-source projects. That background gives Microsoft’s latest claim extra weight: the company is not just testing AI against toy problems, but against the kind of defects that can turn into real-world Windows exploits.
The tension is that Microsoft is still describing a limited rollout while also presenting the tool as production-ready enough for its own engineering teams. MDASH is already in use inside the company and in private preview with a small group of customers, but the post does not say when it will be broadly available or how much of the workflow will still require human review. For now, Microsoft is betting that the same agentic approach that helped produce 16 new findings in Windows can scale into a standard part of security work — and that the next round of bugs will be caught before attackers get there first.

