Spectra Assure Free Trial
Get your 14-day free trial of Spectra Assure for Software Supply Chain Security
Get Free TrialMore about Spectra Assure Free Trial
The sheer speed and scale of AI-generated software is overwhelming many of the security teams tasked with assessing software packages for things such as vulnerabilities and logic flaws — creating what many see as a dangerous and growing imbalance between code creation and security validation.
QA and security teams unable to keep up leave their organizations exposed to mounting technical debt, heightened supply chain risk, and a greater likelihood of vulnerabilities ending up in production environments.
At the same time, the expanding use of AI to hunt for vulnerabilities — an activity engaged in by both researchers and adversaries — and the emergence of next-generation tools such as Claude Mythos to accelerate the hunt is compounding the issue by forcing security teams to contend with far more flaws than they can realistically remediate.
Here’s what you need to know about the imbalance — and what you can do about it, including leveraging AI to fight AI.
[ See webinar: Stop Trusting Packages. Start Verifying Them. ]
Security experts observe that even well-resourced teams with mature processes are caught in a bottleneck and have had to shift their focus from vulnerability discovery to triage and remediation.
The situation is particularly acute in the open-source ecosystem. One effect of the deluge of newly found vulnerabilities on open source was seen when, effective March 27, HackerOne paused new submissions to its Internet Bug Bounty program, citing the difficulty maintainers have validating and fixing so many. The cURL Project took the same step in January because AI-generated submissions had overwhelmed its security team.
In cloud environments, human-driven vulnerability remediation has become unsustainable. A recent survey by Sysdig uncovered what appears to be a plateauing in the ability of organizations to remediate critical and high-severity vulnerabilities in their cloud environments despite using mature tools and processes and proper prioritization techniques.
The report said that AI is enabling proofs of concept and vulnerability exploits at speeds faster than humans can respond:
“We must face an uncomfortable truth: Organizations have optimized human workflows as far as they can, but have reached a vulnerability ceiling despite mature processes. The problem isn’t from a lack of effort but a shift in the battlefield.”
—Sysdig survey report
Tools such as ChatGPT and Claude and coding-focused platforms such as GitHub Copilot and Amazon CodeWhisperer are enabling everyone — from experienced developers and software engineers to so-called vibe coders — to rapidly generate functional code with minimal oversight — or no oversight, as the prevalence of what is being referred to as shadow AI suggests. Such activity, happens outside formal software development lifecycles, bypasses established security reviews, code repositories, and governance controls.
How widespread is the usage of AI coding by developers? Sonar Source reported earlier this year that its survey of about 1,150 software developers found that 72% were using AI tools on a daily basis to write code. Respondents said AI is currently generating 42% of their code and that they expect that percentage to increase by 50% by 2027.
Developers, the report added, are using AI to build prototypes, to develop production-grade software for internal use, for customer-facing applications, and in mission-critical environments. And although nearly all respondents (96%) expressed doubts about the functional correctness of AI-generated code, only 48% review their code before committing it to production.
Incorrect functionality isn’t the only problem. Tests that Veracode conducted last year on AI-generated code across Java, Python, C#, and JavaScript environments showed that AI models introduced a risky vulnerability in nearly half the tests.
Randolph Barr, CISO of Cequence Security, said the game has changed for application security (AppSec), even though the principles of secure development haven’t really changed. Practices such as “shift left,” threat modeling, and secure code review are all still very relevant.
“[AI-assisted] coding just blew up the throughput. A developer using Copilot or Cursor can produce in an afternoon what used to take a week. Our security review processes were never built for that pace.”
—Randolph Barr
He said reviewers can be lulled into trusting AI-generated code that looks clean and confident and not realize that it’s just wrong because there are no obvious red flags. And developers are now more prone to shipping logic they haven’t fully reasoned through because they accepted a suggestion rather than wrote it.
Barr is concerned by another big issue: AI doesn’t know an organization’s systems as well as a human developer does. It knows public patterns, but it doesn’t know, for instance, an organization’s specific tenant-isolation model or authorization boundaries — and that gap between “generically correct” and “correct for our architecture” is exactly where security problems live, he said.
Barr said that the short timespan between vulnerability discovery and exploitation is something new in his 20 years of experience. “When I started, a new threat would emerge and you had months to study it, respond, and adapt. That’s gone,” he said. “The gap between a capability appearing and it being widely adopted before anyone fully understands the risk is now weeks.”
There’s only one way to fight that, Barr said.
“The organizations that handle this well won’t be the ones that slow AI adoption down; they’ll be the ones whose security teams are running at the same speed as their developers.”
—Randolph Barr
That means using AI to battle AI coding, an approach trumpeted by others who have watched the threat landscape evolve. Jeff Williams, CTO at Contrast Security, said forward-thinking organizations will continuously produce software with strong and verifiable security properties.
“The future belongs to whoever can build automated software factories that reliably produce secure code and generate the assurance case to prove it. That is the real shift coming into view.”
—Jeff Williams
It’s his belief that organizations and the industry in general have spent far too long treating security as an endless penetrate-and-patch exercise, where software producers find some flaws, fix a few, and call the rest risk management. “As AI makes insecurity more visible, that model starts to look inadequate,” he said.
What this new security challenge boils down to is this: Development and AppSec teams have to discover and remediate their flaws before someone else does. And Williams said that means using AI to prevent, find, and remediate vulnerabilities. Vulnerability prioritization will fade as a priority, with the focus shifting to minimize potential exposure windows.
“If defenders cannot find and fix their own vulnerabilities incredibly quickly, AI-assisted attackers will find and exploit them instead.”
—Jeff Williams
Sysdig’s survey found that many organizations are responding to an environment where threats are coming at them at machine speed by deploying agentic AI to triage alerts, investigate risks, and even initiate automated remediation action with minimal human intervention.
Humans remain important for overseeing autonomous agents and setting guardrails and policies for their safe operation.
“Autonomous remediation, executed within human‑driven guardrails, is how organizations will keep pace with shrinking exploit timelines.”
—Sysdig survey report
Learn about how to leverage ML-BOMs to provide immediate visibility into every LLM in your environment.