Spectra Assure Free Trial
Get your 14-day free trial of Spectra Assure
Get Free TrialMore about Spectra Assure Free Trial
AI coding assistants could be tremendous aids for enhancing the quality and security of software. But as deployed now, they’re usually not. At too many organizations, AI coding assistants aren’t just creating code 10 times faster; they’re also introducing software quality and security issues at similarly blistering velocity.
The result is an acceleration of the degradation in software quality that began years ago, a point recently made by Denis Stetskov, a longtime developer working with NineTwoThree Studio, in his Substack. We are, he wrote, currently living through the “greatest software quality crisis in computing history.”
Denis StetskovThis isn’t about AI. The quality crisis started years before ChatGPT existed. AI just weaponized existing incompetence.
Exhibit A for Stetskov was Apple’s release in September of a refresh to its Calculator app in macOS 26 sporting a memory leak bug that consumed 32GB of RAM from systems, more memory than many computers had a decade ago. A generation ago, he wrote, this kind of bug would have generated herculean efforts to fix and analyze what went wrong. “Today, it’s just another bug report in the queue” and barely makes the news.
But with AI coding, the bug-fix queue is rapidly going from barely manageable to nearly impossible to prioritize and address — at least with traditional development models.
Here’s why the great software quality crisis has come to a head — and how you can get out in front of it by adopting a spec-driven development model.
Get Essential Guide: Software Supply Chain Security for Dummies
Over the past couple of weeks, new studies have come out that validate Stetskov’s claim about AI-enhanced coding. The data confirms that nearly everyone is using AI coding agents and shipping code much faster.
But nearly everyone also implicitly knows that quality and security consequences from this increased velocity are likely creating a net-negative business impact at many organizations. And few businesses are addressing the problem; most simply don’t have enough visibility or control over their AI coding practices to understand or manage the fallout.
In the most recent study from Cycode, which is based on a poll of 400 CISOs, application security (AppSec) directors, and DevSecOps managers, nearly all of the respondents — 97% — said that their organizations are already using or piloting AI coding assistants in their development workflows. And almost one-third of the respondents said that AI now generates most of their code. The result for 78% of the organizations is increased developer productivity.
But here’s the rub:
Another study, by the developer platform DX, examined the working habits of more than 135,000 developers across hundreds of companies. Engineering users of AI tools reported that, on average, they were saving 3.6 hours per week and shipping 60% more pull requests than non-users.
But the DX study also suggests that existing development bottlenecks can eat up those time savings. As Rob Bowley, a product and technology consultant for Pragmatic Partners, noted in a recent analysis of the DX report, “meetings, interruptions, review delays, and CI wait times cost developers more time than AI saves.”
Rob BowleyYou can save four hours writing code faster, but if you lose six hours to slow builds, context switching, poorly-run meetings, the net effect is negative.
His conclusion is that AI may accelerate processes but it’s not going to fix broken ones. And that can have extreme consequences for quality and technical debt.
The DX report’s data on the quality impact of AI-assisted coding offers insights into how this plays out on the macro scale. The study shows a very slight average improvement in quality metrics such as change failure rates (CFRs), change confidence, and code maintainability. But these averages hide an extremely high degree of variability from organization to organization.
In a preview of this report data, DX deputy CTO Justin Reock observed that, while industry averages suggest modest improvements across all quality metrics, “the reality is far more nuanced, with results varying by over 40 points between companies and ranging from significant quality improvements to concerning degradations.”
Bowley said this variability in CFR rates isn’t noise, but rather one of the most important signals in DX’s data.
Rob BowleyStrong quality practices get faster. Weak practices accumulate debt faster. The organizations seeking genuine gains are those already practicing modern software engineering. Those practices remain rare.
The problematic nature of AI’s impact on quality development is the way that it distorts complexity, said Aleksey Stukalov of IntelliJ IDEA Division. In a recent LinkedIn post, he noted that he had to clean up 60% of the code produced by an AI agent in a recent project, adding that AI “flips the table” on software development’s “continuous fight with complexity.” Following decades of work by the industry to cage complexity with managed services, frameworks, and maintainable architectures, AI throws it all in the wind, he said.
Aleksey StukalovGeneration is easy. It ships code with no effort and ignores trade-offs. Maintenance is hard. Coupling, cohesion, run costs, reliability — [that’s] all your problem. [Imagine that] somebody gifted you a dog. Zero purchase price, daily work for decades. Feeding, training, vet bills, walks — all yours. Enjoy!
As a result, Stukalov said, he worries that we’re losing the fight against entropy in source code — which inevitably results in a cascade of reliability and security issues. Therein lies the problem: When AI-generated code is viewed outside the context of architectural choices and risk management tradeoffs, the productivity gains seem endless. But once you add in those issues, the picture changes.
When teams ship commits at 10 times the previous rate, the overall math changes, said Joe Magerramov, a vice president and distinguished engineer at Amazon Web Services, in a recent exploration of the new calculus of AI-based coding.
Joe MagerramovWhat used to be a production-impacting bug once or twice a year can become a weekly occurrence. Even if most bugs get caught in integration or testing environments, they will still impact the shared code base, requiring investigation and slowing the rest of the team down.
AI acceleration also adds strain to decision-making schedules, because higher throughput means a faster rate at which choices have to be made, he said. “Should we use this caching strategy or that one? How should we handle this edge case? What's the right abstraction here?”
Joe MagerramovAt normal velocity, a team might make one or two of these decisions per week. At 10x velocity, they are making multiples each day. This means we need to fundamentally rethink how we approach building software. CI/CD pipelines designed for 10 commits per day will buckle under 100.
Getting to a place that fully accounts for the new calculus will take a rethinking about more than process flows and tooling org charts and fundamental KPIs around engineering success are also going to need to be re-examined.
A growing contingent of engineering thought leaders believe that many answers will be found in what some are calling “spec-driven development,” where coding excellence and quality are determined by how well engineers handle the specifications that drive AI agents to generate code.
This is the direction in which the so-called godfather of DevOps, Patrick Debois, is going with his thinking about AI and software quality. Debois predicts that specification-driven development will be the next fundamental shift in how the industry thinks about software development in an AI-native development era. And he believes that evaluation of specs for quality will likely be intertwined with software quality’s future, he wrote recently.
Patrick DeboisJust as we developed test coverage metrics and code quality tools, we’ll need ways to evaluate spec quality. How complete is your spec? How testable? How maintainable? These metrics don’t exist yet, but they’re coming.
Spec-driven development pushes forward the idea that developers will shift from being the writers of code to being the managers of what takes over the writing of it. Reyk Flöter, director of engineering at Kraken, wrote recently that developers today have to iterate prompts well and understand the nuances of the various models they are working with.
Reyk FlöterThis is about setting expectations and adjusting for the different ways they interpret instructions. Engineering managers do this every day; different reports have different communication styles and strengths.
The reality is that AI coding is also reshaping the developer career pipeline. And this new engineering calculus will be one of the biggest obstacles in sustaining spec-driven development. In the near term, many organizations will see AI as a way to drastically slim down their development organizations by replacing junior developers with AI agents. But that creates a number of problems.
Stetskov, in his Substack post, said having no juniors today will result in having no seniors tomorrow. And that equals having no one to direct the AI agents with specs and no one to fix what AI breaks in the future. “[Senior] developers don’t emerge from thin air,” he said.
Following this to its logical conclusion, the future of quality software is going to rest on sound management that thinks not only about the sustainability of the workflows, but also about the long-term health of their talent pipeline. The reality, said Flöter, is that AI doesn’t fix organizational woes. In fact, it makes them more apparent.
Reyk FlöterThe teams that thrive will not treat AI as an escape hatch from thinking; they will treat AI as a multiplier for good engineering disciplines, staying close to the craft while learning to manage leverage with precision. Management and creation start to merge. The future belongs to those who can do both, who can guide intelligent systems and still understand what the code is supposed to do when you face an outage during a 3 a.m. incident.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.
Get your 14-day free trial of Spectra Assure
Get Free TrialMore about Spectra Assure Free Trial