It was probably inevitable that vibe coding would cause a cascade of complex changes to coding practices and software development. As AI-generated code continues to comprise a larger share of real-world codebases, one of the most consequential changes has been a major shift in the quantity of cybersecurity risks—and the nature of those security risks.
Vibe coding, in a nutshell, is the notion building products with nothing but AI tools and natural language prompts. It should be understood as distinct from agentic coding or agentic engineering, which entail more deliberate use of AI coding tools by an experienced programmer. The term was coined in a tweet from Andrej Karpathy—an esteemed developer and AI veteran—in which he described a freewheeling style of AI coding that relies entirely on large language model (LLM) prompting, letting the user “fully give in to the vibes” and “forget that the code even exists.”
He described it as a fun approach to “throwaway weekend projects,” but the proliferation of powerful AI coding has opened the door to vibe coding in the workplace—even in production software handling sensitive data. As of late 2025, industry research indicates that AI-generated code now represents 22% of all merged code.1 On GitHub, reports estimate that 46% of all written code was generated by Copilot.2
This has created a major expansion in vulnerabilities and security issues beyond what could be explained by the natural increase in volume and velocity of new code that coding assistants enable. One analysis of real-world codebases found that while AI-assisted teams were shipping code 4 times faster, they were shipping 10 times as many security flaws.3
Abandoning vibe coding altogether would sacrifice significant productivity gains from automation and increased participation, but it’s clear that secure coding practices must evolve in response. Traditional security checks are designed to address how developers—more specifically, trained human developers—typically make mistakes. Organizations now need an updated security posture and updated security tools, tailored to the new reality driven by AI agents and novice vibe coders.
Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.
Across different platforms, environments and LLMs, industry research consistently indicates that increased AI code generation is accompanied by increased application security (AppSec) vulnerabilities.
For example:
A December 2025 study of open-source repositories found that AI-generated code introduced security vulnerabilities in 45% of development tasks. Comparing AI-assisted pull requests (PRs) against human-only PRs, the researchers found that AI-powered PRs generated 2.74 times more security issues (and 1.7 times more issues overall) than human-authored code.4
A 2026 GitGuardian report on secret sprawl posits that Claude Code-assisted commits exposed secrets more than twice as often as human-only commits. All told, hardcoded secrets (such as passwords and API keys) exposed in public GitHub commits increased by 34% year-over-year in 2025—the largest single-year jump on record.5
An October 2025 report noted that critical vulnerabilities and data exposures in vibe-coded applications, such as misconfigured APIs and other authentication issues, were often out “in the open,” accessible directly through public endpoints.6
Security teams should also consider that public repositories on GitHub (and similar services) are only the visible tip of the iceberg. GitGuardian suggests that internal repositories are six times more likely to contain hardcoded secrets than public repos.
Of equal concern is the fact that sensitive credentials exposed through vibe coding are apparently less likely to be addressed and remediated than security issues that arise through traditional coding workflows. GitGuardian’s report noted that almost 70% of exposed credentials that had been validated as legitimate in 2022 remained valid through January 2025. 64% were still exposed and unrevoked as of January 2026. In one notable incident, Football Australia inadvertently exposed its AWS access keys in its website’s source code—they remained exposed for over 700 days.
Taken together, these patterns suggest that vibe coding security risks are accumulating at a rate greater than what traditional remediation programs were designed to handle.
The frequency and magnitude of AI-derived security flaws seem to persist even as LLM functionality continues to improve. Industry observers have therefore suggested that the training pipelines and benchmarks used to optimize and validate model performance aren’t always naturally aligned with secure code generation in the real world.4 Most studies evaluate AI-generated code in isolation, with a focus on correctness—whether it simply compiles, runs and executes its task—and performance on synthetic benchmarks.7
But in enterprise development environments, models don’t operate in isolation. GitGuardian’s report found that LLM-related infrastructure, such as services for orchestration, retrieval augmented generation (RAG) or vector storage, exposed secrets at a rate five times greater than that of core model providers.5
In other words, many security risks of vibe coding aren’t derived from LLMs: they’re derived from the ecosystem in which the models are used. Newly released inference providers, gateways, registries and integration layers enter production workflows quickly—often too quickly for security postures (or novice vibe coders) to keep up.
Agentic AI adds further complexity to security considerations: when an AI agent is granted local access to files, credential stores and terminals, the computer itself can become another potential attack surface. Prompt injection and supply-chain attacks that leverage exposed local credentials can snowball into organizational risks. In the infamous Shai-Hulud worm, which compromised the world’s largest JavaScript registry in late 2025, almost 60% of compromised machines weren’t personal workstations—they were continuous integration/continuous delivery (CI/CD) runners4 (whose purpose, ironically, is to automate security testing and ensure code quality).
An important exacerbating factor for vibe coding security risks is the way that AI-driven code changes are typically packaged: On average, AI-assisted developers produce over three times more commits than their analog peers, but package those commits into significantly fewer—and much bigger—pull requests (PRs). These vibe-coded PRs are usually larger in scope, altering multiple files and services.
Cursor’s “Developer Habits Report,” released in Spring 2026, noted that the average number of lines of code per PR had risen by roughly 250% year over year (and that the growth rate itself was accelerating). “Mega PRs,” which Cursor defined as PRs containing changes to at least 1,000 lines of code, rose from an 8% share of all PRs in January 2025 to 13.9% of all PRs in May 2026.8
Big, sprawling PRs that touch many parts of the codebase are much harder to exhaustively review than smaller, more targeted PRs. Isolated changes to one service or code block might require additional changes elsewhere to avoid downstream issues, but those ancillary effects are less likely to be caught in code review if the PR being reviewed takes a shotgun approach that dilutes the reviewer’s attention. Research from Apiiro, an AppSec platform, found that while PR volume for AI-assisted teams fell by nearly a third, those teams shipped 10 times more security defects.3
In short, AI accelerates every part of code creation—including the bad habit of implementing too many changes at once. Concentrating and accelerating change can also concentrate and accelerate the risk of each merge.
A large-scale study of 500,000 code samples, presented at the 36th IEEE International Symposium on Software Reliability Engineering (ISSRE 2025), determined that “AI-generated code differs from human-written code not only in the number of security vulnerabilities but also in their nature and distribution.”
This raises an existential concern: existing frameworks for assessing code quality and security vulnerabilities are “human-centered, designed around assumptions about human cognition, error modes, and review processes.” AI-generated code demonstrates markedly different characteristics and tendencies, making those human-centric frameworks ill-suited to evaluating it.7
Compounding the issue, research into vibe coding security risks has also revealed multiple emergent types of attacks that have no equivalent in human-coded software.
LLMs, and the AI coding tools they power, have a tendency to recommend packages that don’t actually exist—and research has found that these hallucinated package names are sometimes repeated across different scenarios and even different models. Slopsquatting, a type of package hallucination attack, exploits this tendency.
Slopsquatting attackers observe the names of hallucinated packages that LLMs repeatedly generate and pre-register these (previously) fictitious package names on public registries. When vibe coders, often altogether unfamiliar with commonly used packages (and thus unlikely to spot a fake one), commit code containing these package names, they are unknowingly installing malicious code.
This has no analogue in traditional software development. A human programmer would be unlikely to hallucinate a package that doesn’t exist—and even if one did, it would be both impossible and impractical for a would-be attacker to predict this, guess the name of the hallucinated package and register malicious code under that package name.
Rules file backdoor attacks exploit the configuration (“rules”) files used to customize AI agent behavior when generating or modifying code in platforms such as Cursor, GitHub Copilot and Claude Code.
Attackers typically modify popular open source rules files, embedding malicious instructions in the form of hidden Unicode characters that are nearly impossible to detect. Once these changes have been accepted, that malicious code will be activated in every subsequent AI coding session that references the infected rules file. As a result, any code generated during such sessions will be compromised—but because the compromised output comprises Unicode characters that are essentially invisible, it appears clean in code review.
The attackers’ goal is usually to introduce vulnerabilities and “backdoors” in subsequently generated code (and the applications using that code) that they can exploit later.
Armed with an understanding of these specific patterns, organizations should update their security best practices to benefit from the advantages of AI-generated code without falling victim to the types of vulnerabilities that arise from indiscriminate vibe coding.
While the potential for these security defects may arise from the imperfections of AI coding tools, the realization of these risks is a failure of human oversight. Developers are, ultimately, still in control of what additions and changes get accepted, revised or ignored. Coding assistants continue to expand and improve their built-in guardrails, but human review remains the final barrier to security vulnerabilities going live—which is why IBM Bob’s default settings integrate manual reviews and human-in-the-loop approval at each stage.
That said, there are ways to improve the efficacy of human review.
Flaws buried within huge, sprawling pull requests are significantly harder to spot in a security check than those located in more focused PRs. Security teams should implement guidelines for PR size accordingly.
Packages should be given particular scrutiny to minimize risk of slopsquatting attacks. Security teams should note that with each new iteration of a given AI model, new package hallucination patterns may emerge.
Developers and security teams alike should familiarize themselves with the differences between human-written and AI-generated code defects.
Organizations should regard AI-generated code as a distinct input category requiring distinct, dedicated security controls (rather than simply integrating AI output into standard, human-oriented review processes).
As a broad axiom, organizations should treat any AI-derived code or modules as untrusted input in the same manner they’d treat external library code. Static analysis security testing (SAST) should be a mandatory gate through which all AI contributions must pass before entering the codebase. Third party rules files—such as open source, preconfigured Cursor Rules,
The ongoing improvement of agentic engineering platforms presents an opportunity to fight fire with fire through AI-powered secrets detection, in which AI-specific credentials, such as LLM API keys or vector database tokens, have their own dedicated detection signatures.
IBM Bob, for instance, has built-in tools to enforce secure coding and access control at scale through a combination of Bob custom modes, HashiCorp Vault integration and IBM MCP Gateway. It likewise performs real-time code review, scanning code for complexity issues, potential vulnerabilities and refactoring opportunities that can be addressed inline or reviewed later in the
The ability to catch newly introduced defects in a code base is largely dependent on the readability of that code base. Establishing and enforcing code quality standards is therefore essential to facilitating productive security reviews. Shared standards and routine refactoring will maximize the chances of insecure code being identified and addressed.
1. “AI-assisted engineering: Q4 impact report”, DX, 2025
2. “Github Copilot Usage Data Statistics”, Tenet, 18 July 2025
3. “4x Velocity, 10x Vulnerabilities: AI Coding Assistants Are Shipping More Risks”, Apiiro, 4 September 2025
4. “October 2025 Update: GenAI Code Security Report”, Veracode, October 2025
5. “The State of Secrets Sprawl 2026: AI-Service Leaks Surge 81% and 29M Secrets Hit Public GitHub”, GitGuardian, 17 March 2026
6. “Methodology: How we discovered over 2k high-impact vulnerabilities in apps built with vibe coding platforms”, Escape, 29 October 2025
7. “Human-Written vs. AI-Generated Code: A Large-Scale Study of Defects, Vulnerabilities, and Complexity”, arXiv, 29 August 2025
8. “The Cursor Developer Habits Report: Spring 2026”, Cursor, 1 June 2026 (archived 4 June 2026)
Accelerate software delivery with Bob, your AI partner for secure, intent-aware development.
Optimize software development efforts with trusted AI-driven tools that minimize time spent on writing code, debugging, code refactoring or code completion and make more room for innovation.
Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.