The State of Vibe Coding Security in 2026: What Scanning Thousands of AI-Built Apps Reveals

Quick Answer

After scanning thousands of AI-generated applications built with tools like Cursor, Bolt, Lovable, and Claude Code, clear patterns emerge: 73% ship with unprotected API routes, 45% expose secrets in frontend bundles, and only 12% have meaningful test coverage. The code compiles, the UI looks polished, but the security and structural foundations are consistently weak. This report breaks down the ten most common issues, what causes them, and how to close the gaps before production.

Why This Data Matters Now

The volume of AI-generated code shipping to production in 2026 is unprecedented. GitHub's 2025 Octoverse report found that over 46% of all code on GitHub is now AI-assisted. Tools like Bolt and Lovable scaffold full-stack applications in minutes. Cursor and Windsurf embed AI directly into professional IDE workflows. Claude Code generates entire feature branches from natural language descriptions. v0 produces deployment-ready React components on demand.

But speed creates a blind spot. When you can build a full SaaS product in a weekend, there is no natural pause point for security review. The code works, the demo looks great, and the deploy button is one click away. What gets missed in that flow is not random - it follows patterns. Specific categories of security, quality, and performance issues surface in AI-generated codebases with striking regularity, regardless of which tool generated them.

Apiiro's 2025 research confirmed that AI-generated code contains security vulnerabilities at 2.74x the rate of human-written code. The question is not whether AI-generated code has problems - it does. The question is which problems appear most often, and how to systematically eliminate them. That is what this data answers.

Top 10 Issues Found Across AI-Generated Codebases

The following table ranks the ten most frequently detected issues when scanning applications built with Cursor, Bolt, Lovable, Claude Code, v0, and Windsurf. These findings span security vulnerabilities, structural code quality, dependency health, and production readiness - the four dimensions that determine whether an app is actually safe to ship.

#	Issue	Category	Prevalence	Severity
1	Missing SSL certificate monitoring	Monitoring	91%	High
2	Missing security headers (HSTS, CSP)	Security	89%	High
3	Missing React error boundaries	Quality	78%	Medium
4	Unprotected API routes	Security	73%	High
5	God files (500+ lines)	Structure	67%	Medium
6	Hardcoded secrets in frontend code	Security	45%	Critical
7	Hallucinated npm imports	Dependencies	34%	High
8	Excessive dependency count (47+ packages)	Dependencies	~60%	Medium
9	Low test coverage (ratio below 0.3)	Quality	88%	High
10	Below-average Lighthouse performance	Performance	~65%	Medium

Three things stand out. First, the top issues are not obscure edge cases - they are fundamental gaps in authentication, monitoring, and structural hygiene. Second, the prevalence numbers are high across all tools, not specific to any single generator. Third, the most critical findings (hardcoded secrets, unprotected routes) are exactly the ones attackers look for first.

The Security Layer: What Vibe Check Scans Reveal

Security findings dominate the top of the list, and the patterns are consistent. The 73% unprotected API route figure means that nearly three out of four AI-generated backends have at least one endpoint that accepts requests from anyone on the internet without verifying identity. This is not a hypothetical risk - it is an open door.

The root cause is predictable. When a developer prompts "create an endpoint that returns user data," the AI generates exactly that: a route that returns user data. It does not add authentication middleware unless explicitly asked, because the prompt did not mention it. The same logic applies to the 45% secret exposure rate. AI tools using Supabase or Stripe will place API keys directly in frontend environment variables (prefixed with NEXT_PUBLIC_ or VITE_) because that is the fastest way to make the code work.

// ❌ BAD: AI-generated API route - no auth, no validation, secrets in client code
// This pattern appears in 73% of scanned vibe-coded apps

export async function GET(req: Request) {
  const { searchParams } = new URL(req.url);
  const userId = searchParams.get('userId');

  // Direct database query with no ownership check
  const userData = await db.query(
    `SELECT * FROM users WHERE id = '${userId}'`
  );

  return Response.json(userData.rows[0]);
}

// ✅ GOOD: Authenticated, validated, parameterized
import { auth } from '@/lib/auth';
import { z } from 'zod';

const paramsSchema = z.object({
  userId: z.string().uuid(),
});

export async function GET(req: Request) {
  const session = await auth();
  if (!session) {
    return Response.json({ error: 'Unauthorized' }, { status: 401 });
  }

  const { searchParams } = new URL(req.url);
  const parsed = paramsSchema.safeParse({
    userId: searchParams.get('userId'),
  });
  if (!parsed.success) {
    return Response.json({ error: 'Invalid input' }, { status: 400 });
  }

  // Only allow users to access their own data
  if (parsed.data.userId !== session.user.id) {
    return Response.json({ error: 'Forbidden' }, { status: 403 });
  }

  const userData = await db.query(
    'SELECT id, name, email FROM users WHERE id = $1',
    [parsed.data.userId]
  );

  return Response.json(userData.rows[0] ?? null);
}

The 89% missing security headers rate is equally concerning. Headers like Strict-Transport-Security, Content-Security-Policy, and X-Frame-Options are deployment-level configurations that AI code generators almost never set. Vercel, Netlify, and similar platforms add some headers by default, but most are incomplete. The OWASP 2025 Top 10 lists security misconfiguration - which includes missing headers - as the 5th most critical web application security risk.

The Structural Layer: What Vibe X-Ray Patterns Show

Beyond individual findings, the structural analysis of AI-generated codebases reveals deeper architectural problems. When you map the import graph, module boundaries, and symbol relationships across an entire project, a pattern emerges: AI-generated code is structurally fragile.

The 67% god file rate means that two-thirds of AI-built apps have at least one file exceeding 500 lines that handles routing, data fetching, business logic, and rendering in a single module. These files are the blast radius amplifiers - a bug in one function can cascade through hundreds of lines of tightly coupled logic. Structural analysis shows that god files typically have 3-5x more downstream dependents than well-factored modules, meaning a single change can break multiple features.

The dependency picture tells a similar story. AI-generated apps average 47 npm dependencies compared to roughly 23 for hand-coded equivalents of similar complexity. More dependencies mean more transitive attack surface. Snyk's 2025 State of Open Source Security report found that the average JavaScript project has 4 known vulnerabilities in its transitive dependency tree. Doubling the dependency count roughly doubles the CVE exposure before a single line of application code is considered.

Perhaps the most uniquely AI-specific finding is the 34% hallucinated import rate. One in three AI-generated codebases imports at least one npm package that does not exist. These phantom dependencies silently break builds or, worse, create supply chain attack vectors if someone registers the hallucinated package name on npm with malicious code. This is a problem that simply did not exist before AI code generation.

The Production Layer: What Vibe Monitoring Patterns Expose

Scanning does not stop at the source code. The production health of deployed AI-generated apps reveals a third category of problems that only surface when the application meets real traffic and real infrastructure.

The 91% missing SSL monitoring figure is the starkest number in the entire dataset. Nearly every AI-generated app that gets deployed has zero alerting for certificate expiration. When that certificate expires - and it will - the site goes down with a browser-level security warning. No amount of clean code matters if users cannot reach the application.

Lighthouse performance scores tell another story. The average AI-generated site scores 62 on performance, compared to 78 for hand-coded equivalents. The gap comes from three sources: unoptimized images (AI tools do not configure next/image properly or compress assets), excessive JavaScript bundles (those 47 dependencies all ship to the browser), and missing code splitting. These are not code bugs - they are deployment and build configuration gaps that accumulate across the bloated dependency trees AI tools create.

The missing React error boundary rate of 78% means that when a runtime JavaScript error occurs - and it will - the entire application crashes to a white screen. No fallback UI, no error reporting, no recovery path. Next.js and Vercel provide error boundary patterns, but AI generators rarely include them unless the prompt specifically requests error handling.

What This Means for Founders and CTOs

The data does not argue against vibe coding. Building with AI is too productive to abandon, and the tools will only improve. But it does argue that the gap between "it works" and "it is production-ready" is wider for AI-generated code than most builders realize. The issues found are not theoretical - they are the exact attack vectors that automated scanners, penetration testers, and bad actors look for.

Automated scanning platforms like VibeDoctor (vibedoctor.io) run 149+ checks across security, code quality, dependencies, and AI-specific patterns, providing a Vitals Score that benchmarks your app against production-ready standards. Free to sign up.

Three practices close the majority of these gaps:

Scan every commit, not just pre-launch. AI-generated code drifts. Each prompt cycle introduces new routes, new dependencies, and new configuration. Continuous scanning catches regressions the moment they appear - not three months later when an attacker finds them.
Track structural health over time. A single scan shows findings. Tracking your codebase structure across versions shows trends: is dependency count growing? Are god files multiplying? Is the import graph getting more tangled? Trend data turns reactive fixing into proactive architecture decisions.
Monitor production, not just code. SSL expiry, uptime, Lighthouse scores, security headers - these are runtime properties that change independently of your source code. A certificate renewal failure or a CDN misconfiguration will take your site down regardless of how clean your codebase is.

The Veracode 2024 State of Software Security report found that 63% of applications contain input validation flaws, a number that AI coding has not improved. The tools generate functional code at unprecedented speed, but functional is not the same as safe. The data from scanning thousands of AI-built apps makes that distinction quantifiable.

Which AI coding tool produces the most secure code?

No tool consistently produces secure code out of the box. In pattern analysis, Claude Code and Cursor with Claude tend to include input validation and error handling more often than Bolt or Lovable, but all tools regularly miss authentication middleware, security headers, and rate limiting. The safest approach is any tool combined with automated scanning before deployment.

Are these issues specific to vibe-coded apps or do hand-coded apps have them too?

Many of these issues also appear in hand-coded applications - missing security headers and low test coverage are industry-wide problems. The difference is frequency and concentration. AI-generated apps have these issues at significantly higher rates because the generation process optimizes for functionality, not defense. A human developer is more likely to add auth middleware from habit; an AI only adds it when prompted.

What is a hallucinated import and why is it dangerous?

A hallucinated import is when an AI model generates an import statement for an npm package that does not exist. The model invents a plausible-sounding package name based on training data patterns. This breaks builds at best. At worst, an attacker registers the hallucinated package name on npm with malicious code, and any developer who runs npm install unknowingly executes it. This is called a dependency confusion attack, and hallucinated imports create the perfect conditions for it.

How many checks should a production-ready scan include?

A thorough scan covers at minimum: secret detection, CVE scanning for dependencies, static analysis for code quality, security header verification, SSL certificate validation, Lighthouse performance metrics, and AI-specific pattern checks like hallucinated imports and god files. Scanning only one dimension - for example, only dependencies - misses the majority of issues found in AI-generated codebases.

Can I fix all these issues by writing better prompts?

Better prompts help but do not solve the problem. You can prompt for authentication and input validation on specific routes, but you cannot prompt for the absence of all 149+ patterns simultaneously. Security is a system-level property, not a per-function property. Prompting handles individual gaps; scanning handles the system. Both are necessary.