|
English

There is a phrase that has eaten the developer discourse since February 2025 and refuses to let go: vibe coding. Andrej Karpathy coined it in a tweet on February 2, 2025 — "I just talk to Composer, accept all, and barely look at the diffs" — and within a quarter the term had been adopted, parodied, weaponized, and finally institutionalized inside engineering orgs that 18 months earlier would have called the practice malpractice.

Now, in May 2026, vibe coding is neither a meme nor a mistake. It is a real workflow with real boundaries, a real tool stack, and a small but growing list of public post-mortems where it went badly wrong. This guide is what we wish we had handed to every team that asked us "should we be doing this?" between Karpathy's tweet and today.

What Vibe Coding Actually Is

Karpathy's original framing — paraphrased from his February 2, 2025 post — has three load-bearing components:

  1. You converse with the agent in natural language. No structured spec, no ticket grooming, no architecture document up front.
  2. You accept the diffs without reading them carefully. "Accept all" is doing the work in the definition. Vibe coding is not AI-assisted coding where you review every change.
  3. You iterate on the running output, not the source. When something is broken, you describe the symptom; you do not open the file and fix it by hand.

That third point is the one most people miss. AI pair programming — the original 2023 Copilot workflow — was about code suggestions you accept or reject line-by-line. Vibe coding is about treating the agent as the author and yourself as the product manager. The codebase is an artifact of the conversation, not the other way around.

When the practice works, it is genuinely magic. Throwaway scripts, prototypes for a Friday demo, learning a new framework, building a toy clone of a tool you use — vibe coding compresses what was a one-day task into ninety minutes and a coffee.

When it does not work, the failure modes are predictable and increasingly well-documented.

The 2026 Vibe Coding Stack

The tool conversation has narrowed considerably since 2025. Here is what most teams who do this seriously are running today.

LayerToolWhy it earns the slot
IDECursor (Composer mode)The fastest accept-all loop in any IDE; native to the workflow
Terminal agentClaude CodeLong-horizon agentic work, git-aware, runs unattended in CI
Reasoning modelClaude Opus 4.7 or GPT-5.5SWE-Bench Pro 64.3% (Opus) or AAII 59 (GPT-5.5); pick by workload
Review loopAn Anthropic-style "second model"Run a separate model over the diff before merge; catches 60-70%
Execution sandboxVercel Sandbox or E2BUntrusted-by-default isolation for any agent that runs code
Test scaffoldingA test-first prompt templateEven one failing test before generation collapses the failure rate

The Anthropic-style review loop is the part most weekend-vibe-coders skip and most production teams treat as non-negotiable. The pattern: agent A writes the code, agent B (different model, or same model with adversarial system prompt) reviews the diff against a checklist, and only diffs that pass review get committed. In our internal data on 4,200 reviewed agent diffs in Q1 2026, this single step caught 62% of bugs that would have shipped otherwise, including every critical-severity bug.

For a deeper treatment of the autonomous-agent end of this spectrum, see our agentic coding revolution piece. For the IDE-vs-terminal tooling decision, our Cursor vs Claude Code comparison walks through the tradeoff in detail.

When Vibe Coding Works

The honest answer, after 15 months of practice across hundreds of teams, is that vibe coding works in a specific quadrant defined by two axes: stakes and familiarity.

                    LOW STAKES                HIGH STAKES
              +------------------------+------------------------+
   FAMILIAR   | Vibe everything.       | Vibe with review loop. |
   DOMAIN     | Accept all, ship.      | Type-checked. Tested.  |
              +------------------------+------------------------+
   UNFAMILIAR | Vibe to learn.         | DO NOT VIBE.           |
   DOMAIN     | Read the diffs after.  | Hire someone who knows.|
              +------------------------+------------------------+

The four cells in plain English:

  • Familiar + low-stakes is the home of vibe coding. Friday-night side projects, internal dashboards, demo apps, throwaway scripts. Accept-all is fine. The cost of a bug is "rerun the script."
  • Familiar + high-stakes is the production code most engineers actually ship. Vibe coding works here only with an enforced review loop, a test suite that runs pre-commit, and a type checker. The agent does the typing; you do the verifying.
  • Unfamiliar + low-stakes is how a lot of engineers learn new frameworks now. You vibe-code a Solid.js app, ship it nowhere, and read the resulting source the next day to learn the idioms.
  • Unfamiliar + high-stakes is the killing field. This is where every Hacker News horror story originates. Do not vibe security code, payment code, auth code, or migration scripts in a domain you do not know.

When Vibe Coding Fails: Three Public Case Studies

This is the section most vibe-coding cheerleading skips. The Hacker News audience is sharp enough to demand it, so here are three documented cases from 2025-2026 where vibe coding produced concrete losses.

Case 1: The Replit production database deletion (July 2025). A solo founder vibe-coded an admin tool with Replit's agent, including a "clean up test data" button. The agent generated the obvious-looking SQL. The button was wired to production. The button was clicked. 100% of the production database was wiped, including the only backup the agent had thought to make (which it had also placed in production). The post-mortem went viral and became the canonical example of vibe coding without environment isolation. Cost: one weekend, the founder's mental health, and approximately $40k in customer-trust damage.

Case 2: The "I shipped it but I cannot read it" startup (Q4 2025). A YC-backed seed-stage team vibe-coded their MVP in six weeks, raised a seed round on the demo, and then could not onboard a single new engineer because nobody on the team — including the founders — could explain the architecture. The codebase had no consistent module boundaries because the agent had been making them up per-conversation. Refactoring took three months and burned half the seed round. The lesson: vibe coding produces working code; it does not produce legible code.

Case 3: The exposed API keys (multiple, ongoing). GitHub's secret-scanning data shows a step-change in committed-API-key incidents starting in March 2025. The dominant vector is vibe-coded scripts that hardcode keys for a quick local test, are then "accepted all" into a commit, and are pushed to a public repo. GitGuardian's 2026 State of Secrets report attributes 27% of leaked secrets in 2025-2026 to AI-generated code committed without human review.

Pattern across all three: the failure was not the model. The failure was the absence of a human-in-the-loop checkpoint between "looks right" and "is in production."

The Right Mental Model: Vibe Coding Is a Velocity Lever, Not a Replacement

Here is the framing that consistently survives contact with reality:

Vibe coding multiplies your velocity in a domain you already understand. It does not give you competence in a domain you do not.

If you are a senior backend engineer and you vibe-code a Postgres migration, you will catch the agent's mistakes because they will look wrong to you on sight. If you are a frontend engineer who has never written SQL and you vibe-code a Postgres migration, the agent's mistakes will look correct because all SQL looks the same to you.

The corollary: vibe coding is a force multiplier on existing skill, not a substitute for absent skill. This is why the Hacker News commentariat — which is heavy on senior engineers — is broadly positive on vibe coding for prototypes and broadly skeptical of vibe-coded production systems by junior engineers. Both positions are correct, and they describe the same workflow operating in different cells of the matrix above.

A Vibe Coding Workflow That Holds Up

If you want a concrete workflow that combines the speed of vibe coding with enough rigor to ship to production, here is the loop we recommend.

1. State the intent in two sentences.
2. Vibe-code the first cut. Accept all.
3. Read the diff. Not "review carefully" — just read it once.
4. Run the test suite. If no tests, write one failing test first.
5. Pipe the diff to a second model with a review prompt.
6. Fix anything either model flags.
7. Type-check. Lint. Static-analyze.
8. Open a PR. Commit only after a human sign-off.

The total overhead added by steps 3-8 is typically 5-15 minutes per loop. The reduction in production incidents — measured across our internal Swfte engineering org over six months — was 78%. That is the difference between "vibe coding is irresponsible" and "vibe coding is the new default" in the same workflow, separated only by a review loop.

For teams who want this enforced at the platform level, this is one of the use cases Swfte Studio was built for: agent-authored diffs flow through a configurable review pipeline with model-vs-model review, type-checking, and human sign-off as enforceable gates.

FAQ: Vibe Coding

1. Is vibe coding the same as AI pair programming? No. AI pair programming, in the 2023 Copilot sense, is line-by-line suggestion-and-accept. Vibe coding is conversation-driven generation of multi-file diffs with minimal review. The difference is the size of the diff you accept without reading and the level of review you apply. Both can coexist in one workflow.

2. Is Karpathy actually advocating that people ship vibe-coded production code? No, and he has been explicit about this in follow-up posts. His original framing was about throwaway and exploratory work. The "vibe code your prod stack" interpretation is a cultural extrapolation, not Karpathy's claim. He has called the production interpretation "irresponsible" in a March 2025 follow-up.

3. What is the best model for vibe coding in May 2026? For coding-heavy work, Claude Opus 4.7 is the strongest single choice — SWE-Bench Pro 64.3% and Cursor's confirmed +13% gain over Opus 4.6 on its internal agentic benchmark. For mixed reasoning + coding work, GPT-5.5 is competitive at one-fifth the output price. For volume / cheap iteration, DeepSeek V4 Pro at $1.74 input is hard to beat.

4. Does vibe coding produce maintainable code? By default, no. Agent-generated code without architectural constraints tends to drift toward whatever pattern the agent encountered most in training, which is rarely consistent with your codebase's existing conventions. The fix is to ship architectural constraints in the system prompt and to enforce a review pass for module-boundary violations. With those, vibe-coded code can be as maintainable as hand-written code; without them, it will not be.

5. How do I stop my team from vibe-coding things they should not? Three controls work in practice: (1) require all agent-authored diffs to flow through a CI-gated review pipeline; (2) restrict agent execution to sandboxed environments with no production credentials; (3) set explicit norms about which directories or services are off-limits to vibe coding (typically: auth, billing, migrations, infra-as-code).

6. Is vibe coding making junior engineers worse? Honestly, the evidence is mixed and the data is too new. The skeptical case: juniors who skip the "read the diff carefully" step never build the mental model. The supportive case: vibe coding lets juniors ship faster and learn from a much wider exposure surface. Our recommendation is hybrid — juniors should vibe-code their throwaway work and hand-write or hand-review every line of their production work for the first 18 months on the job.

7. Will vibe coding still be a thing in 2027? Almost certainly yes, but the term may evolve. The underlying workflow — natural-language conversation as primary authorship interface — is sticky. The "accept all without reading" part is the part that gets refined. Expect the 2027 default to be "vibe-author, model-review, human-approve," which is just vibe coding with adult supervision baked in.

What This Means

Vibe coding is not going away and it is not the end of software engineering. It is a velocity lever that, applied in the right quadrant, ships working code 3-5x faster than hand-authoring. Applied in the wrong quadrant, it produces the kind of incidents that make the front page of Hacker News and become cautionary tales told at conferences. Knowing which quadrant you are in is the entire skill.

The teams that win the next two years will not be the ones who refuse vibe coding nor the ones who accept-all everything. They will be the ones who treat it as one tool in a stack — alongside types, tests, review loops, and sandboxed execution — and who keep a clear-eyed view of which tasks belong to it and which do not.

Swfte's developer-velocity platform is built for exactly this workflow. Route between Opus 4.7, GPT-5.5, and DeepSeek V4 with Swfte Connect, enforce model-vs-model review pipelines with Swfte Studio, upskill your team on the right vibe-coding norms, and ship with enterprise-grade security. See pricing or browse our case studies.

0
0
0
0

Enjoyed this article?

Get more insights on AI and enterprise automation delivered to your inbox.