How to Vibe Code Responsibly in Production — Erik Schluntz (Anthropic, Code with Claude)

Code with Claude (Anthropic) · April 20, 2026

Erik Schluntz · 00:38 "I broke my hand in a bike accident — two months in a cast — and Claude wrote all my code for me during those two months."

A technical talk at Code with Claude (Anthropic, 2025), about 31 minutes. Speaker: Erik Schluntz (Anthropic, Programming Agents, co-author of Building Effective Agents). "How to vibe code responsibly in production," drawn from his lived experience inside Anthropic.

Erik Schluntz is an Anthropic researcher working on coding agents (Programming Agents), including Claude Code. He is one of the co-authors of Building Effective Agents Anthropic's industry-standard guide to agent design, published December 2024 on the Anthropic official blog. Co-written by Erik Schluntz and Barry Zhang. Systematizes principles like 'workflows vs agents,' 'when to use an agent,' and 'start with simple patterns,' and is the most-cited agent design guide in the industry. (Anthropic official blog, December 2024) — co-written with Barry Zhang — the most-cited agent design guide in the industry. This 31-minute talk is a technical session at Code with Claude (Anthropic, 2025).

The personal episode that opens the talk is striking: "Last year I broke my hand in a bike commuting accident and spent two months in a cast. During those two months, Claude wrote all my code for me" (00:38 – 00:46). The accidental constraint of an injury became the basis for establishing, through lived experience, "how to have AI responsibly write code in production environments" — knowledge he then brought back into Anthropic's internal AI-driven development culture. This personal arc is the source of the talk's overall credibility.

The core of the talk is the idea of "taking Karpathy's vibe coding definition The term Andrej Karpathy coined on X in February 2025: 'fully give in to the vibes, embrace exponentials, and forget that the code even exists.' Schluntz cites this definition and extends it with his own production-side variant: 'you forget the code exists, but you don't forget the product exists.' seriously and bringing it into production." "If you're running tight feedback loops with AI in Cursor or Copilot, you're not strictly vibe coding" (01:21). True vibe coding is the fully automatic state in which "you forget the code even exists." And with Schluntz's own extension — "you forget the code exists, but you don't forget the product exists" (04:55) — the bridge to production operation is designed.

The 31 minutes cover five central themes. (1) The exponential — the length of tasks AI can handle is doubling every seven months: today an hour, in 1–2 years a day, then a week. Keeping in lockstep is impossible. (2) An old problem in a new form — the same problem as a CTO managing technical specialists, a PM managing engineers, a CEO managing accountants. Managing an implementation you do not understand is a challenge as old as civilization. (3) The leaf-node strategy — concentrate vibe coding on parts of the code nothing else depends on. The core architecture is deeply understood by humans. (4) Be Claude's PM — a JFK-style "ask not what Claude can do for you, but what you can do for Claude," with 15–20 minutes of context-building investment. (5) The demonstration — a 22,000-line PR merged in a day — a large change to Anthropic's internal production reinforcement-learning codebase, accomplished with leaf-node focus + verifiable-checkpoint design.

Key Observations

"Two months in a cast + Claude wrote all my code" — the personal experience (00:38 – 01:02)

The source of credibility for Schluntz's entire argument. "Last year I broke my hand in a bike commuting accident and spent two months in a cast. During those two months, Claude wrote all my code for me" (00:38 – 00:46).

"Figuring out how to do this effectively was very important to me" (00:46 – 00:51). The external constraint — physically unable to write — turned into a natural laboratory for "AI-driven development best practices." The insights gained were then brought back into other Anthropic teams and into model training — a good example of the path by which personal experience becomes organizational knowledge.

This personal episode runs through the talk. "Be Claude's PM," "the leaf-node strategy," "the context-building investment" — all are pragmatism that came out of two months of on-the-ground validation. Read as constraint-driven invention — "what a researcher who could not write code came up with out of necessity" — rather than abstract framework, the persuasiveness of the talk rises.

The true definition of vibe coding — "forget the code even exists" (01:00 – 04:00)

Schluntz's framing: "Many people conflate vibe coding with 'doing a lot of AI code generation.' But that's not quite right" (01:00). Running tight feedback loops with AI in Cursor or Copilot is, strictly, not vibe coding.

He cites Karpathy's definition: "'Fully give in to the vibes, embrace exponentials, and forget that the code even exists.' The important part here is 'forget that the code even exists'" (01:34 – 01:48). This is fully consistent with Karpathy's AI Ascent 2026 talk.

The negative and positive sides of vibe coding, sorted out. The negative: "Someone coding for the first time who doesn't understand what they're doing — API keys exceeding their limit, subscription bypass, weird data in the DB" (02:18 – 02:33). The positive: "Video games, fun side projects, domains where bugs are OK" (02:42 – 02:49). Schluntz's framing question: "If the success stories are limited to toy, low-stakes domains, why should we care about vibe coding?" (02:54 – 03:09). The answer comes in the next section.

The exponential — doubling every 7 months, next year a day, the year after a week (03:09 – 04:30)

The most concrete number Schluntz presents: " The exponential of AI-feasible task length An observation METR (Model Evaluation and Threat Research) published in 2025. The length of tasks an AI model can complete consistently (measured in human-time equivalent) is doubling roughly every seven months. About an hour as of 2025, extrapolated to about two hours in 2026, four hours in 2027, and so on. The number Schluntz uses as the spine of this talk. . The length of tasks AI can do is doubling every seven months. We're at about an hour today" (03:14 – 03:20).

The near future where this is fine: "Have Claude Code write a feature that takes an hour, review all of it, and stay closely engaged" (03:24 – 03:38).

But the picture two steps ahead is frightening: "What about next year? The year after? When AI is powerful enough to generate a day's or a week's worth of work at once, you cannot keep up if you insist on moving in lockstep" (03:42 – 03:59). The conclusion: "If you want the benefit of this exponential, you have to find a responsible way to give in and use it" (04:01 – 04:09).

These numbers align with the "December inflection point" and "computers a million times faster" arguments in Karpathy's AI Ascent 2026 talk. Independent sources in the industry are sharing the same vision of the future — a pluralistic basis.

The compiler analogy — "trust without reading the assembly" (04:30 – 06:00)

Schluntz's analogy. "In the early days of the compiler, many developers really didn't trust it. They'd use the compiler but read the output assembly to confirm it was the same as what they'd write" (04:35 – 04:46). "But that doesn't scale. At some point you have to deal with a large system and you have no choice but to trust the system" (04:46 – 04:54).

Schluntz's own production-side definition: "You forget the code exists, but you don't forget the product exists" (04:55 – 05:01).

Completing the compiler analogy: "Back to the compiler analogy, we all know there's assembly down there, but most of us don't have to think about the actual assembly anymore. And yet we still build good software without understanding assembly. I think software is going to reach the same level" (05:01 – 05:25). Applying the industry's standard line about abstraction layers rising, to the AI era.

An old problem in a new form — the CTO / PM / CEO management problem (06:00 – 08:30)

Schluntz's strong claim: "This is not a new problem" (06:00). "How does a CTO manage experts in fields where they aren't the expert? How does a PM review a feature built in code they can't read? How does a CEO check the accountant's work without being a finance expert themselves?" (06:08 – 06:24).

These problems have existed for centuries, millennia, and there are solutions: "A CTO can write acceptance tests without understanding the implementation"; "a PM can actually use the product and check whether it works as expected"; "a CEO can build trust through spot checks" (06:30 – 07:05).

Schluntz's philosophical conclusion: "Managing an implementation you don't understand is actually a problem as old as civilization itself. Every manager in the world is already dealing with this. It's just that we software engineers aren't used to it. We're used to the pure individual contributor who understands the stack all the way to the bottom" (07:08 – 07:38). "To become the most productive, you need to let go of that" (07:38 – 07:46).

How to let go safely, responsibly: "Find verifiability at the abstraction layer A core principle of production vibe coding that Schluntz proposed in this talk. Design a layer at which behavior can be verified at a higher abstraction level, without knowing the underlying implementation. CTO acceptance tests, PM product trials, CEO spot checks — these are the design decisions that fit. — a layer where you can verify behavior without knowing the implementation below" (07:48 – 08:00). This is the core design principle of the entire talk.

Tech debt — the one exception to verifiability (08:30 – 11:00)

Schluntz's candid caveat: "There is one caveat, though — technical debt (tech debt) A software engineering term. The phenomenon by which short-term implementation compromises accumulate as future maintenance costs. Proposed by Ward Cunningham in 1992. Schluntz argues in this talk that this is the one domain in which 'there is no way to verify without reading the code,' treating it as the limit of vibe-coding strategy. " (08:30).

The problem: "Right now, there's no good way to measure or verify technical debt without reading the code yourself. In the other systems — the accountant example, the PM example — you can verify the parts you care about without knowing the implementation. But technical debt is one of the rare things with no good verification method other than becoming an expert in the implementation" (08:30 – 09:04).

Schluntz's answer — the leaf-node strategy: "Focus on the 'leaf nodes' of the codebase. Parts of the code or system that nothing else depends on — terminal features, bells and whistles" (09:16 – 09:31).

Why leaf nodes are safe: "Tech debt in a leaf node is pretty OK — nothing else depends on it. It's less likely to change, and it's less likely that something further will be built on top" (09:36 – 09:51). The core (the trunk and the lower branches), on the other hand: "The core architecture, where as an engineer you need to deeply understand things. This is where things change, where other things get built on top" (09:55 – 10:05).

Schluntz's optimistic addition: "Models keep getting better. We may be heading to a world where we can trust them to write extensible, tech-debt-free code. I trust Claude 4 models much more than 3.7" (10:10 – 10:25). A view in which the boundary of the leaf-node region widens over time.

"Be Claude's PM" — the JFK-style maxim (11:00 – 13:30)

The most quotable line in the talk: " Be Claude's PM The mindset of vibe coding that Schluntz proposed in this talk. The developer is Claude's product manager, responsible for providing the context, requirements, and constraints needed for the task to succeed. The phrase rephrases JFK's January 1961 inaugural — 'Ask not what your country can do for you, but what you can do for your country.' . Ask not what Claude can do for you, but what you can do for Claude" (11:09 – 11:15). A riff on JFK's 1961 inaugural address.

Concrete advice: "What guidance and context would a new teammate need to succeed at this task? We've gotten used to rapid back-and-forth chat with AI — 'build this feature,' 'fix this bug'" (11:24 – 11:51).

But in reality: "If a human were told on day one 'implement this feature,' you wouldn't expect them to succeed. They need a tour of the codebase, the requirements, the spec, the constraints" (11:51 – 12:15). So in vibe coding, the developer is responsible for giving Claude this information and setting it up to succeed.

Schluntz's concrete workflow: "When I build a feature with Claude, I spend 15–20 minutes consolidating the guidance into a single prompt before I let Claude cook. Those 15–20 minutes often involve another conversation, going back and forth with Claude — exploring the codebase, finding files, building a plan, identifying which files need to change, the patterns to follow" (12:24 – 13:09). "Once the artifact and all the information are gathered, I hand it to Claude in a new context with 'execute this plan'" (13:09 – 13:20).

"People without a business should not vibe code in production" (13:30 – 14:30)

Schluntz's frank limitation: "Despite the title, production vibe coding isn't for everyone. A completely non-technical person trying to build a full business from scratch — that's dangerous" (13:33 – 13:45).

The reason: "Because you can't ask the right questions. You can't be an effective PM for Claude" (13:55 – 14:05). This is fully consistent with Karpathy's "floor vs ceiling" framing — the "raising the floor" effect of vibe coding is wonderful, but production is a different matter.

Schluntz's implication: welcome vibe coding as "democratization," but treat it on a separate axis from "production responsibility" — a two-layer framing. The line: "people who can't code making a game or a side project" is wonderful; "people who can't code making a payment system" is dangerous.

A 22,000-line PR merged in a day — proof from inside Anthropic (14:30 – 18:30)

The strongest evidence of the talk: "We recently merged a 22,000-line change to our production reinforcement-learning codebase, most of it written by Claude" (14:39 – 14:46).

Four ways they executed it responsibly:

  1. Take the Claude PM role seriously: "We didn't merge in one prompt. It took several human-days of work — requirements organization, guiding Claude, system design" (15:09 – 15:25).
  2. Focus on leaf nodes: "The changes were concentrated mostly on leaf nodes of the codebase. We knew tech debt there would be OK" (16:11 – 16:25).
  3. Humans review the core: "The parts we considered important, the parts that need to be extensible — humans reviewed those carefully" (16:25 – 16:35).
  4. Design verifiable checkpoints: "We carefully designed a stress test for stability, designed the whole system so that the inputs and outputs were easy for humans to verify — that let us confirm correctness without reading all the code" (16:39 – 17:35).

The result: "We had the same level of confidence as in any other change to the codebase. But we delivered it in a tiny fraction of the time and effort it would have taken to hand-write and review every line" (17:48 – 18:05).

The most exciting side effect: "We didn't just save a week of human time — once we knew this was possible, we started thinking about engineering differently. When something that takes two weeks now takes a day, you realize you can do bigger features, bigger changes" (18:21 – 18:53). " Marginal cost of software An application of the economics concept. When the cost of producing one additional unit (marginal cost) drops, consumption rises — the demand-curve effect. Schluntz argues that 'AI lowers the marginal cost of software, so we can consume and build more software.' An AI-era version of the Jevons paradox. drops, and we can consume and build more software" (18:57 – 19:05). An AI-era version of the Jevons paradox.

Security — proposing a "provably correct" framework (24:30 – 27:30)

Response to an audience question. "A few months ago someone showed that the top 10 vibe-coded apps were so vulnerable that someone who isn't even a professional hacker could exfiltrate information" — the question (24:30 – 25:00).

Schluntz's answer: "It all comes back to 'be Claude's PM, understand the context well enough, know what's dangerous, what's safe, where to be careful'" (25:14 – 25:35). "The cases that get reported are people doing it who shouldn't be coding in the first place" (25:40 – 25:50).

A product-design-level solution: "I expect provably correct hosting systems A product idea Schluntz proposes near the end of this talk. A 'coloring-book-style' framework in which the important parts — authentication, payment, security — are pre-designed and the developer fills in only the UI layer. Inspired by Claude Artifacts (frontend-only, so structurally safe), with room to build a version that includes the backend. to emerge. The authentication and payment parts are prepared for you — you just fill in the UI layer" (26:23 – 26:43). He cites Claude Artifacts as the simplest example (frontend-only, no authentication, no payment — structurally safe). "Someone should build the backend version of this" (27:00 – 27:25) — planting a product-idea seed for the industry.

Test-driven development (TDD) and the practical wisdom of leaf nodes (28:00 – 31:00)

Response to the final audience question — how Claude writes tests. "Claude tends to fall into the rabbit hole of writing tests that are too closely tied to the implementation" (29:55).

Schluntz's countermeasure: "I instruct specifically: 'write only three end-to-end tests — a happy path, an error case, and this other error case.' Very specifically, I instruct the tests to be general and end-to-end" (30:08 – 30:42).

Schluntz's working style: "When I'm vibe coding, the only part of the code I read — or the first part I read — is the tests. If I agree with the tests and the tests pass, I feel the code is broadly fine" (30:48 – 31:04). "When I can prompt Claude to write minimalist end-to-end tests, this works best" (31:04 – 31:15). An industry insight that TDD pairs well with AI-driven development.

Industry Context

Code with Claude is the developer-focused conference Anthropic runs. There's the SF event in May 2025 (with Hannah Moran × Christian Ryan's Prompting 101, Boris Cherny's Claude Code session, and others), and various extension events later in the year. Schluntz's talk was published on Anthropic's official channel — an outward-facing transmission of the company's AI-driven development culture.

Timing-wise this matters — Schluntz's talk was published before Karpathy's AI Ascent 2026 talk (May 2026), but the conceptual framing is fully aligned. The complementary relationship: "Karpathy organizes from the industry's bird's-eye view, Schluntz concretizes the on-the-ground implementation." The structure of pluralistic grounding — the same industry shift confirmed from multiple independent, authoritative sources.

Schluntz's Building Effective Agents (co-written with Barry Zhang, December 2024) is one of the most-cited agent design guides in the industry. It systematizes principles like "workflows vs agents" and "start with simple patterns." Together with Skills not Agents (Barry Zhang × Mahesh Murag, December 2025), it is the core of Anthropic's outward communication of its 2025–2026 AI-driven development culture.

Where it sits among related videos

The lineage for understanding this talk:

Schluntz's talk is "on-the-ground implementation inside Anthropic"; Karpathy's talk is "the bird's-eye view of the industry." Read together, the low layer (concrete tactics) and the high layer (paradigm framing) lock in. Schluntz's "22,000-line PR," "leaf-node strategy," and "Claude PM" read as concrete instances of Karpathy's "Software 3.0," "agentic engineering," and "floor vs ceiling."

Implementation Implications

Implications of this talk for technologists building LLM products:

First, make leaf-node identification a standard of technical design. In your own codebase, run the exercise of distinguishing "terminal features that nothing else depends on" from "the core architecture" as a prerequisite for Claude-driven development. Tag each feature as "leaf or trunk," with a policy that only leaves are eligible for vibe coding and the trunk requires human review.

Second, evaluate the 15–20 minutes of context-building investment organizationally. Instead of "just asking Claude immediately," make the work of designing "how to hand information to Claude" an explicit engineering step. Operationally, evaluate "the prompts used and the plan" alongside the PR at review time.

Third, make verifiable-checkpoint design a prerequisite for production rollout. Build in "inputs and outputs you can confirm correct without reading the code" from the feature-design stage. Stress tests, acceptance tests, observable input-output pairs — design these abstraction layers intentionally.

Fourth, build "next year's AI is 2x, the year after 4x" into the product roadmap. A product built assuming "today, hour-long tasks are feasible" will be obsolete in 1–2 years. Build into the roadmap the transition to agents that can handle "day-long tasks" and "week-long tasks."

Fifth, updating hiring criteria. Add to your technical hiring rubric "people who can be Claude's PM" = "people who can ask the right questions," "people who don't begrudge the 15–20 minutes of context-building investment," "people who can distinguish leaf nodes from the core." Consistent with the redesign of the hiring process Karpathy proposes.

Critical Perspective

The strength of this talk is the demonstration, backed by a concrete number — "the 22,000-line PR." There are also caveats.

First, the 22,000-line PR example is about Anthropic's internal production reinforcement-learning codebase — a fully offline system (Schluntz confirmed this in response to an audience question). For systems that include authentication, payment, user data, and external communication, the security risk is structurally different. "Vibe coding 22,000 lines into an outward-facing service" will not necessarily produce the same result.

Second, identifying "leaf-node strategy" requires experience. Judging "is this a leaf or trunk?" is hard for beginners. It applies to a developer cohort like the one Schluntz presupposes — "senior engineers with a deep understanding of the codebase" — but a separate educational design is needed for juniors or teams.

Third, "the 15–20 minutes of context-building investment" can be measured at the individual productivity level, but is hard to set up as an organizational metric. If "15 minutes of context building" is not built into a visible engineering step, it collides with a culture that demands "output you can run immediately."

Fourth, the tension between Karpathy's "raising the floor" and Schluntz's "production is a different matter." Schluntz's statement — "non-technical people should not build a business in production" — contradicts the democratization ideal. It carries the conservative implication that "domains requiring expertise can only be entered by experts." This is a contested topic in the AI industry's democratization debate.

Fifth, Schluntz's "provably correct hosting system" proposal does not yet exist as a product (as of May 2026). It is valuable as a seed for industry ideation, but at present, "vibe coding in production" requires supplementing security, authentication, and payment with existing frameworks.

These caveats aside, as a rare outward publication of "on-the-ground implementation inside Anthropic," this talk's value is decisive. It is also a historical record in the sense that the company publishes its in-house AI-driven development culture as teaching material for the wider industry.

Reader Takeaways

  • Partition your own codebase into "leaf nodes" and "core" and write an operational policy that concentrates vibe coding on leaf nodes
  • Replace "ask Claude immediately" with "15–20 minutes of context-building investment." Evaluate the prompts and the plan as part of PR review
  • Make "verifiable checkpoints" a prerequisite of feature design. Build layers where stress tests, acceptance tests, and observable input/output pairs let you confirm correctness without reading the code
  • Build "AI task length doubles every 7 months" into the product roadmap. A product optimized for "today's AI" will be obsolete in 1–2 years
  • Update hiring criteria to "people who can be Claude's PM" = three abilities: ask the right questions, build context, and identify leaf nodes
  • Treat Schluntz's limitation — "production vibe coding isn't for everyone" — as separate from the democratization ideal. The strategy differs between games / side projects and production systems involving authentication and payment

Video Outline

  • (00:00) Opening — "everyone's favorite topic, vibe coding, and responsibly in production"
  • (00:23) Erik Schluntz self-introduction — Anthropic Programming Agents, co-author of Building Effective Agents
  • (00:38) "I broke my hand in a bike accident, two months in a cast, Claude wrote all my code"
  • (01:00) The true definition of vibe coding — Cursor / Copilot strictly aren't vibe coding
  • (01:34) Citing Karpathy's definition — "forget that the code even exists"
  • (02:18) The downside of vibe coding — API key overruns, weird data in the DB
  • (02:42) The upside of vibe coding — games, side projects, low-stakes domains
  • (03:14) The exponential — AI task length doubles every 7 months
  • (03:42) Next year a day, the year after a week — lockstep impossible
  • (04:30) Compiler analogy — trust without reading the assembly
  • (04:55) "Forget the code exists, but don't forget the product exists"
  • (06:00) An old problem in a new form — the CTO / PM / CEO management problem
  • (07:08) "A problem as old as civilization" — managing implementations you don't understand
  • (07:48) Verifiability at the abstraction layer is the answer
  • (08:30) Caveat: tech debt is the exception to verifiability
  • (09:16) The leaf-node strategy — parts of the code nothing else depends on
  • (09:55) The core (trunk) needs deep human understanding
  • (10:10) As models improve, the leaf-node boundary widens
  • (11:09) "Ask not what Claude can do for you, but what you can do for Claude"
  • (11:24) The mindset of being Claude's PM
  • (12:24) The 15–20-minute context-building investment — back-and-forth in a separate conversation, planning
  • (13:33) "It's dangerous for non-technical people to build a business in production"
  • (14:39) "22,000-line change merged to the production reinforcement-learning codebase"
  • (15:09) Four responsible execution methods
  • (16:11) Focus on leaf nodes
  • (16:39) Verifiable-checkpoint design — stress tests + inputs/outputs
  • (17:48) Result — same confidence as any other change, in a fraction of the time
  • (18:21) Marginal cost of software drops — bigger changes become possible
  • (19:46) Closing 4 principles — Claude's PM, leaf nodes, verifiability, the exponential
  • (20:06) Q&A begins
  • (20:08) Q1: Does the way we learn change in the AI era?
  • (22:18) Q2: How much information to put in the context
  • (24:30) Q3: Balancing vibe coding and cybersecurity
  • (26:23) The provably correct hosting system proposal
  • (27:00) Claude Artifacts as an example
  • (28:00) Q4: TDD tips — narrow to three end-to-end tests
  • (31:00) Q5: Distinguishing Claude Code from Cursor / VS Code; when to compact
  • (31:00) Closing, applause

Key Quotes

  • "I broke my hand in a bike accident — two months in a cast — and Claude wrote all my code for me" (Schluntz, 00:38)
  • "True vibe coding is 'forgetting that the code even exists'" (Schluntz, citing Karpathy, 01:43)
  • "The length of tasks AI can do is doubling every seven months. We're at about an hour today" (Schluntz, 03:14)
  • "When AI can handle day-long or week-long tasks next year or the year after, you can't keep up if you insist on moving in lockstep" (Schluntz, 03:42)
  • "You forget the code exists, but you don't forget the product exists" (Schluntz, 04:55)
  • "Managing an implementation you don't understand is a problem as old as civilization" (Schluntz, 07:08)
  • "Tech debt is one of the rare things with no good verification method other than becoming an expert in the implementation" (Schluntz, 08:48)
  • "Tech debt in a leaf node is OK — nothing else depends on it" (Schluntz, 09:36)
  • "Ask not what Claude can do for you, but what you can do for Claude" (Schluntz, 11:09)
  • "I spend 15–20 minutes consolidating the guidance into a single prompt, and then let Claude cook" (Schluntz, 12:24)
  • "It's dangerous for a non-technical person to try to build a full business from scratch" (Schluntz, 13:33)
  • "22,000-line change merged to our production reinforcement-learning codebase, most of it written by Claude" (Schluntz, 14:39)
  • "We didn't just save a week of human time — we started thinking about engineering differently" (Schluntz, 18:21)
  • "The marginal cost of software drops, and we can consume and build more software" (Schluntz, 18:57)
  • "In 1–2 years, people who insist on reading and writing all the code will be at a serious disadvantage" (Schluntz, 19:46)
  • "A provably correct hosting system — authentication and payment are pre-prepared; you just fill in the UI layer" (Schluntz, 26:23)
  • "The only part of the code I read — or the first part I read — is the tests" (Schluntz, 30:48)

Sources

Vibe Coding - Eric Schluntz, Anthropic Team (Code with Claude)

Related resources:

Glossary

Vibe coding
Term coined by Andrej Karpathy on X in February 2025. Fully automatic AI coding in which you "fully give in to the vibes, embrace exponentials, and forget that the code even exists." Schluntz applies this definition strictly in the talk: "the tight feedback loop of Cursor / Copilot is, strictly, not vibe coding."
Building Effective Agents
Anthropic's industry-standard guide to agent design, published December 2024. Co-written by Erik Schluntz and Barry Zhang. Systematizes principles like "workflows vs agents," "when to use an agent," and "start with simple patterns." The most-cited agent design guide in the industry.
Leaf-node strategy
The core tactic of production vibe coding that Schluntz proposed in this talk. Partition the codebase into "leaf nodes (terminal features nothing else depends on)" and "the core (the trunk and lower branches of the system)." Concentrate vibe coding on leaf nodes; have humans deeply understand the core. Because tech debt in a leaf node has limited system-wide impact.
Verifiability at the abstraction layer
The core principle of production vibe coding Schluntz proposed in this talk. Design a layer where behavior can be verified at a higher abstraction level without knowing the underlying implementation. CTO acceptance tests, PM product trials, CEO spot checks — these are the design decisions that fit.
Technical debt (tech debt)
A software engineering term proposed by Ward Cunningham in 1992. The phenomenon by which short-term implementation compromises accumulate as future maintenance costs. Schluntz argues in this talk that this is the one domain in which "there is no way to verify without reading the code," treating it as the limit of vibe-coding strategy.
Be Claude's PM
The mindset of vibe coding that Schluntz proposed in this talk. The developer is Claude's product manager, responsible for providing the context, requirements, and constraints needed for the task to succeed. The phrase rephrases JFK's January 1961 inaugural — "Ask not what your country can do for you, but what you can do for your country."
The exponential of AI-feasible task length
An observation METR (Model Evaluation and Threat Research) published in 2025. The length of tasks an AI model can complete consistently (measured in human-time equivalent) is doubling roughly every seven months. About an hour as of 2025, extrapolated to about two hours in 2026, four hours in 2027, and so on. The number Schluntz uses as the spine of this talk.
Marginal cost of software
An application of the economics concept. When the cost of producing one additional unit (marginal cost) drops, consumption rises — the demand-curve effect. Schluntz argues that "AI lowers the marginal cost of software, so we can consume and build more software." An AI-era version of the Jevons paradox.
Provably correct hosting system
A product idea Schluntz proposes near the end of this talk. A "coloring-book-style" framework in which the important parts — authentication, payment, security — are pre-designed and the developer fills in only the UI layer. Inspired by Claude Artifacts (frontend-only, so structurally safe), with room to build a version that includes the backend.
Claude Artifacts
A Claude feature published by Anthropic in 2024. Code Claude writes is hosted and run inside a sandbox in Claude AI for execution and display. Frontend-only, with no authentication and no payment, so structurally safe. Schluntz cites this as an example of "provably correct."
Code with Claude
The developer-focused conference Anthropic runs. The SF event in May 2025 (with Hannah Moran × Christian Ryan's Prompting 101, Boris Cherny's Claude Code session, this talk, etc.) and various extension events that year. A venue for Anthropic to communicate its AI-driven development culture outward.
Barry Zhang
An Anthropic research engineer and Schluntz's co-author on Building Effective Agents. Led the design of Agent Skills, announced by Anthropic in 2025. Reached the Skills concept from the prototype insight "we built Claude Code and realized it was a general-purpose agent." Detailed in the MEMEX article "Skills not Agents."
JFK's 1961 inaugural address
The famous passage John F. Kennedy delivered in his presidential inaugural address in January 1961: "Ask not what your country can do for you, but what you can do for your country." Schluntz's "Ask not what Claude can do for you, but what you can do for Claude" rephrases this.
Anthropic production reinforcement-learning codebase
The internal Anthropic system Schluntz cites as the target of the 22,000-line PR. In response to an audience question (around 26:00), Schluntz makes clear that this is "fully offline." Because it does not include authentication, payment, or external communication, the security risk is structurally low. That is the contextual condition that made 22,000-line vibe coding possible.
Jevons paradox
Proposed by the 19th-century British economist William Stanley Jevons in 1865. The phenomenon that "when a resource can be used more efficiently, consumption of that resource actually rises." Example: improvements in the efficiency of coal engines did not reduce coal consumption — they raised it. Schluntz's "the marginal cost of software drops, and we can consume and build more software" is the AI-era version.
METR (Model Evaluation and Threat Research)
An independent research organization founded in 2022. Formerly ARC Evals. Conducts capability evaluations of large AI models, with pre-release access to models from OpenAI, Anthropic, and others, responsible for detecting dangerous capabilities. The observation it published in 2025 — "the length of tasks AI can complete doubles every seven months" — is widely cited in the industry.
comment is stripped from the HTML output. */}