Summary of "Stop Prompting Claude. Use Karpathy's Method Instead."

Main idea / premise

The video claims that most people prompt Claude (and similar LLMs) “wrong,” and argues for a faster, more reliable approach attributed to Andrej Karpathy. The method is broken into three layersspec, verifier, and environment—plus a concluding “one thing to focus on” in the age of AI.


Layer 1: Spec (bridge human goals to what the model can execute)

Key limitation highlighted

State-of-the-art models struggle with context-driven decisions because they lack a reliable signal for the real-world “situation.”

Core concept

A spec is a structured format for delivering your understanding to the model—so the model operates within the correct framing.

Criticism of the common approach

Karpathy is presented as disliking high-level “plan mode” as too superficial. Instead, you should design a detailed spec through collaboration with the agent.

How to build the spec (as described)

  1. Uncover your goal

    • Distinguish between a task (“create end-of-month report”) and the underlying decision/conclusion the task supports.
    • Technique: have Claude “interview” you to identify the real goal.
  2. Work agile, not waterfall

    • Waterfall: give the agent everything at once and see the final result later.
    • Agile speccing: break scope into smaller pieces, show checkpoints, review, adjust, repeat.
    • Technique: bias the spec toward smaller, compartmentalized segments.
  3. Be precise and use your brain

    • Precision reduces assumptions; assumptions increase drift.
    • Add instructions like: “verify key decisions explicitly” so the model can’t silently skip important choices.

Output

A final “modern engineering” prompt/process to produce a tightly scoped, goal-aligned spec.


Layer 2: Verifier (make evaluation measurable and enforce verification)

Problem addressed

Karpathy framing referenced: “animals versus ghosts”

Three verification tactics

  1. Set evaluation criteria up front

    • Vague: “make the report look good.”
    • Precise: “report must have three sections, each ends with a recommendation.”
    • Add this into the verification prompt.
  2. Use a second AI model as a critic

    • “Second librarian” idea: a different model checks/grades the first model’s output using different knowledge/assumptions.
    • Mentions: for Claude Code workflows, use the Codex plugin to run consistency checks or validate steps via another system (e.g., “ensure both systems agree”).
  3. Pull external signal where possible

    • Technical: verify deployment by connecting Claude to the deployment system and confirming success.
    • Non-technical: load historical reports to enforce the required format/spec during verification.

Claim

Claude creator Boris Cherney is quoted as saying that with a feedback loop, Claude code can produce 2–3x quality (as stated in the video).


Layer 3: Environment (tooling + persistent workspace that improves over time)

Analogy

Key complaint

Most people “build the workshop from scratch” each time; merely keeping chat history isn’t the same.

How to build the environment (practical components)

  1. Create and maintain a Claude.md

    • Claude reads/injects it automatically on each prompt.
    • Include a verification plan so verification is not optional.
    • The video describes sections like:
      • repo/workspace description
      • routing and “custom skills”
      • knowledge architecture (where to look for info)
      • key rules that must always be followed
  2. Build an “LLM knowledge base” (Karpathy concept)

    • Create a local folder/retrieval structure so Claude can ingest materials and quickly find the right references.
    • Emphasizes: “your data is your moat.”
  3. Build reusable custom skills

    • If you do something repeatedly, make a custom skill/handbook for it.
    • Skills compound with usage (e.g., “run water through the hose”).
  4. Add true guardrails (rule enforcement at tool level)

    • Prompt-only rules like “don’t make up information” aren’t guaranteed.
    • For critical safety/accuracy, enforce restrictions using tool hooks (e.g., block edits to a protected folder like /important, don't edit).
    • Guardrails categories:
      • Always do (autopilot-safe)
      • Ask first (double-check)
      • Never do (cannot be crossed)

Result

An “end-to-end Karpathy method” combining spec → verifier → environment.


“One thing to focus on” in the age of AI


Main speakers / sources (as indicated)

Category ?

Technology


Share this summary


Is the summary off?

If you think the summary is inaccurate, you can reprocess it with the latest model.

Video