Stop Writing Prompts: Your AI Is Waiting for a System That Runs Itself

Stop Writing Prompts: Your AI Is Waiting for a System That Runs Itself

A screenwriter doesn't stay on set to watch every camera after finishing the script.

But today, most people's relationship with AI is exactly that—you write a "script" (Prompt), then stay on set the entire time, monitoring every frame. When AI makes a mistake, you correct it. When AI goes off track, you pull it back. Round after round, you spend more time on set than AI spends working.

Boris Cherny, creator of Claude Code, said something this June: "I no longer write prompts for Claude. I have loops running that prompt Claude for me."

Behind this statement is a paradigm shift happening: from Prompt Engineering (you write instructions) to Loop Engineering (you design systems). It's not that AI has become smarter—your role has changed.

From "Writing Lines" to "Building the Set"

The easiest way to understand this shift is to look at the magnitude of token consumption:

Stage Your Role Token Magnitude
Prompt Engineering Full-time operator Hundreds~Thousands
Context Engineering Information pipeline designer Thousands~Tens of thousands
Harness Engineering System builder, occasional intervention Tens of thousands~Hundreds of thousands
Loop Engineering Loop designer, system runs itself Hundreds of thousands~Continuous

Each stage doesn't replace the previous one; it builds on top of it. You still write prompts, but you no longer feed them manually round after round—an automatically running loop does it for you.

The core components of this loop are five plus one:

  • Heartbeat: A timed trigger. You don't check; the system wakes itself up on schedule.
  • Worktrees: Parallel isolation zones. Multiple agents can work simultaneously without conflict.
  • Skills: Persistent rule files. Agents can call them anytime, so you don't have to repeat yourself each time.
  • Connectors: Pipes to the outside world. Search, APIs, databases—agents can touch them directly.
  • Verifier: An independent judge. The agent that writes code and the agent that reviews code must not be the same.
  • Memory Spine: Persistent state outside of conversation. Prevents agents from forgetting during long-running tasks.

Five components form a system that can run on its own; the memory spine is the central nervous system tying everything together.

Three Scenarios: How Loops Run

Scenario 1: Inventory Monitoring—From "You Check" to "It Checks for You"

You want to grab a new phone. Before, you refreshed the page ten times a day. Loop Engineering approach: set a heartbeat to trigger every hour, define the goal as "Out of Stock button no longer shows and price is under budget," and the memory spine records the last observed state — if inventory hasn't changed, don't report; only notify you when it changes.

AI goes from a "question-answering tool" to a "sentry monitoring the situation."

Scenario 2: Cross-Platform Opportunity Scanning—From "You Search" to "It Screens for You"

Automatically scan platforms like Reddit, X, etc., looking for posts mentioning "support team overload" or "seeking AI consultant." The skills component presets scoring rules (1-10 quality rating), and connectors pull data via the MCP protocol. Human Gate: AI only drafts private messages, never sends automatically—reputation protection.

Before you even start searching, AI has already completed demand mining and draft preparation.

Scenario 3: Automated PR Fix—From "You Fix Bugs" to "It Stands Guard for You"

When a PR test fails on GitHub, AI fixes it automatically. Key design: open a separate worktree branch for each fix task to prevent agent conflicts. After the fix, another independent model runs the tests for verification—the code writer cannot be both player and referee. If tests fail, error info is fed back to the LLM, self-correction loops until passing.

Boris Cherny says this mode allows him to have agents automatically handle PRs even while on vacation.

Some Have Already Gone to Extremes

Steve Yegge (former Google engineer) built a system called Gas Town—running 20-30 Claude Code instances simultaneously, orchestrated by a "Mayor" agent, "Polecats" handle execution, "Witnesses" monitor stuck agents, and all state is stored in Git—so progress isn't lost even on crash.

With this system, he submitted 40,000 lines of code and 100+ PRs in one month. The cost? 40 Claude Code Max accounts, thousands of dollars per month.

This is the double edge of Loop Engineering: it can indeed multiply output, but the amplifier doesn't discriminate direction—good loops multiplied by good engineers are nuclear bombs; bad loops multiplied by bad decisions are meat grinders.

Real Engineering is Not About "Making It Run," But "Keeping It on Track"

Addy Osmani (Google Cloud AI Director) reminds us of a cold fact: For a loop to truly save money, it must satisfy four conditions simultaneously; missing any one makes the cost outweigh the benefit:

  1. The task repeats at least once a week—the setup cost of the loop is amortized over repeated runs.
  2. Results can be automatically verified—there are tests, compilation, or clear right-wrong judgments.
  3. Token budget can tolerate waste—loops inevitably have idle runs and invalid attempts.
  4. The agent has tools equivalent to a senior engineer—otherwise it can't even fix files correctly.

The two most typical failure modes:

  • Overbaking: Vague goals (like "make code better looking"), AI might refactor the entire project for a minor issue, producing a bunch of unwanted features.
  • Money Furnace: No circuit breaker mechanism; an unsupervised loop can burn tens of millions of tokens in a few hours.

Someone on Zhihu summarized it well: "Loops are real, but most people can't use them yet." This isn't cold water; it's helping you avoid a "fruitless build."

Your New Role: Not Operator, but System Architect

Loop Engineering marks the era of "Deploy and Walk Away" in AI collaboration.

Your core competitiveness is no longer crafting clever prompts, but precisely defining three things:

  1. What to observe? (What event does the heartbeat trigger?)
  2. When to stop? (How does the verifier determine the goal is achieved?)
  3. What to do when things go wrong? (Where is the circuit breaker?)

This is not a story of "AI is getting stronger so humans can coast." The model's capabilities haven't leaped qualitatively; what changed is how thick a foundation you're willing to lay for it.

From screenwriter to set architect: you haven't left the set; your position on the set has changed.


References: Addy Osmani "Loop Engineering", Boris Cherny Claude Code talk, O'Reilly Radar, Steve Yegge Gas Town project, Cobus Greyling blog

评论

暂无评论。

登录后可发表评论。