Skip to content
RALPH LOOP

Ralph Loop Failure Modes (Context Rot, Runaway Cost) and How to Avoid Them

A green CRT terminal showing a looping agent process with warning markers branching off to labeled guardrails, suggesting failure modes mapped to fixes.

A Ralph loop does not fail in mysterious ways. It fails in a short list of predictable ones: the agent forgets the plot, the loop burns money forever, it thrashes on a task that is too big or too vague, it ships work that looks done but is wrong, or it damages something on your machine. Each of these has a known cause and a specific guardrail. None of them require luck to avoid.

This post walks through every failure mode of running an AI coding agent in a loop, why each one happens, and the exact mechanism in ralph.sh that prevents it. If you are setting up a loop for the first time or wondering why a run went sideways, this is the checklist.

What are the failure modes of a Ralph loop?

Section titled “What are the failure modes of a Ralph loop?”

Five, mostly. They are context rot, runaway cost, thrashing on a bad task, silent wrong work, and sandbox damage. Every one maps to a guardrail you can turn on or design around. Here is the short version before the details:

  1. Context rot. Fixed by fresh context per iteration and state on disk.
  2. Runaway cost and infinite loops. Fixed by an iteration cap, a completion promise, and a budget you set.
  3. Thrashing on a bad or oversized task. Fixed by atomic tasks, one task per iteration, and the BLOCKED and DECIDE promise tags.
  4. Silent wrong work. Fixed by verification gates and screenshots.
  5. Sandbox escape or damage. Fixed by running the agent inside a Docker Sandbox.

The rest of this post takes each one in turn.

Context rot is the slow decay of an agent’s reasoning as its context window fills with stale and conflicting information. Old tool output, abandoned plans, your corrections, the agent’s own apologies, half-finished edits: all of it competes with the task that actually matters right now. The longer a single session runs, the worse the signal-to-noise ratio gets. You see it as repeated mistakes, forgotten constraints, and confident edits that quietly undo earlier good work.

This is the failure the Ralph technique was built to kill, so the guardrail is the core of the design rather than a bolt-on.

Each iteration spawns the agent with a clean context window. It reads only what matters right now: the prompt, a short project summary, the task list, and the spec for the one task it is about to work. There is no transcript of the previous twelve iterations clogging the window. The model spends its attention on the current task instead of re-reading its own history.

Think of it as a series of sprints rather than one marathon. A marathon session accumulates fatigue. A fresh sprint each time stays sharp.

If the agent forgets everything between iterations, how does it make progress? Because progress does not live in the conversation. It lives in files. Ralph keeps state under .agent/:

.agent/
├── PROMPT.md # Prompt sent to the agent each iteration
├── tasks.json # Task lookup table
├── tasks/ # Per-task specs (TASK-{ID}.json)
├── logs/LOG.md # Progress log
└── history/ # Per-iteration output logs

When a fresh agent boots, it reads .agent/tasks.json to see what is done, reads .agent/logs/LOG.md to see what happened recently, and reads the git log to see the actual committed changes. The code on disk is the record. The window never grows large enough to rot because it resets every pass. The deeper version of this design, including how the prompt file makes a fresh agent reorient in seconds, is in how to write the PROMPT.md file that drives a Ralph loop.

Runaway cost and infinite loops: the money fire

Section titled “Runaway cost and infinite loops: the money fire”

The naive Ralph loop is one line:

Terminal window
while :; do cat PROMPT.md | your-agent-cli; done

That loop never stops on its own. It has no iteration cap and no notion of success. Point it at a paid model and walk away, and it will happily keep calling the API until your bill says otherwise. Even when the work is genuinely done, a loop without a stop condition will keep re-running the agent to rediscover that there is nothing left to do, paying full freight each time.

Runaway cost has two shapes. One is the loop that never terminates. The other is the loop that terminates eventually but does far more iterations than the work needed. Both are guardrail problems, not model problems.

ralph.sh takes a hard cap on iterations. The default is ten. You set a higher one explicitly:

Terminal window
./ralph.sh -n 50
./ralph.sh --max-iterations 5

To smoke test the setup without committing to a long run, do exactly one pass:

Terminal window
./ralph.sh --once

When the loop hits the cap with work still pending, it exits with code 1 (MAX_ITERATIONS). That is not a crash. It is the safety net doing its job. You read the log, decide whether to top up the budget, and run it again.

The loop stops on an explicit signal, never on a guess. The agent emits a promise tag to declare where things stand:

  • <promise>COMPLETE</promise> every task is finished.
  • <promise>BLOCKED:reason</promise> the agent needs a human to clear something.
  • <promise>DECIDE:question</promise> the agent hit a real decision point.

ralph.sh watches for these tags and translates them into exit codes:

CodeMeaning
0COMPLETE. All tasks finished.
1MAX_ITERATIONS. Reached the cap.
2BLOCKED. Needs human help.
3DECIDE. Needs a human decision.

A COMPLETE ends the run cleanly the moment the work is actually done, so you do not pay for victory laps. The full design of the stop condition, and why a promise beats letting the agent decide it feels finished, is in completion promises and exit codes.

Cost control is mostly about deciding limits before you start: how many iterations, which model, and what the loop is allowed to touch. A cheaper model on a well specified task can outperform an expensive model on a vague one, because the expensive model spends its budget flailing. The full treatment of caps, model choice, and the verification gates that keep a loop from spinning is in cost control for autonomous AI coding agents.

Give an agent a task like “refactor the billing system” and it will thrash. The task is too big to hold in one iteration and too vague to verify, so the agent makes a sprawling change, second-guesses it, partially reverts, and produces a commit you cannot review. Repeat that across iterations and the loop spins without converging. This is the most common reason a loop “does not work,” and it is almost always a task design problem rather than an agent problem.

The unit of work matters. A task should be small enough to finish in a single iteration and specific enough to verify. “Add a unit test for the discount calculation in cart.ts” is atomic. “Improve test coverage” is not. The loop is only as good as the tasks you feed it. Garbage tasks in, garbage commits out. How to decompose a large project into atomic, independently verifiable packets is the heart of spec-driven development, and it is upstream of everything the loop does.

The rule that keeps a loop reliable is blunt: the agent completes exactly one task, commits, and stops. It never batches several tasks into one iteration. Batching is how an agent drifts off scope and produces a tangled diff. One task per invocation keeps each commit reviewable and each iteration’s blast radius small.

This pairs with the task lookup table. The agent reads .agent/tasks.json, picks the highest-priority incomplete task, opens its spec at .agent/tasks/TASK-{ID}.json, works only those steps, and exits. The table is a lightweight index, so a project can hold hundreds of tasks while each iteration loads only the one it needs.

Fix: BLOCKED and DECIDE instead of guessing

Section titled “Fix: BLOCKED and DECIDE instead of guessing”

When a task is genuinely underspecified or the agent hits a wall it cannot clear, the worst outcome is for it to guess and burn iterations on a wrong assumption. The promise tags give it an honest exit. BLOCKED:reason stops the loop and hands you the blocker. DECIDE:question stops and asks for your call. The loop returns exit code 2 or 3, and you are not staring at a dead terminal wondering what happened. A loop that stops to ask is far cheaper than a loop that thrashes in silence.

Silent wrong work: it looks done but it is broken

Section titled “Silent wrong work: it looks done but it is broken”

The scariest failure is the one that looks like success. The agent flips a task to done, commits, and moves on, but the code does not actually work. Maybe it compiles and the tests it wrote test the wrong thing. Maybe the UI renders but the button does nothing. Across an overnight run, silent wrong work compounds: later tasks build on the broken one, and you wake up to a green task list and a red product.

The cause is letting the agent self-assess on a vibe. The fix is to take that judgment away from the agent and give it to a tool that cannot be fooled by optimism.

Before the agent calls a task done, it runs the project’s checks. Ralph assumes a verification stack: Playwright for end to end tests, Vitest for unit tests, TypeScript for types, ESLint for linting, and Prettier for formatting. The repo mantra is exactly this blunt: if you didn’t test it, it doesn’t work.

The gate is binary. The checks pass or the iteration is not done. If a check fails, the agent loops back inside the same iteration and fixes the code until it passes. The agent does not get to claim success while the test suite is red. This is what makes a commit in the git log trustworthy rather than aspirational.

Fix: screenshots for the things tests miss

Section titled “Fix: screenshots for the things tests miss”

Some failures do not show up in a unit test. A layout that is technically rendered but visually broken passes a DOM assertion and fails a human glance. For UI work, the loop takes a screenshot as part of completing a task, so the artifact you review in the morning includes proof of what the agent saw. A screenshot is a cheap, honest signal that catches the class of bugs that slip past assertions.

Autonomous agents run in bypass-permissions mode, often called YOLO mode, because stopping to approve every file write would defeat the point of an overnight loop. Claude Code calls this --dangerously-skip-permissions. The flag name is honest. An agent with your permissions and no approval prompts can read your SSH keys, touch files outside the repo, run arbitrary install scripts, and make network calls you never intended. On your laptop, that is a real risk. The mistake is treating the agent’s good behavior as the safety boundary.

Run the agent inside an isolated Docker Sandbox microVM. The boundary is the sandbox, not the agent’s restraint. ralph.sh does this by default through the sbx CLI, giving each run a deterministic sandbox name of the form ralph-<agent>-<current-dir>-<hash8>. Inside that microVM, the agent can run in YOLO mode because the worst it can damage is the sandbox.

You can inspect and operate the sandbox like any container:

Terminal window
sbx ls
sbx exec -it <name> bash

Network is deny-by-default. The agent gets only the domains you allow:

Terminal window
sbx policy allow network <name> registry.npmjs.org

That last point matters for both safety and cost. A deny-by-default network blocks exfiltration and stops an agent from wandering off to install or call something you did not sanction. The isolation model is documented in the Docker Sandboxes docs, and the bypass flag the loop relies on is in the Claude Code docs.

Here is the whole picture in one view: each failure on the left, each guardrail on the right.

flowchart LR
    subgraph Failures["Failure modes"]
        F1["Context rot"]
        F2["Runaway cost / infinite loop"]
        F3["Thrashing on a bad task"]
        F4["Silent wrong work"]
        F5["Sandbox damage"]
    end
    subgraph Guards["Guardrails"]
        G1["Fresh context + state on disk"]
        G2["-n cap + completion promise + budget"]
        G3["Atomic tasks, one per iteration, BLOCKED / DECIDE"]
        G4["Verification gates + screenshots"]
        G5["Docker Sandbox microVM"]
    end
    F1 --> G1
    F2 --> G2
    F3 --> G3
    F4 --> G4
    F5 --> G5

Read it as a contract. If you have turned on the guardrail, the matching failure mode is handled. If a run went wrong, find which guardrail was missing or misconfigured.

Before you start a multi-hour or overnight loop, walk this list. It is the difference between waking up to a clean branch and waking up to a mess.

  1. Cap the iterations. Pass -n with a number you are willing to pay for. Do not run the uncapped one-liner.
  2. Test the setup with one pass. Run ./ralph.sh --once and read the output before trusting a long run.
  3. Check your tasks are atomic. Each task should be finishable in one iteration and verifiable by a check. If you cannot write the acceptance criteria, the task is not ready.
  4. Confirm the verification stack runs. Make sure tests, lint, and type checking actually execute in the project, because the loop leans on them as its truth signal.
  5. Run in the sandbox. Keep the agent inside the Docker Sandbox and the network on deny-by-default. Allow only the domains the build genuinely needs.
  6. Pick the right agent and model. Switch agents with --agent and pass model flags after a separator:
Terminal window
./ralph.sh --agent codex -- --model gpt-5.5
./ralph.sh -a gemini -- --model pro

Supported agents are claude (the default), codex, copilot, cursor, gemini, and opencode.

A Ralph loop is persistence with a memory and a stop button. Its failure modes are the failure modes of any long autonomous process: it forgets, it overspends, it chases a bad goal, it lies to itself about success, and it can break things it touches. The technique does not pretend these do not exist. It pairs each one with a guardrail and makes the guardrails the default.

None of this removes your judgment from the loop. You still write the tasks, set the budget, and review the commits. What the guardrails buy you is the confidence to let an agent grind for hours without standing over it. Get the five right and the loop becomes boring in the best way: it either finishes, or it stops and tells you exactly why.

Frequently asked questions

What are the most common ways a Ralph loop fails?

There are five common failure modes: context rot where the agent loses the plot over a long session, runaway cost or infinite loops, thrashing on a task that is too big or too vague, silent wrong work that looks done but is broken, and damage from running an agent without a sandbox. Each one maps to a specific guardrail in ralph.sh.

How does a Ralph loop avoid context rot?

It resets the context window every iteration and stores project state on disk. Because the agent reads the task list, logs, and git history each pass instead of carrying a growing conversation, the window never gets large enough to rot. Progress lives in files like tasks.json and commits, not in chat memory.

How do I stop a Ralph loop from burning money?

Set an iteration cap with -n, rely on the completion promise so the loop stops the moment all tasks are done, and choose your model and budget before you start. The loop exits with code 1 when it hits the cap, which is a safety net rather than a failure. Running a single pass with --once first confirms the setup before a long run.

Why does my agent thrash instead of finishing the task?

Almost always because the task is too large or too vaguely specified to finish and verify in one iteration. The fix is atomic tasks, one task per invocation, and clear acceptance criteria. When a task is genuinely blocked or needs a decision, the agent should emit a BLOCKED or DECIDE promise and stop rather than guess.

Is it safe to run an autonomous coding agent in YOLO mode?

Only inside a sandbox. Bypass-permissions mode is dangerous on your laptop because the agent can read credentials and touch files outside the repo. Run it inside a Docker Sandbox microVM with deny-by-default networking, so the boundary is the sandbox rather than the agent's good behavior.