Skip to content
RALPH LOOP

How to Write the PROMPT.md File That Drives a Ralph Loop

A diagram of a PROMPT.md file feeding into each fresh-context iteration of a Ralph loop

PROMPT.md is the instruction the loop sends to the agent on every single iteration. It is not setup documentation and it is not a one-time kickoff message. It is the text a fresh agent reads at the top of each pass, with zero memory of the last one, so it has to make that agent reorient from disk and start working within seconds.

Get it right and a stateless agent reads its task list, picks one task, verifies the result, and commits, over and over, until the work is done. Get it wrong and the loop drifts: the agent batches tasks, forgets where state lives, or never signals that it finished. This post shows exactly what to put in .agent/PROMPT.md, with an annotated example you can copy.

If the loop itself is new to you, start with what the Ralph technique is and come back. The pattern was popularized by Geoffrey Huntley, who framed Ralph as a Bash loop that builds software while you sleep. The prompt file is the steering wheel of that loop.

Why does PROMPT.md matter so much in a Ralph loop?

Section titled “Why does PROMPT.md matter so much in a Ralph loop?”

The defining property of a Ralph loop is that each iteration starts the agent with a clean context window. There is no carryover transcript. The agent that runs iteration 31 knows nothing about iterations 1 through 30 except what it can read from disk. This is deliberate. It avoids context rot, where a long session drifts and the agent loses the plot as old output piles up in the window. For the deeper version of this idea, see context engineering for long-running agents.

The trade for that clean window is memory. A fresh agent remembers nothing, so the filesystem and git history become the memory layer. PROMPT.md is the bridge between the two. It is the only text guaranteed to be in front of the agent at the start of every pass, and its whole job is to point an amnesiac agent at where its memory lives and tell it what to do next.

ralph.sh re-sends the same PROMPT.md on each iteration. (Anthropic shipped an official Claude Code plugin that does this with a Stop Hook that re-injects the prompt. Our implementation is the hackable ralph.sh script, which loops the agent directly.) Because the text is identical every pass, it has to be written for an agent that is reading it for the first time, every time. That single constraint drives everything else in this post.

What does every PROMPT.md need to include?

Section titled “What does every PROMPT.md need to include?”

A working prompt file does four jobs. Miss any one of them and the loop stalls in a predictable way.

A fresh agent cannot remember what it did, so the first thing the prompt must do is hand it a map of disk. Reference the files by path so the agent reads them before doing anything:

  • .agent/prd/SUMMARY.md: the condensed description of what is being built. This is the “why.”
  • .agent/tasks.json: the task lookup table, the list of work with passes flags.
  • .agent/tasks/TASK-{ID}.json: the full spec for a single task, including its steps and acceptance criteria.
  • .agent/logs/LOG.md: the running log of past iterations, newest at the top, so the agent can see what already happened.
  • .agent/STEERING.md: critical work injected mid-run that the agent handles before resuming the task list.
  • .agent/STRUCTURE.md: the current directory layout, so the agent does not reinvent paths.

The agent reorients by reading these in order. The summary tells it the goal, the log tells it the recent past, the task table tells it what is left.

This is the rule that makes the whole thing reliable. The agent picks the highest-priority task with passes: false, works only that task, commits, and stops. It never batches. Batching is the most common way a loop goes off the rails, because a single commit that touches five tasks is impossible to verify and impossible to revert cleanly. One task per invocation keeps each step small enough to test and small enough to bisect later. The whole rule is worth its own read in one task per iteration.

Define how the agent verifies its own work

Section titled “Define how the agent verifies its own work”

A loop is only as good as its verification, because tests are what gate progress without you in the chair. The prompt has to spell out the stack the agent runs before it commits: Vitest for unit tests, Playwright for end to end, tsc for types, ESLint for lint, Prettier for format. For UI work, the prompt should require a Playwright smoke test and a saved screenshot so the agent has visual ground truth. The repo mantra is blunt: if you didn’t test it, it doesn’t work.

The loop stops on an explicit signal, not a vibe. The prompt defines the promise tags the agent emits so ralph.sh knows what happened:

<promise>TASK-{ID}:DONE</promise> one task finished, stop this iteration
<promise>COMPLETE</promise> all tasks finished, exit the loop
<promise>BLOCKED:reason</promise> needs human help
<promise>DECIDE:question</promise> needs a decision

Those tags map to exit codes, which is what makes the loop scriptable in CI or a cron job:

Terminal window
./ralph.sh -n 50
echo $? # 0 COMPLETE, 1 MAX_ITERATIONS, 2 BLOCKED, 3 DECIDE

The per-task DONE tag is what ends a single iteration. The COMPLETE tag is what ends the entire run. Both belong in the prompt, and the loop logic depends on the agent emitting them exactly. The full mechanics live in completion promises and exit codes.

Here is the shape of a real .agent/PROMPT.md, trimmed to the load-bearing parts. Notice how it front-loads the one-task rule, points at state with @ file references, lays out a numbered task flow, then closes with hard rules and help tags.

> ONE TASK PER INVOCATION. Complete one task from @.agent/tasks.json,
> commit, output <promise>TASK-{ID}:DONE</promise>, and STOP.
## Overview
You are implementing the project described in @.agent/prd/SUMMARY.md
## Required Setup
Run `npm run dev` (as a background process) in the `src` directory.
## Before Starting
Check @.agent/STEERING.md for critical work. Handle it in sequence,
remove when done, then proceed to tasks.
## Task Flow
1. Pick the highest-priority task with `passes: false` in tasks.json
2. Read the full spec: .agent/tasks/TASK-{ID}.json
3. Check existing structure in @.agent/STRUCTURE.md
4. Implement step by step and write a unit test
5. UI tasks: Playwright smoke test, save a screenshot, verify it
6. Run eslint --fix, prettier --write, and e2e for affected files
7. Run tsc and unit tests project-wide
8. All tests must pass. Broke an unrelated test? Fix it first.
9. Set `passes: true` in tasks.json for the completed task
10. Log entry to .agent/logs/LOG.md (date, summary, screenshot path)
11. Commit using the Conventional Commit format
## Rules
- Only ONE task per invocation. After committing, output the DONE
promise and STOP. Do NOT read the next task.
- Kill background processes before the promise tag.
- No git push.
- When ALL tasks pass, output <promise>COMPLETE</promise> and nothing else.
## Help Tags
- BLOCKED: environment issues you cannot fix from the sandbox
- DECIDE: a real decision a human has to make

A few things are doing the heavy lifting here, and they are worth calling out:

  • The very first line is the one-task rule in a blockquote. It is the first and last thing the agent sees, because batching is the failure mode that wrecks the most runs.
  • Every state file is referenced with @ so the agent loads it. The prompt does not paste the contents inline. It points, and the agent reads fresh each pass.
  • The task flow is numbered and verification is steps 6 through 8, before the commit at step 11. Verification is not optional and it is not last.
  • The Before Starting step makes STEERING.md the first thing checked, so you can steer a running loop by editing one file mid-run.

This is how PROMPT.md feeds each iteration. The same text re-enters a fresh context every pass, and the agent rebuilds its understanding from disk before touching code:

flowchart TD
    P["PROMPT.md (re-sent every iteration)"] --> S["ralph.sh starts an iteration"]
    S --> R["Fresh-context agent reads PROMPT.md"]
    R --> O["Reorient from disk: SUMMARY.md, tasks.json, LOG.md, STEERING.md"]
    O --> T["Pick one task with passes:false"]
    T --> W["Implement, run tests, lint, types"]
    W --> G{"All gates pass?"}
    G -->|"No"| B["Fix next pass, or emit BLOCKED"]
    G -->|"Yes"| C["Commit, set passes:true, emit DONE"]
    C --> Q{"All tasks done?"}
    Q -->|"No"| P
    Q -->|"Yes"| D["Emit COMPLETE, loop exits 0"]
    B --> P

The arrow from the “No” branches back up to PROMPT.md is the entire point. Nothing carries over in the agent’s head. The prompt and the files on disk are what survive between passes.

How do you switch modes by editing PROMPT.md?

Section titled “How do you switch modes by editing PROMPT.md?”

The default PROMPT.md is written for implementation: build the next task, test it, commit. But the loop is mode-agnostic. The agent does whatever the prompt tells it to, so you change the loop’s behavior by editing one file. The state files, the one-task rule, and the promise tags stay the same. Only the task flow changes.

A few practical modes:

  • Implementation (default): pick a passes: false task, build it, verify, commit. This is the shape shown above.
  • Refactor: keep behavior identical, improve structure. The task flow becomes “refactor one module, prove behavior is unchanged by running the existing tests, commit.” The acceptance gate is that no test changed and all of them still pass.
  • Review: read the diff from the last run and flag issues instead of writing features. The task flow becomes “read recent commits, check against the PRD and the code standards, write findings to a file, commit the report.” No production code changes.
  • Test backfill: write missing tests for code that already exists. The task flow becomes “pick an untested module, write unit and e2e coverage, confirm the suite passes, commit.”

Because the mode lives entirely in PROMPT.md, you can keep a few prompt variants in version control and swap them in for a run. Run an implementation pass overnight, then point the same loop at a review prompt in the morning to audit what it built. The loop machinery in ralph.sh does not change. You are only changing the instruction.

Most broken loops trace back to a prompt that breaks one of the four jobs above. These are the ones that show up most.

“Build the feature and make it good” gives a fresh agent nothing to act on. It does not know which task, where the spec is, or what “good” means. A vague prompt produces a vague run: the agent improvises, picks an arbitrary task, skips verification, and you wake up to a branch you cannot trust. Be concrete. Name the files, name the rule, name the gates. The agent can only be as precise as the prompt.

This is the subtle one. It is tempting to write “continue where you left off” or “you already set up the database.” A fresh-context agent did not leave off anywhere and does not remember setting anything up. Every instruction that assumes memory is an instruction the agent cannot follow. Write the prompt as if the agent has never seen the project, because on every iteration, it has not. Push all state to disk and point the prompt at it.

If the prompt does not aggressively enforce one task per iteration, a capable agent will try to be helpful and knock out three tasks in one pass. That produces a fat commit that mixes concerns, fails verification in a way that is hard to localize, and cannot be reverted without losing good work. The fix is to repeat the one-task rule at the top and in the rules section, and to require the DONE promise plus a hard stop after the commit. When batching does slip through, it is usually one of the documented Ralph loop failure modes, and the fix is structural rather than throwing more iterations at it.

Skipping the verification gate or the completion signal

Section titled “Skipping the verification gate or the completion signal”

A prompt with no test gate degrades the loop toward one-shot prompting, because there is nothing to stop a bad commit from landing. A prompt with no completion signal means the loop runs to its iteration cap even when the work is done, burning tokens for nothing. Always specify the verification stack and always specify the promise tags. Those two pieces are what let you start a run and walk away.

A short checklist before you commit your PROMPT.md

Section titled “A short checklist before you commit your PROMPT.md”

Run through this before you point ralph.sh at a real project:

  • Does the prompt name every state file the agent needs (SUMMARY.md, tasks.json, the task spec, LOG.md, STEERING.md)?
  • Is the one-task rule stated at the top and enforced in the rules?
  • Does it list the exact verification commands and require them before the commit?
  • Does it define the promise tags and require a stop after the per-task DONE?
  • Does it read cleanly to an agent that has never seen the project before?

If all five are yes, a fresh-context agent can reorient and make progress on every pass. That is the entire job of the file.

Frequently asked questions

What is PROMPT.md in a Ralph loop?

It is the instruction the loop sends to the agent on every iteration. The script ralph.sh re-sends the same PROMPT.md each pass, and because each iteration starts with a fresh context window, the prompt has to make a stateless agent reorient from disk and act. It points the agent at its state files, enforces one task per iteration, defines verification, and specifies the promise tags that signal completion.

Where do I put the PROMPT.md file?

It lives at .agent/PROMPT.md in your project, alongside the other state files the loop reads: tasks.json, the per-task specs in tasks/, prd/SUMMARY.md, logs/LOG.md, and STEERING.md. Running npx @pageai/ralph-loop scaffolds the .agent directory with a default implementation-mode prompt you can edit.

How do I change what the loop does without changing the script?

Edit PROMPT.md. The loop is mode-agnostic, so the agent does whatever the prompt tells it. Swap the task flow to refactor, review, or backfill tests while keeping the one-task rule, the state references, and the promise tags the same. The ralph.sh machinery does not change, only the instruction does.

Why does PROMPT.md have to repeat the one-task rule?

Because a capable agent will try to be helpful and batch several tasks into one pass, which produces a fat commit that is hard to verify and hard to revert. Stating the rule at the top and in the rules section, then requiring a DONE promise and a hard stop after the commit, keeps each iteration to a single verified task.

What happens if my PROMPT.md has no completion signal?

The loop runs until it hits the iteration cap and exits with code 1 (MAX_ITERATIONS) even when the work is actually done, wasting tokens. Always define the promise tags so the agent can emit COMPLETE when all tasks pass, which exits the loop with code 0, and BLOCKED or DECIDE when it needs a human.