Breaking a PRD Into Atomic Agent Tasks

A PRD decomposed into atomic task packets that feed a tasks.json lookup table an AI agent works one at a time.

Apr 17, 2026 - 15 min read - 3200 words

Creator of RalphLoop.sh, founder of PageAI

To break a PRD into tasks an agent can build, you decompose it into atomic task packets: small, self-contained units of work, each with one objective, the files to inspect, ordered steps, and acceptance criteria a machine can check. The agent picks one packet, builds it, verifies it, commits, and stops. The PRD is the contract for the whole project. A task packet is the contract for one iteration.

This post is the decomposition itself: what goes inside a single packet, how Ralph stores packets in .agent/tasks.json and .agent/tasks/TASK-{ID}.json, how to size each one so a single iteration can finish and commit, and how dependencies and priority decide the order the loop works them. It assumes you already have a buildable PRD. If you do not, start with how to write a PRD an AI agent can actually build from, then come back here to chop it up.

Why a PRD is the wrong unit of work for an agent

A PRD describes a finished product. An agent does not build finished products. It edits files in a single context window, and that window is finite. Hand it the whole PRD and tell it to build, and one of two things happens. It tries to do everything at once and produces a sprawling diff you cannot review, or it loses the plot halfway through and starts contradicting decisions it made an hour earlier. That second failure is context rot, and it gets worse the longer a single session runs.

The fix is to never ask the agent to hold the whole project in its head. You decompose the PRD into the smallest pieces that still make sense on their own, and you feed the agent exactly one piece per iteration with a fresh context window each time. The Ralph technique is built around this: each loop starts the agent clean, it reads the task list, it picks one packet, and the filesystem carries the memory between passes instead of a long chat history.

So the real question is not “how do I get the agent to build my PRD.” It is “what is the smallest packet of work I can define that the agent can finish and verify in one sitting.” Get that unit right and the loop becomes reliable. Get it wrong and no amount of prompting saves you.

What goes inside an atomic task packet

A task packet is atomic when it has a single objective and everything the agent needs to hit it. Four parts make a packet complete.

The objective. One sentence describing the one thing this packet delivers. Not “build authentication.” That is a feature, not a task. “POST /api/auth/register creates a new user account” is a task. If you cannot state the objective in a single sentence without the word “and,” the packet is too big and you split it.

The files to inspect. Name the files the agent should read before it touches anything, and the files it is expected to change. This is the cheapest way to stop an agent from inventing a parallel module next to the one you already have. In Ralph these live in the step details and in technicalNotes: “extend the existing auth module in src/lib/auth.ts, do not add a second one.” A packet that points the agent at the right files reuses code instead of duplicating it.

The ordered steps. The packet breaks the objective into a short sequence the agent works in order. Each step is concrete enough to act on, and the last step is almost always “write the tests that prove the acceptance criteria.” Tests are steps inside the packet, not a separate task scheduled for later.

Testable acceptance criteria. Each criterion is a condition the agent can confirm by running a command and reading the output. “Login works” is a vibe. “POST with a wrong password returns 401 and the body { error: 'Invalid credentials' }” is a criterion. The test is simple: can a machine return yes or no without your opinion? If not, rewrite it. The discipline of writing verifiable criteria is the same one that makes the PRD buildable, covered in how to write a PRD for an AI agent.

The packet is the unit of verification, not just the unit of work. When the agent finishes, it does not ask you whether it is done. It checks the criteria, and the criteria answer for it.

How Ralph stores this: a lookup table plus per-task specs

Ralph splits the task list into two layers, and the split is what lets a run scale to hundreds of packets without drowning the agent in detail.

The root .agent/tasks.json is a lookup table, not the detail. It is a flat list where each entry is a stub that points at a spec file. The agent scans this list every iteration to find the next thing to do, so it stays lean on purpose.

[
  {
    "id": "TASK-1",
    "title": "Verify project prerequisites and access",
    "category": "setup",
    "specFilePath": ".agent/tasks/TASK-1.json",
    "passes": false
  },
  {
    "id": "TASK-2",
    "title": "Add avatars storage bucket and config",
    "category": "infrastructure",
    "specFilePath": ".agent/tasks/TASK-2.json",
    "passes": false
  },
  {
    "id": "TASK-3",
    "title": "POST /api/avatar uploads and stores a user avatar",
    "category": "api-endpoint",
    "specFilePath": ".agent/tasks/TASK-3.json",
    "passes": false
  }
]

Each stub carries a passes flag. It starts false and only flips to true after the agent verifies the work for that packet. The loop reads this flag to decide what is left. Keeping the table this thin matters: a run with two hundred packets still scans fast, because the table holds titles and pointers, not the full contracts. That scaling pattern is its own topic in task lookup tables for agents.

The detail lives in the per-task spec at .agent/tasks/TASK-{ID}.json. This is the full packet: the objective as a description, the files to inspect inside the steps, the ordered steps, the testable acceptance criteria, the dependencies, an estimated complexity, and any technical notes.

{
  "id": "TASK-3",
  "title": "POST /api/avatar uploads and stores a user avatar",
  "category": "api-endpoint",
  "description": "Accept an image upload from an authenticated user, validate it, store it in the avatars bucket, and save the URL on the user row.",
  "acceptanceCriteria": [
    "POST with a valid PNG or JPEG under 2 MB returns 200 with the stored avatar URL",
    "A file over 2 MB returns 413 with the error text File too large",
    "A non-image content type returns 415",
    "An unauthenticated request returns 401",
    "The avatar URL is persisted on the user row and survives a re-fetch"
  ],
  "steps": [
    {
      "step": 1,
      "description": "Add the upload route handler",
      "details": "Read src/lib/auth.ts for the session helper and src/lib/storage.ts for the bucket client. Add POST /api/avatar, validate size and content type, upload, then update users.avatarUrl.",
      "pass": false
    },
    {
      "step": 2,
      "description": "Write tests for every acceptance criterion",
      "details": "Add Vitest cases for valid upload, oversized file, wrong content type, unauthenticated request, and persistence after re-fetch.",
      "pass": false
    }
  ],
  "dependencies": ["TASK-1", "TASK-2"],
  "estimatedComplexity": "medium",
  "technicalNotes": [
    "Reuse the storage client in src/lib/storage.ts, do not add a second SDK",
    "Strip EXIF data before storing the image"
  ]
}

Notice the four parts of a packet map onto the JSON. description is the objective. The step details name the files to inspect and change. steps is the ordered sequence. acceptanceCriteria is the checkable definition of done. TASK-1 is always reserved for prerequisite verification, so the agent confirms credentials, tools, and access exist before any feature work starts.

Here is how a PRD turns into packets and lands in the lookup table.

flowchart TD
  PRD["prd/PRD.md: goals, constraints, criteria"] --> Decompose["Decompose into atomic packets"]
  Decompose --> P1["tasks/TASK-1.json: prerequisite gate"]
  Decompose --> P2["tasks/TASK-2.json: storage config"]
  Decompose --> P3["tasks/TASK-3.json: upload endpoint"]
  Decompose --> Pn["tasks/TASK-N.json: ..."]
  P1 --> Index["tasks.json lookup table (stubs + passes)"]
  P2 --> Index
  P3 --> Index
  Pn --> Index
  Index --> Loop["ralph.sh: pick one packet per iteration"]
  Loop --> Verify["Build, test, screenshot, commit, set passes true"]

The PRD is the source. Decomposition produces one spec file per packet. The lookup table indexes them. The loop reads the table, opens one spec, and works it. You do not write all of this by hand: the prd-creator skill generates the table and the spec files from your PRD, then you edit the packets that need sharpening.

Sizing a packet so one iteration can finish and commit

The hardest part of decomposition is sizing. Too big and the agent runs out of context before it finishes, leaves the packet half built, and the next iteration starts cold on a mess. Too small and you spend more iterations on overhead than on work. The target is the largest packet that still finishes, verifies, and commits inside a single iteration.

A few concrete rules keep packets in the right range.

One objective, no “and.” If the title needs the word “and” to be honest, it is two packets. “Add the upload endpoint and the avatar UI” is two: the endpoint is an api-endpoint packet, the UI is a ui-ux packet that depends on it. Split on the “and.”

One area of the codebase. A packet that touches the data model, the API, and the frontend in one pass is too wide. Each layer is its own packet. The migration is one. The endpoint that reads the new column is another. The component that renders it is a third. Narrow packets keep the diff readable and the failure isolated.

Verifiable in the steps it contains. If proving the criteria would take more steps than building the feature, the packet is doing too much. A well sized packet has a handful of steps, and the test step covers the criteria without ballooning into a second project.

A clean commit at the end. The end state of every packet is a commit that builds, passes its tests, and changes nothing it did not need to. If you cannot imagine that commit as a single coherent change, the packet is wrong.

The economics back this up. The loop runs one packet per iteration, and you cap iterations with ./ralph.sh -n 50 (the default is 10). A run of small packets makes steady, auditable progress: one commit per packet, one test suite per criterion, one screenshot per UI change. A run of giant packets thrashes, because the agent keeps almost finishing and never quite committing. When you want to watch a single packet land before turning the loop loose, run exactly one iteration:

npx @pageai/ralph-loop
./ralph.sh --once

The rule that one invocation handles exactly one packet, commits, and stops is the reliability backbone of the whole approach. It has enough nuance to deserve its own treatment in one task per iteration.

Dependencies and priority ordering

Atomic packets are not independent of each other. The upload endpoint cannot exist before the storage bucket. The avatar UI cannot render a URL the endpoint does not yet return. Decomposition has to capture this order, or the agent builds on a foundation that is not there and the iteration fails.

Ralph encodes order in two places. The dependencies array on each spec names the packets that must pass first. In the example above, TASK-3 depends on ["TASK-1", "TASK-2"], so the loop will not start the upload endpoint until the prerequisite gate and the storage config both report passes: true. The agent respects this when it selects the next packet, which means you can write the whole task list up front without worrying that the agent jumps ahead.

Priority decides what to do among packets that are all unblocked. Each iteration the loop finds the highest-priority incomplete task whose dependencies are satisfied, opens its spec, and works it. The selection logic each pass is small.

flowchart TD
  Start["Fresh context"] --> Scan["Scan tasks.json for incomplete packets"]
  Scan --> Ready{"Dependencies all passes true?"}
  Ready -->|"no"| Skip["Skip, try next packet"]
  Ready -->|"yes"| Pick["Pick highest priority among ready packets"]
  Skip --> Scan
  Pick --> Work["Open TASK-{ID}.json, work the steps"]
  Work --> Gate["Tests, lint, types, screenshot"]
  Gate -->|"fail"| Work
  Gate -->|"pass"| Commit["Set passes true, commit, stop"]

Two practices keep ordering dependable. First, front-load the foundation. The prerequisite gate is TASK-1 for a reason: nothing should run before the agent confirms it has the access and tools to run at all. Data-model packets come early, because most feature packets depend on them. Second, keep dependency chains shallow. A packet that depends on a packet that depends on a packet is a sign the work is really one larger unit you split too aggressively, or that an intermediate packet is doing too little. Wide and shallow beats narrow and deep, because shallow graphs give the loop more ready packets to choose from at any moment, which keeps it moving even if one branch is blocked.

When you need to reorder mid-run, you do not kill the loop. You edit .agent/STEERING.md, and the agent handles that critical work on its next iteration before returning to the task list. That is how you inject “fix the failing migration before anything else” without losing momentum.

A short worked example

Take “let users upload a profile avatar” from a PRD and watch it become packets.

That sentence is a feature, not a task. Decomposition asks: what are the smallest verifiable units, and in what order. TASK-1 is the prerequisite gate, always. TASK-2 configures the storage bucket and is a dependency for everything that stores a file. TASK-3 is the upload endpoint, depending on the bucket. TASK-4 is the avatar component in the UI, depending on the endpoint because it renders the URL the endpoint returns. TASK-5 handles deletion, depending on the upload existing.

Each packet gets criteria a machine can check, not opinions. For the endpoint: a valid image under the size limit returns 200 with a URL, an oversized file returns 413, a wrong content type returns 415, an unauthenticated request returns 401. For the UI: the component shows a placeholder when no avatar is set, shows the image after a successful upload, and a Playwright run plus a screenshot confirms it. The feature that read as one line in the PRD is now five committed, tested steps, each small enough to finish in a single iteration, ordered so the loop never builds ahead of its foundation.

The framing of phases that produces these packets (specify, plan, decompose into tasks, implement and verify) comes from GitHub Spec Kit, and the loop that runs the packets autonomously was popularized by Geoffrey Huntley in his original Ralph writeup. Decomposition is the phase that decides whether the loop succeeds, which is why it is worth slowing down for.

Where to go next

If you are building this workflow, read across the spec-driven cluster:

Spec-driven development with AI for the full Specify, Plan, Tasks, Implement workflow these packets sit inside.
How to write a PRD an AI agent can actually build from for the goals, constraints, and acceptance criteria you decompose here.
One task per iteration for the rule that makes one-packet-at-a-time reliable.
Task lookup tables for agents for scaling the table to hundreds of packets.

For the mechanics of the loop that reads these packets on every pass and the fresh-context design behind it, start with what is the Ralph technique.

Frequently asked questions

How do I break a PRD into tasks an AI agent can build?

Decompose the PRD into atomic task packets, where each packet has one objective, the files to inspect, ordered steps, and acceptance criteria a machine can check by running a command and reading the output. Split on any objective that needs the word and, keep each packet inside one area of the codebase, and make sure the agent can finish and commit it in a single iteration.

What makes a task packet atomic?

A packet is atomic when it has a single objective and everything the agent needs to hit it without holding the rest of the project in its head. That means one deliverable, the files to read and change, a short ordered sequence of steps with tests as the final step, and verifiable acceptance criteria. If you cannot state the objective in one sentence without the word and, the packet is too big.

How does Ralph store a decomposed PRD?

Ralph uses two layers. The root .agent/tasks.json is a lean lookup table of stubs, each with an id, title, category, a pointer to a spec file, and a passes flag. The detail lives in per-task specs at .agent/tasks/TASK-{ID}.json, which hold the description, acceptance criteria, ordered steps, dependencies, estimated complexity, and technical notes. The agent scans the table to pick a packet and opens the spec to work it.

How big should a single agent task be?

The right size is the largest packet that still finishes, verifies, and commits inside one iteration. Too big and the agent runs out of context and leaves the work half done. Too small and overhead dominates. Aim for one objective, one area of the codebase, a handful of steps including the tests, and a single clean commit at the end.

How do dependencies and priority decide task order?

Each spec has a dependencies array naming the packets that must pass first, so the loop will not start a packet until its dependencies report passes true. Among packets whose dependencies are satisfied, the loop works the highest priority one. Front-load the prerequisite gate and data-model packets, keep dependency chains shallow, and edit STEERING.md to reorder mid-run without stopping the loop.

Run your own Ralph loop

Ralph is a hackable script you point at your project. Install it and let an agent work through your task list.

npx @pageai/ralph-loop

Install from npm Star on GitHub Watch the walkthrough