How to Inspect and Debug Inside an AI Agent Sandbox
A sandbox is not a black box. Here is how to find the running microVM, shell into it with sbx exec, reproduce the failure by hand, and read the network log.
A sandbox is not a black box. Here is how to find the running microVM, shell into it with sbx exec, reproduce the failure by hand, and read the network log.
A sandboxed agent should start with no outbound network and earn each domain it reaches. Here is how to allowlist exactly what a task needs with sbx policy, so package installs work and exfiltration does not.
Bypass-permissions mode hands an agent full shell access with no prompts. That is reckless on your host and a non-event inside a Docker Sandbox. Here is how to run YOLO mode when the blast radius is contained.
Vibe coding is fast for throwaways and dangerous for production. Here is an honest comparison with spec-driven development, why autonomous agents need a spec, and how to pick per task.
A plain Docker container is a start, not a boundary. Here is how Docker Sandboxes microVMs compare to a hand-rolled container for isolating an autonomous coding agent.
A flat task list stops working past a few dozen entries. A lookup table that indexes per-task spec files lets an autonomous agent grind through hundreds of tasks without losing track.
An autonomous coding agent runs with your full user permissions, which means it can read your SSH keys and push to your remotes. A sandbox is the only blast radius cheap enough to lose.
The most reliable rule for autonomous coding is one task per invocation, commit, then stop. Here is why batching tasks wrecks an agent loop and how Ralph enforces the rule.
A PRD is too big for an agent to build in one shot. Here is how to decompose it into atomic, independently verifiable task packets the loop finishes one at a time.
You do not have to kill a Ralph loop to redirect it. Edit .agent/STEERING.md mid-run and the agent reads it at the top of the next iteration, handles the critical work first, then resumes the task list.