Reference

Docker Sandboxes

Docker Sandboxes runs coding agents in isolated microVMs. Ralph uses the standalone sbx CLI and starts agents with a deterministic --name so the sandbox can be reused and stopped cleanly.

Use ./ralph.sh --login to authenticate inside the correct named sandbox. Use ./ralph.sh --print-name if you only need the deterministic sandbox name for debugging.

Useful Docker references:

Docker AI overview explains how Docker Sandboxes fits with Gordon, Model Runner, MCP Toolkit, and Docker Agent.
Docker Sandboxes covers installation, usage, security, credentials, and troubleshooting.
Supported Docker Sandboxes agents lists the agent CLIs Docker supports.

Playwright configuration

If you are using Playwright, here is a recommended configuration:

import { defineConfig, devices } from '@playwright/test';

/**
 * See https://playwright.dev/docs/test-configuration.
 */
export default defineConfig({
  testDir: './tests',
  fullyParallel: true,
  globalTimeout: 30 * 60 * 1000,
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 1,
  workers: process.env.CI ? 3 : 6,
  reporter: 'html',
  use: {
    baseURL: 'http://localhost:3000',
    trace: 'on-first-retry',
  },


  // NB: only chromium will run in Docker (arm64).
  projects: [
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'] },
    }
  ],
});

Vitest configuration

If you are using Vitest, here is a recommended configuration:

import { defineConfig } from "vitest/config";
import react from "@vitejs/plugin-react";
import path from "path";

export default defineConfig({
  plugins: [react()],
  test: {
    environment: "node",
    globals: true,
    include: ["lib/**/*.test.ts", "lib/**/*.test.tsx"],
    // setupFiles: ['./vitest.setup.ts'], // Include this if using Next.js
  },
  resolve: {
    alias: {
      "@": path.resolve(__dirname),
    },
  },
});

If you are using Next.js, you’ll also need a vitest.setup.ts file to mock the next/image and next/link components.

import '@testing-library/jest-dom/vitest'
import { vi } from 'vitest'
import React from 'react'

// If using Next.js, mock next/image
vi.mock('next/image', () => ({
  default: ({ src, alt, ...props }: { src: string; alt: string }) => {
    return React.createElement('img', { src, alt, ...props })
  },
}))

// If using Next.js, mock next/link
vi.mock('next/link', () => ({
  default: ({
    children,
    href,
    ...props
  }: {
    children: React.ReactNode
    href: string
  }) => {
    return React.createElement('a', { href, ...props }, children)
  },
}))

Running with a different agentic CLI

Ralph can run several Docker Sandboxes agents without editing the script:

./ralph.sh --agent claude    # default
./ralph.sh --agent codex
./ralph.sh --agent copilot
./ralph.sh --agent cursor
./ralph.sh --agent gemini
./ralph.sh --agent opencode

You can also pass agent-specific options after Ralph’s -- separator:

./ralph.sh --agent codex -- --model gpt-5.3-codex
./ralph.sh --agent gemini -- --model pro

Ralph currently supports: claude, codex, copilot, cursor, gemini, and opencode. See Docker’s full list of supported agentic AI CLIs in Docker’s docs. Because the agent name is part of Ralph’s sandbox name, switching agents creates or reuses that agent’s own sandbox for the project.

Starting from scratch

For AI to actually verify its implementation and for the loop to work, you need a way to verify it.

To that end, at the minimum you’ll need an end-to-end test framework and a unit test framework.

For example, you can use the following commands to install Playwright and Vitest:

npm i @playwright/test vitest jsdom typescript eslint prettier -D

# If using React, also recommend installing:
npm i @vitejs/plugin-react @testing-library/dom @testing-library/jest-dom @testing-library/react @testing-library/user-event -D

It is recommended that you add skills for your specific language and framework. See skills.sh to discover existing skills.