Skip to main content

Running AI Agents in Devcontainers

How to use devcontainers to run AI agents like Claude Code and Codex in isolated environments you control, so you can skip the permission prompts.

The problem with running an AI agent on your machine is that it has your privileges. It can read your files, run commands, install packages, hit the network. Every rm -rf is one bad prompt away. Every curl | bash is a gamble. You’ve seen the horror stories. We’ve all seen the horror stories.

Docker Sandbox tackled this: a way to run AI agents in isolated containers with their own filesystem, network, and process space. It supports Claude Code, Codex, Copilot, Gemini, Kiro, OpenCode, and others (all currently experimental). It works well. For a lot of people, it’s the right answer.

But Docker Sandbox gives you Docker’s container, configured Docker’s way. You don’t control the base image. You don’t pick the toolchain. You don’t wire in your own skills, prompts, or agent configurations beyond what the sandbox supports out of the box. If your workflow depends on a specific environment (a particular CUDA version, a custom build tool, a finely tuned set of agent skills) you’re working around the sandbox instead of with it.

I ran into this pretty quickly. I’ve got a set of Claude Code skills I’ve built up over time, and I wanted my agent running in an environment that matched my actual project setup. Docker Sandbox didn’t give me that level of control. So I tried something different.

The Devcontainer Approach

Instead of relying on Docker’s sandboxing, you can use Devcontainers (the .devcontainer configurations that VS Code and GitHub Codespaces use) to create fully isolated environments for your AI agents.

Here’s what you get:

  • Full flexibility. Configure the container however you want. Need Python 3.11 with PyTorch? Go 1.26 with Just? Node 20? It’s all Dockerfile commands.
  • Mount your agents. Bind-mount your existing Claude Code skills, Codex configurations, or custom agents into the container. You control exactly what goes in.
  • Persistent state. Your agents remember things between runs. No starting from scratch every time.
  • Host protection. The agent runs in the container. It can’t touch your home directory, your SSH keys, or that one file you really shouldn’t have left in /tmp.

How It Works

You add a .devcontainer/ directory to your project with a Dockerfile and a devcontainer.json. The Dockerfile is your environment, same as any other Docker setup:

# .devcontainer/Dockerfile
FROM mcr.microsoft.com/devcontainers/base:ubuntu

RUN apt-get update && apt-get install -y \
    curl \
    git \
    build-essential

# Install Go
RUN curl -fsSL https://go.dev/dl/go1.26.0.linux-amd64.tar.gz | tar -C /usr/local -xz
ENV PATH="/usr/local/go/bin:${PATH}"

# Install Node
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - && \
    apt-get install -y nodejs

Then the devcontainer.json pulls it together and mounts your agent config:

// .devcontainer/devcontainer.json
{
  "name": "dev-claude",
  "build": {
    "dockerfile": "Dockerfile"
  },
  "mounts": [
    "source=${localEnv:HOME}/.claude,target=/home/vscode/.claude,type=bind,consistency=cached"
  ],
  "postCreateCommand": "claude --version"
}

The mounts array is the key part. You’re bind-mounting your host’s ~/.claude directory into the container so the agent has access to your skills and configuration, but everything else (your home directory, your keys, your files) stays on the host where it belongs. If you’re using Codex instead, mount ~/.codex the same way.

When you open the project in VS Code with the Dev Containers extension, or when GitHub Codespaces spins up, you get a fresh container with your AI agent already wired in. The agent operates inside. You’re protected outside.

Why Not Docker Sandbox?

Docker Sandbox is a great option, especially if the defaults work for your workflow. But devcontainers give you more control in a few key areas:

Customization. Docker Sandbox gives you a pre-configured container. Devcontainers give you whatever you need. A specific CUDA version, a custom toolchain, something compiled from source. You write the Dockerfile.

Agent integration. Your Claude Code skills took time to build. Your Codex prompts are finely tuned. With devcontainers, you bind-mount your ~/.claude or ~/.codex directory and your agent has everything it needs.

Persistence. Devcontainers persist between sessions. Everything carries over without rebuilding. With Docker Sandbox, each run starts fresh. If your agent generated useful context or built up a .claude/memory over the course of a session, that’s gone next time.

The Tradeoffs

It’s not all perfect. Devcontainers need more setup than docker run. You’ve got Dockerfiles to manage, volumes to mount, configurations to sync. There’s maintenance. And if you change your base image, you’ll need to rebuild, which depending on your Dockerfile could mean a coffee break or a lunch break.

You also need to be thoughtful about what you mount. Bind-mounting your entire home directory defeats the purpose. I made this mistake early on, mounting $HOME because it was easy. The agent had access to my SSH keys, my AWS credentials, my shell history. Not great. Mount only what the agent needs: its config, maybe a credentials file for a specific service, and the project itself. Everything else stays outside.

But when you’re running autonomous agents that could accidentally delete your entire $HOME? The tradeoffs start to make sense.

Let It Rip

OK so here’s the real payoff. Both Claude Code and Codex have flags that let them run without constantly asking for permission:

claude --dangerously-skip-permissions
codex --yolo

On your host machine, these flags are genuinely scary. The agent can run any command, modify any file, install anything, all without asking. One bad tool call and you’re spending your evening restoring from backups.

Inside a devcontainer? The blast radius is the container. The agent can rm -rf / and the worst thing that happens is you rebuild the container. Your host is untouched. Your SSH keys are safe. Your credentials are where you left them.

This is, honestly, the whole point. The permission prompts in these tools exist because the agent is running on your machine with your privileges. Remove that risk by isolating the environment and the prompts become unnecessary overhead. You get a fully autonomous agent that can move fast without the constant “are you sure?” interruptions. And you sleep fine because it can’t touch anything that matters.

The Bottom Line

If the defaults fit your workflow, Docker Sandbox is probably all you need. But if you want full control over the environment your agents run in, devcontainers give you exactly that.

Configure your container. Mount your agents. Let the AI do its work in a box that can’t hurt the host.

If you’re doing something similar, or you’ve found a better approach, I’d love to hear about it. Hit me up on Twitter or Bluesky.