I was wrong about AI agent sandboxing

Posted on January 22, 2026 • 4 minutes • 817 words

When I first started building and deploying AI agents that could actually do things (edit code, run tests, manage infrastructure), my first instinct was to lock everything down, perhaps because I’ve been in the container/microVM space for so long. I was convinced that we needed complex, multi-layered isolation to prevent an agent from hallucinating its way into a rm -rf / disaster.

Looking back, I realized I was wrong about several key architectural decisions. I was over-engineering for a future that hadn’t arrived yet, and in the process, I missed the simpler, more elegant solutions right in front of me.

I was wrong about filesystem sandboxing

In the early days, I was obsessed with filesystem isolation. I spent a lot of time thinking about how to provide each agent with a clean, disposable environment. I gravitated toward OverlayFS, the same technology that powers Docker layers. The idea was to have a read-only base layer and a writable “upper” layer that would be discarded after the agent finished its task.

But here is the thing: it’s not really about the scale. It’s just additional, unnecessary complexity. Most developers don’t want to add another specialized tool or a complex kernel-level dependency to their stack just to run an agent safely.

Since the vast majority of AI agent use cases today are focused on coding, git worktree is all most people actually need. We want to use the tools we already have and trust.

Git worktrees allow you to have multiple branches checked out in different directories simultaneously, sharing the same underlying .git history. For an AI agent, this is perfect. You give the agent its own worktree; it can make changes, run tests, and even “corrupt” its local environment without affecting the main working directory or other agents. If it succeeds, you commit. If it fails, you just delete the directory. It’s built-in, lightweight, and uses the tools we already have.

I was wrong about WebAssembly (WASM)

I really thought WASM would be the silver bullet for agent security. That’s why I wrote hyper-mcp .

The promise of WASM is incredible: a platform-agnostic, capability-based security model that sandboxes network and filesystem access by default. I envisioned a world where every agent tool was a WASM module, strictly limited in what it could do.

The problem? Nobody wants to rewrite everything.

Forcing every tool, script, and CLI utility into a WASM runtime creates massive friction. If an agent needs a specialized Python library or a specific version of a CLI tool, WASM becomes a bottleneck rather than an enabler.

Maybe solutions like on-demand microVMs are necessary for providers who need to run untrusted code at scale. But for the masses, it’s complete overkill.

In practice, a lightweight bubblewrap (bwrap) sandbox is more than enough. It provides Linux namespace isolation without requiring you to compile your entire ecosystem to WASI. Or, in many cases, the combination of tmux and separate git worktree instances provides enough logical isolation to get the job done without the overhead. This is exactly why projects like Gastown and multiclaude have started to surface: they lean into existing, familiar primitives rather than trying to reinvent the sandbox.

I was wrong about MCP

The Model Context Protocol (MCP) is a fantastic effort to standardize how LLMs interact with external tools. I’ve spent a lot of time working with it, but I’ve had to adjust my expectations.

LLMs are trained on the vast sum of existing human knowledge, which includes decades of CLI usage. They “know” how to use bash, grep, awk, find, and curl because they’ve seen millions of examples of them in their training data.

When we wrap these tools in a new abstraction like MCP, we often introduce a “translation layer” that doesn’t exist in the LLM’s primary world-view. The performance (both in terms of speed and reasoning quality) is often lower when an agent is forced through an MCP interface it’s seeing for the first time, compared to just letting it write a bash script.

This is exactly why something like Claude Skills is gaining so much traction. It’s simple. It’s just a single Markdown file that describes what the agent can do. Everyone can create their own without needing to learn a new protocol or set up a complex server.

The CLI is the native language of technical AI agents. Standardizing the transport (like MCP does) is great, but we shouldn’t try to abstract away the power and familiarity of the shell.

Complexity is the enemy of security

The biggest lesson I’ve learned is that in the world of AI agents, complexity is the enemy. Every layer of abstraction you add is another place for things to break, and another thing the LLM has to “understand” to be effective.

Sometimes, the best sandbox isn’t a high-tech container or a WASM runtime. Sometimes, it’s just a clean git worktree and a healthy dose of pragmatism.