Sandbox Isolation: How A2A Cloud Runs Agent Code Inside microVMs
Containers aren't enough. A2A Cloud runs every agent workload inside libkrun microVMs with FUSE-backed workspaces, signed grants, and locked-down egress — so agents can do real work without ever touching the host.
Sandbox Isolation: How A2A Cloud Runs Agent Code Inside microVMs
Agents need to *execute*. Run shell. Run Python. Install deps. Generate files. Test their own output.
That's the whole point. And it's exactly where most agent platforms get hand-wavy.
A2A Cloud doesn't. Every line of agent code runs inside a libkrun microVM, with the workspace exposed through a custom FUSE filesystem, network egress disabled by default, and access scoped by signed grants. This is the security layer that makes the rest of the platform possible.
Let's open the hood.
The Stack: libkrun + microsandbox + FUSE
The sandbox runtime is a FastAPI service (sandbox_runtime/service.py) that exposes five endpoints:
POST /v1/run_shell— one-shot shell executionPOST /v1/run_python— one-shot Python executionPOST /v1/sandboxes— create persistent sandbox sessionPOST /v1/sandboxes/{name}/exec— exec inside persistent sessionDELETE /v1/sandboxes/{name}— teardown
Under that wrapper sits the real isolation:
from microsandbox import Sandbox, Volume, RegistryAuth
cfg = {
"name": self._sandbox_name,
"image": self._image,
"volumes": {self._guest_mount: Volume.bind(self._mountpoint)},
"replace": True,
"memory_mib": self._resource_caps.vm_memory_mib,
"cpus": self._resource_caps.vm_cpu_count,
}
self._sandbox = await Sandbox.create(cfg)libkrun is the hypervisor. microsandbox is the Python control layer. Each agent invocation is its own microVM. Not a container. Not a namespace. A microVM.
That's the moat. Containers share a kernel. microVMs do not. A compromised agent can't pivot to the host.
FUSE Mode: The Workspace Lives on S3
Agents need a /workspace. But the workspace can't actually be on the host — it's a MinIO bucket, scoped to a grant.
So A2A Cloud built a live FUSE filesystem that translates VFS calls into S3 operations.
Architecture diagram (from the module docstring):
sandbox guest:/workspace
↔ Volume.bind
↔ host:/var/run/sb-{id}/mnt (FUSE)
↔ S3FuseAdapter
↔ Backend
↔ MinIOReal VFS handlers:
read→download_files()with 4MB read-aheadwrite→ bufferedSpooledTemporaryFile(max 64MB in-memory), flushed viaoverwrite_stream()unlink→backend.delete()
Kernel cache is disabled entirely — auto_cache=False, attr_timeout=0.0, entry_timeout=0.0. Why? Because the grant is the source of truth. Stale cached entries could expose data a revoked grant shouldn't see.
This is paranoid by design. Good.
Bridge Mode: The Mac Compatibility Lane
Older Apple Silicon (M1/M2) can't do live FUSE the way Linux can. So the sandbox runtime ships a second mode: bridge mode.
Bridge mode pre-hydrates a host tmpdir from the backend, raw bind-mounts it into the guest, and snapshots-diffs-pushes changes back when the exec finishes. _snapshot_dir() + _flush_changes() do the work.
Bridge mode is the development compatibility path. FUSE mode is the production path. The selection happens at runtime in microsandbox_fuse.py:1346-1362.
Same API. Same agent contract. Different mount strategy.
Write Policies: outputs-only by Default
Not every skill should be allowed to scribble all over the workspace. The sandbox runtime ships two write policies:
"outputs"(default): writes restricted to/outputs/and/memories/AGENTS.md"workspace": full/workspacewritable
A write outside the approved prefix returns EACCES at the FUSE layer (_is_sandbox_writable_path()). Not a runtime exception in Python — an actual filesystem permission denial. The agent code *cannot* lie about it.
This is the principle of least authority, enforced at the kernel boundary. That's the right place.
Network Egress: Off by Default
Fresh sandbox boots with network_disabled and the first exec runs _apply_network_policy(). That script uses iptables/nftables to DROP all egress except localhost.
Want network? You ask for it. The grant carries the flag, the runtime honors it, and even then access is policy-checked.
This is the difference between "sandbox" and "sandbox." A lot of agent platforms hand-wave network isolation. A2A Cloud blocks at the firewall.
Resource Caps Are Real
The ResourceCaps model is concrete, not suggestive:
vm_memory_mib— hard guest memory limitvm_cpu_count— hard CPU countmax_single_file_bytes— default 1GBmax_spool_total_bytes— default 4GBmax_open_files— default 64max_aexecute_wall_s— default 300s
CapsTracker enforces these at the FUSE callback layer. Runaway agent? Capped at the filesystem before it can starve the host.
Grants Reach All The Way Down
Every sandbox call accepts either a bearer token or an X-A2A-Grant header. Grants carry allow/deny fnmatch patterns, write prefixes, and bucket scoping.
The grant doesn't just gate the API — it gates the filesystem. The FUSE layer checks _is_sandbox_writable_path() against the grant's write prefixes on every write call. MinIO sees download_files() and overwrite_stream() calls filtered through grant patterns.
The host MinIO credentials never leave the host. The guest only sees its mounted workspace. Zero credential leakage by design.
What This Adds Up To
Most "agent sandboxes" are containers with a polite README. A2A Cloud's sandbox is a microVM with:
- libkrun hypervisor isolation (real kernel boundary)
- FUSE-backed workspace tied to S3 + grants
- Egress-disabled-by-default networking
- Filesystem-enforced write policies
- Hard resource caps tracked per session
- No host credentials in guest, ever
That stack is what lets the platform say: "yes, run that generated code." And mean it.
Why This Matters For Builders And Buyers
For builders: you can write agents that do real work — install packages, run validators, render assets, test their own output — without becoming a security engineer. The platform owns the boundary.
For buyers: you can adopt agents from the marketplace knowing the runtime can't pivot, exfiltrate, or escape. Egress is blocked by default. Filesystem access is grant-scoped. Resources are capped.
For everyone: this is what production-grade agent infrastructure looks like. No magic. No hand-waving. Just real isolation, all the way down.
Most platforms talk about "safe execution." A2A Cloud ships it.