Giving AI Agents a Place to Run Code

As AI agents become more capable, a critical question keeps surfacing: where should they actually execute code? Running untrusted, LLM-generated code on your host machine is a non-starter. Sharing a single persistent environment between agents leads to conflicts and state pollution. What's needed is something disposable, isolated, and fast.

That's why we built Monza — an open-source platform for creating ephemeral, Docker-backed development sandboxes designed specifically for AI agent workflows.

The Problem

Today's AI coding agents — whether they're debugging, prototyping, or running data pipelines — need a safe place to execute arbitrary code. The options aren't great:

  • Local execution is risky. One bad rm -rf and you're having a bad day.
  • Persistent cloud environments are expensive and accumulate stale state.
  • Serverless functions are too constrained for real development tasks.

What agents really need are short-lived containers that spin up instantly, run whatever the agent needs, and disappear when the work is done.

How Monza Works

Monza has a simple architecture: a Go backend that orchestrates Docker containers, backed by PostgreSQL for state tracking.

When an agent requests a sandbox, Monza:

  1. Provisions a Docker container from a devcontainer.json template
  2. Exposes a REST API for interacting with the container
  3. Tracks the sandbox lifecycle — from creating to running to expired
  4. Automatically cleans up idle sandboxes after a configurable TTL (default: 15 minutes)

Agents keep their sandboxes alive by sending heartbeats. Stop sending heartbeats, and the sandbox expires — no orphaned containers eating up resources.

What You Can Do Inside a Sandbox

Once a sandbox is running, the API gives agents full control:

  • Execute commands — run build scripts, install dependencies, execute tests
  • Upload files — push source code, configs, or data into the container (up to 100 MB)
  • Download files — pull build artifacts, logs, or generated output back out
  • Port mapping — access services running inside the container

The entire lifecycle is managed through a clean REST API:

POST   /api/sandboxes              # Create a sandbox
GET    /api/sandboxes/{id}         # Get sandbox status
POST   /api/sandboxes/{id}/heartbeat    # Keep it alive
POST   /api/sandboxes/{id}/files/upload # Push files in
GET    /api/sandboxes/{id}/files/download?path=... # Pull files out
DELETE /api/sandboxes/{id}         # Tear it down

LangChain Integration

We also built langchain-monza — a Python client library that makes it trivial to use Monza sandboxes from AI agent frameworks.

from langchain_monza import MonzaSandbox

with MonzaSandbox(template="python-3.12") as sandbox:
    result = sandbox.execute("echo 'Hello from the sandbox!'")
    print(result)

The MonzaSandbox class handles sandbox creation, command execution, file operations, and automatic cleanup via Python's context manager protocol. No need to worry about dangling containers — when the with block exits, the sandbox is torn down.

For longer-running workflows, you can also manage the lifecycle manually:

sandbox = MonzaSandbox(template="python-3.12")
sandbox.execute("pip install pandas && python analysis.py")
sandbox.close()

Install it with pip:

pip install langchain-monza

Why Ephemeral?

The key design decision behind Monza is that sandboxes are disposable by default. This isn't a limitation — it's the point:

  • Security — each task gets a fresh, isolated environment. No cross-contamination between agent runs.
  • Resource efficiency — idle containers are automatically reclaimed. No zombie processes.
  • Reproducibility — every execution starts from a known state defined by the devcontainer template.
  • Simplicity — no state management, no environment drift, no cleanup scripts.

Devcontainer Templates

Monza uses the devcontainer.json standard for defining sandbox environments. Drop a template into the configured directory, and agents can request it by name. This means you can predefine environments for Python, Node.js, Go, Rust — whatever your agents need — with all the right tooling pre-installed.

Getting Started

Monza requires Go 1.22+, Docker, and PostgreSQL. Clone the backend, configure your database URL, and start the server:

git clone https://github.com/Glyph-Software/monza-backend
cd monza-backend
export DATABASE_URL="postgres://user:pass@localhost:5432/monza"
go run cmd/server/main.go

Then install the Python client and start creating sandboxes from your agent code.

Both repositories are open source under the MIT license:

What's Next

Monza is still early, and we're actively developing it. We're exploring support for persistent volumes, multi-container workspaces, and tighter integrations with popular agent frameworks beyond LangChain. If you're building AI agents that need code execution, we'd love your feedback and contributions.

Glyph-Software/monza-backend
Go backend for ephemeral sandbox management — Docker orchestration, REST API, and automatic cleanup.
Glyph-Software/langchain-monza
Python client library for Monza — create and manage sandboxes from AI agent frameworks like LangChain.
Python