Agent Sandbox

Persistent filesystem sessions for TPMJS agents. Clone repos, edit files, run commands — all state preserved across tool calls within a conversation.

Stateful Sessions
Git Support
Shell Execution
File I/O
Self-hostable

Introduction

The Agent Sandbox is a Deno-based execution server that gives each agent conversation its own persistent workspace directory. Unlike the default stateless executor where each tool call is independent, the sandbox preserves files across tool calls so agents can perform multi-step workflows.

A file written in the first turn of a conversation is still there in the tenth turn. A git repo cloned in one step can be modified and committed in the next. This is what makes agentic coding workflows possible.

User Message
┌─────────────────────────────────────┐
│  TPMJS Web App (Next.js)            │
│                                     │
│  1. Create/resume sandbox session   │
│  2. Stream AI response              │
│  3. Route tool calls to sandbox     │
│  4. Persist messages to DB          │
└──────────────┬──────────────────────┘
               │  POST /execute-tool
               │  { sessionId, params }
┌─────────────────────────────────────┐
│  Agent Sandbox Server (Deno)        │
│                                     │
│  Session: agent123:conv456          │
│  ┌───────────────────────────────┐  │
│  │  /workspace/                  │  │
│  │  ├── my-repo/                 │  │
│  │  │   ├── .git/                │  │
│  │  │   ├── src/                 │  │
│  │  │   └── README.md            │  │
│  │  └── notes.txt                │  │
│  └───────────────────────────────┘  │
│                                     │
│  Inject _sandboxWorkDir → execute   │
└─────────────────────────────────────┘

When to Use the Sandbox

🔧

Multi-step Workflows

Generate files in one step, process them in the next. Build pipelines that span multiple tool calls.

📂

Git Operations

Clone repositories, make changes, commit, and push. The full git workflow within a conversation.

💾

File-based Processing

Tools that read and write files need shared state. CSV analysis, log parsing, code generation — all work naturally.

🏗️

Code Generation & Testing

Generate code in one step, run tests or build it in another. Iterate based on results.

Don't need statefulness? Use the default executor. It's simpler, faster, and doesn't require a separate server. The sandbox is specifically for workflows where files must persist between tool calls.

How It Works

Here's a multi-turn conversation showing how state persists:

// Turn 1: User says "Clone the repo and show me the README"
// Agent calls:
shellExec({ command: "git clone https://github.com/user/repo.git --depth 1" })
readFile({ path: "repo/README.md" })

// Turn 2: User says "Add a license file"
// Agent calls (files from turn 1 still exist):
writeFile({ path: "repo/LICENSE", content: "MIT License..." })
shellExec({ command: "cd repo && git add LICENSE && git commit -m 'Add license'" })

// Turn 3: User says "Show me the git log"
// Agent calls (all prior changes persist):
shellExec({ command: "cd repo && git log --oneline" })

End-to-End Flow

  1. 1User sends a message to an agent with sandbox enabled
  2. 2The conversation route creates or resumes a sandbox session (ID: agentId:conversationId)
  3. 3The AI model decides which tools to call based on the user message
  4. 4Tool calls are routed to the sandbox server with the session ID
  5. 5The sandbox server resolves the session workspace and injects _sandboxWorkDir into params
  6. 6The tool executes in the workspace directory — files persist on disk
  7. 7Results flow back to the AI model, which may call more tools or respond with text
  8. 8All messages, tool calls, and results are persisted to the database

shellExec

Execute any shell command in the workspace. Supports git, npm, python, and any CLI tool available in the container.

Parameters

ParameterTypeRequiredDescription
commandstring
Yes
Shell command passed to sh -c
timeoutnumberNoTimeout in milliseconds (default: 30000, max: 300000)

Example

// Agent calls shellExec
{
  "command": "git clone https://github.com/user/repo.git --depth 1",
  "timeout": 60000
}

Response:

{
  "stdout": "Cloning into 'repo'...\n",
  "stderr": "",
  "exitCode": 0,
  "durationMs": 2341,
  "truncated": false
}

Timeout behavior: On timeout, sends SIGTERM, waits 2 seconds, then SIGKILL. Exit code 137 indicates a timeout kill.

Output truncation: stdout and stderr are each truncated at 100 KB.

readFile

Read a file from the workspace. Paths are relative to the workspace root.

Parameters

ParameterTypeRequiredDescription
pathstring
Yes
File path relative to workspace root

Example

// Agent calls readFile
{
  "path": "repo/src/index.ts"
}

Response:

{
  "content": "export function hello() {\n  return 'world';\n}\n",
  "path": "repo/src/index.ts",
  "size": 48,
  "truncated": false
}
Content is truncated at 500 KB. Symlinks that escape the workspace are blocked.

writeFile

Create or overwrite a file. Parent directories are created automatically by default.

Parameters

ParameterTypeRequiredDescription
pathstring
Yes
File path relative to workspace root
contentstring
Yes
Content to write
createDirsbooleanNoCreate parent directories if needed (default: true)

Example

// Agent calls writeFile
{
  "path": "src/main.py",
  "content": "import json\n\ndef transform(data):\n    return json.dumps(data)\n",
  "createDirs": true
}

Response:

{
  "success": true,
  "path": "src/main.py",
  "bytesWritten": 58
}

listFiles

List files and directories. Supports recursive listing with depth control.

Parameters

ParameterTypeRequiredDescription
pathstringNoDirectory path relative to workspace root (default: ".")
recursivebooleanNoList recursively (default: false)
maxDepthnumberNoMax depth for recursive listing (default: 3)

Example

// Agent calls listFiles
{
  "path": ".",
  "recursive": true,
  "maxDepth": 2
}

Response:

{
  "path": ".",
  "entries": [
    { "name": "repo", "type": "directory", "size": null },
    { "name": "repo/README.md", "type": "file", "size": 1024 },
    { "name": "repo/src", "type": "directory", "size": null },
    { "name": "repo/src/index.ts", "type": "file", "size": 48 },
    { "name": "src", "type": "directory", "size": null },
    { "name": "src/main.py", "type": "file", "size": 58 }
  ],
  "truncated": false
}
Capped at 1,000 entries. Symlinks pointing outside the workspace are silently skipped.

Session Lifecycle

1

Session Created

When you send the first message to a sandbox agent, a session is automatically created. The session ID is agentId:conversationId. Each session gets its own workspace directory.

2

TTL Extended on Activity

Every tool call extends the session TTL (default: 24 hours). Session creation is idempotent — subsequent messages just extend the timeout and resume the workspace.

3

Files Persist Across Turns

All files in the workspace persist between tool calls and conversation turns. Git repos, generated files, build artifacts — everything stays until the session ends.

4

Auto-Expire or Manual Cleanup

Sessions expire after the TTL (24 hours of inactivity). Deleting a conversation immediately destroys the session. A cleanup sweep runs every 60 seconds.

Orphan cleanup: On server startup, any leftover workspace directories from a previous run are automatically removed. Graceful shutdown (SIGTERM/SIGINT) destroys all active sessions.
What happens when a session expires? After 24 hours of inactivity, the session and its workspace are deleted. If you send a message after that, a new session is created with a fresh, empty workspace — any files, git repos, or build artifacts from the previous session will be gone. The agent can still use all sandbox tools but starts from scratch.
Coming soon: Workspace snapshots to object storage. Expired sessions will be automatically archived and restored on resume, giving true persistence across session expirations and server restarts.

Session API

These endpoints are called automatically by the TPMJS web app. You only need them if building a custom integration.

POST /sessions — Create or Resume

curl -X POST https://your-sandbox.up.railway.app/sessions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $EXECUTOR_API_KEY" \
  -d '{"sessionId": "agent123:conv456", "timeoutSeconds": 86400}'

Response:

{
  "sessionId": "agent123:conv456",
  "workDir": "/tmp/tpmjs-sandbox/sessions/agent123_conv456/workspace",
  "createdAt": "2026-02-16T10:00:00.000Z",
  "expiresAt": "2026-02-16T11:00:00.000Z",
  "resumed": false
}

Idempotent: if the session already exists, returns it with "resumed": true and extends the TTL.

GET /sessions/:id — Get Status

curl https://your-sandbox.up.railway.app/sessions/agent123:conv456 \
  -H "Authorization: Bearer $EXECUTOR_API_KEY"

Response:

{
  "sessionId": "agent123:conv456",
  "workDir": "/tmp/tpmjs-sandbox/sessions/agent123_conv456/workspace",
  "createdAt": "2026-02-16T10:00:00.000Z",
  "expiresAt": "2026-02-16T11:00:00.000Z",
  "toolCallCount": 7
}

DELETE /sessions/:id — Destroy

curl -X DELETE https://your-sandbox.up.railway.app/sessions/agent123:conv456 \
  -H "Authorization: Bearer $EXECUTOR_API_KEY"

Immediately removes the session and deletes the workspace directory.

Sandbox Configuration

Enable sandbox by toggling the sandboxEnabled switch on your agent. Sandbox is independent of the executor setting — you can use sandbox with either the default executor or a custom one.

// Enable sandbox on an agent via the API:
PATCH /api/agents/:id
{
  "sandboxEnabled": true
}

// Sandbox URL and API key are configured via environment variables:
// AGENT_SANDBOX_URL=https://your-sandbox.up.railway.app
// AGENT_SANDBOX_API_KEY=your-secret-key

Config Cascade

Executor configuration follows a cascade — agent-level overrides collection-level, which overrides the system default:

// Resolution order (highest to lowest priority):
Agent.executorConfig     →  applies to ALL tools
Collection.executorConfig →  applies to tools in that collection
System default           →  stateless package-executor
TPMJS Default SandboxSelf-hosted
SetupZero config — just toggle "Enable Sandbox"Deploy template to Railway/Docker
URLLeave URL field emptyEnter your sandbox URL
Data PrivacyShared infrastructureYour infrastructure, your data
CustomizationStandard limitsCustom TTL, disk quotas, session limits

Environment Variables

Sandbox Server

VariableDefaultDescription
EXECUTOR_API_KEYBearer token for authentication. If unset, auth is disabled.
MAX_CONCURRENT_SESSIONS50Maximum number of active sessions
DEFAULT_SESSION_TTL_SECONDS86400Session timeout (24 hours)
SESSION_DISK_QUOTA_MB100Per-session disk quota in MB
PORT3002Server port
GITHUB_TOKENGitHub PAT for cloning private repos via credential helper

Web App

VariableDefaultDescription
AGENT_SANDBOX_URLURL of the sandbox server. Can be overridden per-agent via executorConfig.url.

Agent Environment Variables

Agents and collections can define environment variables that are injected into each tool call. Agent-level env vars override collection-level ones:

// Collection envVars (base)
{ "API_KEY": "coll-key", "REGION": "us-east-1" }

// Agent envVars (overrides)
{ "API_KEY": "agent-key", "DEBUG": "true" }

// Merged result
{ "API_KEY": "agent-key", "REGION": "us-east-1", "DEBUG": "true" }

Env vars are set before each tool execution and cleaned up afterward. They are not persisted across calls.

Git Integration

The sandbox container includes git and openssh-client. Agents can clone repos, create branches, make commits, and push changes.

Public Repos

No configuration needed. Agents can clone any public repo:

shellExec({ command: "git clone https://github.com/user/repo.git --depth 1" })

Private Repos

Set the GITHUB_TOKEN environment variable on the sandbox server. A built-in credential helper automatically provides the token to git:

The credential helper reads GITHUB_TOKEN from the server environment and provides it as an x-access-token credential to git. No .gitconfig or SSH key setup needed.

Deploy to Railway

The fastest way to get a self-hosted sandbox running.

# Clone the template
git clone https://github.com/tpmjs/tpmjs.git
cd tpmjs/templates/agent-sandbox

# Deploy to Railway
npm install -g @railway/cli
railway login
railway init
railway up

# Set env vars in Railway dashboard:
#   EXECUTOR_API_KEY=your-secret
#   GITHUB_TOKEN=ghp_... (optional, for private repos)

Set EXECUTOR_API_KEY in the Railway dashboard to secure your instance.

Deploy with Docker

# Build
docker build -t tpmjs-sandbox .

# Run
docker run -p 3002:3002 \
  -e EXECUTOR_API_KEY=your-secret-key \
  -e GITHUB_TOKEN=ghp_optional \
  tpmjs-sandbox

The Docker image is based on denoland/deno:2.1.9 and includes curl, git, and openssh-client. It runs as a non-root user with write access restricted to /tmp.

Connect to TPMJS

Point the web app at your sandbox server:

# apps/web/.env.local
AGENT_SANDBOX_URL=https://your-sandbox.up.railway.app

Or for local development:

# Run locally (no auth)
cd templates/agent-sandbox
deno run --allow-net --allow-env --allow-read --allow-write=/tmp --allow-run server.ts

# Quick test
curl http://localhost:3002/health | jq .

Security Model

🔐

Authentication

All endpoints except /health and /metrics require a Bearer token when EXECUTOR_API_KEY is set. Disable auth by leaving it unset (dev only).

📁

Path Sandboxing

All file paths are normalized, resolved, and validated to stay within the workspace. Symlinks are checked via realPath(). Escape attempts throw an error.

👤

Process Isolation

Runs as a non-root deno user. Write access is restricted to /tmp. Each session has its own workspace directory. Env vars are cleaned up after each call.

⏱️

Timeout Protection

Shell commands have configurable timeouts (default 30s, max 5min). SIGTERM then SIGKILL. HTTP-level timeout of 5 minutes on executor calls.

Limits & Quotas

ResourceLimitConfigurable
Disk per session100 MBSESSION_DISK_QUOTA_MB
Concurrent sessions50MAX_CONCURRENT_SESSIONS
Session TTL24 hoursDEFAULT_SESSION_TTL_SECONDS
Shell command timeout30s (max 5min)Per tool call
stdout/stderr100 KB eachNo
File read size500 KBNo
Directory listing1,000 entriesNo
Module cache200 entries, 2min TTLNo

Execute Tool API

The core endpoint for running tools within a session.

curl -X POST https://your-sandbox.up.railway.app/execute-tool \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $EXECUTOR_API_KEY" \
  -d '{
    "packageName": "@tpmjs/official-sandbox-shell",
    "name": "shellExec",
    "version": "0.1.0",
    "params": { "command": "ls -la" },
    "sessionId": "agent123:conv456",
    "env": { "MY_VAR": "value" }
  }'

Response:

{
  "success": true,
  "output": {
    "stdout": "total 8\ndrwxr-xr-x 3 deno deno 4096 ...\n",
    "stderr": "",
    "exitCode": 0,
    "durationMs": 12,
    "truncated": false
  },
  "executionTimeMs": 45
}

Request Fields

ParameterTypeRequiredDescription
packageNamestring
Yes
npm package name (e.g. @tpmjs/official-sandbox-shell)
namestring
Yes
Named export from the package (e.g. shellExec)
versionstringNoPackage version (default: latest)
paramsobjectNoParameters passed to tool.execute()
sessionIdstringNoSession ID for workspace resolution
envobjectNoEnvironment variables set for this execution only

Health & Metrics

GET /health

No authentication required. Use for uptime monitoring and capability detection.

{
  "status": "ok",
  "version": "1.0.0",
  "info": {
    "runtime": "deno",
    "denoVersion": "2.1.9",
    "activeSessions": 3,
    "maxSessions": 50,
    "cachedModules": 12,
    "capabilities": {
      "sessions": true,
      "executeToolWithSession": true
    }
  }
}

GET /metrics

No authentication required. Operational metrics for dashboards and alerting.

{
  "uptime": 3600000,
  "sessions": {
    "active": 3,
    "max": 50,
    "totalCreated": 150,
    "totalDestroyed": 147
  },
  "executions": {
    "total": 1024,
    "successful": 1010,
    "failed": 14
  },
  "memory": {
    "rss": 67108864,
    "heapUsed": 45000000,
    "heapTotal": 60000000
  },
  "cache": {
    "modules": 12,
    "maxSize": 200
  }
}