Agent Sandbox

Persistent filesystem sessions for TPMJS agents. Clone repos, edit files, run commands — all state preserved across tool calls within a conversation.

Stateful Sessions

Git Support

Shell Execution

File I/O

Self-hostable

Introduction

The Agent Sandbox is a Deno-based execution server that gives each agent conversation its own persistent workspace directory. Unlike the default stateless executor where each tool call is independent, the sandbox preserves files across tool calls so agents can perform multi-step workflows.

A file written in the first turn of a conversation is still there in the tenth turn. A git repo cloned in one step can be modified and committed in the next. This is what makes agentic coding workflows possible.

User Message
  │
  ▼
┌─────────────────────────────────────┐
│  TPMJS Web App (Next.js)            │
│                                     │
│  1. Create/resume sandbox session   │
│  2. Stream AI response              │
│  3. Route tool calls to sandbox     │
│  4. Persist messages to DB          │
└──────────────┬──────────────────────┘
               │  POST /execute-tool
               │  { sessionId, params }
               ▼
┌─────────────────────────────────────┐
│  Agent Sandbox Server (Deno)        │
│                                     │
│  Session: agent123:conv456          │
│  ┌───────────────────────────────┐  │
│  │  /workspace/                  │  │
│  │  ├── my-repo/                 │  │
│  │  │   ├── .git/                │  │
│  │  │   ├── src/                 │  │
│  │  │   └── README.md            │  │
│  │  └── notes.txt                │  │
│  └───────────────────────────────┘  │
│                                     │
│  Inject _sandboxWorkDir → execute   │
└─────────────────────────────────────┘

When to Use the Sandbox

🔧

Multi-step Workflows

Generate files in one step, process them in the next. Build pipelines that span multiple tool calls.

📂

Git Operations

Clone repositories, make changes, commit, and push. The full git workflow within a conversation.

💾

File-based Processing

Tools that read and write files need shared state. CSV analysis, log parsing, code generation — all work naturally.

🏗️

Code Generation & Testing

Generate code in one step, run tests or build it in another. Iterate based on results.

Don't need statefulness? Use the default executor. It's simpler, faster, and doesn't require a separate server. The sandbox is specifically for workflows where files must persist between tool calls.

How It Works

Here's a multi-turn conversation showing how state persists:

// Turn 1: User says "Clone the repo and show me the README"
// Agent calls:
shellExec({ command: "git clone https://github.com/user/repo.git --depth 1" })
readFile({ path: "repo/README.md" })

// Turn 2: User says "Add a license file"
// Agent calls (files from turn 1 still exist):
writeFile({ path: "repo/LICENSE", content: "MIT License..." })
shellExec({ command: "cd repo && git add LICENSE && git commit -m 'Add license'" })

// Turn 3: User says "Show me the git log"
// Agent calls (all prior changes persist):
shellExec({ command: "cd repo && git log --oneline" })

End-to-End Flow

1User sends a message to an agent with sandbox enabled
2The conversation route creates or resumes a sandbox session (ID: agentId:conversationId)
3The AI model decides which tools to call based on the user message
4Tool calls are routed to the sandbox server with the session ID
5The sandbox server resolves the session workspace and injects _sandboxWorkDir into params
6The tool executes in the workspace directory — files persist on disk
7Results flow back to the AI model, which may call more tools or respond with text
8All messages, tool calls, and results are persisted to the database

shellExec

Execute any shell command in the workspace. Supports git, npm, python, and any CLI tool available in the container.

Parameters

Parameter	Type	Required	Description
command	string	Yes	Shell command passed to sh -c
timeout	number	No	Timeout in milliseconds (default: 30000, max: 300000)

Example

// Agent calls shellExec
{
  "command": "git clone https://github.com/user/repo.git --depth 1",
  "timeout": 60000
}

Response:

{
  "stdout": "Cloning into 'repo'...\n",
  "stderr": "",
  "exitCode": 0,
  "durationMs": 2341,
  "truncated": false
}

Timeout behavior: On timeout, sends SIGTERM, waits 2 seconds, then SIGKILL. Exit code 137 indicates a timeout kill.

Output truncation: stdout and stderr are each truncated at 100 KB.

readFile

Read a file from the workspace. Paths are relative to the workspace root.

Parameters

Parameter	Type	Required	Description
path	string	Yes	File path relative to workspace root

Example

// Agent calls readFile
{
  "path": "repo/src/index.ts"
}

Response:

{
  "content": "export function hello() {\n  return 'world';\n}\n",
  "path": "repo/src/index.ts",
  "size": 48,
  "truncated": false
}

Content is truncated at 500 KB. Symlinks that escape the workspace are blocked.

writeFile

Create or overwrite a file. Parent directories are created automatically by default.

Parameters

Parameter	Type	Required	Description
path	string	Yes	File path relative to workspace root
content	string	Yes	Content to write
createDirs	boolean	No	Create parent directories if needed (default: true)

Example

// Agent calls writeFile
{
  "path": "src/main.py",
  "content": "import json\n\ndef transform(data):\n    return json.dumps(data)\n",
  "createDirs": true
}

Response:

{
  "success": true,
  "path": "src/main.py",
  "bytesWritten": 58
}

listFiles

List files and directories. Supports recursive listing with depth control.

Parameters

Parameter	Type	Required	Description
path	string	No	Directory path relative to workspace root (default: ".")
recursive	boolean	No	List recursively (default: false)
maxDepth	number	No	Max depth for recursive listing (default: 3)

Example

// Agent calls listFiles
{
  "path": ".",
  "recursive": true,
  "maxDepth": 2
}

Response:

{
  "path": ".",
  "entries": [
    { "name": "repo", "type": "directory", "size": null },
    { "name": "repo/README.md", "type": "file", "size": 1024 },
    { "name": "repo/src", "type": "directory", "size": null },
    { "name": "repo/src/index.ts", "type": "file", "size": 48 },
    { "name": "src", "type": "directory", "size": null },
    { "name": "src/main.py", "type": "file", "size": 58 }
  ],
  "truncated": false
}

Capped at 1,000 entries. Symlinks pointing outside the workspace are silently skipped.

Session Lifecycle

Session Created

When you send the first message to a sandbox agent, a session is automatically created. The session ID is agentId:conversationId. Each session gets its own workspace directory.

TTL Extended on Activity

Every tool call extends the session TTL (default: 24 hours). Session creation is idempotent — subsequent messages just extend the timeout and resume the workspace.

Files Persist Across Turns

All files in the workspace persist between tool calls and conversation turns. Git repos, generated files, build artifacts — everything stays until the session ends.

Auto-Expire or Manual Cleanup

Sessions expire after the TTL (24 hours of inactivity). Deleting a conversation immediately destroys the session. A cleanup sweep runs every 60 seconds.

Orphan cleanup: On server startup, any leftover workspace directories from a previous run are automatically removed. Graceful shutdown (SIGTERM/SIGINT) destroys all active sessions.

What happens when a session expires? After 24 hours of inactivity, the session and its workspace are deleted. If you send a message after that, a new session is created with a fresh, empty workspace — any files, git repos, or build artifacts from the previous session will be gone. The agent can still use all sandbox tools but starts from scratch.

Coming soon: Workspace snapshots to object storage. Expired sessions will be automatically archived and restored on resume, giving true persistence across session expirations and server restarts.

Session API

These endpoints are called automatically by the TPMJS web app. You only need them if building a custom integration.

POST /sessions — Create or Resume

curl -X POST https://your-sandbox.up.railway.app/sessions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $EXECUTOR_API_KEY" \
  -d '{"sessionId": "agent123:conv456", "timeoutSeconds": 86400}'

Response:

{
  "sessionId": "agent123:conv456",
  "workDir": "/tmp/tpmjs-sandbox/sessions/agent123_conv456/workspace",
  "createdAt": "2026-02-16T10:00:00.000Z",
  "expiresAt": "2026-02-16T11:00:00.000Z",
  "resumed": false
}

Idempotent: if the session already exists, returns it with "resumed": true and extends the TTL.

GET /sessions/:id — Get Status

curl https://your-sandbox.up.railway.app/sessions/agent123:conv456 \
  -H "Authorization: Bearer $EXECUTOR_API_KEY"

Response:

{
  "sessionId": "agent123:conv456",
  "workDir": "/tmp/tpmjs-sandbox/sessions/agent123_conv456/workspace",
  "createdAt": "2026-02-16T10:00:00.000Z",
  "expiresAt": "2026-02-16T11:00:00.000Z",
  "toolCallCount": 7
}

DELETE /sessions/:id — Destroy

curl -X DELETE https://your-sandbox.up.railway.app/sessions/agent123:conv456 \
  -H "Authorization: Bearer $EXECUTOR_API_KEY"

Immediately removes the session and deletes the workspace directory.

Sandbox Configuration

Enable sandbox by toggling the sandboxEnabled switch on your agent. Sandbox is independent of the executor setting — you can use sandbox with either the default executor or a custom one.

// Enable sandbox on an agent via the API:
PATCH /api/agents/:id
{
  "sandboxEnabled": true
}

// Sandbox URL and API key are configured via environment variables:
// AGENT_SANDBOX_URL=https://your-sandbox.up.railway.app
// AGENT_SANDBOX_API_KEY=your-secret-key

Config Cascade

Executor configuration follows a cascade — agent-level overrides collection-level, which overrides the system default:

// Resolution order (highest to lowest priority):
Agent.executorConfig     →  applies to ALL tools
Collection.executorConfig →  applies to tools in that collection
System default           →  stateless package-executor

	TPMJS Default Sandbox	Self-hosted
Setup	Zero config — just toggle "Enable Sandbox"	Deploy template to Railway/Docker
URL	Leave URL field empty	Enter your sandbox URL
Data Privacy	Shared infrastructure	Your infrastructure, your data
Customization	Standard limits	Custom TTL, disk quotas, session limits

Environment Variables

Sandbox Server

Variable	Default	Description
EXECUTOR_API_KEY	—	Bearer token for authentication. If unset, auth is disabled.
MAX_CONCURRENT_SESSIONS	50	Maximum number of active sessions
DEFAULT_SESSION_TTL_SECONDS	86400	Session timeout (24 hours)
SESSION_DISK_QUOTA_MB	100	Per-session disk quota in MB
PORT	3002	Server port
GITHUB_TOKEN	—	GitHub PAT for cloning private repos via credential helper

Web App

Variable	Default	Description
AGENT_SANDBOX_URL	—	URL of the sandbox server. Can be overridden per-agent via executorConfig.url.

Agent Environment Variables

Agents and collections can define environment variables that are injected into each tool call. Agent-level env vars override collection-level ones:

// Collection envVars (base)
{ "API_KEY": "coll-key", "REGION": "us-east-1" }

// Agent envVars (overrides)
{ "API_KEY": "agent-key", "DEBUG": "true" }

// Merged result
{ "API_KEY": "agent-key", "REGION": "us-east-1", "DEBUG": "true" }

Env vars are set before each tool execution and cleaned up afterward. They are not persisted across calls.

Git Integration

The sandbox container includes git and openssh-client. Agents can clone repos, create branches, make commits, and push changes.

Public Repos

No configuration needed. Agents can clone any public repo:

shellExec({ command: "git clone https://github.com/user/repo.git --depth 1" })

Private Repos

Set the GITHUB_TOKEN environment variable on the sandbox server. A built-in credential helper automatically provides the token to git:

The credential helper reads GITHUB_TOKEN from the server environment and provides it as an x-access-token credential to git. No .gitconfig or SSH key setup needed.

Deploy to Railway

The fastest way to get a self-hosted sandbox running.

# Clone the template
git clone https://github.com/tpmjs/tpmjs.git
cd tpmjs/templates/agent-sandbox

# Deploy to Railway
npm install -g @railway/cli
railway login
railway init
railway up

# Set env vars in Railway dashboard:
#   EXECUTOR_API_KEY=your-secret
#   GITHUB_TOKEN=ghp_... (optional, for private repos)

Set EXECUTOR_API_KEY in the Railway dashboard to secure your instance.

Deploy with Docker

# Build
docker build -t tpmjs-sandbox .

# Run
docker run -p 3002:3002 \
  -e EXECUTOR_API_KEY=your-secret-key \
  -e GITHUB_TOKEN=ghp_optional \
  tpmjs-sandbox

The Docker image is based on denoland/deno:2.1.9 and includes curl, git, and openssh-client. It runs as a non-root user with write access restricted to /tmp.

Connect to TPMJS

Point the web app at your sandbox server:

# apps/web/.env.local
AGENT_SANDBOX_URL=https://your-sandbox.up.railway.app

Or for local development:

# Run locally (no auth)
cd templates/agent-sandbox
deno run --allow-net --allow-env --allow-read --allow-write=/tmp --allow-run server.ts

# Quick test
curl http://localhost:3002/health | jq .

Security Model

🔐

Authentication

All endpoints except /health and /metrics require a Bearer token when EXECUTOR_API_KEY is set. Disable auth by leaving it unset (dev only).

📁

Path Sandboxing

All file paths are normalized, resolved, and validated to stay within the workspace. Symlinks are checked via realPath(). Escape attempts throw an error.

👤

Process Isolation

Runs as a non-root deno user. Write access is restricted to /tmp. Each session has its own workspace directory. Env vars are cleaned up after each call.

⏱️

Timeout Protection

Shell commands have configurable timeouts (default 30s, max 5min). SIGTERM then SIGKILL. HTTP-level timeout of 5 minutes on executor calls.

Limits & Quotas

Resource	Limit	Configurable
Disk per session	100 MB	SESSION_DISK_QUOTA_MB
Concurrent sessions	50	MAX_CONCURRENT_SESSIONS
Session TTL	24 hours	DEFAULT_SESSION_TTL_SECONDS
Shell command timeout	30s (max 5min)	Per tool call
stdout/stderr	100 KB each	No
File read size	500 KB	No
Directory listing	1,000 entries	No
Module cache	200 entries, 2min TTL	No

Execute Tool API

The core endpoint for running tools within a session.

curl -X POST https://your-sandbox.up.railway.app/execute-tool \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $EXECUTOR_API_KEY" \
  -d '{
    "packageName": "@tpmjs/official-sandbox-shell",
    "name": "shellExec",
    "version": "0.1.0",
    "params": { "command": "ls -la" },
    "sessionId": "agent123:conv456",
    "env": { "MY_VAR": "value" }
  }'

Response:

{
  "success": true,
  "output": {
    "stdout": "total 8\ndrwxr-xr-x 3 deno deno 4096 ...\n",
    "stderr": "",
    "exitCode": 0,
    "durationMs": 12,
    "truncated": false
  },
  "executionTimeMs": 45
}

Request Fields

Parameter	Type	Required	Description
packageName	string	Yes	npm package name (e.g. @tpmjs/official-sandbox-shell)
name	string	Yes	Named export from the package (e.g. shellExec)
version	string	No	Package version (default: latest)
params	object	No	Parameters passed to tool.execute()
sessionId	string	No	Session ID for workspace resolution
env	object	No	Environment variables set for this execution only

Health & Metrics

GET /health

No authentication required. Use for uptime monitoring and capability detection.

{
  "status": "ok",
  "version": "1.0.0",
  "info": {
    "runtime": "deno",
    "denoVersion": "2.1.9",
    "activeSessions": 3,
    "maxSessions": 50,
    "cachedModules": 12,
    "capabilities": {
      "sessions": true,
      "executeToolWithSession": true
    }
  }
}

GET /metrics

No authentication required. Operational metrics for dashboards and alerting.

{
  "uptime": 3600000,
  "sessions": {
    "active": 3,
    "max": 50,
    "totalCreated": 150,
    "totalDestroyed": 147
  },
  "executions": {
    "total": 1024,
    "successful": 1010,
    "failed": 14
  },
  "memory": {
    "rss": 67108864,
    "heapUsed": 45000000,
    "heapTotal": 60000000
  },
  "cache": {
    "modules": 12,
    "maxSize": 200
  }
}

← Railway Executor All Executors →