Agent Sandbox
Persistent filesystem sessions for TPMJS agents. Clone repos, edit files, run commands — all state preserved across tool calls within a conversation.
Introduction
The Agent Sandbox is a Deno-based execution server that gives each agent conversation its own persistent workspace directory. Unlike the default stateless executor where each tool call is independent, the sandbox preserves files across tool calls so agents can perform multi-step workflows.
A file written in the first turn of a conversation is still there in the tenth turn. A git repo cloned in one step can be modified and committed in the next. This is what makes agentic coding workflows possible.
User Message
│
▼
┌─────────────────────────────────────┐
│ TPMJS Web App (Next.js) │
│ │
│ 1. Create/resume sandbox session │
│ 2. Stream AI response │
│ 3. Route tool calls to sandbox │
│ 4. Persist messages to DB │
└──────────────┬──────────────────────┘
│ POST /execute-tool
│ { sessionId, params }
▼
┌─────────────────────────────────────┐
│ Agent Sandbox Server (Deno) │
│ │
│ Session: agent123:conv456 │
│ ┌───────────────────────────────┐ │
│ │ /workspace/ │ │
│ │ ├── my-repo/ │ │
│ │ │ ├── .git/ │ │
│ │ │ ├── src/ │ │
│ │ │ └── README.md │ │
│ │ └── notes.txt │ │
│ └───────────────────────────────┘ │
│ │
│ Inject _sandboxWorkDir → execute │
└─────────────────────────────────────┘When to Use the Sandbox
Multi-step Workflows
Generate files in one step, process them in the next. Build pipelines that span multiple tool calls.
Git Operations
Clone repositories, make changes, commit, and push. The full git workflow within a conversation.
File-based Processing
Tools that read and write files need shared state. CSV analysis, log parsing, code generation — all work naturally.
Code Generation & Testing
Generate code in one step, run tests or build it in another. Iterate based on results.
Don't need statefulness? Use the default executor. It's simpler, faster, and doesn't require a separate server. The sandbox is specifically for workflows where files must persist between tool calls.
How It Works
Here's a multi-turn conversation showing how state persists:
// Turn 1: User says "Clone the repo and show me the README"
// Agent calls:
shellExec({ command: "git clone https://github.com/user/repo.git --depth 1" })
readFile({ path: "repo/README.md" })
// Turn 2: User says "Add a license file"
// Agent calls (files from turn 1 still exist):
writeFile({ path: "repo/LICENSE", content: "MIT License..." })
shellExec({ command: "cd repo && git add LICENSE && git commit -m 'Add license'" })
// Turn 3: User says "Show me the git log"
// Agent calls (all prior changes persist):
shellExec({ command: "cd repo && git log --oneline" })End-to-End Flow
- 1User sends a message to an agent with sandbox enabled
- 2The conversation route creates or resumes a sandbox session (ID: agentId:conversationId)
- 3The AI model decides which tools to call based on the user message
- 4Tool calls are routed to the sandbox server with the session ID
- 5The sandbox server resolves the session workspace and injects _sandboxWorkDir into params
- 6The tool executes in the workspace directory — files persist on disk
- 7Results flow back to the AI model, which may call more tools or respond with text
- 8All messages, tool calls, and results are persisted to the database
shellExec
Execute any shell command in the workspace. Supports git, npm, python, and any CLI tool available in the container.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| command | string | Yes | Shell command passed to sh -c |
| timeout | number | No | Timeout in milliseconds (default: 30000, max: 300000) |
Example
// Agent calls shellExec
{
"command": "git clone https://github.com/user/repo.git --depth 1",
"timeout": 60000
}Response:
{
"stdout": "Cloning into 'repo'...\n",
"stderr": "",
"exitCode": 0,
"durationMs": 2341,
"truncated": false
}Timeout behavior: On timeout, sends SIGTERM, waits 2 seconds, then SIGKILL. Exit code 137 indicates a timeout kill.
Output truncation: stdout and stderr are each truncated at 100 KB.
readFile
Read a file from the workspace. Paths are relative to the workspace root.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| path | string | Yes | File path relative to workspace root |
Example
// Agent calls readFile
{
"path": "repo/src/index.ts"
}Response:
{
"content": "export function hello() {\n return 'world';\n}\n",
"path": "repo/src/index.ts",
"size": 48,
"truncated": false
}writeFile
Create or overwrite a file. Parent directories are created automatically by default.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| path | string | Yes | File path relative to workspace root |
| content | string | Yes | Content to write |
| createDirs | boolean | No | Create parent directories if needed (default: true) |
Example
// Agent calls writeFile
{
"path": "src/main.py",
"content": "import json\n\ndef transform(data):\n return json.dumps(data)\n",
"createDirs": true
}Response:
{
"success": true,
"path": "src/main.py",
"bytesWritten": 58
}listFiles
List files and directories. Supports recursive listing with depth control.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| path | string | No | Directory path relative to workspace root (default: ".") |
| recursive | boolean | No | List recursively (default: false) |
| maxDepth | number | No | Max depth for recursive listing (default: 3) |
Example
// Agent calls listFiles
{
"path": ".",
"recursive": true,
"maxDepth": 2
}Response:
{
"path": ".",
"entries": [
{ "name": "repo", "type": "directory", "size": null },
{ "name": "repo/README.md", "type": "file", "size": 1024 },
{ "name": "repo/src", "type": "directory", "size": null },
{ "name": "repo/src/index.ts", "type": "file", "size": 48 },
{ "name": "src", "type": "directory", "size": null },
{ "name": "src/main.py", "type": "file", "size": 58 }
],
"truncated": false
}Session Lifecycle
Session Created
When you send the first message to a sandbox agent, a session is automatically created. The session ID is agentId:conversationId. Each session gets its own workspace directory.
TTL Extended on Activity
Every tool call extends the session TTL (default: 24 hours). Session creation is idempotent — subsequent messages just extend the timeout and resume the workspace.
Files Persist Across Turns
All files in the workspace persist between tool calls and conversation turns. Git repos, generated files, build artifacts — everything stays until the session ends.
Auto-Expire or Manual Cleanup
Sessions expire after the TTL (24 hours of inactivity). Deleting a conversation immediately destroys the session. A cleanup sweep runs every 60 seconds.
Session API
These endpoints are called automatically by the TPMJS web app. You only need them if building a custom integration.
POST /sessions — Create or Resume
curl -X POST https://your-sandbox.up.railway.app/sessions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $EXECUTOR_API_KEY" \
-d '{"sessionId": "agent123:conv456", "timeoutSeconds": 86400}'Response:
{
"sessionId": "agent123:conv456",
"workDir": "/tmp/tpmjs-sandbox/sessions/agent123_conv456/workspace",
"createdAt": "2026-02-16T10:00:00.000Z",
"expiresAt": "2026-02-16T11:00:00.000Z",
"resumed": false
}Idempotent: if the session already exists, returns it with "resumed": true and extends the TTL.
GET /sessions/:id — Get Status
curl https://your-sandbox.up.railway.app/sessions/agent123:conv456 \
-H "Authorization: Bearer $EXECUTOR_API_KEY"Response:
{
"sessionId": "agent123:conv456",
"workDir": "/tmp/tpmjs-sandbox/sessions/agent123_conv456/workspace",
"createdAt": "2026-02-16T10:00:00.000Z",
"expiresAt": "2026-02-16T11:00:00.000Z",
"toolCallCount": 7
}DELETE /sessions/:id — Destroy
curl -X DELETE https://your-sandbox.up.railway.app/sessions/agent123:conv456 \
-H "Authorization: Bearer $EXECUTOR_API_KEY"Immediately removes the session and deletes the workspace directory.
Sandbox Configuration
Enable sandbox by toggling the sandboxEnabled switch on your agent. Sandbox is independent of the executor setting — you can use sandbox with either the default executor or a custom one.
// Enable sandbox on an agent via the API:
PATCH /api/agents/:id
{
"sandboxEnabled": true
}
// Sandbox URL and API key are configured via environment variables:
// AGENT_SANDBOX_URL=https://your-sandbox.up.railway.app
// AGENT_SANDBOX_API_KEY=your-secret-keyConfig Cascade
Executor configuration follows a cascade — agent-level overrides collection-level, which overrides the system default:
// Resolution order (highest to lowest priority):
Agent.executorConfig → applies to ALL tools
Collection.executorConfig → applies to tools in that collection
System default → stateless package-executor| TPMJS Default Sandbox | Self-hosted | |
|---|---|---|
| Setup | Zero config — just toggle "Enable Sandbox" | Deploy template to Railway/Docker |
| URL | Leave URL field empty | Enter your sandbox URL |
| Data Privacy | Shared infrastructure | Your infrastructure, your data |
| Customization | Standard limits | Custom TTL, disk quotas, session limits |
Environment Variables
Sandbox Server
| Variable | Default | Description |
|---|---|---|
| EXECUTOR_API_KEY | — | Bearer token for authentication. If unset, auth is disabled. |
| MAX_CONCURRENT_SESSIONS | 50 | Maximum number of active sessions |
| DEFAULT_SESSION_TTL_SECONDS | 86400 | Session timeout (24 hours) |
| SESSION_DISK_QUOTA_MB | 100 | Per-session disk quota in MB |
| PORT | 3002 | Server port |
| GITHUB_TOKEN | — | GitHub PAT for cloning private repos via credential helper |
Web App
| Variable | Default | Description |
|---|---|---|
| AGENT_SANDBOX_URL | — | URL of the sandbox server. Can be overridden per-agent via executorConfig.url. |
Agent Environment Variables
Agents and collections can define environment variables that are injected into each tool call. Agent-level env vars override collection-level ones:
// Collection envVars (base)
{ "API_KEY": "coll-key", "REGION": "us-east-1" }
// Agent envVars (overrides)
{ "API_KEY": "agent-key", "DEBUG": "true" }
// Merged result
{ "API_KEY": "agent-key", "REGION": "us-east-1", "DEBUG": "true" }Env vars are set before each tool execution and cleaned up afterward. They are not persisted across calls.
Git Integration
The sandbox container includes git and openssh-client. Agents can clone repos, create branches, make commits, and push changes.
Public Repos
No configuration needed. Agents can clone any public repo:
shellExec({ command: "git clone https://github.com/user/repo.git --depth 1" })Private Repos
Set the GITHUB_TOKEN environment variable on the sandbox server. A built-in credential helper automatically provides the token to git:
GITHUB_TOKEN from the server environment and provides it as an x-access-token credential to git. No .gitconfig or SSH key setup needed.Deploy to Railway
The fastest way to get a self-hosted sandbox running.
# Clone the template
git clone https://github.com/tpmjs/tpmjs.git
cd tpmjs/templates/agent-sandbox
# Deploy to Railway
npm install -g @railway/cli
railway login
railway init
railway up
# Set env vars in Railway dashboard:
# EXECUTOR_API_KEY=your-secret
# GITHUB_TOKEN=ghp_... (optional, for private repos)Set EXECUTOR_API_KEY in the Railway dashboard to secure your instance.
Deploy with Docker
# Build
docker build -t tpmjs-sandbox .
# Run
docker run -p 3002:3002 \
-e EXECUTOR_API_KEY=your-secret-key \
-e GITHUB_TOKEN=ghp_optional \
tpmjs-sandboxThe Docker image is based on denoland/deno:2.1.9 and includes curl, git, and openssh-client. It runs as a non-root user with write access restricted to /tmp.
Connect to TPMJS
Point the web app at your sandbox server:
# apps/web/.env.local
AGENT_SANDBOX_URL=https://your-sandbox.up.railway.appOr for local development:
# Run locally (no auth)
cd templates/agent-sandbox
deno run --allow-net --allow-env --allow-read --allow-write=/tmp --allow-run server.ts
# Quick test
curl http://localhost:3002/health | jq .Security Model
Authentication
All endpoints except /health and /metrics require a Bearer token when EXECUTOR_API_KEY is set. Disable auth by leaving it unset (dev only).
Path Sandboxing
All file paths are normalized, resolved, and validated to stay within the workspace. Symlinks are checked via realPath(). Escape attempts throw an error.
Process Isolation
Runs as a non-root deno user. Write access is restricted to /tmp. Each session has its own workspace directory. Env vars are cleaned up after each call.
Timeout Protection
Shell commands have configurable timeouts (default 30s, max 5min). SIGTERM then SIGKILL. HTTP-level timeout of 5 minutes on executor calls.
Limits & Quotas
| Resource | Limit | Configurable |
|---|---|---|
| Disk per session | 100 MB | SESSION_DISK_QUOTA_MB |
| Concurrent sessions | 50 | MAX_CONCURRENT_SESSIONS |
| Session TTL | 24 hours | DEFAULT_SESSION_TTL_SECONDS |
| Shell command timeout | 30s (max 5min) | Per tool call |
| stdout/stderr | 100 KB each | No |
| File read size | 500 KB | No |
| Directory listing | 1,000 entries | No |
| Module cache | 200 entries, 2min TTL | No |
Execute Tool API
The core endpoint for running tools within a session.
curl -X POST https://your-sandbox.up.railway.app/execute-tool \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $EXECUTOR_API_KEY" \
-d '{
"packageName": "@tpmjs/official-sandbox-shell",
"name": "shellExec",
"version": "0.1.0",
"params": { "command": "ls -la" },
"sessionId": "agent123:conv456",
"env": { "MY_VAR": "value" }
}'Response:
{
"success": true,
"output": {
"stdout": "total 8\ndrwxr-xr-x 3 deno deno 4096 ...\n",
"stderr": "",
"exitCode": 0,
"durationMs": 12,
"truncated": false
},
"executionTimeMs": 45
}Request Fields
| Parameter | Type | Required | Description |
|---|---|---|---|
| packageName | string | Yes | npm package name (e.g. @tpmjs/official-sandbox-shell) |
| name | string | Yes | Named export from the package (e.g. shellExec) |
| version | string | No | Package version (default: latest) |
| params | object | No | Parameters passed to tool.execute() |
| sessionId | string | No | Session ID for workspace resolution |
| env | object | No | Environment variables set for this execution only |
Health & Metrics
GET /health
No authentication required. Use for uptime monitoring and capability detection.
{
"status": "ok",
"version": "1.0.0",
"info": {
"runtime": "deno",
"denoVersion": "2.1.9",
"activeSessions": 3,
"maxSessions": 50,
"cachedModules": 12,
"capabilities": {
"sessions": true,
"executeToolWithSession": true
}
}
}GET /metrics
No authentication required. Operational metrics for dashboards and alerting.
{
"uptime": 3600000,
"sessions": {
"active": 3,
"max": 50,
"totalCreated": 150,
"totalDestroyed": 147
},
"executions": {
"total": 1024,
"successful": 1010,
"failed": 14
},
"memory": {
"rss": 67108864,
"heapUsed": 45000000,
"heapTotal": 60000000
},
"cache": {
"modules": 12,
"maxSize": 200
}
}