Scenarios are AI-generated test cases for your tool collections. They automatically verify that your tools work correctly and provide quality metrics over time.
Automated testing for tool collections
A Scenario is a test case that exercises your tool collection with a realistic user prompt. When you run a scenario:
This enables continuous testing of your tool collections, similar to unit tests for code.
Typical scenarios for different tool types
Test data extraction from various webpage structures:
"Scrape the main heading and first paragraph from https://example.com"
"Extract all links from a news article page"
"Get the price and description from an e-commerce product page"Verify search accuracy and relevance:
"Search for recent news about climate change and summarize the top 3 results"
"Find documentation for the React useState hook"
"Search for restaurants near Times Square, New York"Test transformations and calculations:
"Convert 100 USD to EUR using current exchange rates"
"Parse this JSON and extract the user email addresses"
"Calculate the average of these numbers: 10, 20, 30, 40, 50"Verify file operations and content extraction:
"Generate a PDF report with the title 'Monthly Summary'"
"Extract text content from a markdown document"
"Create a CSV file with sample user data"Install the CLI and authenticate
npm install -g @tpmjs/cliYou need a TPMJS API key to run scenarios. Get one from your dashboard settings.
tpm auth logintpm scenario generate
AI-generate test scenarios based on your collection's tools. The generator analyzes your tools and creates realistic prompts.
# Generate 1 scenario (default)
tpm scenario generate my-collection
# Generate multiple scenarios
tpm scenario generate my-collection --count 5
# Skip similarity check (allow duplicates)
tpm scenario generate my-collection --count 3 --skip-similarityGenerating 3 scenarios for "Web Scraping Toolkit"...
✓ Generated 3 scenarios:
1. "Scrape the main article content from a news website and extract..."
Similarity: 0% (unique)
Tags: web-scraping, content-extraction
2. "Extract all image URLs from an e-commerce product gallery..."
Similarity: 15% (unique)
Tags: web-scraping, images, e-commerce
3. "Get the current weather data from a weather service page..."
Similarity: 8% (unique)
Tags: web-scraping, weather, data-extraction
Use 'tpm scenario list my-collection' to view all scenarios.Generated scenarios are checked against existing ones using vector similarity. If a scenario is >70% similar to an existing one, you'll see a warning. This helps maintain diverse test coverage.
tpm scenario list
View all scenarios for a collection or browse public scenarios.
# List scenarios for a specific collection
tpm scenario list my-collection
# List all public scenarios
tpm scenario list
# With pagination
tpm scenario list --limit 50 --offset 20
# Filter by tags
tpm scenario list --tags web-scraping,api
# Output as JSON
tpm scenario list my-collection --jsonScenarios for Web Scraping Toolkit
Name Quality Runs Status Tags
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Scrape main article content... 85% 12 pass web-scraping
Extract image URLs from gallery 92% 8 pass images, e-commerce
Get weather data from service... 45% 5 fail weather, data
Showing 3 scenario(s)tpm scenario run
Execute all scenarios for a collection. This is ideal for CI/CD pipelines or batch testing.
# Run all scenarios for a collection
tpm scenario run my-collection
# Verbose output with detailed progress
tpm scenario run my-collection --verbose
# JSON output for CI integration
tpm scenario run my-collection --jsonRunning 3 scenarios for "Web Scraping Toolkit"...
[1/3] Scrape main article content...
✓ PASSED (2.3s) - Successfully extracted article content
[2/3] Extract image URLs from gallery
✓ PASSED (1.8s) - Found and returned 12 image URLs
[3/3] Get weather data from service...
✗ FAILED (3.1s) - Weather service returned 403 error
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Results: 2 passed, 1 failed, 0 errors
Total time: 7.2s
Quota remaining: 47 runs/day0 - All scenarios passed1 - One or more scenarios failedtpm scenario test
Run a single scenario by its ID. Useful for debugging or re-testing specific failures.
# Run a single scenario
tpm scenario test clu123abc456
# With verbose output
tpm scenario test clu123abc456 --verbose
# JSON output
tpm scenario test clu123abc456 --jsonScenario: Scrape main article content from a news website
Collection: Web Scraping Toolkit
✓ Scenario PASSED
Results
Status: completed
Verdict: pass
Reason: The agent successfully extracted the main article heading and...
Usage
Duration: 2,341ms
Tokens: 1,245 (in: 892, out: 353)
Run ID: run_abc123def456
Quota remaining: 46 runs/daytpm scenario info
View detailed information about a scenario including its run history and quality metrics.
# View scenario details
tpm scenario info clu123abc456
# Include run history
tpm scenario info clu123abc456 --runs 10
# JSON output
tpm scenario info clu123abc456 --jsonScenario: Scrape main article content from a news website
ID: clu123abc456
Collection: Web Scraping Toolkit
Created: 2 weeks ago
Prompt:
Scrape the main article content from a news website and extract
the headline, author, publication date, and body text.
Quality Metrics:
Score: 85%
Total Runs: 12
Consecutive Passes: 5
Last Run: 2 hours ago (pass)
Tags: web-scraping, content-extraction, news
Recent Runs:
#12 pass 2h ago 2,341ms "Successfully extracted article content"
#11 pass 1d ago 2,156ms "Successfully extracted article content"
#10 pass 2d ago 2,892ms "Successfully extracted article content"
#9 fail 3d ago 4,521ms "Timeout waiting for page load"
#8 pass 4d ago 2,234ms "Successfully extracted article content"How scenario quality is calculated
Quality scores help identify reliable scenarios and track improvement over time. Scores range from 0% to 100%.
Scenarios earn bonus points for consecutive passes and lose points for consecutive failures:
A scenario with 5 consecutive passes would have a quality score of approximately 85% (5% × 5 + 1% × (1+2+3+4+5) = 25% + 15% = 40%, plus base score). High-quality scenarios are featured on the TPMJS homepage showcase.
Automate scenario testing in your pipeline
Integrate scenario testing into your CI/CD pipeline to catch regressions early.
name: TPMJS Scenario Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test-scenarios:
runs-on: ubuntu-latest
steps:
- name: Install TPMJS CLI
run: npm install -g @tpmjs/cli
- name: Configure API Key
run: |
mkdir -p ~/.config/tpmjs
echo '{"apiKey":"${{ secrets.TPMJS_API_KEY }}"}' > ~/.config/tpmjs/config.json
- name: Run Scenarios
run: tpm scenario run my-collection --json > results.json
- name: Check Results
run: |
FAILED=$(jq '.failed' results.json)
if [ "$FAILED" -gt 0 ]; then
echo "❌ $FAILED scenario(s) failed"
exit 1
fi
echo "✅ All scenarios passed"{
"collection": "my-collection",
"total": 5,
"passed": 4,
"failed": 1,
"errors": 0,
"duration": 12345,
"results": [
{
"scenarioId": "clu123abc456",
"name": "Scrape main article content",
"status": "pass",
"duration": 2341,
"verdict": "pass",
"reason": "Successfully extracted article content"
}
]
}Daily quota and usage tracking
Scenario execution is subject to daily rate limits to ensure fair usage:
The remaining quota is shown after each scenario run. Plan your CI/CD schedules accordingly to stay within limits.
Continue learning about TPMJS
REST API endpoints for programmatic scenario management
Create and manage tool collections for your scenarios
Learn how scenarios use agents for execution
Discover tools to add to your collections