Installation & Usage

Install this tool and use it with the AI SDK

1. Install the package

npm install @tpmjs/tools-evals-blah

pnpm add @tpmjs/tools-evals-blah

yarn add @tpmjs/tools-evals-blah

bun add @tpmjs/tools-evals-blah

deno add npm:@tpmjs/tools-evals-blah

2. Import the tool

import { createEval } from '@tpmjs/tools-evals-blah';

3. Use with AI SDK

import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { createEval } from '@tpmjs/tools-evals-blah';

const result = await generateText({
  model: openai('gpt-4o'),
  tools: { createEval },
  prompt: 'Your prompt here...',
});

console.log(result.text);

Signature

(name: string, prompt: string, eval_type: string, eval_criteria: string, description?: string, expected_behavior?: string) => Promise<unknown>

Parameters

Available configuration options

Auto-extracted

name

Required

Type: string

Eval name (1-100 characters)

prompt

Required

Type: string

The prompt that will be sent to models

eval_type

Required

Type: string

"rubric" scores against a detailed rubric; "semantic" compares to an ideal response

eval_criteria

Required

Type: string

JSON string of scoring criteria. For rubric: {"rubric": "...", "max_score": 1}. For semantic: {"ideal_response": "...", "rubric": "..."}.

description

Optional

Type: string

Optional eval description (max 1000 characters)

expected_behavior

Optional

Type: string

Optional human-readable description of expected behavior (max 2000 characters)

Schema extracted: 3/1/2026, 4:27:40 AM

README

@tpmjs/tools-evals-blah

AI SDK tools for evals.blah.dev — the open LLM evaluation platform. Register models, create evals, trigger runs, and check the leaderboard.

Installation

npm install @tpmjs/tools-evals-blah

Setup

Read-only tools (list, get, leaderboard) require no authentication.

For write operations (create model, create eval, trigger run), set your API key:

export EVALS_BLAH_API_KEY=blah_your_api_key_here

Get an API key at https://evals.blah.dev/settings/api-keys

Usage

import {
  listModels,
  getLeaderboard,
  createModel,
  createEval,
  triggerRun,
} from '@tpmjs/tools-evals-blah';

// List all models (no auth needed)
const models = await listModels.execute({});

// Check the leaderboard (no auth needed)
const leaderboard = await getLeaderboard.execute({});

// Register a model (requires API key)
const model = await createModel.execute({
  name: 'My Model',
  inference_uri: 'openai/gpt-4.1-mini',
});

// Create an eval (requires API key)
const eval = await createEval.execute({
  name: 'Code Clarity',
  prompt: 'Write a function to reverse a string',
  eval_type: 'rubric',
  eval_criteria: '{"rubric": "Rate code clarity 0-1", "max_score": 1}',
});

// Trigger a run (requires API key)
const run = await triggerRun.execute({});

Tools

Tool	Auth	Description
`listModels`	No	List all registered LLM models
`getModel`	No	Get a model by ID
`createModel`	Yes	Register a new model
`getModelResults`	No	Get all eval results for a model
`listEvals`	No	List all evaluation definitions
`getEval`	No	Get an eval by ID
`createEval`	Yes	Create a new evaluation
`listRuns`	No	List all eval runs
`getRun`	No	Get a run by ID
`getRunResults`	No	Get all results for a run
`triggerRun`	Yes	Trigger a new eval run
`getResult`	No	Get a single result by ID
`getLeaderboard`	No	Get model rankings

License

MIT

createEval

This tool is currently broken

Interactive Playground

Installation & Usage

1. Install the package

2. Import the tool

3. Use with AI SDK

Signature

Tags

Parameters

README

@tpmjs/tools-evals-blah

Installation

Setup

Usage

Tools

License

Statistics

Bundle Size

NPM Keywords

Maintainers

Links

Frameworks