@tpmjs/tools-judge
Evaluate an AI conversation across 10 quality metrics. Use this tool frequently in agentic loops to verify the AI is making progress, staying on track, and actually completing what the user intended. Returns scores, reasoning, must-dos, and improvement suggestions for each metric.
Test @tpmjs/tools-judge (judgeConversation) with AI-powered execution
0/2000 characters
Install this tool and use it with the AI SDK
npm install @tpmjs/tools-judgepnpm add @tpmjs/tools-judgeyarn add @tpmjs/tools-judgebun add @tpmjs/tools-judgedeno add npm:@tpmjs/tools-judgeimport { judgeConversation } from '@tpmjs/tools-judge';import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { judgeConversation } from '@tpmjs/tools-judge';
const result = await generateText({
model: openai('gpt-4o'),
tools: { judgeConversation },
prompt: 'Your prompt here...',
});
console.log(result.text);(messages: { role: string; content: string; toolCalls: { args: Record<string, unknown>; toolName: string }[]; toolResults: { result: { }; toolName: string }[] }[], context?: string, strictMode?: boolean, originalUserRequest?: string) => Promise<unknown>Available configuration options
messagesarrayArray of AI SDK messages to evaluate. Each message should have role and content.
originalUserRequeststringOptional: The original user request if different from first message
contextstringOptional: Additional context about what the conversation should accomplish
strictModebooleanOptional: If true, requires higher scores to pass (default: false)
Schema extracted: 3/3/2026, 4:21:33 AM
ERROR: No README data found!
Downloads/month
4
GitHub Stars
19
Quality Score