TypeScript SDK
AgentV provides two npm packages for programmatic use:
@agentv/eval— custom assertions and code graders@agentv/core— programmatic evaluation API and typed configuration
Installation
Section titled “Installation”# Assertion SDK (defineAssertion, defineCodeGrader)npm install @agentv/eval
# Programmatic API (evaluate, defineConfig)npm install @agentv/coreCustom Assertions
Section titled “Custom Assertions”Use defineAssertion from @agentv/eval to create reusable assertion types. Place them in .agentv/assertions/ — they’re auto-discovered by filename.
Pass/Fail Pattern
Section titled “Pass/Fail Pattern”import { defineAssertion } from '@agentv/eval';
export default defineAssertion(({ output }) => { const wordCount = (output ?? '').trim().split(/\s+/).filter(Boolean).length; return { pass: wordCount >= 3, reasoning: `Output has ${wordCount} words`, };});Score Pattern
Section titled “Score Pattern”Return a score (0–1) instead of pass for graded evaluation:
import { defineAssertion } from '@agentv/eval';
export default defineAssertion(({ output, traceSummary }) => ({ pass: (output ?? '').length > 0 && (traceSummary?.eventCount ?? 0) <= 10, reasoning: 'Checks content exists and is efficient',}));If only pass is given, score is 1 (pass) or 0 (fail).
Using in YAML
Section titled “Using in YAML”Convention-based discovery maps filename → assertion type:
.agentv/assertions/word-count.ts → type: word-count.agentv/assertions/sentiment.ts → type: sentimentReference directly in your eval file — no command: needed:
assertions: - type: word-count - type: contains value: "Hello"Code Graders
Section titled “Code Graders”Use defineCodeGrader from @agentv/eval for full control over scoring with an explicit assertions array:
import { defineCodeGrader } from '@agentv/eval';
export default defineCodeGrader(({ output, traceSummary }) => ({ score: (output ?? '').length > 0 && (traceSummary?.eventCount ?? 0) <= 5 ? 1.0 : 0.5, assertions: [ { text: 'Answer is not empty', passed: (output ?? '').length > 0 }, { text: 'Efficient tool usage', passed: (traceSummary?.eventCount ?? 0) <= 5 }, ],}));defineCodeGrader graders are referenced in YAML with type: code-grader and command: [bun, run, grader.ts]. defineAssertion uses convention-based discovery instead — just place in .agentv/assertions/ and reference by name.
For detailed patterns, input/output contracts, and language-agnostic examples, see Code Graders.
Wire Format vs SDK Format
Section titled “Wire Format vs SDK Format”Raw grader stdin uses snake_case because it crosses a process boundary and may be consumed by Python, shell, jq, or external dashboards. The @agentv/eval SDK converts that payload to idiomatic TypeScript camelCase before calling your handler.
| Raw stdin | SDK handler field |
|---|---|
expected_output | expectedOutput |
output_path | outputPath |
trace_summary | traceSummary |
token_usage | tokenUsage |
cost_usd | costUsd |
duration_ms | durationMs |
workspace_path | workspacePath |
output is already the final answer string in both formats. Transcript-aware code should read messages, trace.messages, or trace.events; answer-text graders should read output or the deprecated answer alias.
Programmatic API
Section titled “Programmatic API”Use evaluate() from @agentv/core to run evaluations as a library — no YAML needed.
Inline Test Definitions
Section titled “Inline Test Definitions”import { evaluate } from '@agentv/core';
const { results, summary } = await evaluate({ tests: [ { id: 'greeting', input: 'Say hello', assertions: [{ type: 'contains', value: 'Hello' }], }, ],});
console.log(`${summary.passed}/${summary.total} passed`);Auto-discovers the default target from .agentv/targets.yaml and .env credentials.
File-Based via specFile
Section titled “File-Based via specFile”Point to an existing YAML eval instead of inlining tests:
import { evaluate } from '@agentv/core';
const { results, summary } = await evaluate({ specFile: './evals/my-eval.eval.yaml',});Typed Configuration
Section titled “Typed Configuration”Create agentv.config.ts at your project root for type-safe, validated configuration using defineConfig() from @agentv/core:
import { defineConfig } from '@agentv/core';
export default defineConfig({ execution: { workers: 5, maxRetries: 2, verbose: true, otelFile: '.agentv/results/otel-{timestamp}.json', }, output: { dir: './results' }, limits: { maxCostUsd: 10.0 },});The config file is auto-discovered by the CLI from your project root and validated with Zod at startup.
Scaffold Commands
Section titled “Scaffold Commands”Bootstrap new assertions and eval files from the CLI:
# Create a new assertion typeagentv create assertion <name> # → .agentv/assertions/<name>.ts
# Create a new eval with test casesagentv create eval <name> # → evals/<name>.eval.yaml + .cases.jsonl