VibeTestContext
The VibeTestContext
interface is injected into vibeTest
functions via Vitest fixtures. It provides methods and properties for agent execution, evaluation, assertions, and cumulative state tracking.
Interface
Section titled “Interface”interface VibeTestContext { runAgent(opts: RunAgentOptions): AgentExecution; judge<T = DefaultJudgmentResult>( result: RunResult, options: JudgeOptions<T> ): Promise<T>; expect: typeof import('vitest')['expect']; annotate(message: string, type?: string, attachment?: TestAttachment): Promise<void>; task: import('vitest').TestContext['task']; files: FileAccessor; tools: ToolAccessor; timeline: TimelineAccessor;}
Properties
Section titled “Properties”runAgent
Section titled “runAgent”runAgent(opts: RunAgentOptions): AgentExecution
Execute an agent with the given options. Automatically captures hooks, git state, file changes, and tool calls.
Parameters:
opts
- Agent execution configuration
Returns:
AgentExecution
- Thenable execution handle with reactive watch capabilities
Behavior:
- State accumulates across multiple
runAgent()
calls in the same test - Access cumulative state via
context.files
,context.tools
,context.timeline
Example:
vibeTest('multi-run test', async ({ runAgent, expect }) => { // First run const result1 = await runAgent({ prompt: '/implement feature A' });
// Second run (state accumulates) const result2 = await runAgent({ prompt: '/implement feature B' });
// Access cumulative state expect(context.files.changed().length).toBeGreaterThan(0);});
See Also:
judge<T = DefaultJudgmentResult>( result: RunResult, options: { rubric: Rubric; instructions?: string; resultFormat?: z.ZodType<T>; throwOnFail?: boolean; }): Promise<T>
Evaluate a RunResult
using LLM-based judgment. The judge is a specialized agent that formats the rubric into a prompt internally.
Type Parameters:
T
- Type of judgment result (default:DefaultJudgmentResult
)
Parameters:
result
- TheRunResult
to evaluateoptions.rubric
- Evaluation criteriaoptions.instructions
- Optional custom instructions for evaluationoptions.resultFormat
- Optional Zod schema for type-safe resultsoptions.throwOnFail
- Iftrue
, throws error when judgment fails
Returns:
Promise<T>
- Structured judgment result
Example:
vibeTest('quality evaluation', async ({ runAgent, judge, expect }) => { const result = await runAgent({ prompt: '/refactor code' });
const judgment = await judge(result, { rubric: { name: 'Code Quality', criteria: [ { name: 'readability', description: 'Code is easy to read' } ] } });
expect(judgment.passed).toBe(true);});
See Also:
expect
Section titled “expect”expect: typeof import('vitest')['expect']
Context-bound expect
function for snapshot concurrency safety.
Purpose:
- Use this instead of global
expect
in concurrent tests - Ensures snapshots are properly isolated
Example:
vibeTest.concurrent('concurrent test', async ({ runAgent, expect }) => { const result = await runAgent({ prompt: '/task' });
// Use context.expect, not global expect expect(result.files.changed()).toMatchSnapshot();});
annotate
Section titled “annotate”annotate( message: string, type?: string, attachment?: TestAttachment): Promise<void>
Stream annotations to reporters in real-time.
Parameters:
message
- Annotation messagetype
- Optional annotation type (e.g., ‘info’, ‘warning’, ‘error’)attachment
- Optional file attachment (Vitest moves to attachmentsDir automatically)
Example:
vibeTest('annotated test', async ({ runAgent, annotate }) => { await annotate('Starting feature implementation');
const result = await runAgent({ prompt: '/implement feature' });
await annotate('Implementation complete', 'success');});
task: import('vitest').TestContext['task']
Access to Vitest task metadata for custom meta storage.
Use Cases:
- Store custom metadata for reporters
- Access test name, file path, etc.
Example:
vibeTest('metadata test', async ({ runAgent, task }) => { const result = await runAgent({ prompt: '/task' });
// Store custom metadata task.meta.customData = { feature: 'auth' };});
files: { changed(): FileChange[]; get(path: string): FileChange | undefined; filter(glob: string | string[]): FileChange[]; stats(): { added: number; modified: number; deleted: number; renamed: number; total: number; };}
Access cumulative file changes across all runAgent()
calls in this test.
Methods:
changed()
Section titled “changed()”Returns all files changed across all agent runs.
const allFiles = context.files.changed();console.log(`Total files changed: ${allFiles.length}`);
get(path)
Section titled “get(path)”Get a specific file by path.
const file = context.files.get('src/index.ts');if (file) { const content = await file.after?.text();}
filter(glob)
Section titled “filter(glob)”Filter files by glob pattern.
const tsFiles = context.files.filter('**/*.ts');const testFiles = context.files.filter(['**/*.test.ts', '**/*.spec.ts']);
stats()
Section titled “stats()”Get change statistics.
const stats = context.files.stats();console.log(`Added: ${stats.added}`);console.log(`Modified: ${stats.modified}`);console.log(`Deleted: ${stats.deleted}`);console.log(`Total: ${stats.total}`);
See Also:
tools: { all(): ToolCall[]; used(name: string): number; findFirst(name: string): ToolCall | undefined; filter(name: string): ToolCall[]; failed(): ToolCall[]; succeeded(): ToolCall[];}
Access cumulative tool calls across all runAgent()
calls in this test.
Methods:
Get all tool calls from all runs.
const allTools = context.tools.all();console.log(`Total tools used: ${allTools.length}`);
used(name)
Section titled “used(name)”Count uses of a specific tool.
const editCount = context.tools.used('Edit');console.log(`Edit used ${editCount} times`);
findFirst(name)
Section titled “findFirst(name)”Find the first use of a specific tool.
const firstWrite = context.tools.findFirst('Write');if (firstWrite) { console.log('First file written:', firstWrite.input);}
filter(name)
Section titled “filter(name)”Get all calls to a specific tool.
const bashCalls = context.tools.filter('Bash');bashCalls.forEach(call => { console.log('Command:', call.input);});
failed()
Section titled “failed()”Get all failed tool calls.
const failures = context.tools.failed();if (failures.length > 0) { console.error('Failed tools:', failures.map(t => t.name));}
succeeded()
Section titled “succeeded()”Get all successful tool calls.
const successful = context.tools.succeeded();console.log(`${successful.length} tools succeeded`);
See Also:
timeline
Section titled “timeline”timeline: { events(): AsyncIterable<TimelineEvent>;}
Access unified timeline of events across all runAgent()
calls.
Methods:
events()
Section titled “events()”Returns async iterable over all timeline events.
for await (const event of context.timeline.events()) { console.log(`${event.type} at ${event.timestamp}`);}
Event Types:
tool_use
- Tool invocationtool_result
- Tool completiontodo_update
- TODO status changenotification
- Agent notification
Module Augmentation
Section titled “Module Augmentation”Vibe-check augments Vitest’s TestContext
to include VibeTestContext
:
declare module 'vitest' { export interface TestContext extends VibeTestContext {}}
This allows TypeScript to recognize vibe-check properties in test functions.
Access VibeTestContext
through destructuring in vibeTest
functions:
import { vibeTest } from '@dao/vibe-check';
vibeTest('example', async ({ runAgent, judge, expect, files, tools }) => { // All properties available via destructuring const result = await runAgent({ prompt: '/task' });
expect(result).toBeDefined(); expect(files.changed().length).toBeGreaterThan(0); expect(tools.used('Edit')).toBeGreaterThan(0);});
See Also
Section titled “See Also”- vibeTest() → - Test function using this context
- WorkflowContext → - Workflow equivalent
- RunResult → - Result type returned by runAgent()
- Cumulative State Guide → - Using files/tools/timeline