Custom Matchers

Vibe-check provides custom Vitest matchers tailored for testing agent behavior. These matchers work on RunResult objects and provide expressive, readable assertions.

Available Matchers

File Matchers

`toHaveChangedFiles(paths)`

Assert that specific files were changed (supports glob patterns).

vibeTest('changes expected files', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/refactor' });

  // Exact paths
  expect(result).toHaveChangedFiles(['src/auth.ts', 'tests/auth.test.ts']);

  // Glob patterns
  expect(result).toHaveChangedFiles(['src/**/*.ts']);

  // Single path
  expect(result).toHaveChangedFiles('src/main.ts');
});

Parameters:

paths - string | string[] - Exact paths or glob patterns

Passes when: All specified paths/patterns match changed files.

Fails when: Any specified path is not in the list of changed files.

`toHaveNoDeletedFiles()`

Assert that no files were deleted during execution.

vibeTest('never deletes files', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/refactor' });

  expect(result).toHaveNoDeletedFiles();
});

Passes when: No files have changeType === 'deleted'.

Fails when: Any file was deleted.

Use case: Protect against destructive refactorings.

Tool Matchers

`toHaveUsedTool(name, opts?)`

Assert that a specific tool was used, optionally with a minimum count.

vibeTest('uses Edit tool', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/refactor' });

  // Basic usage (at least once)
  expect(result).toHaveUsedTool('Edit');

  // With minimum count
  expect(result).toHaveUsedTool('Edit', { min: 3 });
});

Parameters:

name - string - Tool name (e.g., ‘Edit’, ‘Read’, ‘Bash’)
opts.min - number (optional) - Minimum usage count (default: 1)

Passes when: Tool was used at least min times.

Fails when: Tool was not used, or used fewer than min times.

`toUseOnlyTools(allowlist)`

Assert that only allowed tools were used (whitelist pattern).

vibeTest('only uses safe tools', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/analyze code' });

  // Only allow reading/grepping, no modifications
  expect(result).toUseOnlyTools(['Read', 'Grep', 'Glob']);
});

Parameters:

allowlist - string[] - List of allowed tool names

Passes when: All tool calls are in the allowlist.

Fails when: Any tool not in allowlist was used.

Use case: Enforce read-only operations, prevent destructive tools.

Quality Matchers

`toCompleteAllTodos()`

Assert that all TODOs were completed (none pending or in_progress).

vibeTest('completes all TODOs', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/implement feature X' });

  expect(result).toCompleteAllTodos();
});

Passes when: All TODOs have status === 'completed'.

Fails when: Any TODO has status === 'pending' or status === 'in_progress'.

Use case: Verify agent finished all planned work.

`toHaveNoErrorsInLogs()`

Assert that no errors occurred during execution (checks logs and timeline).

vibeTest('runs without errors', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/fix type errors' });

  expect(result).toHaveNoErrorsInLogs();
});

Passes when: No error events in timeline, no failed tool calls.

Fails when: Errors found in logs or tool failures detected.

Use case: Ensure clean execution without failures.

Cost Matchers

`toStayUnderCost(maxUsd)`

Assert that total cost stayed within budget.

vibeTest('stays under budget', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/add feature' });

  // Budget: $2.00
  expect(result).toStayUnderCost(2.00);
});

Parameters:

maxUsd - number - Maximum allowed cost in USD

Passes when: result.metrics.totalCostUsd <= maxUsd.

Fails when: Cost exceeds budget.

Use case: Enforce cost constraints for expensive operations.

LLM-Based Matchers

`toPassRubric(rubric)`

Assert that the result passes an LLM-based quality evaluation.

vibeTest('meets quality standards', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/refactor codebase' });

  // Async matcher - uses judge internally
  await expect(result).toPassRubric({
    name: 'Code Quality',
    criteria: [
      { name: 'has_tests', description: 'Added comprehensive test coverage' },
      { name: 'no_todos', description: 'No TODO comments left in code' },
      { name: 'type_safe', description: 'All code is properly typed' }
    ]
  });
});

Parameters:

rubric - Rubric - Evaluation criteria (see Rubrics Guide)

Passes when: Judge evaluation returns passed: true.

Fails when: Judge evaluation fails or rubric criteria not met.

Use case: Quality gates that require semantic understanding.

Hook Capture Matchers

`toHaveCompleteHookData()`

Assert that all hook events were captured successfully.

vibeTest('has complete hook data', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/task' });

  expect(result).toHaveCompleteHookData();
});

Passes when: result.hookCaptureStatus.complete === true.

Fails when: Hook capture was incomplete or failed.

Use case: Debug hook capture issues, ensure data integrity.

Usage Patterns

Pattern 1: Comprehensive Validation

Combine multiple matchers for thorough validation:

vibeTest('full validation', async ({ runAgent, expect }) => {
  const result = await runAgent({
    prompt: '/implement auth with tests'
  });

  // Files
  expect(result).toHaveChangedFiles(['src/auth.ts', 'tests/auth.test.ts']);
  expect(result).toHaveNoDeletedFiles();

  // Quality
  expect(result).toCompleteAllTodos();
  expect(result).toHaveNoErrorsInLogs();

  // Cost
  expect(result).toStayUnderCost(2.00);

  // Tools
  expect(result).toHaveUsedTool('Edit', { min: 2 });

  // LLM evaluation
  await expect(result).toPassRubric({
    name: 'Implementation Quality',
    criteria: [
      { name: 'tests', description: 'Has comprehensive test coverage' },
      { name: 'types', description: 'Properly typed' }
    ]
  });
});

Pattern 2: File Change Validation

Verify specific file patterns:

vibeTest('modifies only src/', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/refactor' });

  // Only src/ files changed
  expect(result).toHaveChangedFiles(['src/**/*.ts']);

  // No config/database changes
  const configChanged = result.files.changed().some(f =>
    f.path.startsWith('config/') || f.path.startsWith('database/')
  );
  expect(configChanged).toBe(false);

  // No deletions
  expect(result).toHaveNoDeletedFiles();
});

Pattern 3: Tool Allowlist

Restrict tool usage:

vibeTest('read-only analysis', async ({ runAgent, expect }) => {
  const result = await runAgent({
    prompt: '/analyze codebase for security issues'
  });

  // Only allow read tools
  expect(result).toUseOnlyTools(['Read', 'Grep', 'Glob', 'Bash']);

  // Verify no Edit/Write calls
  const destructiveTools = result.tools.all().filter(t =>
    ['Edit', 'Write', 'NotebookEdit'].includes(t.name)
  );
  expect(destructiveTools).toHaveLength(0);
});

Pattern 4: Cost-Aware Testing

Enforce budgets per operation:

vibeTest('budget per complexity', async ({ runAgent, expect }) => {
  // Simple task: $1 budget
  const simple = await runAgent({ prompt: '/fix typo in README' });
  expect(simple).toStayUnderCost(1.00);

  // Medium task: $3 budget
  const medium = await runAgent({ prompt: '/add tests for one module' });
  expect(medium).toStayUnderCost(3.00);

  // Complex task: $10 budget
  const complex = await runAgent({ prompt: '/refactor entire codebase' });
  expect(complex).toStayUnderCost(10.00);
});

Pattern 5: Quality Gates

Combine matchers with LLM evaluation:

vibeTest('quality gate', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/implement feature' });

  // Basic quality checks
  expect(result).toCompleteAllTodos();
  expect(result).toHaveNoErrorsInLogs();
  expect(result).toHaveChangedFiles(['src/**', 'tests/**']);

  // LLM-based quality evaluation
  await expect(result).toPassRubric({
    name: 'Feature Quality',
    criteria: [
      { name: 'tests', description: 'Has comprehensive test coverage' },
      { name: 'docs', description: 'Added user-facing documentation' },
      { name: 'types', description: 'All code is type-safe' },
      { name: 'errors', description: 'Proper error handling' }
    ]
  });
});

Using Matchers in Watchers

Custom matchers work in reactive watchers:

vibeTest('matchers in watchers', async ({ runAgent, expect }) => {
  const execution = runAgent({ prompt: '/refactor' });

  execution.watch(({ files, tools, metrics }) => {
    // Note: Matchers work on RunResult, not PartialRunResult
    // Use standard expect() for partial state

    // Files check
    const deleted = files.changed().filter(f => f.changeType === 'deleted');
    expect(deleted).toHaveLength(0);

    // Tool check
    expect(tools.failed().length).toBeLessThan(3);

    // Cost check
    if (metrics.totalCostUsd) {
      expect(metrics.totalCostUsd).toBeLessThan(5.0);
    }
  });

  const result = await execution;

  // Custom matchers on final result
  expect(result).toCompleteAllTodos();
  expect(result).toHaveChangedFiles(['src/**']);
});

Negation

All matchers support .not for negation:

vibeTest('negation examples', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/task' });

  // Should NOT change these files
  expect(result).not.toHaveChangedFiles(['config/**', 'database/**']);

  // Should use more than just Read
  expect(result).not.toUseOnlyTools(['Read']);

  // Should NOT stay under $0.50 (expect it to cost more)
  expect(result).not.toStayUnderCost(0.50);
});

Matcher Comparison Table

Matcher	Validates	Use Case
`toHaveChangedFiles`	File changes match patterns	Verify expected modifications
`toHaveNoDeletedFiles`	No deletions	Protect against data loss
`toHaveUsedTool`	Tool was called	Verify specific tool usage
`toUseOnlyTools`	Only allowed tools used	Enforce read-only or safe tools
`toCompleteAllTodos`	All TODOs completed	Verify work finished
`toHaveNoErrorsInLogs`	Clean execution	Ensure no failures
`toStayUnderCost`	Within budget	Enforce cost constraints
`toPassRubric`	LLM quality check	Semantic validation
`toHaveCompleteHookData`	Hook capture complete	Debug data issues

Best Practices

Combine matchers - Use multiple for comprehensive validation
Start simple - Basic matchers first, LLM evaluation last
Use globs - File patterns are more maintainable than exact paths
Budget per task - Set appropriate cost limits based on complexity
Allowlist tools - Use toUseOnlyTools for read-only or safe operations
Async matchers - Remember await for toPassRubric
Meaningful failures - Matchers provide detailed error messages

Troubleshooting

Matcher Not Found

Problem: TypeScript error: Property ‘toXxx’ does not exist.

Cause: setupFiles not configured or custom matchers not imported.

Solution: Ensure defineVibeConfig sets test.setupFiles: ['@dao/vibe-check/setup'].

toHaveChangedFiles Always Fails

Problem: Matcher fails even though files look correct.

Cause: Path mismatch (absolute vs relative, casing, separators).

Solution: Check result.files.changed().map(f => f.path) to see exact paths, adjust patterns accordingly.

toPassRubric Never Resolves

Problem: Test hangs on await expect(result).toPassRubric(...).

Cause: Judge call failing or API key missing.

Solution: Check ANTHROPIC_API_KEY env var, verify rubric is valid, check network connectivity.

toUseOnlyTools Too Strict

Problem: Matcher fails because of system tools (like Bash for git).

Cause: Allowlist too narrow.

Solution: Add system tools to allowlist: ['Read', 'Edit', 'Bash', 'Grep'].

Quick Reference

vibeTest('matcher quick reference', async ({ runAgent, expect }) => {
  const result = await runAgent({ prompt: '/task' });

  // Files
  expect(result).toHaveChangedFiles(['src/**/*.ts']);
  expect(result).toHaveNoDeletedFiles();

  // Tools
  expect(result).toHaveUsedTool('Edit', { min: 2 });
  expect(result).toUseOnlyTools(['Read', 'Edit', 'Bash']);

  // Quality
  expect(result).toCompleteAllTodos();
  expect(result).toHaveNoErrorsInLogs();

  // Cost
  expect(result).toStayUnderCost(5.00);

  // LLM evaluation (async)
  await expect(result).toPassRubric({
    name: 'Quality',
    criteria: [{ name: 'test', description: 'Has tests' }]
  });

  // Hook capture
  expect(result).toHaveCompleteHookData();
});

Custom Matchers

Available Matchers

File Matchers

toHaveChangedFiles(paths)

toHaveNoDeletedFiles()

Tool Matchers

toHaveUsedTool(name, opts?)

toUseOnlyTools(allowlist)

Quality Matchers

toCompleteAllTodos()

toHaveNoErrorsInLogs()

Cost Matchers

toStayUnderCost(maxUsd)

LLM-Based Matchers

toPassRubric(rubric)

Hook Capture Matchers

toHaveCompleteHookData()

Usage Patterns

Pattern 1: Comprehensive Validation

Pattern 2: File Change Validation

Pattern 3: Tool Allowlist

Pattern 4: Cost-Aware Testing

Pattern 5: Quality Gates

Using Matchers in Watchers

Negation

Matcher Comparison Table

Best Practices

Troubleshooting

Matcher Not Found

toHaveChangedFiles Always Fails

toPassRubric Never Resolves

toUseOnlyTools Too Strict

See Also

Quick Reference

`toHaveChangedFiles(paths)`

`toHaveNoDeletedFiles()`

`toHaveUsedTool(name, opts?)`

`toUseOnlyTools(allowlist)`

`toCompleteAllTodos()`

`toHaveNoErrorsInLogs()`

`toStayUnderCost(maxUsd)`

`toPassRubric(rubric)`

`toHaveCompleteHookData()`