AgentExecution

AgentExecution is a thenable class returned by runAgent() and stage(). It provides reactive watch capabilities for assertions during agent execution, enabling fail-fast testing.

Class Definition

class AgentExecution {
  watch(fn: WatcherFn): this;
  then<T, U>(
    onFulfilled?: (value: RunResult) => T | Promise<T>,
    onRejected?: (reason: unknown) => U | Promise<U>
  ): Promise<T | U>;
  catch<U>(onRejected?: (reason: unknown) => U | Promise<U>): Promise<RunResult | U>;
  finally(onFinally?: () => void): Promise<RunResult>;
  abort(reason?: string): void;
}

Important: AgentExecution is a thenable object (implements then/catch/finally) but NOT a Promise subclass. It’s fully awaitable and works with Promise.all/race, but instanceof Promise returns false.

Methods

watch

watch(fn: WatcherFn): this

Parameters:

fn - Watcher function receiving PartialRunResult

Returns:

this - For method chaining

When Watchers Run: Watchers are invoked after each significant hook event:

PostToolUse - After each tool completes
TodoUpdate - When TODO status changes
Notification - When agent sends notifications

Execution Guarantees:

Watchers execute sequentially in registration order (not parallel)
Each watcher completes before the next starts
If any watcher throws, execution aborts immediately
No race conditions: only one watcher runs at a time

Example - Fail-Fast on File Violations:

import { vibeTest } from '@dao/vibe-check';

vibeTest('restrict file changes', async ({ runAgent, expect }) => {
  const execution = runAgent({
    prompt: '/refactor authentication'
  });

  // Abort if non-auth files are modified
  execution.watch(({ files }) => {
    const authFiles = files.changed().filter(f =>
      f.path.startsWith('src/auth/')
    );
    const otherFiles = files.changed().filter(f =>
      !f.path.startsWith('src/auth/')
    );

    if (otherFiles.length > 0) {
      expect.fail(`Modified non-auth files: ${otherFiles.map(f => f.path)}`);
    }
  });

  const result = await execution;
  // Only reaches here if watcher never threw
});

Example - Multiple Watchers (Sequential):

vibeTest('multiple watchers', async ({ runAgent, expect }) => {
  const execution = runAgent({ prompt: '/task' });

  execution
    .watch(({ tools }) => {
      // Watcher 1: Limit tool failures
      expect(tools.failed().length).toBeLessThan(3);
    })
    .watch(({ metrics }) => {
      // Watcher 2: runs only if watcher 1 passes
      expect(metrics.totalCostUsd).toBeLessThan(5.0);
    })
    .watch(({ todos }) => {
      // Watcher 3: runs only if watchers 1 and 2 pass
      const completed = todos.filter(t => t.status === 'completed').length;
      expect(completed).toBeGreaterThan(0);
    });

  await execution;
});

Example - Cost Budget Enforcement:

vibeTest('enforce cost budget', async ({ runAgent, expect }) => {
  const execution = runAgent({ prompt: '/expensive-task' });

  execution.watch(({ metrics }) => {
    if (metrics.totalCostUsd && metrics.totalCostUsd > 1.0) {
      expect.fail(`Cost exceeded budget: $${metrics.totalCostUsd.toFixed(4)}`);
    }
  });

  await execution; // Aborts if cost > $1.00
});

See Also:

then

then<T, U>(
  onFulfilled?: (value: RunResult) => T | Promise<T>,
  onRejected?: (reason: unknown) => U | Promise<U>
): Promise<T | U>

Make AgentExecution awaitable (thenable interface).

Parameters:

onFulfilled - Called when execution succeeds
onRejected - Called when execution fails

Returns:

Promise<T | U> - Promise resolving to fulfillment or rejection value

Usage:

// Await directly
const result = await runAgent({ prompt: '/task' });

// Use then/catch
runAgent({ prompt: '/task' })
  .then(result => console.log('Success:', result.files.stats()))
  .catch(error => console.error('Failed:', error));

// Promise.all
const [result1, result2] = await Promise.all([
  runAgent({ prompt: '/task1' }),
  runAgent({ prompt: '/task2' })
]);

// Promise.race
const result = await Promise.race([
  runAgent({ prompt: '/task' }),
  new Promise((_, reject) =>
    setTimeout(() => reject('Timeout'), 60000)
  )
]);

catch

catch<U>(
  onRejected?: (reason: unknown) => U | Promise<U>
): Promise<RunResult | U>

Handle errors from execution or watchers.

Parameters:

onRejected - Error handler function

Returns:

Promise<RunResult | U> - Promise resolving to result or error handler return value

Example:

vibeTest('handle execution errors', async ({ runAgent }) => {
  const result = await runAgent({ prompt: '/task' })
    .catch(error => {
      console.error('Execution failed:', error);
      // Return fallback result or re-throw
      throw error;
    });
});

finally

finally(onFinally?: () => void): Promise<RunResult>

Cleanup handler that runs whether execution succeeds or fails.

Parameters:

onFinally - Cleanup function

Returns:

Promise<RunResult> - Promise resolving to run result

Example:

vibeTest('cleanup example', async ({ runAgent }) => {
  let resourcesAllocated = false;

  const result = await runAgent({ prompt: '/task' })
    .finally(() => {
      // Always cleanup, even if execution fails
      if (resourcesAllocated) {
        console.log('Cleaning up resources');
      }
    });
});

abort

abort(reason?: string): void

Manually abort the execution. Use this to cancel long-running agents programmatically.

Parameters:

reason - Optional abort reason (included in rejection error)

Example - Timeout Implementation:

vibeTest('custom timeout', async ({ runAgent }) => {
  const execution = runAgent({ prompt: '/long-task' });

  // Abort after 60 seconds
  setTimeout(() => {
    execution.abort('Timeout after 60s');
  }, 60_000);

  try {
    await execution;
  } catch (error) {
    console.error('Aborted:', error); // "Timeout after 60s"
  }
});

Example - User Cancellation:

import { vibeTest } from '@dao/vibe-check';

vibeTest('cancellable task', async ({ runAgent }) => {
  const execution = runAgent({ prompt: '/task' });

  // Simulate user clicking "Cancel" button
  process.on('SIGINT', () => {
    execution.abort('User cancelled');
  });

  await execution;
});

WatcherFn Type

type WatcherFn = (ctx: PartialRunResult) => void | Promise<void>;

Watcher function type that receives partial execution state.

Behavior:

Can be sync or async
If it throws, execution aborts immediately
Receives PartialRunResult with current state

Example:

const watchCost: WatcherFn = ({ metrics }) => {
  if (metrics.totalCostUsd && metrics.totalCostUsd > 1.0) {
    throw new Error(`Cost exceeded: $${metrics.totalCostUsd}`);
  }
};

const watchFiles: WatcherFn = async ({ files }) => {
  const changed = files.changed();
  if (changed.length > 50) {
    throw new Error(`Too many files changed: ${changed.length}`);
  }
};

Thenable vs Promise

AgentExecution is thenable but not a Promise subclass.

What Works:

// ✅ Awaiting
const result = await execution;

// ✅ Promise.all
await Promise.all([execution1, execution2]);

// ✅ Promise.race
await Promise.race([execution, timeout]);

// ✅ then/catch/finally
execution.then(...).catch(...).finally(...);

What Doesn’t Work:

// ❌ instanceof Promise
execution instanceof Promise; // false

// ❌ Promise-only methods
execution.any(...); // Error: not a method

Why Not a Promise Subclass?

AgentExecution needs custom behavior:

watch() method for reactive assertions
abort() method for cancellation
Avoid prototype pollution

Subclassing Promise would complicate the implementation and limit flexibility.

Complete Example

import { vibeTest } from '@dao/vibe-check';

vibeTest('complete agent execution example', async ({ runAgent, expect }) => {
  const execution = runAgent({
    prompt: '/implement user authentication'
  });

  // Reactive assertions
  execution
    .watch(({ files }) => {
      // Only allow auth-related files
      const nonAuthFiles = files.changed().filter(f =>
        !f.path.includes('auth')
      );
      if (nonAuthFiles.length > 0) {
        expect.fail('Modified non-auth files');
      }
    })
    .watch(({ metrics }) => {
      // Enforce cost budget
      if (metrics.totalCostUsd && metrics.totalCostUsd > 0.50) {
        expect.fail('Cost exceeded $0.50');
      }
    })
    .watch(({ todos }) => {
      // Track progress
      const completed = todos.filter(t => t.status === 'completed').length;
      console.log(`${completed} tasks completed`);
    });

  // Set timeout
  const timeout = setTimeout(() => {
    execution.abort('Timeout after 5 minutes');
  }, 300_000);

  try {
    // Await result
    const result = await execution;

    // Standard assertions after execution
    expect(result.files).toHaveChangedFiles(['src/auth/**']);
    expect(result).toCompleteAllTodos();

  } catch (error) {
    console.error('Execution failed:', error);
    throw error;

  } finally {
    // Cleanup
    clearTimeout(timeout);
  }
});