AgentExecution
AgentExecution
is a thenable class returned by runAgent()
and stage()
. It provides reactive watch capabilities for assertions during agent execution, enabling fail-fast testing.
Class Definition
Section titled “Class Definition”class AgentExecution { watch(fn: WatcherFn): this; then<T, U>( onFulfilled?: (value: RunResult) => T | Promise<T>, onRejected?: (reason: unknown) => U | Promise<U> ): Promise<T | U>; catch<U>(onRejected?: (reason: unknown) => U | Promise<U>): Promise<RunResult | U>; finally(onFinally?: () => void): Promise<RunResult>; abort(reason?: string): void;}
Important: AgentExecution
is a thenable object (implements then/catch/finally
) but NOT a Promise subclass. It’s fully awaitable and works with Promise.all/race
, but instanceof Promise
returns false
.
Methods
Section titled “Methods”watch(fn: WatcherFn): this
Register a watcher function that runs during execution for reactive assertions.
Parameters:
fn
- Watcher function receivingPartialRunResult
Returns:
this
- For method chaining
When Watchers Run: Watchers are invoked after each significant hook event:
PostToolUse
- After each tool completesTodoUpdate
- When TODO status changesNotification
- When agent sends notifications
Execution Guarantees:
- Watchers execute sequentially in registration order (not parallel)
- Each watcher completes before the next starts
- If any watcher throws, execution aborts immediately
- No race conditions: only one watcher runs at a time
Example - Fail-Fast on File Violations:
import { vibeTest } from '@dao/vibe-check';
vibeTest('restrict file changes', async ({ runAgent, expect }) => { const execution = runAgent({ prompt: '/refactor authentication' });
// Abort if non-auth files are modified execution.watch(({ files }) => { const authFiles = files.changed().filter(f => f.path.startsWith('src/auth/') ); const otherFiles = files.changed().filter(f => !f.path.startsWith('src/auth/') );
if (otherFiles.length > 0) { expect.fail(`Modified non-auth files: ${otherFiles.map(f => f.path)}`); } });
const result = await execution; // Only reaches here if watcher never threw});
Example - Multiple Watchers (Sequential):
vibeTest('multiple watchers', async ({ runAgent, expect }) => { const execution = runAgent({ prompt: '/task' });
execution .watch(({ tools }) => { // Watcher 1: Limit tool failures expect(tools.failed().length).toBeLessThan(3); }) .watch(({ metrics }) => { // Watcher 2: runs only if watcher 1 passes expect(metrics.totalCostUsd).toBeLessThan(5.0); }) .watch(({ todos }) => { // Watcher 3: runs only if watchers 1 and 2 pass const completed = todos.filter(t => t.status === 'completed').length; expect(completed).toBeGreaterThan(0); });
await execution;});
Example - Cost Budget Enforcement:
vibeTest('enforce cost budget', async ({ runAgent, expect }) => { const execution = runAgent({ prompt: '/expensive-task' });
execution.watch(({ metrics }) => { if (metrics.totalCostUsd && metrics.totalCostUsd > 1.0) { expect.fail(`Cost exceeded budget: $${metrics.totalCostUsd.toFixed(4)}`); } });
await execution; // Aborts if cost > $1.00});
See Also:
then<T, U>( onFulfilled?: (value: RunResult) => T | Promise<T>, onRejected?: (reason: unknown) => U | Promise<U>): Promise<T | U>
Make AgentExecution
awaitable (thenable interface).
Parameters:
onFulfilled
- Called when execution succeedsonRejected
- Called when execution fails
Returns:
Promise<T | U>
- Promise resolving to fulfillment or rejection value
Usage:
// Await directlyconst result = await runAgent({ prompt: '/task' });
// Use then/catchrunAgent({ prompt: '/task' }) .then(result => console.log('Success:', result.files.stats())) .catch(error => console.error('Failed:', error));
// Promise.allconst [result1, result2] = await Promise.all([ runAgent({ prompt: '/task1' }), runAgent({ prompt: '/task2' })]);
// Promise.raceconst result = await Promise.race([ runAgent({ prompt: '/task' }), new Promise((_, reject) => setTimeout(() => reject('Timeout'), 60000) )]);
catch<U>( onRejected?: (reason: unknown) => U | Promise<U>): Promise<RunResult | U>
Handle errors from execution or watchers.
Parameters:
onRejected
- Error handler function
Returns:
Promise<RunResult | U>
- Promise resolving to result or error handler return value
Example:
vibeTest('handle execution errors', async ({ runAgent }) => { const result = await runAgent({ prompt: '/task' }) .catch(error => { console.error('Execution failed:', error); // Return fallback result or re-throw throw error; });});
finally
Section titled “finally”finally(onFinally?: () => void): Promise<RunResult>
Cleanup handler that runs whether execution succeeds or fails.
Parameters:
onFinally
- Cleanup function
Returns:
Promise<RunResult>
- Promise resolving to run result
Example:
vibeTest('cleanup example', async ({ runAgent }) => { let resourcesAllocated = false;
const result = await runAgent({ prompt: '/task' }) .finally(() => { // Always cleanup, even if execution fails if (resourcesAllocated) { console.log('Cleaning up resources'); } });});
abort(reason?: string): void
Manually abort the execution. Use this to cancel long-running agents programmatically.
Parameters:
reason
- Optional abort reason (included in rejection error)
Example - Timeout Implementation:
vibeTest('custom timeout', async ({ runAgent }) => { const execution = runAgent({ prompt: '/long-task' });
// Abort after 60 seconds setTimeout(() => { execution.abort('Timeout after 60s'); }, 60_000);
try { await execution; } catch (error) { console.error('Aborted:', error); // "Timeout after 60s" }});
Example - User Cancellation:
import { vibeTest } from '@dao/vibe-check';
vibeTest('cancellable task', async ({ runAgent }) => { const execution = runAgent({ prompt: '/task' });
// Simulate user clicking "Cancel" button process.on('SIGINT', () => { execution.abort('User cancelled'); });
await execution;});
WatcherFn Type
Section titled “WatcherFn Type”type WatcherFn = (ctx: PartialRunResult) => void | Promise<void>;
Watcher function type that receives partial execution state.
Behavior:
- Can be sync or async
- If it throws, execution aborts immediately
- Receives
PartialRunResult
with current state
Example:
const watchCost: WatcherFn = ({ metrics }) => { if (metrics.totalCostUsd && metrics.totalCostUsd > 1.0) { throw new Error(`Cost exceeded: $${metrics.totalCostUsd}`); }};
const watchFiles: WatcherFn = async ({ files }) => { const changed = files.changed(); if (changed.length > 50) { throw new Error(`Too many files changed: ${changed.length}`); }};
Thenable vs Promise
Section titled “Thenable vs Promise”AgentExecution
is thenable but not a Promise subclass.
What Works:
// ✅ Awaitingconst result = await execution;
// ✅ Promise.allawait Promise.all([execution1, execution2]);
// ✅ Promise.raceawait Promise.race([execution, timeout]);
// ✅ then/catch/finallyexecution.then(...).catch(...).finally(...);
What Doesn’t Work:
// ❌ instanceof Promiseexecution instanceof Promise; // false
// ❌ Promise-only methodsexecution.any(...); // Error: not a method
Why Not a Promise Subclass?
AgentExecution
needs custom behavior:
watch()
method for reactive assertionsabort()
method for cancellation- Avoid prototype pollution
Subclassing Promise would complicate the implementation and limit flexibility.
Complete Example
Section titled “Complete Example”import { vibeTest } from '@dao/vibe-check';
vibeTest('complete agent execution example', async ({ runAgent, expect }) => { const execution = runAgent({ prompt: '/implement user authentication' });
// Reactive assertions execution .watch(({ files }) => { // Only allow auth-related files const nonAuthFiles = files.changed().filter(f => !f.path.includes('auth') ); if (nonAuthFiles.length > 0) { expect.fail('Modified non-auth files'); } }) .watch(({ metrics }) => { // Enforce cost budget if (metrics.totalCostUsd && metrics.totalCostUsd > 0.50) { expect.fail('Cost exceeded $0.50'); } }) .watch(({ todos }) => { // Track progress const completed = todos.filter(t => t.status === 'completed').length; console.log(`${completed} tasks completed`); });
// Set timeout const timeout = setTimeout(() => { execution.abort('Timeout after 5 minutes'); }, 300_000);
try { // Await result const result = await execution;
// Standard assertions after execution expect(result.files).toHaveChangedFiles(['src/auth/**']); expect(result).toCompleteAllTodos();
} catch (error) { console.error('Execution failed:', error); throw error;
} finally { // Cleanup clearTimeout(timeout); }});
See Also
Section titled “See Also”- Reactive Watchers Guide → - Using watch() for fail-fast testing
- PartialRunResult → - Partial state type for watchers
- RunResult → - Final result type
- runAgent() → - Function that returns AgentExecution