Skip to content

How-To Guides

These guides provide practical solutions to specific problems. Each guide is task-oriented and focuses on achieving a particular goal.

Testing Guides

Learn testing patterns including reactive watchers, cumulative state, custom matchers, and matrix testing.

Advanced Guides

Master advanced features like MCP servers, cost optimization, bundle cleanup, and multi-modal prompts.


Learn patterns for writing effective tests and assertions.

Implement fail-fast assertions with AgentExecution.watch() to catch issues early during agent runs.

Topics: Real-time monitoring, partial results, early termination, error detection

Track and aggregate state across multiple agent runs for comprehensive testing scenarios.

Topics: Multi-run tracking, state aggregation, cross-run analysis, data persistence

Use all available matchers for files, tools, quality checks, and cost constraints.

Topics: File matchers, tool matchers, quality matchers, cost matchers, matcher chaining

Generate Cartesian product tests to benchmark multiple models and configurations.

Topics: Test generation, model comparison, configuration matrices, performance analysis


Build production-ready agent workflows and pipelines.

Create multi-stage workflows that orchestrate complex agent interactions.

Topics: Stage definitions, cumulative context, cross-stage data sharing, pipeline composition

Implement retry logic and iterative workflows using until() helpers.

Topics: Retry strategies, convergence testing, iterative refinement, condition checking

Build resilient workflows with comprehensive error handling strategies.

Topics: Error recovery, graceful degradation, fallback strategies, error reporting


Evaluate and benchmark agent quality systematically.

Leverage LLM-based evaluation to assess agent output quality.

Topics: Judge configuration, rubric application, scoring systems, quality gates

Design effective rubrics for consistent and reliable evaluation.

Topics: Rubric structure, criterion design, scoring scales, best practices

Compare models, configurations, and prompts with systematic benchmarking.

Topics: Performance metrics, cost analysis, model comparison, regression detection


Master advanced features and optimization techniques.

Integrate Model Context Protocol servers for enhanced agent capabilities.

Topics: MCP configuration, server integration, tool availability, context management

Reduce costs while maintaining quality through strategic optimizations.

Topics: Token reduction, model selection, prompt optimization, caching strategies

Manage artifact storage and implement cleanup policies.

Topics: Retention policies, storage management, cleanup strategies, disk usage

Use text, images, and files in your agent prompts effectively.

Topics: Image prompts, file attachments, mixed content, format handling