Lazy Loading & Memory Efficiency
Vibe-check captures complete execution context (file content, messages, diffs) but keeps memory usage low through lazy loading. Large data stays on disk until you actually need it.
The Problem
Section titled “The Problem”Agent executions can produce massive amounts of data:
- File changes: 100+ files, each 10-100 KB (10 MB total)
- Conversation messages: Long transcripts with tool results (1 MB+)
- Git diffs: Unified diffs for all changes (5 MB+)
If we loaded everything into memory:
// ❌ BAD: Loading all data upfrontconst result = await runAgent({ agent, prompt });// Memory: 20 MB per test// 100 tests = 2 GB memory usage
Vitest runs tests in parallel. With 100 concurrent tests, memory usage explodes.
The Solution: Hybrid Architecture
Section titled “The Solution: Hybrid Architecture”Vibe-check uses a hybrid approach:
- Summaries in memory - Small metadata (file paths, sizes, hashes)
- Content on disk - Large data in RunBundle (lazy-loaded on demand)
- Lazy accessors - Data fetched only when accessed
// ✅ GOOD: Lazy loadingconst result = await runAgent({ agent, prompt });// Memory: ~50 KB (just metadata)
// Load specific file only when neededconst file = result.files.get('src/auth.ts');const content = await file.after?.text(); // <-- NOW loads from disk
Memory Comparison
Section titled “Memory Comparison”Approach | Memory per Test | 100 Parallel Tests |
---|---|---|
Eager (all data in memory) | 20 MB | 2 GB |
Hybrid (lazy loading) | 50 KB | 5 MB |
Savings | 400x less | 400x less |
What’s Lazy-Loaded
Section titled “What’s Lazy-Loaded”1. File Content
Section titled “1. File Content”In-memory (RunResult):
interface FileChange { path: string; // ~50 bytes changeType: 'modified'; // ~10 bytes before?: { sha256: string; // 64 bytes size: number; // 8 bytes text(): Promise<string>; // <-- Lazy stream(): ReadableStream; // <-- Lazy }; after?: { ... }; // Same stats?: { ... }; // ~50 bytes}
Total in-memory per file: ~200 bytes (metadata only)
On-disk (RunBundle):
.vibe-bundles/abc123/ files/ before/ <sha256>.txt.gz # 10 KB compressed after/ <sha256>.txt.gz # 10 KB compressed
When you access:
const content = await file.after.text();// 1. Read from bundle: .vibe-bundles/abc123/files/after/<sha256>.txt.gz// 2. Decompress if gzipped// 3. Return string (now in memory: ~50 KB)
2. Conversation Messages
Section titled “2. Conversation Messages”In-memory (RunResult):
readonly messages: Array<{ role: 'assistant'; summary: string; // First 120 chars (~120 bytes) ts: number; // 8 bytes load(): Promise<unknown>; // <-- Lazy}>;
Total in-memory per message: ~150 bytes
On-disk (RunBundle):
.vibe-bundles/abc123/ messages.ndjson # Full message content
When you access:
const full = await msg.load();// Read full message from messages.ndjson
3. Git Diffs
Section titled “3. Git Diffs”In-memory (RunResult):
readonly git: { before?: { head: string; dirty: boolean }; // ~100 bytes after?: { head: string; dirty: boolean }; // ~100 bytes changedCount: number; // 8 bytes diffSummary(): Promise<Array<{ ... }>>; // <-- Lazy};
On-disk (RunBundle):
.vibe-bundles/abc123/ git-diff.txt # Full unified diff (potentially MB)
4. File Patches
Section titled “4. File Patches”In-memory (RunResult):
interface FileChange { stats?: { added: number; // 8 bytes deleted: number; // 8 bytes chunks: number; // 8 bytes }; patch(format?: 'unified' | 'json'): Promise<string | object>; // <-- Lazy}
On-disk (RunBundle):
.vibe-bundles/abc123/ git-diff.txt # Unified diff (extracted per-file on demand)
API Patterns
Section titled “API Patterns”Small Files: Use text()
Section titled “Small Files: Use text()”For files < 10 MB:
const file = result.files.get('src/auth.ts');const content = await file.after?.text();// Loads entire file into memory (string)
Memory impact: File size (e.g., 50 KB string in memory)
Large Files: Use stream()
Section titled “Large Files: Use stream()”For files > 10 MB:
const file = result.files.get('data/large.json');const stream = file.after?.stream();
// Process line-by-line (memory-efficient)for await (const line of stream) { processLine(line);}
Memory impact: One line at a time (e.g., 1 KB per iteration)
Bulk Operations: Filter First
Section titled “Bulk Operations: Filter First”Inefficient (loads all files):
// ❌ BAD: Loads content for all 100 filesconst allFiles = result.files.changed();for (const file of allFiles) { const content = await file.after?.text(); if (content.includes('TODO')) { // ... }}
Efficient (filter by metadata, then load):
// ✅ GOOD: Filters by path firstconst srcFiles = result.files.filter('src/**/*.ts');for (const file of srcFiles) { const content = await file.after?.text(); if (content.includes('TODO')) { // ... }}
Why better: Only loads files matching the glob pattern.
Content-Addressed Storage
Section titled “Content-Addressed Storage”Files are stored by SHA-256 hash to deduplicate content:
.vibe-bundles/abc123/ files/ before/ a1b2c3d4e5f6...txt.gz # File content (hash-named) after/ a1b2c3d4e5f6...txt.gz # Same hash = same content f6e5d4c3b2a1...txt.gz # Different hash = different content
Benefits:
- Deduplication - If a file hasn’t changed, before/after share the same hash (stored once)
- Integrity - Hash mismatch = corruption detected
- Efficient comparison - Compare hashes instead of full content
Example:
const file = result.files.get('README.md');
// Fast comparison (just hashes)if (file.before?.sha256 === file.after?.sha256) { console.log('File unchanged (content-wise)');}
// No need to load content to compare
Compression
Section titled “Compression”Large files are gzip-compressed to save disk space:
// Storage decision (automatic):if (fileSize > 10 * 1024) { // 10 KB // Store as .txt.gz (compressed) await writeGzipped(content);} else { // Store as .txt (uncompressed, faster access) await writeRaw(content);}
When you load:
const content = await file.after?.text();// Framework detects .gz extension and decompresses automatically
Trade-offs:
- Pros: 5-10x disk savings for large files
- Cons: Slight CPU overhead on load (negligible)
Streaming for Reporters
Section titled “Streaming for Reporters”HTML Reporter uses lazy loading for scalability:
// Generate HTML report for 100 testsfor (const test of tests) { const result = getRunResult(test);
// Only load summaries (in-memory) const fileCount = result.files.changed().length; const cost = result.metrics.totalCostUsd;
// Lazy-load diffs only for failed tests if (!test.passed) { for await (const event of result.timeline.events()) { renderEvent(event); }
const file = result.files.get('problem.ts'); const patch = await file.patch('unified'); renderDiff(patch); }}
Memory usage: Only loads data for failed tests (not all 100).
When Data Is Loaded
Section titled “When Data Is Loaded”Access Pattern | Load Timing | Memory Impact |
---|---|---|
result.files.changed() | Immediately | Metadata only (~200 bytes/file) |
file.after?.text() | On-call | Full file content (~50 KB) |
file.after?.stream() | On-iterate | One chunk at a time (~4 KB) |
result.timeline.events() | On-iterate | One event at a time (~1 KB) |
msg.load() | On-call | Full message (~10 KB) |
file.patch('unified') | On-call | Diff text (~20 KB) |
Benefits
Section titled “Benefits”1. Scalable Parallel Testing
Section titled “1. Scalable Parallel Testing”// 100 tests running concurrently// Memory: 100 tests × 50 KB = 5 MB// Without lazy loading: 100 tests × 20 MB = 2 GB
2. Fast Test Execution
Section titled “2. Fast Test Execution”// Test completes immediatelyconst result = await runAgent({ agent, prompt });expect(result).toHaveChangedFiles(['src/**']);// ✅ Passes without loading any file content
3. Efficient Matchers
Section titled “3. Efficient Matchers”// Matcher only loads what it needsexpect(result).toHaveChangedFiles(['src/**']);// 1. Filters by glob (metadata only)// 2. No file content loaded
4. Memory-Bounded Reporters
Section titled “4. Memory-Bounded Reporters”// HTML reporter processes one test at a timefor (const test of tests) { renderTest(test); // Load, render, discard}// Peak memory: one test's data (~20 MB max)
Summary
Section titled “Summary”Lazy loading provides both memory efficiency and performance:
- Summaries in memory - Fast access to metadata
- Content on disk - Loaded only when accessed
- Content-addressed storage - Deduplication via hashes
- Compression - Disk space savings for large files
- Streaming APIs - Process large data without memory bloat
Result: Vibe-check can handle 100+ file changes without slowing down tests or bloating memory.
See Also
Section titled “See Also”- Run Bundle - On-disk storage structure
- Auto-Capture - What data is captured
- Context Manager - Lazy loading implementation