Skip to content

Multi-Modal Prompts

This guide covers how to create multi-modal prompts using the prompt() helper. You’ll learn how to combine text, images, files, and slash commands for richer agent interactions.

The prompt() helper builds multi-modal user messages that combine:

  • Text - Natural language instructions
  • Images - Screenshots, diagrams, UI mockups
  • Files - Reference documents (PRDs, specs, examples)
  • Commands - Slash commands like /refactor, /analyze
import { prompt } from '@dao/vibe-check';
const messages = prompt({
text: 'Implement this design',
images: ['./mockup.png'],
files: ['./requirements.md']
});
await runAgent({ prompt: messages });

Simple text prompts don’t need prompt():

// These are equivalent:
await runAgent({ prompt: 'Refactor auth.ts' });
await runAgent({ prompt: prompt({ text: 'Refactor auth.ts' }) });

Include screenshots or diagrams:

import { vibeTest, prompt } from '@dao/vibe-check';
vibeTest('implement from mockup', async ({ runAgent, expect }) => {
const result = await runAgent({
prompt: prompt({
text: 'Implement this login screen',
images: ['./mockups/login.png']
})
});
expect(result).toHaveChangedFiles(['src/components/Login.tsx']);
});

Include reference documents:

vibeTest('implement from spec', async ({ runAgent, expect }) => {
const result = await runAgent({
prompt: prompt({
text: 'Implement the API according to this spec',
files: [
'./docs/api-spec.md',
'./docs/data-models.md'
]
})
});
expect(result).toHaveChangedFiles(['src/api/**/*.ts']);
});

Execute Claude Code commands:

vibeTest('refactor with context', async ({ runAgent }) => {
const result = await runAgent({
prompt: prompt({
command: '/refactor',
files: ['./docs/architecture.md'] // Provide architectural context
})
});
});

Combine text, images, files, and commands:

vibeTest('full-featured prompt', async ({ runAgent }) => {
const result = await runAgent({
prompt: prompt({
command: '/implement',
text: 'Build the user profile page matching this design and requirements',
images: [
'./mockups/profile-page.png',
'./mockups/mobile-view.png'
],
files: [
'./docs/prd.md',
'./docs/api-schema.json',
'./examples/similar-component.tsx'
]
})
});
});

What the agent receives:

  1. /implement command
  2. Text instructions
  3. Two images (design mockups)
  4. Three reference files (PRD, API schema, example code)

prompt() accepts images as:

  • File paths - Automatically read and converted to base64
  • Buffers - Pre-loaded image data
import { readFile } from 'node:fs/promises';
// From file paths (preferred)
prompt({
images: ['./screenshot.png', './diagram.jpg']
});
// From buffers
const imageBuffer = await readFile('./screenshot.png');
prompt({
images: [imageBuffer]
});
// Mix both
prompt({
images: [
'./screenshot1.png', // File path
imageBuffer // Buffer
]
});

Claude supports common image formats:

  • PNG (.png)
  • JPEG (.jpg, .jpeg)
  • GIF (.gif)
  • WebP (.webp)

Image loading errors are logged but don’t fail the prompt:

prompt({
text: 'Analyze this design',
images: [
'./exists.png', // ✅ Loaded
'./missing.png', // ⚠️ Warning logged, skipped
'./invalid.txt' // ⚠️ Warning logged, skipped
]
});
// Prompt succeeds with only exists.png

Files are automatically read and included as text:

prompt({
text: 'Implement according to spec',
files: ['./spec.md']
});
// Agent receives:
// Text: "Implement according to spec"
// File content: "# API Specification\n\n## Endpoints\n..."

Include multiple reference documents:

prompt({
files: [
'./requirements.md', // User requirements
'./architecture.md', // System architecture
'./api-spec.json', // API specification
'./examples/user.ts' // Example code
]
});

Any text file format works:

  • Markdown (.md, .mdx)
  • Code (.ts, .js, .py, .java, etc.)
  • JSON/YAML (.json, .yaml, .yml)
  • Plain text (.txt)

vibeTest('implement from design', async ({ runAgent, expect }) => {
const result = await runAgent({
prompt: prompt({
text: `
Implement this checkout page:
- Match the design exactly
- Use Tailwind CSS
- Add form validation
- Handle payment processing
`,
images: ['./designs/checkout.png'],
files: [
'./docs/design-system.md',
'./src/components/Button.tsx' // Example component
]
})
});
expect(result).toHaveChangedFiles(['src/pages/Checkout.tsx']);
});
vibeTest('implement API', async ({ runAgent, expect }) => {
const result = await runAgent({
prompt: prompt({
command: '/implement',
text: 'Create REST API endpoints for user management',
files: [
'./openapi.json', // OpenAPI specification
'./src/models/User.ts', // Data model
'./src/middleware/auth.ts' // Auth example
]
})
});
expect(result).toHaveChangedFiles(['src/routes/users.ts']);
expect(result).toHaveUsedTool('Write');
});
vibeTest('fix visual bug', async ({ runAgent }) => {
const result = await runAgent({
prompt: prompt({
text: `
Fix the alignment issue shown in the screenshot.
The submit button should be aligned to the right.
`,
images: ['./bug-reports/misaligned-button.png'],
files: ['./src/components/Form.tsx']
})
});
expect(result.files.get('src/components/Form.tsx')).toBeDefined();
});
vibeTest('implement feature with examples', async ({ runAgent }) => {
const result = await runAgent({
prompt: prompt({
text: `
Implement user authentication similar to the example,
but add OAuth support as shown in the diagram.
`,
images: ['./diagrams/oauth-flow.png'],
files: [
'./requirements.md',
'./examples/basic-auth.ts',
'./config/oauth-providers.json'
]
})
});
});

// ✅ Good: Reference existing code
prompt({
text: 'Add OAuth support to the existing auth system',
files: ['./src/auth/index.ts']
});
// ❌ Bad: Include entire codebase
prompt({
text: 'Add OAuth support',
files: [
'./src/auth/index.ts',
'./src/auth/password.ts',
'./src/auth/jwt.ts',
'./src/auth/session.ts',
// ... 20 more files
]
});
// Too much context = higher cost, slower response
// ✅ Good: Clear instructions
prompt({
text: 'Implement the mobile responsive version of this design, focusing on <768px screens',
images: ['./mobile-mockup.png']
});
// ❌ Bad: Vague instructions
prompt({
text: 'Do this',
images: ['./mobile-mockup.png']
});
// ✅ Good: Logical order (spec → examples → related code)
files: [
'./requirements.md', // What to build
'./examples/similar.ts', // How similar code works
'./src/utils/helper.ts' // Related utilities
]
// ❌ Bad: Random order
files: [
'./src/utils/helper.ts',
'./requirements.md',
'./examples/similar.ts'
]
// ✅ Good: Optimized images
images: ['./mockup-optimized.png'] // 500 KB
// ❌ Bad: Uncompressed images
images: ['./mockup-raw.png'] // 8 MB
// Higher cost, slower upload, may hit limits
// ✅ Good: One command with context
prompt({
command: '/refactor',
text: 'Focus on improving performance and readability',
files: ['./docs/performance-guidelines.md']
});
// ❌ Bad: Separate prompts
// First: "/refactor"
// Then: "improve performance"
// Then: provide guidelines separately

vibeTest('adaptive prompt', async ({ runAgent }) => {
const isComplexFeature = true;
const result = await runAgent({
prompt: prompt({
text: 'Implement user profile',
files: [
'./requirements.md',
// Include architecture docs only for complex features
...(isComplexFeature ? ['./architecture.md'] : [])
]
})
});
});
import { readdir } from 'node:fs/promises';
vibeTest('analyze all mockups', async ({ runAgent }) => {
// Find all mockup files
const mockupFiles = (await readdir('./mockups'))
.filter(f => f.endsWith('.png'))
.map(f => `./mockups/${f}`);
const result = await runAgent({
prompt: prompt({
text: 'Analyze consistency across all mockups',
images: mockupFiles
})
});
});
vibeTest('include related files', async ({ runAgent }) => {
const featureFiles = [
'./src/features/auth/Login.tsx',
'./src/features/auth/Register.tsx',
'./src/features/auth/ForgotPassword.tsx'
];
const result = await runAgent({
prompt: prompt({
text: 'Add OAuth support to all auth components',
files: featureFiles
})
});
});

import { vibeWorkflow, prompt } from '@dao/vibe-check';
vibeWorkflow('design implementation', async (wf) => {
// Stage 1: Implement desktop view
const desktop = await wf.stage('desktop view', {
prompt: prompt({
text: 'Implement desktop layout (>1024px)',
images: ['./mockups/desktop.png'],
files: ['./requirements.md']
})
});
// Stage 2: Implement mobile view
const mobile = await wf.stage('mobile view', {
prompt: prompt({
text: 'Add mobile responsive styles (<768px)',
images: ['./mockups/mobile.png'],
files: [desktop.files.get('src/App.tsx')?.path] // Reference desktop implementation
})
});
// Stage 3: Add interactivity
await wf.stage('interactivity', {
prompt: prompt({
command: '/implement',
text: 'Add form validation and animations',
files: ['./docs/interaction-spec.md']
})
});
});

Symptom: Images appear missing in agent context

Solution: Check file paths and permissions

import { access } from 'node:fs/promises';
// Verify images exist
const imagePath = './mockup.png';
try {
await access(imagePath);
console.log('✓ Image exists');
} catch {
console.error('✗ Image not found:', imagePath);
}

Symptom: Files not included in prompt

Solution: Check file encoding and size

import { stat, readFile } from 'node:fs/promises';
const filePath = './spec.md';
const stats = await stat(filePath);
console.log('File size:', stats.size, 'bytes');
if (stats.size > 1024 * 1024) { // > 1MB
console.warn('File is large, may increase cost');
}
// Test read
const content = await readFile(filePath, 'utf-8');
console.log('Preview:', content.slice(0, 100));

Symptom: Multi-modal prompts cost more than expected

Solution: Reduce image/file count

// ✅ Good: Essential images only
images: ['./primary-mockup.png']
// ❌ Bad: All possible images
images: [
'./mockup1.png',
'./mockup2.png',
'./mockup3.png',
'./mockup4.png',
'./mockup5.png'
]

function prompt(config: {
/** Text content */
text?: string;
/** Images (file paths or buffers) */
images?: Array<string | Buffer>;
/** Files to include (content will be read) */
files?: Array<string>;
/** Slash command */
command?: string;
}): AsyncIterable<SDKUserMessage>;

AsyncIterable<SDKUserMessage> - Compatible with runAgent({ prompt }).


Now that you understand multi-modal prompts, explore:

Or dive into the API reference: