Lesson 4: Media Messages (Multimodality)
This lesson demonstrates the correct way to send images to a multimodal (vision-capable) LLM using the SDK's built-in message helpers.
Code: lesson_4_media_messages.mjs
The most important part of this lesson is the message payload. To link an image with a question, you send the media message first, followed immediately by the user message in the same request array.
// lesson_4_media_messages.mjs
// Merci SDK Tutorial: Lesson 4 - Using Media Messages (Multimodality)
// --- IMPORTS ---
// We import helpers for each distinct message type we will send.
import { MerciClient, createMediaMessage, createUserMessage } from '../lib/merci.2.14.0.mjs';
import { token } from '../secret/token.mjs';
const MODEL = 'openai-gpt-5-mini';
async function main() {
console.log(`--- Merci SDK Tutorial: Lesson 4 - Media Messages (Model: ${MODEL}) ---`);
try {
// --- STEP 1: INITIALIZE THE CLIENT ---
console.log('[STEP 1] Initializing MerciClient...');
const client = new MerciClient({ token });
// --- STEP 2: DEFINE PROMPT AND INPUT DATA ---
// We define the path to our local image and the text prompt that refers to it.
console.log('[STEP 2] Preparing prompt and input data...');
const imagePath = './image.png';
const userPrompt = "What is in this image? Describe it in a single, detailed sentence.";
// --- STEP 3: CONFIGURE THE CHAT SESSION ---
// No special configuration is needed on the session itself, just the right model.
console.log('[STEP 3] Configuring the chat session...');
const chatSession = client.chat.session(MODEL);
// --- STEP 4: PREPARE THE MESSAGE PAYLOAD ---
// THIS IS THE MOST IMPORTANT PART OF THE LESSON.
// To link an image with a question, you send the media message first,
// followed immediately by the user message in the same request.
console.log('[STEP 4] Building the message array with separate media and user messages...');
const messages = [
await createMediaMessage(imagePath),
createUserMessage(userPrompt)
];
// --- STEP 5: EXECUTE THE REQUEST & PROCESS THE RESPONSE ---
console.log('[STEP 5] Sending multimodal request and processing stream...');
let finalResponse = '';
process.stdout.write('🤖 Vision Assistant > ');
for await (const event of chatSession.stream(messages)) {
if (event.type === 'text') {
process.stdout.write(event.content);
finalResponse += event.content;
}
}
process.stdout.write('
');
console.log('
[INFO] Stream finished. Response fully received.');
// --- FINAL RESULT ---
console.log('
--- FINAL RESULT ---');
console.log(`🖼️ Media > ${imagePath}`);
console.log(`👤 User > ${userPrompt}`);
console.log(`🤖 Vision Assistant > ${finalResponse}`);
console.log('--------------------');
} catch (error) {
// --- ROBUST ERROR HANDLING ---
if (error.code === 'ENOENT') {
console.error(`
[FATAL ERROR] Image file not found at "${error.path}"`);
console.error(' Please make sure the image file exists before running the script.');
process.exit(1);
}
console.error('
[FATAL ERROR] An error occurred during the operation.');
console.error(' Message:', error.message);
if (error.status) {
console.error(' API Status:', error.status);
}
if (error.details) {
console.error(' Details:', JSON.stringify(error.details, null, 2));
}
if (error.stack) {
console.error(' Stack:', error.stack);
}
console.error('
Possible causes: Invalid token, network issues, or an API service problem.');
process.exit(1); // Exit with a non-zero code to indicate failure.
}
}
main().catch(console.error);
Expected Output
Assuming you have an `image.png` file showing a can of Campbell's Tomato Soup, the model will analyze the image and provide a textual description.
--- FINAL RESULT ---
🖼️ Media > ./image.png
👤 User > What is in this image? Describe it in a single, detailed sentence.
🤖 Vision Assistant > The image displays a modern, glossy logo featuring a large, three-dimensional 'M' symbol stylized as a flowing ribbon, transitioning from vibrant blue on the left to warm orange on the right, casting subtle shadows to enhance its depth, all positioned above the white, lowercase, sans-serif text "merci-sdk" against a solid dark navy blue background.
--------------------
Prerequisite
Before running, you must create an image file named image.png
in the same directory as the script, or change the imagePath
variable to point to an existing image.