logo Faceless Video Maker
Tutorial

How to Create Eye-Catching B-Roll Images for YouTube Videos Using AI

Stock footage is expensive and generic. The same clip appears in thousands of videos. AI-generated images let you create custom visuals that match exactly what your script is saying โ€” and they're unique to your video.

ยท 8 min read

What is B-Roll and why does it matter?

B-Roll is the visual content that plays on screen while the narrator is speaking. In a traditional video production, it's the footage that supports the story โ€” a cutaway to a street, an old photograph, a close-up of an object.

In a faceless YouTube video, B-Roll images are your entire visual layer. The viewer watches a sequence of images while hearing the voiceover. The quality, variety, and relevance of those images directly affect whether people keep watching.

Why not just use stock photos? Stock photos are generic, look dated, and often don't match the specific moment in your script. AI-generated images can show exactly what your narration is describing โ€” a specific scene, a specific era, a specific mood.

Step-by-step: creating B-Roll images for your video

1

Set your image parameters

In the B-Roll Library, you'll see three settings before splitting your script:

Chars per image

How many characters of script each image covers. 500โ€“700 chars โ‰ˆ one image per 15โ€“20 seconds of narration. For a 3-minute video this gives you about 8โ€“10 images.

Image Style

The visual style for all images in this video. Choose one and stick with it โ€” consistency makes the video look professional.

Image Language

If any text appears inside the images (signs, labels, captions), this sets the language for that text.

2

Split your script into segments

Click "Split Script into Segments". The tool reads your script and divides it into equal chunks โ€” one image segment per chunk. Each segment shows you the text it covers.

This step is free โ€” it's just splitting text. No AI credits used until you start generating images.

3

Generate images one by one (or all at once)

Each segment card has a "Generate Image" button. The AI reads the segment text, understands what's happening in that part of your story, and generates an image that matches.

You can click generate on all segments in quick succession โ€” they'll process in parallel. Most images are ready in about 30 seconds.

Screenshot of B-Roll Library showing Chars per image set to 1000, Image Style set to Anime/Manga, Image Language set to English, and a grid of 3 generated anime-style images โ€” each showing a scene from the horror script, with Regenerate and Delete buttons

Each image card shows the segment text and the generated image. Regenerate any you want to change.

4

Review and regenerate

Once all images are generated, scroll through and identify any that don't fit. You can:

  • โ€บ Regenerate โ€” tries again with the same description (free if the previous attempt failed, otherwise 10 credits)
  • โ€บ Edit the description โ€” click on the segment text to change it, then regenerate with a more specific prompt
  • โ€บ Delete and skip โ€” if a segment works better without an image

Choosing the right image style for your niche

Your image style defines the visual identity of your channel. Pick one that fits your content โ€” and be consistent. Here are the main options:

Cinematic Photo

History, True Crime, Documentary

Realistic, filmic images. Works for any niche that benefits from a grounded, credible look.

Anime / Manga

Horror, Fantasy, Action

Stylised, expressive images. Great for story-based content where emotion is key.

Watercolor

Travel, Culture, Human Interest

Soft, artistic images. Works for content that needs warmth and texture.

Flat Illustration

Finance, Science, Education

Clean, modern images. Ideal for explainer content that needs clarity over atmosphere.

Isometric

Tech, Productivity, Business

3D-looking flat graphics. Makes abstract concepts visual and easy to follow.

Retro / Vintage

History, 70sโ€“90s culture

Faded, aged look. Perfect for historical content or nostalgia-driven channels.

You can also type a custom style description instead of picking a preset โ€” for example, "dark oil painting with dramatic shadows" or "children's book illustration".

How images are timed in the exported video

You don't need to manually set how long each image stays on screen. When you export the video, the tool automatically divides the total voiceover duration evenly across all your B-Roll images.

For example: a 3-minute voiceover with 8 images โ†’ each image shows for approximately 22 seconds. During that 22 seconds, the image slowly pans and zooms (Ken Burns effect) to keep things visually dynamic.

Transitions between images are crossfades โ€” the last frame of one image blends into the first frame of the next. This gives the video a smooth, professional feel without any manual editing.

Generate your first B-Roll images for free

100 free credits at signup. Start creating custom visuals for your next YouTube video today.

Start for Free