Hello! Today I would like to write about this.
"When you create an anime-like video with AI, the characters become different people for each frame."
Every time I hear such a lament, I nod so hard that my neck breaks. However, today, the situation has changed dramatically.
Vidu has officially released Multi-Reference Consistency, which loads up to seven reference images at the same time and matches tagged images for each frame to maintain consistency, making it easier to achieve consistency for multiple characters at once. However, Vidu's Reference to Video is currently limited to generating videos that are only 4 seconds long at a time. Therefore, we will verify the steps to create a 30-second PV using the idea of "8 x 4-second clips".
The setting is a rooftop adorned with glowing neon signs. There are three reasons for choosing this location:
1. It allows both horizontal and vertical movement
The added height variation brings dynamic action, keeping even 4-second clips visually engaging.
2. The background can be handled with a single image
A helipad floor and distant skyline—once these two elements are drawn, the camera movement won't feel unnatural. This means more reference slots can be dedicated to characters.
3. Lighting effects are easy to achieve
A nighttime cityscape with neon makes it simple to add rim lighting along the edges of characters' hair, which pairs beautifully with cel shading.
Name | Age |
|---|


Fixed Items |
|---|
Silhouette |
|---|
Akari | 20 | Hair #C8A2FF + Star Emblem #FFD700 | Silver flight jacket, gold star on left chest | Slim, scar on one eyebrow |
Ryu |
I am often asked, "Is it okay to just give the color a similar name?", but if you write down the HEX code, it seems that the color fluctuations in Midjourney are greatly reduced. Roughly speaking, it's like "specifying the paint number and giving it to them."
Niji 6 is an anime-specialized model known for its clean line art and vibrant colors. While v7.0 is also excellent, its prompt handling can be trickier, so for this tutorial, we'll use Niji 6.
Prompt: imagine PG-13 character reference, young adult woman (age 20) cyber-punk heroine,
front view centered, modern TV-anime style, thin colored line art, variable outline,
two-tone cel shading, short lavender hair with magenta streaks, silver flight jacket with gold star patch,
neutral grey background --ar 3:4 --stylize 150 --niji 6 --seed 777
Although the goal was to generate a front view, a full three-view reference was output. This image will be used to generate further poses.

Midjourney operates on Discord. Right-click the output image and select “Copy Link” to proceed with further use.
By inserting the previously copied image link into <Akari_front_URL>, you can generate a consistent character design and art style across different views.
Prompt: imagine profile left view, facing right, modern TV anime style, colored thin line art, two-tone cel shading, soft pastel palette, shallow depth of field, neutral grey BG, modest clothing, PG-13 --cref <Akari_front_URL> --sref <Akari_front_URL> --ar 3:4 --stylize 150 --niji 6
Sref (Style Reference): Maintains overall art style and visual tone (e.g., oil painting, manga style, etc.).

Prompt: imagine PG-13 character reference, young adult man (age 22) cyber-samurai, front view, modern TV anime style, colored thin line art, two-tone cel shading, short crimson hair, black tech trench coat, glowing gold katana held down, neutral grey studio BG, symmetrical stance --ar 3:4 --stylize 150 --niji 6 --seed 888
This prompt generates a symmetrical front-facing base image of Ryu in a cyber-samurai style, featuring clear anime-style line art, bold cel shading, and a neutral background suitable for character reference use.

Prompt: imagine aerial downward slash pose, coat fluttering, red energy trails, cinematic dusk rooftop, anime style, variable-width outline, two-tone cel shading, gentle bloom, PG-13 --cref <Ryu_front_URL> --sref <Ryu_front_URL> --ar 16:9 --stylize 200 --niji 6
This prompt generates an action scene of Ryu in mid-air performing a downward slash. His coat flows dramatically with red energy trails against a cinematic dusk rooftop background. By using both --cref and --sref, the character design and visual style remain consistent with the base reference image.

Prompt:imagine orthographic reference sheet, hovering spherical drone mascot, diameter 30 cm, teal alloy body, central camera lens, four holographic wings emitting soft yellow light, modern anime cel shading, colored thin line art, neutral grey studio BG --ar 1:1 --stylize 120 --niji 6
This prompt generates a three-view (orthographic) reference sheet of Pixie, a 30 cm-wide floating spherical drone mascot. It features a teal alloy body, central camera lens, and four soft yellow holographic wings, all rendered in clean anime-style cel shading with thin colored lines. Ideal for use in consistent character modeling and animation generation planning.

Prompt: imagine cinematic rooftop helipad at dusk, neon-lit skyline, soft fog layers,
2.5D anime background, thin colored line art, two-tone cel shading,
no characters --ar 16:9 --stylize 250 --niji 6 --seed 12345
Note: Including no characters is essential. If omitted, random passersby may appear in the scene, which can confuse the Multi-Reference system and compromise character consistency.

Add multiple images for each: Akari, Ryu, and Pixie, such as "left profile," "back view," "running (or cutting/rotation)," etc. By using --cref <Akari_front_URL> to refer to the base image and explicitly stating "hair remains lavender," color discrepancies will be minimized.
The basic front view + 2 auxiliary views (side and back) make up one set, and Vidu AI video generator allows you to upload a maximum of 3 images at the same time. For running poses, generating with a 16:9 aspect ratio will prevent "limb clipping."
After generating the images, download them and organize the file names like this:
Akari_front.png
Akari_profile.png
Ryu_front.png
…
BG_rooftop.png
This will make tagging in Vidu easier.
If you select "Speed" for the resolution, you can click the HD button in the top right of the thumbnail after generation to upscale to 1080p.

1. You can register up to three images at once using the My References button.

2. The order in which they are added will automatically determine the reference order, so place the most important front core at the top.
3. Set the Reference Name.
Enter a name that is easy to distinguish, such as the character name or pose name.
```
@SCENE_BG
@Akari_front
@Ryu_front
@Pixie_front
@Akari_sheet ... (auxiliary)
@Ryu_sheet ... same
```

4. Check the Style.
If you want to add more depth, try changing it to 3D Rendering or 2.5D Animation.

# C1 0-4s: Introduction
@SCENE_BG drone rise, neon skyline dusk, gentle bloom, PG-13
# C2 4-8s: Akari Running
@SCENE_BG @Akari_front @Akari_profile enters left sprinting, lavender trail, dolly-in, rim light
# C3 8-12s: Ryu Landing
@SCENE_BG @Ryu_front descends from the sky, gold katana sparks, tilt-down, dust puff
# C4 12-16s: Pixie Joins
@SCENE_BG @Pixie_front hovers center, teal holo-wings pulse, zoom-in 120→80 mm
# C5 16-20s: Confrontation
@SCENE_BG @Akari_front @Ryu_front face-off, jackets flutter, static 50 mm lens
# C6 20-24s: Slow-Motion Dramatic Shot
@SCENE_BG slow-motion 0.5×, lavender vs gold energy arcs, handheld 3%
# C7 24-28s: Collision
@SCENE_BG impact shockwave, cyan-magenta grade, handheld 5%, debris particles
# C8 28-32s: Pull-Back End
@SCENE_BG camera pulls back, sunset sky fills frame, silhouettes, gentle bloom
✱ Quick Explanation of Camera Terms
When typing in the prompt input field, entering @ will automatically display reference suggestions.

1. Horizontal movement + vertical movement + rotation are switched every 4 seconds to add contrast.
2. If you write the camera lens value as a "guideline for distance," Vidu reproduces the perspective surprisingly faithfully.
3. Slow motion is entered at "0.5x." The reason is that the image quality will decrease if you add length in post-editing, so it will be smoother if you set it to slow motion on Vidu from the beginning.
A new feature added by Vidu in April 2025 is a tool that generates sound effects just by entering text and a timestamp. If you write something like "0-2 s: wind" or "2-4 s: sword clash," it will create a multi-layer according to the number of seconds.
Example: C7 Collision SE

Click the Timestamp button before entering the text. You can adjust the length of the sound effect on the Sets total duration screen at the bottom.
In the past, we would search through sound effect websites and manually match waveforms, but with this feature, you can complete a clip in just a few seconds.
Suno AI is a music generation service that creates 2-3 minute songs from just text input.
While a detailed explanation is omitted, you can generate background music by turning on "Instrumental" in the Suno AI Create screen and simply entering text for the "Style."
Style Example
Japanese anime rock, uplifting Instruments: cinematic synthwave, pulsing bass arpeggio, punchy electronic drums
https://suno.com/song/48c1d3e0-3c4a-40f1-b29a-6910af0553e5?sh=3kMOrCZFOaSiQhbR

90% of the sound is completed using Vidu's in-house sound effects and Suno AI music. The remaining 10% is spiced up with EQ and limiters in Filmora and Resolve, and a 30-second animation can be completed without any rework.
1. Hair color changes midway
2. Face appears in a back shot
3. Camera speed jumps at the 4-second boundary
Hair #FF3030 + Sword #FFD700 |
Black trench coat, sword with pulsing red LEDs |
Inverted triangle body type |
Pixie | - | Core #00E0FF + Wings #FFFA00 | 30 cm spherical drone, 4 holographic wings | Sphere with wings |
The 4-second limit is not a "constraint" but rather an editing point. If you approach it this way, Vidu becomes a high-speed feedback tool that allows you to "think → instantly see results." You can register up to 3 reference images, but a 2-image setup with front and side views is more practical. After generation, simply click the HD button to upscale to 1080p in one click.
That being said, the UI and pricing of AI tools change daily. Please check the official documentation, and don't be afraid to try things out, even if you fail. I look forward to seeing your original AI animations on your timeline next!