Part 1. Why did I decide to create a story with ChatGPT?
“AI and short animations — that sounds kind of complicated, doesn't it?”In reality, taking the first step to bring your imagination to life might be surprisingly simple if you use tools like ChatGPT and Vidu AI video generator.
In my creative process, I find myself using ChatGPT quite naturally.To organize ideas, to talk things through when I'm unsure, or to gradually put into words those emotions that are hard to express.
The short animation I created this time, The Nameless Sound, was shaped little by little through dialogue with AI. Even during production, there wasn't anything particularly special about how I used it. Just like always, I kept having conversations with ChatGPT, and slowly uncovered the outline of the story.
The theme I chose was “memory.” Rather than tracing a clearly defined storyline, I wanted to depict something soft and vague — something that lingers quietly in someone's heart.
In this article, I'd like to share just a little about how I usually work — how I shape stories together with ChatGPT.
If you're thinking about trying to make an AI-generated animation, or even if you've never used AI before, I hope my natural creative process can offer you a helpful hint or two.

Part 2. Collaborative process with ChatGPT
2-1. Verbalization of genre and theme
The first question I posed to ChatGPT was:
"I'm thinking of creating a fictional anime using AI. What genre would be best for a mysterious, Western atmosphere like Harry Potter, or a slightly rusty steampunk image?"
At this point, the world view, theme, and characters had not been decided at all.
I decided to start by verbalizing the genre and atmosphere, and then explore together.
At one point, I also posed to ChatGPT, along with my favorite image, "I want to create an anime with this kind of girl and background atmosphere."
Inspired by the atmosphere of that image, I asked ChatGPT to suggest several themes and settings, asking, "So what kind of story would suit this world?"
Among the suggestions, I was strongly attracted to the theme of "a girl who touches memories," and from there, I gradually solidified the outline of the story.
2-2. Title and protagonist setting
When the genre and theme were somewhat clear, I told ChatGPT: "Ultimately, I want to finish it as a short MV (within 5 minutes) with a story."
Then ChatGPT summarized the conversation so far, like a planning memo, with a structure plan.
Example:
- Genre (philosophical fantasy x sensory production)
- Concept (when you touch a building, memories flow as "sound and image")
- Characteristics of the protagonist (a silent girl who doesn't show much emotional turmoil)
- MV structure (a 5-minute production plan on a timeline)
- Music structure, world view rules, future production steps, etc.
Reading that helped me clear my thoughts. At this time, I told ChatGPT:
"From here on, I want to proceed step by step."
This is the "magic word" I always use when creating with ChatGPT. With this one word, the conversation becomes much easier, and my own thoughts also deepen step by step.
The first step was to set the character of the protagonist and decide on a name.
I already had an image of the impression of the girl. She's quiet and doesn't show much emotion, but her heart flutters slightly when she touches upon her memories. But at this point, she still didn't have a name.
When I asked ChatGPT to suggest some names that would suit this character, it responded with several names based on their meaning and sound. Among them, the name that caught my eye was "Lyra."
It wasn't too bright, but not too heavy either. The clear sound and the feeling of being called from somewhere far away perfectly matched the image of the girl I had in my mind.
By having the nameless "girl of memories" emerge as "Lyra", I felt like the entire work became much more personal.

2-3. Composing the song and lyrics
Once the direction of the story was solidified, the next step was to compose the song. For me, this order is very important.
If the sound comes first, I can assemble the video to match the flow of emotions. It's easier to imagine the editing, and above all, I like the feeling of "cutting and pasting the video to match the sound." For me, this is how the work takes shape more naturally.
So I told ChatGPT like this.
"I want to compose the song for the MV first. Can you do that?"
"It would be good to have lyrics. What kind of artist's song would be similar to what kind of image would be best?"
At this point, I was planning to ask another creator to compose the song, so I worked with ChatGPT to organize things like "How do I convey the image?" and "What kind of melody and lyrics would be appropriate?"
First, I thought about the direction of the lyrics to match the atmosphere of the work, and then we adjusted the scenery, emotional ups and downs, and the flow of the composition.
[Lyrics]
A name the city quietly left behindA voice dissolved into the twilight's sigh Only I can feel it still— A presence echoing, faint and real
When I gently touch that rusted doorLaughter rewinds the time once moreOnly the kindness that once lived hereStill quietly breathes, ever near
My shadow's gone, nowhere in sightNot even held in the city's mindBut that's okay—what I hope for mostIs to touch a heart, like a passing ghost
Each time the bell rings, slightly lateA thread of the world starts to unlaceChasing a song no one can hearI'm still here, quietly near
Even if my name fades from memoryIf this sound can reach somebodyIn the moment you turn aroundI want to live on in the lingering sound
When I completed these lyrics, I felt that the "sound" for this work was finally born. Even without the video, I had a strange feeling that the protagonist Lyra's feelings and the atmosphere of the world were quietly emerging.
2-4. Cutting and Story Structure
Once the foundations of sound and words were laid, we moved on to the phase of designing the flow of the entire video. The first request I made to ChatGPT was:
"I want you to design the flow of the storyboard."
From here, we began to refine our direction, including some practical issues.
One of these was the limitations of Midjourney v7 at the time. At the time of production, "character reference (cref)" was not yet supported, so it was difficult to continuously feature the face of the main character, Lyra, so I told them:
"It seems that we won't be able to use Lyra's face much, so I would like to focus on scenes with cityscapes and her back."
(※New features such as "omni reference" have now been released, but at the time of production, it was a constant process of trial and error.)
I also consulted them on the following ideas for adding variety to the video.
"I thought it would be interesting to have a scene looking down on the city from a bird's eye view, so I'd like to consider including a bird."
In this way, without being too attached to the original ideal, we fine-tuned the direction based on technical constraints and the development of the director, which was a process unique to collaboration with AI.
ChatGPT suggested "showing emotion through a character's back or posture" and "using a bird's-eye view" in line with these conditions. Based on these suggestions, we adjusted the placement of the final cuts and the directorial composition, and the core of the video gradually solidified.

2.5 Prompt Generation for Midjourney
The story and music are solidified, and the general flow of the video is in sight, so the next step is the visualization process. For this work, I used Midjourney v7 to generate images, focusing on the background and impressive scenes.
The image generation was based on a storyboard proposal consisting of 30 cuts proposed by ChatGPT. Assuming a 5-minute MV, it organized "what kind of scene" and "what emotions are being expressed" for each time period.
I used this proposal as a starting point and asked them to create Midjourney prompts corresponding to each cut. I specified that the list be compiled in Google Spreadsheet format and that all prompts be written in English.
I did not create the video according to the completed storyboard, but instead lined up the images output by Midjourney and reconstructed it based on my own feelings. In addition, for cuts where I felt that "this kind of scene is needed here," I prepared new prompts and generated additional ones.
In order to maintain the character's image, we used Midjourney's Edit function to create background variations, and Vidu's image-to-video function to visualize and extract Lyra's back, then composited them in Canva.
2-6. How to create poetic lines
Once the foundations for the video and sound were in place, the last thing we worked on was choosing the words.
The task was to pick up the words little by little, focusing on lines that gently accompany the story.
The first thing we asked ChatGPT was:
"I want you to create poetic lines that focus on Lyra's inner voice."
ChatGPT was sympathetic to our suggestion to use ambiguity in expression and to make Lyra's existence mysterious, and provided us with several ideas for lines.
Some of the lines that were created in this way were later incorporated into the video editing process, while being reviewed again.
For example, the line "If it resonates in you even after I disappear" was particularly memorable.
At first, I was a little worried about the "free expression" that ChatGPT proposed, wondering if it would be too difficult to understand.
However, after experiencing the appeal of "stories born in the free space" and "expression that leaves it up to the viewer" that ChatGPT explained to me, I ended up liking their style.
