Mastering the Prompt: The Ultimate Guide to Gaga AI Video Generation

Mastering the Prompt: The Ultimate Guide to Gaga AI Video Generation


The prompt is your director’s script—it determines the content and movement of your video. The more complete, precise, and rich your description, the higher the quality of the generated video will be and the closer it will match your vision.

gaga ai video generator

To help you get started quickly, we provide four distinct prompt formulas tailored to different creative needs, along with a detailed vocabulary dictionary.

Prompt Formulas

1. The BASIC Formula

Target User: New AI video users or those seeking simple creative inspiration. Simple, free-form prompts generate videos with more imaginative results.

ComponentDescriptionExample (Text-to-Video)
SubjectThe main focus (person, animal, object, or imaginative entity).A black-haired girl in mecha-Hanfu
SceneThe environment where the subject is located (background, foreground, real, or fictional).with her hair in an updo, turning to look at the camera
MotionThe specific action or movement (still, subtle, large, partial, or overall movement).her soft, lustrous hair gently dancing in the air.
Video EffectA black-haired girl in mecha-Hanfu with her hair in an updo, turning to look at the camera, her soft, lustrous hair gently dancing in the air.
Image-to-Video ExampleVideo Effect
Prompt:“A cinematic 10-second video of a paraglider flying smoothly over the deep blue sea under a clear sky. The person with a red harness and helmet is gliding calmly, speaking naturally to the camera with synchronized lip movements. The golden-yellow parachute moves gently with the wind, while soft ambient ocean sounds and subtle background music enhance the atmosphere. The scene feels adventurous, peaceful, and realistic, with fluid motion and accurate facial expressions.”

Negative Prompt:“blurry, distorted face, unrealistic lip sync, glitchy movement, low resolution, extra limbs, deformed parachute, text artifacts, unnatural colors, overexposed sky, cartoonish style, grainy quality.
The picture is used as the first frame of the video, and the rest of the video is generated according to the prompt.

sample image

2. The ADVANCED Formula

Target User: Users with some AI video experience. Adding richer, more detailed descriptions enhances video quality, vividness, and narrative depth.

ComponentDescriptionExample Detail
Subject DetailDetailed description of the subject’s appearance(e.g., “a black-haired Miao girl in minority attire”).
Scene DetailDetailed description of the environment(e.g., “a gloomy cathedral with cracked marble”).
Motion DetailDetailed description of movement features(amplitude, speed, effect, e.g., “violently shaking,” “shattered glass”).
Camera LanguageShot type, angle, lens, or camera movement(see dictionary below).
AtmosphereDescription of the desired mood(e.g., “dreamy,” “lonely,” “epic”).
StylizationDescription of the visual style(e.g., “Cyberpunk,” “Gouache Illustration,” “Wasteland Style”).

3. The CAMERA MOVEMENT Formula

Target User: Users with clear requirements for camera work, suitable for professional video output. Adding specific movement details enhances video dynamism and narrative flow.

ComponentDescriptionExample Prompt
Camera Movement DescriptionA specific description of the lens movement over time. (Keep movement duration under 5s).The camera starts on a full-screen, antique wooden screen, slowly panning left to reveal an ancient style girl sitting behind the screen, wearing Shu embroidery Hanfu, with her hair in a high bun, conducting an online video conference.
Video EffectThe camera slowly pans left from an antique wooden screen, revealing an ancient style girl sitting behind it, who is wearing Hanfu and conducting an online video conference.

4. The TRANSFORMATION Formula

Target User: Users with a clear creative need for subject morphing. Adding a transformation description enhances the video’s fun and visual impact.

ComponentDescriptionExample Prompt
Subject AFeatures/state of the subject before transformation.Japanese comic style. In a corner of a city street, a black cat crouches under a streetlamp, gazing at the distant neon lights.
Transformation ProcessDetailed description of the process from A to B (enhances naturalness).Suddenly, a blue light descends from the sky, quickly enveloping its body. The black cat rises into the air in the light, its black fur gradually dissipating as its body rapidly elongates.
Subject BFeatures/state of the subject after transformation.Its fur turns into a sleek black suit, outlining a slender silhouette. The cat ears disappear, and the facial contours gradually sharpen, eventually transforming into the handsome and cold face of a young boy.
Video EffectA black cat transforms into a handsome, mysterious young man in a black suit on a city street, in a Japanese comic style.

Bonus: The Sound Formula

For models like Wan2.5 that support native audio generation, you can describe the full soundscape.

Sound ElementDescription DetailsExample
VoiceContent, Emotion, Accent, Timbre.A British woman’s voice saying: “Don’t blink,” in a nervous, sharp tone.
Sound EffectsSource, Action, Environment.A glass breaks and shatters into tiny pieces, with a sudden silence afterward.
BGMStyle, Mood.Epic orchestral background music, suspenseful style.

Prompt Dictionary: Enhancing Control and Expression

By writing prompts across different dimensions, you can increase the controllability and expressiveness of the generated video.

1. Shot Type

TypeExample PromptVideo Effect
Close-UpClose-up shot | The lens focuses on a girl’s face in ancient style, with soft light casting a delicate outline on her skin.Focus on intricate facial details.
Medium ShotMedium shot | The lens shows a girl in ancient style gracefully walking among flowers, her long skirt fluttering in the wind, as if blending with nature.Shows the character’s body language and surroundings.
Long ShotLong shot | The lens shows a bustling city street. Pedestrians come and go on the wide sidewalks, creating a lively scene.Captures the entire scene and environment.
Bird’s Eye ViewBird’s eye view | The lens adopts a bird’s eye perspective, looking down on the entire city, showing the interwoven streets and buildings.An overhead, geographic perspective.

2. Perspective

TypeExample PromptVideo Effect
Low AngleThe video starts with a pair of walking legs. The lens is shot from a low angle, focusing on the movement of the feet. Shoes tread on the rough, deserted ground, surrounded by broken concrete and scattered weeds, showing the desolation and decadence of a wasteland style.Makes the subject appear dominant or imposing.
Drone (FPV)FPV drone perspective | The video begins with an FPV (First-Person View) drone shot, bringing an immersive experience. The camera rapidly flies through the city’s skyscrapers, showing a magnificent urban landscape.A dynamic, high-speed, immersive viewpoint.

3. Lens

TypeExample PromptVideo Effect
Fisheye LensFisheye lens | The video shows a scene under a fisheye lens, bringing a unique curved effect that makes the busy city street more eye-catching.Creates a wide, distorted, high-impact visual.
Wide AngleWide-angle lens | The video captures a surveillance scene of a busy city street with a wide-angle lens. The picture is broad with an open field of view.Captures a large expanse of space.

4. Camera Movement

TypeExample PromptVideo Effect
Push InA giant cubical stone stands in the center of the square, surrounded by tranquility. The camera slowly pushes in, gradually approaching the stone, and the rough texture and traces of time become clear.Focuses attention on a subject or detail.
Pull OutA giant cubical stone stands firmly in the center of the square. The camera slowly pulls out, the majestic outline of the stone gradually appears, and the surrounding pavement and lawn are slowly revealed.Reveals the subject within its larger context.
Track/FollowA cubical stone rolls in the center of the square, and the camera follows the movement. The camera closely tracks the gravel, capturing the subtle textures on the ground and the motion as the spring breeze sweeps past.Maintains focus on a moving subject.
OrbitA giant cubical stone stands in the center of the square. The camera rotates and orbits the stone, capturing the rough texture and the subtle sheen of light on the surface.Shows the subject from all sides.

5. Speed

TypeExample PromptVideo Effect
SlowThe race car moves slowly; the background gradually clears in silence. The speed is gentle, bringing a relaxed experience.Emphasizes the detail and tranquility of the scene.
FastThe race car moves quickly; the background blurs instantly. The speed is rapid, bringing an intense thrill.Emphasizes motion, speed, and adrenaline.
Slow MotionThe flow of people moves slowly, in slow motion, magnifying every pedestrian’s pace.Highlights subtle movements and creates a dramatic effect.
Time-lapseThe video shows the amazing process of plants rapidly growing and blooming with a time-lapse effect.Compresses long periods of time into a few seconds, emphasizing growth or change.

6. Atmosphere

TypeExample PromptVideo Effect
Vibrant/JoyfulThe video shows a vibrant forest, with sunlight casting golden spots through the canopy. Birds chirp happily, and leaves sway with the music, creating a thriving and joyful scene.A bright, lively, and energetic mood.
Dreamy/QuietThe video shows a deep forest; the night stillness is like a soft veil. The surroundings are silent, with only the rustling of the wind. The sky is dark blue, with stars twinkling like distant gems, creating a serene and dreamy atmosphere.A soft, peaceful, and ethereal mood.
Lonely/PensiveThe video shows a lonely forest, silent all around. Leaves gently fall, making a slight rustling sound. The empty space is filled only with the lonely echo of the wind, creating a lonely and melancholy atmosphere.A quiet, reflective, and solitary mood.
Tense/GloomyThe video shows a tense forest with a fierce wind whipping the treetops. Thick clouds cover the sky, creating a gloomy and heavy atmosphere, suggesting impending change or danger.A dark, uneasy, and highly anxious mood.
Epic/ReverentThe video shows a majestic forest, with tall trees standing like guardians. Sunlight filters through the dense canopy, casting dappled light. The whole scene exudes grandeur and tranquility, inspiring a sense of awe.A grand, powerful, and deeply respectful mood.

7. Style

TypeExample PromptVideo Effect
CyberpunkRetro Cyberpunk style – A cyber warrior in a leather jacket walks through an abandoned electronic factory under blinking neon lights. The camera pulls out from his back to show a futuristic city night scene.High-tech, neon-lit, often dystopian style.
Wasteland StyleThe video shows a stunning appearance of a celestial fairy in a wasteland style setting. She wears tattered but gorgeous clothes, with a pair of strange wings made of ruins and fragments, soaring above the barren landscape.Post-apocalyptic, desolate, and often rugged aesthetic.
Line Art IllustrationIn the courtroom, an intelligent and eloquent fox lawyer in a crisp suit is arguing passionately for his client. Line art animation.A distinct visual style based on clean, defined outlines.
Chinese AnimeChinese anime style time-traveling girl, studying etiquette in an ancient palace under candlelight, with every move showing classical elegance.A stylized, distinctively Chinese animation aesthetic.
Felt StyleThe video shows a figure in a kitchen, made of felt, adding a touch of childlike fun. The felt person is standing in a miniature kitchen, holding a small spatula, seemingly cooking with great care.A charming, tactile, handcrafted visual style.
Classic MasterpieceIn Van Gogh’s Starry Night, a skateboarder in modern clothes weaves through twisted trees. The light and shadow effects under the starry sky interweave with the skateboard’s trajectory.An animated interpretation based on a famous work of art.
Pixel GameA person stands in a pixel game style world, equipped with the most gorgeous, high-resolution 8K texture package ever.A retro, digitized, 8-bit or 16-bit video game aesthetic.

Gaga AI Core Feature: Image Animation

The primary feature of Gaga AI is Image-to-Video Generation. It uses AI algorithms to analyze the content of your static image and, guided by your text prompt, generates realistic movement, expressions, or camera effects to bring the photo to life.

FeatureBenefit
Preserves DetailThe output video maintains the look, style, and subject consistency of your original photo.
Realistic MotionAnimate characters with lifelike movement (e.g., blinking, breathing) or add dynamic environmental effects (e.g., wind, rippling water).
Creative ControlUse the text prompt to direct exactly how the image should move or what the camera should do.

Your Steps to Create Video on Gaga AI

Here is the simple workflow for creating your masterpiece using the Gaga AI Video Generator:

Step 1: Choose Your Generation Mode

  • Select your preferred mode:
    • IT-to-AV (Talking Avatar): Start with a custom avatar and your prompt.
    • Image-to-Video (Digital Actor): Upload your starting image first.

Choose Your Generation Mode

Step 2: Set Your Video Parameters

  • Model Selection: Choose the best AI model for your needs (e.g., a high-consistency model for character shots or a speed model for fast drafts).
  • Aspect Ratio: Select the correct dimensions (e.g., 9:16 for TikTok/Reels, 16:9 for YouTube).
  • Duration: Set the desired length of your clip (usually 5-10 seconds).

set video parameter

Step 3: Enter Your Image and Prompt

  • In the text box, enter a clear image, and simple prompt describing the action or movement you want to see.
    • Bad Prompt: Make it move.
    • Good Prompt: The camera slowly pans across the landscape with a slight, subtle fog effect.
    • Better Prompt: The woman in the photo blinks and gives a slight smile. Cinematic 4K.

set video parameters

Step 4: Generate and Refine

  • Click the “Generate” button. The AI will process your request.
  • Review the generated video. If it’s close but not perfect, refine your prompt by adjusting one or two descriptive elements and generate again.

generate and refine

Pro Tip: Start with the Basic Formula to get an initial idea, then switch to the Advanced Formula to add in specific details for lighting, style, and camera work! Happy creating!

Turn Your Ideas Into a Masterpiece

Discover how Gaga AI delivers perfect lip-sync and nuanced emotional performances.