Skip to main content

Video Generation

Generate videos from text prompts, animate still images, or edit existing videos with natural language. Choose between multiple AI models depending on your quality, speed, and audio needs.

Models

ModelResolutionDurationAudioSpeedBest for
Grok VideoUp to 720p1–15 secondsIncluded automaticallyMediumQuick videos with audio out of the box
Luma Ray 2Up to 4K5s or 9sAdd after generationSlowerHigh-resolution output, controlled audio
Luma Ray 2 FlashUp to 4K5s or 9sAdd after generationFasterPreviews, iterations, batch processing
Choosing a model
  • Need audio included automatically? Use Grok Video
  • Need 1080p or 4K resolution? Use Luma Ray 2
  • Iterating on prompts or doing batch processing? Use Luma Ray 2 Flash for speed

Generation modes

Text to video

Generate a video entirely from a text prompt. Describe the scene, action, and mood. Available on all models.

Image to video

Animate a still image. Upload a photo or select from the media library — the AI creates a video based on the image content and your prompt. Available on all models.

You can upload up to 10 images at once for batch processing. Each image creates a separate video job with the same prompt and settings. See Batch Processing below.

Video editing

Edit an existing video with natural language. Provide a source video and describe what to change — the AI modifies only what you ask for while preserving the rest.

note

Video editing is currently available with Grok Video only.

Settings

Settings vary by model. The Generate page automatically updates available options when you switch models.

Grok Video

SettingOptionsNotes
Duration1–15 secondsFlexible range
Aspect Ratio9:16, 16:9, 1:1Image-to-video defaults to Auto (matches source)
Resolution480p, 720p

Luma Ray 2 / Ray 2 Flash

SettingOptionsNotes
Duration5 seconds or 9 secondsOnly two options
Aspect Ratio9:16, 16:9, 1:1, 4:3, 3:4, 21:9, 9:21More options including ultrawide
Resolution540p, 720p, 1080p, 4KHigher resolutions available

Audio

Grok Video — automatic audio

Grok-generated videos include audio automatically. The model generates appropriate sound effects, music, or ambient audio based on your prompt. There is no way to disable or customise the audio at generation time.

Luma — Add Audio (post-production)

Luma-generated videos are silent by default. After a Luma video completes, you can add AI-generated audio:

  1. Open the completed Luma video job
  2. Scroll to Add Audio
  3. Describe the audio you want (e.g., "calm background music with soft piano")
  4. Optionally describe what to avoid (e.g., "vocals, speech, loud drums")
  5. Click Add Audio

The audio is generated by Luma's AI and merged into the video. This typically takes 1–2 minutes. Check the Audit Log on the job detail page for a "audio added successfully" entry, then refresh to play the updated video.

Bulk Add Audio

Add the same audio to multiple Luma videos at once:

  1. Go to the Jobs page
  2. Select completed Luma video jobs (checkboxes)
  3. The Add Audio button appears in the bulk action bar
  4. Enter one audio description — it applies to all selected videos

Batch processing

Upload multiple images at once in Image to Video mode to create separate video jobs for each image:

  1. Select Video as the asset type and Image to Video mode
  2. Click the Upload tab
  3. Select up to 10 images (hold Ctrl/Cmd to multi-select)
  4. A preview grid shows all selected images with a count
  5. Remove individual images if needed, or click Remove all
  6. Click Generate — each image creates a separate job

Progress shows "Creating job 3 of 10..." as jobs are submitted. All jobs appear on the Jobs page.

tip

Batch processing works with any video model (Grok, Ray 2, or Ray 2 Flash). For real estate walkthroughs, see the dedicated Real Estate Walkthrough guide.

Prompt templates

The Generate page includes pre-built templates for common video types, organised by category:

  • E-Commerce — product showcases, unboxing, before & after
  • Lifestyle — lifestyle in action, cinematic reveals
  • UGC / Testimonial — creator-style testimonials
  • Food & Beverage — food hero shots
  • Tech & Apps — app demos
  • Sale & Promo — flash sales, promotions
  • Fintech / DeFi — card lifestyle, global transfers, yield promos
  • Fashion / Red Carpet — red carpet arrivals, event-specific templates
  • Real Estate — walkthroughs, 360 tours, auction hype, and more

Each template pre-fills the prompt with customisable variables. Some templates (like House Walkthrough) also auto-configure the form settings.

Processing time

ModelTypical time
Grok Video2–5 minutes
Luma Ray 21–3 minutes
Luma Ray 2 Flash30–60 seconds

Video editing (Grok only) adds additional overhead compared to text-to-video.

Retry

If a video job fails or stalls, tap Retry on the job detail page to re-run generation. If the provider has already finished processing, the result is recovered immediately.

Creativity levels

Some templates include a creativity level that controls how faithfully the AI reproduces the source image in Image to Video mode:

LevelBehaviourUsed by
FaithfulStrict fidelity — no alterations to the source imageHouse Walkthrough
BalancedGentle fidelity (default)Most templates
CreativeEncourages reimagining the sceneRenovation Opportunity

Templates with faithful creativity also run an auto-analysis step before generation, describing every visible element in the source image to help the AI preserve details accurately.