Video Generation
Generate videos from text prompts, animate still images, or edit existing videos with natural language. Choose between multiple AI models depending on your quality, speed, and audio needs.
Models
| Model | Resolution | Duration | Audio | Speed | Best for |
|---|---|---|---|---|---|
| Grok Video | Up to 720p | 1–15 seconds | Included automatically | Medium | Quick videos with audio out of the box |
| Luma Ray 2 | Up to 4K | 5s or 9s | Add after generation | Slower | High-resolution output, controlled audio |
| Luma Ray 2 Flash | Up to 4K | 5s or 9s | Add after generation | Faster | Previews, iterations, batch processing |
- Need audio included automatically? Use Grok Video
- Need 1080p or 4K resolution? Use Luma Ray 2
- Iterating on prompts or doing batch processing? Use Luma Ray 2 Flash for speed
Generation modes
Text to video
Generate a video entirely from a text prompt. Describe the scene, action, and mood. Available on all models.
Image to video
Animate a still image. Upload a photo or select from the media library — the AI creates a video based on the image content and your prompt. Available on all models.
You can upload up to 10 images at once for batch processing. Each image creates a separate video job with the same prompt and settings. See Batch Processing below.
Video editing
Edit an existing video with natural language. Provide a source video and describe what to change — the AI modifies only what you ask for while preserving the rest.
Video editing is currently available with Grok Video only.
Settings
Settings vary by model. The Generate page automatically updates available options when you switch models.
Grok Video
| Setting | Options | Notes |
|---|---|---|
| Duration | 1–15 seconds | Flexible range |
| Aspect Ratio | 9:16, 16:9, 1:1 | Image-to-video defaults to Auto (matches source) |
| Resolution | 480p, 720p |
Luma Ray 2 / Ray 2 Flash
| Setting | Options | Notes |
|---|---|---|
| Duration | 5 seconds or 9 seconds | Only two options |
| Aspect Ratio | 9:16, 16:9, 1:1, 4:3, 3:4, 21:9, 9:21 | More options including ultrawide |
| Resolution | 540p, 720p, 1080p, 4K | Higher resolutions available |
Audio
Grok Video — automatic audio
Grok-generated videos include audio automatically. The model generates appropriate sound effects, music, or ambient audio based on your prompt. There is no way to disable or customise the audio at generation time.
Luma — Add Audio (post-production)
Luma-generated videos are silent by default. After a Luma video completes, you can add AI-generated audio:
- Open the completed Luma video job
- Scroll to Add Audio
- Describe the audio you want (e.g., "calm background music with soft piano")
- Optionally describe what to avoid (e.g., "vocals, speech, loud drums")
- Click Add Audio
The audio is generated by Luma's AI and merged into the video. This typically takes 1–2 minutes. Check the Audit Log on the job detail page for a "audio added successfully" entry, then refresh to play the updated video.
Bulk Add Audio
Add the same audio to multiple Luma videos at once:
- Go to the Jobs page
- Select completed Luma video jobs (checkboxes)
- The Add Audio button appears in the bulk action bar
- Enter one audio description — it applies to all selected videos
Batch processing
Upload multiple images at once in Image to Video mode to create separate video jobs for each image:
- Select Video as the asset type and Image to Video mode
- Click the Upload tab
- Select up to 10 images (hold Ctrl/Cmd to multi-select)
- A preview grid shows all selected images with a count
- Remove individual images if needed, or click Remove all
- Click Generate — each image creates a separate job
Progress shows "Creating job 3 of 10..." as jobs are submitted. All jobs appear on the Jobs page.
Batch processing works with any video model (Grok, Ray 2, or Ray 2 Flash). For real estate walkthroughs, see the dedicated Real Estate Walkthrough guide.
Prompt templates
The Generate page includes pre-built templates for common video types, organised by category:
- E-Commerce — product showcases, unboxing, before & after
- Lifestyle — lifestyle in action, cinematic reveals
- UGC / Testimonial — creator-style testimonials
- Food & Beverage — food hero shots
- Tech & Apps — app demos
- Sale & Promo — flash sales, promotions
- Fintech / DeFi — card lifestyle, global transfers, yield promos
- Fashion / Red Carpet — red carpet arrivals, event-specific templates
- Real Estate — walkthroughs, 360 tours, auction hype, and more
Each template pre-fills the prompt with customisable variables. Some templates (like House Walkthrough) also auto-configure the form settings.
Processing time
| Model | Typical time |
|---|---|
| Grok Video | 2–5 minutes |
| Luma Ray 2 | 1–3 minutes |
| Luma Ray 2 Flash | 30–60 seconds |
Video editing (Grok only) adds additional overhead compared to text-to-video.
Retry
If a video job fails or stalls, tap Retry on the job detail page to re-run generation. If the provider has already finished processing, the result is recovered immediately.
Creativity levels
Some templates include a creativity level that controls how faithfully the AI reproduces the source image in Image to Video mode:
| Level | Behaviour | Used by |
|---|---|---|
| Faithful | Strict fidelity — no alterations to the source image | House Walkthrough |
| Balanced | Gentle fidelity (default) | Most templates |
| Creative | Encourages reimagining the scene | Renovation Opportunity |
Templates with faithful creativity also run an auto-analysis step before generation, describing every visible element in the source image to help the AI preserve details accurately.