True Multi-Modal Input
Combine up to 9 images, 3 video clips (15s total), 3 audio files, and text prompts simultaneously. Reference motion, effects, camera movements, and sounds from any uploaded content with natural language control.
- Upload text + image + video + audio together
- Up to 12 assets in a single project
- Natural language reference control