Google just launched Gemini Omni Flash at I/O — an AI model that generates and edits video through natural conversation. Here's what HK creators need to know.
Google just dropped a bombshell at Google I/O. Meet Gemini Omni Flash — Google's new "any-to-any" AI model that can generate and edit video from any combination of text, images, audio, and video input. And it's rolling out now.
Here's what Hong Kong creators, agencies, and marketers need to know.
What Is Gemini Omni?
Gemini Omni is Google DeepMind's latest leap in generative AI. Unlike traditional video models that take text prompts and output video, Omni is built on Gemini's native multimodal architecture — it can reason and create simultaneously.
The first model in the family, Gemini Omni Flash, is available today in the Gemini app, Google Flow, and YouTube Shorts. It represents a shift from single-modal generation (text-to-video) to truly multimodal creation (any-to-video).
Edit Videos Through Conversation
The headline feature? You can edit videos using natural language, conversationally.
Every instruction builds on the last. Characters stay consistent. Physics holds up. The scene remembers what came before. Want to turn a marble sculpture into bubbles mid-shot? Just ask. Need to change a mirror into liquid rippling effect in post? Say it in plain English.
This is a game-changer for HK video production agencies working on tight deadlines. Instead of complex 3D compositing or frame-by-frame VFX, you can iterate on video edits as naturally as chatting with a colleague.
Grounded in Real-World Knowledge
Because Gemini Omni is built on Google's foundation model, it brings real-world understanding to video generation. It knows what objects are, how they behave, and how scenes logically flow together. That means fewer nonsensical outputs and more usable, production-ready footage.
For Hong Kong marketers creating product demos, social media content, or ad creative, this means faster turnaround and higher quality — no more fighting with models that can't tell a teacup from a teapot.
Rolling Out Now
Gemini Omni Flash is launching across three surfaces: - Gemini app — for individual creators and experimentation - Google Flow — for workflow-based AI automation - YouTube Shorts — integrated directly into Shorts creation tools
Additional output modalities (image and audio generation) are confirmed for future releases. This is the first step in a broader Omni roadmap.
What This Means for Cooly.ai Users
If you're already using Cooly Studio to generate AI images and videos, Gemini Omni Flash represents an exciting new capability on the horizon. Cooly.ai is actively tracking the latest AI video models — from Runway to Kling to Veo — and Omni's conversational editing approach is exactly the kind of workflow innovation that makes AI video production accessible for HK agencies.
At cooly.ai, we're building tools that let you experiment with the best generative AI models in one place. Whether you're a solo creator or a full-service agency, staying on top of models like Gemini Omni Flash is how you stay ahead.
Frequently Asked Questions
Q: When is Gemini Omni available? A: Gemini Omni Flash is rolling out now to the Gemini app, Google Flow, and YouTube Shorts.
Q: Can Gemini Omni generate images or audio too? A: Not yet. The current release focuses on video output. Google confirmed that image and audio output modalities are coming in future updates.
Q: How is Omni different from Veo? A: Veo is Google's video generation model focused on text-to-video and image-to-video. Omni is a multimodal reasoning + creation model — it can take any combination of text, images, audio, and video as input and edit or generate video conversationally.
Q: Can I use Gemini Omni in Cooly Studio? A: Not yet directly, but Cooly.ai continuously integrates the best AI models as they become available. Keep an eye on cooly.ai for updates.
Q: Is this useful for Hong Kong businesses? A: Absolutely. For HK agencies producing video content, the ability to edit video through conversation dramatically reduces production time. Marketers can iterate on ad creative faster, and creators can experiment without expensive post-production tools.
