Google Launches Gemini Omni Flash, a Multimodal AI Video Model

Google debuts Gemini Omni Flash at I/O 2026, enabling video creation and editing from any combination of inputs.

Google Omni Flash Launch | Illustration

Google Unveils Gemini Omni Flash at I/O 2026

Google unveiled Gemini Omni Flash on May 19, 2026, at its annual Google I/O developer conference in Mountain View, California — the first model in a new line called Gemini Omni. The announcement was authored by Koray Kavukcuoglu, Chief Technology Officer of Google DeepMind and Chief AI Architect at Google. Gemini Omni is described by Google as a model that can create anything from any input, starting with video, and allows users to edit content using conversational language.

While Google is not new to AI video — having launched the Veo 3 model the previous year, which focused on turning text into video — Omni accepts any input and creates video on demand. The new model combines reasoning abilities with generative AI tools, allowing users to create and edit videos using combinations of text, images, audio and video inputs.

Conversational Editing and Multi-Turn Refinement

The model allows users to edit videos through natural language conversations, with each instruction building on previous edits while maintaining consistency in characters, scenes and physics. Users can change actions in videos, add new characters or objects, modify environments, alter styles and adjust camera angles through multiple editing prompts.

Google states that in multi-turn editing, changes can be applied sequentially — transforming a violinist's environment, removing the instrument, and shifting camera angle — without losing continuity from the original scene. This architecture of compounding edits distinguishes Omni Flash from standalone generation tools that treat each output as a discrete, independent creation.

Physics Reasoning and Knowledge-Grounded Generation

The model is designed to generate videos grounded in Gemini's real-world knowledge, including concepts related to history, science and cultural context. Google added that the model has an improved understanding of physics concepts such as gravity, kinetic energy and fluid dynamics to create more realistic visuals.

Google illustrates this capability with examples including a marble rolling on a chain-reaction track, a claymation explainer of protein folding, and an alphabet video in which 26 unusual objects — one per letter — appear in rapid-fire sequence with lower-thirds displayed as handwritten slips of paper. These demonstrations position Gemini Omni as capable of translating complex conceptual prompts into visually coherent output without requiring frame-by-frame directorial input.

Avatar Feature and Responsible Deployment

Google says it is taking a cautious approach to safely deploying Omni. At launch, users can create videos with their own voice using a digital avatar of themselves, but Google is still testing the ability to edit videos to change audio and speech, in order to better understand how to bring that capability to users responsibly.

The examples that DeepMind researcher Gabe Barth-Maron and a Google representative gave to TechCrunch of avatar uses were personal in nature — such as making a video of oneself winning an award or going to the moon, or removing a passerby from the background of a vacation video. Despite the near-term consumer focus, Google will make Omni available via API in the coming weeks, and the enterprise and creative implications are considered significant.

SynthID Watermarking and Content Verification

All videos created with Gemini Omni include Google's SynthID digital watermark technology, which can be verified through the Gemini app, Gemini in Chrome and Google Search. Google CEO Sundar Pichai stated that since the launch of the SynthID watermarking system three years ago, over 100 billion images and videos have been watermarked, along with 60,000 years' worth of audio assets. Pichai added that OpenAI, Kakao and ElevenLabs have signed on to adopt Google's watermarking system, with Nvidia already committed as of the previous year.

Google is also expanding built-in C2PA Content Credentials, a separate tool that lets users check content metadata via Google Lens, including where an image originated and whether it was edited using generative AI tools.

Availability Across Platforms and Subscription Tiers

Gemini Omni Flash is rolling out globally to Google AI Plus, Pro and Ultra subscribers through the Gemini app and Google Flow. Google said the feature is also being introduced at no cost to users on YouTube Shorts and the YouTube Create app starting this week. The company said the model will also be made available to developers and enterprise customers through APIs in the coming weeks.

Omni Flash is capable of rendering 10 seconds of video. According to a Google representative who spoke with TechCrunch, this is not a model limitation but rather a decision based on a desire to reach more users and an anticipation that most users will not want to make significantly longer videos at this stage. Longer video durations are described as being in the pipeline.

A Google representative noted the model's text-rendering capabilities as a particular strength with commercial applications: "If you want a product somewhere, or even just a slogan, it needs to be accurate. We definitely anticipate filmmakers and other kinds of creators are going to be using this model as well." Google did not disclose specifics on any pricing adjustments associated with API access for enterprise customers.

AI Informed Newsletter


	I agree to receive emails. *

Disclaimer: The content on this page and all pages are for informational purposes only. We use AI to develop and improve our content — we love to use the tools we promote.

Send Us An Email

Course creators can promote their courses with us and AI apps Founders can get featured mentions on our website, send us an email.

Our mission

Simplify AI use for the masses, enable anyone to leverage artificial intelligence for problem solving, building products and services that improves lives, creates wealth and advances economies.

Who we are

A small group of researchers, educators and builders across AI, finance, media, digital assets and general technology.

Why AI?

If we have a shot at making life better, we owe it to ourselves to take it. Artificial intelligence (AI) brings us closer to abundance in health and wealth and we're committed to playing a role in bringing the use of this technology to the masses.

Just Courses?

We aim to promote the use of AI as much as we can. In addition to courses, we will publish free prompts, guides and news, with the help of AI in research and content optimization.

Learn, Grow and Earn With AI

We use cookies and other software to monitor and understand our web traffic to provide relevant contents, protection and promotions. To learn how our ad partners use your data, send us an email.