Google DeepMind has just launched Veo 2, competing with OpenAI’s Sora. While Sora can generate 20-second clips up to 1080p resolution, Google’s Veo 2 model can produce minutes-long videos at stunning 4K resolution.
Google also claims Veo 2 can understand real-world physics and nuances of human movement and expressions—something that models like Sora still struggle with. Here’s everything you need to know about Google’s new Veo 2 model.
What’s New With Google’s Veo 2 Model?
Earlier this year, Google announced Veo, its AI text-to-video generating model. Building on that, the team has now introduced the upgraded Veo 2 model.
One of the biggest improvements in Veo 2 is its understanding of real-world physics and human movement. For example, if you’ve tried models like Sora, you might have noticed issues such as extra fingers on a hand or objects that don’t belong in a scene. Veo 2 fixes these problems, creating more natural and coherent outputs. Google also claims Veo 2 hallucinates less compared to other models.
In addition, Veo 2 can understand the language of cinematography. You can specify a genre, lens type, or cinematic effects, and Veo 2 will follow these instructions. For instance, you can ask for low-angle tracking shots, shallow depth of field, or an 18mm lens.
However, Google says the model still struggles with complex scenes or complex motions.
The model also can generate longer videos—up to several minutes in length—and at ultra-high resolutions of up to 4K. Compared to OpenAI’s Sora, which is limited to 20-second clips in 1080p resolution, this is a big step up. Veo 1, by comparison, could only generate videos that are up to a minute long in 1080p.
All videos generated with the Veo 2 model will include an invisible SynthID watermark to identify them as AI-generated, helping reduce the risk of misinformation.
How Can You Use Veo 2?
Just like the original Veo model, Veo 2 is not publicly available for everyone. Right now, it is accessible through Google’s VideoFX tool, which is part of Google Labs. Access is currently limited, and users can sign up for the waitlist. While Veo 2 supports 4K resolution and extended durations, the current implementation in VideoFX is limited to 720p resolution and clips that are eight seconds long.
Google is also planning to integrate Veo 2 into YouTube Shorts next year, opening up more possibilities for creators. For developers and enterprises, Veo 2 is rolling out via Vertex AI, Google’s AI platform.
Also Read:
- Now You Can Try Google’s Project Astra: Multimodal AI for Everyday Tasks
- 8 New AI Features in Google Chrome You Should Know About
- Google Announces New OS, Android XR, for VR and AR Devices
What About Imagen 3?
Alongside Veo 2, Google has also upgraded its Imagen 3 image-generation model. Imagen 3 is now better at rendering brighter, more detailed images. Google also claims it can follow prompts more accurately. Imagen 3 is available in ImageFX, which is rolling out to over 100 countries.
Google didn’t stop there. They also introduced a new experimental tool called Whisk, which combines Imagen 3 with Google’s Gemini AI for even more creative control. With Whisk, you can remix elements like subjects, scenes, and styles to create unique images. For example, you could upload an image, describe a scene, and add a specific art style to create something completely new. Whisk is available through Google Labs in the U.S., so if you’re curious, you can give it a try.