Google I/O 2024’s keynote session was a 112-minute-long affair the place the corporate made a number of main bulletins centered on synthetic intelligence (AI). The bulletins ranged from new AI fashions to integration of AI into Google merchandise, however maybe one of the vital fascinating introductions was Veo, an AI-powered video technology mannequin, that may generate 1080p decision movies. The tech large mentioned that the AI instrument can generate movies that transcend the one-minute mark. Notably, OpenAI additionally unveiled its video AI mannequin dubbed Sora in February.
Through the occasion, Demis Hassabis, co-founder and CEO of Google DeepMind, unveiled Veo. Saying the AI mannequin, he mentioned, “Right this moment, I am excited to announce our latest and most succesful generative video mannequin known as Veo. Veo creates high-quality 1080p movies from textual content, picture and video prompts. It might seize the main points of your directions in several visible and cinematic types.”
The tech large claims that Veo can intently comply with prompts to know the nuance and tone of a phrase after which generate a video to resemble it. The AI mannequin can generate movies in several types like timelapse, close-ups, fast-tracking photographs, aerial photographs, and numerous lighting and depth of discipline photographs. Other than video technology, the AI mannequin can even edit movies when the consumer offers it with an preliminary video and a immediate so as to add or take away one thing. Additional, it will possibly additionally generate movies past the one-minute mark both by a single immediate or by way of a number of sequential prompts.
To unravel the issue of consistency in video technology fashions, Veo makes use of latent diffusion transformers. This helps in lowering the situations of characters, objects, or the whole scene flickering, leaping, or morphing unexpectedly between frames. Google highlighted that movies created by Veo will likely be watermarked utilizing SynthID, the corporate’s in-house instrument for watermarking and figuring out AI-generated content material. The mannequin will quickly be out there for choose creators by way of the VideoFX instrument at Google Labs.
Veo’s similarities with OpenAI’s Sora
Whereas neither of the AI fashions is obtainable to the general public but, each share a number of similarities. Veo can generate 1080p movies for a length that may surpass one minute, whereas OpenAI’s Sora can generate movies of as much as 60 seconds. Each fashions can generate movies from textual content prompts, photographs, and movies. Primarily based on diffusion fashions, each are able to producing movies from a number of photographs, types, and cinematography tehcniques. Each Sora and Veo additionally include AI-generated content material labels. Sora makes use of the Coalition for Content material Provenance and Authenticity (C2PA) commonplace whereas Veo makes use of its native SynthID.