Google DeepMind unveiled the successor to the Genie synthetic intelligence (AI) mannequin, which may generate limitless 2D sport worlds, on Wednesday. Dubbed Genie 2, the brand new AI mannequin is able to producing distinctive action-controllable, playable 3D environments primarily based on a single picture immediate. Calling Genie 2 an AI “world mannequin”, the corporate said that it may possibly generate as much as minute-long environments with constant objects. The corporate mentioned these generated worlds might be performed by people or can be utilized to coach AI brokers.
Google DeepMind Unveils Genie 2 AI Mannequin
In a weblog submit, the corporate detailed the brand new AI mannequin and its capabilities. Whereas its predecessor may solely generate sport worlds for 2D platformer video games, the Genie 2 AI mannequin can generate 3D worlds full with constant fashions that may be interacted with. This implies people or AI brokers can stroll, run, swim, climb, and carry out extra actions in these environments.
Genie 2’s generative capabilities enable it to generate routes, buildings, and objects that can’t be seen within the enter picture. These parts are designed and rendered by the mannequin from scratch. Moreover, the muse mannequin can be able to sustaining consistency in these environments. This implies even when a participant strikes away from one space and returns again, the environments stay the identical.
Aside from this, Genie 2 is able to producing totally different views equivalent to first-person views, isometric views, or third-person views. Additional, customers also can work together with the objects within the generated worlds and may carry out actions equivalent to opening a door, bursting a balloon, or climbing a ladder. The mannequin can be prompted to generate physics-related results equivalent to water ripples, smoke, gravity, directional lighting, reflections, and extra.
Coming to the technical particulars, DeepMind defined that Genie 2 is an autoregressive latent diffusion mannequin and has been educated on a big video dataset. The transformer structure additionally contains an autoencoder which allows frame-by-frame technology of those worlds.
Notably, DeepMind additionally launched an AI mannequin dubbed Scalable Instructable Multiworld Agent or SIMA earlier this 12 months, which is basically able to agentic AI capabilities in 3D worlds. The corporate says Genie 2 is able to offering distinctive environments to related AI brokers and coaching them for varied real-life situations.
Because the world mannequin can generate distinctive environments, Google says this may eradicate the chance of knowledge contamination and can enable builders to accurately assess an AI agent’s capabilities.