Microsoft boss Satya Nadella has said the company plans to train its new gaming AI on a “catalogue of games” soon, so it can then “start playing them”.
Speaking on the Dwarkesh Patel podcast, Nadella described seeing Microsoft’s gaming AI Muse generate gameplay videos that reflected a user’s input as a “massive, massive moment of ‘wow'”.
In short, Muse has so far been trained based on years of gameplay footage from Ninja Theory’s arena shooter Bleeding Edge. After ingesting this gameplay archive, Muse can then generate new gameplay footage and estimate how it might change based on a specific prompt suggested by the user. For example, Muse could generate gameplay footage with an additional jump pad in the level, if commanded to do so.
As yet, this is not an AI able to start coding its own games, or dreaming up its own concepts. Rather, it’s a tool that could eventually become a shortcut for providing generated gameplay footage to speed up the process of development.
“Can you actually generate games that are both consistent and then have the ability to generate the diversity of what that game represents, and then are persistent to user mods?” Nadella said. “That’s what this is.
“What I’m excited about is bringing – we’re going to have a catalogue of games soon that we will start using these models, or we’re going to train these models to generate, and then start playing them.”
Nadella did not elaborate on how users would be able to play the generated footage Muse provides.
“When Phil Spencer first showed it to me, he had an Xbox controller and this model basically took the input and generated the output based on the input. And it was consistent with the game,” Nadella continued.
“That to me is a massive, massive moment of ‘wow’. It’s kind of like the first time we saw ChatGPT complete sentences, or Dall-E draw, or Sora. This is one such moment.”
Writing in response to Microsoft’s announcement of Muse last week, gaming AI expert Michael Cook described the advancement as “impressive” but “not a practical process”.
“It’s impressive that it can do this using visual information because things like lighting, camera angles, user interface and so on are a lot for an AI model to handle. But ultimately, even with all of this data, all the time spent annotating datasets, and so on, it was still only just about able to generate footage predicting player behaviour.
“The research team behind this probably believe it will get more efficient over time, which might make it more affordable or tractable for small developers,” Cook continued. “However, it still raises the question of how we get video footage of people playing our game in the first place.”
Last month, Take-Two Interactive boss Strauss Zelnick weighed in with his opinion of AI, saying that “artificial intelligence is an oxymoron, there’s no such thing”.