Researchers from a technology company Virtual protocol they published a paper on a new artificial intelligence model of text-to-video conversion, MarioVGGwhich can simulate shots of Super Mario Bros via some basic text input (thanks, ArsTechnica).
The model ran through 737,000 frames of Mario Bros., showing Nintendo’s beloved plumber in 32 different levels with varying degrees of success and failure (141 wins and 139 losses, according to Github). Based on those images and how they are arranged, the AI model “learns” what commands like “jump” and “run” correspond to on the screen and is then able to simulate those commands in video format, physics and all.
Virtual protocol paper shows the model in action through a series of short videos that, from a distance, look very similar to the iconic NES platform. The publisher highlighted a selection of those videos on Twitterclaiming, “The era of infinite interactive worlds is here”:
While the model is capable of recreating select Mario moves, it’s not like we’re looking at a one-on-one simulation. To keep things simple, the researchers focused on just two inputs, “run right” and “run right and jump.” The resolution has been reduced from the NES’s 256×240 to a much smaller 64×48 and the output frames are a fraction of the input (producing seven generated frames out of the 35 input), so things are far from silky smooth.
Not everything is so fast. One RTX 4090 graphics card used in the study was only capable of producing a video sequence of six frames every six seconds, and while the final frame of one sequence could be used as the first frame for the next — approaching something resembling a real-world level — the researchers admit that for now “not practical and friendly for interactive video games”.
On top of all that, the results are full of errors. A closer look at the videos above reveals Mario changing colors on the fly, morphing into enemies, sliding through normally impassable objects, and occasionally disappearing entirely. Official Mario this is not.
Still, researchers are not giving up hope that a model like this could be used to develop games in the future. “While replacing game development and game launchers entirely using a video generation model may still not be practical and compelling at this point,” the paper concludes, “we show that it is possible and an option with only a limited dataset of a single game domain.” .
The ability for AI to determine cause and effect between user input and on-screen gameplay is an amazing concept, but that final note about being able to “replace game development” leaves a sour taste.
As if you needed a reminder, 2024 was one of the industry’s worst years for game developer layoffs, with both big and small studios seeing shrinking numbers to cut costs. An AI tool that can accurately replicate gaming may still be a long way off, but how it fits into current work practices will increasingly be a cause for concern in the coming years if it continues to advance at this rate.
Just last week, Bayonetta 3 voice actress Jennifer Hale said AI is “coming for all of us” as negotiations continue SAG-AFTRA strike turned to its use in video game acting.