“But can it run Doom?” is an adage that has managed to travel through almost every piece of tech on the market. From controlling the game with a toaster to running Doom on a pregnancy test (please wash your hands after), it has almost become a benchmark of geeky creativity.
A joint effort from researchers at Google Research, Google DeepMind, and Tel Aviv University has managed to get the classic shooter running on nothing but a neural network (via Futurism). This essentially generates a frame, based on a model that is trained on the real game. You can check out a video of it running in real-time right here but there are natural limitations to it.
For the unaware, a neural network is an AI structure modelled after the human brain that uses machine learning to process commands and prompts. They are notably used in predictive models, due to their ability to grasp concepts more broadly than traditional AI.
A large part of making them better is called “training”, where it iterates on a small level, using wide sets of data. When a network is “trained” on something, that is to say it is pulling in the data from it and using it in some way. In the example of generative AI, trained models will largely be quite similar to their source material until those data sets are wide enough.
Though it’s obviously very impressive to run something this complex through a neural model, it’s worth noting that the game is played at a slow pace in the video, with plenty of cuts—clearly grabbing the most fluid and realistic moments of the game. This isn’t to diminish the work but to place it in context. You can’t go out and just play through Doom right now with the support of a neural network. It is a test and not much more than that right now.
Without touching on any ethical discussions of (specifically generative) AI use in games, the accompanying paper acknowledges the experiment’s limitations and makes an argument for the future of the tech. Due to the neural network’s limited memory, it can only store 3 seconds from the game itself. It does seem to hold onto HUD effects but memory positioning and more gets lost while playing. It also fails to fully predict following frames, as you can see in the video above, with visual glitches and a lack of clarity in some areas.
Following on from this, the paper says “We note that nothing in our technique is Doom specific except for the reward function for the RL-agent”.
It then says that the same basic network could be used to attempt to emulate other games. Furthering this, it then makes the argument that this same engine could be used to replace or accompany programmers working on actual games. As the network appears to be trained on specific games with specific functions, no argument is made for how this would translate to creating entirely new games. The paper then says that an engine like this could be used to “include strong guarantees on frame rates and memory footprints”, following this up by saying “We have not experimented with these directions yet and much more work is required here, but we are excited to try!”
It is unclear how well this network will function outside of what we’ve seen thus far, and I’d encourage not making too many assumptions about the future of the tech just yet, but it is still rather impressive in itself—even if the goals of the future of the paper seem a bit lofty.