Why LLMs Can Write Video Games But Can't Play Them: NYU Professor Explains

Q: Why Video Games Break AI

Video games are a completely different problem. LLMs have no spatial reasoning ability — it’s simply not in the training data. Even AlphaZero, which mastered both chess and Go, had to be retrained for each game. No general game AI exists. Togelius ran a game AI benchmark competition for seven years and eventually stopped because progress plateaued. Agents got better at some games but worse at others. When they recently updated the framework for LLMs: “They fail. They absolutely suck. All of them. They don’t even do as well as a simple search algorithm.”

By Jaspal March 31, 2026 Updated: March 31, 2026

Here’s a paradox that should bother anyone who thinks AI is about to replace everything: LLMs can write the code for a video game in one prompt, but they absolutely cannot play one. NYU professor Julian Togelius explains why this matters more than you think.

Coding Is a Well-Designed Game

According to Togelius, director of NYU’s Game Innovation Lab, the reason LLMs excel at coding is that programming is essentially a perfectly designed game. You get a specification (the level), write code (the action), run it (the test), and get immediate, granular feedback — does it compile? Does it crash? Does it pass tests?

“There’s a theory from game designer Raph Koster that games are fun because we learn to play them as we play them,” Togelius told IEEE Spectrum. “From that perspective, writing code is an extremely well-designed game.”

Why Video Games Break AI

Video games are a completely different problem. LLMs have no spatial reasoning ability — it’s simply not in the training data. Even AlphaZero, which mastered both chess and Go, had to be retrained for each game. No general game AI exists.

Togelius ran a game AI benchmark competition for seven years and eventually stopped because progress plateaued. Agents got better at some games but worse at others. When they recently updated the framework for LLMs: “They fail. They absolutely suck. All of them. They don’t even do as well as a simple search algorithm.”

The Write-but-Can’t-Play Paradox

The strangest part: you can ask an LLM to code a playable Asteroids clone in one prompt and it works. But the LLM cannot play the game it just created. Game development is iterative — you write, test, adjust the feel. An LLM can’t do that testing loop.

“The LLM doesn’t know much about how to use it,” Togelius says. And this limitation extends beyond games: AI can build a GUI with buttons, but it has no idea how a human would actually interact with it.

Games Are Harder Than the Real World

Counterintuitively, games are more diverse than reality. The real world has the same physics everywhere — that’s why Waymo works. But Halo and Space Invaders are more different from each other, in meaningful ways, than two academic essays on quantum physics.

The Bottom Line

The inability of LLMs to play video games isn’t a trivial limitation — it reveals a fundamental gap in how these models understand the world. They process language brilliantly but lack spatial reasoning, real-time adaptation, and the ability to learn through interaction. Until AI can play a game it’s never seen before, claims about general intelligence should come with a very large asterisk.