Measuring AI Advancement Through Games in 2024

Artificial intelligence has made remarkable advancements in recent years, with AI agents reaching new heights in complex strategy games. Games have long served as benchmarks for AI capabilities, providing environments where researchers can test and measure progress. In 2023, games continue to push the boundaries of AI, revealing how far algorithms have come and how much further they must go to match human intelligence.

A Brief History of AI in Games

Game-playing has been an AI research area since the earliest days of the field. The first game-playing AI was developed in 1950, which could play a simplified version of chess called Los Alamos chess. In 1955, Arthur Samuel created programs for checkers that could learn from their own gameplay.

As computing power increased, more complex games were tackled. In 1997, IBM‘s Deep Blue defeated world chess champion Garry Kasparov, demonstrating an AI capable of mastering one of the most studied games in history. This was an impressive feat at the time, achieved through sheer computing power and efficient tree search algorithms. Deep Blue could evaluate 200 million chess positions per second.

Kasparov vs Deep Blue

Kasparov vs Deep Blue in 1997. Source: BBC

As AI advanced, more complex games with larger state spaces and imperfect information challenged researchers. In 2011, IBM Watson defeated top human players in Jeopardy, showcasing its natural language processing abilities. Watson could parse clues and rapidly determine probable responses by consulting its 200 million page knowledge base.

The following years saw AI achieve superhuman performance in classic video games like Atari using reinforcement learning. DeepMind‘s DQN agent was able to surpass professional play in over half of Atari games in 2015.

Atari

DeepMind‘s AI playing Atari games. Source: DeepMind

Go long resisted AI domination due to its enormous search space. Go has a complexity measured at 10^360 possible board configurations, far greater than chess. But DeepMind‘s AlphaGo beat top professional player Lee Sedol 4-1 in 2016, combining Monte Carlo tree search with deep neural networks. DeepMind continued to push Go AI to new heights, with AlphaGo Zero achieving superhuman performance through self-play learning.

Go

Lee Sedol reviewing his match against Alpha Go in 2016. Source: BBC

AI bots have also reached expert levels in poker, a game involving hidden information and bluffing. Libratus from Carnegie Mellon University defeated top pros in no-limit Texas hold‘em poker in 2017. Libartus employed a complex strategy involving precomputed responses, real-time search, and self-improvement to win over $1.7 million in chips from poker professionals.

More recently, OpenAI Five became the first AI to defeat the world champions at multiplayer online battle arena game Dota 2. This marked a major milestone, as Dota 2 presents a complex, real-time environment with hidden information, requiring agents to collaborate and think strategically. The OpenAI Five system managed to achieve victory through hyperparameter optimization and training against itself over 10,000 years of simulated gameplay.

Dota 2

OpenAI Five playing Dota 2. Source: The Guardian

So in over 60 years of AI game research, we‘ve gone from checkers to chess to Jeopardy and Go to complex video games. However, the timeline shows slower progress in games that require handling imperfect information or modeling human psychology and behaviors. This remains an ongoing challenge.

Ongoing Challenges in Game AI

While AI has achieved impressive results in games, open challenges remain, especially in complex multiplayer environments.

StarCraft II has become a new AI challenge problem, with competitions testing agents in real-time strategy gameplay. The AIIDE StarCraft II competition has run annually since 2010. The state-of-the-art in 2020 was AlphaStar from DeepMind, reaching Grandmaster level on the 1v1 ladder. However, no AI has consistently defeated top pro players in full-information StarCraft games. StarCraft II gameplay requires strategic decision-making, spatial reasoning, memory, and execution under time pressure. AI still struggles with higher-level strategies like economic macromanagement, and adapting strategies across matches. Human unpredictability also poses a challenge.

StarCraft II

StarCraft II remains challenging for AI. Source: Blizzard

Open-world video games like Minecraft present an even greater challenge. With vast environments and endless possibilities, developing AI agents to assist or compete with humans in these sandboxes will require general intelligence on par with humans. Minecraft involves navigating 3D first-person environments, gathering resources, creatively building structures and more. OpenAI‘s Hide and Seek environment models some of these challenges in a simple grid world, but human-level Minecraft play remains far beyond current AI.

Imperfect information games also continue to test AI abilities. No AI has consistently defeated top human players in complex card games like bridge or poker. In bridge, deciding the winning player given complete information about hands is a mathematical challenge unsolved by AI. And bluffing remains an obstacle in poker, as humans leverage intuition and psychology to mislead opponents.

Human gameplay is often unpredictable and suboptimal. So AI trained through self-play can develop habits and blind spots that humans can exploit. For example, Dota 2 professionals noticed predictable patterns in OpenAI Five‘s gameplay they could capitalize on. Agents that more closely model human psychology and adaptability are an area of ongoing research.

Multi-agent collaboration and competition also increases complexity, as AI must model other actors‘ mental states and incentives. Modeling theory of mind to cooperate, coordinate, and deceive like humans remains an open challenge. The Hanabi game is a good example testing collaborative skills.

So while AI excels in certain domains, matching the general gameplay abilities of humans across a variety of titles remains an elusive goal.

Adoption of AI in Gaming

While pushing boundaries of AI, games have also been early adopters of leveraging AI and machine learning commercially.

  • Game design: AI generators can now create game maps and scenarios programmatically. This helps designers expand content exponentially.

  • Game testing: Automated game testing using AI agents plays games repetitively to detect bugs. This is faster and more comprehensive than manual playtesting.

  • NPC behavior: Non-player character AI controls behaviors like pathfinding, tactics, and decision making. Games leverage ML to make NPCs more realistic and reactive.

  • Procedural content: AI can procedurally generate detailed content like environments, sounds, and visual effects. This adds uniqueness and variety.

  • Player modeling: Understanding different player types and their behaviors through ML models allows adaptation to playstyle and personalization.

  • Cheat detection: ML is widely used to detect cheating bots based on player inputs and actions. This helps secure multiplayer games.

  • Recommendation systems: ML systems track player preferences to recommend new games, in-game items, and connections.

Adoption is widespread from large studios like Ubisoft, EA and Activision to indie developers. Close to 80% of developers already use or plan to integrate AI according to the GDC State of Industry Survey.

AI is driving the future of gaming by enabling new experiences, efficiencies and personalization for developers and players.

AI Gameplay in 2024

In 2023, here are some of the notable games serving as AI benchmarks and competitions:

StarCraft II

Blizzard‘s StarCraft II continues to be a platform for testing real-time strategy game AI capabilities. The AIIDE competition sees bots compete in full matches on competitive maps.

AIIDE StarCraft Competition

AIIDE 2017 StarCraft Competition Example. Source: ResearchGate

State-of-the-art StarCraft II bots like AlphaStar from DeepMind can defeat 99.8% of human players. But no AI has consistently beaten top pro players. StarCraft II gameplay requires strategic decision-making, spatial reasoning, memory, and execution under time pressure. AI still struggles with higher-level strategies like economic macromanagement, and adapting strategies across matches. Human unpredictability also poses a challenge.

Dota 2

OpenAI Five demonstrated superhuman DOTA 2 performance, but exclusively playing mirror matches. Human teams continue to defeat AI in matches with different characters.

Developing AI that can handle Dota 2‘s complexity with over 100 unique characters remains an ongoing challenge. Compared to humans, OpenAI Five also lacked longer term strategic thinking, relying more on tactical teamfights.

Poker

AI performance at limiting Texas hold ‘em led researchers to tackle no-limit poker with greater complexity. But modeled abstractions weakened performance against humans.

New techniques like Facebook‘s Pluribus use self-play with no abstractions, but are limited to 6-player games. Developing AI that can model human psychology and successfully bluff in a full game of 9 or 10 player no-limit poker remains an open challenge.

Poker AI Milestones

Timeline of AI milestones in poker. Source: Anthropic

Hide and Seek

Researchers have developed environments like Hide and Seek for testing fundamental AI capabilities. In these simulations, hider bots attempt to evade seeker bots in complex, partially observable grids.

State-of-the-art hide and seek AI still falls short of human cognition – seekers struggle with deducing hidden locations from incomplete information. This models the challenge of handling hidden information in the real world.

Hide and Seek

Example hide and seek environment. Source: Analytics Vidhya

Minecraft

Minecraft provides a more open-ended environment to develop intelligent agents. AI agents must navigate 3D first-person environments, gather resources, and creatively build structures and items.

OpenAI trains reinforcement learning agents in Minecraft, but their capabilities are limited compared to human players. Developing AI that can master Minecraft‘s near-infinite possibilities remains an unsolved grand challenge.

The AI Olympics Benchmark

To systematically measure and track AI progress across a spectrum of games, researchers have proposed the "AI Olympics." This benchmark would include a standardized set of games curated to test core aspects of intelligence.

Some suggested AI Olympic events include:

  • Memory challenges – Memorization and recall tasks test working memory critical to general intelligence.

  • Maze navigators – Assessing navigation skills in 2D and 3D environments with partial observability.

  • Physics prediction – Modeling basic physics like projectile motion and collisions.

  • Game imitators – Observing and learning to mimic human gameplay across genres.

  • Reverse engineering – Deducing game rules and mechanics from observations.

  • Game theory challenges – Multi-agent games requiring strategic reasoning about others.

  • Virtual decathlon – A range of virtual athletics events testing fine-grained motor control.

By standardizing metrics and levels, the AI Olympics could measure progress in these domains year over year. And bonus challenges against human benchmarks would reveal how AI capabilities compare. This concept neatly encapsulates the spirit of testing AI through games.

Key Game AI Trends in 2024

Some promising trends gaining traction in game AI research:

  • Rapid learning – Going from beginner to expert play in a matter of hours through techniques like imitation learning. This more closely resembles human adaptability.

  • Self-play approaches – Training agents against incrementally improved versions of themselves unlocks new levels of performance in games like Go and poker.

  • Generative modeling – Models like AlphaStar‘s neural network build abstract representations of games, better handling new scenarios.

  • Theory of mind – New models aim to better capture human psychological states and mental models for games requiring deception and cooperation.

  • Model-free techniques like deep reinforcement learning make minimal assumptions and have achieved superhuman performance in many games.

  • Sim2real transfer – Transferring policies from simulation to real-world robotic systems could enable learning physical skills through games.

The Future of Game AI

In the coming years, games will continue pushing AI capabilities forward. Multi-agent collaboration and competition will be a driving force as AI is integrated into online video games. Massively open environments like the Metaverse will require new levels of social intelligence. And replicating arbitrary human skills in virtual worlds remains an AI grand challenge.

Surpassing human intelligence across the diverse spectrum of possible games remains far off. Game AI breakthroughs highlight progress but also reveal persistent limitations. As games increase in complexity, they will continue to humble AI agents and provide benchmarks to focus research efforts. Mastering games is akin to mastering many facets of life – a challenge AI researchers will be unraveling for decades to come.

While no game yet captures the full complexity of the real world, AI game research provides crucial insights into developing more human-like intelligence. And games will likely be the first domain where AIs consistently surpass human capabilities across the board. So game AI represents a critical milestone on the long road toward artificial general intelligence.