AI Game Playing: The Gardner Test

Testing AI with Games

To develop AI models that are both safe and powerful, their capacity to play games on the fly can be tested. A challenge has been proposed to assess how well AIs can adapt to and adhere to new rules.

Tic-tac-toe illustrates the simplicity and complexity of games. Martin Gardner noted its strategic aspects and variations, such as reverse games and three-dimensional versions. Gardner’s games may offer insights into making artificial intelligence more humanlike. Games, with their rules, highlight what makes us human. AI models face challenges in navigating rules as they start to mirror human thought. Overcoming this is important because the path to artificial general intelligence involves building AIs that can understand and follow rules.

The Gardner Test

A new evaluation, the Gardner test, would task an AI with playing a game after being surprised by its rules and without human assistance. This surprise could be achieved by revealing the rules at the game's start. The Gardner test draws inspiration from work in general game playing (GGP), a field influenced by Michael Genesereth. In GGP competitions, AIs compete in games with rules given at the start in a mathematical language. The proposed test seeks to advance this by using natural language rules.

This goal is now more feasible due to breakthroughs in large language models (LLMs). The challenge could use games from GGP competitions like Connect Four, Hex, and Pentago, and also draw from games Gardner wrote about. Input from the GGP research community, AI model developers, and Martin Gardner fans could improve test design.

Adapting to New Rules

Passing the test involves creating an AI that can master any strategy game on the fly. Strategy games require abilities like thinking ahead, adapting to responses, changing objectives, and conforming to rules. Current AI models often depend on knowing the rules in advance to train their algorithms. AlphaZero, for example, can play chess, Go, and shogi at a high level but needs the rules set before training. If AlphaZero faces a new game, it will struggle.

An AI model that performs well on the proposed test would adapt to new rules, even without data, and follow any rule set with precision. While some AI systems can play variants of simple games, they often need prompting. An AI that passes the Gardner test would follow the rules exactly. Specialized tools might make errors by reproducing past training data rather than following set rules.

Errors could have consequences in areas like national security or finance. AI systems that can follow rules could lead to machine intelligences with greater flexibility. Humans can generalize, as seen in game players who could compete in various games. Game playing with new rules is key to evolving AI, potentially creating AIs capable of anything while following rules. Testing the ability to play games on the fly might be the way to achieve powerful but safe AI.