Farina's research team beats top human Stratego player

3 articles · Updated · MIT News · May 5

Using new algorithms and training costing under $10,000, the team won 15 games, drew four and lost one against the game's best-ever player.
The result marks superhuman performance in Stratego, a bluffing-heavy imperfect-information game where earlier major research efforts had spent millions without reaching that level.
Farina, an MIT assistant professor, said the techniques could feed into broader AI systems for strategic decision-making in complex multi-agent settings with hidden information.

Can AI designed to deceive humans in games also teach us to become better, more collaborative negotiators?

Will AI mastering game theory solve global challenges or lead to logically disastrous outcomes?

As AI masters strategic deception for under $10,000, how can society defend against its misuse?

Revolutionizing Imperfect Information AI: DeepNash’s Efficient Path to Superhuman Stratego Performance

Overview

In late 2025, DeepMind's DeepNash AI achieved superhuman performance in the complex board game Stratego, overcoming massive hidden information and an enormous game tree. This breakthrough was made possible by a novel model-free deep reinforcement learning approach combined with the Regularised Nash Dynamics (R-NaD) algorithm, which enables convergence to robust, unexploitable strategies through self-play. DeepNash's innovative neural architecture and training methods allowed it to master bluffing and deception while drastically reducing computational costs by eliminating expensive search processes. Open-sourcing DeepNash democratizes AI research and paves the way for applying R-NaD to real-world challenges like traffic optimization, though scaling and ethical concerns remain important future hurdles.

...

Farina's research team beats top human Stratego player

Revolutionizing Imperfect Information AI: DeepNash’s Efficient Path to Superhuman Stratego Performance

Overview

Related Stories