Computers aren’t just unfeeling automatons that follow instructions and rapidly make millions of calculations — it turns out they can also bluff.
After conclusively besting humans at chess and Go, an Artificial Intelligence program has now managed to beat the world’s top players at poker. Liberatus, an AI developed by researchers at Carnegie Mellon University, has managed to beat four of the world’s top players in a 120,000-hand competition in heads-up no-limit Texas hold’em. In a paper published yesterday, Noam Brown and Tuomas Sandholm described how they’d managed to build what might be the most powerful poker AI ever.
Liberatus’ AI is a combination of three separate modules that help it decide what to do with each hand. It mainly relied on reinforcement learning, which is a method of extreme trial and error — Liberatus was only taught the rules of poker, and it played game after game with itself. After trillions of games, it automatically came up with strategies that could help it beat its opponents.
The first moves the AI made were completely random. But through an algorithm called Monte Carlo counterfactual regret minimization, it studied its previous games to see which of its moves had eventually led it to win. It then started playing these moves more often. The researchers also coded the AI to bluff — it isn’t helpful if an AI becomes predictable in a game of poker.
Poker is a specially hard game for Artificial Intelligence to tackle. There are 10^163 (10 followed by 163 zeroes) different game situations in no-limit hold’em assuming starting stacks of 20,000 chips, making it impossible for even the strongest computers to calculate all possible game situations. Also, poker is an imperfect information game, meaning that unlike chess, where the computer can exactly know the positions of all pieces on the board, in poker it must guess which cards their opponent holds.
Brown and Sandholm tested Liberatus against four of the top poker players in the world — Jason Les, Dong Kim, Daniel McCauley, and Jimmy Chou — in a 120,000-hand Brains vs. AI challenge match over 20 days in January this year. A prize pool of $200,000 was allocated to the four humans players in aggregate. Each human was guaranteed $20,000 of that pool. The remaining $120,000 was divided among them based on how much better the human did against Libratus than the worstperforming of the four humans. Liberatus managed to decisively come up trumps — it defeated the humans by a margin of 147 mini big blinds per hand. It also beat all four humans individually.
“It was really difficult for us to play,” said Dong Kim, one of the beaten human players. “We would bring a strategy and it would be good for the day of, and then the next day it would bring something new to the table. And we were not ready for that, so it was overall really, really tough.” Kim said that the AI played in a way it appeared that it could see his cards. “I’m not accusing it of cheating,” he said. “It was just that good.”