Nate and I had the great privilege of participating in the premiere broadcast of a series of matches between DeepStack, a state-of-the-art heads up no-limit hold ’em Artificial Intelligence, and human professionals. We found DeepStack to be a really tough competitor that left us questioning our play in both large and small pots I’m sure we didn’t play nearly as well as heads up specialists would have, but it was great fun to try, and hopefully we did a good job of sharing the experience with the audience on Twitch. If you missed it, here’s a link to the replay!
.Next week, Terrence Chan and Adam Schwartz of the 2+2 Pokercast will play DeepStack. I wanted to share some of my thoughts from the match with both them and the Thinking Poker community anyway, so I figure I might as well just collect my thoughts here.
- Bet Sizes. I haven’t discussed this with the Computer Poker Research Group, but it seems like there are only a few bet sizes that DeepStack considers for its own actions (though, as I understand it, its ability to respond to diverse bet sizes is one of its chief advances over previous NLHE AIs). For instance, into a pot of 1600, it might bet 800, 1600, or 3200, but it would never choose 2291 as a bet size unless that were its exact stack size.
This strikes me as the best opportunity to exploit DeepStack, though Terrence and Adam are probably more capable than I of determining how exactly to take advantage of that (it wasn’t something I actively tried to do during my match). Considering the range of bet sizes DeepStack does use, I suspect that generally it doesn’t lose much by not considering “weirder” amounts. However, this might be somewhat more problematic with shallow stacks, where never betting less than half pot (if that is even a constraint) might prevent it from having a bet-folding range at all.
- Threat of a Check-Raise. These were the spots where I felt I had the most difficulty setting aside my “feel” based on how human opponents tend to play and constructing minimally exploitable ranges. There are a lot of spots where (non-elite) human opponents don’t check-raise often. This is for a variety of reasons: lack of “obvious” bluffing candidates, difficulty of checking a strong hand multiple times, etc. As a result, I think I ended up with betting ranges that were sometimes too depolarized (getting raised off of strong draws or very-possibly-best made hands sucks) or simply too wide.
Example: There was one hand where I turned 84 into a bluff on AJ2Q4, and it check-raised me with 85o!
- Board Coverage. Nate and I talked a bit about this on stream. This is something you see when working with solvers as well, and is probably related to (2). There are subtle things that DeepStack seems to do when making what might seem like arbitrary choices about candidates for floating or bluffing on early streets. The end result is a less predictable range on future streets.
For instance, I know that I want to have some Kx in my three-betting range when deep, and I typically choose some combination of KTs – KAs for this purpose. DeepStack almost certainly does a better job of getting the exact frequency right, but even we miraculously had the same amount of Kx in our three-betting ranges, it probably builds its range by three-betting all combinations of Kxs at relatively low frequencies. This means it ends up connecting with boards like Q74 in three-bet pots in ways that I don’t. Likewise its candidates for peeling or bluff-raising flop can seem surprising when the truth is that the choice is arbitrary in a vacuum but there is incentive to reach turns and rivers with a wider variety of holdings than most humans do. Consequently, it’s harder (though still not impossible) to recognize a particular run out as good or bad for DeepStack based on its play on earlier streets.
- Surprising Play. DeepStack did more than a few things that surprised us. For the most part, we were willing to believe that it “knew” better and could, after the fact, wrap our heads around why it may have done what it did. But it made one play against me that I have a really hard time believing could possibly be correct.
At 200/400, I opened to 1200 with QTo, and DeepStack jammed 18,250 effective with 85o. When we’re talking about move all in pre-flop, board coverage isn’t going to be a consideration, and although shoving ranges won’t be strictly linear because there will exist hands where calling > shoving > folding, it’s hard to imagine how folding could ever be correct if shoving 85o is +EV here.
It’s worth adding here that one feature of an equilibrium strategy is that it will not include “advertising” or “balancing” plays, even at a low-frequency, that have a negative expected value. Now admittedly, DeepStack does not claim to have an equilibrium strategy, but the point is that shoving, even at a low frequency, can’t be justified simply by saying it’s a balancing play. It would have to have EV not less than 0 for shoving to be correct at any non-zero frequency.