Maduvan Kasi – CSCC46 Blog 2019

Poker has always been considered the quintessential imperfect-information game – imperfect-information meaning that no one player has full knowledge of the other players’ positions. While we as computer scientists have made great strides in developing AI to play perfect-information games, like chess and checkers, it is only recently that we’ve been able to crack the world of imperfect-information.

In 2017, the DeepStack AI was revealed to the world, boasting the ability to beat professional human players at poker’s most-popular variant: Texas hold’em. More specifically, the AI was designed to play the heads-up no-limit variant of Texas hold’em, which essentially means the game had no cap on pot size and was restricted to 2 players. Interestingly, since the game involves 2 players and imperfect-information, it becomes a prime candidate of being analyzed with game theory.

Game theory can help us, for instance, determine how often we should be value-betting vs. bluffing when we’re on a given range.

Let’s break that terminology down a bit. Texas hold’em poker consists of 4 betting rounds, and the actions you take in each round [check, call, raise, fold] can give away how strong your hand could be. The likely range of hands you could be holding based on these actions is known as your range.

Now consider the bet on the river (after all community card have been revealed) in a situation where you’ve pegged your opponent on a very tight range. At this point, you know for sure whether you have the winning or losing hand – but either one could still let you win the pot. If you have the winning hand, you want to value-bet: raising just enough so that your opponent calls and you win the pot. Contrarily, if you have the losing hand, you want to bluff: raising enough to intimidate your opponent into folding, so you still win the pot.

With all this in mind, consider the following scenario:

The pot is sitting at $100
The river card has just been revealed
The opponent has placed you in some range R
You’ve placed your opponent in a very narrow range
Action starts with you

Here, you have 2 ‘games’ to play out, depending on if you’ve determined you hold the winning or losing hand.

This is the game you play if you have the winning hand.

		Player B
		Call	Fold
Player A	ValueBet	$200	$100
	Fold	$0	$0

This is the game you play if you have the losing hand.

		Player B
		Call	Fold
Player A	Bluff	-$100	$100
	Fold	$0	$0

Note that these games only have 1 payoff value. Only Player A’s payoffs are shown in these tables because Player B is playing an entirely different game – this is due to the fact that Action in poker is sequential, not instantaneous. This means that Player B can, and should, react to Player A’s choice before making their own. Player B only has 1 game to play, shown below:

		Player A
		VB	Bl
Player B	C	-$100	$200
	F	$0	$0

Notice how there is no option for Player A to fold; the game above can be played instantaneously (since B does not know whether A value-betted or bluffed), but if A had folded, B would know and would make $100.

Now, since A cannot control which game they are playing, they have to play in such a manner that they maximize their profits regardless of what B plays. Since B only has a choice when A doesn’t fold, A has to determine a ratio to ValueBet:Bluff (folding when needed to maintain this ratio). We can find this ratio by finding q in the Mixed-Nash Equilibrium in B’s game (as seen in lecture).

It turns out the q = 2/3, meaning that A should value-bet twice as often as they bluff if they want B to be indifferent about their choice. We know B will be indifferent because this all occurs when they believe A is in range R, so they have no other info to work with. We can now find the expected payoff for A: if B chooses to call, then A gets 2/3(200) + 1/3(-100) = $100, and if B chooses to fold, then A gets 2/3(100) + 1/3(100) = $100. Therefore, by using game theory, Player A can guarantee a payoff of $100 in range R.

Obviously, this was a very simple and isolated example, and DeepStack employs many more complicated heuristics to produce its results. But, it is insightful to see how just the basics of game theory can be directly applied to better your poker play. GTO [Game Theory Optimized] poker is a growing trend in the scene, and for good reason. After all – if you can’t beat them, join them.

References

Corrigan, Rory. “How You Should Think About Poker (But Probably Don’t).” Upswing Poker, 8 Dec. 2017.
- Webpage found at: https://upswingpoker.com/gto-poker-game-theory-optimal-strategy/
Moravčík, Matej & Schmid, Martin & Burch, Neil & Lisý, Viliam & Morrill, Dustin & Bard, Nolan & Davis, Trevor & Waugh, Kevin & Johanson, Michael & Bowling, Michael. (2017). DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker. Science. 356. 10.1126/science.aam6960.
- Companion Webpage found at: https://www.deepstack.ai

In today’s media-driven and ever-connected digital landscape, it is hard to understate both the prevalence and significance of memes in our modern society. From mindlessly filling up your instagram feed, to serving as fuel for the protesters in Hong Kong, memes are now a common and effective way of communicating ideas en masse.

As such, it could be insightful to understand just what separates a great, or viral, meme from all the others. In a 2013 paper by Weng et al., they found that they were able to predict the virality of a meme based on whether it spread like a simple or complex contagion. Complex contagions often start with a very small probability of adoption, but can increase this probability with multiple exposures. This behaviour lends itself to a ‘trapping’ effect within communities, where it spreads quickly within the community but becomes difficult to leave it, based on the principles of Structural Trapping, Social Reinforcement, and Homophily (as discussed in class).

Structural Trapping
Social Reinforcement
Homphily

Memes that spread like simple contagions however, are much more akin to diseases. In this situation, each individual exposure carries the same probability of adoption, but that base probability is often significantly higher than those of complex contagions. As a result, these memes don’t falter and stew in 1 or 2 communities, but quickly infect the whole network.

Based on this understanding of meme virality as a symptom of its contagion type, Weng et al. were able to build a ML model that could predict which memes would take off based on its initial spread amongst communities. In the following diagram, we can see that the non-viral meme (a complex contagion) was heavily-concentrated in one community, but was unable to find a decent footing in any others. In contrast, the viral meme (a simple contagion) did not have as heavy a concentration in any single community, but did manage to gain a decent spread amongst several communities. We can then see the results of these spreads as the viral meme was able to infect a plethora of new communities on a regular basis (notice the prominent circles with a redder hue). Meanwhile, the non-viral meme was relegated to it initial community (dark blue circle) and did not have anywhere near as much of an impact on the few other communities it did get into.

In conclusion, we can clearly see that, as a by-product of the way social communities (which are just social networks) interact, it is more optimal for a meme to loosely relate to a wide variety of people than to strongly relate to any one group. Indeed, this revelation may seem quite intuitive, but it is nice to have this intuition grounded in data and empirical analysis. After all, if we still can’t beat the common cold, how could we ever hope to beat Pepe the Frog?

References

Lilian Weng, Filippo Menczer, and Yong-Yeol Ahn. Virality Prediction and Community Structure in Social Networks. Nature Scientific Report. (3)2522, 2013.
Companion Webpage found at: https://lilianweng.github.io/virality.html

Author: Maduvan Kasi

The Game Theory behind the Game of Poker

Great Memes are Just Diseases