David Wan – CSCC46 2020 Course Blog

Communities. When most people think about them, we usually think of a group of people that have strong ties to one another. They trust each other and, more importantly, are willing to cooperate with one another to achieve a goal. But being the evil little gremlin I am, I wondered to myself, how strong are these communities actually? What would it take to completely demolish a community’s willingness to cooperate with one another? And then I stumbled upon an article titled “Information Cascades and the Collapse of Cooperation” by Tang et al., I was instantly intrigued and felt the need to write about it. I wanted to gain insight into community dynamics and see empirical evaluations of the process of being a newcomer to a community. I also wanted to witness the effects bad actors in a community can have in a quantified manner. I felt that such information could help me grasp the importance of various mechanisms of communities such as moderators, especially in the realm of online forums.

I’ll quickly give a rundown of Yang et al.’s study. Given an underlying social network, each node is classified either as a cooperator or a defector. As time passes, a new node is introduced, and they connect to a node in the network, which they call a “role-model”. Ideally new nodes would want to connect to cooperators and avoid defectors. When the new node is making their decision, they have access to public and private information. The public information is simply the degree of nodes in the network and the private information is a sample taken from one of two Gaussian distributions, one for cooperators and another for defectors.

I’ll quickly describe some important variables used in the study. Cooperators distribute a benefit of b value to its neighbours for a cost c. There is also a variable denoted as δ (the “selection strength”), which is basically the degree to which a node will consider its payoff when choosing to connect to a role-model. The higher δ, the more likely it will connect to role-models that provide a larger payoff. p is a weight between 0 and 1 which defines how heavily public information should be considered when deciding whether to connect to a role-model; q is defined similarly for private information. Finally, there is the notion of P-cascades and N-cascades which are information cascades that form when there is a conflict between private and public information. P-cascades are cascades that are created when the private information of newcomers indicate they should connect to a role model, but instead they follow public information and don’t connect. Similarly, N-cascades are cascades where private information of newcomers indicate they shouldn’t connect to a role model, but instead follow public information and connect.

In terms of content relating to CSCC46, the study tackles the concepts of game theory and information cascades. Yang et al. utilized game theory (specifically evolutionary game theory) by treating connections between nodes as a game where the payoffs are based on a pre-defined benefit and cost variable.

Figure 1: Payoff matrix for game between node and its neighbours. Assume b > c > 0.

Information cascades are a concept in CSCC46 that this study directly addresses when it comes to the role-models newcomers choose. If public and private information about whether a node should connect to a role-model conflict, newcomers may make the wrong decision when choosing to connect/reject a node. If a series of wrong decisions are made, this can cause successive newcomers to simply follow the crowd and make the same mistake, leading to a P-cascade or N-cascade.

What I found interesting about this study is how they allowed public and private information to be weighed differently. This sort of mimics how people in the real world might behave in this kind of scenario. You may have people who are more comfortable going along with the crowd, thus they will value public information more heavily. Likewise, someone who is more independent may have more confidence in their own private information and thus will weigh it more heavily. On a related note, in the study, Yang et al. discovered that public information had significant effects on the underlying social network, even if it was in limited amounts.

Figure 2: Graphs showing levels of cooperation at varying decision thresholds 𝜏 for three selection strengths δ. The left column of graphs is for the scenario where private information is weighed more than public information. The right column is the opposite in that regard. Each row corresponds to corresponds to different benefit to cost ratios. Note: the last row should be “b/c=10/8”.

Notice in Figure 2, private info generally has a high degree of cooperation which lasts for higher decision thresholds, as shown by the left side. Meanwhile on the right side, cooperation values were lower when public information was being considered more. These findings could be an approximate answer to the question I proposed at the beginning of this blog. If a group of individuals were to distort public information, even just little bit, it could take a serious toll on how communities cooperate and function as a whole. Subsequently, this provides a basis for why community moderators seem so important. They can control community dissidents and prevent public information from being absurdly polluted.

While this study was only an approximation, it gives a rough idea why cooperation is so fragile within communities. If others were to build upon this research and perhaps obtain real-world data on this subject, a mitigating factor can be discovered for the reasons behind such fragility. Not only would this make communities more resilient to collapses in cooperation, but it could also point towards a solution the actual cause of these collapses, stopping them once and for all.

References

Yang, G., Csikász-Nagy, A., Waites, W. et al. Information Cascades and the Collapse of Cooperation. Sci Rep 10, 8004 (2020). https://doi.org/10.1038/s41598-020-64800-z

https://www.nature.com/articles/s41598-020-64800-z

Ever since I was young, playing video games was one of my favorite hobbies and it still is today. Over the years I have spent with this hobby, I have played various games that spanned a wide array of genres and have seen certain games come in and out of relevancy over time. And as I observed these patterns, I noticed I was asking myself “What games are currently popular? Why did certain games become relevant/irrelevant?”. As such, I wanted to see how applications of network analysis could help me in my quest to gain a larger perspective of the gaming community.

Luckily, Xiaozhou Li and Boyang Zhang of Tampere University have responded to my desire with their January 2020 study “A Preliminary Network Analysis on Steam Game Tags: Another Way of Understanding Game Genres”. In their study, they examined the largest PC gaming platform known as Steam and their user-defined tagging system. The gist of the system is that users can assign tags that they feel represent a game the best (for example, some might tag the game “Call of Duty” as an “Action” and “Shooter” game) and frequently applied tags will become featured categories for that game. Li and Zhang analyzed this tagging system by building a network of game tags, creating an edge between tags if they both were applied to a game. They then performed community detection to see how tags grouped up into communities and labeled the communities based on highest centrality nodes within each community, as shown by Figure 1.

In terms of content related to CSCC46, this study applies the concepts of community detection and PageRank, although the latter has not been discussed yet as of writing this blog. One interesting thing to note is how Li and Zhang performed community detection. They used a method called the “Louvain method” as opposed to the Girvan-Newman algorithm studied in class. Subsequently, the concept of betweenness also appeared in the study; edges with high betweenness connected game tags that usually are not used together. PageRank was used to evaluate the importance of the game tags within the network, much like how PageRank was originally used to rank search query results in search engines.

Figure 1: The resulting communities of tags, split into 4 groups: a) Strategy & Simulation Games, b) Puzzle & Arcade Games, c) RPG Games, d) Shooter Games

Figure 2: A graph showing the connections between the most popular tags.

So why would this information be interesting? Couldn’t you just load up a site like SteamCharts and see what are the top and trending games? Well, yes, you could. But that doesn’t give as broad of an image. The data that Li and Zhang extracted from their study would be useful for game developers as they can see what games people are playing currently. It can provide insight into what kinds of games are currently hot on the market or if there are any genres that are gaining traction. And while the data quality may have been marred by incorrect tagging, it still shows how the wisdom of crowds still provided data that was relatively accurate. Overall, I feel that if this information were frequently updated, publicly available, and covered a broader scope of platforms, it would be an incredibly useful tool for developers and gamers.

References

Li, Xiaozhou & Zhang, Boyang. (2020). A preliminary network analysis on steam game tags: another way of understanding game genres. 65-73. 10.1145/3377290.3377300.

https://www.researchgate.net/publication/339081814_A_preliminary_network_analysis_on_steam_game_tags_another_way_of_understanding_game_genres

https://store.steampowered.com/tag/