Categories
Uncategorized

Box Office Success and Link Analysis

Link analysis is a method of organizing the given data, specifically in the form of nodes and edges, and deriving patterns to attain valuable insights. Suna (2019) answers whether the network analysis can predict success in the box office.

The crucial aspect of this analysis revolves around the idea of Degree Centrality, which could be represented mathematically as the following:

Note that we normalize the expression according to the number of nodes in the graph. Graphically, the degree centrality of a given node corresponds to the number of out-links it has as a ratio with the remaining nodes in the network.

To start analyzing the success of the box office of movies, we construct a network like the following:

  • From a list of movies with the casting members ordered by release date
  • Nodes: Movie Actors
  • Edges: Co-starred

Suna (2019) introduces the variable of significance for the stars in a movie by simply extracting the three main characters from them. This leads to a network consisting of the most impactful actors in movies with links to other impactful actors that have co-starred in movies. Furthermore, the degree of centrality for each of the main characters are calculated using the formula mentioned previously.

Extracting a well-known actor from the network will result in a densely connected component that resembles the following image.

From the above component, we see that the degree of centrality of a given actor is directly proportional to the popularity and the significance the actor brings to their movies. Nicolas Cage, for instance, seems to have a high degree of centrality. This brings us to the question of whether this degree of centrality is correlated to the box office success of an arbitrary move. After analyzing the degree of centralities and the revenues of movies using the Spearman Correlation Coefficient, Suna (2019) found that the degree of centrality of a given actor is indeed statistically related to the revenue of the movie they have acted in. Specifically, the more significant the actor was in a movie, their degree of centrality seems to have a higher statistical correlation with the revenue of the movie.

This form of link analysis is similar to that of the Hubs and Authorities algorithm (HITS) that is used as a voting mechanism for web pages. The HITS algorithm iteratively calculates the centrality of a given web page, which serves as a quantitative value for the importance of a web page. Therefore, we have seen that analyzing a movie star network using a form of link analysis could help with predicting whether or not a movie was successful in the box office, which highlights the importance of the centrality of nodes in a network

References
Suna, Ahmet (2019). Can Network Analysis Work for Predicting Success of Box Revenue. Retrieved November 19, 2020, from https://towardsdatascience.com/can-network-analysis-work-for-predicting-success-of-box-office-revenue-c8370c8427f9

Categories
Uncategorized

How Graph Theory Could be used to Analyze the News

Using the knowledge of graph theory, one could generate a network of entity relationships that appear in daily news articles and analyze the connections between such entities. Such a network can be constructed using the following method as outlined by Marcell (2020):

  • Extract entities from news articles and define them as the nodes of the graph. Entities could either be a person or an organization.
  • Link pairs of entities that appear on a news article together or add to the weight if such a link already exists.

Using the above method, we would arrive at a network of people and organizations along with their degree of relationships denoted by the edges. An example of such a network would be the following

Graph generated using news from the UK during March 26, 2020. Source: https://towardsdatascience.com/building-a-social-network-from-the-news-using-graph-theory-by-marcell-ferencz-9155d314e77f

There could be several questions that can be answered through the construction of such a network. For example, one point of discussion could be along the words of “how is person X and person Y related”? Questions like such could be answered by analyzing the connections that are present in the graph. For example, if there exists a link between person X and person Y, we could derive that they have been directly involved in an event described in a news article. Also, if there does not exist a direct link, but there is a mutual entity that connects person X and Y, then we can conclude that it is very likely that person X and Y know each other in real life. Along with describing the relationships between two entities, the connected components that form during the construction of the graph could be used to categorize entities into related groups. Marcell (2020), using the network derived from news articles posted in the UK, was able to categorize the news topics into eight themes (US politics, Australian academic institutes, and so on). The categorization was derived using carefully visualizing the connected components that are present in the network.

Furthermore, we could analyze the popularity or influence that a person has on media using the network. Such an analysis is straight forward and deals with the connectivity of a given entity to other entities. Specifically, if a given entity has paths leading to several entities in the graph, then we can conclude that that entity is quite popular in the news media. This ties back to the idea of degrees of separation discussed in the lectures. The Bacon number is a related phenomenon that also deals with the degree of separation but specifically in Hollywood. It was found that only 12% of all the actors in Hollywood cannot be tied to Bacon using co-appearances. The same theory could be used to answer who has been the most influential person on the News by analyzing the links between the entities in the constructed graph.

In conclusion, graph theory could be employed in the news to analyze the relationship between entities that appear in the news media. Relationships between two particular entities, categorization of the entities into news topics, and the degree of influence of entities are only a few derivations that are possible using the news network. In theory, one could potentially extend this construction into several other industries to achieve insightful derivations that answer questions that are specific to that industry.

References:

  • Ferencz, Marcell (April 13, 2020). Building a Social Network from the News using Graph Theory. Retrieved October 17, 2020, from https://towardsdatascience.com/building-a-social-network-from-the-news-using-graph-theory-by-marcell-ferencz-9155d314e77f