Categories
Uncategorized

After Page Rank Is No Longer Visible

Google has been using the page ranking algorithm to best serve their clients with the most relevant search result. There was a time when page ranks are still visible to all users – it was no longer a thing after 2010 that Google hid all of them. However, not being visible does not mean that Google has stopped using it. According to Erika, not only that Google kept the PageRank algorithm after 2010, it is indeed updated in recent years and still plays a very important role in serving clients with the best search results in 2020. [1]

Many companies have tried to guess the latest Google’s page ranking algorithm. Some of them even developed their alternative algorithms. One example is SEO PowerSuite. Their self-owned Domain InLink Rank provides an alternative way to rank the most valuable pages. Similar to Google’s old page rank algorithm, it takes the number of incoming edges and their weights into account to calculate a page’s rank. However, there is no detailed formula found online that well-explained how these factors work exactly in the formula. Instead, this blog post is interested in one of the experiments conducted last year by SEO PowerSuite on how well their Domain Inlink Rank algorithm performed compared to Google’s SERP (Search Engine Result Pages). [2][3]

The experiment targeted around 33500 keywords and their search results. Only the first 30 results from each keyword search were kept, which results in over 1 billion pages. After comparing the results produced by the Domain Inlink Rank algorithm and by searching on Google, it turned out that they are positively correlated with a correlation coefficient of 0.128. This indicates that a page is likely to be ranked higher if it is also ranked high among all search results in Google. However, according to the definition of the correlation coefficient, any value under 0.3 is considered “weak”. Therefore, a coefficient value of 0.128 does not make a significant point.

Despite that, after comparing the experiment results from other page ranking algorithms, SEO makes a fair point that its InLink Rank algorithm has better performance than other alternatives. Comparing the “next best competitor” after SEO PowerSuite, Moz has published their experiment results on similar setups. It turned out that their highest correlation coefficient (0.12076) was even weaker by relatively 6%.

(figure. 1) Comparing the performance of InLink Rank with the four products by Moz, in terms of correlation coefficients.
Image source: https://cdn1.link-assistant.com/images/news/google-page-rank-2019/screen-07.png

Aside from that, it is interesting to find out that SEO PowerSuite has been working on detecting spamming hub pages and providing some proper instruction for web page owners to improve on their page rank. The top two approaches are qualifying backlinks and making use of internal links.

On the one hand, backlinks refer to those the website points to. Under this InLink Rank model, all websites are authorities and hubs at the same time. Frequently checking if any of them has a low-rank score and removing those links that point to low-quality sites can prevent loss of page rank on the next round of page rank update. A tool named “SEO SpyGlass” checks InLink Rank scores for those backlinks, as well as for potential risks and errors for backlink pages’ authority.

(figure. 2) An example of using the SEO SpyGlass tool to analyze the InLink page rank for backlink pages.
Image source: https://cdn1.link-assistant.com/images/news/google-page-rank-2019/screen-10.png

On the other hand, taking good use of internal links can save a lot of time. It is described that internal links act like a “page rank storage” under the InLink Model. To maximize the use of internal links, it is important to make sure there are no orphan pages under control because that will be a waste of source. Having pages linking to each other under a website makes sure page rank flows between pages. A tool named “WebSite Auditor” visualizes such processes and makes it easier to find any orphan pages.

(figure. 3) An example of using the WebSite Auditor tool to analyze the structure of a website and to detect if there are any orphan pages.
Image source: https://cdn1.link-assistant.com/images/news/google-page-rank-2019/screen-16.png

It is exciting to see the materials we just covered in the lecture (3 days ago) are doing some work in the real world industry. It is also important in helping me understand these articles and diagrams better since they are so closely related to what we learned. All sources are put under “Reference” below, please feel free to dig in and read more!

Reference
1. Some description of Google PageRank and why it is still important:
https://www.semrush.com/blog/pagerank/
2. The experiment on Domain InLink Rank:
https://www.link-assistant.com/news/inlink-rank-correlation.html
3. The analysis of the experiment, and more relative materials:
https://www.link-assistant.com/news/google-page-rank-2019.html

Categories
Uncategorized

Network Analysis on 5G COVID-19 Conspiracy

Both “5G” and “coronavirus” are hot topics of this year. One is the latest technology standard for broadcast cellular networks, while the other causes the serious pandemic that we are suffering at the moment. However, I was very confused when seeing these two terms put together and discussed seriously. How are they related, and what indeed is the “5G COVID-19 conspiracy”? I started reading with all these questions.

Early this year, there was a theory that “the spread of the coronavirus is associated to the 5G network technology”. This causes a huge number of tweets and retweets spreading this misinformation. To discuss the origin of such “5G coronavirus conspiracy”, how it spreads, which parties are involved, and what can we do to fight against it, the study performed several steps to analyze what was going on.

Firstly, it used the keyword “#5Gcoronavirus” and “5Gcoronavius” to target the English tweets that mentioned this topic. Then it used NodeXL to construct a graph where nodes are users and an edge exists when a user “replies-to” or “mentions” another. Vertices are grouped by cluster according to the Clauset-Newman-Moore algorithm, which is an algorithm to find community structure in a very large network. In the end, a manual content analysis was performed to analyze the purpose of those tweets – are they in favour of or against this conspiracy, or are they tweeting maliciously to make people believe in this conspiracy.

Among all these clusters, there are three most interesting ones that the article discussed – group 1, 2, and 4. Group 1 is an isolated group where those tweeter users tweet without mentioning or replying others. Therefore, the nodes that represent them are isolated from others. They might be tweeting their opinion, but do not contribute that much to the spread of the conspiracy.

Group 2 is the Broadcast group. Those users share contents regard the 5Gcoronavirus topic while also mentioning and replying to others. More users are told about this topic, and some of them eventually become part of this cluster when they started to tweet and mention others. The result is reflected by the increasing size of the group 2 cluster.

The last and most important cluster is the group 4 cluster. They are the accounts that actively spread the conspiracy. Among the total number of 408 Twitter accounts, the manual content analysis provides a report of the top ten influential accounts ranked by betweenness centrality score. We can see that most of them are just normal citizens that very actively spread the conspiracy. The tenth account is an exception. Donald Trump, indeed, did not tweet that much himself regards 5Gcoronavirus. Instead, he was mentioned on tons of tweets to comment on this conspiracy.

It is said that the reason why such misinformation can spread so quickly is due to a lack of authority. It is important that such a public figure or influential person can step out and battle against the conspiracy. I agree with this conclusion – it is too difficult to prevent misinformation from arising, instead, we defeat it when it shows up.

Overall this is a very interesting and detailed article that summarizes this event from both descriptive and technical point of view. I am very impressed by how the knowledge we learned in lecture is so closely related to the real-world event. It is also exciting to analyze the cause and the solution from a technical point of view. I strongly recommend others to also take a look at it when having time.

Reference:

  1. Ahmed W, Vidal-Alaball J, Downing J, López Seguí F
    COVID-19 and the 5G Conspiracy Theory: Social Network Analysis of Twitter Data