Compromised Networks

Recently, a new malware known as Nodersok/Divergent, surfaced. While the steps this malware takes to infect a system are an interesting topic to look into, that will not be the focus of this post. Rather we will look at what an infected system can do, within the network it is connected to.

Image result for infected networks

The malware allows malicious JavaScript code to run and execute under the valid program Node.exe. The payload of the malware contains basic functions, which turns the infected machine into a proxy, accessible through a remote machine controlled by the attacker. While this does not give the attacker direct access to other machines on the network, there is still a breach within the infected PC’s network. An attacker can for example, use that machine as part of other malicious activities. Since the requests are being proxied through the infected machines, they may not necessarily be traced back to the attacker, but rather from the infected PC’s. 

While the malware itself, turns machines into proxies, it is interesting to think about what security threats this may pose on its network, especially if this malware were to evolve. For example, take a machine that sits within an isolated network: or in terms of graphs, an isolated, disconnected SCC. Machines would only be accessible from other machines on the network. However, as soon as one machine becomes infected, it’s possible that the attacker now has access to information on other machines, as requests would come from the infected machine. 

This leads to an interesting idea of how networks should be set up, in order to protect against threats such as this. For example, is it important and necessary to have all the machines in one SCC, or can some parts be even more isolated? If communication channels between the machines in the network are two-way (such as in an undirected graph), a possible solution could be to have one way communications between machines, similar to a directed graphs, in order to restrict the way machines can communicate. All in all, security breaches from malware related to networking requests, can pose a large threat, and should be taken into account when designing how machines are connected within a network.

References
https://www.microsoft.com/security/blog/2019/09/26/bring-your-own-lolbin-multi-stage-fileless-nodersok-campaign-delivers-rare-node-js-based-malware/

Community discovery model in dynamic social networks

The academic journal published by IEEE(Institute of Electrical and Electronics Engineers) proposed a new model to improve the quality of community discovery. This new proposed model addresses some of the limitations that traditional methods had and produces result with higher correspondence with the ground truth communities.

Traditionally, community discovery methods treat network structure as a static topology. They neglect the interaction between the information factor, as they only represent the possibility of interaction between users. However modern microblogging networks applications like Twitter, are extremly dynamic in content distribution and topological structure. The flow of information that is not considered in traditional methods could be applied to microblogging networks to better determine user interest community.

An example of dynamic network diagram

The proposed method uses multiple different data analysis algorithms to filter and integrate information interactions between users. It then uses machine learning strategy to analyze information to dynamically updates the characteristic of the network to cluster the community. 

This is interesting to me because, as we have learnt in class, sometimes data are lost in translation. This method takes some of the missing information into account and analyze them before translation. The resulting communities have higher correspondence and accuracy, which can provide more information than traditional methods.

Reference:

Jiang, Liang, et al. “An Efficient Evolutionary User Interest Community Discovery Model in Dynamic Social Networks for Internet of People.” IEEE Internet of Things Journal, 2019, pp. 1–1., doi:10.1109/jiot.2019.2893625.

Vulnerability of spam attacks on social networks

For the past decade, the growth of social media platforms have been enormous. These platforms are used to connect and create relationships with friends and people that we interact with on a daily basis. Even though these social networks are meant for people to add their friends, it turns out that these platforms are very susceptible to attackers that are looking to send spam for their personal gain. But, how do these attackers get the opportunity to include themselves in a network of friends that they do not belong in? Using Facebook as a common social network, have you ever received a friend request from a random person? This is exactly the way these attackers get themselves involved within a network. All it takes is one friend of a large network of friends to accept that request from the attacker that will now allow the attacker a path to all other members of the large network. If you have any personal information such as your phone number or email on your profile, these will now all be compromised and make you vulnerable to spam attacks. Because of the undirected nature of these friend relationships, mindlessly accepting requests on these social networks will leave you exposed to such attacks.

If we think of this in terms of what we know about networks, this is an extremely common occurrence. Once a user forms a friend relationship with an attacker, this can be seen as a bridge edge from a group of attackers to a group of normal friends. Another aspect of social networks is the concept of mutual friends. This initial relationship between one attacker and one real user can quickly snowball into further relationships between other attackers and other real users. Therefore, we now no longer have a single bridge node connecting these two groups of graphs, but several local bridges that strengthen their relationships that will lead to more attacks.

In conclusion, being apart of a social network can leave us vulnerable from attackers looking to send spam messages. The obvious solution to this is to avoid including personal information in our social media profiles. Other than this, we must be careful with who we decide to share our profiles with, as we could be creating a bridge between a network of attackers and our group of friends.

Reference

Shrivastava, N., Majumder, A., & Rastogi, R. (2008, April 12). Mining (Social) Network Graphs to Detect Random Link Attacks. Retrieved from https://ieeexplore.ieee.org/abstract/document/4497457.

An Analysis of the Allegiances of the 2019 Venezuela Presidential Crisis and their Interconnections

In last Monday’s class (30/09/19), we had discussed the notion of balance in graphs, following the reasoning of how relationships between friends and enemies would be structured in realistic scenarios. If you have a balanced graph, you can used known relationships to predict the relationships between nodes. For example, if you know that A is friends with B and B is enemies with C, it would be reasonable to guess that A is enemies with C as well. I thought that it would be interesting to apply this logic to a real-world situation and examine the relationships between nations. For this, I want to look at the ongoing Venezuelan Presidential Crisis. In short, the need to know is that there is a global debate regarding who is the rightful president of Venezuela between Nicolás Maduro and Juan Guaidó. Among the countries aligned with Maduro you have the likes of Russia, China, and Cuba. Of those supporting Guaidó, you would find the USA, Canada, Brazil, and the UK. Keep in mind that the full list of countries and their declared allegiances is much larger, but I just want to paint a general picture.

At the centre of the issue, you have the two Venezuelan parties, which can be comfortably labelled as having a negative relationship. We can also label the relationship between the Venezuelan parties and their respective supporters as being positive. This gives us two clear factions, one supporting Guaidó and the other Maduro. According to the logic dictated by balanced graphs, it would hold to reason that the countries within these factions would all have positive relations with each other and negative relations with countries in the opposing faction. This statement manages to hold in most high-profile cases, as shown in figure 1, with some examples being USA-UK or Russia-China, but there are some notable exceptions.

A prominent outlier I want to highlight is Canada-Cuba which have had very strong relations for decades. Despite this friendship, Canada and Cuba are in opposing factions regarding the Venezuelan Presidential Crisis. This Canada-Cuba relationship manages to create an unbalance in the graph, but would that strictly mean the relationship itself it prone to collapse? Over the recent months, I have not heard of any deterioration in the relationship between these two countries despite the clear difference in policy regarding Venezuela. Of course, it wouldn’t be surprising to read that there is an increase in tension behind closed doors, but as of right now, it doesn’t feel accurate to say that Canada-Cuba relations are in someway flawed. This relationship causes other issues with balance such as how Canada-Iran has a negative relationship, yet Cuba-Iran has a positive relationship.

Due to the Canada-Cuba example, I feel that it may be rather difficult to find a perfectly balanced graph using real-world data as the world is simply too intricate to be able to definitively say who are enemies and friends of whom. An important aspect to consider is that edges are binary, negative or positive. Relationships with countries are volatile and subject to change. An example is Brazil-Russia, notably in opposing factions, where their relationship has been improving, but are still in a tough position to gauge whether their current relationship could be described as friendly. Another thing to consider is a neutral relationship, such as the one between the UK and Cuba. Such a relationship couldn’t be expressed with graph balancing as each edge must be coloured as negative or positive, not allowing for neutrality of any kind. This is not to say that the notions of balanced graphs aren’t useful, but it may be more reasonable to look at an overall level of balance, such as relating to probability, as opposed to merely saying that the graph is balanced or unbalanced.

2019 Venezuelan Presidential Crisis Summary:

https://www.bbc.com/news/world-latin-america-48121148

Figure 1: Demonstrates the relationships between some of the larger countries involved in the 2019 Venezuelan Presidential Crisis

Social Networks altered by growing Information Network

With the world becoming more interconnected with the internet, people can share their ideologies with a significant amount of people without meeting them in person. This results in the barriers of communication such as distance and time differences being removed and the ability to access discussions becomes more available to the general public.

With social media and online discussion boards not being controlled by regulations and the ability for non-transparency, specific ideals can be broadcast throughout the social network without any accountability. This leads to

The inclusion of natural barriers like distance are what prevent the intermingling of individuals in different areas, leading to natural filters that regulate the flow of information. Without these barriers, location is no longer a main factor when analyzing the social network.

Figure 1: The structure of the social network affecting the voters’ perceptions of information

As a result, the analysis of social and information networks cannot solely rely on the location of a individual node in a geographic area as its edges connecting other nodes may be in significantly different areas.

This is interesting with respect to the course because a simple graph with edges connecting nodes that communicate with one another does not give a complete picture of the situation, as the actual network has location as part of the structure and the channels of communication can be unpredictable.

Big data loads and enterprises

The study shown in the article tells us the surprising fact that most enterprise networks are actually incapable of handling big data loads. With the rise of technology and the fact that most households now have access to a device that is hooked up to the web through personal devices or public ones in the library. It is very surprising that a large number of companies cannot handle the flow of data through their networks. Just like what was said in class, a graph could be used to represent the networks we use for the internet, with the nodes being the devices (routers, switches, etc for either the enterprise or just any normal household) and the edges being the connection between two nodes (a connection meaning they can send data packets to each other). Lots of information are being sent every day and it seems that the speed at which technology is coming out exceeds the rate at which our companies can handle right now.

This is really interesting to me because with how much money there is to be made from the tech industry, I would guess that businesses are able to keep up with the growing demands. The reality of the situation, however, is not too surprising because of how big those networks are. The ones talked about in class with only 15 nodes are already really confusing to look at so when thinking about the billions of devices around the world, it makes a lot more sense why it is really complicated to pull off. This has really opened my eyes because it has made me understand more about the scope of these networks and how large of a scale they operate at.

https://www.networkworld.com/article/3440519/most-enterprise-networks-cant-handle-big-data-loads.html

Decentralized communications allows for more robust networks

I saw an article describing the use of decentralized communication with Bluetooth used by protestors in Hong Kong. (Wakefield, 2019) The decentralization of communication relates to the ideas in CSCC46 of network robustness and the connectedness of graphs.

In the protests, many protestors were using the messaging app Telegram to coordinate protests, and to communicate in a large group. However, this had a few issues. Telegram, as a cloud-based messaging app uses a centralized network model, where communications between devices must travel through their server. (Telegram, n.d.) For example, for ‘Alice’ to contact ‘Bob’, through Telegram, Alice sends a message from her phone, to Telegram’s server, who sends it to Bob. However, this presents a few issues. First, in a large protest with many people, cell towers can quickly become overloaded, making it difficult to send messages. Another point of failure is that if Telegram’s server encounters issues, then a message cannot be sent, this happened in June 2019, when Telegram’s servers faced a DDOS attack. (Shieber, 2019)

Due to these risks, protestors started using apps such as Bridgefy and Firechat, which use peer-to-peer communication through Bluetooth to communicate. (Wakefield, 2019) In the context of a protest, it seems feasible. In a crowd, people are physically close together, so Bluetooth’s short 100m range is not an issue. (Wakefield, 2019) If a cell tower is overloaded, the users in that immediate area can still communicate with other users in the immediate area. If there are many users in the area using the app, they all assist in distributing messages to each other.

In the context of CSCC46, if we consider devices and equipment such as phones and cell towers to be nodes, and a connection between them to be edges, this decentralized communication allows for greater network robustness, as the graph does not quickly become disconnected when one important node (the cell tower) is removed. The distance between nodes can also be shorter, as the minimum distance is now phone directly to phone, instead of travelling through a cell tower and a server. This allows for more stable communication in tightly packed local areas, as long as there are enough nodes.

Centralized network, cell tower is a single point of failure
Decentralized network, each node (user) can communicate with each other independently

Decentralized communication through Bluetooth allows for a more robust network in small areas. This has applications outside of protests in any event with large crowds which may overload cell towers, such as a baseball game, a concert, or a natural disaster. (Bridgefy, n.d.) Using our knowledge of CSCC46 helps us analyze why some forms of communication can be more stable than others in certain situations.

References:

Bridgefy. (n.d.). Bridgefy. Retrieved from Bridgefy: https://bridgefy.me/

Shieber, J. (2019, June 13). Telegram faces DDoS attack in China… again | TechCrunch. Retrieved from TechCrunch: https://techcrunch.com/2019/06/12/telegram-faces-ddos-attack-in-china-again/

Telegram. (n.d.). Telegram Messenger. Retrieved from Telegram: https://telegram.org/

Wakefield, J. (2019, September 3). Hong Kong protesters using Bluetooth Bridgefy app – BBC news. Retrieved from BBC News: https://www.bbc.com/news/technology-49565587

Leveraging Community Detection Algorithms for Machine Learning

Hey everyone.

Today I want to discuss an interesting new study that came out recently involving the usage of social networks as a tool to group datasets for machine learning models. A paper named The Power of Communities: A Text Classification Model with Automated Labeling Process Using Network Community Detection published by Minjun Kim and Hiroki Sayama on September 25th, 2019 highlights a useful application of network logic and analysis as it relates to training machine learning text classification models.

If anyone has worked in data science before, you’ll understand the enormous amount of time that is spent on ETL – extract, transform, and load. On top of that, the data needs to be labelled and feature engineered to be able to extract useful insights from it. These two researchers describe how supervised and semi-supervised data are often associated with pre-defined keywords or data which impacts classification. The other clustering algorithms, such as k-means relies biases towards words which repeatedly appear in different contexts, biasing the model and introducing unnecessary ambiguity. The paper explains Kim and Sayama’s methods on how to apply a network community detection algorithm in grouping the preprocessed sentences into different communities and trying to extract insights from that.

Particularly interesting, is that the method for network detection in their paper is the Louvain modularity algorithm for network community detection. This algorithm is based on evaluating density of network links. This relates well to our class discussions on strong and weak ties, as the Louvain algorithm measures modularity as a value between (-1, 1) of the density of links inside communities compared to links between communities. This Louvain method actually relates to the Girvan-Newman algorithm as it was based on that algorithm, instead introducing an aspect of heuristic analysis and local optimization on top of the original algorithm.

I found this topic interesting because it showcases the various applications of network theory. By vectorizing sentences, we can discern mathematical properties that relate sentences semantically with each other and draw out communities without using natural language processing. This application is especially interesting as it shows a concrete application of community detection as we saw in class, and how it relates to cutting-edge modern academia research. For anyone interested, I have included a diagram of the communities as detected in the paper by Kim and Sayama below.

Citation:

Kim, M., & Sayama, H. (2019, September 25). The Power of Communities: A Text Classification Model with Automated Labeling Process Using Network Community Detection. Retrieved September 30, 2019,  from https://arxiv.org/abs/1909.11706v1.

Needham, M. (n.d.). 6.1. The Louvain algorithm. Retrieved September 30, 2019, from        https://neo4j.com/docs/graph-algorithms/current/algorithms/louvain/.

The Role of Networks in Disease Prevention

The understanding of networks is crucial in regard to the process of disease prevention. The importance of the effects of wind in the spread of disease is highlighted in an article by Joel H Ellwanger and José A B Chies at, thelancet.com

The channels of wind that carry the airborne vectors, such as mosquitoes, may be interpreted as a directed link in a network. The spread of Malaria is affected by wind speed and direction (Chies and Ellwanger). The links may be weighted with respect to the strength of the wind, number of mosquitoes, potency of the virus, etc. Since it would be near impossible to represent every organism as a node, the nodes of the network would be a geographic segment of hosts of the virus, which includes animals. For example, villages, towns, cities, forests, natural habitats, etc.

An example graph of the spread of airborne diseases.

This network may be converted into a mathematical graph, such as those seen in CSCC46 at the University of Toronto. Then, graph theory and analysis may be conducted on such graphs. The clustering coefficient is the measure of how much clustering occurs among nodes in a graph. The degree of a node is how many connections it has to other nodes. By analyzing the clustering coefficient and degrees of nodes on smaller portions of the graph, common sources of the virus may be deduced, as nodes would be clustered more densely around these sources. A breath first search of the graph, starting from any one of the sources, can be used to see how airborne diseases spread throughout time. Then, preventive measures may be put in place around these sources and other densely clustered areas to prevent the future spread of the disease.

By analyzing the networking behind the spread of airborne diseases, future outbreaks of these diseases may be more efficiently prevented. This study highlights the importance of network analysis and its countless applications to real world problems.

Russell, D. A., and Michael Winterbottom. “Wind: a neglected factor in the spread of infectious diseases” The Lancet, Elsevier Inc., 1 November 2018, https://www.thelancet.com/journals/lanplh/article/PIIS2542-5196(18)30238-9/fulltext

Strong Tradic Closure When Finding a Job

This article talks about the impact of weak ties to the job seekers and recruiters, however, the author only talks about the impact for the recruiters side. There states if a company wants to build a team with talented people, it is more likely for the company to hire people with talent because of the new employee is the bridge to other networks of great people. This is because of the strong readic closure we talked about in lecture. In this network, the relationship between networks is not only friendship, but also the employment relationship, the weak ties still exist between the company and the network of an employee and the weak tie has some effect such that the employee could offer a referral to his or her network so that the company would hire them in the end.

This is interesting because for students who are about to graduate, and after graduation, they have to find jobs. Sending as many as possible resumes to companies is a choice to find a job but this way is not efficient and there is a small chance to get the interview invitation from the company. To know someone who is working or has worked in a company and let him or her provide a reference seems to be a good choice. Although the article does not talk about the impact on the seeker side, but the weak ties do have impact on both ways as the job seeker could be recommended to the company because they can get referrals, and the company is more likely to hire them (weak tie effect) if they do have the skills to meet the requirement.

Reference:

Harper, Everett. “Weak Ties Matter.” TechCrunch. TechCrunch, April 26, 2016. https://techcrunch.com/2016/04/26/weak-ties-matter/.