Categories
Uncategorized

How good is YouTube’s Music recommendation algorithm?

I am an avid YouTube Music supporter. I truly believe that it is a better experience on laptop than Spotify. But I’ve always wondered how does it’s recommendation algorithm work? Like YouTube, whenever you click on a song (or a playlist) on YouTube Music it will play that song/playlist and populate the sidebar with other songs that you might be interested in. This is a really cool feature that has allowed me to discover many new songs and artists, but it has also brought me to some weird and niche songs that I do not enjoy. For this blog, I’ve finally decided to test the almighty YouTube Music algorithm!

I used the open source YtMusicAPI to interact with the YouTube Music API. The base songs that I used to get recommendations were songs from my liked list which is a list automatically generated from all the songs I have liked over the years. I got the most recent 200 liked songs and for each one I got a maximum of 25 recommendations. For each artist from a liked song, I connected them to other artists whose songs where recommended (including self reference). In total, I had 538 artists (106 of which originated from liked songs) and 2534 connections between artists, only 1.75% of possible connections. I represented this data as a directed weighted graph, where each edge is a liked artist’s song being connected to another artist’s song (or the same artist for self loops) and the weight being the number of songs connected between these artists. A flaw in this design is that some of my favourite artists who I have a lot of liked songs from will get more chances to connect to other artists and therefore have bigger neighbourhoods, higher weights and higher clustering coefficients.

The graph turned out to be disconnected, with 96.5 of nodes being connected and only one liked artist (Too Many Zooz) recommending other artists that no one else recommended. Too Many Zooz is a unique band which has niche appeal and therefore it is not a surprise that it has created its own mini-network. The degree distribution of the graph turned out to be exponential decay as expected, however there are some bumps along the way in neighbourhoods of size between 33-19 which are most likely the most popular artists who YouTube will automatically try to link to because popular artists are the most probable to be liked by new listeners. The maximum degree is 108 by the Glass Animals. This however should be taken with a degree of caution (pun intended) for biases discussed above. The average degree is 9.42, while Gnp gives 2.0. This shows that artists are actually highly connected through the YouTube Music recommendation algorithm which is surprising because I always though they would recommend mostly songs by the same artist which seems to be corroborated by the data because 47.2% of artists were mostly recommended other songs by themselves. But it seems that in each recommendation, there is enough variety of artists to get the user to discover new things.

The weight distribution is also very similar to the degree distribution, with the maximum weight being 82 for the self references of Gregory Alan Isakov.  It should also come to no surprise that the clustering coefficient was much higher than Gnp as well, 0.1869 vs 0.0175. An entire order of magnitude higher, clearly displaying that even though the YouTube Music algorithm tended to recommend a variety of artists for each song, similar artists keep getting recommended to each other. This may also be because of some biases in the sample which have been touched on above.

In terms of connectedness, as stated before it is disconnected with most nodes in one large component. There are also 4 strongly connected components of sizes more than a node, the largest containing 57 nodes, a whopping 10.6% of the graph. The others have sizes of 17 (3.2%), 3 (.6%), and 2 (.4%) nodes. In each SCC, all artists same the exact same genres, even if there are multiple. This shows how the YouTube Music algorithm strongly recommends artists from the same genre when recommending new songs. For anyone interested, the large SCC is all alternative rock artists: The Black Keys, Younger Hunger, Gang of Youth, AJR, and IDK HOW to name some of my favourites. Again the size of the largest SCC may have been affected by biases of the sample.

In conclusion, the YouTube Music algorithm is a very well built algorithm. For each song, it will recommend mostly songs from the same artist, but it will always give enough variety for the listener to discover other artists from the same genre to broaden the user’s experience. I tried uploading the relevant files and images to this post but I kept getting errors so they are all uploaded on piazza (@30).

By Ezzeldin Ismail

 

 

Leave a Reply