Neo4j
is a powerful database management system, that is capable of storing and managing multiple graphs contained in databases. It uses a query language called Cypher
that has a visual and logical way of pattern matching nodes and relationships in a graph. I used neo4j
for a few assignments back in 2019 when I took CSCC01
. One of our assignments was actually building a REST api for accessing IMDB
data, and one of the endpoints actually computed the Kevin Bacon degree. I have been thinking about bringing up neo4j
sometime during the lecture but I guess now is the best time.
Below is a simple example of cypher query that that will return a graph of people
nodes with property height > 1.8
connected to country
nodes.
MATCH (p: Person)-[:FROM]->(c:Country)
WHERE p.height > 1.8
RETURN p, c;
I decided to write about this dbms
because I personally use the neo4j
sandbox when I need some visualization of class topics. They offer an online sandbox for free at https://neo4j.com/sandbox/ with many pre built datasets such as movies, 2019 women’s world cup, US Congress, movie reviews… You can even generate your own graph of tweets and mentions if you connect your twitter account!
We can also compute things like the IN
and OUT
degree of nodes, suppose we have a twitter-esque network structure stored.
We can compute the IN
and OUT
of Alice with the example below:
MATCH (u:User)
WHERE u.id = 'Alice'
RETURN u.id AS name,
size((u)-[:FOLLOWS]->()) AS follows,
size((u)<-[:FOLLOWS]-()) AS followers
Then our output:
Finally, we can also compute things such as Clustering Coefficient
but I forgot how and it also wasn’t the first result on google. Anyhow, neo4j
offers a great way to visualize various topics covered in class, I hope that you will all play around the sandbox, and maybe we could even use it for future demos in class!
Links: https://neo4j.com/sandbox/