In this post, I will demonstrate how to extract data from a Neo4j database and import it into Graph Commons. I will also provide some tips for fine-tuning and sharing your graph. Even though this demonstration is based on Neo4j, you should be able to apply the same techniques to prepare graphs for Graph Commons with your own favorite tool.
Neo4j is a popular open-source graph database that I get to work with everyday to develop Graph Commons. The queries you will see further in the post are executed in the Neo4j query browser, which is a great practical tool to quickly interact with your data set. In order to get started with Neo4j, check out: http://neo4j.com/developer/get-started/
Introducing the data set and setting up the context
My current data set consists of around 10k nodes and 68k relations that I scraped from the official website of Grand Assembly of Turkey. In the portion of the set relevant to this post, I have 2243 Members of the Parliament (MP) since 1996 (20th term) and 81 Cities, where MP’s were elected from. Relations contain donem property specifying the term MP represented that city. Some MP’s serve for multiple terms and some in multiple cities.
I want to focus on how MP’s transfer from one city to another, and map out their “migration” patterns. Normally one would assume a representative is elected from a city they were born in or the city they lived in for a long time. However, it is also common to see MP’s change their city as a part of an election strategy.
The following query creates a relation between two cities when an MP migrates from one city to the other in two consecutive terms. To track the multiple instances of this migration, weight property is incremented every time there is a new instance of this relation.
Now I need to export these relations as CSV to import them to Graph Commons. I will run the following simple query and click export CSV on the query browser.
Notice the order of the order of the columns in the output which is required for Graph Commons to parse the file. Here’s the sample output from this query.
Importing data and creating the first graph
I visit https://graphcommons.com/graphs/import, and create a new graph by importing my CSV file.
Here’s what my imported graph looks like initially:
Fine tuning the graph
At this point, I can start fine-tuning the look of my graph. Using the layout tool tip box on the right side, I change the layout algorithm to Force Atlas 2 and I adjust layout parameters. I also open the cartography panel from the right drawer on the top right corner and change color and size parameters for nodes.
I see that some node labels are overlapping with each other, making the graph harder to read. To fix that, I can click the pause button on the right side of the screen (shortcut: spacebar toggles play/pause). This will stop the layout engine. Then, on the paused static graph, I can drag nodes slightly to a better location, making it more readable without losing the structure of the layout. When I am happy with the layout in the pause mode, I can click on the Save button on the top to save a snapshot (ie. save all node positions) of the graph. I can always come back to this snapshot even if I toggle play and change the layout. Mouse over the “Reset” button at the bottom right corner and select “Show snapshot” in the menu. Of course, I’d like everyone to see the graph layout the way I positioned them. Therefore, here in the Reset menu, I can check “Open with snapshot” option, which will make sure the graph will be displayed to visitors exactly the way I made it.
Here’s the final version of my graph: https://graphcommons.com/graphs/9bffe9ae-e913-4893-8cf0-a3787037f34d?auto=true.
At first glance, I see a triangle of major nodes: Ankara as the political capital, Istanbul as the cultural capital and Diyarbakir as, what is considered to be, the Kurdish capital of Turkey. Among the three, Istanbul stands out as the most prominent city in the graph. It has many edges in both directions to many other cities, which is expected due to the strategic importance of the city. The formation of a cluster of cities with major Kurdish population (ie. Diyarbakir and Van) may imply a sense of commonalities and shared politics in the region.
Adding node details
Graph Commons also supports the import of node details through CSV files. To demonstrate, I will query the MPs who changed the city they represented between the previous (24th) and the current (25th) terms. The following query will create the CSV file for the edges.
Now I will run another query to retrive export the details for MPs for Graph Commons. It is pretty much the same query as before, only the return statement lists MP node data.
The output is the default format for the node details. Note that any headers after Reference will be considered as a custom property for the node type.
Now I have two CSV files, one for the edges and two for the node details. I will add both of these files on the import page and create a new graph same as before.
Notice image column is automatically assigned as the image for MP nodes in the graph. I can click on an MP node and see the bio of each MP in the description field. As a final touch-up, I update the title of my graph and add a cover image by clicking on the title on the top of the page.
Publish and share findings
Now that I am happy with the content and the look of my graph, I feel ready to share my graph. All standard meta tags are automatically added to graph pages to support previews on Facebook, Twitter and everywhere else. Here’s what my tweet looks like.
I can also embed this graph in a web page or a blog post. I click embed on the top right menu of the graph and copy the embed code inside this blog post. Here is the result.
In this post, I demonstrated a way to export CSV files from the Neo4j query browser to quickly create graphs on Graph Commons. I believe this is a very convenient method particularly for data scientists and researchers who work with large data sets in their local workspace and need an easy and practical medium to share their work.
Feel free to get in touch with us @graphcommons to send any suggestions, questions and comments.