Thursday 11 August 2011

(Re-)Using the Graph Explorer Pt. 1: Technology

As announced, I would like to use the next couple of posts to provide a little technical information on our PELAGIOS datavisualization demo, aka the Graph Explorer.

In this post I'll provide a quick rundown of the technologies we have used to implement the demo. So if your interest is in understanding the code, deploying it on your own server, etc. then this post is for you!

If you don't care about the details under the hood, but want to use the API to build your own mashups: this will be covered in the next post.

If you're not interested in tinkering with the application at all, but rather want to know how you can get your own data into the graph, or create your own data aggregations: we'll cover that in part III, which will focus on the data-end of things.


In terms of architecture, the Graph Explorer is a pretty standard Web application: the user interface is implemented in JavaScript, so it should run in any reasonably modern browser, with no need for extra plugins (i.e. no Flash or Silverlight involved). It makes heavy use of Scalable Vector Graphics to visualize the graph, aided by a few little helper libraries underneath for added functionality and eye candy.

The server side is implemented in Java. To make things work at reasonable speeds, the Graph Explorer keeps an aggregation of PELAGIOS partners' data in a database. (Right now it's actually just a small sample subset of about 75.000 data records total.)

Rather than using a relational database, we have used the Neo4j NoSQL graph database. Not only does this fit better with the graph-like structure of our source data (which is RDF), it also has the added benefit that we get a range of graph algorithms for free: e.g. the shortest path search, which is what you will see when you search for multiple places (example). Personally, I also found it easier to work with Neo4j rather than a triple store: the recommended Neo4j practice of defining your own domain model (and then working with concrete instances of Datasets, Places and OAC Annotations, rather than a generic model of Nodes and Edges or Resources and Properties) just felt much more straightforward, results in more concise and readable code, and (in my opinion) easily makes up for the sacrifices in terms of 'genericness' and lack of a standardized query language.

Source Code & License

As all our work, the Graph Explorer is open. We're licensing it under the GNU General Public License v3.0. You can get the source code from our GitHub repository, which also contains detailed build and deployment instructions.

P.S.: An online demo of the PELAGIOS Graph Explorer is available here. Screencasts explaining the basic usage are in this blogpost: The PELAGIOS Graph Explorer: A First Look

No comments:

Post a Comment