Tuesday, 3 July 2012

Geographical information retrieval of historical regions

In the last few weeks I have been developing some API and an interface over the PELAGIOS API in order to be able to retrieve historical places, and their relative annotations, by using some geographical context. The issue of superimposing a geographical representation over some data collection is not novel. In modern data collections, for example the data produced by public administrations nowadays, organisations use geographical nomenclatures such as the administrative subdivisions, or the NUTS if the data is statistical observations. 

For historical data collections this is not always feasible, since the administrations of past kingdoms not always provided a sharp definition of their boundaries (sharp meant in a modern sense, with precise coordinates for regions' shapes) nor a deep subdivision which can help our information retrieval task.

Fortunately for PELAGIOS' users, some of the boundaries for the provinces of the past roman empire have been made available as shape files, and this can help us in browsing the wealth of data annotations provided as geolocalized linked data. The shape files in question were digitised from Barrington Atlas rasters (georegistered and supplied by AWMC) by Pedar Foss at Depauw University in 2007 within the context of the MAGIS project, and have been provided by Tom Elliott from the Institute for the Study of the Ancient World, New York University. The regions represented can be seen in the figure below.


Roman provinces up to AD 117 visualised in CartoDB
The present post is not about a single API or a single interface that can be implemented on top of PELAGIOS data and services, but instead it aims to provide some insights on how to implement geographical browsing by using open source tools. 

In order to visualize places and annotations from PELAGIOS API we exploited the geographical search by bounding box. By retrieving places in the PELAGIOS network contained by a bounding box we are half way to filter them via any polygon. In fact, by adopting a GIS we could directly querying data by polygons. Unfortunately that would require to have all the annotation data and the regions' polygons stored in the same database which is against the principle of distributeness of the linked data paradigm and it is not feasible in general scnarios. In fact that solution would require to provide a version of the PELAGIOS data to any interested user that would be forced to install GIS software and host their particular polygons. 

Instead, in here, what we did is to decouple the management of the annotation data with the geographical retrieval features, trying to minimize the amount of software to install and reusing as much as possible the data and services already provided. For this reason we uploaded the polygons we were interested on in a web enabled version of postgis, called CartoDB. CartoDB allows a limited and free use of the web platform, but users can download the open source version and install it on a server if and when needed. CartoDB allows to run SQL queries over HTTP requests that allow developers to integrate the system easily.

As said earlier, once we have the capability to query by bounding box we are half way to being able to query by polygons. In fact, by querying the CartoDB we can retrieve the shape of a region by using its name (e.g. Aquitania in the figures below). If we want to retrieve all the PELAGIOS places contained in the Aquitania region we can query the PELAGIOS API for the places contained in the bounding box of the polygon first, and then filter those places based on the topological containment applied to the retrieved shape. 


Selection of PELAGIOS places by using
region's bounding box
Filtering of those resources by using the
polygon topological containment 






















The activities involved to extract places by using polygons can be represented by the diagram below and involve three actors: the service implemented by the ECS dept. in Southampton (named ECS), the PELAGIOS API, and the CartoDB instance used for this scenario.




This kind of geo-retrieval capability can be used to fuel applications like the one you can find at: http://pelagios.ecs.soton.ac.uk/historic/The visualisation enables users to visualise annotation stored within the PELAGIOS network for roman provinces and to browse them accordingly. The web application has been implemented using the OpenLayers API for which CartoDB provides a plugin to import a map layer quite straightforwardly.





In the first screenshot above you can see how the frequency of annotations and places are rendered as a heat map where the higher is the number of annotations and the "hottest" is the colour of a blob at a given location. Places are represented too in order to give the user an insight of the locations represented and not only of those annotated by PELAGIOS contributors, but still more annotated places are still 'hotter' than the others.


Once a province has been clicked and the heatmap is showed the user can search for places within the province borders. Once a place has been selected (historical names are used, the ones represented within PELAGIOS) the map zooms onto the place's coordinates and the available annotations are shown in a paginated list. The labels used in the list are the ones provided with the linked data representation of the annotation, otherwise the link is used instead. The awld library, developed by Sebastian Heath of Nomisma, allows to see an html representation of the annotation in a pop up when hovering the links. 

All requests to gather the annotations representation are made on client side by the browser, exploiting the possibility offered by PELAGIOS API to support CORS (Cross-origin resource sharing) which allows javascript code to make requests to a host different from the one the code has been downloaded from. This limitation is actually present in most of the REST services deployed up to date and hinder the possibility to reuse REST services directly from javascript applications.

Future work will include the possibility to select annotations directly from the map by selecting an area of interest.
This work has been made possible by the support of PELAGIOS project, and I would like to thank in particular: Leif Isaksen, Elton Barker, and Simon Rainer; which provided ideas, support and feedback

Gianluca Correndo
Research fellow WAIS group Electronic and Computer Science University of Southampton

No comments:

Post a Comment