Geographical information retrieval of historical regions
In the last few weeks I have been developing some API and an interface over the PELAGIOS API in order to be able to retrieve historical places, and their relative annotations, by using some geographical context. The issue of superimposing a geographical representation over some data collection is not novel. In modern data collections, for example the data produced by public administrations nowadays, organisations use geographical nomenclatures such as the administrative subdivisions, or the NUTS if the data is statistical observations.
For historical data collections this is not always feasible, since the administrations of past kingdoms not always provided a sharp definition of their boundaries (sharp meant in a modern sense, with precise coordinates for regions' shapes) nor a deep subdivision which can help our information retrieval task.
Fortunately for PELAGIOS' users, some of the boundaries for the provinces of the past roman empire have been made available as shape files, and this can help us in browsing the wealth of data annotations provided as geolocalized linked data. The shape files in question were digitised from Barrington Atlas rasters (georegistered and supplied by AWMC) by Pedar Foss at Depauw University in 2007 within the context of the MAGIS project, and have been provided by Tom Elliott from the Institute for the Study of the Ancient World, New York University. The regions represented can be seen in the figure below.
The present post is not about a single API or a single interface that can be implemented on top of PELAGIOS data and services, but instead it aims to provide some insights on how to implement geographical browsing by using open source tools.
In order to visualize places and annotations from PELAGIOS API we exploited the geographical search by bounding box. By retrieving places in the PELAGIOS network contained by a bounding box we are half way to filter them via any polygon. In fact, by adopting a GIS we could directly querying data by polygons. Unfortunately that would require to have all the annotation data and the regions' polygons stored in the same database which is against the principle of distributeness of the linked data paradigm and it is not feasible in general scnarios. In fact that solution would require to provide a version of the PELAGIOS data to any interested user that would be forced to install GIS software and host their particular polygons.
Instead, in here, what we did is to decouple the management of the annotation data with the geographical retrieval features, trying to minimize the amount of software to install and reusing as much as possible the data and services already provided. For this reason we uploaded the polygons we were interested on in a web enabled version of postgis, called CartoDB. CartoDB allows a limited and free use of the web platform, but users can download the open source version and install it on a server if and when needed. CartoDB allows to run SQL queries over HTTP requests that allow developers to integrate the system easily.
As said earlier, once we have the capability to query by bounding box we are half way to being able to query by polygons. In fact, by querying the CartoDB we can retrieve the shape of a region by using its name (e.g. Aquitania in the figures below). If we want to retrieve all the PELAGIOS places contained in the Aquitania region we can query the PELAGIOS API for the places contained in the bounding box of the polygon first, and then filter those places based on the topological containment applied to the retrieved shape.
The activities involved to extract places by using polygons can be represented by the diagram below and involve three actors: the service implemented by the ECS dept. in Southampton (named ECS), the PELAGIOS API, and the CartoDB instance used for this scenario.
Roman provinces up to AD 117 visualised in CartoDB |
In order to visualize places and annotations from PELAGIOS API we exploited the geographical search by bounding box. By retrieving places in the PELAGIOS network contained by a bounding box we are half way to filter them via any polygon. In fact, by adopting a GIS we could directly querying data by polygons. Unfortunately that would require to have all the annotation data and the regions' polygons stored in the same database which is against the principle of distributeness of the linked data paradigm and it is not feasible in general scnarios. In fact that solution would require to provide a version of the PELAGIOS data to any interested user that would be forced to install GIS software and host their particular polygons.
Instead, in here, what we did is to decouple the management of the annotation data with the geographical retrieval features, trying to minimize the amount of software to install and reusing as much as possible the data and services already provided. For this reason we uploaded the polygons we were interested on in a web enabled version of postgis, called CartoDB. CartoDB allows a limited and free use of the web platform, but users can download the open source version and install it on a server if and when needed. CartoDB allows to run SQL queries over HTTP requests that allow developers to integrate the system easily.
As said earlier, once we have the capability to query by bounding box we are half way to being able to query by polygons. In fact, by querying the CartoDB we can retrieve the shape of a region by using its name (e.g. Aquitania in the figures below). If we want to retrieve all the PELAGIOS places contained in the Aquitania region we can query the PELAGIOS API for the places contained in the bounding box of the polygon first, and then filter those places based on the topological containment applied to the retrieved shape.
![]() |
Selection of PELAGIOS places by using region's bounding box |
![]() |
Filtering of those resources by using the polygon topological containment |
The activities involved to extract places by using polygons can be represented by the diagram below and involve three actors: the service implemented by the ECS dept. in Southampton (named ECS), the PELAGIOS API, and the CartoDB instance used for this scenario.