Tuesday, 7 January 2014

The day of Pelagios: Berlin 11.12.13

Before the seasonal break of mince pies and Glühwein, the Pelagios team held a meeting in Berlin to address a range of issues relating to geospatial data aggregation and analysis. The fact that we were holding this in Berlin reflected the fortunate co-presence there of a number of different digital humanities initiatives. Our hosts were the German Archaeological Institute (or DAI), the ICT Director, Reinhard Förtsch, along with his researchers Philipp Gerth and Wolfgang Schmidle. Others joining us were:
The meeting presented us with the opportunity to talk first about Pelagios and its evolution. The Pelagios model of phases 1 and 2 uses annotations to facilitate linking (in our case through common references to places) rather than trying to unify different models. By enabling linking, each partner’s site also serves as a gateway to another, thereby maximizing the potential discoverability of these resources and avoiding fruitless attempts at creating individual portals that are supposed to do everything. Yet, even if we are decentralized, for linking to be facilitated we need a lightweight structure.

In Pelagios phase 3 work is concentrating on three areas. Since we are extending our model into new regions and time periods, gazetteers - essentially databases of place names - are crucial. Again our approach is to enable the linking between resources rather than trying to build a super gazetteer that contains all place names over time. With the aim of aligning gazetteers, we are currently investigating interoperability: What might a gazetteer 'ecosystem' look like? Options include using popular gazetteers as a backbone, though each come with drawbacks (the Getty Thesaurus of Geographic Names is heavily curated, minimizing community involvement, while Geonames includes extraneous information like every hotel in Berlin), and the SKOS vocabulary 'close match' label to enable links between gazetteers. For the meeting we've brought along a first preview of our 'cross gazetteer search', which runs on top of the linkages between the datasets from Pleiades and DARE. A screenshot of the user interface to the system is shown below.

Figure 1. Cross-Gazetteer Search Preview UI

Our second task is to enable annotations to be made on primary data (both textual and visual), so that place names can be identified. Initial attempts at building a toolkit for annotating texts will be discussed in forthcoming posts on this blog. As for the challenge of annotating maps, two questions are particularly relevant: where can we get computers to do the heavy lifting? And where do humans have to come into the loop? Finally, we are also investigating ways of visualizing the resources in our network. Our heat map provides an early indication not only of the spatial spread but also the intensity of the resources.

These three areas—relating to gazetteer interoperability, annotation methods and visualization—were the subjects of discussion.

The DAI started work in May to build a gazetteer of the Institute’s archaeological and bibliographical records. They have also been working with Wikidata and Wikimedia to explore how knowledge about the Roman frontier (the ‘Limes’) can be aggregated and used. One such example is an interactive timeline (seen below), showing how the border changed over time. Markus Schnöpf is currently working on a gazetteer for the Islamic world, which could help provide the basis for future Pelagios activity with Islamic texts. Meanwhile, at Stanford, Josh Ober’s team are developing a digital version of Mogen Hansen’s Polis inventory, which will not only provide a comprehensive dataset of settlements in ancient Greece, but also allow them to be searched in various ways using a simple browser plug in map. (Watch this space for developments.) These projects join a list that includes Pleiades, the Digital Atlas of the Roman Empire, Chinese Historical GIS, and Past Place, as the key protagonists taking the first steps towards creating a gazetteer ecosystem.

Figure 2. An interactive timeline of the Roman ‘Limes’ (frontier)

Annotation methods
With Greg Crane’s Humboldt Professorship at the University of Leipzig, various new initiatives are being launched with the aim of utilizing digital resources for the study of the ancient world. One of these, the Historical Languages eLearning Project, is experimenting with e-learning strategies for teaching ancient Greek and Latin based around annotation. Pelagios could work with this team to help in cases of disambiguating names that prove too challenging for our automated workbench, or to experiment with using games to scale up annotation over larger number of documents. The ARIADNE project, here represented by Martin Doerr and Gerald Hiebel, is laying the foundations for inferencing over data rather than just data retrieval (which is what Pelagios focuses on). In particular, the CIDOC-CRM model adopted by ARIADNE uses a formal structure for describing concepts and relationships that, while more complex semantically, is compatible with the Pelagios annotation model; moreover, the results of Pelagios can be used as the basis for CRM-compliant data.

Throughout the discussion, we were also concerned about visualization developments that can help in the understanding and analysis of potentially massive datasets. Dirk Wintergrün presented on GeoTemCo, a platform for visualising spatio-temporal data. This potentially looks very powerful, and will be especially interesting once temporal content (derived from e.g. publication dates, person references and other sources) are combined with place annotations. We give one example below, since it provides a new way of looking at data that members of the Pelagios team have produced in a previous project, GAP. Figure 3 shows GAP data from Herodotus and Pausanias in GeoTemCo, enabling the analysis and comparison of geographical referencing of these different books. In particular, Marian Dörk demonstrated a wide range of exciting visualization possibilities that could answer specific research questions and more generally appeal to the general public.

Figure 3. A comparison of places in Herodotus and Pausanias, using GAP data in GeoTemCo

