Pelagios

Friday, 15 November 2013

The nesting of EAGLE within Pelagios

In our previous post we introduced what EAGLE is and what it hopes to achieve. In this post we outline briefly some particularities with our data structure that demonstrate what we are bringing to Pelagios.

Most fundamentally we use the term ‘place’ as it is defined by Trismegistos Geo: this means taking ‘place’ in its broadest sense, to refer not only to towns and villages, but also to regions, districts and all kinds of micro-toponyms. All toponyms referring to a single place are listed on their individual cards, each of which has a unique TM Geo_ID number. The number itself contains no information, but creates a numerical order. If two places are identified and their cards joined, the Geo_ID number of the old card is preserved but henceforward contains only a reference to the new card.

For example, Trismegistos Geo lists two kinds of places: ancient places attested in both literary and documentary sources, and modern places insofar an ancient document has been found there. Sometimes in fact no information about the ancient toponym is available and the findspot of an ancient text has to be recorded with its modern findspot. With regard to ancient places, it is not always clear what is a real toponym and what is a common noun that refers to a geographical item (also called appellatives in linguistic studies). In this matter, Trismegistos follows the practical rule that any toponym listed in the geographical index of a publication is also listed in the geographical database. Trismegistos Geo is also adding to PLEIADES id for some location, in order to facilitate the recognition of geographical entries in other databases. In addition, the cards store all names and variants; among them a standard name is chosen both for the ancient and the modern name. Moreover, every place is ascribed to a modern country, an ancient region and a Roman provincia, each item in a separate field. The standard name for the modern country is the one used in English, and the correspondences between each modern country or region and the ancient provinces are those in use at the Epigraphic Database Heidelberg.

Aligning the inscriptions in Trismegistos will mean that the “annotated thing” not only will represent the most up-to-date unique entry for that text but also will in turn link to multiple independent editions of the same text where they exist and indeed to all quality curated editions from the EAGLE BPN. In this way we will help minimize the possibility of duplicating records for the same place.

In the long term, we look forward to aligning both Trismegistos and Pleiades to Wikidata, in order to bring together the richness of both of these gazetteers. As we see it, establishing a network of gazetteers—one of the aims of Pelagios 3—is a highly valuable step towards harmonizing practice and making content reusable and extendable. We look forward to working with the Pelagios team to take linked ancient world data one step further in terms of data networking and interoperability, and together help facilitate research in all disciplines of the field, digital or otherwise.

Thursday, 14 November 2013

The EAGLE flies with Pelagios

EAGLE—the Europeananetwork of Ancient Greek and Latin Epigraphy—is joining Pelagios. EAGLE is itself a Best-Practice Network (BPN), co-funded through the ICT-Policy Support Programme of the European Commission, and aims to create a new online archive for epigraphy in Europe. As part of Europeana’s multi-lingual online collection of millions of digitised items from European museums, libraries, archives and multi-media collections, EAGLE will link and connect, using Linked Open Data (LOD) best practice, thousands of inscriptions, photos of inscriptions and related contextual items in a single readily-searchable platform.

The project will make available the vast majority of surviving inscriptions from the Greco-Roman world, complete with the essential information about them and, for all the most important, one or more translations. By joining Pelagios, EAGLE will be able to connect with other major online projects about the Ancient World and make its data accessible to other aggregator and LOD projects to increase the quality, usability and accessibility of data provided by the BPN. For example, our partner Trismegistos (KULeuven) has gathered geographical information concerning the provenance of the inscriptions listed by the major content providers—a total of some 35,235 place records and 124,569 place attestation records.

The EAGLE BPN looks forward to the possibilities of connecting materials that have for a long time been viewed only in isolation as a result of separation and localism. There are four tasks towards achieving this vision data wise:

To make all content available in Europeana, the largest culture and heritage aggregator in Europe (#AllezCulture)
To use Wikidata for our translations of inscriptions. By gathering all existing translations of inscriptions and providing an easy-to-edit online database of translations, EAGLE aims to enrich both those data that are present in Wikimedia Commons with curated content from the databases, and the database contents themselves with contributions from the wider public
To produce an open, interoperable format. In the Eagle portal, data will be available in XML files compliant with EPIDOC/TEI guidelines.
To produce open vocabularies that align existing models used by single content providers. These will provide many other URIs which, we hope, will become a way to further connect other data on the basis of Object Type, Material, Type of inscription, to mention just some.

We at EAGLE are excited about joining Pelagios and look forward to enabling online research about the ancient world take off.

Thursday, 17 October 2013

IWP2: Pelagios and the Beakers of Vicarello

The last few weeks have been a busy time for the Pelagios team. In parallel to kicking off our work on linking gazetteers as part of first Infrastructure Workpackage (IWP 1), we also started to assemble some foundational bits and pieces of our second IWP - which is concerned with building up the data and annotation infrastructure.

Prelude: the Itinerarium Gaditanum

Jump cut to Vicarello, Italy, mid-19th century: excavations at the Aquae Apollinares Baths in 1852 reveal three cylindrical vessels made of silver, with heights varying between 95-153 mm. Excavations in 1863 later reveal a fourth vessel of similar kind. Although differing in the details, on the surface of each vessel is engraved the Itinerarium Gaditanum, the land route between Gades (Cadiz) and Rome, listing between 104 and 110 road stations along the way, and the distances between them in units of Millia Passum (thousand Roman steps or 1481 meters approx).

Photo by Ryan Baumann CC-BY 2.0

The Vicarello Beakers, as they are now frequently referred to, have traditionally been identified as miniature replicas of a milestone probably erected in Gades, perhaps similar in design to the Miliarium Aureum (the Golden Milestone) in Rome. Originally, through the study of the different stations of the route, experts had dated them at different times between the governments of Augustus and Tiberius. But recent palaeographic studies and comparisons with late documents such as the Antonine Itinerary or Burdigalensis Itinerarium, as well as their resemblance to the missorium of Theodosius suggest a dating to the late third or early fourth century AD.

Their handy number of toponyms, as well as the fact that there are images and transcriptions available online already, makes the Vicarello Beakers an excellent test case to teach our data infrastructure a few new tricks. Technical details about the upgrades it's about to receive (complete with RDF samples and pointy brackets) will appear on our Wiki and through our mailing list in due course. But, for the purpose of this blog post, let me just give you a sneak preview of some of the things our upgraded data model can do.

Linked Data, Open Annotation, RDF, What?

You may recall that Pelagios is based on the principles of Linked Open Data, and that we have chosen the Open Annotation Data Model as the conceptual basis for our common vocabulary. These foundations will not change. But with a growing network of partners, more diverse content, and increasing amounts of data, it has become painfully clear that our initial data model from the days of Pelagios 1 and 2 has reached its limits. We have grown to so many partners and content now that data for major places has become practically unmanageable - just try to find something useful in our data about Rome!

Mapped Pelagios annotations for one of the Vicarello Beakers

So what are the things our new data model will improve?

First and foremost, our new model allows for richer item metadata. There is now a much cleaner separation between information about the item, and information about the places that relate to it (and how). There is room to encode dates and temporal characterics, categories, authorship, languages used in the source document - ordering dimensions which help us to get more structure into the pile of "anonymous place references" we agglomerated through our first two project phases.
In line with a richer metadata model, we have also adopted the FRBR distintion of Work and Expression. In FRBR terminology, the Vicarello Beakers are a Work - "a distinct intellectual or artistic creation". Each of the four beakers is termed an Expression of this Work. This is another straightforward ordering principle, which helps us to get more structure and hierarchy into our data.
One of the changes that happend in the transition between the (now deprecated) Open Annotation Collaboration model and the new Open Annotation model is support for multiple "annotation bodies". I'll refer to the OA spec for details. But as far as Pelagios is concerned, this change allows us to represent the different "faces" of a place reference in a source document - logical mappings to (a) gazetteer URI(s), its precise transcription, different images of it, etc. in a much simpler way.
Toponyms in a document may follow a certain sequence or layout. The Vicarello Beakers are a prime example of this: laying out their toponyms in a list with four columns, according to the sequence of the places along the route between Gades and Rome. We're experimenting with ways to record the logical ordering of toponyms in a document, and bringing it to use for visualization.

This simple mashup shows the toponyms from the four Vicarello Beakers on a map. There's an information box with the Work metadata at the bottom, and if you look to the top-right, you will find a small layer menu which lets you switch places - and the path indicating the toponym sequence - on and off individually for each beaker. Click on a place, and a popup will show you the transcription from the Beakers, along with the gazetteer reference from Pleiades, which corresponds to the place.

What's noteworthy about this demo, however, is not so much the map itself - but rather that the map is generated completely automatically from a Pelagios RDF file, containing item metadata and OA annotations. (You can grab the RDF source file here.) In essence, these are also our first baby steps towards the Visualization Workbench - which is the objective of our third infrastructure workpackage.

In the meantime, stay tuned for the exciting sequel to "Pelagios and the Beakers of Vicarello" - in which the Pelagios team will tackle their next Early Geospatial Document, and where we will shed some light on the workflow we use to compile our data, and how we transform it to Open Annotations.

Pages