Friday 8 June 2012

Pelagios at the Linked Ancient World Data Institute (#lawdi)

Last week (31st May - 2 June) the Institute for the Study of the Ancient World, NYU, hosted a workshop on linked data in the ancient world. Pelagios were well represented, as Leif kicked off the invited presentations with a discussion of the difference between the semantic web and linked data, while I brought up the rear with a personal reflection on what the evolving digital world might mean for a Classical Studies researcher or student. (All presentations can be found here.)

Here I’d like to present 5 take-home points:

1. Of the different approaches to tying together resources on the web, linked open data seems the best bet, and not just because that's what Pelagios is doing! Linked open data uses a decentralised model in which participants agree on certain stable identifiers for things (such as places or names) and a way of mapping their data to them. So, for example, Pelagios uses Pleiades identifiers for ancient places and something known as RDF triples for expressing the relationship. We find that, by doing this, authority is diffused through the Pelagios ecosystem, meaning that there is no single point of failure (unless Pleiades fails, and, if that happens, we're all screwed anyway!) and that the extent to which projects annotate their data depends on the extent to which they want to hook into the network. Above all, as Sean Gillies, Pleiades’ head developer, has already emphasised in a previous post, it means doing what works.

2. Ok, but what difference does it make if your data is linked? Well, one great example provided at LAWDI by Andrew Meadows of Nomisma concerned coins. Within the world of linked data, it's now possible to discover, map and analyse not only find-spots (where the coins are found), but also where the same coins were minted and even the mines from which their metals derive. These data provide hitherto unparalleled access into the political and cultural deep structure that underpinned all kinds of interactions in the ancient world.

3. At one level, this kind of work represents a paradigm shift of sorts. The lone humanities scholar could hardly be expected to provide and analyse all these data by him or herself; linked data presupposes cooperation. But there is also a bigger point. If I think about my own experiences in Hestia, GAP and now Pelagios, it’s not only the case that each project has led to further, and more involved, collaboration; at each point new skills or tools have been needed, we have found the person to carry out that work and brought them in on the team. Linking data means, when all is said and done, linking with people. Which is fun!

4. While formal collaborations are not the usual humanistic way of doing things, linking data is what scholars have been doing all the time, as evident in footnoting. But scholarship is not only about referring to some other data of some kind; the best scholars chase up the connections. So, for example, the late, great, Oxford don, Don Fowler, writes (in the chapter “On the Shoulders of Giants” of his book Roman Constructions, Oxford, 2000, p.116):
“Classicists have always been concerned with ‘parallels’ – with what goes after the magic word ‘cf.’... What has not been clear with the traditional citation of parallel passages is what the point of the activity is, how the parallels affect the interpretation of the text.”
With this abbreviation “cf.”, which derives from the Latin conferre, Fowler plays upon its meaning to compare or “to bring together”. Imagine reading a footnote and being able to check the ancient source or modern scholar cited, or find out what other materials (images, documents) relate to the place or person under investigation, simply by clicking on a link. This might be blue- or pie-in-the-sky thinking for present publications, but it will be soon possible in ISAW papers, where individual contributions will be identified down to the paragraph level, meaning that any paragraph can be cited, or tweeted, at will. Reading is going get a lot more interactive.

5. Finally, this idea of linked open data is a powerful metaphor not only for thinking about our own world (and especially the internet) but also for approaching the ancient world. At the beginning of his enquiry (‘historia’) into why the Greeks and Persians came into conflict, Herodotus describes how he ‘came upon towns of men both small and great alike, for of the places that were once great, most have now become small, while those that were great in my time were small before’ (1.5.3). Like an Odysseus wandering the seas and coming to know the minds of many men (Homer, Odyssey 1.3), Herodotus writes about a world in which a people forcibly relocated to Persia (claim to) ultimately derive from refugees from Troy (the Paeonians, 5.13), where places as far flung as Marseilles and Cyprus are brought together for comparing the meaning of a word (5.9), and where the river Ister (Danube) and Nile frame the Histories’ geography (2.26, 33-34; 4.50, 53). In a world that is linked together in a myriad of different ways, investigations require making myriad uses of connections. Herodotus would have approved.


  1. Great summary of a great event :-) Only thing I'd add is that if Pleiades goes down we're not necessarily screwed for several reasons:

    1. Because their data is open it would be possible to create a replacement service and because it's registered under it could probably be run under the same domain name too so people wouldn't need to repoint their URIs.

    2. Even if we had to repoint URIs it would just be the domain rather than the slugs which is a relatively trivial operation

    3. Even if we couldn't do that, all the annotations would still merge together perfectly. The URIs acta as shared IDs even if they're not resolvable. The only thing we'd lose is the data from Pleaides which is why it doesn't make sense to centralize data there.

    The same goes for the Pelagios API - not only is it entirely replacable but we should assume that it _will_ be imitated and replaced and improved upon. This is what makes the system so decentralized which leads to both robustness and mutability (both good and bad).

    1. My comment was only meant in jest... :) But this fuller explanation of the robustness of the Pelagios ecosystem even I find helpful. Thanks Leif! Perhaps you'd like to explain RDF now?...