Pelagios: November 2011

Wednesday, 30 November 2011

How do we balance supporting novice spatial users alongside experts? Or, is geospatial analysis necessarily GIS?

The task for table 5 at the #jiscGEO breakout discussion was to come up with a recommendation about how to balance the needs of newcomers to geospatial analysis with those of ‘experts’ in using Geographical Information Systems (GIS). To have greatest impact and value for the community, there was general agreement that any strategy should address the needs of the subject specialist user community. This means exploring how the technology can be made to work for the user, rather than necessarily ‘up skilling’ the user to become a technical expert (say, in GIS).

By focusing simply on technology training, there is the danger that, as well as being seen as irrelevant, too difficult or simply just boring for users (academics or students), the data gets overlooked or is made to fit a ‘system’ of analysis. For example, one problem of using GIS in humanities is the issue of ‘fuzzy’ data. This isn’t just a case of the system failing to cope with fuzziness: it also betrays an underlying assumption that data can, and should be, disambiguated and clear. For humanists, however, the questions driving research are often precisely those that look to nuance or complicate the material. We like messy results. Humanists need worry less about producing an accurate and/or truthful representation and more about how maps can be used as entry points to explore the data—this is seeing maps as part of the investigative process rather than as an end in and of themselves.

Ideally, then, users should be involved in the development, enrichment and adaptation of geospatial technologies, to make those tools work for them. Therefore, we recommend that JISC should build on their contacts within the HE sector to have teams of subject specialist users (i.e. the successful projects) go into universities, where this is already a JISC presence to help co-ordination, to show the target group the kinds of geospatial technologies that can be used and get them involved in shaping these tools for the future—the ‘show and tell’ unconference rolled out across the sector, as it were.

Elton, Nicola, Rasmus, Claire, Ryan, Addy

Sunday, 27 November 2011

Growing use of OAC - an inventory (no doubt incomplete) of initiatives

As the Pelagios project’s Common Ontology for Place References (COPR) is based on the Open Annotation Collaboration ontology, JISC (in the person of David Flanders) suggested a blog post on the growing use of OAC in HE and research. It’s a bit late - blogger’s block - but here it is. Many thanks to Rob Sanderson for his useful input.

A lot of the work being done is in the humanities, where research practice is more human-centric and “annotation” - with various meanings - is a core component of research, or a fundamental “scholarly primitive”. Textual studies is a particularly active area:

Stanford University has been using AOC for work on annotating digitised mediaeval manuscripts. As these are frequently illustrated, this involves annotating structured text (maybe already marked up using TEI XML) and images within the texts. This has been taken up more widely in the SharedCanvas project, whose results are being used by various libraries and universities for annotating mediaeval manuscripts, including the British Library, Bibliotheque National de France, and the Bodleian in Oxford, among others.

Emblem Books are another fruitful area for annotation. These form a genre of book, popular during the 16th and 17th centuries, containing collections of emblematic images with explanatory text, usually aiming to inspire the reader to contemplate some moral issue or other. The University of Illinois at Urbana-Champaign and Herzog August Bibliothek Wolfenbüttel have been collaborating with the Emblematica Online project on using OAC for annotating digitised emblem books. This also involves annotating structured text and images, although in printed books rather than manuscripts.

The AustLit project, based at the University of Queensland in Australia, has been applying OAC to the development of scholarly critical editions, specifically for annotating variations between different versions of a literary work.

An analogous approach could be used with variants within a “stemma” or family of manuscripts. In fact a use case of our own may be provided by the HERA-funded SAWS project, which is looking at complex relationships between mediaeval Greek and Arabic manuscripts of “wise sayings”, so-called gnomologia. I will be looking into this further.

A little (but not entirely) beyond textual studies, OAC is also being used for annotating historical maps - the Digital Mappaemundi project at Drew University is looking at methods of dealing with mediaeval maps and related geographical texts - in fact these maps can be thought of as complex images with original annotations, so the model may fit very well. Also at Cornell, the YUMA Universal Media Annotator (YUMA) tool has been used with OAC to annotate historical map collections.

OAC has also found applications in the digital libraries and archives world (the applications are not entirely disjoint from the above):

The US National Information Standards Organization (NISO) and the Internet Archive have launched an initiative for developing standards for creating and sharing bookmarks and annotations in e-books (announced October 2011), with various publishers interested. This will take on board the work done in OAC, although the standards developed will go beyond this.

Brown University Library is developing an annotation framework for the Fedora digital repository software based on OAC, linking the annotations created directly with TEI-encoded texts in their repository, and exploring how annotations can be attached to structural and semantic elements within those documents. Brown’s Women’s Writers Project will provide one of the initial test cases.

MITH (Maryland Institute for Technology in the Humanities) have been collaborating with the Alexander Street Press on using OAC to store annotations on their streaming library of educational videos. As an example of what they intend, they have produced a working prototype that allows shapes to be drawn so as to select regions of video for annotation.

And just to show that the sciences are not being ignored here, BIONLP at the University of Colorado - who work on natural language processing of biological texts - are investigating the use of OAC with entities and relationships automatically mined from such texts, and the FP7 Wf4Ever (Workflow Forever) project is using OAC for annotating research objects.

Any more contributions to this list happily accepted!

Friday, 11 November 2011

The Pelagios Graph Explorer: An information superhighway for the ancient world

Just as the settlements around the Ancient Mediterranean would seem disconnected without the sea to join them, so online ancient world resources have been separated, until now. Meaning “of the sea”, Pelagios has brought this world together using the principles of Linked Open Geodata. The Pelagios Graph Explorer allows students, researchers and the general public to discover the cities of antiquity and explore the rich interconnections between them.

The Pelagios Graph Explorer
Alice is an archaeology student from Memphis, TN. When not collecting Elvis singles, she loves nothing better than to find out about cities of the past. Recently she has come across Arachne, the database of images and finds from the German Archaeological Institute. She's interested in her hometown's namesake, Memphis, Egypt, and so she types it into the search box (fortunately it's the same word in German) and finds quite a few interesting results: 21 objects and 16 sets of photos. But what do they mean? What stories do they tell? And what role did Memphis play in the ancient world? What Alice doesn't know is that there are many other open resources out there with information about Memphis, its history and material culture and has no way to find out.

Enter the Pelagios Graph Explorer. Using the principles of Linked Open Data, the Pelagios Explorer allows people like Alice to discover those resources (including Arachne). When she types 'Memphis' into the Explorer's search box she is presented with a graph of information that shows her a range of different resources that relate to the city. Hovering the mouse over the pink circle, a balloon pops up about the Perseus Digital Library which seems to have 13 separate references to it. And clicking on a reference in the data view takes her straight there.

Now that's all well and good, but it rather begs the question: How would she find out about Pelagios in the first place? The answer is simple. As well as being a human interface, Pelagios is also an API, allowing resource curators to embed links right next to their own content. For instance, Carlo the classicist might be exploring the geographic flow of Herodotus's Histories using GapVis which has lots of handy info - a map, related sites, photos, etc. But the handy 'Pelagios Graph Explorer' link takes him straight to Pelagios and even fills in the details for him. This is the power of Linked Open Data - content providers such as Arachne can open up up a world of contextual information with a single link.

There's a lot more we could tell you about Pelagios - the fact that you can use the Explorer to find relationships between multiple cities for instance, or that it's an ever-growing collective of content providers committed to the principle of openness and public access. We could also tell you about the plans we have for Pelagios2 - to refine the data, improve the search facilities, and expand the community. But we think the best way to explore it is to have a go yourself. So why not check out our user guides and dive in!

Who are Pelagios?

Pelagios is a collective of projects connected by a shared vision of a world - most eloquently described in Tom Elliott’s article ‘Digital Geography and Classics’ - in which the geography of the past is every bit as interconnected, interactive and interesting as the present. Each project represents a different perspective on Antiquity, whether map, text or archaeological record, but as a group we believe passionately that the combination of all of our contributions is enormously more valuable than the sum of its parts. We are committed to open access and a pragmatic lightweight approach that encourages and enables others to join us in putting the Ancient World online. Pelagios is just the first step in a longer journey which will require many such initiatives, but we welcome anyone who shares our vision to join us in realising it.

Members of the Pelagios Team at our February 2011 kick-off workshop. From left to right: Rainer Simon, Greg Crane, Mark Hedges, Reinhard Förtsch, Mathieu D’Aquin, Elton Barker and Sean Gillies. Missing from the photo are: Leif Isaksen, Sebastian Rahtz, Sebastian Heath, Neel Smith, Eric Kansa, Kate Byrne, Tom Elliott, Alex Dutton, Rasmus Krempel, Bridget Almas, Gabriel Bodard and Ethan Gruber

Pelagios was made possible by the following organizations:

TABLE OF CONTENTS
Project Plans

Budget

Budget Plan

Project Progress

Technical Progress

Usability

General Commentary

Wednesday, 9 November 2011

What Makes Spatial Special?

One of the nice aspects of being part of the jiscGEO programme is that occasionally we're thrown slightly more philosophical questions to chew on. The most recent one is simple but broad: 'What makes spatial special?' This is hardly a new topic of course, as one of our co-projects has pointed out. A lot of people have discussed the
significance of the Spatial Turn and Kate Jones has done an excellent job in summarizing many of the key arguments. Rather than repeat them here I thought I'd approach them from a different angle: 'Why has Space become special and not Time?'

On the face of it the two have a great deal in common. For a start they are both values that not only underpin virtually any kind of information you can think of, but as dimensions (or a set of them) they also form a ratio scale which enables us both to order it and calculate relationships such as the closeness and density of data. As the simpler of the two (with just one dimension to deal with, rather than two or three), time seems by far the easier value for people to engage with. And yet there are no Temporal Information Systems, no Volunteered Temporal Information, no Temporal Gazetteers, no 'Temporal Turn' to speak of. So why has space, and not time, become the darling of the digital zeitgeist? Here's my theory: Because we experience space statically but time dynamically, a social asymmetry exists which makes spatial descriptions more useful socially.

Both time and space are affected by the Inverse Square Law of Relevance: as every good hack knows, a person's interest in a topic tends to fall off the further away they are from it, temporally and spatially. Of course that's not an absolute rule, but on the whole people are considerably more interested in today's home game than they are in foreign matches from yesteryear. The difference between space and time is that populations perceive themselves as being randomly dispersed throughout space, whereas time seems to be experienced simultaneously[1]. As a result, maps appear to be universally relevant because the distribution of relevance is spread across them. In contrast, almost our entire global attention is focussed on just one (travelling) moment in time. So while a map of Britain is equally relevant to people in London, Bangor and Inverness, a timeline of Britain is not equally relevant to Saxons, Normans and ourselves because the Saxons and the Normans are dead.

Enough of the beard-stroking, why should we care? It seems to me that there are two important conclusions to be drawn from this. The first is that the importance of maps is created socially and not individually. Because their relevance is determined by having multiple points of view, they can be enormously enhanced through social Web
technologies which is why Webmapping, despite having far less functionality than GIS, has rapidly outstripped it in utility. The less obvious lesson is that despite its ubiquity, spatial relevance is not spread evenly. Sparsely populated parts of the world (i.e. most if it) are not considered highly relevant by many people. By the same token, places in which mankind congregates (cities) tend to be seen as highly relevant. We see this most clearly in the number and diversity of named places they create. Whereas unoccupied spaces tend to have a just a handful of big named places, densely occupied spaces have a name for every nook and cranny. That means to create really powerful, socially relevant maps we need to start thinking about visualizing places, rather than just spaces.

And what of poor old temporal technologies? Will we ever get people to be as interested in the past as they are in the present? That's for another blog post, but if you are interested, come and join us for the NeDiMAH/JISC workshop in Greenwich on November 30th where we'll be devoting plenty of space and time to the subject.

[1] Actually, physics gives us plenty of reasons to doubt that this is the case at all, but it certainly feels that way, which is what's...er...relevant here.

Friday, 4 November 2011

SPQR triples - inscriptions and papyri

The SPQR project has produced just over half a million triples, describing approximately 57,500 inscriptions and papyri. The triples were derived from the following epigraphic and papyrological datasets:

The Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden Aegyptens (HGV), a collection of metadata records for about 65,000 Greek papyri from Egypt
The Inscriptions of Aphrodisias, a corpus of about 2,000 ancient Greek inscriptions from the city of Aphrodisias in modern Turkey.
The Inscriptions of Roman Tripolitania, a corpus of about 1,000 inscriptions from modern Libya.

The triples can be downloaded as RDF/XML from the following links:

Perseus and Pelagios

The Perseus geospatial data now includes annotations of ancient places with Pleiades URIs. Beginning next week,

the Places widget in the Perseus interface will include links to download the Pleiades annotations in OAC compliant RDF format. These links will appear for any text with place entity markup which also has places from this dataset. We are also providing a link to search on the top five most frequently mentioned of these places in the Pelagios graph explorer.

In addition, RDF files containing annotations for the occurrences across all texts in each collection will be available from the Perseus Open Source Downloads page.

To produce these annotations, we used the normalized and regular place names from the Pleiades+ dataset to identify likely matches with the Perseus places, and then the longitude and latitude coordinates from each source to validate and disambiguate these matches. Places which matched via this method are annotated with an "is" relationship to the Pleiades URI. For Perseus places which were not automatically mapped to a Pleiades URI via this method, we do a second pass at matching using the location coordinates, looking for Pleiades places within a certain range of the Perseus coordinates. Places which matched via this method are annotated with a "nearby" relationship to the Pleiades URI. These mappings are all stored with the Perseus place data in our database, and are available along with the other geospatial data for occurrences of these entities in the Perseus texts.

Going forward, we hope to be able to continue to work on improving the automatic alignment of the Perseus and Pleiades+ place data, as well as providing the means for manual refinement and correction of the annotations. In this initial pass, we were able to automatically annotate a little over 15% of the distinct ancient place names already identified in the Perseus texts. We would like not only to increase the percentage of matches with the Pleiades data, but also to begin to take advantage methods for automatically identifying place entities in the many texts in the Perseus repository which do not yet have this level of curation.

Pages