Tuesday 26 June 2012

Using the Pelagios data in widgets


The last few months I have been developing embeddable widgets for the Pelagios project using the data from Pelagios partners via the Pelagios API. Trying out early versions of these widgets with potential users, it became clear that there was a tension between the data available and the data I needed to make a really good user interface. 

As a reminder, a Pelagios-compliant set of data consists of a set of annotations. Each annotation body is a place in the Pleiades gazetteer and each annotation target is a URI identifying something that has an association with that place. Datasets can be grouped into subdatasets in a tree-like structure and partners provide a VoID file containing information about the root dataset.

Below I outline the main issues that have cropped up.

1) Titles for annotation targets 

Let us suppose we want to show all the items that Pelagios Partners have annotated with Corinth. It is not very useful to users to give them a list of URIs of these items, even when they are divided into subdatasets. Initially as this was all the data I had, I either had to display these URIs or list the items as 'Item 1', 'Item 2', 'Item 3', and so on. Needless to say, neither of these options were popular with users.

As part of Pelagios 2,  most partners have added a title to the annotation using dcterms:title, giving more descriptive information about items.  However, strictly this should be the title of the annotation not of the annotated target, a subtle but important difference, and of course some partners treated it as such. In the discussion on the Pelagios mailing list, it was suggested that the target of the annotation is given the title instead, which makes sense to me and will hopefully make its way into the Pelagios Cookbook. In the meantime, I am displaying the title of the annotation target if it exists, and of the annotation if it doesn't exist, as most partners have not updated their data yet. 

2) Dataset and subdataset titles and divisions 

Should the title of a dataset describe the set of annotations or should it describe the set of annotation targets? The former is strictly correct but will also not make much sense to users who have not heard of Pleiades and are not thinking in terms of annotations. How do we deal with this and what guidance do we give on making dataset and subdataset titles meaningful to users who may not have heard of many of the partners before seeing the widget?

There is also the question of granularity of subdatasets at different levels. Partners are free to structure their datasets and subdatasets how they wish, but this could easily leave us with user interface problems in the widgets. One possible option here might be to give some examples in the Pelagios Cookbook of subdataset structures that have worked well from an information architecture perspective for different types of partners.

3) Should annotation target URIs be URLs of HTML pages about the target?

We would obviously like a way for a user to be able to find out more information about an annotation target. At the moment all we have about an annotation target is the URI and probably a title. There is no reason why that URI should be anything but an identifier, but on the other hand, most partners have been using a URL of a page about the annotation target (e.g. the page in the online museum catalogue) so in the widget I am providing a link to the URI. But for a few partners, the URI might not exist as a URL or might contain RDF which is clearly not very user friendly. So there is an interesting question here of how we can associate the URL of a page about an annotation target with the annotation target in the data while not making things over-complicated.  

4) Meta information about partners used in the widgets

In the widgets, we want to tell users something about each partner. Users may not have heard of the partners in question, or indeed Pelagios, so a list of the Pelagios partners by itself may be fairly meaningless to them.

To give users rather more context, I asked each partner for a logo file and short 'strapline' describing the type of data that they provide. I collated these manually and assembled them in a JSON file from which I then grab the relevant information from the widgets. However, it would be nice if this information was provided in the VoID file provided by each partner instead so that this JSON file will not need updating for new partners, but how best to do this? The dcterms:description field is too long to use as a strapline for many partners for example. So this is another thing for the project to think about.

In summary

One of the interesting aspects of building the widgets has been how it has made the data from the Pelagios partners much more transparent and as a result, several interesting questions that have arisen for the project as a whole. Do you have any thoughts on how the project should deal with the issues above? Please do feel welcome to comment below - I would love to hear your ideas and suggestions.



No comments:

Post a Comment