Wednesday, 23 March 2011

Pelagios Workshop SPARQL Demo & RDFa


Note: The following post is heavily based on notes by Mathieu D’Aquin and Sebastian Heath.

Prior to the workshop we have made a test SPARQL endpoint available for anyone who wants to test our approach and see what might be possible. The Demonstrator was built by Mathieu D’Aquin based on our weekly Skype meetings and discussion list (which you can follow at our Google Group). The endpoint is available at: 
http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query
using the OWLIM triple store with a Sesame interface.

Test with DME data
We have loaded the store with test data from the OAC descriptions available at: http://dme.ait.ac.at/yuma-server/timeline

For example, http://dme.ait.ac.at/yuma-server/api/annotation/369 is described with the following triples (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/explore?resource=%3Chttp://dme.ait.ac.at/yuma-server/api/annotation/369%3E)

<http://dme.ait.ac.at/yuma-server/api/annotation/369>     rdf:type       dfs:Resource           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     rdf:type       oac:Annotation           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     dc:title       "Algier"           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     j.1:created    "2011-03-17 15:14:17.899"
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     j.1:modified   "2011-03-17 15:14:17.899"
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     dc:creator     "guest"           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     oac:hasBody            <http://dme.ait.ac.at/yuma-server/api/annotation/369#body>
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     oac:hasTarget  _:node341


_:node341            rdf:type            rdfs:Resource           
_:node341            rdf:type            oac:ConstrainedTarget
_:node341            oac:constrainedBy   _:node342           
_:node341            oac:constrains      "http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif"           

<http://dme.ait.ac.at/yuma-server/api/annotation/369#body>   rdf:type   rdfs:Resource           
<http://dme.ait.ac.at/yuma-server/api/annotation/369#body>   rdf:type   oac:Body
<http://dme.ait.ac.at/yuma-server/api/annotation/369#body>   rdfs:label "Control Point for 'Algier' (36.752887, 3.042048)"


Example queries:
select distinct ?x where {
     ?x a oac:Annotation



gives the list of available annotations. (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=PREFIX%20oac:%3Chttp://www.openannotation.org/ns/%3E%0A%0Aselect%20distinct%20%3Fx%20where%20{%3Fx%20a%20oac:Annotation}&limit=0&infer=true)

select distinct ?lab where {
    ?x a oac:Annotation.
    ?x oac:hasBody ?b.
    ?b rdfs:label ?lab.
    ?x oac:hasTarget ?t.
    ?t oac:constrains "http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif"

gives the list of texts associated to the document http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif

Added Pleiades links:
A simple association based on similarity was applied to detect when the annotation related to a place known to Pleiades. For each of such annotations (e.g., http://dme.ait.ac.at/yuma-server/api/annotation/366 about “Corsica”, which is http://pleiades.stoa.org/places/991339), triples similar to the following were added:

<http://dme.ait.ac.at/yuma-server/api/annotation/366>
oac:hasBody <http://pleiades.stoa.org/places/991339>
<http://pleiades.stoa.org/places/991339>
rdf:type wgs84_pos:SpatialThing

Small extension of the Ontology:
A new class was created called GeoAnnotation. Intuitively, this class represent the oac:Annnotation(s) that point to (oac:hasBody) geographical objects (wgs84_pos:SpatialThing).
It is therefore defined in abstract OWL syntax as 

class (GeoAnnotation partial oac:Annotation)
class (GeoAnnotation complete restriction(oac:hasBody someValuesFrom(wgs84_pos:SpatialThing))

In other words, GeoAnnotation is the class of annotations that have a body which is a SpatialThing. In triple form:

<http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>
  rdf:type        owl:Class
<http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>  rdfs:subClassOf oac:Annotation
<http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>  owl:equivalentClass _:node417
_:node417            rdf:type                 owl:Restriction
_:node417            owl:onProperty           oac:hasBody
_:node417            owl:someValuesFrom       wgs84_pos:SpatialThing           


Based on this definition, the system is able to infer that the annotations that have been connected to Pleiades objects are GeoAnnotation. The query:

select distinct ?x where {
  ?x a <http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>
}
gives the corresponding results (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=select%20distinct%20%3Fx%20where%20{%3Fx%20a%20%3Chttp://pelagios-project.org/ontology/oac-geo/GeoAnnotation%3E}&limit=100&infer=true )

The query:
select distinct ?d ?b ?l where {
  ?x a <http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>.
  ?x oac:hasBody ?b.
  ?x oac:hasTarget ?t.
  ?t oac:constrains ?d.
  ?b a wgs84_pos:SpatialThing.         
  ?b rdfs:label ?l
}
gives the list of relationships between documents and Pleiades places (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=PREFIX%20oac:%3Chttp://www.openannotation.org/ns/%3E%0APREFIX%20wgs84_pos:%3Chttp://www.w3.org/2003/01/geo/wgs84_pos%23%3E%0APREFIX%20rdfs:%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0A%0Aselect%20distinct%20%3Fd%20%3Fb%20%3Fl%20where%20{%3Fx%20a%20%3Chttp://pelagios-project.org/ontology/oac-geo/GeoAnnotation%3E.%0A%3Fx%20oac:hasBody%20%3Fb.%20%3Fx%20oac:hasTarget%20%3Ft.%20%3Ft%20oac:constrains%20%3Fd.%20%3Fb%20a%20wgs84_pos:SpatialThing.%20%3Fb%20rdfs:label%20%3Fl}%0A&limit=100&infer=true )

Test with Arachne Data

Using the following query:
http://arachne.uni-koeln.de:8080/solr/select?indent=on&version=2.2&q=*:*&fq=kategorie:topographie&start=0&rows=10000&fl=+Pfad,+id,+kurzbeschreibung,+antikeRoemProvinzTopographie,+ort,+Genauigkeit+Ort_antik,+antikeGriechLandschaftTopographie,+Geonamesid&qt=standard&wt=json&explainOther=&hl.fl=

we can obtain a list of potential annotations from Arachne, with information about images relating to places, that can have modern and ancient names.

There are (apparently) a bit more than 5000 items in this list, running the same similarity-based process as above, we could relate a bit more than 2000 of them to Pleiades URIs.

They all appear, with the previous queries, as Annotations and GeoAnnotation. With this dataset, there are now many documents refering to the same place. We can for example obtain the list of documents referring to “” with the query:

select distinct ?x where {
     ?y oac:hasBody <> . 
     ?y oac:hasTarget ?x

The list of all the (geo)annotations relating to it can be retrieved through the query:

PREFIX oac:<http://www.openannotation.org/ns/>
select ?x where {
  ?x a <http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>.
  ?x oac:hasBody <http://pleiades.stoa.org/places/1027>
}
http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=PREFIX%20oac%3A%3Chttp%3A%2F%2Fwww.openannotation.org%2Fns%2F%3E%0A%0Aselect%20%3Fx%20where%20{%0A%20%20%20%3Fx%20a%20%3Chttp%3A%2F%2Fpelagios-project.org%2Fontology%2Foac-geo%2FGeoAnnotation%3E.%0A%20%20%20%3Fx%20oac%3AhasBody%20%3Chttp%3A%2F%2Fpleiades.stoa.org%2Fplaces%2F1027%3E%0A}%0A%0A%20&limit=100&infer=true


Other possible extensions
There are different kinds of GeoAnnotation (and different types of documents considered). For example, some data would mention both documents about the location, and documents that cite the location, but are about something else. Maps of particular locations might have a specific status, as well. We could add sub-properties of oac:hasTarget for example or subclasses of GeoAnnotation.

Using html link element to indicate presence of Pelagios ingestible RDF

In addition Mathieu’s work, Sebastian Heath has been considering implmentations in RDFa. It is common to use the html element ‘link’ within an html ‘head’ to indicate the location of alternate versions of web resource. The most common application of this convention is to indicate the presence of Atom or RSS feeds.

E.g. <link rel="alternate" type="application/atom+xml" title="Atom feed" href="<URI Here>" />

Resource authors/publishers can use the following convention to indicate the location of RDF-serialized Pelagios-compatible oac:Annotations.

 <link rel=”x-pelagios-oac-serialization” title=”Pelagios compatible version” type=”<mime-type>” href=”<URI>”/>

  • The @rel MUST be equal to "x-pelagios-oac-serialization".
  • The @type SHOULD match the mime type of the resource pointed to by @href.
  • The @type SHOULD be one of:
  • The value of the title attribute is not significant. Authors MUST NOT use it to communicate any information to the Pelagios crawler.

Monday, 14 March 2011

Pelagios Project Plan Part 7: Budget

Budget forecast*

  • Total Staffing costs 59%
  • Partner collaboration activities (incl. the workshop) 16%
  • Overheads (Estates and Indirects) 27%

*This is the total budget, including contributions In Kind from partners (see below)



Budget Management

The Arts Faculty Research Grants Manager is Suzanne Duncanson-Hunter. She is working closely with both the PI and CoI to make sure that the project keeps to budget. The PI and CoI are also in close correspondence with the JISC manager over how best to manage costs and utilise resources. In addition, the Open University, through Suzanne's efforts, have already drawn up an external consultancy contract with Rainer Simon of DME for work to be undertaken on Pelagios in WP3.


Budget Justification

The ontology work undertaken by LUCERO in WP1 complements the funding that they already have from JISC. Furthermore, all work on ontology specification, mapping and alignment done by the data and documentation partners (GAP, Perseus, DME, SPQR, Arachne) and Pleiades is payment In Kind. Because all of our partners are already committed to linked open data research and have secured sustainable and significant funding for themselves, Pelagios is able to make substantial research and infrastructural advances on a relatively modest budget, thus greatly maximizing Return On Investment for both JISC and project partners.


This is encapsulated by our 'start-up' event, the one-day workshop at KCL, which presents a unique opportunity for an intensive exchange of knowledge and experience on Linked Open GeoData from the community at large, as well as from all of our partners. For this reason we are intending to record the proceedings to help document the discussion of issues and methods relating to linking open data ontologies, so that it may be of use for other groups working in this area in the future. Our budget ensures that all invited external expert speakers will have UK costs borne, while also paying for the time and expenses of all the Pelagios partners, for both their participation at the workshop and the project meeting on the following day.

Saturday, 12 March 2011

Choosing an ontology: OAC

Part of Pelagios' first Workpackage is to decide on a Common Ontology for Place References (COPR). In doing so we are not attempting to reinvent the wheel - far from it. Good Linked Open Data practice is to reduce, reuse and recycle. To that end we have been investigating a variety of options and are now basing our approach on the Open Annotation Collaboration ontology.

The OAC is also a work in progress but, as luck would have it, they are holding their workshop in Chicago on exactly the same dates as ours (we now have a great line-up, btw, so register soon as places are filling up). Their fundamental principles seem to be exactly what we are looking for though: the ability to connect a target web document with some information about it (in our case, an ancient place).

The basics should look something like this:

ex:ann1 oac:Annotation
       oac:hasBody "http://pleiades.stoa.org/places/423025"
       oac:hasTarget <some resource identifying the text + fragment>



But a number of interesting issues remain - should we use Blank Nodes for the annotations themselves (especially in RDFa)? If not, where do we store them? Should we subclass the OAC ontology to specify that it is a geographic annotation? If so should it be the oac:Annotation or oac:hasBody that we subclass, and where should that ontology be hosted? We are fortunate to have the assistance of Bernhard Haslhofer and Robert Sanderson in this discussion, who are both involved with OAC and look forward to them reporting back from OAC's workshop. In the meantime Mathieu D'Aquin is putting together our own SPARQL demonstrator to see how useful this approach may be in practice.


If you have any thoughts on this do let us know and you can follow the discussion itself over on our Google Group.