Wednesday 23 March 2011

Pelagios Workshop SPARQL Demo & RDFa


Note: The following post is heavily based on notes by Mathieu D’Aquin and Sebastian Heath.

Prior to the workshop we have made a test SPARQL endpoint available for anyone who wants to test our approach and see what might be possible. The Demonstrator was built by Mathieu D’Aquin based on our weekly Skype meetings and discussion list (which you can follow at our Google Group). The endpoint is available at: 
http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query
using the OWLIM triple store with a Sesame interface.

Test with DME data
We have loaded the store with test data from the OAC descriptions available at: http://dme.ait.ac.at/yuma-server/timeline

For example, http://dme.ait.ac.at/yuma-server/api/annotation/369 is described with the following triples (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/explore?resource=%3Chttp://dme.ait.ac.at/yuma-server/api/annotation/369%3E)

<http://dme.ait.ac.at/yuma-server/api/annotation/369>     rdf:type       dfs:Resource           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     rdf:type       oac:Annotation           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     dc:title       "Algier"           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     j.1:created    "2011-03-17 15:14:17.899"
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     j.1:modified   "2011-03-17 15:14:17.899"
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     dc:creator     "guest"           
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     oac:hasBody            <http://dme.ait.ac.at/yuma-server/api/annotation/369#body>
<http://dme.ait.ac.at/yuma-server/api/annotation/369>     oac:hasTarget  _:node341


_:node341            rdf:type            rdfs:Resource           
_:node341            rdf:type            oac:ConstrainedTarget
_:node341            oac:constrainedBy   _:node342           
_:node341            oac:constrains      "http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif"           

<http://dme.ait.ac.at/yuma-server/api/annotation/369#body>   rdf:type   rdfs:Resource           
<http://dme.ait.ac.at/yuma-server/api/annotation/369#body>   rdf:type   oac:Body
<http://dme.ait.ac.at/yuma-server/api/annotation/369#body>   rdfs:label "Control Point for 'Algier' (36.752887, 3.042048)"


Example queries:
select distinct ?x where {
     ?x a oac:Annotation



gives the list of available annotations. (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=PREFIX%20oac:%3Chttp://www.openannotation.org/ns/%3E%0A%0Aselect%20distinct%20%3Fx%20where%20{%3Fx%20a%20oac:Annotation}&limit=0&infer=true)

select distinct ?lab where {
    ?x a oac:Annotation.
    ?x oac:hasBody ?b.
    ?b rdfs:label ?lab.
    ?x oac:hasTarget ?t.
    ?t oac:constrains "http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif"

gives the list of texts associated to the document http://dme.ait.ac.at/samples/maps/oenb/AC04248667.tif

Added Pleiades links:
A simple association based on similarity was applied to detect when the annotation related to a place known to Pleiades. For each of such annotations (e.g., http://dme.ait.ac.at/yuma-server/api/annotation/366 about “Corsica”, which is http://pleiades.stoa.org/places/991339), triples similar to the following were added:

<http://dme.ait.ac.at/yuma-server/api/annotation/366>
oac:hasBody <http://pleiades.stoa.org/places/991339>
<http://pleiades.stoa.org/places/991339>
rdf:type wgs84_pos:SpatialThing

Small extension of the Ontology:
A new class was created called GeoAnnotation. Intuitively, this class represent the oac:Annnotation(s) that point to (oac:hasBody) geographical objects (wgs84_pos:SpatialThing).
It is therefore defined in abstract OWL syntax as 

class (GeoAnnotation partial oac:Annotation)
class (GeoAnnotation complete restriction(oac:hasBody someValuesFrom(wgs84_pos:SpatialThing))

In other words, GeoAnnotation is the class of annotations that have a body which is a SpatialThing. In triple form:

<http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>
  rdf:type        owl:Class
<http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>  rdfs:subClassOf oac:Annotation
<http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>  owl:equivalentClass _:node417
_:node417            rdf:type                 owl:Restriction
_:node417            owl:onProperty           oac:hasBody
_:node417            owl:someValuesFrom       wgs84_pos:SpatialThing           


Based on this definition, the system is able to infer that the annotations that have been connected to Pleiades objects are GeoAnnotation. The query:

select distinct ?x where {
  ?x a <http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>
}
gives the corresponding results (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=select%20distinct%20%3Fx%20where%20{%3Fx%20a%20%3Chttp://pelagios-project.org/ontology/oac-geo/GeoAnnotation%3E}&limit=100&infer=true )

The query:
select distinct ?d ?b ?l where {
  ?x a <http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>.
  ?x oac:hasBody ?b.
  ?x oac:hasTarget ?t.
  ?t oac:constrains ?d.
  ?b a wgs84_pos:SpatialThing.         
  ?b rdfs:label ?l
}
gives the list of relationships between documents and Pleiades places (see http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=PREFIX%20oac:%3Chttp://www.openannotation.org/ns/%3E%0APREFIX%20wgs84_pos:%3Chttp://www.w3.org/2003/01/geo/wgs84_pos%23%3E%0APREFIX%20rdfs:%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0A%0Aselect%20distinct%20%3Fd%20%3Fb%20%3Fl%20where%20{%3Fx%20a%20%3Chttp://pelagios-project.org/ontology/oac-geo/GeoAnnotation%3E.%0A%3Fx%20oac:hasBody%20%3Fb.%20%3Fx%20oac:hasTarget%20%3Ft.%20%3Ft%20oac:constrains%20%3Fd.%20%3Fb%20a%20wgs84_pos:SpatialThing.%20%3Fb%20rdfs:label%20%3Fl}%0A&limit=100&infer=true )

Test with Arachne Data

Using the following query:
http://arachne.uni-koeln.de:8080/solr/select?indent=on&version=2.2&q=*:*&fq=kategorie:topographie&start=0&rows=10000&fl=+Pfad,+id,+kurzbeschreibung,+antikeRoemProvinzTopographie,+ort,+Genauigkeit+Ort_antik,+antikeGriechLandschaftTopographie,+Geonamesid&qt=standard&wt=json&explainOther=&hl.fl=

we can obtain a list of potential annotations from Arachne, with information about images relating to places, that can have modern and ancient names.

There are (apparently) a bit more than 5000 items in this list, running the same similarity-based process as above, we could relate a bit more than 2000 of them to Pleiades URIs.

They all appear, with the previous queries, as Annotations and GeoAnnotation. With this dataset, there are now many documents refering to the same place. We can for example obtain the list of documents referring to “” with the query:

select distinct ?x where {
     ?y oac:hasBody <> . 
     ?y oac:hasTarget ?x

The list of all the (geo)annotations relating to it can be retrieved through the query:

PREFIX oac:<http://www.openannotation.org/ns/>
select ?x where {
  ?x a <http://pelagios-project.org/ontology/oac-geo/GeoAnnotation>.
  ?x oac:hasBody <http://pleiades.stoa.org/places/1027>
}
http://kmi-web01.open.ac.uk:8080/openrdf-workbench/repositories/pelagiosTest/query?queryLn=SPARQL&query=PREFIX%20oac%3A%3Chttp%3A%2F%2Fwww.openannotation.org%2Fns%2F%3E%0A%0Aselect%20%3Fx%20where%20{%0A%20%20%20%3Fx%20a%20%3Chttp%3A%2F%2Fpelagios-project.org%2Fontology%2Foac-geo%2FGeoAnnotation%3E.%0A%20%20%20%3Fx%20oac%3AhasBody%20%3Chttp%3A%2F%2Fpleiades.stoa.org%2Fplaces%2F1027%3E%0A}%0A%0A%20&limit=100&infer=true


Other possible extensions
There are different kinds of GeoAnnotation (and different types of documents considered). For example, some data would mention both documents about the location, and documents that cite the location, but are about something else. Maps of particular locations might have a specific status, as well. We could add sub-properties of oac:hasTarget for example or subclasses of GeoAnnotation.

Using html link element to indicate presence of Pelagios ingestible RDF

In addition Mathieu’s work, Sebastian Heath has been considering implmentations in RDFa. It is common to use the html element ‘link’ within an html ‘head’ to indicate the location of alternate versions of web resource. The most common application of this convention is to indicate the presence of Atom or RSS feeds.

E.g. <link rel="alternate" type="application/atom+xml" title="Atom feed" href="<URI Here>" />

Resource authors/publishers can use the following convention to indicate the location of RDF-serialized Pelagios-compatible oac:Annotations.

 <link rel=”x-pelagios-oac-serialization” title=”Pelagios compatible version” type=”<mime-type>” href=”<URI>”/>

  • The @rel MUST be equal to "x-pelagios-oac-serialization".
  • The @type SHOULD match the mime type of the resource pointed to by @href.
  • The @type SHOULD be one of:
  • The value of the title attribute is not significant. Authors MUST NOT use it to communicate any information to the Pelagios crawler.

No comments:

Post a Comment