Thursday, 24 February 2011

Pelagios Project Plan Part 3: Risk Analysis and Success Plan

Risk Analysis

The main risks identified are shown below and will be reviewed every month:

Risk

Prob.

Impact

Action to Mitigate

Overly-complex or simplistic ontology inhibits uptake

Low

Med

The domain and technical expertise of LUCERO, ACRG and Pleiades, plus concrete use-cases from each partner, ensures a robust ontology. The use of modular extensions means that problems specific to one document type will not affect others, permitting multiple interest groups.

Lack of community interest

Low

Med

Each project partner already comes to the table with an established community of users, among whom the Pelagios outputs will be rolled out. Other interest groups such as Google, Ordnance Survey, GeoNames, EDINA and the European Commission will be involved through participating at the workshop.

Project stream unable to fund mapping and publication of data

Med

Low

The global economic crisis threatens all academic funding, but the highly international nature of the consortium and low dependency across the Work Packages greatly spreads this risk.

Critical failure of Pleiades Project

Low

Med

Pleiades is a core element of the Institute for the Study of the Ancient World (ISAW)’s digital infrastructure with high level institutional support and long term funding. It will be possible to use URIs generated from the Princeton Encyclopaedia of Classical Sites as a fallback position if necessary.


Evaluation

The success of the core ontology for place references (COPR) and linked open geo-data (LOG) approach will be measured against each publication stream using SMART criteria:

Specific: The ability to map a given document type to i) persistent Pleiades URIs, and ii) COPR-compliant RDF (see project plan post 1)

Measurable: A comparison of the precision and recall of the linked open geo-data (LOG) against a benchmark manual mapping of a sample from the source material

Achievable: Each team will identify an appropriate corpus in terms of both complexity and volume to demonstrate the value of the ontology

Relevant: Publication streams will refer to ancient places centred on the Mediterranean (whence come the majority of ancient sources) and datasets likely to be of interest to a wide audience

Timely: Sample data will be available four months prior to the end of the project. Complete datasets will be available one month prior to the end of the project.


In addition, documenting the process of implementing the core ontology by each stream is a key output of WP2 (see project plan post 6) by which future requirements can be identified, while each partner group will trial and elicit feedback on the demonstration web services among their target user-groups.


Sustainability

The RDF ontology will be hosted by Pleiades, whose long-term strategy and funding ensures the stability of both the domain name and content. Instance data will be held by the relevant partners, each of whom have sustainability at the core of their funding and institutional commitments. All web services and tools will use Open Source technology and Open Standards so as to be freely available for adoption by third parties. Code will be hosted in a suitable repository such as SourceForge or GitHub. A demonstrator version of the software will be hosted on multiple websites maintained by the project partners.

Tuesday, 22 February 2011

Pelagios Project Plan Part 2: Wider Benefits of Pelagios to Sector & Achievements for Host Institution

Pelagios benefits a number of institutions and the wider community through its extensive network of collaborations:

1. Pelagios brings together three projects within major international universities to thrash out a core ontology for all users working with place reference data. They are:

2. Pelagios takes this core ontology and investigates its adoption by five international projects, which are already finding ways of extracting and using placename information. Each representing a different kind of document type, they are:
  • Google Ancient Places has been sponsored by Google to discover and represent place references in the unstructured XML of the Google Books Corpus. GAP is a consortium based at the OU, Southampton, Edinburgh and Berkeley with considerable experience in ancient geospatial data drawn from HESTIA, Open Context and CHALICE.
  • The Perseus project is a world leader in providing digital resources of the ancient world, focused primarily on primary sources enriched with structural mark-up. It is backed by the US National Endowment for the Humanities, the National Science Foundation, the Institute for Museum and Library Services, the Fund for the Improvement for Postsecondary Education, the Department of Education, the Mellon Foundation, and the National Endowment for the Arts.
  • Arachne is the central object database of the German Archaeological Institute, hosted and developed in cooperation with the Cologne Digital Archaeology Laboratory at the University of Cologne. Arachne has over 5,000 registered users who can access its extensive OER materials (850,000 scanned images and documentation for c. 250,000 sites and objects) free of charge.
  • SPQR is a JISC-funded project (JISCMRD Programme) using LOD approaches to provide integrated views across diverse resources relating to fragmentary ancient documents.
  • The Digital Memory Engineering (DME) research group of the Austrian Institute of Technology (AIT) is exploring ways of enabling and integrating services for the Europeana cultural heritage Web portal (see the Annotation Prototype) with an expertise in map documents.
By having each project partner document the process by which they take up and use the ontology, Pelagios will provide a benchmark publication that offers guidance to the community for using linked open geo-data (a LOG paradigm). In addition, by bringing these existing international repositories of Ancient World resources together, Pelagios will encourage its partners to exchange data, practices and experience with each other, their large extant user-base, JISC, and users worldwide.

3. Finally, Pelagios will demonstrate the value of the ontology to non-technical users by experimenting with various accessible web services that could draw upon it.
Benefits (summary table)

Who?

What?

How?

Pelagios
Partners

Partner projects are already investing considerable resources in geospatially annotating their holdings. Pelagios will enable them to do this in a common framework, thereby increasing reusability of their data.

ontology development; data annotation; process documentation; tool development; tool reviewing; workshop attendance.

Workshop attendees / Data publishers

Data publishers will benefit from both a field guide to LOG publication for multiple document types and a clear ontology against which their data can be validated.

constructive feedback through consultation; workshop attendance where possible.

End users (teachers, researchers, learners)

Easy-to-use web visualization tools will provide immediate access for researchers, students and members of the public. A REST-based Web service will give Digital Humanists the power to perform large scale queries and complex analyses.

interest in ancient geodata; adoption of tools and services.

JISC

Pelagios will provide JISC with a variety of infrastructure assets (ontology, documentation, tools) that enable it to offer strategic leadership in the field of Linked Open Geodata

funding; involvement in project meetings and dissemination; synthesis project; programme activities.

Monday, 21 February 2011

Pelagios Project Plan Part 1: Aims, Objectives and Final Output(s) of the project

Overall aim

To trial a method of linking open data (LOD) that enables scholars and enthusiasts alike to discover and make use of references to ancient places in maps, texts, images and tables.


Objectives

1. Define a Core Ontology for Place References (COPR). In discussion with external groups, the project partners will develop a base-line ontology to help users answer two kinds of query:

i) Within this document (text, map, database), give me a list/map of all the ancient places in it.

ii) Given this place, give me a list of documents (texts, maps, databases) that reference it.

i.e. we’re looking for the simplest method for enabling users to say with confidence that the document they’re looking at refers to place x (or vice versa), and then to bring up additional sources of information about it.

2. Trial this ontology on different document types (text, map, database) related to ancient world research, where each partner:

i) Aligns place references in their own document types to Uniform Resource Identifiers (URIs) in the Pleiades gazetteer of ancient places

ii) Generates Resource Description Framework (RDF) based on the ontology

iii) Documents that process so that it can be reproduced by others.

3. Create prototype tools and services, which are easily consumable by learners, educators, researchers and the public, to demonstrate the power and effectiveness of this approach.

i.e. answer the ‘so what?” question by demonstrating some of the things that users will be able to do with this resource.


Outputs

1. A publicly available, lightweight core ontology which can then by modified or extended by different users: i.e. the ontology will permit modular extensions for different kinds of document so that details specific to each type will not add unnecessary complexity for users wishing to publish data in conformance with the core ontology. The COPR ontology will be immediately accessible as RDF/XML on the Pleiades website.

2. Documentation of the processes by which each partner identifies place references and aligns those references to Pleiades. Drawing on the consortium’s breadth of experience, we will detail these processes as a guide for others looking to adopt a linked open geodata (LOG) approach: i.e. although we’re using URIs based on Pleiades, the ontology should be able to work with other gazetteers (including those based on modern placenames). Project partners will host instance RDF describing their holdings on their own websites or in established repositories such as the TALIS Connected Commons.

3. Development of open source web services and neo-geographic web tools. These technologies will optimize machine-access to the raw RDF and make both the discovery and visualization (by map, list or ordered ‘timeline’) of the integrated geodata simple for learners, teachers, researchers and the general public. All software Code will be made publicly available through an open repository (e.g. GitHub).

Sunday, 6 February 2011

Welcome to PELAGIOS

PELAGIOS stands for 'Pelagios: Enable Linked Ancient Geodata In Open Systems'. The idea behind the project is simple, even if the actions to fulfill it - and the acronym - are not. On-line resources that reference ancient places are multiplying rapidly, bringing huge potential for the researcher provided that they can be found; but, even then, the user currently has no way of bringing the data together. PELAGIOS has teamed up with an international consortium of leading research groups to trial a method of linking open data (LOD) that will enable scholars and enthusiast alike to discover all kinds of stuff related to ancient places and then to visualize it in accessible and meaningful ways.

The consortium of projects and research groups that make up PELAGIOS are as follows:

We'll be aiming to post the project's many turns pretty regularly (say, every other week or so), since we believe that the process as much as any outcome itself may be of interest to the community. Above all, since a project of this size and nature will only succeed if it has the input from those who are going to use it as a resource (i.e. you guys), we welcome your feedback. So join us in going places, ancient style.

Pelagios is funded by JISC as part of their #jiscGEO programme.