The process of using Pleiades names
consists of getting access to the specific name IDs. Pleiades provides a site that will return
these IDs here: http://pleiades.stoa.org/places/
On that page the user can type in the
required name and retrieve the ID in the resulting URI.
SquinchPix actually maintained no
location information for the pictures in its DB. The place names are embedded in the captions,
of course, and in the tag or keyword tables.
But the tag ‘Rome’ is not treated any differently from the tag for,
e.g., ‘concrete’ or any other tag. As a
result there is no easy way to specifically pick out place name tags in an
automated fashion. What SquinchPix has
done all along is maintain a pretty accurate lat/long pair for every
picture. It’s the lat/long pairs that
drive location services on SquinchPix such as the Google map that gets
generated dynamically for every image.
In order to participate in the
Pelagios project SquinchPix decided to make two changes to the DB. In the table which contains information for
each picture (‘PI’) a field was added for an unambiguous modern name for the
location of the picture.
Then came the work of actually rooting
out the place names from the keyword table and associating the right place name
with the right picture. We wrote a
script that looked for all the pictures that were keyworded ‘Rome’. Those that were keyworded ‘Rome’ had the word
‘Rome’ entered in the new dedicated place name field by the script. The script just dumped out the captions for
those which were NOT keyworded ‘Rome’. Then we inspected those captions looking for
more place names. Next came ‘Athens’,
then ‘Mycenae’, ‘Naples’, ‘Tiryns’ and the rest. For each new place name the script labeled
that many more pictures and forced out fewer and fewer captions. From 20,000 pictures without place names we
used iteration to reduce that number to about 300 after two days of work. By the end of that time each locatable
picture had a specific place name associated with it. The remainder were almost all pictures of
artifacts with no secure find spot. That
remainder could probably be identified with some larger Pleiades-compliant name
such as ‘Syria’, or ‘Mediterranean’ but that work is for another stage.
The second big change to the DB was the creation of a separate table that
used that same modern place name established in step 1 as an index to a set of
doubles. The doubles were simply the
corresponding Pleiades-compliant name and the Pleiades ID. This table was populated by hand, entry by
entry. On SquinchPix there are about 170
distinct and unambiguous place names so that there are that many records in
this new table. In addition to using the
Pleiades look-up facility we made use of the .kml which we ran in Google Earth
in parallel. If we couldn’t find the
place in Google Earth then we used the look-up facility. Even though dealing with a much smaller
number of records this hand-population took about four days.
Once that
table was populated we had a secure way of going from the specific picture to
its modern place name and then to the Pleiades-compliant name/ID pair. Now we
simply wrote a script that would traverse all the pictures, get the
Pleiades-compliant name and number and use it to write out the Turtle-compliant
record. In this way (the extra table, that is) we could confine the fluctuating
nature of the Pleiades project to a ‘localized’ corner of the DB. We anticipate
that this table in our DB will change and will be maintained and updated on an
ongoing basis. The reason for this is that Pleiades is dynamic and also our
ideas about specific places and names may not mesh cleanly with theirs in all
instances thus necessitating the occasional negotiation. To their credit they
are very responsive to questions and suggestions about place names. I would
urge anyone engaged in a conversion project to communicate with them whenever better
ideas about place names or locations should surface.
No comments:
Post a Comment