2008-07-14T07:57:40Z
Dave Pawson.
link
Home
The family Album
Like many families, we have a small collection of prints dating back a hundred years or so, many fading, some damaged. This weekend I took hold of ours from my sister and digitized them. The original intent was to simply put them on a CD. I did more.
Whilst scanning a few hundred prints the mind wanders. I started to think about the comments on the back of the prints, mainly identifying the people. How to capture them? A natural move took me to a website. To take a break from scanning I wrote a bit of code to take care of that.
I had 6 directories of jpegs. I have a python script that lists directory contents. I tweaked that to create
<file name="xxx.jpg">
<title>.</title>
<person nm=""/>
<!--
<table>
<row>
<td><person nm=""/></td>
<td><person nm=""/></td>
<td><person nm=""/></td>
</row>
</table>
-->
<desc> </desc>
<place> </place>
<date> </date>
</file>
The filename is lodged in the name attribute, the person name(s) are regimented to a lowercase maiden name (where known), used as a form of relax token. This allows the same name in two directories.   in the desc and place element saves me a touch of editing. desc also takes para children. For those photos having rows of people, the table provides a means of locating the names in a fairly similar manner. I named the scanned files using a combination of the people in it, or the event. So I have files such as fredDoreenAmelia.jpg
Even more boring I then started going through each image, photos alongside and filling in the person values. The description was used to provide those extras that the genealogists love. Also to relate figures that are unconnected to our family.
As a rest for this I wrote three XSLT scripts. One to generate HTML from the pawson.xml file, the master list of files. A second to abstract the names and point back at the images, so Amelia has an entry which points to all photographs in which she is to be found. This is where the name values come in, they form the 'person' entry.
<person nm="ameliasmith">
<name>.</name>
<f href="ameliaArthur.html">
<ttl>Amelia and Arthur </ttl>
</f>
<f href="Anne.80th.html">
<ttl> </ttl>
</f>
The ttl value comes from the master file image title. The name is the shown name of the person. (I chose their names today, where they are known!). That job was relatively simple. Filling in the ttl element using the nm attribute. I realised that the list of names (persons.xml) was unstable, so I created a file to extract just the person element, @nm attribute and added the ttl value as their full name. This is preserved as a seperate file.
<persons> <person nm="dorisspeak" full="Doris Speak"/> ...
I then moved to using this data to merge with the master file to create the persons file. A third xslt script then transformed the persons.xml file into HTML, complete with index, to list all the people and link back to their photographs.
I can now hand back to my sister a digital copy of the album and wait for the errors to be discovered! The job was too repetative not to have errors.
A good weekends work. Two HTML files indexing by image and person, the images as they were scanned.
Keywords: photography
Comments (View)Return to main index