Wednesday 1 February 2012

Beneša's trigrams

Here is an experimental page for researching trigrams in the De morte Christi, a neo-Latin epic by Damianus Benessa (Damjan Beneša).

What did we do:
1. using a concordance program (our reliable AntConc), we found trigrams in Beneša's Latin text, which we obtained courtesy of our colleague Vlado Rezar, Beneša's modern editor

2. we reformatted the trigrams slightly, using tr and sed, to make use of the excellent PhiloLogic crapser function (it is hard not to laugh thinking about this function, because in Croatian "serem" means "I crap")

3. using curl and a simple bash script, we sent the trigrams to CroALa

4. using again sed, we filtered out the successful hits, i. e. those which produced results

5. with some more sed, the hits were turned into searches on the page linked to at the beginning: [X]. There you'll find the trigram which produced the hit, the link to a saved search, and a report on the number of occurrences found in CroALa.

Most interesting findings for us are occurrences from Marko Marulić and Jakov Bunić, close contemporaries of Beneša; Marulić and Bunić also wrote Biblical epic poems in Latin (and Marulić's epic remained in manuscript until the 1950's).

The useful sed snippet which produces the regex line, and a line immediately before it, is here:
sed -n '/Your search found/{x;1!p;g;$N;p;};h' ben-filename


(Adapted from that goldmine, the Sed one-liners.)

Thinking about PND

An important part of our research is finding the Personennamendatei (PND) number of Croatian Latin authors and adding the number to our personal data record of the author. So far, 83 authors (of 241 from our experimental set) have been connected with their PND-Nrs.

Now we're looking into the ways Wikipedia (at least, German Wikipedia) explores the PND to uniquely identify persons and connect data on them. The BEACON format seems a nice start for a small catalogue like ours. And, of course, it would be nice if Croatian Wikipedia decided to adopt something similar to the PND scheme.