Tuesday 15 December 2009

Collation Software for Editing

A query came up on the Exlibris List about "Collation Software," by which the person posting the query meant the computer equivalent of a Hinman Collator: a devise that allows the close comparison of type features of multiple copies of the same page. Famously, Hinman used his Collator to compare multiple copies of the First Folio of Shakespeare's works to identify many slightly different impressions of each sheet and page.

As is often the case on academic lists, the answers to this query veered off in a slightly different direction. This was partly because there is no Hinman-style collation software, but it was also because "Collation Software" can mean the collation (i.e. comparison) by textual editors of multiple copies of the same text.

Comparing different editions/issues/states of the same text is necessary if one is to create a single text from multiple—and conflicting—texts (or witnesses). This sort of comparison is also necessary because it establishes what differences exist between texts, which enables an editor to establish the relationships between them (whether Text C is a reprint of Text A or Text B) and the importance of these differences (whether Text B was corrected by the Author or by the printer). These differences are usually accounted for, and evaluated in, the critical apparatus of a critical edition, specifically in the list of variant readings, in a series of lemma, stemma and sigla. (See here for more on Copy-text editing.)

As it turns out, there is a free, open-source collation tool called Juxta that will generate a list of variant readings (i.e., a full list of lemma and stemma) from any number of witnesses. The software allows users to set any of the witnesses as the base text, to add or remove witness texts, to switch the base text at will. The primary collation gives a split-frame comparison of a base text with a witness text. Juxta can also display a "heat map" of all textual variants, or a "histogram" to display the density of variations.

I tested the software out very briefly on a few texts that I am editing and was delighted both with the split-frame comparison and the "lemmatized schedule" (the list of variant readings, in a series of lemma, stemma). My only concern thus far is that it seems that the texts must be stripped of all font-formatting before they can be compared and "lemmatized." So, every instance of italics or small caps being added, removed or reversed is lost. This is a huge loss, because the difference between "bite me" and "bite me" is just as important as that between "bite me" and "boot me."

Nevertheless, the software will come in very handy when I am trying to establish the relationship between the twelve editions of the text I am working on. And it will be great to be able to generate a "lemmatized schedules" against which I can check the list of variant readings I have compiled the old way. And, at the price, who can complain?

No comments: