Literary and Linguistic Computing Advance Access originally published online on July 13, 2009
Literary and Linguistic Computing 2009 24(4):467-489; doi:10.1093/llc/fqp027
| ||||||||||||||||||||||||||||||||||||||||||||||||||
An exercise in non-ideal authorship attribution: the mysterious Maria Ward
Department of English, New York University, UK
Correspondence: David L. Hoover, Department of English, New York University, 19 University Place, 5th Floor, New York, NY 10003, USA. E-mail: david.hoover{at}nyu.edu
| Abstract |
|---|
The dangers of computational approaches to authorship attribution in the absence of an adequate set of training texts for the claimant authors are well known. This study aims to show, however, that significant progress can be made even where conditions are quite problematic. We investigate a difficult authorship question involving three texts, ostensibly by three authors, each of whom wrote nothing else. Only one of the texts can be unquestionably ascribed to a known author, and this author has been suggested as the true author of one of the two remaining texts. We investigate these three texts, along with similar texts by other authors, using cluster analysis, Delta analysis, t-testing, and PCA. We also create simulations of our authorship problem using sets of three texts of known authorship by one, two, and three authors. We test these sets using correct and incorrect assumptions of authorial difference, and then compare the results with analyses of our three texts based on the same range of assumptions. By combining information from all of these tests, we achieve what we believe is a persuasive, if not conclusive, solution to a significant and long-standing question concerning the authorship of Maria Warda's violently anti-Mormon Female Life Among the Mormons. At the same time, we demonstrate methods for making progress in cases where conditions are less than ideal.