Literary and Linguistic Computing Advance Access originally published online on April 12, 2006
Literary and Linguistic Computing 2006 21(2):169-178; doi:10.1093/llc/fql019
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
A Prototype for Authorship Attribution Studies
Duquesne University, Pittsburgh
Correspondence: Patrick Juola, Duquesne University, Pittsburgh, PA 15282, USA. E-mail: juola{at}mathcs.duq.edu
Despite a century of research, statistical and computational methods for authorship attribution are neither reliable, well-regarded, widely used, or well-understood. This article presents a survey of the current state of the art as well as a framework for uniform and unified development of a tool to apply the state of the art, despite the wide variety of methods and techniques used. The usefulness of the framework is confirmed by the development of a tool using that framework that can be applied to authorship analysis by researchers without a computing specialization. Using this tool, it may be possible both to expand the pool of available researchers as well as to enhance the quality of the overall solutions [for example, by incorporating improved algorithms as discovered through empirical analysis (Juola, P. (2004a). Ad-hoc Authorship Attribution Competition. In Proceedings 2004 Joint International Conference of the Association for Literary and Linguistic Computing and the Association for Computers and the Humanities (ALLC/ACH 2004), Göteborg, Sweden)].