© 2004 by Association for Literary & Linguistic Computing
Testing Burrows's Delta
New York University, USA
Delta, a simple measure of the difference between two texts, has been proposed by John F. Burrows as a tool in authorship attribution problems, particularly in large open problems in which conventional methods of attribution are not able to limit the claimants effectively. This paper tests Delta's effectiveness and accuracy, and shows that it works nearly as well on prose as it does on poetry. It also shows that much larger numbers of frequent words are even more accurate than the 150 that Burrows tested. Automated methods that allow for tests on large numbers of differently selected words show that removing personal pronouns and words for which a single text supplies most of the occurrences greatly increases the accuracy of Delta tests. Further tests suggest that large changes in Delta and Delta z-scores from the likeliest to the second likeliest author typically characterize correct attributions, that differences in point of view among the texts are more significant than differences in nationality, and that combining several texts for each author in the primary set reduces the effect of intra-author variability. Although Delta occasionally produces errors in attribution with characteristics that would normally lead to a great deal of confidence, the results presented here confirm its usefulness in the preliminary stages of authorship attribution problems.
* Correspondence: David L. Hoover, Department of English, New York University, 19 University Place, 5th Floor New York, NY 10003, USA. E-mail: david.hoover{at}nyu.edu
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. L. Jockers, D. M. Witten, and C. S. Criddle Reassessing authorship of the Book of Mormon using delta and nearest shrunken centroid classification Lit Linguist Computing, February 17, 2009; (2009) fqn040v2. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Argamon Interpreting Burrows's Delta: Geometric and Probabilistic Foundations Lit Linguist Computing, June 1, 2008; 23(2): 131 - 147. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. van Dalen-Oskam and J. van Zundert Delta for Middle Dutch Author and Copyist Distinction in Walewein Lit Linguist Computing, September 1, 2007; 22(3): 345 - 362. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Burrows All the Way Through: Testing for Authorship in Different Frequency Strata Lit Linguist Computing, April 1, 2007; 22(1): 27 - 47. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Garcia and J. C. Martin Function Words in Authorship Attribution Studies Lit Linguist Computing, April 1, 2007; 22(1): 49 - 66. [Abstract] [Full Text] [PDF] |
||||
