Literary and Linguistic Computing Advance Access originally published online on December 6, 2008
Literary and Linguistic Computing 2008 23(4):465-491; doi:10.1093/llc/fqn040
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
Reassessing authorship of the Book of Mormon using delta and nearest shrunken centroid classification*
Department of English, Stanford University, Stanford, CA 94305, USA
Department of Statistics, Stanford University, Stanford, CA 94305, USA
Department of Civil and Environmental Engineering, Stanford University, Stanford, CA 94305, USA
Correspondence: Matthew L. Jockers, Department of English, Stanford University, Stanford, CA 94305, USA. E-mail: mjockers{at}stanford.edu
| Abstract |
|---|
Mormon prophet Joseph Smith (1805–44) claimed that more than two-dozen ancient individuals (Nephi, Mormon, Alma, etc.) living from around 2200 BC to 421 AD authored the Book of Mormon (1830), and that he translated their inscriptions into English. Later researchers who analyzed selections from the Book of Mormon concluded that differences between selections supported Smith's claim of multiple authorship and ancient origins. We offer a new approach that employs two classification techniques: delta commonly used to determine probable authorship and nearest shrunken centroid (NSC), a more generally applicable classifier. We use both methods to determine, on a chapter-by-chapter basis, the probability that each of seven potential authors wrote or contributed to the Book of Mormon. Five of the seven have known or alleged connections to the Book of Mormon, two do not, and were added as controls based on their thematic, linguistic, and historical similarity to the Book of Mormon. Our results indicate that likely nineteenth century contributors were Solomon Spalding, a writer of historical fantasies; Sidney Rigdon, an eloquent but perhaps unstable preacher; and Oliver Cowdery, a schoolteacher with editing experience. Our findings support the hypothesis that Rigdon was the main architect of the Book of Mormon and are consistent with historical evidence suggesting that he fabricated the book by adding theology to the unpublished writings of Spalding (then deceased).
*Shortly after publication of the Advance Access version of this paper, Jockers discovered a minor preprocessing error in the data file. His original text-processing script failed to account for the possibility that two hyphens (- -) might be used as a substitute for the em dash. This resulted in a very small number of word types being incorrectly tokenized and an even smaller number of miscounted words. For example, the ngram age- -and was tokenized as a unique word type instead of being counted as one instance each of the words aged and and. Jockers became aware of this error on January 9, 2009 and immediately corrected the tokenization script and reprocessed the data. Witten then reran both the winnowing algorithm and the NSC and Delta procedures. The minor corrections to the data file did not result in any changes to the winnowed result set of words used by NSC. In all but one case, the classification results given by NSC were also unchanged. The only change in classification occurred in chapter 147 (Alma 52) of the Book of Mormon. Instead of Rigdon being the most likely candidate and Spaulding the second most likley, NSC reported the reverse, Spalding as most likely and Rigdon second most likely. In the original results, NSC ranked Rigdon at 0.4646 and Spalding at 0.4628. With the corrected data, NSC ranked Rigdon at 0.4626 and Spalding at 0.46525. The corrected data file was uploaded to the supplementary materials URL on January 12.