© 1993 by Association for Literary & Linguistic Computing
Articles |
St Paul RevisitedWord Clusters in Multidimensional Space
Computer Centre, University of Keele UK
H. H. Greenwood, Computer Centre, University of Keele, Keele, North Staffordshire ST5 5BG, UK.
Frequency counts of common words appearing in the Greek texts of St Paul's letters separate the chapters into groups which correspond to individual epistles, and clusters which identify with the Missionary, Captivity and Pastoral letters. This significant agreement between the frequency counts and classifications of the epistles which are accepted in New Testament scholarship as reflecting patterns of authorship, is based on data for all fifty words which occur more than 100 times throughout the texts. A multivariate representation of the data is analysed by cluster analysis which extracts clusters from within the distribution, and non-linear mapping which also identifies clusters and determines whether the clusters are genuinely separated or overlap in the multidimensional space. The clusters obtained in this application from the common word frequency data of the epistles are not recognizable within the distributions of the individual word variables, but a cooperative effect obtained by combining these distributions within the multivariable representation produces distinct, separated clusters. Separated clusters exist within the multidimensional space only.
Principal component analysis is used with base vector projections of the data to identify those words of the total set which largely account for the clustering properties of the multivariate distribution.