Literary and Linguistic Computing Advance Access originally published online on February 25, 2005
Literary and Linguistic Computing 2005 20(1):133-151; doi:10.1093/llc/fqh048
| ||||||||||||||||||||||||||||||||||||||||||||||||
Articles |
Documents and Data: Modelling Materials for Humanities Research in XML and Relational Databases
King's College London, UK
John Bradley, Centre for Computing in the Humanities, King's College London, Strand, London WC2R 2LS, UK. E-mail: john.bradley{at}kcl.ac.uk
In this paper we describe the mix of text-oriented and data-oriented materials that have arisen during the process of conceptualising the Durham Liber Vitae (DLV) project. We have found a mixing of text- and data-oriented materials common in our projects, and that some aspects of SGML and XML markup's conceptual orientationparticularly the strong preference for asserting associations between elements by hierarchy and containment (the OHCO model)have often obscured the presence of data-oriented (non-hierarchical) elements in the materials, and or encouraged inadequate ways to represent them. Although discussion of XML and its modelling abilities within the Computing Humanities community have tended to focus on issues arising in the OHCO model, the OHCO model itself is not the only modelling approach that XML markup provides. This paper demonstrates a way of taking conventional data modelling diagrams (inherently not OHCO in orientation) and modelling them for XML markup in a way that uses XML's preferred OCHO/containment approach where-ever possible, and XML's link-oriented association (e.g. ID/IDREF) approach between different hierarchies when essential. It then touches on aspects of ownership and reference that seem to lie behind XML's containment and linking association strategies. Finally, it describes some of the difficulties that standard XML tools such as XSLT and XPath (obviously primarily designed with the OHCO model in mind) have when dealing with links in XML, and shows an example of where XQuery's syntaxborn out of work with relational databasesbetter handles queries based around linking.