Skip Navigation

Literary and Linguistic Computing 1993 8(4):221-223; doi:10.1093/llc/8.4.221
© 1993 by Association for Literary & Linguistic Computing
This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by ZAMPOLLI, A.
Right arrow Articles by OSTLER, N.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?


Articles

Special Section on Corpora

Introduction to Part One

ANTONIO ZAMPOLLI and NICHOLAS OSTLER

Linguacubun Ltd London

Nicholas Ostler, Linguacubun Ltd, 17 Oakley Road, London N1 3LL, UK.
In the 1990s the empirical study of language from large bodies of recorded documents has assumed a new importance, and this was reflected by the European Commission's decision to support the project NERC, the Network of European Research Corpora, led by Antonio Zampolli. Its aim was to study the need for, and possible provision of analysed corpora for European languages.

NERC's first action was to organize an International Workshop, held at Pisa in January 1992, and attended by invited scholars from Europe and North America, to gather and cross-fertilize a variety of experience and views on how to further the project's aims. Re-worked versions of some of the papers presented then, together with a small number describing related work by other scholars, are now published as a special supplement to this and the next number of Literary and Linguistic Computing.

They are presented in an order which corresponds to NERC's own structure. After a general statement from José Soler of the European Commission on the importance of this field of study, this follows a spectrum of interest: from examinations of the demand for corpora (McNaught) and the administrative complications in making them available (Hockey and Walker), through analysis of the conceptual (Biber) and practical (Crowdy, Part 1) problems in selection of texts, to the issues that arise when designing (Sampson) and applying (Leech) a system of categories for annotating the language in the texts. A particular problem here is treatment of spoken texts when reduced to written form, and Ballester et al. offers a solution for Spanish, Crowdy Part 2, for English. After these studies in annotation, the focus shifts to statistical techniques for exposing the semantics of uninterpreted text, sometimes known as ‘knowledge acquisition’ (Bindi et al. and Brown et al.). Finally, this supplement contains reports from some current projects which make essential use of large corpora and their annotation categories for particular applications: designing lexicons (Antoni-Lay et al., Khatchadourian and Modiano), multilingual text processing (Cowie et al.), and speech technology assessment (Fourcin and Gibbon).


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.