© 1993 by Association for Literary & Linguistic Computing
Articles |
Transcription Conventions used for the Corpus of Spoken Contemporary Spanish
Laboratorio de Linguistica Madrid, Spain
Francisco Marcos-Marin, Laboratorio de Linguistica Informatica, Ap. 46348, E-28080 Madrid, Spain. E-mail: marcos{at}ccuam3.sdi.uam.es
Because speech is not organized in the same way as written language it is difficult to transcribe. The main difficulty is derived from one of the most distinctive characteristics of spoken language: spontaneity. The problems resulting from this issue have been solved by establishing transcription conventions. We have decided to use a tagging scheme to mark distinctive features of spoken language. Our linguistic intuitions have helped us choose the right terminology for these tags.
In this paper we list and explain tags used in the coding of the Reference Corpus of Spoken Peninsular Spanish. A corpus consisting of more than one million words, transcribed orthographically.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
L. J. Rodriguez and M. I. Torres Spontaneous Speech Events in Two Speech Databases of Human-Computer and Human-Human Dialogs in Spanish Language and Speech, September 1, 2006; 49(3): 333 - 366. [Abstract] [PDF] |
||||
