Skip Navigation


Literary and Linguistic Computing Advance Access originally published online on September 6, 2006
Literary and Linguistic Computing 2006 21(4):477-492; doi:10.1093/llc/fql038
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
21/4/477    most recent
fql038v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gooskens, C.
Right arrow Articles by Heeringa, W.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of ALLC and ACH. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

The Relative Contribution of Pronunciational, Lexical, and Prosodic Differences to the Perceived Distances between Norwegian Dialects

Charlotte Gooskens

Scandinavian Languages and Cultures, University of Groningen, Groningen, The Netherlands

Wilbert Heeringa

Humanities Computing, University of Groningen, Groningen, The Netherlands

Correspondence:Charlotte Gooskens, Scandinavian Languages and Cultures, University of Groningen, Postbus 716, NL-9700 AS Groningen, The Netherlands. E-mail: c.s.gooskens{at}rug.nl
In the period between 1999 and 2002, Jørn Almberg and Kristian Skarbø compiled a database which consists of recordings and phonetic transcriptions of translations of the fable ‘The North Wind and the Sun’ in about fifty Norwegian dialects. On the basis of fifteen of these recordings, Charlotte Gooskens carried out a perception experiment (Gooskens and Heeringa, 2004). In this experiment she investigated the distances between the fifteen dialects as perceived by the speakers themselves.

On the basis of the phonetic transcriptions, Wilbert Heeringa (2004) measured computational linguistic distances between the fifteen Norwegian varieties (Gooskens and Heeringa, 2004). Distances were calculated by means of Levenshtein distance, which finds the minimum cost of changing one pronunciation into another by inserting, substituting or deleting phonetic segments. Gooskens and Heeringa (2004) correlated the perceptual distances with these computational distances and found a significant correlation of r = 0.67. In the computational distances, pronunciational, lexical, and morphological variation is processed, but these levels are not studied separately.

The contribution of this article is that we measure pronunciational, lexical, and prosodic distances separately. Within pronunciational distances we distinguish between consonants and vowels on the one hand, and between substitutions and insertions/deletions on the other hand. When correlating the separate levels with perception and using multiple linear regression analyses we found that pronunciation is most important in perception and especially vowel substitutions play a major role.


1 See http://hyde.park.uga.edu/lamsas.

2 The recordings and the transcriptions (in IPA as well as in SAMPA) were made by Jørn Almberg in cooperation with Kristian Skarbø at the Department of Linguistics, NTNU, Trondheim and made available at http://www.ling.hf.ntnu.no/nos/. We are grateful for their permission to use the material.

3 The example should not be interpreted as a historical reconstruction of the way in which one pronunciation changed into another. We just show that the distance between two arbitrary pronunciations is found on the basis of the least costly set of operations mapping one pronunciation into another.

4 See http://www.phon.ucl.ac.uk/home/wells/cassette.htm.

5 The program PRAAT is a free public-domain program developed by Paul Boersma and David Weenink at the Institute of Pronunciation Sciences of the University of Amsterdam and is available at http://www.fon.hum.uva.nl/praat.

6 If there are fifteen dialects, there are (15 x (15 – 1))/2 = 105 dialect pairs. Per dialect pair, there are maximally fifty-eight word pairs, so the reader may expect totally 105 x 58 = 6110 Levenshtein distances. The higher number of 18801 is the result of the fact that some words appear more than once in the text, for example nordavinden ‘the North wind’ usually appears four times in the text, which increases the number of Levenshtein calculations per word pair.

7 In seven cases we found missing transcriptions, namely for the dialects of Herøy (two cases), Lesja (one case), Stjørdal (two cases), Trondheim (one case), and Verdal (one case).

8 Although our example is hypothetical, the pronunciations used here are existing ones, which are found in our set of fifteen Norwegian dialects.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.