215 67 15MB
English Pages 415 [416] Year 2012
Consonant Clusters and Structural Complexity
Interface Explorations 26
Editors
Artemis Alexiadou T. Alan Hall
De Gruyter Mouton
Consonant Clusters and Structural Complexity edited by
Philip Hoole Lasse Bombien Marianne Pouplier Christine Mooshammer Barbara Kühnert
De Gruyter Mouton
ISBN 978-1-61451-076-5 e-ISBN 978-1-61451-077-2 ISSN 1861-4167 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. ” 2012 Walter de Gruyter GmbH & Co. KG, 10785 Berlin/Boston Cover image: iStockphoto/Thinkstock Typesetting: Royal Standard, Hong Kong Printing: Hubert & Co. GmbH & Co. KG, Göttingen ⬁ Printed on acid-free paper Printed in Germany www.degruyter.com
Table of contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Part I. Phonology and Typology Structural complexity of consonant clusters: A phonologist’s view . . . . Theo Vennemann
9
On the relations between [sonorant] and [voice] . . . . . . . . . . . . . . . . . Rina Kreitman
33
Limited consonant clusters in OV languages. . . . . . . . . . . . . . . . . . . . Hisao Tokizaki and Yasutomo Kuwana
71
Manner, place and voice interactions in Greek cluster phonotactics . . . . . Marina Tzakosta
93
Consonant clusters in four Samoyedic languages. . . . . . . . . . . . . . . . . Zsuzsa Va´rnai
119
Part II.
Production: analysis and models
Articulatory coordination and the syllabification of word initial consonant clusters in Italian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anne Hermes, Martine Grice, Doris Mu¨cke and Henrik Niemann
157
A gestural model of the temporal organization of vowel clusters in Romanian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefania Marin and Louis Goldstein
177
Coupling of tone and constriction gestures in pitch accents . . . . . . . . . Doris Mu¨cke, Hosung Nam, Anne Hermes and Louis Goldstein
205
Tonogenesis in Lhasa Tibetan – Towards a gestural account . . . . . . . . Fang Hu
231
Part III. Acquisition Probabilistic phonotactics in lexical acquisition: The role of syllable complexity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Natalie Boll-Avetisyan
257
vi
Table of contents
Acquiring and avoiding phonological complexity in SLI vs. typical development of French: The case of consonant clusters. . . . . . . . . . . . Sandrine Ferre´, Laurice Tuller, Eva Sizaret and Marie-Anne Barthez
285
Part IV. Assimilation and reduction in connected speech Articulatory reduction and assimilation in n#g sequences in complex words in German . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pia Bergmann Overlap-Driven Consequences of Nasal Place Assimilation . . . . . . . . . Claire Halpert
311 345
The acoustics of high-vowel loss in Northern Greek dialects and typological implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nina Topintzi and Mary Baltazani
369
List of contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Subject index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Language index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
399 401 407
Introduction This book is a selection of papers from a meeting we organized in Munich in the summer of 2008. We the editors have our main background in experimental phonetic approaches to speech production. But the attraction for us in setting up the meeting and subsequently producing this book was precisely the realization that workers with many different perspectives currently see a better understanding of complex sound sequences as an important challenge, and that it could be mutually fruitful to bring these different strands together, e.g. not just production but also perception, not just phonetics but also phonology and psycholinguistics. The meeting itself attracted some 50 contributions (all abstracts can be found at http://www.phonetik.uni-muenchen.de/institut/ veranstaltungen/cluster/), so clearly only a small proportion could be included as full-length chapters in a book, but nonetheless we have done our best to do justice to the range of approaches in evidence at the meeting. The book is divided into four main sections ‘Phonology and Typology’, ‘Production: Analysis and Models’, ‘Acquisition’, and ‘Assimilation and reduction in connected speech’. However, it should be noted as well that there is, of course, a gradual rather than clear-cut dividing line between the different subfields working on the organisational principles of sound structure. Therefore, several contributions using empirical data to reflect on theoretical cognitive issues span across the content of multiple sections. The section on Phonology and Typology is opened by Theo Vennemann’s chapter entitled “Structural complexity of consonant clusters: a phonologist’s view”, which also, because of its review character, provides a background for the book as a whole. It introduces the regularities that underlie the composition of syllables, and in particular of consonant clusters. This is an updated version with substantial extensions of Vennemann’s seminal work “Preference Laws for Syllable Structure and the Explanation of Sound Change”. Vennemann argues that in many cases sound change is guided by two principles: syllable structure simplification towards the universally preferred CV syllable template, and avoidance of violations of Consonantal Strength Sequencing. As he points out, these and other regularities can describe why sound changes happen in certain synchronic stages of a particular language or dialect but it is impossible to predict whether, when and how complexity problems and violations are resolved. Vennemann’s claims are supported by a wealth of examples from diverse languages. Rina Kreitman’s paper also falls within this longstanding phonological tradition of elucidating the constraints on possible and preferred syllable structures. The specific focus here is on consonant clusters in the narrow sense of
2
Introduction
consonant sequences with no intervening syllable or morpheme boundary, and more particularly on the patterns that occur in 2-element clusters with respect to the feature [sonorant] on the one hand and [voice] on the other hand. Kreitman considers the patterns for [sonorant] and [voice] first of all separately, demonstrating, for example, that of the 15 possible language types in terms of the possible combinations of [þsonorant] and [sonorant] in 2-element clusters only four actually occur, resulting in a clear set of implicational relations. However, she is also at pains to point out that patterning with regard to [sonorant] is not predictable from the patterning with regard to [voice]. Thus a language may be complex with respect to the [sonorant] combinations it allows, but simple with respect to [voice], and vice-versa. The results are based on a typological survey of a large number of languages. The extensive appendices allow cross-referencing between languages, cluster types, and sources. This should prove a valuable resource for many questions in addition to the ones specifically addressed in the chapter itself. The paper by Hisao Tokizaki and Yasutomo Kuwana is the only one in the volume that explicitly considers the possibility of interactions between syllabic and syntactic structure. Specifically, they examine the idea that languages with object-verb (OV) word order show a tendency towards simple syllable structure. While at first blush The World Atlas of Language Structures (WALS) fails to confirm this hypothesis, the authors argue that given the relatively rudimentary system for assessing syllable complexity, a more differentiated category system might render a different picture, and might indeed reveal a correlation between word order and syllable complexity. In particular the authors claim that in order to define syllable complexity, a broader variety of factors needs to be taken into account, such as the properties of boundaries in right branching vs. left branching structures, coda inventories and their geographical distribution, as well as cluster simplification processes. A preliminary data survey renders first support for their hypothesis. Marina Tzakosta’s paper addresses the issue of cluster formation within syllables in Greek as it relates to the notion of sonority and phonotactic constraints. Tzakosta proposes an account of Greek phonotactics which expands the concepts of the sonority scale and sonority distance and takes into consideration three distinct scales: a manner scale, a place scale and a voicing scale. Moreover Tzakosta argues that, instead of the common dichotomy between acceptable / unacceptable, consonant clusters fall within three categories, namely perfect, acceptable and non-acceptable depending on the satisfaction of the three different scales. All scales are satisfied in a rightward manner and cluster perfection and/or acceptability are not absolute but gradient notions. Tzakosta’s analysis is based on a variety of CL and CC production data
Introduction
3
taken from a corpus of the major dialectal varieties of Greek, a corpus of Standard Greek developmental data, as well as a corpus of Greek as a second language. Zsuzsa Varnai’s paper documents first of all the syllable structures of four endangered Samoyedic languages spoken in Northern Siberia. Overall, syllable structure in these languages can be regarded as moderately complex (probably the most typical case cross-linguistically), meaning that in fact no tautosyllabic clusters occur in onset position, and only a very restricted range at syllable boundaries. In the context of the present book this in turn makes these languages an interesting case-study for understanding how clusters in loan words are ‘repaired’, since all these languages are in extensive contact with Russian, which is a prime example of a language with complex syllable structure and a very rich set of clusters. Overall, in common with investigations of other languages, epenthesis was the most common repair strategy. The clusters most likely to be targeted for repair were Sibilant þ Plosive and Plosive þ Lateral, but the former were much more likely to be repaired by prosthesis than the latter. This is discussed in terms of auditory similarity to the Russian forms. An interesting observation was that the same cluster (even within the same language) was not necessarily always repaired by the same process. The papers in the second section Production: analysis and models form a more homogeneous group than might appear at first sight, since they all frame their questions with close reference to the computational model of syllable structure currently being developed by Goldstein, Nam and colleagues. Much of the original work in this approach was devoted to understanding articulatory coordination as a function of position within the syllable (see e.g. work by Byrd and colleagues, and recently by Marin & Pouplier); with particular reference to clusters this led to concepts such as the c-center as an expression of the typical gestural coupling relations in syllable onsets. A further common feature of the papers in this second section is that they can all be seen as engaged in exploring whether and how this approach can now be applied to a wider range of structural phenomena. Three of the four papers use electromagnetic articulography as the primary source of data; the remaining paper (Marin/Goldstein) uses the – in a sense – complementary technique of articulatory synthesis as the primary tool. Anne Hermes and colleagues thus approach the phenomenon of Italian ‘impure s’ within this explicit gestural framework. By varying the cardinality of syllable onset clusters they find evidence for the c-center effect by examining the shift of the right-most consonant into the vowel. They explain this shift as the result of temporal reorganisation due to underlying timing constraints (competitive oscillator coupling) as they are hypothesized for syllable onsets.
4
Introduction
Crucially however, if the first element of a cluster is a sibilant (‘impure s’) it does not take part in the temporal reorganisation. Within the applied framework the authors argue that Italian in general allows for complex syllable onsets. ‘Impure s’, however, is not part of the onset or maybe not even part of the syllable at all. Apart from contributing to the long-standing debate on the syllable affiliation of ‘impure s’, this study also demonstrates how morphological structure can be reflected in articulatory dynamics. The chapter by Stefania Marin and Louis Goldstein further widens the book’s perspective on the structural properties of the syllable since the focus here is on vowel clusters rather than consonant clusters. The approach followed, namely the dynamic gestural approach proposed in articulatory phonology and its quantitative implementation within a computational task-dynamic model, has, as just mentioned, already been extensively used for consonant clusters. This paper represents one of its first applications to the coordination patterns found in sequences of vowels. Specifically, it shows for Romanian vowels how concepts from dynamical models such as coupling modes and gestural blending weight can be used to capture in a principled way the differences between tautosyllabic diphthongs and heterosyllablic (hiatus) vowel sequences on the one hand, and on the other hand the alternation between a diphthongal realization under stress and a monopththongal one in the absence of stress. Interestingly, both the diphthongal and monophthongal realization on the surface can be modelled as being the result, underlyingly, of two simultaneous vowel gestures. The last two papers in this section contribute to the widening of perspective by moving beyond purely segmental aspects of articulatory coordination to consider the integration of tonal phenomena. Thus Doris Mücke and colleagues contribute to current efforts to extend the model to the coordination of oral gestures with so-called tonal gestures, i.e. they revisit the much-discussed topic of tonal alignment from an articulatory perspective. Data from Catalan and Viennese German are analyzed, leading to the suggestion that different coupling relations seems to underlie the coordination of oral gestures with high vs. low tones. Through comparison of lexical and prosodic pitch accent tones they find that the timing of consonants and vowels within the syllable is not perturbed by the prosodic tones – in contrast to lexical tones. The final paper extends the range of tonal phenomena considered to the topic of tonogenesis. This might appear at first sight somewhat remote from the main concerns of this book. But in fact Fang Hu’s paper forms part of a lively ongoing discussion relating tonogenetic mechanisms in many languages to both structural complexity and to consonant clusters. This link is appropriate first in a general sense because there is evidence that emergence of tonal systems
Introduction
5
(increase in laryngeal complexity) often parallels a decrease in segmental complexity (in particular, simplification of consonant clusters). But there may also be a much more specific link. Here Hu extends to Tibetan previous work by Gao on Mandarin. Using ideas on gestural coupling developed in the articulatory phonology/task dynamics framework evidence is found that in tonal syllables the initial consonant and the tone exhibit the coordination pattern with each other (and in turn with the vowel) that has previously been observed for complex onsets involving consonant clusters (the pattern referred to as the c-center effect). In terms of the gestural coupling, the tone acts rather like a consonant in a complex onset. For the third section the focus moves to Acquisition, one of the classic topics of psycholinguistic research, and both papers in this section provide examples of typical experimental paradigms from this field. Underlyingly at issue is the nature of the mental representations that need to be assumed to predict performance in speech and language development. The theme of lexical acquisition, and the usefulness of using stimuli of increased structural complexity, is addressed in the paper by Natalie BollAvetisyan. She examines the role of probabilistic phonotactics in facilitating short-term memory (STM) recall and lexical acquistion. While studies over the past decade have shown that probabilistic phonotactics influences lexical acquistion, it is still controversial whether probabilistic phonotactic knowledge is informed by abstractions from lexical entries, or also by sub-lexical representations. As Boll-Avetisyan points out, previous studies exclusively used structurally simple test items, in particular CVC syllables, which are subject to weaker co-occurrence constraints as compared to consonant clusters. She presents the results of two STM recognition experiments with adult native speakers of Dutch, a language allowing for highly complex syllable onsets and codas. Using a reaction-time paradigm, recognition was found to be faster for non-words of high than of low phonotactic probability. However, the effect was only present when complex syllables were used, but not when syllables were simple. Thus, the effect of probabilistic phonotactics increases with increasing syllable complexity. Boll-Avetisyan argues that sub-lexically represented probabilistic phonotactics are informed by abstract knowledge about phonological structure and suggests that sub-lexical knowledge might bootstrap lexical acquisition. In the only paper in the volume that is explicitly concerned with disordered populations Sandrine Ferré and co-authors approach the notion of phonological complexity by comparing the productions of French-speaking adolescents suffering from Specific Language Impairment (SLI) with the productions of
6
Introduction
typically-developing children, addressing the question of whether the phonological notion of syllabic complexity can make valid predictions about the degree of difficulty in the online processing of consonant clusters. Adopting Angoujard’s (1997) syllable model, the so-called Rhythmic Grid model, phonological complexity is defined on the basis of the constraints that govern the association between segments (consonants) and syllabic position. Ferré et al.’s results of a series of repetition experiments show that while consonants are produced correctly when they appear in isolation, production errors increase in certain syllabic positions and with increasing length of the consonant cluster, in particular for speakers with SLI. Adolescents with SLI and typicallydeveloping children also demonstrated very different repair strategies. Ferré et al. therefore conclude that the syllable-internal complexity can affect repetition accuracy independently of the number of segments and that phonological complexity lies above all in the constraints that govern the association between consonants and syllable, and not merely in the size of a consonant cluster. In the final section Assimilation and reduction in connected speech the common characteristic of all three papers is that they are concerned with consonant sequences that are not lexically given, but which emerge when morphemes or words are concatenated to form longer utterances (Bergmann, Halpert), or when variable processes of reduction lead to the loss of intervening vocalic material (Topintzi and Baltazani). A key issue in assimilation, and in reduction processes in general, is the role of gestural overlap. Discussion of this role also forms a common thread to these papers, and links them to the gesturally oriented papers of the second section. In their acoustic study, Nina Topintzi and Mary Baltazani deal with High Vowel Deletion/reduction in Kosani Greek. The process is shown to be quite variable and gradient, more frequent for [u] than for [i]. To some extent, it correlates with aspiration and duration in adjacent consonants. This contribution adds to the cross-linguistic typology of Vowel Deletion, with perhaps the most striking observations being those relating to cases of vowel deletion between voiced consonants. As a consequence of vowel deletion, a large range of derived consonant clusters emerges which outnumbers the inventory of clusters typically found in Standard Greek by far. An approach based on gestural overlap is argued to be overall the most successful route towards accounting for the observations. The paper by Claire Halpert contributes directly to the discussion of how gestural overlap may condition assimilation based on data from Zulu. She provides an analysis of nasal place assimilation in Zulu noun prefixation in terms of gestural overlap, arguing that changes in laryngeal features accompanying nasal assimilation can be accounted for under the assumption that
Introduction
7
gestural duration is constrained in assimilation processes. An optimalitytheoretic constraint is posited which demands that an assimilated NC sequence must not exceed the duration of a singleton C. The enforcement of this constraint may, however, result in a marked structure depending on the laryngeal features of the consonants involved. This by hypothesis explains why implosives and aspirated stops lose their laryngeal properties in NC assimilation. As evidence for the existence of an active durational constraint on assimilation, the paper presents pilot acoustic data showing that assimilated NC clusters are durationally comparable to singletons while non-assimilating mC clusters are twice as long as singletons. Finally, the paper by Pia Bergmann extends a long tradition of articulatory study (here using electropalatography) of assimilatory behaviour of consonant sequences across word or morpheme boundaries. Earlier studies had made clear that the classical case of coronal consonants assimilating with respect to place of articulation to the following consonant can involve a whole continuum of articulatory reduction rather than being an all-or-nothing change of the place of articulation of the target consonant. The present paper adds to the evidence that the degree of reduction can be affected by frequency effects (perhaps interacting with word-class effects, e.g. more reduction when a function word is involved) and by prosodic strength. Moreover, the paper also identifies an effect about which much less is known, namely the influence of the vowel preceding the consonant sequence (more reduction following long than short vowels); this is discussed with respect to models of German syllable structure. Finally, the paper uses the EPG data to pay detailed attention to speakerspecific assimilatory strategies. In closing, it remains for us to thank the German Research Council (DFG) for supporting our own work on clusters (grants to Philip Hoole, Christine Mooshammer and Marianne Pouplier) and in particular to acknowledge the support of the DFG’s Priority Programme 1234 ‘Phonological and phonetic competence: Between grammar, signal processing, and neural activity’ and of its coordinator Hubert Truckenbrodt for the organization of the Munich meeting. We thank the Interface Explorations series editors Artemis Alexiadou und Tracy Hall for the opportunity to publish this book in their series. Last but not least we are extremely grateful to the large number of conscientious reviewers we were able to count on as the book took shape.
Part I.
Phonology and Typology
Structural complexity of consonant clusters: A phonologist’s view Theo Vennemann Abstract This paper attempts a definition of consonant clusters, consonant cluster complexity, and cluster complexity reduction in a phonological perspective. In particular, since at the present stage of our knowledge a metrical (and thus: general) definition of consonant cluster complexity is not possible, a relative and structure-dependent concept is proposed: Only clusters within the scope of one and the same preference law can be compared, namely evaluated as the more complex the less preferred they are in terms of that preference law. This concept, as well as ways in which cluster complexity is reduced, are illustrated with examples from various languages. They include wordinitial muta-cum-liquida reductions in Spanish and Portuguese, certain cases of “metathesis at a distance” (e.g. Spanish periglo > peligro ‘danger’), and slope displacements as in Old Italian ca.'pes.tro > ca.'pres.to ‘rope’, Tuscan pa.'dro.ne > pra.'do.ne ‘lord, employer’. The opposite kind of development, namely the formation and complexification of clusters, is argued for the most part not to be motivated by syllable structure preferences but (a) by a variety of syntactic and morphological processes and (b) in phonology itself by rhythmically induced copations (e.g. syncope in Latin periculo > Spanish periglo), or to result from borrowing.
1. What is a consonant cluster? Let us begin with the question what we mean when speaking about consonant clusters. What would be a suitable definition? Since I am a phonologist rather than a phonetician, all the definitions that follow will be phonological rather than phonetic. The Oxford English Dictionary defines a cluster as ‘a collection of things of the same kind, as fruits or flowers, growing closely together; a bunch’, ‘originally of grapes’ [!]. The word is attested in the language as early as the year 800. It is assumed to be a -tro-derivate of the same root that we also have in clot, clout, and cleat, German Klotz and Kloß. In any event, a cluster consists of discrete elements, a consonant cluster of discrete consonantal elements. In traditional phonetics one learns that phonetic objects are continua. Hence a consonant cluster as a phonetic object would have to be a continuum, and that is what a cluster by definition is not. Philip
12
Theo Vennemann
Hoole (p.c.) has assured me that modern phonetics can show that a degree of segmentation already occurs at the articulatory level, rather than only on the mental articulatory retina (for which cf. Tillmann/Mansell 1980), and that within the so-called gestural framework (Browman and Goldstein 1986, 1989, 1992), “gestures whose coordination is part of a word’s lexical representation bear a close relationship to those conglomerates of gestures that constitute what is traditionally considered to be a ‘segment’ ” (Byrd 1996: 160). However that may be, phonologists are dealing exclusively with discrete objects. Therefore in that regard they have no problem defining a consonant cluster, namely indeed as a set of consonants understood as discrete objects, but more precisely as an uninterrupted sequence of two or more consonants within some well-defined unit of language, such as a syllable, word, or phrase. And if phonologists do have a problem it is because they do not know for sure what a consonant is, an uncertainty which may also hold for phoneticians. For example, is the second speech sound in twist, twinkle, twine, twenty, twaddle, etc. and in quick, quest, quiet, quota, etc. a consonant or a vowel? If it is a consonant, then the words twist and quick begin with a consonant cluster. If the second speech–sound is just the vowel /u/ in a syllable margin, namely in a complex syllable head, then those words do not begin with a consonant cluster, but rather with a sequence of consonant and vowel within a syllable head. Perhaps that is actually what phonologists mean when speaking of consonant clusters: an uninterrupted sequence of marginal speech sounds, i.e. a sequence of speech sounds not interrupted by a syllable nucleus (nor, of course, by a pause). And this may be the only legitimate meaning if we take seriously the idea that the speech sounds of any language can be arranged hierarchically on scales of increasing consonantality, or decreasing sonority, without any break-off point, as in (1). (1) Consonantal Strength scale: No division between vowels and consonants increasing Consonantal Strength voiceless plosives voiced plosives voiceless fricatives voiced fricatives nasals lateral liquids (l sounds) central liquids (r sounds) high vowels mid vowels low vowels
Structural complexity of consonant clusters: A phonologist’s view
13
This particular scale is that presented in Vennemann (1988: 9). There are other arrangements. Some authors use finer scales, for example scales which hierarchize obstruents and nasals by place of articulation, and vowels on the frontness parameter. Conversely there are less fine-graded scales, such as scales lumping all obstruents or all vowels together or not distinguishing lateral and central liquids in terms of strength. Thus one often sees the simple scale V L N F P (vowels liquids nasals fricatives plosives). For some languages even this scale may prohibit certain generalizations. The only scale to which I have never seen contrary language material is V R O (vowels sonorants obstruents). The above scale may be the most fine-graded that most linguists can agree on. When finer distinctions are made, language-specific differences begin to play a role, and linguists will begin to differ. The scalar nature of the consonantality, or conversely the sonority, of the speech sounds in any language is a venerable concept, much worked with by Sievers (1901), among others. The history of the concept is described in chapter 2 of Murray (1988). Turning now to the question of clustering, there follow some definitions, (2) to (7). (2) A cluster is an uninterrupted sequence of cardinality greater than one. Mathematians would undoubtedly let the cardinality begin with zero, i.e., they would admit empty clusters and unit clusters. But in everyday usage a cluster of objects contains at least two objects. The Oxford English Dictionary expresses that much by defining a cluster as ‘a collection of things’. Indeed, we would not, except perhaps jokingly, call a single painting, or no painting at all, an art collection. (3) A consonant cluster is a cluster of marginal speech sounds (i.e., a cluster of speech sounds not interrupted by a nuclear speech sound). With C for marginal speech sounds and V for nuclear speech sounds, and with $ (or a period, “.”) for a syllable boundary, CC, C$C, CCC, C$CC, CC$C etc. are consonant clusters, CVC, CV$C, CVCC, CCVCC, CV$CC etc. are not. (4) A head cluster is a consonant cluster entirely within a syllable head. (5) A coda cluster is a consonant cluster entirely within a syllable coda. (6) An intersyllabic cluster is a consonant cluster containing both coda and head speech sounds. C$C, CC$C, C$CC etc. are intersyllabic clusters.
14
Theo Vennemann
(7) A contact cluster is an intersyllabic cluster of cardinality two. C$C clusters within VC$CV, VCC$CV, VC$CCV, VCC$CCV etc. are contact clusters. 2. What is consonant cluster complexity? Phonologists have gathered a lot of information on consonant clusters and their structural complexity, and have formulated a number of generalizations. These are well-founded, inasmuch as they find support in the observation of numerous language systems, in which always the less complex structures are favored over the more complex, in the sense that the occurrence of complex structures almost always implies the occurrence of the less complex ones on a given structural parameter. They also find support in the observation of language change, in which always the more complex structures are eliminated before the less complex ones on the same parameter of complexity (cf. Vennemann 1989). For example, if a language has consonant clusters of three, it also has consonant clusters of two, but not conversely. Or more generally, cf. (8): (8) If a language has consonant clusters of cardinality n (n > 2), it also has consonant clusters of cardinality n – 1. No phonologist will doubt this generalization, which incidentally can be derived from the various preference laws in Vennemann (1988). (Those Laws relevant to my topic are cited in the Appendix below.) Since the Head Law says in part (a) that a syllable head is the more preferred the closer its cardinality is to one, and the Coda Law says that a syllable coda is the more preferred the closer its cardinality is to zero, it follows that clusters are everywhere dispreferred, but the more so the greater their cardinality. Changes reducing the cardinality of clusters can be found in many languages, and for some languages we know that the maximal cardinality of certain clusters has decreased in historical times, for example that of head clusters in Korean and Pali. In the early history of English there have been sporadic attempts at reducing triconsonantal word-initial syllable heads, for example in the words speak (German sprechen) and shut (German schließen, Germanic +skleutan, Lat. claudere), cf. Vennemann 2000. But these changes have not become general, as shown by spring, split, strand etc. Can we perhaps use the preference laws to define the structural complexity of consonant clusters? It depends on what we want “structural complexity of consonant clusters” to mean. If some measure is intended by which any two
Structural complexity of consonant clusters: A phonologist’s view
15
clusters can be compared and judged more or less complex, or even some numerical scale or measure for the structural complexity for consonant clusters, then phonology cannot help. But we can do two things below that level of generality. First, we can compare any two consonant clusters in terms of structural complexity that are on one and the same quality scale of one of the preference laws, viz. the Head Law, the Coda Law, and the Contact Law, cf. (9). (9) The structural complexity of consonant cluster A is greater than that of consonant cluster B if B is more preferred than A in terms of one of the preference laws. Since the preference laws for syllable structure refer to structural aspects of syllables, it follows that cluster complexity is structure-dependent. For example, kl is less complex than lk in syllable heads but more complex in syllable codas. This is recognized in (9) by relativizing complexity comparisons to a particular preference law, in this case either to the Head Law or to the Coda Law. It makes no sense, in this framework, to ask which of the two clusters, kl or lk, is less complex an sich, i.e. without such structural relativization. Second, we can say what contributes to the structural complexity of consonant clusters, if we correlate this concept with that of linguistic quality in terms of preference, i.e. of graded naturalness, cf. (10). (10) Every property that makes a consonant cluster less preferred relative to some other consonant cluster contributes to the structural complexity of the given consonant cluster. 3. Consonant cluster complexity and the Head Law Let us illustrate (9) above with a straightforward example that everyone knows. When initial consonant clusters of a plosive and a sonorant are eliminated – eliminated because not only clusters of cardinality greater than two are complex but all clusters are, only single consonants are good – there is an order that may not be broken. Thus, in English, on the partial scale in (11), (11) English:
?kl*knkr—|———|———|—→ increasing quality of head clusters
all three clusters existed in Old English, as they do in Contemporary German. In Contemporary English, the worst of these clusters is gone – German Knie is English knee, where the k- is still spelled but no longer spoken, nor is it speakable; the cluster is ungrammatical as a word-initial head cluster. The next on
16
Theo Vennemann
the quality scale, kl-, is unstable in some dialects, t or a glottal stop or something else, barely audible, being spoken instead of k (cf. Luick 1914: §§801, 802, Fisiak 1980, and Lutz 1991: 251–254 with further references). The cluster kr- is intact everywhere. As phonologists we can explain this by reference to the Head Law, part (c). If one compares (11) to (1) farther up, it becomes obvious that the sonorants in the clusters are arranged along the scale of Consonantal Strength. Part (c) of the Head Law, the preference law for the structure of complex syllable heads, says that a head cluster is the more preferred the sharper the Consonantal Strength drops from the first head speech sound to the next. And as one can see in (11.a), (11.a)
k n l r —|———|———|———|—→ decreasing Consonantal Strength
the drop from k to n is smallest, which means kn- is most disfavored. Many phonologists see this as an explanation for the loss of kn-: The English development instantiates a universal preference law. This preference law in turn is a generalization gained by many phonologists’ looking at the structure and changes in many languages (see for example Greenberg 1978). Looking at the quality scale for initial consonant clusters of plosive and sonorant in more general terms – cf. the scale in (12) – (12) A quality scale for CC heads with plosives (P) as onset and plosives (P), fricatives (F), nasals (N), lateral liquids l, central liquids r, and semivowels V̯ on the slope) PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ we see in (13) to (19) how nicely languages range from instantiating the full scale to no such head clusters at all. (13) Classical Greek PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ + + + + + (+) (14) Contemporary Greek, German PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ – + + + + +
Structural complexity of consonant clusters: A phonologist’s view
17
(15) Old and Middle English PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ – – + + + + (16) Contemporary English, Classical Latin PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ – – – + + + (17) Italian, Portuguese (but there are new words with Pl- through borrowing, for example Ital. blu ‘blue’) PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ – – – – + + (18) Korean PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ – – – – – + (19) Tahitian PPPFPNPlPrPV̯ ——|————|————|————|————|————|——→ – – – – – – These scales are historical snapshots, so to speak. They may change any time, and they do. New clusters may originate in a language through copation processes (syncope and apocope, see section 7 below), as happened massively in the history of Polish (cf. Rochoń 2000: chapter 2). New clusters may also be transported into a language with borrowed words. In Basque, “native words of any antiquity never contain initial clusters” (Trask 1997: 163). Older loanwords were made to abide by this constraint, for example Latin placet, planum, flammam, florem, gloriam → Basque laket ‘it is pleasing’, lau ‘plane’, lama ‘flame’, lore ‘flower’, loria ‘glory’. But in recent loanwords from Romance, “plosive-liquid clusters are tolerated in both initial and medial position” (Trask 1997: 194): Spanish plaza, plomo, precio, trono, clase, glorificar → Basque plasa ‘town square’, plomu ‘lead’, presio ‘price’, tronu ‘throne’, klasa ‘class’, glorifika ‘glorify’. In German, the cluster sk was eliminated by a kind of context-free consonantal monophthongization, sk > ṣχ > ʃʃ,
18
Theo Vennemann
with ʃʃ degeminating first in syllable heads and codas, later generally: scāf, skāf- > Schaf [ ʃa:f ] ‘sheep’, wascan > waschen [vaʃən] ‘to wash’, tisc, tisk- > Tisch [tɪʃ ] ‘table’. The same cluster was soon reintroduced in loanwords: Skat (a card game, < Ital. scarto ‘discarded playing cards’), Skandal, Skrupel, Sklave, Maske, grotesk.
4. How is consonant cluster complexity reduced? There are numerous mechanisms which reduce the complexity of consonant clusters, thereby eliminating clusters from their positions on the scale (always the worst on the scale first, as already mentioned). In early English, initial clusters of velar plosives K (k, g) plus the nasal n were eliminated by deleting the plosive. Greek word-initial clusters of two plosives were eliminated by turning the first plosive into a homorganic fricative ( pt > ft, kt > xt), thereby enlarging the pre-existing FP set (with F = s, such as in spóggos ‘sponge’, stéphanos ‘wreath’, skopós ‘a look-out-man, guardian’): (20) ptéryx > ['fteriks] ‘wing’, ktízō > ['çtizo] ‘(I) build’ Apparently head clusters of two plosives are much dispreferred, even more so than fricative prependices. This well-known systematic exception to the preference for falling consonantal strength in syllable heads has been ascertained for a number of languages and established as an implicational universal by Morelli (1998, 1999): If a language has PP heads, it also has FP heads (as well as PF heads), whereas the converse is not true. The examples in (21) and (22) show the elimination of initial Pl- clusters (also of fl-) in Portuguese, in inherited words through weakening of l into the palatal glide with subsequent palatalization and assibilation of the anlaut obstruent into ch- [ʃ-] with loss of the glide (cf. 21), in more recent acquisitions from various sources through weakening of the l into r (cf. 22): (21) a.
Latin planus plaga platus plenus plorare plumbum pluvia
Portuguese chao chaga chato cheio chorar chumbo chuva
‘flat’ ‘wound’ ‘flat’ ‘full’ ‘deplore’ ‘lead’ ‘rain’
Structural complexity of consonant clusters: A phonologist’s view
chamar chave
‘to call’ ‘key’
flamma flagrare
chama cheirar
‘flame’ ‘to smell’
Spanish blanco, -a
Portuguese branco, -a
‘white’
(b) plancha plato plaza
prancha prato praça
‘board’ ‘plate’ ‘place’
(c)
cravo
‘nail’
fraco, -a frauta, flauta frecha, flecha frota froixo, -a
‘weak’ ‘flute’ ‘arrow’ ‘fleet’ ‘slack’
(b) clamare clavis
19
Also: (c) (22) (a)
clavo
Also: (d) flaco, -a flauta flecha flota flojo, -a
A noteworthy feature of the scale in (12), viz. in (13) through (19), is that there occur only continuous ranges, never any gaps. That has to do with the fact that clusters on this quality scale, just as objects on any other quality scale, can only be changed by eliminating the most structurally complex consonant cluster first, then the next on the scale, and so on. Changing the range in any other order is not possible, except in the rare case that a change on a different parameter intersects the one in (12) at some point. For example, if in a language with these clusters the phonemes /l/ and /r/ merge everywhere into /r/, this change will eliminate the Pl- clusters, but not as an operation on clusters but as an effect of an intersecting change. 5. How well do we understand consonant cluster complexity reduction? Clearly in a manner of speaking we understand why and how the complexity of consonant clusters is reduced. Phonologists do because they can interpret consonant cluster changes as improvements on their quality scales; and phoneticians do because they can interpret those changes as articulatory simplifications. But rather than continue emphasizing how well phonologists have
20
Theo Vennemann
organized their subject matter, I would like at this point to do just the opposite, i.e., point out that in reality we do not really understand how complexity problems of this sort are solved in any given case. Not only can we not predict whether or when a complexity problem comes under attack, we also cannot predict which of several possible solutions to the problem will be “chosen”, so to speak. For example, we understand perfectly that a head cluster C1C2that is dispreferred according to the Head Law, part (c), is structurally complex and therefore likely to come under attack. But whether the problem is resolved by deleting the onset consonant C1 as too weak or the slope consonant C2 as too strong, or by manipulating the strength of one of the two, namely by strengthening C1 or by weakening C2, and with what result, or whether a vowel will be inserted to break up the cluster and achieve a nice C1V.C2V sequence, or whether the cluster will be partially removed from the head position by prosthesis and heterosyllabized as a medial cluster, VC1.C2V, we do not yet know. All of these measures are on record for various languages, see the partial illustration in (23) to (31). (23)
Kn- > n- in English
(24.a)
Cl- > Cr- in Portuguese, see (22) above
(24.b) Cl- > Ci̯ - in Italian (25.a)
wl- > l- in English, German, Old Norse (Lutz 1997)
(25.b) wl- > bl- in English, German (Lutz 1997), also in Classical Greek (25.c)
wl- > fl- in German dialect (Lutz 1997)
(25.d) wr- > r- in Scandinavian, English, German dialects (Lutz 1997) (25.e)
wr- > br- in English, German (Lutz 1997)
(26.a)
χn- > n- in almost all of Germanic (Lutz 1997)
(26.b) χn- > gn-, kn- in Scandinavian dialects (Lutz 1997) (26.c)
χn- > sn- in Swedish (Lutz 1997)
(27.a)
χl- > l- in almost all of Germanic (Lutz 1997)
(27.b) χr- > r- in almost all of Germanic (Lutz 1997) (28.a)
χw- > w- in almost all of Germanic (Lutz 1997)
(28.b) χw- > kw-/kv- in Scandinavian (Lutz 1997) (29.a)
fn- > n- in almost all of Germanic (Lutz 1997)
Structural complexity of consonant clusters: A phonologist’s view
21
(29.b) fn- > sn- in English (Lutz 1997) (29.c)
fn- > p f n- in Upper German dialect (Lutz 1997)
(30.a)
sn- > sin-, kl- > kil-, ml- > mil-, ry- > riy- etc. in Pali (Murray 1982)
(30.b) CC- > C ə.C- in Biblical Hebrew (31.a)
sC- > es.C- in Spanish
(31.b) sm- > ’es.m-, gd- > ’eg.d-, etc. in Phoenician (Krahmalkow 2001: 32) For syllable contact changes one may read Murray and Vennemann (1982). An entire catalogue of types of syllable contact change, with exemplifications for every one of them, may be found in Vennemann (1988: 50–55). The catalogue is reproduced below in the Appendix. 6. On understanding phonological changes It was said in section 5 that phonologists do not understand certain things happening in phonological systems. If “understand” is used in the sense of ‘be able to explain’, then this holds true for everything going on in phonological systems. Phonologists describe sound systems and their changes and generalize over them, but they cannot explain them. Only phoneticians can explain certain aspects of phonological systems and their changes (and not all of them either), by establishing phonetic correlates of speech sounds in context and generalizing over those correlates. This is quite typical of phonology: Phonological descriptions are pure syntaxes – phonotaxes, so to speak, which describe their objects but do not explain them. It is also true of cross-linguistic phonological universals, including phonological preferences: They are syntactic generalizations – phonotactic generalizations – which describe the phonological regularities of the worlds’ languages but do not explain them. To be sure, they explain specific statements by deriving them from more general ones within a theory, either of a particular language system or of all language systems. But why the generalization is valid, or the generalization which that generalization derives from, etc. – in short why the axioms of the theory are valid we do not know. We have to ask specialists in other disciplines, especially phoneticians. Or we have to learn enough phonetics etc. to know by ourselves. But then we know it as would-be phoneticians etc., not as phonologists. Phoneticians in particular are the semanticists in the endeavor to understand the sound systems
22
Theo Vennemann
of the world’s languages. They connect the phonologists’ phonotaxes to objects and events in the real world. This is the situation as a philosopher of science would describe it. In actual fact, phonologists have already adopted some basic universal phonetic knowledge from the phoneticians, socio-phonological knowledge from the sociolinguists, etc. When they speak of fricatives and labials and nasals and voiced sounds etc., rather than labeling their distribution classes with abstract names, they already mix aspects of the phonetic semantics of their descriptions into the phonotaxis. That is all right, as long as they know what they are doing. It is similar to the syntacticians’ naming certain distribution classes nouns, others verbs, yet others sentence adverbs etc. It incorporates aspects of the semantics of the syntactic descriptions and cross-linguistic generalizations into the syntactic descriptions themselves. This reaching into the other domain to find suggestive labels for the most general and recurring classes of objects in their own domain facilitates communication, both with oneself and with others. 7. Slope metathesis as consonant cluster complexity reduction The following are some tricky cases of sound change which – in times before phonologists’ thinking in terms of graded naturalness, or preferences, developed – were simply dubbed “metatheses at a distance”. Let us look at (32). (32) Lat. periculum > Span. peligro ‘danger’ : r - l > l - r We see r and l changing places, a clear case of metathesis, if there ever was one. How do we explain it? Do r and l simply exchange position in Spanish? Certainly not, because the change does not always happen, not even in words of the same rhythmic structure as peligro, cf. (33). (33) Lat. alacrem > alegre ‘lively, merry’ : l - r > idem So is (32) a simple case of confusion? Certainly not, see (34). (34) Lat. miraculum > milagro ‘miracle’ : r - l > l - r Lat. parabola > palabra ‘word’ : r - l > l - r (32) and (34) apparently follow a rule. Is the rule then to change r - l into l - r but not conversely? Not either, cf. (35). (35) Lat. aprilem > abril ‘April’ : r - l > idem So both l - r and r - l may remain unchanged, and the question is still why r - l metathesizes precisely in the environment set up by the group in (32) and (34), and there unexceptionably.
Structural complexity of consonant clusters: A phonologist’s view
23
The answer is not trivial. It follows from a careful analysis of consonant cluster complexity, viz. of the consonant clusters resulting from the syncope that changes the word-final sequence -PVlV into -PlV. The resulting head cluster of obstruent plus liquid is improved – i.e., its structural complexity is diminished – by reducing the Consonantal Strength of the second cluster element, namely replacing the lateral by the central liquid, cf. (36). (36) periculum > periglo > peligro : gl > gr miraculum > miraglo > milagro : gl > gr parabola > parabla > palabra : bl > br And in order to avoid the doubling of r in two successive syllable heads, the first of the two liquids is dissimilatorily replaced by the lateral liquid removed from the slope, cf. (37). (37) periglo > *perigro, !peligro dissim. tendency: r - Cr > l - Cr ! miraglo > *miragro, milagro dissim. tendency: r - Cr > l - Cr parabla > *parabra, !palabra dissim. tendency: r - Cr > l - Cr This process is named “slope metathesis” (in Vennemann 1988). That may be a nice descriptive term but it covers up the motivating factors: the reduction of structural consonant cluster complexity, the dissimilatory power of syllable head sequences, and the desire to preserve as much as possible of the original phonic substance of the word in order to retain its phonic identity and recognizability. Since speakers apply such subtle mechanisms as slope metathesis to reduce cluster complexity one may ask how and why consonant clusters arise in languages in the first place. There are several mechanisms, some operating outside phonology such as loanword adaptation, the univerbation of constituents in syntax and composition, and the addition of affixes to stems in morphology, others operating inside phonology. The commonest among the latter are the copations, namely syncope which of necessity creates new clusters (as in the examples discussed in this section) or complexifies existing clusters, and apocope which may transform intersyllabic clusters into the least preferred kind of cluster, coda clusters: syncope:
-V.CV.CV- > -VC.CV- or -V.CCV-VC.CV.CV- > -VCC.CV- or -VC.CCV-V.CV.CCV- > -VCC.CV- or -VC.CCVetc.
24
Theo Vennemann
apocope: -VC.CV > -VCC -VC.CCV > -VCCC -VCC.CV > -VCCC etc. The only copation process that does not complexify but may even improve syllable structure is procope (apheresis), which eliminates unaccented initial syllables, especially naked ones. Needless to say the operations just illustrated are not motivated by syllable structure preferences. They derive from lexical, syntactic, morphological, and rhythmic processes that are motivated by their own preferences; the syllable structure complexifications they may cause are merely incidental to their outcome, “collateral damage”, so to speak. All language changes are local and follow their own preferential parameters; structures on other parameters may thereby be affected negatively – differentially because they may interfere and curb the primary change where the damage would be unacceptable. The copations, as is well known, arise in informal, especially rapid speech. They evidence a preference for brevity, favoring short words over long words, the measure of shortness being the number of syllables: Copations may be defined as processes reducing the number of syllables of an expression by one. They affect the lexicon when their results stabilize in frequent use and in first language acquisition. Of course, this stability is only temporary: To the extent that the results are consonant clusters they are from this incipient moment onward subject to reductive pressures.
8. Slope displacement as consonant cluster complexity reduction As the examples in section 7 show, structural complexity with regard to consonant clusters is a very complex concept. It certainly involves more than the structure of the cluster itself. It has long been known that what makes a cluster complex in a syllable head may make it simple in a syllable coda and conversely, head and coda being to some extent mirror images of each other, see (38). (38) $trV rather simple, Vtr $ very complex (Vrt $ less complex) But whether a cluster is more or less complex depends not only on its position in the syllable but also on the position of that syllable in larger structures, especially the word. Please look at (39).
Structural complexity of consonant clusters: A phonologist’s view
25
(39) Slope displacement (Vennemann 1988, 1997; examples from Rohlfs 1972: §§322–323, cf. Spanish examples in Lipski 1992) (39) Slope displacement is the transposition of a speech sound from the slope of one syllable to the corresponding slope of a syllable in the neighborhood. (39) [Lat. catedra > VLat. (Pompeii) catecra >] + cadegra > cadrega ‘chair’ (Lombardy) In the example the r is transposed from the g (gr-) to the d (dr-). The Lombards had a gr cluster before the metathesis, they have a dr cluster after the metathesis. What have they gained? Where is the structural simplification, the complexity reduction? The difference is that before the change the cluster stands in an unstressed syllable, after the change, in a stressed syllable, cf. (39.a). (39.a) +ca.'de.gra > ca.'dre.ga This is in harmony with the Stressed Syllable Law (40), which is part of the General Syllabication Law, i.e. (109) in the excerpts from Vennemann 1988 in the Appendix below. (40) The Stressed Syllable Law: All syllabic complexities are less disfavored in stressed syllables than in unstressed syllables (Vennemann 1988: 58, 1997: 318). See the additional examples in (39.b). (39.b) ca.'pes.tro > ca.'pres.to ‘rope’ (Old Italian) ot.'to.bre > at.'tru. fu ‘October’ (Lucania, Campania) in.'te.gro > in.'treg ‘complete’ (Milanese) A supporting simplifying factor may have been that the cluster occurs earlier in the word after the change, because there is evidence for the law in (41) and its specialization (41.a), cf. Vennemann 1997: 318. (41).a The Early Syllable Law: All syllabic complexities are less disfavored the earlier they occur within the word. (41.a) The First Syllable Law: All syllabic complexities are less disfavored in first syllables than in later syllables. See the example in (42).
26
Theo Vennemann
(42) ca.'pes.tro > cra.'pes.tu ‘rope’ (Calabria) – cf. (39.b) The First Syllable Law may even win against the Stressed Syllable Law, as (42) shows. It wins automatically, so to speak, if a resulting cluster would not be allowed in the language on independent grounds, see (43): (43) fi.'nes.tra > fri.'nes.ta ‘window’ (Calabria) – *fi.'nres.ta or the resulting structure would lose the cluster effect, see (44): (44) te.'a.tro > tri.'a.tu ‘theatre’ (Sicilian) – ?ti.'ra.tu But the First Syllable Law is even prone to drain the Stressed Syllable Law, see (45). (45) pa.'dro.ne > pra.'do.ne ‘lord, employer’ (Tuscan dialects) com.'pra.re > crom.'pa.re ‘to buy’ (Tuscan dialects) dot.'tri.na > drot.'ti.na ‘doctrin’ (Tuscan dialects) cas.'tra.to > cras.'ta.o ‘castrated’ (Old Ligurian) By far the most numerous cases, as is to be expected, are those in which both Laws work together, i.e. where the first syllable of the word is stressed, see (46). (46) 'den.tro > 'dren.to ‘within’ (written language) 'stu.pro > 'stru.po ‘defilement’ (written language) 'fab.bro > 'frab.bo ‘smith’ (Tuscan dialects, also Umbria, Lazio) 've.tro > 'vre.to ‘glass’ (Tuscan dialects) 'ca.pra > 'cra.pa ‘goat’ (widespread in dialects of the south) 'fàb.bri.ca > 'fráb.bi.ca ‘factory’ (widespread in dialects of the south) etc., etc.
9. Conclusion In the preceding sections of this paper it has been shown what structural complexity of consonant clusters and what change – especially reduction – of consonant cluster complexity may mean in phonology. The phoneticians will clarify and illustrate these terms in their own language. Since it counts as the hallmark of a good scientific approach to be compatible with approaches in neighboring disciplines, and since phonetics is the closest neighbor of
Structural complexity of consonant clusters: A phonologist’s view
27
phonology, relating to it much as semantics does to syntax, it is to be hoped that the results of phonetic research in this domain fit together with the laws and analyses discussed above, in the sense that they may be understood as interpreting the phonotactic, i.e. phono-syntactic descriptions by relating them to objects and processes in the real world. If they do, we are on a successful course in both disciplines. If they do not, corrections become necessary, perhaps even a different conceptualization of the relationship between the two disciplines.
Appendix Excerpts from Vennemann 1988. The laws are also cited and illustrated in Restle and Vennemann 2001. – Numbers refer to pages in Vennemann 1988, except for the Early Syllable Law and the First Syllable Law where they refer to Vennemann 1997. Head Law (6) A syllable head is the more preferred: (a) the closer the number of speech sounds in the head is to one, (b) the greater the Consonantal Strength value of its onset, and (c) the more sharply the Consonantal Strength drops from the onset toward the Consonantal Strength of the following syllable nucleus. Coda Law (25) A syllable coda is the more preferred: (a) the smaller the number of speech sounds in the coda, (b) the less the Consonantal Strength of its offset, and (c) the more sharply the Consonantal Strength drops from the offset toward the Consonantal Strength of the preceding syllable nucleus. Nucleus Law (42) A nucleus is the more preferred: (a) the steadier its speech sound, and (b) the less the Consonantal Strength of its speech sound. Contact Law (67) A syllable contact A$B is the more preferred, the less the Consonantal Strength of the offset A and the greater the Consonantal Strength of the onset B; more precisely – the greater the characteristic difference CS(B) – CS(A) between the Consonantal Strength of B and that of A.
28
Theo Vennemann
General Syllabication Law (109) A syllabication is the more preferred: (a) the better the resulting syllable contact is and (b) the better that syllable contact is embedded. Here “better” is short for ‘more preferred [= less complex] in terms of the Syllable Contact Law’. The quality of embedding is defined in terms of the Consonantal Strength of the environment of the contact and the distribution of stress on both sides of the contact; cf. (108) in Vennemann 1988. The Stressed Syllable Law (100) All syllabic complexities are less disfavored in stressed syllables than in unstressed syllables. The Early Syllable Law (318) All syllabic complexities are less disfavored the earlier they occur within the word. The First Syllable Law (318) All syllabic complexities are less disfavored in first syllables than in later syllables. Types of Syllable Contact Change (87) (1) Tautosyllabication: A.B > .AB (2) Gemination: A.B > A.AB (3) Calibration (a) Coda weakening: A.B > C.B, where C is weaker than A (b) Head strengthening: A.B > A.C, where C is stronger than B (4) Contact epenthesis: A.B > A.CB, where C is stronger than A (5) Strength assimilation (a) regressive: A.B > C.B, where the Consonantal Strength of C is less than that of A and greater than or equal to that of B (b) progressive: A.B > A.C, where the Consonantal Strength of C is less than that of B and greater than or equal to that of A (6) Contact anaptyxis: A.B > AV.B, where V is a vowel (7) Contact metathesis: A.B > B.A
Structural complexity of consonant clusters: A phonologist’s view
29
References Byrd, Dani 1996
A phase window framework for articulatory timing. Phonology 13: 139–169. Browman, C. P., and Louis Goldstein 1986 Towards an articulatory phonology. Phonology Yearbook 3: 219– 252. Browman, C. P., and Louis Goldstein 1989 Articulatory gestures as phonological units. Phonology 6: 201–251. Browman, C. P., and Louis Goldstein 1992 Articulatory phonology: An overview. Phonetica 49: 155–180. Fisiak, Jacek 1980 Was there a kl-, gl- > tl-, dl-change in Early Modern English? Lingua Posnaniensis 23: 87–90. Greenberg, Joseph H. 1978 Some generalizations concerning initial and final consonant clusters. In: Joseph H. Greenberg (ed.), Universals of human language, 4 vols, vol. 1: Phonology, 243–279. Stanford, California: Stanford University Press. Krahmalkov, Charles R. 2001 A Phoenician-Punic grammar (Handbook of Oriental Studies, Section one: The Near and Middle East 54). Leiden: Brill. Lipski, John M. 1992 Metathesis as template-matching: A case study from Spanish. Folia Linguistica Historica 11 (1990 [1992]): 89–104, and 12 (1991 [1992]): 127–145. Luick, Karl 1914–1940 Historische Grammatik der englischen Sprache, 2 vols. Leipzig: Bernhard Tauchnitz. [Reprint Stuttgart: Bernhard Tauchnitz, 1964.] Lutz, Angelika 1991 Phonotaktische gesteuerte Konsonantenveränderungen in der Geschichte des Englischen (Linguistische Arbeiten 272). Tübingen: Niemeyer. Lutz, Angelika 1997 Lautwandel bei Wörtern mit imitatorischem oder lautsymbolischem Charakter in den germanischen Sprachen. In: Kurt Gustav Goblirsch, Martha Berryman Mayou and Marvin Taylor (eds.), Germanic studies in honor of Anatoly Liberman, 439–462. (NOWELE 31/32.) Odense: Odense University Press. Morelli, Frida 1998 Markedness relations and implicational universals in the typology of onset obstruent clusters. Proceedings of the Annual Meeting of the North Eastern Linguistic Society [NELS] 28, vol. 2. Available on the Internet at http://ebookbrowse.com/roa-251-morelli-2-pdf-d6710926 (24 April 2011).
30
Theo Vennemann
Morelli, Frida 1999
The phonotactics and phonology of obstruent clusters in optimality theory. Ph.D. Dissertation, University of Maryland at College Park. Available on the Internet at http://roa.rutgers.edu/view.php3?id=432 (24 April 2011). Murray, Robert W. 1982 Consonant cluster development in Pāli. Folia Linguistica Historica 3: 163–184. Murray, Robert W. 1988 Phonological strength and Early Germanic syllable structure (Studies in Theoretical Linguistics 1.) Munich: Wilhelm Fink. Murray, Robert W., and Theo Vennemann 1982 Syllable contact change in Germanic, Greek, and Sidamo. Klagenfurter Beiträge zur Sprachwissenschaft 8: 321–349. Restle, David, and Theo Vennemann 2001 Silbenstruktur. In: Martin Haspelmath, Ekkehard König, Wulf Oesterreicher and Wolfgang Raible (eds.), Sprachtypologie und sprachliche Universalien: Ein internationales Handbuch, II.1310– 1336. (Handbücher zur Sprach- und Kommunikationswissenschaft 20.) 2 vols. Berlin: Walter de Gruyter. Rochoń, Marzena 2000 Optimality in complexity: The case of Polish consonant clusters. (Studia Grammatica 48.) Berlin: Akademie-Verlag. Rohlfs, Gerhard 1972 Historische Grammatik der italienischen Sprache und ihrer Mundarten. (Bibliotheca Romanica 5.) 3 vols. Vol. I: Lautlehre. 2nd unchanged ed. [1st ed. 1949.] Bern: Francke. Sievers, Eduard 1901 Grundzüge der Phonetik zur Einführung in das Studium der Lautlehre der indogermanischen Sprachen. 5th ed. Leipzig: Breitkopf & Härtel. [Reprint Hildesheim: Georg Olms 1976.] Tillmann, Hans G., with Phil Mansell 1980 Phonetik: Lautsprachliche Zeichen, Sprachsignale und lautsprachlicher Kommunikationsprozeß. Stuttgart: Klett-Cotta. Trask, R. Larry 1997 The history of Basque. London: Routledge. Vennemann, Theo 1988 Preference laws for syllable structure and the explanation of sound change: With special reference to German, Germanic, Italian, and Latin. Berlin: Mouton de Gruyter. Vennemann, Theo 1989 Language change as language improvement. In: Vincenzo Orioles (ed.), Modelli esplicativi della diacronia linguistica: Atti del Convegno della Società Italiana di Glottologia, Pavia, 15–17 settembre 1988, 11–35. Pisa: Giardini Editori e Stampatori. [Reprinted in:
Structural complexity of consonant clusters: A phonologist’s view
31
Charles Jones (ed.), Historical linguistics: Problems and perspectives, 319–344. London: Longman, 1993.] Vennemann, Theo 1997 The development of reduplicating verbs in Germanic. In: Irmengard Rauch and Gerald F. Carr (eds.), Insights in Germanic linguistics II: Classic and contemporary, 297–336. (Trends in Linguistics, Studies and Monographs 94.) Berlin: Mouton de Gruyter. Vennemann, Theo 2000 Triple-cluster reduction in Germanic: Etymology without sound laws? Historische Sprachwissenschaft (Historical Linguistics) 113: 239–258.
On the relations between [sonorant] and [voice] Rina Kreitman Abstract In previous literature it has been reported that the features [sonorant] and [voice] are closely related. Voicing has long been linked to the feature [sonorant] as one of its phonetic correlates, since voicing is one of the attributes common to all sonorant consonants. It has been suggested that the distribution of the feature [voice] in clusters can be predicted from the behavior of the feature [sonorant]. If sonority reversed clusters are prohibited, “voicing reversals”, a situation where voicing decreases within a cluster pre-vocalically, should not be tolerated either (Lombardi 1991). Here, I report on a cross-linguistic typological study of the distribution of these two features in wordinitial onset clusters and how they relate to one another. The different typological patterning of the two features and their internal markedness imply that it is impossible to predict the typological patterning of clusters in terms of one of these features based on the other. A language can be of one type in terms of [sonorant] but of a different type in terms of [voice]. The typology presented can further predict language type shifts due to historical changes. The prediction is: no matter the stage the language is in, it must become a type of language predicted by the typology.
1. Introduction In this paper I explore the relationship between two phonological features: [sonorant] and [voice].1 I focus on the typologies of biconsonantal word initial onset clusters along these two dimensions. I begin section 2 by presenting the typology of onset clusters in terms of the feature [sonorant]. In section 3, I show that despite claims in the literature to the contrary, the rare cluster type [+voice][−voice] ([+v][−v]) is empirically attested cross linguistically and after establishing the existence of the [+v][–v] cluster type, I present the typology of the feature [voice] in clusters. I further address claims in the literature that the patterning of onset clusters in terms of [sonorant] on the one hand, and in terms of [voice] on the other, are closely correlated (Lombardi 1991, Morelli 1999, Steriade 1997). My 1. Languages which are argued to rely on features other than [voice] to distinguish between obstruents, were excluded from the survey as will become evident in section 3.
34
Rina Kreitman
own findings, to be reported here, do not support this position. Rather, I show that the organization of onset clusters in terms of the feature [sonorant] follows a different pattern from the organization of onset clusters in terms of the feature [voice]. I show that the claim that [+voice][−voice] clusters are closely correlated with SO clusters (Lombardi 1991) is untenable. While it is possible that the two features [sonorant] and [voice] are closely linked phonetically (Parker 2002, 2008), it is not immediately transparent that they are mutually dependent. As will become evident from the typologies presented here, the typological patternings of the two features are entirely independent of each other and therefore, these two features cannot be reduced to a single feature. Moreover, I show that the patterning of one feature does not provide any clues about the typological patterning of the other feature. Furthermore, the markedness relations of clusters in terms of the feature [sonorant] are quite different from markedness relations in terms of the feature [voice], which will become evident in the discussion in section 4. The typologies I present are a result of a cross linguistic survey, which included 63 languages from 22 language families. The typologies presented here are based strictly on the phonological features [sonorant] and [voice]. It is important to note that in this work I discuss the feature [sonorant], which partitions the consonant set into two classes: the class of obstruents and the class of sonorants. Following Zec (1995), I address only the classes of obstruents and sonorants and do not address any further distinctions within these classes. In other words, the phonological feature [sonorant] is not equated to the commonly used property sonority, expressed in terms of a scale. This paper does not address the further fine-grained distinctions found in more elaborate sonority scales or the behavior of such sonority scales but rather, it explores the relationship of the feature [sonorant] and the feature [voice].
2. Clustering of sonorants (S) and obstruents (O) In word initial, bi-consonantal onset clusters there are four logical combinations of obstruents (O) standing for [–sonorant] consonants, and sonorants (S), standing for [+sonorant] consonants. The four logical possibilities for combining obstruent (O) and sonorant (S) consonants in an onset cluster are as in (1): (1) a. OS
b. OO c. SS d. SO
In the obstruent (O) class only consonantal segments specified for [–sonorant] are included; this includes both stops and fricatives. Conversely, only segments
On the relations between [sonorant] and [voice]
35
specified for [+sonorant] are included in the sonorant (S) class. For the purpose of this survey only, this latter group consisted of liquids and nasals. Glides were excluded for reasons listed in (5). Logically, a language can have any of the clusters in (1), or any combination of them, or none. A language that has none of the clusters listed in (1) is, of course, a language that does not allow any consonantal clusters. We examine only those languages which allow at least one of the clusters listed in (1). Given the cluster combinations in (1), a-priori there are fifteen logical possibilities for combining these clusters into groups of one to four cluster types. Therefore, a-priori there are fifteen logically possible language types, as in (2). If a language L has only one of the onset clusters listed in (1), it can, a-priori, be any one of them, as in (2a). If a language has two of the onset clusters in (1), it can, a-priori be any of the sets listed in (2b). If a language has three of the onset clusters in (1), it can have any of the sets listed in (2c). Finally, it is logically possible for a language to have all four onset clusters listed in (1), as in (2d). A language that has no onset clusters constitutes an empty group, { }, which is a sixteenth logically possible language type and is excluded from this study. (2) a. 1 cluster b. 2 clusters c. 3 clusters d. 4 clusters {OS} {OS,OO} {OS,OO,SS} {OS,OO,SS,SO} {OO} {OS,SS} {OS,OO,SO} {SS} {OS,SO} {OS,SS,SO} {SO} {OO,SS} {OO,SS,SO} {OO,SO} {SS,SO} In sum, in (2) I list all fifteen logically possible language types (excluding the empty group). The question arises, which of the logically possible language types in (2) are occurring language types. To address this, I conducted a crosslinguistic survey of languages which allow word initial onset clusters. The methodology of the survey is outlined in section 2.1 and the results of the survey are presented in section 2.2. 2.1. The survey – methodology The cross-linguistic typological survey I conducted is based on 63 languages from 22 language families. A complete list of the languages included in the survey is provided in appendix I of this paper. The survey includes languages which were included in Greenberg (1965), Levin (1985), Morelli (1999) and Steriade (1982), as well as 24 languages that had not been included in any
36
Rina Kreitman
earlier cross-linguistic typological studies. My survey includes only those languages that have onset clusters, which is not the case with Greenberg’s survey. This automatically excluded Persian, for example, which is a language with no onset clusters included in Greenberg’s survey but not in this survey. Moreover, when I consulted the sources for some of the languages included in Greenberg’s, Levin’s, Morelli’s and Steriade’s studies, I decided that some (e.g. those for Eggon and Nisqually (Maddieson 1981)) did not contain enough information to be safely included in this survey. That is, it was not clear if all the languages fulfilled all the criteria listed in (3) and (5) below. The survey relies on descriptive grammars and grammar books as well as additional research material where available. Multiple sources were consulted, and data from several sources compared, whenever such data were available. The greatest challenge in this typological study was to distinguish between a sequence of two consonants and a cluster. Therefore, a host of criteria were assumed regarding the status of a sequence of consonants. The basic criterion for including a language in the survey is whether it allows consonantal clusters word initially. To be precise, the word initial consonant sequence CiCj is taken to be an onset cluster if it does not contain a morpheme boundary or any intervening phonological material as stated in (3): (3) Onset Cluster
–
Let Ci Cj be a word initial sequence of consonants. The sequence CiCj is an onset cluster iff:
(i) There is no morpheme boundary between Ci and Cj: (Ci and Cj are tauto-morphemic). (ii) There is no segment Si such that CiSiCj (there is no intervening material between CiCj). (iii) CiCj are linked to the same syllable node. It should be noted that all sequences which conform to (3), including sequences of segments which violate the Sonority Sequencing Principle (Selkirk 1984), constitute regular onset clusters for the purposes of this survey. In this, I depart from proposals in the literature (Levin 1985, Steriade 1982 among many others) which grant a special status to clusters which violate the Sonority Sequencing Principle, for example, by associating the first member of a cluster with declining sonority to prosodic levels higher than the syllable. The Sonority Sequencing Principle (SSP) states the strong cross-linguistic tendency for syllables to rise in sonority towards the peak and fall in sonority towards the margins. The SSP as formulated in Selkirk (1984) is given in (4). However, while (4) constitutes a strong cross-linguistic tendency, it is not equally obeyed by all types of languages, as will be demonstrated in this work:
On the relations between [sonorant] and [voice]
37
(4) Sonority Sequencing Principle (SSP) In any syllable, there is a segment constituting a sonority peak that is preceded and/or followed by a sequence of segments with progressively decreasing sonority values. (Selkirk 1984: 116) The structure of onsets and their role within the syllable has long been debated in the literature (Clements and Keyser 1983, Clements 1990, Davis 1985, Gordon 1998, Hyman 1985, Itô 1989, Kahn 1976, McCarthy and Prince 1986, Selkirk 1982, Zec 1988, among others). The question whether sequences which do not conform to the SSP can form a real cluster in the phonological sense is highly controversial. Solutions in the form of sesqui-syllables, headless syllables, extrametrical material and appendices, amongst others, have been proposed in the literature to account for deviant and non-compliant onsets (Cho and King 2003, Everett and Everett 1984, Goedemans 1998, Gussmann 1992, Levin 1985, Nepveu 1994, Rialland 1994, Steriade 1982, Thomas 1992 among others, also see an elaborate discussion and summary in Vaux 2004). However, this hotly debated and complex issue is outside the scope of this paper. As mentioned above, I diverge from these proposals in the literature by treating even sequences with sonority reversals as clusters. A language was excluded from the survey if its clusters did not conform to one (or more) of the conditions listed in (3). Moreover, any of the circumstances in (5) would exclude a language from the survey:2
2. Languages were also excluded for technical reasons, for example, if sources of data were incomplete or inconclusive. Some sources, for example, Matthews (1955) for Dakota, and Hoff (1968) for Carib, do not make a clear distinction between word initial and word medial clusters, which makes it impossible to distinguish them. Moreover, for Dakota, different grammars listed different possible clusters. Also excluded were languages for which data from different sources were inconsistent. One such example is Chukchee (Bogoras 1922, Kenstowitz 1981 and Levin 1985 among others). Some sources claim that Chukchee contains initial clusters (Levin 1985 following Bogoras 1922) while others (Kenstowiz 1981) claim that clusters in Chukchee are broken by vowel epenthesis. Skorik (1961) explains that in Chukchee in some words consonantal sequences can appear either with or without a vowel word initially but when the same sequence appears in an onset position word medially, it must appear with the vowel between the two segments or with a preceding vowel, suggesting that consonant sequences are not truly clusters underlyingly. This is confirmed in Asinovskii’s (1991) acoustic data.
38
Rina Kreitman
(5) Additional conditions and criteria for excluding languages from the survey: (5) ii(i)
A language was excluded if it had only obstruent + glide clusters. For example, Korean, which has obstruent + glide clusters such as py and gw, was not included.3 (5) i(ii) Also excluded from the survey were languages with only homorganic nasal + obstruent clusters such as mb and nd. For example, Babungo (Schaub 1985) has only simplex onsets and pre-nasalised onsets and no other clusters. The phonological status of pre-nasalised sequences is not immediately transparent. Such sequences can be a cluster or a pre-nasalised segment (Maddieson and Ladefoged 1993, Riehl 2008). Without more information about the phonological status of these sequences, it is impossible to determine whether a specific sequence is a cluster or simply a pre-nasalised unary segment. Languages which have non-homorganic nasal-obstruent sequences in addition to homorganic nasal-obstruent sequences were included in the survey. For example, if a language has mb clusters but also mt or mk clusters (Taba, Bowden 2001), then the language was included in the survey but the homorganic clusters were excluded (i.e. they were not counted as SO clusters since their underlying status is not always transparent, and they may or may not be clusters). The non-homorganic clusters were included in the survey. (5) (iii) Also excluded were languages which have only h + obstruent or ʔ + obstruent clusters, or obstruent + h and obstruent + ʔ clusters such as Comanche (Riggs 1949) since these may function as preor post-aspiration or glottalization.4 In sum, the survey focuses on languages which allow bi-consonantal word initial onset clusters. Some of the languages included in the survey, such as Chatino (McKaughan 1954), Georgian (Butskhrikidze 2002), and Polish (Sawicka 1974), to name a few, allow clusters longer than two consonants but those clusters were not the focus of this survey. 3. Clusters with glides as the second member are not included in this survey. Surface glides may have a different underlying status. They may be underlying glides that surface as glides or they may be underlyingly vowels that surface as glides (Levi 2004, 2008). Due to the lack of transparency in the underlying status of glides, clusters with glides were excluded from the survey all together. 4. Mazatec (Steriade 1994) and Temoayan Otomi (Andrews 1949) are examples of languages which have mostly pre- and post-aspirated and pre- and post-glottalised sequences as well as pre-nasalised sequences; therefore, they were excluded from the survey all together.
On the relations between [sonorant] and [voice]
39
2.2. Results of survey According to the survey, of the fifteen logically possible language types listed in (2) only four emerge as occurring language types, as in (6): (6) Type 1 Type 2 Type 3 Type 4
{OS} {OS, OO} {OS, OO, SS} {OS, OO, SS, SO}
Evident from 6, summarized in Table (1), are the implicational relations between the various clusters. If a language allows only one type of cluster it is OS. If a language has OO clusters, it will also allow OS clusters. If a language has an SS cluster, it will also have OO and OS clusters. And lastly, if a language has SO clusters it will allow all other clusters: SS, OO and OS. Table 1. Attested language types: feature [sonorant]. Type
OS
OO
SS
Type 1
Z
Type 2
Z
Z
Type 3
Z
Z
Z
Type 4
Z
Z
Z
SO
Language Basque, Wa Kutenai, Modern Hebrew Greek, Irish
Z
Georgian, Russian, Pashto
In sum, evident from Table (1) are the implicational relations captured in (7). The implicational relations in (7) are all unidirectional and without exceptions in the languages of the survey. Next, I single out crucial asymmetries evident in Table (1) and the implicational relations in (7). (7) SO % SS % OO % OS First, there is an asymmetry between the right and left edges of the implicational relations. The presence of SO clusters implies the presence of all other clusters while OS clusters are implied by all other clusters. This asymmetry is expected given that SO is of falling sonority, that is, violates the SSP, while OS has a rise in sonority, i.e. conforms to the SSP. Based on the SSP we expect clusters with rising sonority to occur more frequently than clusters with reversed sonority. It is important to note that in this work an increase or a rise in sonority means an increase from a negative value of the feature [sonorant] to a positive one. That is, there is an increase in sonority from an
40
Rina Kreitman
obstruent segment ([–sonorant]) to a sonorant ([+sonorant]). Similarly, a decrease in sonority denotes a shift from a positive value of the feature [sonorant] to a negative one, as in SO ([+sonorant][–sonorant]) sequences. For the purpose of the survey of the feature [sonorant] both nasals and liquids are treated as a single class of sonorants specified [+sonorant] with no finer distinctions, which are often found in more elaborate sonority scales. This mirrors the treatment of stops and fricatives as a single class of obstruents [–sonorant] with no finer distinctions (Zec 1995). Secondly, we do observe an asymmetry between OO and SS clusters. The question that arises is: why is it that SS implies OO but neither OO implies SS nor OO and SS symmetrically imply each other (*OO , SS)? This results in {OS, OO} being an occurring language type but *{OS, SS} and *{OS, SS, SO} being non-occurring language types. Given the SSP which demands a rise in sonority, it is not immediately transparent why {OS, OO} is an occurring language type but *{OS, SS} is not. Both OO and SS are of flat sonority so why is it that there is a language type which includes only OO clusters (Type 2), but not a language type which includes only SS clusters? There are several reasons which, I suggest, make OO clusters less marked than SS clusters. One possible reason is the acoustic salience of obstruents as opposed to sonorants. Obstruents are perceptually more salient than sonorants, therefore their combinations are also more salient. Ohala (1983: 193) notes: “Obstruents, especially those that involve a transient burst due to the rapid equalization of an appreciable difference in air pressure, create more rapid spectral changes and thus are able to carry more information and make more distinctive sounds than non-obstruents.” Thus, obstruents, due to their acoustic attributes, when released, carry more information, especially in onset (or word initial) position and are therefore easier to distinguish from non-obstruents. We may deduce that their combinations are also more acoustically salient than combinations of sonorants and are therefore perceptually more advantageous. This might also explain another cross-linguistic observation made by Lindblom and Maddieson (1988), which is also the second reason OO clusters may be less marked than SS clusters. According to Lindblom and Maddieson (1988), phonemic inventories of languages tend to have a distribution of roughly 70% obstruents and 30% sonorants. This results in greater clustering possibilities for obstruents than for sonorants simply because there are more obstruents than sonorants. Therefore, the greater markedness of SS clusters stems simply from the mathematical reality that there tend to be fewer sonorants than obstruents in cross-linguistic phonemic inventories.
On the relations between [sonorant] and [voice]
41
Finally, Kreitman (2008) presents the sub-typology of both OO clusters, based on Morelli (1999), as well as SS clusters. She finds that while markedness in the typology of OO clusters is based on manner of articulation, markedness in SS clusters is a combination of both manner and place of articulation. In SS clusters differences in place become crucial to reinforce their perceptual salience. For obstruents, however, manner of articulation alone is sufficient to distinguish between possible members of a cluster. The additional layer of complication in the internal markedness of SS clusters may account for the fact that they are more marked than OO clusters. For a full account of the asymmetry between OO and SS clusters, see Kreitman (2006, 2008), where liquids and nasals and stops and fricatives are extensively addressed separately and the sub-typologies of OO and SS are presented in detail. 2.3. Distributional facts Table 2. Cross-linguistic distribution of clusters: feature [sonorant]. OS
OO
SS
SO
# of langs
63/63
54/63
32/63
19/63
%
100%
85.7%
50.8%
30%
Table (2) presents the distributional data regarding each cluster type. From Table (2) it is evident that if a language allows a consonantal cluster word initially it will allow an OS cluster. More surprising is the frequency of OO, SS and SO cross-linguistically. First, 30% of the languages in the survey admit one or more SO clusters. This number is quite significant making SO clusters much more common than previously assumed. They are not anomalies occurring rarely; rather they occur cross-linguistically in languages as varied as Russian (Indo-European) and Hua (Trans New-Guinea). Secondly, the asymmetry between OO and SS clusters is quite robust with OO clusters being more than one and a half times more common than SS clusters, although both constitute sonority plateaus. 3. Voicing typology In section 2 I presented a typology of word initial onset clusters based on the feature [sonorant]. To test previous claims about the correlation between voicing and sonority, we now turn to the typology of word initial biconsonantal onset clusters based on the feature [voice]. Since only obstruent clusters are
42
Rina Kreitman
specified for [voice], only a subset of the clusters examined in section 2 will be the focus of this section. 3.1. Voicing combinations In word-initial, biconsonantal onset clusters there are four logical combinations of voiced ([+v]) and voiceless ([−v]) obstruents as in (8): (8) a. [–v][–v] b. [+v][+v] c. [–v][+v] d. [+v][–v] Previous studies of voicing in clusters (Lindblom 1983, Lombardi 1991, 1995, 1999, Wetzel and Mascaró 2001, Wheeler 2005, among others) accept (8a–c) as possible clusters, but there are conflicting reports in the literature regarding cluster (8d). While Blevins (2003), Greenberg (1965), and Steriade (1997) all report that (8d) is an attested cluster, Lombardi (1991, 1999) and Lindblom (1983) categorically reject it from the set of occurring clusters. Let us review these conflicting claims in depth. In phonological studies such as Lombardi (1991, 1999), clusters of type (8a–c) are assumed to be possible clusters but clusters of type (8d) are excluded from the set of occurring clusters. In other words, the configuration in (9), in which a voiced obstruent precedes a voiceless obstruent in prenuclear position, is taken to be ill-formed: (9) *Voiced Obstruent – voiceless obstruent – syllable nucleus
Lombardi claims that the prohibition against [+v][−v] onset clusters is universal. According to her, voiced segments may occur only before a sonorant segment, either a vowel or a sonorant consonant. Her argument continues that a [+v][−v] obstruent cluster cannot be an occurring cluster type because voiceless segments cannot intervene between a voiced obstruent and a vowel. She refers to the figure in (9) as a “Universal Sonority Constraint”, “. . . an absolute universal which no language can violate.” (1991: 59). Moreover, Lombardi correlates the prohibition in the figure in (9) with the prohibition on sonority reversed clusters. For her, “voicing reversals” are comparable to SO clusters. As we will see in the next section, this parallel is untenable. Likewise, Lindblom (1983) claims, based on the principle of gestural economy, that [+v][−v] clusters should be excluded on phonetic grounds.
On the relations between [sonorant] and [voice]
43
According to Lindblom, [+v][−v] is an illicit structure for the following reasons: We should also mention here the absence of rapid intrasyllabic alternations of inspiration and expiration. The only universally preferred airstream mechanism appears to be expiratory. A similar no doubt energy-saving arrangement is observed in the distribution of phonation types. . . Clusters do not allow *[+voiced C] [−voiced C]V initially, nor its mirror image finally. (Lindblom 1983: 240) [my emphasis].
However, Greenberg’s (1965) survey based on 104 languages found the following statistics on voicing in initial clusters: (10) (a) (b) (c) (d)
Voiceless + voiceless Voiced + voiced Voiceless + voiced Voiced + voiceless
[–v][–v] [+v][+v] [–v][+v] [+v][–v]
66.7% 21.65% 10.68% 0.97%
From (10) it is clear that the majority, or two thirds (66.7%) of the clusters in Greenberg’s survey are sequences of voiceless obstruents [–v][–v]. All other cluster types constitute the remaining third. Of these, almost 22% are [+v][+v] and just under 11% are [–v][+v] clusters. Under 1% of all clusters are [+v][–v] clusters. However, the numbers are somewhat misleading since Greenberg does not separate obstruent clusters from sonorant clusters. A tn cluster, for example, is considered a [–v][+v] cluster. This skews the numbers of [–v][+v] clusters and [+v][+v] clusters, making it difficult to correctly decipher the statistical data. Evident from this survey is that clusters in which both members are voiceless are preferable to clusters with any other voicing combination. Mixed voicing clusters are a great minority at just a little over 12% of all clusters but both [–v][+v] and [+v][–v] clusters exist. However, while Greenberg accepts the existence of [–v][+v] clusters, he doubts the existence of [+v][–v] clusters, although his survey lists two languages, Bilaan and Khasi, for which obstruent [+v][–v] clusters have been reported. Since Greenberg’s sources for Bilaan and Khasi (Dean 1955 and Rabel 1961 respectively), presented no phonetic evidence for [+v][–v] obstruent clusters, Greenberg allows for the possibility that the reported [+v][–v] clusters are phonetically realised as [–v][–v]; clusters like bt reported for Khasi and bs reported for Bilaan, might actually be phonetically realised as pt and ps respectively. Since Bilaan does not distinguish between b and p, and contains only b in its phonemic inventory, it is possible that the cluster bs listed in the grammar is phonetically realised as
44
Rina Kreitman
ps. With no phonetic evidence for Bilaan, it is impossible to determine how the cluster bs is realised. Blevins (2003) and Steriade (1997) also mention the existence of [+v][–v] clusters. Both recognise the existence of these clusters in Khasi based on descriptive grammars but do not pursue their phonetics. They use [+v][–v] clusters to argue for licensing by cue. That is, voicing distinctions are more likely to occur in perceptually advantageous environments in which the perception of voicing is enhanced. In an OO environment, cues for voicing are rather poor and therefore voicing is less likely to occur in this environment (particularly if voicing is contrastive within a cluster as in the case of [–v][+v] or [+v][–v]). Steriade (1997: 7) argues that lack of perceptual cues in an OO environment accounts for the rarity of languages with [+v][–v] clusters word initially. Although, both Blevins and Steriade examine the influence of perceptual cues on the distribution of voicing in obstruents they do not pursue the phonetic implementation of such clusters in languages that do have [+v][–v] clusters. To summarise, we encounter conflicting claims in the literature concerning [+v][–v] sequencing. On the one hand, based on grammatical descriptions, Blevins (2003), Greenberg (1965, 1978) and Steriade (1997) document the existence of [+v][–v] sequences word initially. On the other hand, Lindblom (1983) and Lombardi (1991, 1995) exclude [+v][–v] word initial sequences as permissible clusters on theoretical grounds, both in the phonology and in the phonetics. To address the conflict between phonetic and phonological theories on the one hand and scarce empirical evidence on the other, a cross-linguistic typological study was conducted to establish the distributional facts of voicing in word initial onset obstruent clusters. The cross-linguistic typological study is supported by acoustic evidence from at least three different languages, Khasi, Tsou and Modern Hebrew. 3.2. Evidence for the [+v][–v] cluster type The languages included in the survey are the same languages used in the survey outlined in section 2.1. However, while the earlier survey included languages with clusters containing both obstruents and sonorants, the present survey includes only languages with obstruent clusters. That is, only 54 languages of the 63 surveyed for the feature [sonorant] were surveyed for the feature [voice]. Some languages such as German, Irish, Klamath and Welsh, to name a few, although they do have OO clusters, were not included in this survey, bringing the number of languages included in this survey to 47. In these languages the distinction between orthographic p, t, k and b, d, g is
On the relations between [sonorant] and [voice]
45
claimed to be based on the feature [spread glottis] (Iverson and Salmons 1995, Jessen 2001, Jessen and Ringen 2002 among others). That is, they are claimed to have a distinction between aspirated and unaspirated stops rather than voiced and voiceless stops. For some of these languages (German), there are conflicting claims regarding the proper distinctive laryngeal feature. Since the nature of the distinctive feature in these languages is controversial but is outside the scope of this work, these languages were excluded from the survey for the feature [voice]. The methodology I employed is the same as outlined in section 2.1. The languages included in this section are also listed in appendix I. Results of the survey indicate that in reality clusters of the [+v][−v] type do occur albeit they are rare. Six languages are reported to contain such clusters and three cases have been documented with supporting phonetic evidence in the literature: (i) Khasi in which, dk in dkar ‘tortoise’ is distinct from tk in tkor-tkor ‘plump and tender.’ According to Henderson (1991), dissimilation of voicing is a widespread feature in Khasi. However, few phonetic details are available, and, unfortunately, in the case of the only instrumental investigation (Henderson 1991 reproduced in Kreitman 2008, 2010) it is not clear that the material was produced by a native speaker. (ii) Tsou in which ɓs is distinct from ps (Wright 1996 reproduced in Kreitman 2008, 2010);5 (iii) Modern Hebrew in which dk in dkalim ‘palms’ is distinct from tk in tkarim ‘flat tires’ and dg in dgalim ‘flags’ (Kreitman 2008, 2010). Figure 1 is a spectrogram of the word dkalim ‘palm trees’ from Modern Hebrew (Kreitman 2008). It is an illustrative sample which provides acoustic phonetic evidence for the existence of [+v][−v] clusters in addition to the evidence available from Khasi and Tsou. A much wider range of utterances and many more examples of the occurrence of [+v][−v] clusters, can be found in Kreitman (2008, 2010). Given these facts, Lombardi’s cross-linguistic prohibition against [+v][−v] clusters and Lindblom’s prediction that [+v][−v] clusters cannot be produced, have no empirical basis. 5. The phonological classification of implosives is not always transparent. In some languages they pattern with obstruents but in others they do not. Based on the phonological patterning of implosives in some languages, as well as articulatory evidence from the production of implosives, it has been proposed that implosives should not be treated as obstruents or as sonorants (Clements and Osu 2002). According to Clements and Osu, implosives are specified [−obstruent, −sonorant]. In this paper, however, they are treated as obstruents.
46
Rina Kreitman
Figure 1. Modern Hebrew: dkalim ‘palms’ (Kreitman 2008)
3.3. Typology of the feature [voice] Now that we have established the existence of the [+v][–v] cluster type phonetically, we are ready to address the typology of the feature [voice] in word initial onset clusters. In (11) all the fifteen logically possible language types that result from all possible groupings of the clusters in (8) are listed. Of the fifteen logically possible combinations only 6 occurring language types emerge. Table (3) summarises all occurring language types. (11) (a)
{[−v][−v]} {[+v][+v]} {[−v][+v]} {[+v][−v]}
(b) {[−v][−v], [+v][+v]} {[−v][−v], [−v][+v]} {[−v][−v], [+v][−v]} {[+v][+v], [−v][+v]} {[+v][+v], [+v][−v]} {[−v][+v], [+v][−v]}
On the relations between [sonorant] and [voice]
(c)
47
{[−v][−v], [+v][+v], [−v][+v]} {[−v][−v], [+v][+v], [+v][−v]} {[−v][−v], [−v][+v], [+v][−v]} {[+v][+v], [−v][+v], [+v][−v]}
(d) {[−v][−v], [+v][+v], [−v][+v], [+v][−v]} Table 3. Attested language type: feature [voice]. [−v][−v]
[+v][+v]
[−v][+v]
Type 1
Z
Type 2
Z
Z
Type 3
Z
Z
Z
Type 4
Z
Z
Z
Type 5
Z
Z
Type 6
Z
Z
[+v][−v]
sample language Dutch, Kutenai Greek, Romanian Georgian
Z
Tsou, Khasi Biloxi, Camsa
Z
Bilaan, Amuesha
Clear implicational relations stated in (12) arise from Table (3): (12)
[+v][+v] + [+v][−v] % [−v][+v] % [−v][−v]
Only [–v][–v] clusters are implied by all other clusters and imply no other cluster, making them the least marked cluster type.6 A language Type 1 has only [−v][−v] clusters. Languages such as Dutch and Kutenai belong to this language type. The presence of a [+v][+v] implies the presence of a [–v][–v] cluster but no cluster type implies the presence of a [+v][+v] cluster. A language such as Greek, which contains only [+v][+v] and [–v][–v] clusters and no other cluster type is a Type 2 language as in (13): (13) [+v][+v] + [−v][−v]
6. This is expected based on articulatory phonetics because voicing is more marked (“less natural”) in obstruents, particularly word initially, than in sonorants (Westbury and Keating 1986).
48
Rina Kreitman
The presence of at least one varied voicing combination implies the presence of a [–v][–v] cluster. But in a Type 3 language both, a [–v][+v] cluster and a [+v][+v] cluster are present as in Georgian. By implication a Type 3 language also contains a [–v][–v] cluster, as in (14): (14)
[+v][+v] + [−v][+v] % [−v][−v]
A Type 4 language has all possible voicing combinations as in (12). Languages which belong to this type include, Modern Hebrew, Tsou, Hua and Khasi.7 A Type 5 language, however, contains only one cluster with varying voicing and by implication also a [–v][–v] cluster as in (15) below. Languages which belong to this type include Biloxi and Camsa. (15) [−v][+v] % [−v][−v] A Type 6 language has both possible varying voicing clusters [−v][+v] and [+v][−v] and therefore by implication also [−v][−v] clusters as in (16): (16) [+v][−v] % [−v][+v] % [−v][−v] A Type 6 language is typologically predicted on the basis of the implicational relations in (12); in Table (3) this is exemplified by Bilaan and Amuesha. The only available grammatical description of Bilaan (Dean 1955) lists [−v][−v], [−v][+v] and [+v][−v] as occurring clusters, making Bilaan a Type 6 language. However, as mentioned previously, with lack of phonetic evidence, the cases of [+v][−v] in Bilaan are suspect. Without further phonetic investigation it is impossible to determine whether the [+v][–v] clusters are realised as such in Bilaan or whether some other phonetic properties are used to distinguish these clusters. 7. Berber (Berber) and Moroccan Arabic (Semitic) may also be Type 4 languages in terms of the feature [voice], since they allow [+v][−v] clusters word initially. This can, potentially, increase the number of Type 4 languages in terms of the feature [voice] to 8 and the percentage of languages which permit [+v][−v] clusters to 17%. However, these languages were not included in the survey for two reasons. The first reason for excluding these languages is because available sources did not give exhaustive coverage of permissible clusters. The second reason these clusters were excluded is because the syllabic status of the initial clusters in these languages is controversial (Dell and Elmedlaoui 2002, Shaw et al., 2009 and references therein).
On the relations between [sonorant] and [voice]
49
From these implications we may draw the following conclusions: (i) if a language has only one cluster with the same voicing it will be [−v][−v]; (ii) if a language has a mixed voicing cluster it must have at least one cluster with the same voicing; (iii) nothing implies the presence of [+v][+v] clusters; but (iv) the presence of a [+v][+v] cluster implies the existence of a [−v][−v] cluster; and does not imply the existence of any other cluster. 3.4. Distributional facts Table (4) summarises the distributional facts regarding voicing combinations in word initial obstruent clusters: Table 4. Cross-linguistic distribution: feature [voice]. [–v][–v]
[+v][+v]
[–v][+v]
[+v][–v]
# of langs
47/47
21/47
13/47
6/47
%
100%
44.6%
27.6%
12.7%
The numbers presented in Table (4) for the distribution of the various voicing combinations in clusters differ quite significantly from the numbers found by Greenberg (1965), provided in (10). This is to be expected considering Greenberg calculated the distribution of each cluster type out of the entire set of cluster types while the calculation presented here shows how many languages contain a certain cluster type out of the subset of obstruent clusters only. Surprisingly, [+v][+v] cluster type is much more rare than initially expected. Conversely, the mixed cluster types are more common than initially expected.
4. Comparing the two typologies We are now in a position to compare the implicational relations for the two typologies presented in sections 2 (for the feature [sonorant]) and in section 3 (for the subset of obstruents specified for the feature [voice]). The implicational relations found for the feature [sonorant] are repeated in (17a) and the implicational relations found for the feature [voice] are repeated in (17b). (17) (a) Sonority implicational relations: SO % SS % OO % OS
50
Rina Kreitman
(b) Voicing implicational relations: [+v][+v] + [+v][−v] % [−v][+v] % [−v][−v] For the feature [voice] in obstruents, the preferred cluster type, the cluster that is implied by all other clusters, is [−v][−v], the one in which both segments are voiceless as in (12), repeated in (17b). Thus, in the least marked cluster its members have the same voicing specification. For the feature [sonorant] the least marked cluster, the one implied by all other clusters, is OS as in (7) repeated in (17a). Thus in the least marked cluster, its members have different specifications for the feature [sonorant]. It is clear from (17) that there is no basis for correlating the role of the feature [voice] and [sonorant] in onset clusters. In other words, Lombardi’s proposal to treat [+v][−v] as a sonority reversed cluster, comparable to SO, is not supported by the place of these clusters in their respective typologies. Even if we adopt Lombardi’s assumption that difference in voicing is comparable to difference in sonority, the least marked cluster in terms of [voice] has no “rise” in voicing,8 while the least marked cluster in terms of [sonorant] does. That is, the least marked cluster in terms of voicing ([voice]) is that in which both members of the cluster have the same voicing specifications (a voicing plateau); the least marked cluster in terms of the feature [sonorant] is that in which the sonority values of each member of the cluster are different and the second member is more sonorous than the first member. Clusters which have a sonority plateau, that is, clusters in which both members of the cluster have the same value for the feature [sonorant] are marked. Moreover, while the typology based on the feature [sonorant] in (17a) yields SO as the most marked cluster, the typology based on the feature [voice] does not yield a single most marked cluster. Both [+v][+v] and [+v][−v] are candidates for this status. 5. Predictions about language type shifts Historical changes in a language’s cluster inventory can cause a language to shift types. For example, a language which does not allow clusters at one 8. By “rise” in voicing I mean a progression of the value assigned to the feature [voice] from a negative value to a positive one. That is, there must be a rise from [–v] to [+v] within the cluster for there to be a “rise” in voicing. Conversely, a “fall” in voicing is represented by a regression from a positive value for the feature [voice] to a negative one (from a [+v] to [–v] within a cluster).
On the relations between [sonorant] and [voice]
51
stage, but allows them at another stage, is said to shift types. Clusters may become part of the grammar in several ways: borrowings, morphological or phonological processes such as syncope. Predictions regarding language type shifts follow from the implicational relations stated in (7) and (17a). A language L1 of type T1, can change membership and become a member of another type, T2, by changing the inventory of clusters allowed by the language’s grammar. It follows from (7) that if a language has no clusters then the first cluster type it will achieve is OS. Thus, a language with no clusters can shift to become a Type 1 language, i.e. a language with OS clusters. Examples of languages that shifted types are West Greenlandic (Fortescue 1984) and Popoluca (Elson 1947). Both languages disallowed consonantal clusters word initially at an earlier point in their history, and due to borrowing (from Danish and Spanish respectively), have shifted to become Type 1 languages; both now allow OS clusters. A language may also gain clusters through a process of vowel syncope. For example, a vowel may be consistently deleted in the first syllable of every word. That could result in a language gaining all types of clusters at once and becoming a Type 4 language. However, a language cannot gain only {OO} or only {SS} clusters as languages with only {OO} or {SS} clusters are not empirically attested and are therefore not part of the typology. It is also possible for a language to lose clusters. Once again, it is predicted that if a language loses one cluster type, it will lose the cluster type which implies all other clusters. Thus, a language of Type 4, which allows reversed sonority clusters, those that imply all other clusters, may disallow such clusters and shift to become a Type 3 language. The prediction is that no matter what stage the language is in, if it gains or loses clusters, it must become a language type which is predicted by the typology. A language will never gain only OO and SS clusters without having OS clusters as well, because the set *{OO, SS} cannot belong to an occurring language type.
6. Conclusion While claims in the literature link the feature [sonorant] and the feature [voice], it has been shown here that they may not be so closely correlated, at least not typologically. This suggests that, although these two features may interact in complex ways, they are not mutually dependent and their typological patterning cannot be reduced to a single pattern. That is, the phonological patterning of one of these features in clusters cannot be conjectured based on
52
Rina Kreitman
the other feature. The typological patterning of clusters based on the feature [sonorant] does not provide any clues about the phonological patterning of the feature [voice] in clusters. A language can be of one type in regards to one of these features, and another type in regards to the other. For example, Russian exhibits all possible clusters of the feature [sonorant], OS, OO, SS and SO, making it a Type 4 language in terms of the feature [sonorant], yet only two combinations of the feature [voice] are permitted, [–v][–v] and [+v] [+v], making it a Type 2 language in terms of the feature [voice]. Russian, thus, is elaborate in terms of the combinations it allows word initially for the feature [sonorant] but relatively simple in terms of the voicing combinations it permits. Modern Hebrew is the opposite example. It only allows two cluster types in terms of the feature [sonorant], OS and OO, making it a Type 2 language in terms of the feature [sonorant], but allows all possible voicing combinations, [–v][–v], [+v][+v], [–v][+v] and [+v][–v], making it a Type 4 language in terms of the feature [voice]. Modern Hebrew is simple in terms of the combinations it allows word initially for [sonorant] but is quite complex in terms of the voicing combinations it permits. This suggests that typological classification of languages based on either one of these features should be explored independently.
53
On the relations between [sonorant] and [voice]
Appendix I: Table of clusters by the [sonorant] and [voice] combinations [±sonorant]
[±voice]
Language OS
OO
SS
SO
[–v–v]
Aguacatec
Z
Z
Z
Z
Z
Aleut
Z
Z
Z
Amuesha
Z
Z
Z
Basque
Z
Belarusian
Z
Z
Z
Bilaan
Z
Z
Z
Biloxi
Z
Breton
[+v+v]
[–v+v]
[+v–v]
Z Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Bulgarian
Z
Z
Z
Cambodian
Z
Z
Z
Camsa
Z
Z
Chami
Z
Chatino
Z
Z
Cornish
Z
Z
Czech
Z
Z
Danish
Z
Z
12
Dutch
Z
Z
Z
Embara – Catio
Z
Frisian
Z
Z
Z
Gaelic (Scots)
Z
Z
Z
Georgian
Z
Z
Z
German
Z
Z
Greek
Z
Z
Hebrew (Modern)
Z
Z
Hindi
Z
Z
Hixkaryana
Z
Z
Z
Z Z
Z Z
Z
Z
Z
Z
Z
Z
Z
Z
Z13
Z Z
Z
Z
Z
12
Z
Z
Z
Z
12
Z Z
Z
Z
Z
Z
Z Z
Z
Z
54
Rina Kreitman
Appendix I: Continued [±sonorant]
[±voice]
Language Hua
OS
OO
SS
SO
[–v–v]
[+v+v]
[–v+v]
[+v–v]
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z9
Z
Z
Z
Z
Hungarian10
Z
Z
Icelandic
Z
Z
Inga
Z
Z
Z
Irish
Z
Z
Z
Khasi
Z
Z
Z
Z
Z
Klamath
Z
Z
Z
Z
12
Kobon
Z
Kutenai
Z
Z
Z
Lithuanian
Z
Z
Z
Z
Macedonian
Z
Z
Z
Z
Z
Manx
Z
Z
Z
Z
Mon (Burmese)
Z
Norwegian
Z
Z
Pashto
Z
Z
Z
Z
Z
Polish
Z
Z
Z
Z
Z
Popoluca
Z
Romani
Z
Z
Romanian
Z
Z
Z
Russian
Z
Z
Z
Serbian
Z
Z
Z
Seri
Z
Z
Slovak
Z
Z
Z
Slovenian
Z
Z
Z
Sorbian (lower)
Z
Z
Z
Sorbian (upper)
Z
Z
Z
12
Z
Z 12
Z Z
Z Z
Z
Z
Z
Z
Z
Z
Z Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
55
On the relations between [sonorant] and [voice]
Appendix I: Continued [±sonorant]
[±voice]
Language OS
OO
SS
SO
[–v–v]
[+v+v]
[–v+v]
[+v–v]
Z
Z
Spanish
Z
Swedish
Z
Z
Taba
Z
Z
Totonac
Z
Z
Tsou
Z
Z
Z
Z
Z
Z14
Ukrainian
Z
Z
Z
Z
Z
Z
Wa
Z
Welsh
Z
Z
Yiddish
Z
Z
Zapotec (Isthmus)11
Z
Z
Zoque
Z
Z Z
Z
Z Z
12
Z
Z
Z
Z
Z
9. In Hungarian v is claimed to behave as a sonorant (Barkaï and Horvath 1978), similarly to the way it behaves in Russian, therefore, clusters with v as the second member of a cluster were not included. 10. Hungarian is listed as having only one SO cluster ng. 11. Mitla Zapotec shows the same patterns. 12. In Danish (Hansen 1967), Gaelic (both Scots and Irish), (Green 1997, Harbert p.c), German (Iverson and Salmons 1995, Jessen 2001 among others), Icelandic (Rögnvaldsson 1993), Welsh (Awbery 1984) and Klamath (Blevins p.c) stops are claimed to be distinguished by aspiration and not by voicing. Therefore, none of these languages were included in the survey for voicing. 13. Chatino may or may not have [–v][+v], not entirely clear from source. 14. Tsou contains only one [+v][+v] cluster, zv, which may or may not be an obstruent [+v][+v] cluster, depending on the status of v in Tsou and whether it behaves as a sonorant. If v is a sonorant, then Tsou should be classified as a Type 5 language in terms of the feature [voice]. If, however, v is an obstruent, then Tsou is classified as a Type 6 language.
56
Rina Kreitman
Language database (An asterisk (*) next to the language name indicates that the language was not included in the survey as it either did not contain any clusters or did not conform to the conditions listed in (3) and (5)): Aguacatec (Mayan) – McArthur and McArthur 1956 Aleut (Eskimo-Aleut) – Bergsland 1997 Amuesha (Arawakan) – Fast 1953 Armenian* (Armenian, Indo-European) – Werner 1962; Vaux 1998 Arabic* (Moroccan) – Shaw, Gafos, Hoole and Zeroual 2009; Dell and Elmedlaoui 2002 Asheninka*(Arawakan) – Dirks 1953 Basque (Basque) – Hualde 1991 Babungo* (Niger-Congo) – Schaub 1985 Belarusian (Slavic – Indo-European) – Sawicka 1974 Bengali*(Indo-Iranian, Indo-European) – Ferguson and Chowdhury 1960 Beber* (Berber) – Dell. and Elmedlaoui 2002 Bilaan (Austronesian) – Dean and Dean 1955 Biloxi (Siouan) – Einaudi 1976 Breton (Celtic, Indo-European) – Ternes 1992 Bulgarian (Slavic – Indo-European) – Scatton 1984; Sawicka 1974 Burmese* (Sino-Tibetan) – Sun 1986 Cambodian (Mon-Khmer) – Nacaskul 1978 Camsa (language isolate) – Howard 1967 Carib* (Carib) – Hoff 1968 Chami (Choco) – Gralow 1976 Chatino (Oto-Manguean) – McKaughan 1954 Chukchee* (Chukotko-Kamchatkan) – Asinovskii 1991; Bogoras 1922; Kenstowicz 1981; Levin 1985; Skorik 1961 Comanche* (Uto-Aztecan) – Riggs 1949 Cornish (Celtic, Indo-European) – George 1993 Cuicateco* (Oto-Manguean) – Needham and David 1946 Czech (Slavic, Indo-European) – Kučera 1961; Kučera and Monroe 1968 Dakota* (Siouan) – Matthews 1955 Danish (Germanic, Indo-European) – Diderichsen 1964; Hansen 1967 Dutch (Germanic, Indo-European) – Booij 1995 Eggon* (Niger-Congo) – Maddieson 1981 Embara – Catio (Choco) – Mortensen 1999 French* (Romance, Indo-European) – Dell 1995 Frisian (Germanic, Indo-European) – Cohen, Ebeling, Fokkema and van Holk 1961 Gaelic (Scots) (Celtic, Indo-European) – Gillies 1993; Green 1997 Georgian (Kartvelian) – Butskhrikidze 2002; Chitoran 1998; Chitoran 1999; Chitoran, Goldstein and Byrd 2002; Gvarjaladze and Gvarjaladze 1974 German (Germanic, Indo-European) – Wiese 1996 Greek (Greek, Indo-European) – Eleftheriades 1985; Joseph and PhilippakiWarburton 1987
On the relations between [sonorant] and [voice] Greenlandic* – Fortescue 1984 Haida* (Language isolate) – Sapir 1923 Hebrew (Modern) (Semitic) – Berman 1997; Kreitman 2008 Hindi (Indo-Iranian, Indo-European) – Gumperz 1958; Ohala 1983 Hixkaryana (Carib) – Derbyshire 1985 Hua (Trans New-Guinea) – Haiman 1980 Hungarian (Uralic, Fino-Ugric) – Siptár and Törkenczy 2000 Icelandic (Germanic, Indo-European) – Rögnvaldsson 1993 Inga (Quechuan) – Levinsohn 1979 Irish (Celtic, Indo-European) – Dochartaigh 1992; Green 1997 Kabardian* (North Caucasian) – Colarusso 1992 Keresan* – Spencer 1946 Khasi (Mon-Khmer) – Henderson 1991; Nagaraja 1990; Rabel 1961 Klamath (Penutian) – Barker 1964 Kobon (Trans New-Guinea) – Davies 1980 Kutenai (Language isolate) – Garvin 1948 Leti* (Austronesia) – van Engelenhoven 1995 Lithuanian (Baltic, Indo-European) – Ambrazas 1997 Macedonian (Slavic, Indo-European) – Sawicka 1974 Manx (Celtic, Indo-European) – Broderick 1993 Mon (Burmese) (Mon-Khmer) – Huffman 1990 Mazatec* (Huautla) (Oto-Manguean) – Pike and Pike 1947; Steriade 2004 Norwegian (Germanic, Indo-European) – Næs 1965 Otomi* (Oto-Manguean) – Andrews 1949 Pashto (Indo-Iranian, Indo-European) – Penzl 1955 Polish (Slavic, Indo-European) – Gussman 1992; Sawicka 1974 Popoluca (Oto-Manguean) – Elson 1947 Roma* (Austronesian) – Hajek and Bowden 1999; Steven 1991 Romani (Indo-Iranian, Indo-European) – Ventzel 1983 Romanian (Romance, Indo-European) – Agard 1958; Mallinson 1986 Russian (Slavic, Indo-European) – Sawicka 1974 Serbian (Slavic, Indo-European) – Hodge 1946; Sawicka 1974 Seri (Hokan) – Marlett 1988 Sinhalese* (Indo-Iranian, Indo-European) – Coates and da Silva 1960 Slovak (Slavic, Indo-European) – Sawicka 1974 Slovenian (Slavic, Indo-European) – Sawicka 1974 Sorbian (lower) (Slavic, Indo-European) – Sawicka 1974 Sorbian (upper) (Slavic, Indo-European) – Sawicka 1974 Spanish (Romance, Indo-European) – Chavarria-Aguilar, O. L. 1951 Swedish (Germanic, Indo-European) – Sigurd 1965 Taba (Austronesian) – Bowden, J. 2001; Hajek and Bowden 1999 Totonac (Totonacan) – Aschmann 1946; MacKay 1994; MacKay 1999 Tsou (Austronesian) – Hsin 2000; Wright 1996 Ukrainian (Slavic, Indo-European) – Rusanivskyi (ed.) 1986 Wa (Mon-Khmer) – Watkins 2002
57
58
Rina Kreitman
Welsh (Celtic, Indo-European) – Awbery 1984; Thomas 1992 Yiddish (Germanic, Indo-European) – Jacobs 2005 Yuma* (Hokan) – Halpern 1946 Zapotec (Isthmus and Mitla dialects) (Oto-Manguean) – Briggs 1961; Marlett and Pickett 1987 Zoque (Mixe-Zoque) – Wonderly 1951
References Agard, Frederick B. 1958 Structural sketch of Rumanian. Language, 34(1): 7–127. Ambrazas, Vytautas 1997 Lithuanian Grammar. Vilnius: Baltos lankos. Andrews, Henrietta 1949 Phonemes and morphophonemes of Temoayan Otomi. International Journal of American Linguistics, 15: 213–222. Aschmann, Herman P. 1946 Totonaco phonemes. International Journal of American Linguistics, 12: 34–43. Asinovskii, Aleksandr Semenovich 1991 Konsonantizm Chukotskogo Jazyka [Consonantism of the Chukchee language]. Leningrad: Nauka. (In Russian). Awbery, Gwenllian M. 1984 Phonotactic constraints in Welsh. In Martin J. Ball and Glyn E. Jones (eds.), Welsh Phonology, Selected Readings, 65–104. Cardiff: University of Wales Press. Ball, Martin and James Fife (eds.) 1993 The Celtic Languages. London: Routledge. Barkaï, Malachi and Julia Horvath 1978 Voicing assimilation and the sonority hierarchy: Evidence from Russian, Hebrew and Hungarian. Linguistics, 212: 77–88. Barker, Muhammad A. R. 1964 Klamath Grammar. University of California Publications in Linguistics 32. University of California Press. Bat-El, Outi 1994 Stem modification and cluster transfer in Modern Hebrew. Natural Language and Linguistic Theory, 12: 571–596. Bergsland, Knut 1997 Aleut Grammar: Unangam Tunuganaan Achixaasix. Fairbanks: Alaska Native Language Center. Berman, Ruth 1997 Modern Hebrew. In Robert Hetzron (ed.), The Semitic Languages, 312–333. New York: Routledge.
On the relations between [sonorant] and [voice] Blevins, Juliette 2003
59
The independent nature of phonotactic constraints: An alternative to syllable-based approaches. In Caroline Féry and Ruben van de Vijver (eds.), The Syllable in Optimality Theory, 375–404. Cambridge: Cambridge University Press. Bogoras, Waldemar 1922 Chukchee. In Franz Boas (ed.), Handbook of American Indian Languages: Part 2. Washington: Smithsonian. Booij, Geert 1995 The Phonology of Dutch. Oxford: Oxford University Press. Bowden, John 2001 Taba: Description of a South Halmahera language. Pacific Linguistics 521. Canberra: Australian National University. Briggs, Elinor 1961 Mitla Zapotec Grammar. Mexico: Instituto Lingüístico de Verano and Centro de Investigaciones Antropológicas de México. Broderick, George 1993 Manx. In Martin Ball and James Fife (eds.), The Celtic Languages, 228–288. London: Routledge. Butskhrikidze, Marika 2002 The Consonant Phonotactics of Georgian. Utrecht: LOT. Chavarria-Aguilar, O. L. 1951 The phonemes of Costa Rican Spanish. Language, 27(3): 248–253. Chayen, Moshe J. 1972 The accent of Israeli Hebrew. Lenshonenu, 36: 212–219, 287–300. Chayen, Moshe J. 1973 The Phonetics of Modern Hebrew. The Hague: Mouton. Chitoran, Ioana 1998 Georgian harmonic clusters: Phonetic cues to phonological representation. Phonology, 15(2): 121–141. Chitoran, Ioana 1999 Accounting for sonority violations: The case of Georgian consonant sequencing. Proceedings of the 14th International Congress of Phonetic Sciences, 101–104. San Francisco, August 1999. Chitoran, Ioana, Louis Goldstein and Dani Byrd 2002 Gestural overlap and recoverability: Articulatory evidence from Georgian. In Carlos Gussenhoven and Natasha Warner (eds.), Laboratory Phonology 7, 419–447. Berlin, New York: Mouton de Gruyter. Cho, Young-mee Yu and Tracy Holloway King 2003 Semi-syllables and universal syllabification. In Caroline Féry and Ruben van de Vijver, (eds.), The Syllable in Optimality Theory: 183–212. Cambridge: Cambridge University Press. Clements, Nick G. 1990 The role of the sonority cycle in core syllabification. In John Kingston and Mary Beckman (eds.), Papers in Laboratory Phonology I:
60
Rina Kreitman
Between the Grammar and Physics of Speech, 282–333. Cambridge: Cambridge University Press. Clements, Nick G. and S. Jay Keyser 1983 CV Phonology: a Generative Theory of the Syllable. Cambridge: MIT Press. Clements, Nick G. and Osu Sylvester 2002 Explosives, implosives and nonexplosives: The linguistic function of air pressure differences in stops. In Carlos Gussenhoven and Natasha Warner (eds.), Laboratory Phonology 7, 299–350. Berlin, New York: Mouton de Gruyter. Coates, William A. and da Silva, M.W.S. 1960 The segmental phonemes of Sinhalese. University of Ceylon Review, 18: 163–175. Cohen, Antonie., C.L. Ebeling, K. Fokkema and A.G.F. van Holk 1961 Fonologie van het Nederlands en het Fries. Inleiding tot de Moderne Klankleer, Gravenhage: Martinus Nijhoff. Colarusso, John 1992 A Grammar of the Kabardian Language. Calgary: University of Calgary Press. Davies, John H. 1980 Kobon phonology. Pacific Linguistics B, 68. Canberra: Australian National University. Davis, Stuart 1985 Topics in syllable geometry. Ph.D. dissertation, Department of Linguistics, University of Arizona. Dean, James and Gladys Dean 1955 The phonemes of Bilaan. Philippine Journal of Science, 84(3): 311– 322. Dell, François 1995 Consonant clusters and phonological syllables in French. Lingua, 95: 5–26. Dell, François and Mohamed Elmedlaoui 2002 Syllables in Tashlhiyt Berber and in Moroccan Arabic. Dordrecht: Kluwer. Derbyshire, Desmond 1985 Hixkaryana and Linguistic Typology. Dallas: Summer Institute of Linguistics and The University of Texas at Arlington. Diderichsen, Paul 1964 Essentials of Danish Grammar. Copenhagen: Akademisk Forlag. Dirks, Sylvester 1953 Campa (Arawak) phonemes. International Journal of American Linguistics, 19: 302–304. Dochartaigh, Cathair Ó. 1992 The Irish language. In Donald Macaulay (ed.), The Celtic Languages. Cambridge: Cambridge University Press.
On the relations between [sonorant] and [voice]
61
Einaudi, Paula 1976 A Grammar of Biloxi. New York: Garland. Eleftheriades, Olga 1985 Modern Greek: A Contemporary Grammar. Palo Alto: Pacific Books Publishers. Elson, Ben 1947 Sierra Popoluca syllable structure. International Journal of American Linguistics, 13(1): 13–17. Engelenhoven, Aone van 1995 A Description of the Leti language (as spoken in Tutukei). Ridderkerk: Offsetdrukkerij Ridderprint B.V. Everett, Daniel and Keren Everett 1984 On the Relevance of Syllable Onsets to Stress Placement. Linguistic Inquiry, 15: 705–711. Fast, Peter W. 1953 Amuesha (Arawak) phonemes. International Journal of American Linguistics, 19: 191–194. Ferguson, Charles and Munier Chowdhury 1960 The Phonemes of Bengali. Language, 36(1): 22–59. Fortescue, Michael D. 1984 West Greenlandic. London, Croom Helm. Garvin, Paul L. 1948 Kutenai I: Phonemics. International Journal of American Linguistics, 14: 37–42. George, Ken 1993 Cornish. In Martin Ball and James Fife (eds.), The Celtic Languages, 410–470. London: Routledge. Gillies, William 1993 Scottish Gaelic. In Martin Ball and James Fife (eds.), The Celtic Languages, 145–227. London: Routledge. Goedemans, Rob 1998 Weightless segments. The Hague: Holland Academic Graphics. Gordon, Matthew 1999 Syllable weight: Phonetics, phonology, and typology. Ph.D. dissertation, Department of Linguistics, University of California, Los Angeles. Gralow, Frances L. 1976 Fonología del Chamí [Chami Phonology]. Sistemas Fonológicos de Idiomas Colombianos 3, 29–42. Bogotá: Ministerio de Gobierno and Instituto Lingüístico de Verano. Green, Anthony 1997 The prosodic structure of Irish, Scots Gaelic, and Manx. Ph.D. dissertation, Department of Linguistics, Cornell University. Greenberg, Joseph 1965 Some generalizations concerning initial and final Consonant sequences. Linguistics, 18: 5–34. (reprinted as Greenberg 1978).
62
Rina Kreitman
Greenberg, Joseph 1978 Some generalizations concerning initial and final consonant clusters. In Joseph H. Greenberg (ed.), Universals of Human Language, vol. 2: Phonology. Stanford, California: Stanford University Press. Gumperz, John 1958 Phonological differences in three Hindi dialects. Language, 34: 212–224. Gussmann, Edmund 1992 Resyllabification and delinking: the case of Polish voicing. Linguistic Inquiry, 23, 25–56. Gvarjaladze T’amar and Isidor Gvarjaladze 1974 English-Georgian Dictionary. Tbilisi: State Publication House. Haiman, John 1980 Hua, A Papuan Language of the Eastern Highlands of New Guinea. Amsterdam: John Benjamins. Hajek, John and John Bowden 1999 Taba and Roma: clusters and geminates in two Austronesian languages. In Proceedings of the XIVth Congress of Phonetic Sciences: San Francisco, 1–7 August, 1033–1036. American Institute of Physics. Halpern, Abraham Meyer 1946 Yuma I: Phonemics. International Journal of American Linguistics, 12(1): 25–33. Hansen, Aage 1967 Moderne Dansk [Modern Danish]. København: Forlag Harley. (In Danish) Henderson, Eugénie 1991 Khasi clusters and Greenberg’s universals. Mon-Khmer Studies, 18– 19: 61–6. Hoard, James E. 1978 Remarks on the nature of syllabic stops and affricates. In Alan Bell and Joan Hooper (eds.), Syllables and Segments. Amsterdam: NorthHolland. Hodge, Carleton T. 1946 Serbo-Croatian phonemes. Language, 22: 112–120. Hoff, Bernard J. 1968 The Carib Language: Phonology, Morphonology, Morphology, Texts and Word Index. The Hague: Martinus Nijhoff. Howard, Linda 1967 Camsa phonology. In Viola G. Waterhouse (ed.), Phonemic Systems of Colombian Languages, 73–87. Summer Institute of Linguistics Publications in Linguistics and Related Fields, 14. Norman: Summer Institute of Linguistics of the University of Oklahoma.
On the relations between [sonorant] and [voice] Hsin, Tien-Hsin 2000
63
Consonant clusters in Tsou and their theoretical implications. The Proceedings of the 18th West Coast Conference on Formal Linguistics, Cascadilla Press. Hualde, José Ignacio 1991 Basque Phonology. London, New York: Routledge. Huffman, Franklin E. 1990 Burmese, Thai Mon, and Nyah Kur: A synchronic comparison. Mon-Khmer Studies, 16–17: 31–64. Hyman, Larry 1985 A Theory of Phonological Weight. Dordrecht: Foris. Itô, Junko 1989 A prosodic theory of epenthesis. Natural Language and Linguistic Theory, 7: 217–259. Iverson, Gregory and Joseph Salmons 1995 Aspiration and laryngeal representation in Germanic. Phonology, 12: 369–396. Jacobs, Neil G. 2005 Yiddish: A Linguistic Introduction. Cambridge: Cambridge University Press. Jessen, Michael 2001 Phonetic implementation of the distinctive auditory features [voice] and [tense] in stop consonants. In Tracy Alan Hall (ed.), Distinctive Feature Theory, 237–294. Berlin, New York: Mouton de Gruyter. Jessen, Michael and Catherine O. Ringen 2002 Laryngeal features in German. Phonology, 19: 189–218. Joseph, Brian. D. and Irene Philippaki-Warburton 1987 Modern Greek. London: Croom Helm. Kahn, Daniel 1976 Syllable-based generalizations in English phonology. Ph.D. dissertation, Department of Linguistics Massachusetts Institute of Technology. [Published 1980 New York: Garland Press.] Keating, Patricia A. 1984 Phonetic and phonological representation of stop consonant voicing. Language, 60: 286–319. Kenstowicz, Michael 1981 The phonology of Chukchee consonants. In Bernard Comrie (ed.), Studies in the Languages of the USSR. Carbondale: Linguistic Research Inc. Kreitman, Rina 2003 Diminutive reduplication in Modern Hebrew. Working Papers of the Cornell Phonetics Laboratory, 15: 101–129. Kreitman, Rina 2006 Cluster buster: A typology of onset clusters. In J. Bunting, S. Desai, R. Peachy, C. Straughn and Z. Tomková (eds.), Chicago Linguistic Society, 42(1): 163–179.
64
Rina Kreitman
Kreitman, Rina 2008
Kreitman, Rina 2010
The phonetics and phonology of onset clusters: The case of Modern Hebrew. Ph.D. dissertation, Department of Linguistics, Cornell University. Mixed voicing word-initial onset clusters. In Cécile Fougeron, Barbara Kühnert, Mariapaola D’Imperio and Natalie Vallée (eds.), Laboratory Phonology 10: Phonology and Phonetics, 169–200. Berlin: Mouton de Gruyter.
Kučera, Henry 1961 The Phonology of Czech. The Hague: Mouton and Company. Kučera, Henry and George Monroe 1968 A Comparative Quantitative Phonology of Russian, Czech and German. New York: American Elsevier Publication. Ladefoged, Peter and Ian Maddieson 1996 The Sounds of the World’s Languages. Oxford: Blackwell. Laufer, Asher 1994 Voicing in contemporary Hebrew. Leshonenu, 57(4): 299–342. (in Hebrew). Levi, Susannah V. 2004 The representation of underlying glides. Ph.D. dissertation, Department of Linguistics, University of Washington. Levi, Susannah V. 2008 Phonemic vs. derived glides. Lingua, 118: 1956–1978. Levin, Juliette 1985 A metrical theory of syllabicity. Ph.D. dissertation, Department of Linguistics, Massachusetts Institute of Technology. Levinsohn, Stephen H. 1979 Fonología del Inga [Phonology of Inga]. In Marilyn E. Cathcart et al. (eds.), Sistemas Fonológicos de Idiomas Colombianos 4, 65–85. Bogota: Ministerio de Gobierno. (In Spanish) Lindblom, Björn 1983 Economy of speech gestures. In Peter MacNeilage (ed.), Speech Production, 217–246. New York: Springer-Verlag. Lindblom, Björn and Ian Maddieson 1988 Phonetic universals in consonant systems In Larry M. Hyman and Charles N. Li (eds.), Language, Speech and Mind, 62–78. New York: Routledge. Lombardi, Linda 1991 Laryngeal features and laryngeal neutralization. Ph.D. dissertation, Department of Linguistics, University of Massachusetts, Amherst. Lombardi, Linda 1995a Laryngeal features and privativity. The Linguistic Review, 12: 35– 59.
On the relations between [sonorant] and [voice]
65
Lombardi, Linda 1995b Laryngeal neutralization and syllable wellformedness. Natural Language and Linguist Theory, 13: 39–74. Lombardi, Linda 1999 Positional faithfulness and voicing assimilation in Optimality Theory. Natural Language and Linguist Theory, 17: 267–302. MacKay, Carolyn J. 1994 A sketch of Misantla Totonac phonology. International Journal of American Linguistics, 60(4): 369–419. MacKay, Carolyn J. 1999 A Grammar of Misantla Totonac. Salt Lake City: The University of Utah Press. Maddieson, Ian 1981 Unusual consonant clusters and complex segments in Eggon. Studies in African Linguistics, Supplement 8: 89–92. Maddieson, Ian and Peter Ladefoged 1993 Phonetics of partially nasal consonants. In Marie K. Huffman and Rena Krakow (eds.), Nasals, Nasalization and the Velum, 251–301. San Diego: Academic Press. Mallinson, Graham 1986 Rumanian. London: Croom Helm. Marlett, Stephen A. 1988 The syllable structure of Seri. International Journal of American Linguistics, 54: 245–278. Marlett, Stephen A. and Velma B. Pickett 1987 The syllable structure and aspect morphology of Isthmus Zapotec. International Journal of American Linguistics, 53: 398–422. Matthews, Hubert 1955 A phonemic analysis of a Dakota dialect. International Journal of American Linguistics, 21: 56–59. McArthur, Harry S. and Lucille E. McArthur 1956 Aguacatec (Mayan) phonemes within the stress group. International Journal of American Linguistics, 22: 72–76. McCarthy, John and Alan Prince 1986 Prosodic morphology. Ms., University of Massachusetts, Amherst, and Brandeis University, Waltham, Mass. McKaughan, Howard. P. 1954 Chatino formulas and phonemes. International Journal of American Linguistics, 20: 23–27. Morelli, Frida 1998 Markedness relations and implicational universals in the typology of onset obstruent clusters. In Proceedings of NELS 28: Volume 2. Morelli, Frida 1999 The phonotactics and phonology of obstruent clusters in Optimality Theory. Ph.D. dissertation, Department of Linguistics, University of Maryland at College Park.
66
Rina Kreitman
Morelli, Frida 2003
The relative harmony of /s+stop/ onsets: Obstruent clusters and the sonority sequencing principle. In Caroline Féry and Ruben van de Vijver (eds.), The Syllable in Optimality Theory, 356–371. Cambridge: Cambridge University Press. Mortensen, Charles A. 1999 A Reference Grammar of the Northern Embera Languages: Studies in the Languages of Colombia 7. Arlington: Summer Institute of Linguistics and the University of Texas, Publications in Linguistics, 118. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Nacaskul, Karnchana 1978 The syllabic and morphological structure of Cambodian words. Mon-Khmer Studies, 7: 183–200. Nagaraja, Keralapura S. 1990 Khasi Phonetic Reader. Mysore: Central Institute of Indian Languages. Næs, Olav 1965 Norsk Grammatikk: Elementære Strukturer og Syntaks [Norwegian Grammar: Elementary Structures and Syntax]. Forlag: Fabritius & Sønners. (In Norwegian). Needham, Doris and Marjorie Davis 1946 Cuicateco Phonology. International Journal of American Linguistics, 12: 139–146. Nepveu, Denis 1994 Georgian and Bella Coola: Headless syllables and syllabic obstruents. MA thesis, UC Santa Cruz. Ohala, Manjari 1983 Aspects of Hindi Phonology. Delhi: Motilal Baarsidass. Okrand, Marc 1979 Metathesis in Costanoan grammar. International Journal of American Linguistics, 45: 123–130. Parker, Steve 2002 Quantifying the sonority hierarchy. Ph.D. dissertation, Department of Linguistics, University of Massachusetts, Amherst. Parker, Steve 2008 Sound level protrusions as physical correlates of sonority. Journal of Phonetics, 36: 55–90. Penzl, Herbert 1955 A Grammar of Pashto: A Descriptive Study of the Dialect of Kandahar, Afghanistan. Washington, D.C.: American Council of Learned Societies. Pike, Kenneth and Eunice Pike 1947 Immediate constituents of Mazateco syllables. International Journal of American Linguistics, 13(2): 78–91.
On the relations between [sonorant] and [voice] Rabel, Lili 1961
67
Khasi, a Language of Assam. Baton Rouge: Louisiana State University Press. Rex, Eileen and Mareike Schöttelndreyer 1973 Sistema fonológico del Catío [Phonological systems of Catio]. Sistemas Fonológicos de Idiomas Colombianos 2, 73–85. Bogotá: Ministerio de Gobierno. (In Spanish). Rialland, Annie 1994 The phonology and phonetics of extrasyllabicity in French. In Patricia Keating (ed.), Phonological Structure and Phonetic Form: Papers in Laboratory Phonology 3, 136–159. Cambridge: Cambridge University Press. Riehl, Anastasia 2008 The phonology and phonetics of Nasal-Obstruent sequences. Ph.D. dissertation, Department of Linguistics, Cornell University. Riggs, Venda 1949 Alternate phonemic analysis of Comanche. International Journal of American Linguistics, 15: 229–231. Rögnvaldsson, Eiríkur 1993 Íslensk Hljoðkerfisfræði [Icelandic Phonology]. Reykjavík: Málvísindastofnun Háskóla Íslands. (In Icelandic). Rusanivskyi, V. M. (ed.) 1986 Ukrainskaya Grammatika [Ukrainian Grammar]. Kiev: Naukova dumka. (In Russian). Sapir, Edward 1923 The Phonetics of Haida. International Journal of American Linguistics, 2(3/4): 143–158. Sawicka, Irena 1974 Struktura Grup Spółgłoskowych w Jezykach Słowiańskich [Structure of Consonantal Clusters in Slavic Languages]. Wrocław: Zakład Narodowy im Ossolińskich. (In Polish). Scatton, Ernest. A. 1984 A Reference Grammar of Modern Bulgarian. Cambridge: Slavica Publishers. Schaub, Willi 1985 Babungo. London: Croom Helm. Selkirk, Elisabeth 1982 The syllable. In Harry van der Hulst and Norval Smith (eds.), The structure of phonological representations. Dordrecht: Foris Publications. Selkirk, Elizabeth 1984 On the major class features and syllable theory. In Morris Halle, Mark Aronoff and Richard T. Oehrle (eds.), Language Sound Structure: Studies in Phonology, 107–136. Cambridge, Massachusetts: MIT Press.
68
Rina Kreitman
Shaw, Jason, Adamantios Gafos, Philip Hoole and Chakir Zeroual 2009 Syllabification in Moroccan Arabicː evidence from patterns of temporal stability in articulation. Phonology, 26: 187–215. Sigurd, Bengt 1965 Phonotactic Structure in Swedish. Lund: Berlingska Boktryckeriet. Siptár, Peter and Miklos Törkenczy 2000 The Phonology of Hungarian. Oxford: Oxford University Press. Skorik, Petr. J. 1961 Grammatika Čukotskogo Jazyka: Fonetika i Morfologija Imennyx Častej Reči [Grammar of the Chukchee Language: Nominal Parts of Speech]. Leningrad: Nauka. (In Russian). Spencer, Robert F. 1946 The Phonemes of Keresan. International Journal of American Linguistics, 12(4): 229–236. Steriade, Donca 1982 Greek prosodies and the nature of syllabification. Ph.D. dissertation, Department of Linguistics, Massachusetts Institute of Technology. Steriade, Donca 1994 Complex onsets as single segments: the Mazateco pattern. In Jennifer Cole and Charles Kisseberth (eds.), Perspectives in Phonology, 203–293. Stanford, California: CSLI Publications. Steriade, Donca 1997 Phonetics in phonology: The case of laryngeal neutralization. Ms., University of California Los Angeles. Steven, Lee Anthony 1991 The phonology of Roma: An Austronesian language of Eastern Indonesia. MA thesis. University of Texas. Sun, Hongkai 1986 Notes on Tibeto-Burman consonant clusters. Linguistics of the Tibeto-Burman Area, 9(1): 1–21. Ternes, Elmar 1992 The Breton language. In Donald MacAulay (ed.), The Celtic Languages, 371–452. Cambridge: Cambridge University Press. Thomas, Alan R. 1992 The Welsh Language. In Donald MacAulay (ed.), The Celtic Languages, 251–345. Cambridge: Cambridge University Press. Thomas, David 1992 On sesquisyllabic structure. Mon-Khmer Studies, 21: 206–210. Trubetskoy, Nikolai 1939 Grundzüge der Phonologie [Grounded Phonology] [Translated by C. Baltaxe (1969)]. Berkley: University of California Press. Vaux, Bert 1998 The Phonology of Armenian. Oxford: Oxford University Press. Vaux, Bert 2004 The appendix. Talk presented at CUNY. The CUNY Phonology Forum.
On the relations between [sonorant] and [voice]
69
Ventzel, Tatiana V. 1983 The Gypsy Language. Moscow: Nauka Publication. Watkins, Justin 2002 The Phonetics of Wa. Canberra: Pacific Linguistics. Werner, Winter 1962 Problems of Armenian phonology III. Language, 38(3): 254–262. Westbury, John and Patricia Keating 1986 On the naturalness of stop consonant voicing. Journal of Linguistics. 22: 145–166. Wetzels, W. Leo and Joan Mascaró 2001 The typology of voicing and devoicing. Language, 77(2): 207–244. Wheeler, Max 2005 Voicing contrast: licensed by prosody or licensed by cue? ROA – 769, Rutgers Optimality Archive, http://roa.rutgers.edu/. Wiese, Richard 1996 The Phonology of German. Oxford: Calderon Press. Wonderly, William L. 1951 Zoque II: Phonemes and morphophonemes. International Journal of American Linguistics, 17(2): 105–123. Wright, Richard 1996 Consonant clusters and cue preservation in Tsou. Ph.D. dissertation, Department of Linguistics, University of California Los Angeles. Yoshioka, Hirohide, Anders Löfqvist and René Collier 1982 Laryngeal adjustments in Dutch voiceless obstruent production. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics, 16: 27–35. Zec, Draga 1988 Sonority constraints on prosodic structure. Ph.D. dissertation, Department of Linguistics, Stanford University. Zec, Draga 1995 Sonority constraints on syllable structure. Phonology, 12: 85–129.
Limited consonant clusters in OV languages Hisao Tokizaki and Yasutomo Kuwana Abstract It has been claimed that the complexity of syllable structure is correlated to the order between verb and object in languages of the world: the syllable structure in OV languages is simpler than that in VO languages. However, our analysis of data in Maddieson (2005) and Dryer (2005) seems to show that a number of OV languages have (moderately) complex syllable structure. In spite of this result, we argue that the syllable structure in OV languages is simpler than has been reported, by considering the geographical gradience of coda variety, coda inventory, phonological simplification and particles attached to nouns, and complement-head orders other than OV/VO. We also discuss why OV languages have simple syllable structure: it is argued that juncture between constituents is stronger in left-branching structure (OV) than in right-branching structure (VO); strong juncture in left-branching structure makes words closely connected to each other; simple syllable structure such as CV fits nicely into the stronger juncture without making a consonant cluster.
1. Introduction It has been pointed out that languages with object-verb order (OV) tend to have simple syllable structure (Lehmann 1973, Gil 1986, Plank 1998). This is the case in some OV languages such as Ijo, Yareba and Warao, whose syllable form is CV. However, examination of data in Haspelmath et al. (2005) (henceforth WALS) shows that a number of OV languages have (moderately) complex syllable structure. In this paper, we argue that the syllable structure in OV languages is simpler than has been reported, by showing that consonant clusters are limited at word boundaries and between words in OV languages. We base our argument on only a small number of example languages but hope that these will be sufficient to demonstrate the viability of our research proposal. From a conceptual and theoretical point of view, we also discuss the reason why OV languages should have simple syllable structure. In Section 2, we review the previous studies of the correlation between syllable complexity and word order. We also examine the correlation hypothesis using data from WALS. In Section 3, we argue that syllable structure in OV languages is simpler than it looks if we consider geographical gradation, simplification processes and limited coda inventory. Section 4 discusses why OV
72
Hisao Tokizaki and Yasutomo Kuwana
languages have simple syllable structure; we argue that juncture between constituents is stronger in left-branching structure (OV) than in right-branching structure (VO). Section 5 concludes the discussion.
2. The correlation between syllable structure and OV order 2.1. Implicational universals There have been a number of studies that try to show the correlation between phonology and syntax; Plank (1998) presents an overview of these. Here we concentrate on the relation between syllable structure and verb-object order. It has been pointed out that languages with object-verb order (OV) tend to have simple syllable structure (Lehmann 1973, Donegan and Stampe 1983, Gil 1986, Plank 1998). The Universals Archive lists two correlations, no. 196 and no. 207, with comments by Frans Plank as shown in (1).1 (1) a. OV languages tend to have simple syllable structure. b. IF basic order is OV, THEN syllable structure is simple (tending towards CV). c. Counterexamples: – d. Comments: Languages with flexive morphology (which tend to be OV) tend to have the ends of syllables closed, with consonant clusters occurring in this position as freely as in initial position (Lehmann 1973: 61). This implicational relation is the case in some OV languages, such as Ijo (Niger-Congo), Yareba (Papua New Guinea) and Warao (Venezuela), whose syllable form is CV. The Universals Archive also shows another correlation between word order and syllable structure, as shown in (2). (2) a. VO languages tend to have complex syllable structure. b. IF basic order is VO, THEN syllable structure is complex (permitting initial and final consonant clusters). c. Counterexamples: Old Egyptian (Afro-Asiatic): VO, only syllable types CV and CVC (F. Kammerzell, p.c.). d. Comments: – 1. http://typo.uni-konstanz.de/archive/intro/index.php (Accessed on August 25, 2009)
Limited consonant clusters in OV languages
73
The two observations in (1) and (2) predict that there will be considerable differences between SOV and SVO languages with respect to syllable complexity. Gil (1986) tests the correlation between OV/VO order and syllable structure with his 170 sample languages. He reports that the average number of segments in the syllable structure templates: SOV 4.04 < SVO 4.93. However, this result is not very convincing because the difference between SOV and SVO is less than 0.9 (0.89). Moreover, the number of sample languages is not large enough to claim (1) and (2) as universals across languages; it is necessary, therefore, to test the hypothesis with more data. 2.2. Testing the correlation with data from WALS Let us try to show the correlation between OV/VO order using data from WALS, which lists 2,561 languages, including 359 languages with data on both syllable structure and OV/VO order. Maddieson (2005) in WALS (chapter 12) divides languages into three categories according to their syllable structure: simple, moderately complex and complex, as shown in (3). (3) a. Simple CV Hawaiian and Mba (Adamawa-Ubangian, Niger-Congo; Democratic Republic of Congo) (C)V Fijian, Igbo (Niger-Congo; Nigeria), and Yareba (Yareban; Papua New Guinea) b. Moderately complex CVC CC2V C2 = liquids (r/l) or glides (w/j) CC2VC C2 = w in Darai (Indo-Aryan; Nepal) c. Complex (C)(C)(C)V(C)(C)(C)(C)
English
Categorizing syllable complexity into three groups is effective in showing typological differences between languages. However, we will argue that, as Plank (2009) points out, the categorization is not fine enough to enable correlations between syllable complexity and other features to be identified.2 2. Maddieson (2009) admits the crudity of this three-way distinction of syllable complexity, and proposes a refinement of syllable typology by scoring the complexity of onset, nucleus and coda, as shown in (i)–(iii).
74
Hisao Tokizaki and Yasutomo Kuwana
Dryer (2005) in WALS (Chapter 83) distinguishes three types of languages with respect to the order of object and verb: OV, VO and no dominant order. The third type, languages in which neither OV nor VO is dominant, falls into two classes. The first class is of languages with flexible word order, where both orders are common and the choice is determined by extragrammatical factors, such as many Australian languages (e.g. Ngandi (Gunwinyguan; Northern Territory, Australia)). In the second class are languages in which word order is primarily determined syntactically, but in which there are competing OV and VO constructions. This class includes German in which VO order is used in main clauses in which there is no auxiliary verb, while OV order is used in clauses with an auxiliary verb and in subordinate clauses introduced by a subordinator. Combining Maddieson’s and Dryer’s classification of languages by syllable structure (#12) and word order (#83) in WALS Online (http://wals.info/index) gives us the results shown in Table 1 below. The average complexity of syllable structure in each word order is calculated with simple = 1, moderately complex = 2 and complex = 3. For example, the average syllable complexity of languages with OV order is 2.25 = (18 1 + 93 2 + 60 3) ÷ 171.
2. ii(i)
Contribution of Onset: 0 = Maximal onset is single C 1 = Maximal onset is C + liquid, glide (or nasal) 2 = Maximal onset is CC where C2 may be an obstruent 3 = Maximal onset is CCC or longer 2. i(ii) Contribution of Nucleus: 1 = Nucleus is only simple (monomoraic) V 2 = Nucleus may be long vowel or diphthong 2. (iii) Contribution of Coda: 0 = No codas allowed 1 = Maximal coda is single C 2 = Maximal coda is CC 3 = Maximal coda is CCC 2. The refined syllable typology has eight steps on a scale (1–8). Maddieson claims that distribution of languages across categories is approximately normal with N = 605 languages. According to this typology, ‘simple’ languages (maximal syllable CV) = 1, Japanese = 3 (maximal syllable CjVVC) and Dutch/English = 8. We expect to be able to see the correlation between syllable structure and word orders if we use this typological data on syllable complexity; however, these data are not available at present.
Limited consonant clusters in OV languages
75
Table 1. Syllable comlexities and object-verb order: number of languages Order of Object and Verb
Total 359 languages Syllable Structure
OV (171)
VO (165)
No domin. order (23)
Simple (44)
18
23
3
Moderately complex (198)
93
95
10
Complex (117)
60
47
10
2.25
2.15
2.30
Complexity Average
These results do not seem to show the expected correlation between the object-verb order and the syllable structure, that we have seen in (1) (i.e. OV ! simple syllable) and (2) (i.e. VO ! complex syllable) above. Even worse, the 23 languages with simple syllable structure and VO orders outnumber the 18 languages with simple syllable structure and OV order. The 60 languages with complex syllable structure and OV order outnumber the 47 languages with complex syllable structure and VO order. These data are in fact the opposite of what we expected, given the previous studies we have seen above. It may be that the results can be improved by refining our quantitative approach. First, Dryer (1992, 2009) argues that typological work should not be based on the number of languages, but on the number of genera. Genera are groups of languages whose similarity is such that their genetic relatedness is uncontroversial (Dryer 1992: 84). Dryer argues that counting genera rather than languages controls for the most severe genetic bias. Counting the numbers of genera instead of languages slightly improves the results, as shown in Table 2. Table 2. Syllable complexities and object-verb order: number of genera Order of Object and Verb
Total 272 genera Syllable Structure
OV (132)
VO (117)
No domin. order (23)
Simple (36)
17
16
3
Moderately complex (140)
67
63
10
Complex (96)
48
38
10
2.23
2.19
2.30
Complexity Average
76
Hisao Tokizaki and Yasutomo Kuwana
The 17 genera with simple syllable structure and OV order outnumber the 16 genera with simple syllable structure and VO order. However, the 48 genera with complex syllable structure and OV order still outnumber the 38 genera with complex syllable structure and VO order. Second, Dryer (1992, 2009) argues that genera should also be divided into six macro areas. He emphasizes that it is dangerous to use data from raw totals of languages without examining their distribution over areas. Dividing genera into macro areas gives Table 3. Table 3. Syllable complexities and object-verb order: number of genera in six macro areas Order of Object and Verb
a. Africa (58) Syllable Structure
OV (16)
VO (37)
No domin. order (5)
Simple (10)
2
7
1
Moderately complex (37)
12
22
3
Complex (11)
2
8
1
Order of Object and Verb
b. Eurasia (45) Syllable Structure
OV (33)
VO (10)
Moderately complex (15)
13
2
Complex (30)
20
8
No domin. order (2)
Simple (0)
Order of Object and Verb
c. South East Asia (39) Syllable Structure
OV (9)
VO (30)
Simple (3)
No domin. order (0)
3
Moderately complex (25)
6
19
Complex (11)
3
8 Order of Object and Verb
d. Australia (45) Syllable Structure
2
OV (33)
VO (8)
No domin. order (4)
Simple (8)
7
1
Moderately complex (25)
17
6
2
Complex (12)
9
1
2
Limited consonant clusters in OV languages
Order of Object and Verb
e. North America (46) Syllable Structure
OV (16)
Simple (2)
VO (22)
No domin. order (8)
2
Moderately complex (21)
10
8
3
Complex (23)
6
12
5
Order of Object and Verb
f. South America (40) Syllable Structure
77
OV (25)
VO (11)
No domin. order (4)
Simple (13)
8
3
2
Moderately complex (19)
10
7
2
Complex (8)
7
1
Table 3 shows that there are more OV genera than VO genera with simple syllable structure in (d) Australia (7:1) and (f ) South America (8:3). However, these areas also have more OV genera than VO genera with complex syllable structure, i.e. (d) Australia (9:1) and (f ) South America (7:1). In the other areas, (a) Africa, (b) Eurasia, (c) South East Asia and (e) North America, the number of OV genera with simple syllable structure is not more than that of VO genera with simple syllable structure. These results show that the data in WALS do not give straightforward support for the hypothesis that OV languages have simple syllable structure. However, in the next section we argue that OV languages do have simple syllable structure if we consider the geographical gradation of the variety of word-final consonants, the fine classification of syllable complexity and headcomplement orders, the coda inventory and the simplification of syllable structure within words and between words.
3. Reconsidering syllable structure in OV languages 3.1. Geographical gradation of coda inventory First, as we saw in Section 2, Maddieson (2005) in WALS defines CV as “simple” syllable structure, (C)CVC [Onset CC limited] as “moderately complex” and others such as CCVC [CC free], CCCV . . . and . . . VCC as “complex.” However, this three-way distinction of syllable structure is not fine enough to
78
Hisao Tokizaki and Yasutomo Kuwana
enable us to see possible correlations with other features such as word orders. For example, syllable complexity should be defined on the basis of the number and variety of coda consonants. Hashimoto (1978) argues that both coda and tone are simpler in north Asia than in south Asia, as shown in Table 4. Table 4. Number of tones and codas in Asian languages (cf. Hashimoto 1978)
# tones coda direction
Manchu
Gansu
Beijing
Nanshang
Guanzhou
Thai
0
3
4
6
8 (9)
8
n/ŋ
n/ŋ
n/ŋ
n/ŋ/t/k
m/n/ŋ/p/t/k
m/n/ŋ/p/t/k
North
! South
Southern languages have a wider variety of coda consonants than northern languages. Thai, a VO language, has the most complex syllable, and Manchu, an OV language, has the simplest syllable among these languages. However, both of them are classified as “moderately complex” in WALS. Japanese, another OV language, is classified as having a “moderately complex” syllable. However, its syllable is (C)V(n), which is quite close to the “simple” syllable structure (C)V. Interestingly, this geographical gradation of the coda inventory correlates with the variety of head-complement orders in these languages. The northern language Manchu has consistent head-final order in words and constituents of a variety of sizes: Stem-Suffix, Genitive-Noun, Adjective-Noun, Noun PhrasePostposition, Object-Verb, Clause-Adverbial Subordinator (complement underscored). We define head as a non-branching category and complement as a (potentially) branching category. Head-complement order, shown with (–), increases as we move south, and as the coda inventory and number of tones increase, as shown in Table 5.3 This table shows that the distinction between OV/VO languages is not sufficient to explain the correlation between head-complement orders and syllable complexity. We will try to show such fine correlation in the next section.4
3. We consider interesting the geographical gradation of coda inventory and headcomplement orders because it might show us a relation between linguistics and anthropology. However, this topic is far beyond the scope of this paper and we leave the matter open. 4. It is an open question whether a similar geographical gradation of coda inventory can be found in languages other than Chinese dialects. Unfortunately, we do not have sufficient data about coda inventory in the world’s languages to identify such cases. We leave the problem for future research.
Limited consonant clusters in OV languages
79
Table 5. Number of tones, coda variety and complement-head orders (+) (Stem-Suffix, Genitive-Noun, Adjective-Noun, Noun Phrase-Postposition, Object-Verb, Clause-Adverbial Subordinator) Language
#tones
coda
St-Suf
Manchu
0
n/ŋ
+
Gansu
3
n/ŋ
Beijing
4
n/ŋ
Nanshang
6
n/ŋ/t/k
Guanzhou
8 (9) 8
Thai
G-N
A-N
N-P
O-V
Cl-Sb
+
+
+
+
+
+
–
–
m/n/ŋ/p/t/k
+
+–
–
–
+–
m/n/ŋ/p/t/k
–
–
–
–
–
3.2. Number of segments and degree of head-complement order In order to decide the degree of head-complement/complement-head order of a language, we checked the languages reported in Gil (1986) with the six head-complement orders in Table 5, which correspond to the features in WALS shown in (4). (4) a. b. c. d. e. f.
Prefixing vs. suffixing in inflectional morphology (#26) Order of object and verb (#83) Order of adposition and noun phrases (#85) Order of genitive and noun (#86) Order of adjective and noun (#87) Order of adverbial subordinator and clause (#93)
We assign 1 to each feature when it is a head-complement order (i.e. PrefixStem, Verb-Object, Preposition-Noun Phrase, Noun-Genitive, Noun-Adjective, Adverbial Subordinator-Clause) and –1 when it is a complement-head order (i.e. Stem-Suffix, Object-Verb, Noun Phrase-Postposition, Genitive-Noun, Adjective-Noun, Clause-Adverbial Subordinator). Then, the total score of a consistent head-initial language such as Bantoid and Mixtecan is 6; that of a consistent head-final language such as Turkic and Semitic is –6; that of a mixed language such as Baltic or Athapascan is 0 (–1 3 plus 1 3). For syllable complexity, we used the number of segments in 170 languages listed in Gil (1986), which is based on the Stanford Phonology Archive and the UCLA Phonological Segment Inventory Database. Following Dryer (1992, 2005), we counted the number of genera rather than languages.
80
Hisao Tokizaki and Yasutomo Kuwana
We grouped the genera according to the number of segments in a syllable, and calculated the average value of the head-complement orders. The result is shown in Table 6. Table 6. The score of head-complement values sorted by number of segments in a syllable Number of Segments
Average of HC score
Number of genera
2
1.33
6
3
–1.45
29
4
–0.84
57
5
0.92
39
6
–0.07
14
7
1.20
5
8
2.50
2
9
2.33
3
Although the data are insufficient in some cases, Table 6 shows a tendency: as the number of segments increases, the value of head-complement orders increases. Except for the languages with two, five and nine segments in a syllable, which have the head-complement scores of 1.33, 0.92 and 2.33 respectively (italicized), the HC score gradually increases from –1.45 to 2.50. This result at least shows that we can expect a fine correlation between syllable complexity and head-complement orders including OV/VO order. 3.3. Limited coda inventory in OV languages The coda inventory is more limited in OV languages than in VO languages. A list of OV languages with possible coda consonants is shown in (5).5 (5) a. Japanese: n b. Kanuri (Saharan): n, m, l, ɹ c. Avar (Avar-Andic-Tsezic): n, m, w, j 5. The coda data for Kanuri, Korean, Tamil and Chukuchi in list (5) are from VanDam (2004). We also checked the other languages by analyzing the data in Kamei et al. (1988–2001).
Limited consonant clusters in OV languages
d. e. f. g. h. i. j.
81
Tamil (Southern Dravian): n, ɲ, ŋ, m, l, ɭ , ɾ, r, j Moghol (Mongolic): n, m, r, d Rutul (Lezgic): d, l, s, x Lezgian (Lezgic): m, b, k, l, z, r Chukchi (Chukotko-Kamchatkan): n, l, w, j, t, k Korean: n, ŋ, m, l, p, t, k Kurdish (Central) (Iranian): w, n, m, r, k, t, v, š, ž
This list shows that there is a general order of consonants appearing in the coda position in OV languages. VanDam (2004) argues that languages tend to simultaneously prefer a manner hierarchy (nasal > liquid > obstruent > glide) and a place hierarchy (alveolar > velar > retroflex, tap). This tendency seems to be generally true in languages in (5). We could argue that, at most, OV languages tend to have nasals, liquids, and some voiceless obstruents as a coda. In this sense, syllable structure in OV languages is simpler than in VO languages, which may have a full variety of obstruents and glides.6 Note that Kurdish (Central) in ( j) has a rich variety of coda consonants. However, this language has head-complement orders in other constituents than OV order: Stem-Suffix, Noun-Genitive, Noun-Adjective, PrepositionNoun Phrase, Adverbial Subordinator-Clause (complements underscored). Thus, Kurdish (Central) is more of a head-complement language than a complement-head language, even though it has OV order: its value of headcomplement order is 2 (=(–1) 2 + 1 4). This example again shows that we need to check word orders other than OV/VO in order to see the correlation between word orders and syllable structure, as we saw in Section 3.2. A question to ask is whether the coda inventory in VO languages is not as limited as in OV languages. As we will argue, our analysis predicts that syllable structure in OV languages is simple while that in VO languages may be either complex or simple. In fact, we find a number of VO languages or genera with no coda, such as Igbo (Igboid: Niger-Congo). However, these languages/genera are not counterexamples to our analysis. We will return to this point in Section 4.
6. One might argue that our selection of languages in this section and the next is arbitrary. We admit that we have not checked all languages in a principled manner. However, the point of our argument is to show that there are at least a number of OV languages whose syllable structure is simpler than previously reported, and that this is an area worthy of future investigation.
82
Hisao Tokizaki and Yasutomo Kuwana
3.4. Limited consonant clusters within words in OV languages Now let us consider the consonant clusters between words in languages. OV languages of “(moderately) complex” syllable structure may have phonological changes such as epenthesis and deletion, which simplify syllable structure. We propose (6) as working hypotheses. (6) a. Consonant clusters are reduced in OV (head-final) languages. b. Consonant clusters are not reduced in VO (head-initial) languages. For example, consonant clusters may be avoided by epenthesis of vowels, deletion of consonants and coalescence, as schematized in (7). (7) Consonant clusters can be reduced by a. Epenthesis (CC → CVC) b. Deletion (CC → C) c.
Coalescense (CC → C)
These phonological changes are found in such languages as Hindi and Basque, which are classified as “complex” syllable structure in WALS, but should be called “moderately complex.” First, let us look at the case of epenthesis, which can be found in a number of OV languages, as shown in (8) (cf. Lee and Ramsey (2000) for Korean).7 (8) a. Nambiqara: w’aklsú → w’akəlisú ‘alligator’ b. Persian: drožki (Russian) → doroške ‘droshky’ c. Basque: libru (Latin) → liburu ‘book’ d. Kannada: magal (Old) → magalu (New) ‘daughter’ e. Japanese: drink (English) → dorinku f. Korean: text (English) → teyksuthu In these examples, consonant clusters are reduced by epenthesis of a vowel. Second, we have cases of deletion within words, as shown in the Basque examples in (9) (Hualde and de Urbina 2003: 63). 7. We selected the languages in (8) from the OV languages with a description of syllable simplification in Kamei et al. (1988–2001). The examples in (8) are taken from: Price (1976) (8a), Rastorgueva (1964) (8b), Hualde and de Urbina (2003) (8c), Kamei et al. (1988–2001) (8d), and Lee and Ramsey (2000) (8f ). Price (1976: 346) reports that in Nambiquara “a nondistinctive vowel occurs between all combinations of consonants that involves a change in the oral place of articulation.”
Limited consonant clusters in OV languages
83
(9) a. gloria (Latin) → loria ‘glory’ b. ecclesia (Latin) → eliza ‘church’ A consonant is deleted in the word-initial position in (9a) and in the medial position in (9b). Third, Korean used to have consonant clusters at the onset position in the era of Middle Korean. These clusters CC changed into reinforced consonants in Modern Korean.8 The examples in (10) show the process of coalescence. (10) a. stʌr → ttal “daughter” b. pskur → skur → kkul “honey” Here tt and kk show reinforced consonants (cf. Lee 1975: 152). On the other hand, VO languages seem to have few examples of deletion of consonant clusters. Although it is difficult to show that this is universally the case, there are examples showing that VO languages may delete vowels to make consonant clusters. For example, consider English names of Japanese companies in (11), where vowels are deleted to make consonant clusters or codas. (11) a. Matsuda → Mazda b. Yasukawa → Yaskawa → Noritz c. Noritsu These examples show that VO languages such as English do not need to simplify consonant clusters. Note that we are not claiming that every VO language has the means of making consonant clusters and codas illustrated here. As we will discuss in Section 4, our analysis predicts that the syllable structure in OV languages should be simple while that in VO languages can be either complex or simple. 3.5. Limited consonant clusters between words in OV languages Finally, we would like to point out that consonant clusters between words (as well as those within words) are also limited in OV languages. For example, Korean, which has a number of nouns ending in a coda consonant, in fact has particles attached to them to show their cases. Korean has two forms of particles, which are phonologically conditioned, as shown in (12).
8. We would like to thank John Whitman for discussion on Korean phonology.
84
Hisao Tokizaki and Yasutomo Kuwana
(12) a. b. c. d. e. f.
nominative: -i/ka accusative: -ul/lul instrumental: -ulo/lo comitative: -kwa/wa vocative: -a/ya topic: -un/nun
In (12), the first form of each pair attaches to a word ending with a consonant and the second form to a word ending with a vowel.9 Thus, particles and the words they attach to do not make a consonant cluster even if the words end in a consonant. Note also that these particles end in a vowel i/a/o or a consonant l/n. Thus, constituents consisting of a noun phrase and a particle end in a vowel or l/n. These features of Korean morpho-phonology make Korean more like a syllable-timed or moraic language with the form CVCV. . . . Then, Korean is not a real counterexample to the universal tendency for head-final languages to have simple syllable structure. Similar examples can be found in Moghol (Mongolic), which has eight types of case suffixes (Weiers 2003: 254). (13) a. b. c. d. e. f.
genitive: -i/-ɑi accusative: -i/-’i dative: -du/-do/-tu [cf. du (preposition)] ablative: -sa/-sah, -asa/-asah [cf. sah (preposition)] instrumental: -ar comitative: -la/-lah g. vocative: -ɑ̊
In the ablative (13d), consonant stems normally require the presence of an extra vowel segment, which avoids making a consonant cluster with the preceding stem. These case suffixes end in vowels or h/r; constituents consisting of a noun phrase and a particle also end in a vowel or h/r. Nivkh also has epenthesis in the case of third person singular pronouns (Shiraishi 2006: 39, 41) as shown in (14) and (15). 9. We need to consider the reason why -kwa instead of -wa is used after a word ending with a consonant to make a consonant cluster. Another remaining problem is why the genitive case marker -uy does not have another form with an onset consonant.
Limited consonant clusters in OV languages
85
ŋ-ɨmɨk ‘my mother’ my mother b. c h-ɨtɨk ‘your father’ your father
(14) a.
ŋi-’zŋaj ‘my picture’ my picture b. c hi-’zŋaj ‘your picture’ your picture
(15) a.
Pronominal clitics attach to a vowel-initial host in (14) and to a consonantinitial host in (15) where the vowel i is inserted. In this section we argued that OV languages do have simple syllable structure if we consider the geographical gradation of the variety of word-final consonants, the fine classification of syllable complexity and head-complement orders, the coda inventory and the simplification of syllable structure within words and between words. We used examples from a range of languages to illustrate these points.
4. Why do OV languages have simple syllable structure? We have argued that OV languages tend to have simple syllable structure with fewer consonant clusters between words and within words. In this section, we consider why word orders correlate with syllable structure. Tokizaki (2008) argues that left-branching structure has stronger juncture between its constituents than right-branching structure. The juncture between B and C in leftbranching (16a) is stronger than the juncture between A and B in rightbranching (16b). (16) a. [[A B] C] b. [A [B C]] In this sense, the juncture is asymmetrical between left-branching and rightbranching structure. Tokizaki (2008) shows phonological and morpho-syntactic evidence for this junctural asymmetry. Let us review some of the arguments about Japanese and Korean presented there and discuss some new data from Dutch and German. First, consider Rendaku (sequential voicing) in Japanese, which applies to the first consonant in a word preceded by another word ending with a vowel. For example, the first consonant in the second word in (17a) and (17b) is voiced when it is a part of a compound.
86
Hisao Tokizaki and Yasutomo Kuwana
(17) a.
nise tanuki → nise danuki mock badger ‘mock-badger’
b. tanuki shiru → tanuki jiru badger soup ‘badger-soup’ The voicing rule also applies to three-word compounds if they have leftbranching structure as in (18a), but it is blocked if they have right-branching structure as in (18b) (Otsu (1980)). (18) a.
[[nise tanuki ] shiru] → nise danuki jiru mock badger soup ‘mock-badger soup’
b. [nise [tanuki shiru]] → nise tanuki jiru mock badger soup ‘mock badger-soup’ Let us assume that Rendaku is the process that assimilates a word-initial consonant to the preceding vowel with respect to the feature [+voice]. Then Rendaku is blocked when there is a left bracket between a word-final vowel and a word-initial consonant as in (18b). Thus Japanese Rendaku is a case of left/right-branching asymmetry with respect to blocking phonological change. Another case of left/right-branching asymmetry is n-Insertion in Korean. In Standard Korean, n is inserted before a stem beginning in i or y when it is preceded by another stem or prefix which ends in a consonant. For example, sæk ‘color’ and yuli ‘glass’ may make sæŋ nyuli ‘colored glass’. This rule can apply in compounds with left-branching structure while it cannot in compounds with right-branching structure (Han (1994)). (19) a.
[[on chən] yok] → on chən nyok hot spring bathe ‘bathing in a hot spring’
b. [[mæŋ caŋ] yəm] cecum bowel fire ‘appendicitis’ (20) a.
[kyəŋ [ yaŋ sik]] light Western food ‘a light Western meal’
→ mæŋ jaŋ nyəm
→ kyəŋ yaŋ sik/*kyəŋ nyaŋ sik
b. [myəŋ [ yən ki ]] → myəŋ yən gi/*myəŋ nyən gi fame play skill ‘excellent performance’ A left bracket in a compound blocks n-Insertion as in (19), and a right bracket does not, as in (20).
Limited consonant clusters in OV languages
87
The left/right-branching asymmetry is also seen in languages other than Japanese and Korean. According to Krott et al. (2004), interfixation in Dutch three-word compounds shows the left/right-branching asymmetry. In Dutch, the occurrence of interfix including -s- in tri-constituent compounds matches the major constituent boundary better in right-branching compounds than in left-branching compounds. In (21) and (22), the numbers of compounds with -s- and all interfixes are shown in parentheses after the examples. (21) a.
[arbeid-s-[vraag stuk]] (-s- 38; all 60) employment+question-issue
b. [hoofd [verkeer-s-weg]] (-s- 3; all 11) main+traffic-road (22) a.
[[ grond wet]-s-aartikel ] (-s- 25; all 39) ground-law+article, constitution
b. [[scheep-s-bouw] maatschappij ] ship-building+company
(-s- 13; all 50)
The ratio of the unmarked interfix position (21a) and (22a) to the marked interfix position (21b) and (22b) is higher in right-branching (21) (-s- 38 ÷ 3 = 12.7; all 60 ÷ 11 = 5.5) than in left-branching (22) (-s- 25 ÷ 13 = 1.9; all 39 ÷ 50 = 0.8). That is, interfixes occur at the constituent break more often in rightbranching compounds than in left-branching compounds. This result is expected if we assume that the juncture between constituents in right-branching is weaker than that in left-branching structure. Moreover, Wagner (2005) shows that there is a phrasing asymmetry between OV and VO orders: OV is pronounced as a prosodic phrase while VO is pronounced as two prosodic phrases. In (23), parentheses show prosodic phrases. (23) a.
(Sie hát) (einen Tángo getanzt) she has a-Acc tango danced ‘She has danced a tango.’
b. (Sie tánzte) (einen Tángo) she danced a-Acc tango ‘She danced a tango.’ The OV in (23a) [[einen Tángo] getanzt] is left-branching and is included in a prosodic phrase. The VO in (23b) [tánzte [einen Tángo]] is right-branching and is divided into different prosodic phrases. These arguments support the idea of left/right-branching asymmetry. Now let us see how the asymmetry sheds light on the relation between word orders
88
Hisao Tokizaki and Yasutomo Kuwana
and syllable structure in languages. Let us consider how simple syllable structure allows an object to move to the left of the verb to make left-branching structure. For example, a verb phrase tends to have right-branching structure in a head-initial language (24a), and left-branching structure in a head-final language (24b). (24) a. [VP V [NP .. N ..]] b. [VP [NP .. N ..] V] However, if we assume the left/right-branching asymmetry discussed above, head-final languages in fact have compound-like verb ‘phrases’. (25) [V [ .. N .. ] V] The object and the verb in (25), separated only by a weak bracket (represented by ] ), are more closely connected to each other than the object and the verb in (24a), which are separated by a strong boundary. Simple syllable structure such as CV fits nicely into the stronger juncture in (25) without making a consonant cluster, as in (26). (26) [V [ .. CV ] CV] Then VO languages are allowed to have complex syllable structure because strong boundaries separate the coda of the verb and the onset of the object as shown in (27). (27) [VP .. CCCVCC [NP CCCVCC .. ]] Thus, left/right-branching asymmetry gives us an interesting way to explain a correlation between syntax and phonology.10
5. Conclusion We have seen that data in WALS do not show a clear correspondence between OV languages and simple syllable structure. However, we have argued that this is partly due to the crude distinction between syllable complexity in WALS. We have pointed out that we should take into account the geographical gradience of coda variety, coda inventory, phonological simplification and 10. Mehler et al. (2004) report experimental work showing the correlation between headcomplement order and rhythm, i.e. head-complement = stress-timed vs. complementhead = mora-timed. Although it is based on data from only fourteen languages, the result seems to apply to other languages as well.
Limited consonant clusters in OV languages
89
particles attached to nouns, and complement-head orders other than OV/VO. These points limit consonant clusters within words and between words in OV languages. Thus, the correlation between OV order and simple syllable structure is more realistic than it seems. This correlation is predicted by the notion that left-branching structure has stronger juncture than right-branching structure. Needless to say, we need to investigate the points just mentioned more carefully and thoroughly. We hope that this research is a step toward a typology of syllable complexity and its relation to other components of grammar.
Acknowledgments We would like to thank Theo Vennemann for invaluable comments and suggestions. We are also grateful to Bingfu Lu for his comments on Chinese dialects. This work is supported by Grant-in-Aid for Scientific Research (A20242010, C18520388) and Sapporo University.
References Donegan, Patricia J. and David Stampe 1983 Rhythm and the holistic organization of language structure. In: John F. Richardson, Mitchell Marks and Amy Chukerman (eds.), Papers from the Parasession on the Interplay of Phonology, Morphology and Syntax, Chicago: Chicago Linguistics Society, 337–353. Dryer, Matthew S. 1992 The Greenbergian word order correlations. Language 68: 81–138. Dryer, Matthew S. 2005 Order of object and verb. In: Haspelmath et al. (eds.), 338–339. Dryer, Matthew S. 2009 Problems testing typological correlations with the online WALS. Linguistic Typology 13: 121–135. Gil, David 1986 A prosodic typology of language. Folia Linguistica 20: 165–231. Han, Eunjoo 1994 Prosodic structure in compounds. Doctoral dissertation, Stanford University. Hashimoto, Mantaro 1978 Gengo ruikei chiri-ron (Typological and geographical linguistics). Tokyo: Kobundo. Also in Hashimoto Mantaro Chosaku-shu vol. 1, Tokyo: Uchiyama-shoten, 29–190.
90
Hisao Tokizaki and Yasutomo Kuwana
Haspelmath, Martin, Matthew S. Dryer, David Gil, and Bernard Comrie 2005 The world atlas of language structures. Oxford: Oxford University Press. Hualde, José Ignacio and Jon Ortiz de Urbina 2003 A grammar of Basque. Berlin: Mouton de Gruyter. Kamei, Takashi, Rokuro Kohno and Eiichi Chino (eds.) 1988–2001 Gengogaku Daijiten (The Dictionary of Linguistics). Tokyo: Sanseido. Krott, Andrea, Gary Libben, Gonia Jarema, Wolfgang Dressler, Robert Schreuder and Harald Baayen 2004 Probability in the grammar of German and Dutch: Interfixation in triconsonstituent compounds. Language and Speech 47, 83–106. Lee, Iksop and S. Robert Ramsey 2000 The Korean Language. Albany: State University of New York Press. The Japanese edition is published as Kankokugo Gaisetsu. Tokyo: Taishukan, 2004. Lee, Ki-Moon 1975 Kankokugo-no Rekishi (History of Korean Language). Supervised by Shichiro Murayama and translated by Yukio Fujimoto. Tokyo: Taishukan. Lehmann, Winfred P. 1973 A structural principle of language and its implications. Language 49: 47–66. Maddieson, Ian 2005 Syllable structure. In: Haspelmath et al. (eds.), 54–55. Maddieson, Ian 2010 Correlating syllable complexity with other measures of phonological complexity. On-in Kenkyu (Phonological Studies) 13, 105–116. Mehler, Jacques, Núria Sebastián-Gallés and Marina Nespor 2004 Biological foundations of language acquisition: evidence from bilingalism. In: Michael S. Gazzaniga, Emilio Bizzi and Ira B. Black (eds.) The cognitive neurosciences III, Cambridge, MA: Bradford, MIT Press, 825–836. Otsu, Yukio 1980 Some aspects of rendaku in Japanese and related problems. MIT Working Papers in Linguistics Vol. 2: Theoretical Issues in Japanese Linguistics, 207–227. Plank, Frans 1998 The co-variation of phonology with morphology and syntax: A hopeful history. Linguistic Typology 2: 195–230. Plank, Frans 2009 WALS values evaluated. Linguistic Typology 13, 41–75. Price, David P. 1976 Southern Nambiquara phonology. International Journal of Anthropological Linguistics 42, 338–348.
Limited consonant clusters in OV languages
91
Rastorgueva, Vera S. 1964 A short sketch of the grammar of Persian, (translated by Steven P. Hill; edited by Herbert H. Paper.) Bloomington: Indiana University. Shiraishi, Hidetoshi 2006 Topics in Nivkh phonology. Groningen Dissertations in Linguistics 61. University of Groningen. Tokizaki, Hisao 2008 Symmetry and asymmetry in the syntax-phonology interface. On-in Kenkyu (Phonological Studies) 11, 123–130. VanDam, Mark 2004 Word final coda typology. Journal of Universal Language 5: 119– 148. Wagner, Michael 2005 Asymmetries in prosodic domain formation. MIT Working Papers in Linguistics 49, 329–367. Weiers, Michael 2003 Moghol. In: Juha Janhunen (ed.) The Mongolic languages, London: Routledge, 248–264.
Manner, place and voice interactions in Greek cluster phonotactics Marina Tzakosta Abstract This paper evaluates cluster formation and cluster well-formedness in Greek on the basis of three distinct scales, namely the scale of manner of articulation, the scale of place of articulation and the scale of voicing. The proposal of this paper is that the classical Sonority Scale (cf. Selkirk 1984, Steriade 1982) and the bi-dimensional model proposed by Morelli (1999) in which cluster formation is evaluated on the basis of two distinct scales, i.e. the manner and place scales, are not adequate to account for cluster formation and cluster well-formedness. According to the present proposal, in addition to the scales of manner and place, voicing is crucial for cluster well-formedness and needs to constitute a distinct scale. Voicing actually defines a cluster as an acceptable tautosyllabic sequence. Well-formedness is driven by the rightward satisfaction of the scales in combination with the Distance holding among cluster members. Different degrees of satisfaction of the scales and different distances holding among cluster members result in different degrees of cluster well-formedness. The theoretical claims expressed here are tested through Greek dialectal and developmental data but aim at having cross-linguistic value. The current proposal further contributes to the establishment of principles governing syllabification.
1. Introduction Cluster formation is primarily investigated in combination to the ways it contributes to syllabic complexity. Consonant clusters add to syllabic weight and affect stress assignment given that there are languages in which stress ‘selects’ its landing site depending on syllabic weight (Ewen and van der Hulst 2001, Hayes 1995, van der Hulst 1984).1 Cluster formation is considered to be driven by the Sonority Scale (hereafter SonS) and Sonority Distance (hereafter SD) (Selkirk 1984, Steriade 1982). The SonS determines cluster well-formedness in a progressive and rightward manner. More specifically and, as illustrated in figure 1 below, phonemes
1. In stress-to-weight systems stress adds weight to the syllable that carries it while in weight-to-stress systems stress falls on heavy syllables.
94
Marina Tzakosta
rise in sonority from left to right; therefore, stops are the least sonorous segments whereas vowels are the most sonorous ones. The notion of sonority was first introduced by Sievers (1901) and further developed by Jespersen (1904). Jespersen proposes the classification of phonemes in terms of sonority. Sonority is considered to be a universal principle dependent on phonological grounds. Moreover, there are acoustic studies which further support its universal cross-linguistic character (cf. Jany et al. 2007). Sonority is a gradient notion in the sense that it is comparative; for example, stops are less sonorous than fricatives and both are less sonorous than vowels. Moreover, the more sonorous a segment is the more chances it has to occupy syllabic nuclei positions. On the contrary, the least sonorous a segment is the more probable it is to be part of a syllabic onset or a syllabic coda. Given the above, a syllable is a contour schema rising in sonority towards the nucleus and falling in sonority towards the coda. Rightward satisfaction of the scale implies that, for example, stops may cluster with any consonant type to their right on the scale and result in well-formed clusters. However, fricatives can cluster with all consonant types except for stops which are located to their left. Therefore, according to the SonS, FAFFR,2 FN, FL clusters are perfectly acceptable, but FS3 sequences are not.
Figure 1. The classical sonority scale
Fig. 1 offers a generalized version of the SonS, as proposed by Steriade (1982). However, there are parametrized versions of the SonS which are imposed by the phonotactic constraints of each language. Drachman (1989, 1990), Kappa (1995), and Malikouti-Drachman (1987) have proposed such parametrized SonSs for Greek. 2. S stands for stops, F for fricatives, AFFR for affricates, N for nasals, L for laterals and rhotics, G for glides, V for vowels and C for obstruent consonants, i.e. stops and fricatives. 3. Morelli (1999) suggests that the systematic occurrence of obstruent clusters must be explained in sonority-independent terms. She suggests that the sonority scale should be divided in two distinct scales, one for PoA and one for MoA, along which generalizations can be made. According to her proposal, FS sequences are the only well-formed clusters in Greek. /s/ clusters are also unmarked along both dimensions. However, Greek allows not only for FS clusters but also for SF, FS, SS and FF sequences.
Manner, place and voice interactions in Greek cluster phonotactics
95
SD, on the other hand, a notion qualitative in nature, determines the degree of cluster well-formedness (cf. Clements 1988, 1990, 1992). More specifically, cluster members marked by the biggest possible and sonority-rising distance between them make up the best-formed clusters. Numbers on the SonS signal the distance among cluster members. Consequently, a SF cluster like /px/ with a SD (1) is less well-formed compared to SL sequences like /pl/ with SD (4), though both are well-formed clusters. Therefore, SD presupposes that cluster well-formedness is marked by different degrees of cluster perfection and acceptability. Put differently, cluster perfection is signaled by the biggest possible sonority distance among cluster members – the minimal distance being (1) – while cluster acceptability is signaled by, in most cases, (0) distance among cluster members; (0) distance is attested when cluster members share the same manner of articulation, place of articulation or voicing. Gradience in cluster formation is one of the cores of the present study which will be discussed in detail. It is important to mention that Lass (1984) has proposed a mirror image of the SonS, namely, the Scale of Consonantal Strength (hereafter SConS). On the SConS, phonemes are evaluated and interrelated not with respect to sonority but with respect to their strength. Therefore, while vowels are the most sonorous and stops are the least sonorous segments on the SonS, stops are the strongest segments while vowels are the weakest segments on the SConS. Claims like the ones discussed above allow us to make certain predictions regarding cluster realization and, implicitly, cluster perception. More specifically, if the SonS and SD govern cluster perfection, we expect that a perfect cluster would be perceptually more salient than an acceptable cluster; as a result, the former would have more chances to remain intact in its surface/ phonetic realization. In other words, we would expect that the SonS and SD drive ‘clarity’ of perception which, in turn, facilitates production. Consequently, CL rather than CC clusters are expected to emerge more frequently not only cross-linguistically but also in various aspects of a language (i.e. its dialectal varieties, L1 and L2 data, language disorders). The accuracy of the above assumptions is reinforced by the fact that multiple repair strategies, such as epenthesis, deletion or fusion, apply in clusters with small SD, like SF or FN, whereas clusters with big SD, like SL, are characterized by vowel anaptyxis. These assumptions have been tested and verified by Greek L1 and L2 experimental and developmental data in Tzakosta (2009) and Tzakosta and Vis (2009a, 2009b, 2009c). Although there is solid argumentation regarding the universal as well as (per language) parametric factors that determine the formation of consonant clusters at the level of the SonS and SD, little has been said regarding the internal coherence of consonant clusters and additional factors which drive
96
Marina Tzakosta
cluster acceptability and cluster perfection in different languages or different aspects of the same language. In this study, we investigate this issue focusing on Greek and drawing on data from dialectal varieties of Greek. Dialectal data will be further supported by L1 and L2 data. Our fundamental claim is that the SonS and SD do not suffice in rendering a consonant cluster as perfect or acceptable. We argue that clusters fall within three categories: i) perfect, ii) acceptable and iii) non-acceptable (cf. Tzakosta 2010, Tzakosta and Karra 2011). We propose that perfect, acceptable and nonacceptable cluster formation depends on and is evaluated in parallel by means of the satisfaction of three distinct scales of manner, place and voicing which must be satisfied in a rightward manner. Cluster perfection and acceptability are gradient notions due to SD. In other words, /pl/ and /fl/ are both perfect clusters, but /pl/ is better-formed than /fl/ because the SD is bigger for /pl/ (4) than for /fl/ (3). In sum, we argue that cluster formation is driven by the parallel satisfaction of multiple scales of manner, place and voicing in combination to Distance (hereafter D). We claim that what is important in cluster wellformedness is not SD but rather simple D given that, as it will be shown, simple D without the presupposition of sonority is of major importance in cluster acceptability. A crucial advantage for the establishment of this threescales model is that scales contribute, except for well-formed cluster formation, to the establishment of principles which drive syllabification. This issue will further be discussed in the remainder of the paper. The paper proceeds as follows: Section 2 presents the characteristics of cluster formation in Greek as well as the shortcomings of the current analyses. Section 3 recapitulates our research questions and working hypotheses while section 4 provides details regarding the data sources. Section 5 unravels the features of the new proposal based on data from Greek dialects, native speakers of standard Greek and second language learners of Greek. Finally, section 6 concludes the paper and poses issues for future research. 2. The problem Before we move to the development of our proposal we find it essential to provide some information regarding the phonotactic constraints of standard Greek. Standard Greek allows the formation of open and closed syllables. Syllabic codas consist of maximally one consonantal segment. The repertoire of coda segments is rather limited; only /s/ is allowed word finally and /n/, /l/, /r/ word medially. Therefore, when either /n/, or /l/, or /r/ is the first member of a sequence of consonants, then these consonants are heterosyllabic. Some representative examples of heterosyllabic sequences are given in the data in (1). In all data sets dots indicate syllabic boundaries.
Manner, place and voice interactions in Greek cluster phonotactics
(1) a. b. c. d. e.
[pér.no] [pal.tó] [ár.ma] [ál.mi] [án.θro.pos]
97
‘take-1SG.PRES.’ ‘overcoat-NEUT.NOM.SG.’ ‘chariot-NEUT.NOM.SG.’ ‘brine-FEM.NOM.SG.’ ‘man-MASC.NOM.SG.’
Clusters appear only in onset position word medially and internally (Drachman 1989, Kappa 1995, Nespor 1997, and more references therein; see also Levin 1985 for more information regarding the power of universal principles driving syllabification) satisfying the Maximal Onset Principle (Selkirk 1984). Greek is rather free regarding the combination of consonants that may cluster together. However, it is conservative when it comes to the number of consonants a cluster may consist of. More specifically, Greek clusters may consist of at most three consonants. In three-member clusters the initial cluster member is in most cases /s/. The data in (2) provide some examples of Greek cluster phonotactics.4 Except for perfectly-formed CL sequences, obstruent CC clusters are allowed in all possible combinations; Greek clusters are made of [voiceless stop + voiceless stop], [voiceless stop + voiceless/voiced fricative], [voiced fricative + voiced fricative], [voiceless fricative + voiceless fricative], [voiceless fricative + voiceless stop] segments. [voiced obstruent + voiceless obstruent] sequences, like /bt/ are not attested in Greek. (2) a.
CL → a.plós ‘simple-ADJ.MASC.NOM.SG.’ á.kri ‘edge-FEM.NOM.SG.’ é.θri.os ‘clear-ADJ.MASC.NOM.SG.’ γlá.ros ‘seagull-MASC.NOM.SG.’
b. CC → fθi.nós ‘cheap-ADJ.MASC.NOM.SG.’ vγá.zo ‘take out-1SG.PRES.’ a.ktí ‘coast-FEM.NOM.SG.’ o.pti.kós ‘ocular-ADJ.MASC.NOM.SG.’ té.fxos ‘issue-NEUT.NOM.SG.’ é.kθe.si ‘composition/display-FEM.NOM.SG.’ é.kδo.si ‘publication-FEM.NOM.SG.’5 pé.fko ‘pine-NEUT.NOM.SG.’ 4. This notice holds for the native vocabulary of the language. 5. /kθ/ and /kδ/ emerge in morpheme boundaries.
98
Marina Tzakosta
c.
CG → á.δjos ‘empty-ADJ.MASC.NOM.SG.’
d. CN → a.kmí ‘acme/prosperity-FEM.NOM.SG.’ é.θnos ‘nation-NEUT.NOM.SG.’ e.
NN → a.mne.si.a ‘amnesia-FEM.NOM.SG.’
In this study, the focus is on two-member CL and CC consonant clusters because these cluster types, first, display a great variety of possible combinations in Greek, second, added up they are the most frequently attested (Protopapas et al. in press), and, third, they differ radically regarding their phonological representation. More specifically and regarding this latter parameter, CC sequences have tight phonological representations similar to those of complex segments, and, consequently, they are difficult to perceive and produce. On the contrary, CL clusters have a ‘loose’ phonological representation, therefore, they are assumed to be easy to perceive (Tzakosta and Vis 2009a). CC and CL phonological representations are depicted in figures (2a) and (2b), respectively.
Figure 2. Left panel: A phonological representation of CC clusters. Right panel: A phonological representation of CL clusters
In our survey, we do not consider three types of consonant clusters, namely, CG, /s/ + C and NN clusters. CG clusters are not true clusters in Greek; rather, they are the product of vowel loss and/or raising of /i/. NN clusters, on the other hand, are the least frequently attested in Greek (Protopapas et al. in press). In addition, they behave similarly to CC clusters. Finally, /s/C clusters are not taken into consideration because of the special character that sibilants have in cluster formation (cf. Tzakosta 2009, Tzakosta and Vis 2009a). According to Morelli (1999), /s/C clusters are part of FS clusters. However, following Tzakosta (2009) and Tzakosta and Vis (2009a) we assume that, although
Manner, place and voice interactions in Greek cluster phonotactics
99
sibilants are fricatives, the former behave differently from other fricatives. Not randomly, sibilants, when they appear in onset position, are considered to be extrasyllabic segments. Besides, we assume that this flexible and extrasyllabic role of /s/ makes /st/ be the most frequently attested cluster in Greek (Protopapas et al. in press). It is important to note that, though excluded, these cluster types reinforce our present account. For a relevant discussion see Tzakosta (in press). The major question underlying this study refers to the types of consonant clusters emerging in various aspects of a language system. More specifically, Greek is characterized by constraints that limit the types of clusters allowed in standard Greek. However, dialectal as well as developmental Greek L1 and experimental L2 data reveal that clusters not allowed in the standard language are allowed in other aspects of Greek. It will be shown that segments which are unmarked under a theory of Markedness and, therefore, expected to surface earlier and more accurately in L1 and L2 are substituted for more complex segments/sequences. Such facts suggest that a theoretical account of the segmental composition of clusters based on Feature Geometry and Underspecification is explanatorily inadequate. There are additional questions related to the above claims. For example, if CL are perfect clusters due to SD why do non-perfect clusters, such as CC, emerge massively in Greek dialects and language development? Why do clusters not allowed by the phonotactics of standard Greek emerge in dialectal and developmental data? These topics will be addressed in the discussion that follows.
3. Goals of the present study and working hypotheses Based on the question just pointed out in the previous section, the current study has the following aims: first, to investigate the production patterns of CL and CC clusters with the additional aim to test whether all cluster types have the same ‘survival’ chances in their surface realization, and, second, to make a typological account of CL and CC cluster formation in dialectal varieties of Greek, L1 acquisition and L2 learning. Our major working hypothesis is that the SonS is not adequate to explain cluster formation. Rather, cluster formation should be evaluated on the basis of three distinct scales of manner of articulation (hereafter MoA), place of articulation (hereafter PoA) and voicing. More specifically, we propose that the MoA scale controls good cluster sonority (Clements 1988, 1990), the PoA scale registers the satisfaction of the fixed place hierarchy (Prince and Smolensky 1993), while voicing refines cluster formation.
100
Marina Tzakosta
4. Data sources For the purposes of the present study we draw on data from three corpora: first, indexed dialectal data (Tzakosta 2010, Tzakosta and Karra 2011) from the major dialectal zones of Greek, namely, Dialects of Northern Greece (Epirus, Meleniko, Lesvos, Pontos, Thassos, Corfu, Attica, Thessalia, Kozani, Trikala, Samothraki, Thessaloniki, Koutsovlahika) and Dialects of Southern Greece (Cyprus, Crete, Dodekanese, Ikaria). Data indexation was achieved through the detailed study of grammars, atlases and dictionaries of Greek dialects. No oral speech dialectal data were recorded. The second corpus consisted of naturalistic Greek L1 developmental data from 6 monolingual children whose ages ranged between 1;07–3;05 years. The data were collected on the basis of a) a semi-structured technique of picture naming and b) through free interaction with the children (Tzakosta 2004). The data were recorded and broadly transcribed using IPA. The third corpus consisted of naturalistic Greek L2 data selected from groups with different L1 backgrounds. First, 10 Dutch monolingual adults with age range between 25–60 years and of intermediate proficiency level, and, second, 3 Romanian monolingual adults with age range between 27–51 years of intermediate and advanced proficiency level. The data collection technique used was structured questionnaires (cf. Tzakosta 2006). Data from both groups were recorded and broadly transcribed using IPA. It is important to mention that our study is qualitative in nature. Therefore, we focus on the patterns of consonant clusters that emerge in Greek varieties, L1 and L2, rather than on the frequencies of their surface realization. Consequently, we do not provide statistical analyses or input frequency effects.6
5. The current proposal and the linguistic evidence Following the characteristics of the SonS and D we propose three types of consonant clusters: perfect, acceptable and non-acceptable. Perfect clusters are tautosyllabic consonantal sequences which satisfy the SonS with the maximum possible D between their members. To give an example, stops combined with liquids, like /pl/, form perfect clusters. However, perfection in cluster formation is gradient due to D; fricatives + liquids, like /fl/, make up perfect clusters, though ‘less perfect’ compared to stop + liquid ones, like /pl/. 6. For statistical analyses related to the topic of the current study the interested reader may refer to Tzakosta (2009, 2010).
Manner, place and voice interactions in Greek cluster phonotactics
101
The D among the members of /fl/ is (3), while it is (4) among the members of /pl/. A necessary condition for the formation of a perfect cluster is the minimal satisfaction of all scales, i.e. with D (1). Acceptable clusters are consonantal sequences consisting of members mostly sharing the same landing point on the SonS. In cluster /pt/, for example, both cluster members are voiceless stops; they only differ with respect to place of articulation. In the discussion of the current proposal, we will highlight the fact that acceptable clusters need to at least (vacuously) satisfy one of the three scales. Finally, non-acceptable clusters are consonantal sequences not respecting the SonS. In other words, non-acceptable clusters are formed by consonants selected on a leftward direction on the SonS, like /θp/ whose first member is a fricative and the second is a stop. Following Tzakosta (2009), and, as already mentioned, we assume that different patterns in the production of CL and CC clusters are due to differences in clusters’ perceptual load. Different perceptual loads are due to distinct phonological representations. In other words, complex phonological representations mirror ‘heavy’ perceptual loads while non-complex representations mirror ‘light’ perceptual loads, as having been shown in figures (2a) and (2b) above. The problem arises because the SonS sees segments as inseparable wholes providing information only regarding the principles which govern cluster formation, without giving any information about why certain clusters are better- or worse-formed than others. According to the current proposal, the SonS should be evaluated separately with respect to MoA, PoA and voicing in order to assess subtle cluster differentiations. Given the cluster categorization suggested above, we suggest that perfect, acceptable and non-acceptable cluster formation depends on the degree of satisfaction of the scales of manner, place and voicing which are illustrated in figures 3, 4 and 5, respectively. Like the classical SonS, all scales need to be satisfied in a rightward manner. However, not all clusters are perfect to the same extent, since, as already mentioned, cluster perfection is gradient; the bigger the D among cluster members on all scales the better-formed the cluster. The minimal possible D for perfect clusters is (1) and the maximal is (4). The manner scale in fig. 3 heavily draws on the classical SonS. In the data in (3), (3d) is an example of a cluster which satisfies the manner scale, though with the minimal possible D; the stop is the leftmost cluster member, while the fricative is the rightmost one. In other words, /pθ/ in (3d) is a perfect cluster on the manner scale with the minimal possible distance (1) holding among its members. (3a–c), on the other hand, are instances of clusters which vacuously satisfy the manner scale because its members land at the same point on the
102
Marina Tzakosta
Figure 3. The manner scale
scale, i.e. they are both either stops or fricatives. Cluster members sharing the same manner of articulation form acceptable clusters. In addition, in (3a–b) both cluster members are stops. It is interesting that in (3c) stop /p/ changes to fricative /v/ and, consequently, a minimally perfect cluster becomes – due to its fricative members – an acceptable one. It is important to mention again that the difference between a minimally perfect and an acceptable cluster is the D holding among their members. In a minimally perfect cluster D should be (1), while in an acceptable cluster it is (0). (3) a.
/a.ʎó.ti.kos/ → [a.ʎó.tkus]
‘different-ADJ.MASC.NOM.SG.’
b.
/a.po.ká.to/ → [a.pká.tus] ‘underneath-ADV.’ (Meleniko, Andriotes 1989)
c.
/pe.δí/ → [vδí] ‘child-NEUT.NOM.SG.’
d.
/pi.θa.mí/ → [pθa.mí] ‘span-FEM.NOM.SG.’ (Thessalia, Tzartzanos 1909)
On the other hand, the place scale depicted in fig. 4 is equivalent to the fixed place hierarchy proposed by Prince and Smolensky (1993). According to this hierarchy, velars are more marked compared to labials and labials are more marked compared to coronals. Interpreting the fixed place hierarchy into the place scale proposed here means that a velar or a labial needs to be the leftmost member of a cluster if a coronal is the rightmost one. Accordingly, in order to form a perfect cluster, if the second member of a cluster is a labial, the first member needs to be a velar.
Figure 4. The place scale
Manner, place and voice interactions in Greek cluster phonotactics
103
The data in (4) provide evidence that the place scale is satisfied, though input clusters are slightly changed in their output realization due to D. More specifically, in (4a) the perfect – at the manner level – cluster /θl/ becomes /fl/ in order to achieve perfection at the place level as well. More specifically, /θ/ and /l/ make up an acceptable cluster given that both segments land at the same point on the place scale – they are both coronals – with D (0); however, substitution of /θ/ for /f/ creates D (1) on the place scale among the members of the newly formed cluster. Similarly, in (4b), although /v/ and /l/ make up a perfect cluster and cluster members are marked by D (1), /v/ is substituted for /γ/ in order to achieve an even bigger D (2). Again, data such as that in (4b) illustrate that cluster perfection and acceptability are gradient. Certain clusters are better than others due to D; the bigger the D among cluster members, the better-formed a cluster at the level of perfection and acceptability. In other words, clusters with members differing even minimally with respect to ΜoA and/or PoA are preferred to those sharing the same ΜoA and/or PoA. Finally, (4c) is a mirror case to those described in (4a) and (4b); more specifically, although (4a) and (4b) illustrate instances of ‘better’ perfect clusters compared to (4c), (4c) exemplifies that perfect clusters may be substituted for acceptable ones. Acceptable clusters are characterized by a small, minimal or even zero, D among their cluster members on at least one of the three scales. In /fθ/, the manner and voicing scales are vacuously satisfied, whereas the place scale is minimally satisfied with D (1).7 (4) a.
/θli.ve.rós/ → [fli.vi.rós]
‘depressing-ADJ.MASC.NOM.SG.’
b.
/vlé.po/ → [γlé.po] ‘see-1SG.PRES.’
c.
/θlí.vo.me/ → [fθí.vo.me] ‘be sad-1SG.PRES.’ (Pontos, Oikonomides 1958)
(Meleniko, Andriotes 1989)
Finally, the voicing scale in fig. 5 is the least complex scale, given that segments may be either [–voiced] or [+voiced]. According to this scale, a perfect cluster is a cluster whose first member is [–voiced] and the second is
7. At this point we need to make a crucial clarification based on a comment of an anonymous reader to whom we are thankful. Our account does not rely on the Optimality Theory (hereafter OT) framework. The terms ‘perfect’, ‘acceptable’, ‘non-acceptable’ and ‘constraints’ mirror general well-formedness and not any OT principles. Therefore, we do not find it essential to cite extended OT studies, except for some major ones which are theoretically related to our survey.
104
Marina Tzakosta
Figure 5. The voicing scale
[+voiced]. The converse voicing order is responsible for the formation of nonacceptable clusters. Consonants sharing the same voicing characteristics, i.e. if they are both voiceless or voiced, form acceptable clusters.8 Voicing has been primarily dealt with with respect to voicing and devoicing alternations emerging mostly in Germanic languages (cf. Oostendorp 2004, 2006, among others) and assimilatory processes (cf. Al-Ahmadi Al-Habi to appear, Arvaniti 1999, Baroni 1997, Grijzenhout 2000). Such phenomena have been accounted for mostly within OT by means of the *NC, ND, *ND constraints which allow or forbid NC or ND sequences to emerge (cf. Borowsky 2000, Grijzenhout 2000, Lombardi 1995, 1999, Pater 1999).9 In order to establish a voicing scale in our proposal, the motivating question was the following: if voice assimilation applies to non-adjacent consonants and within consonant clusters and, at the same time, [–voi] + [+voi] clusters like /kδ/ are acceptable and attested in the norm and dialectal data, why are [+voi] + [–voi] clusters, like /δk/, non-acceptable and, actually, non-emergent in any aspect of Greek? The data in (5a–c) illustrate the rightward satisfaction of the voicing scale; the first member of the cluster is voiceless while the second is voiced. Data (5d–e) highlight the creation of clusters which share the same voicing characteristics. Finally, the data (5f–i) pinpoint cases of regressive devoicing assimilation; it is interesting that both voiced and voiceless segments may drive assimilation, as shown in (5f–i), respectively. We assume that in languages like Greek in which both voiced and voiceless segments are allowed in all word positions – which means that neither voicing nor devoicing is preferred – assimilation of both voicing and devoicing are allowed. All clusters in (5) are acceptable – only (5c) is perfect because it minimally satisfies all scales – because they all vacuously satisfy at least one scale. In order to be perfect, the clusters in (5) should at least minimally satisfy all scales.
8. cf. also Malikouti-Drachman (1987, 2001). 9. In OT, the use of * in the formation of a constraint disallows the emergence of the * marked structure.
Manner, place and voice interactions in Greek cluster phonotactics
(5) a.
105
/ti.γá.ni/ → [tγán] ‘frying pan-NEUTR.NOM.SG.’ (Samothraki, Katsanis 1996)
b. /ku.bá.ros/ → [kba.ré.ls] ‘bestman-MASC.NOM.SG.’ c.
/ku.δú.ni/ → [kδu.nél] ‘bell-NEUT.NOM.SG.’ (Thassos, Tombaidis 1967)
d. /ku.fós/ → [kfós] ‘deaf-ADJ.MASC.NOM.SG.’ (Thessalia, Tzartzanos 1909) e.
/tra.γu.δá.i/ → [tra.γδá.i] ‘sing-3SG.PRES.’
f.
/skou.dó/ → [gdó] ‘push-1SG.PRES.’ (Samothraki, Katsanis 1996)
g. /ku.vá.ri/ → [gvár] ‘ball-NEUT.NOM.SG.’ (Kozani, Margariti-Roga 1989)10 h. /δi.cé.li/ → [θcél] ‘grub hoe-NEUT.NOM.SG.’ i.
/po.di.kós/ → [pu.tkós] ‘mouse-MASC.NOM.SG.’ (Thassos, Tombaidis 1967)
A potential argument against the existence of the voicing scale would be that we can consider the latter as part of the SonS or the manner scale especially given that nasality and liquidity embody voicing. However, it is difficult to account for cluster-internal ‘voiceness’ without a distinct scale. In particular given that if the voicing scale is not satisfied, clusters are not acceptable, and, as a result, they are subject to cluster assimilation. In other words, the voicing scale is the scale that always needs to be satisfied for a cluster to be at least acceptable.11 To sum up, according to the three-scales model of cluster well-formedness clusters are perfect if they satisfy all scales at least with minimal D (1). Clusters are acceptable under certain conditions: a) if they vacuously12 satisfy all scales, b) if they violate one of the scales of manner or place and (vacuously) satisfy the other or c) if they violate both MoA and PoA scales but at least vacuously satisfy the voicing scale. Non-acceptable clusters emerge as long as a) all scales are violated, and, b) the voicing scale is violated even if the manner and place scale are at least vacuously satisfied.
10. Cf. Blaho and Bye (2006) for equivalent cross-linguistic results. 11. For the conditions under which the voicing scale may be violated see Tzakosta (2009b). 12. Scale vacuous satisfaction is characteristic only of acceptable clusters.
106
Marina Tzakosta
There is still another important question to be addressed; why are data such as those in (6) attested in different aspects of Greek? More specifically, why are acceptable clusters preferred to perfect ones? First of all, all data in (6) except (6e) are the result of vowel loss. Apparently, the combination of the newly adjacent consonants is valid on the basis of the three scales. Therefore, acceptable clusters emerge. However, it is difficult for the present proposal to account for cases such as those of (6e) in which a perfect cluster is substituted for an acceptable one. We assume that (6e) is rather a case of cluster misperception which has been established in the dialect with time. This is apparently an issue that is still open for discussion. (6) a.
/ku.fá.θi.ce/ → [kfá.θce] ‘become deaf – 3SG.PAST’ (Thessalia, Tzartzanos 1909)
b. /po.di.kós/ → [pu.tkós] ‘mouse-MASC.NOM.SG.’ c.
/ti.γá.ni/ → [tγán] ‘frying pan-NEUTR.NOM.SG.’ (Samothraki, Katsanis 1996)
d. /ku.fós/ → [kfós] ‘deaf-ADJ.MASC.NOM.SG.’ (Thessalia, Tzartzanos 1909) e.
/θlí.vo.me/ → [fθí.vo.me] ‘be sad-1SG.PRES.’ (Pontos, Oikonomides 1958)
k. /pi.δó/ → [bδó] ‘jump-1SG.PRES.’ l.
/pe.δí/ → [vδí] ‘child-NEUT.NOM.SG.’
d. /tu.fé.ci/ → [tfé.ci] ‘gun-NEUT.NOM.SG.’ (Thessalia, Tzartzanos 1909) We will provide representative developmental data which further support our argumentation. It has been argued that CC sequences undergo various repair strategies such as deletion, epenthesis, fusion, stopping, and various types of assimilatory processes when CL clusters are correctly produced in Greek L1 and cross-linguistically (cf. Tzakosta 2006 and more cross-linguistic references therein). CL clusters may be produced even in cases where they are not contained in the target form, as illustrated in (7c). Having knowledge of but not yet having acquired the manner scale, Greek-speaking children have multiple outputs for one input form. As exemplified in (7a), together with the non-acceptable – at the level of manner – /ft/ cluster, acceptable /pt/ is also realized. Simultaneously, sibilant /s/ may be substituted for /θ/ (see examples in (7b), (7d)), the former occupying the same rank on the manner scale as
Manner, place and voice interactions in Greek cluster phonotactics
107
the latter. (7b) and (7d) violate both the manner and place scale but due to the vacuous satisfaction of the voicing scale the output clusters are acceptable. (7) a. /a.ftó/ → [a.ptó], [a.ftó] ‘this-DEM.PR.’ (B: 1;11.27) b. /sxo.lí.o/ → [θxo.lí.o] ‘school-NEUT.NOM.SG.’ (D: 2;07.06) c. /o.bré.la/ → [ku.blé.la] ‘umbrella-FEM.NOM.SG.’ (Me:1;11.22) d. /pá.sxa/ → [pá.θka] ‘Easter-NEUT.NOM.SG.’ (B.M.: 2;09.25) Dutch and Romanian learners of Greek exhibit equivalent data, as exemplified in (8) and (9), respectively. (8) a. b. c. a. c.
/fo.to.γra.fí.a/ → [fo.to.xra.fí.a] /gri.ņá.ris/ → [kri.ni.á.ris] /e.vδo.má.δa/ → [e.vdo.má.da] /θo.ri.któ/ → [fri.któ] /u.ra.nós/ → [i.γra.nós]
d. /e.po.çí/ → [e.pó.ksi] e. /cí.ni.si/ → [klí.si] (9) a. b. c. d.
/fθó.ri.o/ → [fto.rá] /e.vδo.má.δa/ → [e.vdo.má.da] /a.vγó/ → [a.vgó] /xte.ní.zo/ → [kte.ní.zo]
‘photo-FEM.NOM.SG.’ (S1) ‘nasty-ADJ.MASC.NOM.SG.’ (S2) ‘week-FEM.NOM.SG.’ (S3) ‘tanker-NEUT.NOM.SG.’ (S2) ‘sky-MASC.NOM.SG.’ (S3) ‘season-FEM.NOM.SG.’ (S4) ‘circulation-FEM.NOM.SG.’ (S5) ‘fluorine-NEUT.NOM.SG.’ (S3) ‘week-FEM.NOM.SG.’ (S1) ‘egg-NEUT.NOM.SG.’ (S2) ‘comb-1SG.PRES.’ (S1)
e. /γδí.no/ → [gdí.no] ‘denude-1SG.PRES.’ (S2) We assume that the preference for acceptable clusters is an indication of the freer cluster formation mechanisms characteristic of Greek dialects but also other aspects of the language; dialects – especially those of the northern dialectal zone – are less conservative regarding cluster synthesis given that clusters may appear in coda position due to the application of phonological rules according to which high vowel loss and/ or raising apply in unstressed syllables (Newton 1972). This allows various acceptable clusters to appear extensively in the surface realization. In acceptable clusters, consonantal combinations are freer than those of a perfect cluster given that D (0) allows for a high number of consonantal sequences to emerge. Therefore, the number of acceptable clusters is higher than that of perfect clusters. Cluster formation gradience is illustrated in tables 1–3. Table 1 illustrates the segmental combinations which result in gradience in cluster formation at the level of manner of articulation. Table 2 displays gradience at the level of place of articulation, while table 3 presents gradience at the level of voicing.
108
Marina Tzakosta
Table 1. Gradience in Cluster Formation (MoA) Cluster Types
Perfect
Stop + L
Z
Fricative + L
Z
Acceptable
Stop + Stop
Z
Fric + Fric
Z
Stop + Fric
Non-Acceptable
Z Z
Fric + Stop Stop + Affr
Z Z
Affr + Stop Fric + Affr
Z Z
Affr + Fric Z
Affr + Affr
Table 2. Gradience in cluster formation (PoA) Cluster Types
Perfect
Non-Acceptable
Z
lab + lab lab + cor
Acceptable
Z Z
lab + vel Z
cor + cor cor + lab
Z
cor + vel
Z Z
vel + vel vel + cor
Z
vel + lab
Z
Manner, place and voice interactions in Greek cluster phonotactics
109
Table 3. Gradience in cluster formation (voicing) Cluster Types
Perfect
[+voi] + [+voi] [+voi] + [–voi]
Non-Acceptable
Z
[–voi] + [–voi] [–voi] + [+voi]
Acceptable
Z /kδ/ Z /gδ/ Z /δk/
Gradience in cluster well-formedness is depicted in schema 1 where cluster types appear in hierarchical order. Schema 1 is a combined typological synopsis of perfect, acceptable and non-acceptable clusters with respect to all three dimensions of manner, place and voicing. PC1 clusters are the highest in the hierarchy and are the best of perfect clusters; all scales are respected with the biggest possible D among cluster members. PC2 clusters are perfect sequences with smaller D among cluster members compared to PC1 sequences. In the same spirit, AC1 and AC2 clusters are acceptable combinations with small D and relative satisfaction and/or violation of the scales. Finally, non-acceptable N-AC clusters violating all scales occupy the lowest schema level. A major contribution of the scales of manner, place and voicing is that they shape tautosyllabic clusters and redefines heterosyllabic sequences in a new fashion. More specifically, as already mentioned in section 2 and exemplified in the data in (1) which are rewritten as (10) here for the ease of reading, all data in (10) vacuously satisfy the voicing scale but violate both the manner and place scales. Given that this is one of the conditions for forming acceptable clusters, all data in (10) form tautosyllabic sequences according to the three-scales model of cluster well-formedness. However, such a claim is contra to the fact that the sequences in (10) are considered to be heterosyllabic (cf. Kappa 1995, Nespor 1997). The model proposed here establishes new conditions for defining tautosyllabicity and heterosyllabicity. Our model implies that only non-acceptable clusters constitute heterosyllabic sequences. Therefore, because all sequences in (10) are acceptable, they are considered to be tautosyllabic. Some preliminary psycholinguistic evidence supporting these claims stem from Tzakosta and Vis (2009a, 2009b, 2009c); however, more psycholinguistic experimentation needs to take place.
110
Marina Tzakosta
Schema 1. A schematic representation of perfect, acceptable and non-acceptable clusters
Manner, place and voice interactions in Greek cluster phonotactics
(10) a. b. c. d. e.
[pér.no] [pal.tó] [ár.ma] [ál.mi] [án.θro.pos]
111
‘take-1SG.PRES.’ ‘overcoat-NEUT.NOM.SG.’ ‘chariot-NEUT.NOM.SG.’ ‘brine-FEM.NOM.SG.’ ‘man-MASC.NOM.SG.’
6. Conclusions and future research In this study, we presented a typological account of CL and CC clusters based on their production patterns in dialectal varieties of Greek as well as Greek L1 and L2 data. Our proposal is that the SonS and SD are no sufficient means to account for the acceptability and/or perfection of consonant clusters. We propose that clusters are categorized as i) perfect, ii) acceptable and iii) nonacceptable on the basis of the three distinct scales of manner, place and voicing introduced. All scales are satisfied in a rightward manner. Clusters are perfect under one major condition: to minimally satisfy all scales. On the other hand, clusters are acceptable under three conditions: a) if they vacuously satisfy all scales, b) if they violate one of the scales of manner or place and (vacuously) satisfy the other but always (and at least vacuously) satisfy the voicing scale, c) if the voicing scale is at least vacuously satisfied but both scales of manner and place are violated. Vacuous satisfaction is characteristic of acceptable cluster but never of perfect clusters. Non-acceptable clusters emerge as long as a) all scales are violated, and, b) the voicing scale is violated even if the manner and place scales are at least vacuously satisfied. Acceptable clusters are mainly CC clusters and emerge massively in various language aspects because they are flexible and predicted by the typology. Nonacceptable clusters, on the other hand, are rarely and exceptionally attested because they are not predicted by the typology. Our assumption is that in many cases non-acceptable clusters are the result of cluster misperception and wrong production. They are the most marked in theory and the least attested in empirical data. Cluster perfection and/or acceptability are not absolute notions; rather they are gradient. Greek dialects – especially those of the northern dialectal zone – are less conservative regarding cluster synthesis given that clusters may appear even in coda position due to the application of phonological rules according to which high vowel loss and raising applies in unstressed syllables (Newton 1972: 196 ff.).
112
Marina Tzakosta
A major advantage of the current proposal is that the establishment of three distinct scales driving cluster formation contributes to the clear reshape and redefinition of the phonotactic constraints of a language. More specifically, the definition of non-acceptable clusters gives no other option than to consider such clusters as heterosyllabic. However, there is more psycholinguistic work to be done on this topic in order to test the practical validity of such a claim. The present account refines Morellis’ (1999) proposal according to which clusters should be evaluated on the basis of two scales, manner and place. The introduction of a voicing scale is imposed by the fact that no cluster can be acceptable if the voicing scale is violated. Given the parallel (vacuous) satisfaction of the scales the present proposal succeeds in accounting for FS clusters which are not predicted by the classical sonority scale, though they emerge massively in Greek and are considered to be the most well-formed under Morelli’s (1999) account. This fact is related to some relevant interesting issues; first, there is a small set of FS clusters acceptable in the present model, like /fp/, /θf/, and /θt/, though ruled out by OCP given their adjacency on the manner and/or place scales. These clusters are still expected to emerge; therefore, more data need to be tested. Another topic is the ‘fate’ of non-acceptable clusters; it would be interesting to test the extent to which they are prone to ‘phonetic’ insertion or other repair strategies in order to ‘survive’. Such issues are amenable to future research. References Al-Ahmadi Al-Harbi To appear English voicing assimilation: Input-to-output [±voice] and Outputto-Input [±voice]. Journal of King Abdulaziz University 13. Andriotes, Panagiotes 1989 The Dialect of Meleniko [Το γλωσσικό ιδίωμα του Μελένικου] [in Greek]. Thessaloniki: Publications of the Society of Macedonian Studies. Arvaniti, Amalia 1999 Greek voiced stops: Prosody, syllabification, underlying representations or selection of the optimal? Proceedings of the 3rd International Conference of Greek Linguistics. 883–390. Athens: Ellinika Grammata. Baroni, Marco 1997 The representation of prefixed forms in the Italian lexicon: Evidence from the distribution of intervocalic [s] and [z] in northern Italian. M.A. Thesis, Department of Linguistics, UCLA.
Manner, place and voice interactions in Greek cluster phonotactics
113
Blaho, Sylvia and Patrick Bye 2006 Cryptosonorants and the misapplication of voicing assimilation in biaspectual phonology. ROA-759. Borowski, Toni 2000 Word faithfulness and the direction of assimilation. The Linguistic Review 17: 1–28. Clements, Nick G. 1988 The Role of the Sonority Cycle in Core Syllabification. Working papers of the Cornell Phonetics Laboratory 2: 1–68. Clements, Nick G. 1990 The role of the sonority cycle in core syllabification. In John Kingston and Mary E. Beckman (eds.), Papers in laboratory phonology I: between the grammar and physics of speech. 283–333. Cambridge: Cambridge University Press. Clements, Nick G. 1992 The sonority cycle and syllable organization. In Wolfgang U. Dressler, Hans C. Luschuetzky, Oskar E. Pfeiffer and John Rennison (eds.), Phonologica 1988. Proceedings of the 6th International Phonology Meeting. 63–76. Cambridge: Cambridge University Press. Drachman, Gaberell 1989 A remark on Greek clusters. Ms. Department of Linguistics, University of Salzburg. Drachman Gaberell 1990 Onset clusters in Greek. In Joao Mascaró and Marina Nespor (eds), Grammar in progress. Glow Essays for Henk van Riemsdijk. 113– 123. Dordrecht: Foris. Ewen, Colin and Harry van der Hulst 2001 The Phonological Structure of Words. An Introduction. Cambridge: Cambridge University Press. Grijzenhout, Janet 2000 Voicing and devoicing in English, German and Dutch: Evidence for domain-specific identity constraints. Working Papers Theorie des Lexikons 116. Heinrich-Heine-Universität Düsseldorf. Grijzenhout, Janet and Martin Kraemer 2000 Final devoicing and voice assimilation in Dutch derivation and cliticization. In Barbara Stiebels and Dieter Wunderlich (eds.), Lexicon in Focus. Studia grammatica 45: 55–82. Berlin: Akademie Verlag. Hayes, Bruce 1995 Metrical Stress Theory: Principles and Case Studies. Chicago: University of Chicago Press. Hulst van der, Harry 1984 Syllable structure and stress in Dutch. Ph.D. dissertation, University of Leiden.
114
Marina Tzakosta
Jany, Carmen, Matthew Gordon, Carlos M. Nash, Nobutaka Takara 2007 How universal is the sonority hierarchy? A cross-linguistic acoustic study. Proceedings of the 16th International Conference of Phonetic Sciences: 1401–1404. Jespersen, Otto 1904 Lehrbuch der Phonetik. Teubner: Leipzig. Kappa, Ioanna 1995 Silbenphonologie im Deutschen und Neugriechischen. Ph.D. dissertation, University of Salzburg. Katsanis, Nikolaos A. 1996 The Dialect of Samothrace [Το Γλωσσικό Ιδίωμα της Σαμοθράκης] [in Greek]. Thessaloniki. Κατσάνης, Νikolaos Α. 1983 Final consonants – consonant clusters of the dialect of Drimos: An attempt of their systemization [Τελικά Σύμφωνα – Συμφωνικά Συμπλέγματα του Ιδιώματος Δρυμού: Απόπειρα Συστηματοποίησής τους] [in Greek]. Proceedings of the 13th Annual Meeting of Greek Linguistics 8: 33–43. Lass, Roger 1984 Phonology: An Introduction to Basic Concepts. Cambridge: Cambridge University Press. Levin, J. 1985 A metrical theory of syllabicity. PhD. Massachusetts Institute of Technology. Lombardi, Linda 1995 Laryngeal features and privativity. The Linguistic Review 12: 35–59. Lombardi, Linda 1999 Positional faithfulness and voicing assimilation in optimality theory. Natural Language and Linguistic Theory 17: 267–302. Malikouti-Drachman, Angeliki 1987 Syllables in modern Greek. In Wolfgang U. Dressler, Hans Luschützky, Oskar E. Pfeiffer and John R. Rennison (eds.), Phonologica 1984. Proceedings of the Fifth International Phonology Meeting. 181–187. Cambridge: Cambridge University Press. Malikouti-Drachman, Angeliki and Gabriel Drachman 1990 Phonological government and projection: Assimilations, dissimilations [Φωνολογική Κυβέρνηση και Προβολή: Aφομοιώσεις, Ανομοιώσεις] [in Greek]. Working Papers in Greek Grammar. 1– 20. University of Salzburg. Malikouti-Drachman, Angeliki 2001 Greek Phonology: A Contemporary Perspective. Journal of Greek Linguistics 2: 187–243. Μargariti-Roga, Μarianna 1989 Weaving and clothing terms of Katafigion (prefecture of Kozani). [ Όροι Υφαντικής και Ενδυμασίας Καταφυγίου (νομός Κοζάνης)]
Manner, place and voice interactions in Greek cluster phonotactics
115
[In Greek]. Greek Dialectology 2: Thessaloniki: Kiriakidis Public. 173–180. Morelli, Frida 1999 Nespor, Marina 1997 Newton, Brian 1972
The phonotactics and phonology of obstruent clusters in optimality theory. Ph.D. dissertation, University of Maryland at College Park. Phonology [in Greek]. Athens: Patakis.
The Generative Interpretation of a Dialect. A Study of Modern Greek Phonology. Cambridge: Cambridge University Press. Oikonomides, Demosthenes I. 1958 Grammar of the dialect of Pontos [Γραμματική της Ελληνικής Διαλέκτου του Πόντου] [In Greek] Dictionary Bulletin 1: Athens: Academy of Athens. Oostendorp van, Marc 2004 An exception to final devoicing. Rutgers Optimality Archives-656. Oostendorp van, Marc 2006 Incomplete devoicing in formal phonology. Ms. Amsterdam: Meertens Institute. Pater, Joe 1999 Austronesian nasal substitution and other NC effects. In Rene Kager, Harry van der Hulst and Wim Zonneveld (eds.), The ProsodyMorphology Interface. 310–343. Cambridge: Cambridge University Press. Prince, Alan and Paul Smolensky 1993 Optimality theory: Constraint interaction in generative grammar. Ms. Rutgers University, New Brunswick, N.J. and University of Colorado, Boulder. Protopapas, Athanasios, Marina Tzakosta, Aimilios Chalamandaris and Pirros Tsiakoulis In press IPLR: an online resource for Greek word-level and sublexical information. Language Resources and Evaluation. Selkirk Elisabeth O. 1984 On the major class features and syllable theory. In Mark Aronoff and R. Oerle (eds), Language sound structure. 107–136. Cambridge, MA.: MIT Press. Sievers, Eduard 1901 Grundzüge der Phonetik zur Einführung in das Studium der Lautlehre der Indogermanischen Sprachen. Breitkopf und Härtel: Leipzig. Steriade, Donca 1982 Greek prosodies and the nature of syllabification. Ph.D. dissertation. Massachusetts Institute of Technology. Tzakosta, Marina 2004 Multiple parallel grammars in the acquisition of stress in Greek L1. Ph.D. dissertation, LOT Dissertation Series 93, Leiden: ULCL/HIL.
116
Marina Tzakosta
Tzakosta, Marina 2006 Developmental paths in L1 and L2 phonological acquisition: consonant clusters in the speech of native speakers and Turkish and Dutch learners of Greek. In Andrianna Belletti, Elisa Bennati, Cristiano Chesi, Elisa di Domenico and Ida Ferrari (eds.), Language Acquisition and Development: Proceedings of GALA 2005, Generative Approaches in Language Acquisition. 536–549. Cambridge: Cambridge Scholars Press. Tzakosta, Marina 2009 Asymmetries in /s/ cluster production and their implications for language learning and language teaching. Proceedings of the 18th International Symposium of Theoretical and Applied Linguistics. 365–373. Department of English Language and Linguistics: Aristotle University of Thessaloniki. Tzakosta, Marina 2010 “The importance of being voiced”: cluster formation in dialectal variants of Greek. In Angela Ralli, Brian Joseph, Marc Janse and Athanasios Karasimos (eds.), E-proceedings of the 4th international Conference of Modern Greek dialect and Linguistic Theory. 213– 223. University of Patras. http://www.philology.upatras.gr/LMGD/ el/index.html (ISSN: 1792–3743). Tzakosta, Marina In press Consonantal interactions in dialectal variants of Greek: a typological approach of three-members consonant clusters. Greek Dialectology 6. Tzakosta, Marina and Athanasia Karra 2011 A typological and comparative account of CL and CC clusters in Greek dialects. In Marc Janse, Brian Joseph, Angela Ralli and Spyros Armosti (eds.), Studies in Modern Greek Dialects and Linguistic Theory I. 95–105. Nicosia: Kykkos Cultural Research Centre. Tzakosta, Marina and Jeroen Vis 2009a Αsymmetries of consonant sequences in perception and production: affricates vs. /s/ clusters. In Anastasios Tsangalidis (ed.), Selected Papers from the 18th International Symposium on Theoretical and Applied Linguistics. 375–384. Department of English Language and Linguistics: Aristotle University of Thessaloniki: Monochromia. Tzakosta, Marina and Jeroen Vis 2009b Perception and production asymmetries in Greek: evidence from the phonological representation of CC clusters in child and adult speech. Greek Linguistics 29: 553–565. Tzakosta, Marina and Jeroen Vis 2009c Phonological representations of consonant sequences: the case of affricates vs. ‘true’ clusters. In Georgios K. Giannakis, Mary Baltazani, Georgios I. Xydopoulos and Tassos Tsaggalidis (eds.), E-proceedings of the 8th International Conference of Greek Linguistics
Manner, place and voice interactions in Greek cluster phonotactics
117
(8ICGL). 558–573. Department of Greek Philology: University of Ioannina. (ISBN: 978-960-233-195-8). http://www.linguist-uoi.gr/ cd_web/arxiki_en.htm) Tzartanos, Achilleas 1909 On the modern dialect of Thessaly [Περί της Συγχρόνου Θεσσαλικής Διαλέκτου]. Reprinted as an appendix in Greek Dialectology 1 (1989): Thessaloniki: Kiriakidis Publ. Tobaidis, Dimitrios 1967 The dialect of Thassos [Το γλωσσικό ιδίωμα της Θάσου] [in Greek]. Ph.D. dissertation, University of Thessaloniki. Vennemann, Theo 1972 On the Theory of syllabic phonology. Linguistische Berichte 18: 1– 18. Vennemann, Theo 1978 Universal syllabic phonology. Theoretical Linguistics 12: 85–129. Vennemann, Theo 1988 Preference Laws for Syllable Structure. Berlin: Mouton de Gruyter.
Consonant clusters in four Samoyedic languages Zsuzsa Va´rnai Abstract The purpose of this paper to present a description of the clusters of Samoyedic languages: Nenets (Tundra), Enets, Nganasan and Selkup (Taz dialect), which are endangered Uralic languages spoken in North-Siberia in Russia. In this paper I will give an account of the syllable types attested in root lexemes and discuss the constraints that apply to the constituents of the syllable in four examined languages. Despite the fact that these languages are historically and geographically very close to each other, they have different syllable structures, and they choose different processes to adapt borrowed clusters from Russian. I will focus on the similarities and differences between these languages with respect to the processes affecting clusters in Russian loanwords. Russian is counted as having complex syllable structure, very different from the Samoyedic languages. After a brief description of the languages in question I define a syllable template and the representation of the syllable for each language. Then I specify the possible complexity of onset and coda, and I show what types of sequences exist in these languages and what types do not. Then I discuss what happens in these languages to relatively old Russian loanwords.
1. Introduction A subfield of phonological research focuses on the syllable. It registers the rules of syllable structure in individual languages and studies the factors defining syllable shapes as well as the ways their elements are connected to each other to build them up. In this paper I will give an account of the syllable types attested in root lexemes and discuss the constraints that apply to the constituents of the syllable in four Samoyedic languages. I will focus on the similarities and differences between these languages with respect to the processes affecting clusters in Russian loanwords. First – after a brief description of the languages in question, and their sociolinguistic situation – I define a syllable template and the representation of the syllable for each language. Then I specify the possible complexity of onset and coda, and I show what types of sequences exist in these languages and what types do not. Blevins (1995) proposes binary parameters to account for language-particular variation in syllable typology. I present these parameter settings, too; then I discuss what happens in these languages to relatively old
120
Zsuzsa Várnai
Russian loanwords. Russian has many clusters, not only in word medial position across syllable boundaries, but also in onset position at the beginning of the word. My research questions are the following: How are Russian consonant clusters treated in Samoyedic? Which types of clusters are retained, and which ones are simplified in the course of borrowing from Russian? What happens to branching onsets in Samoyedic languages? Which way do they choose to adapt these clusters? Do they all choose the same way or different ways? Which types of sequences undergo simplification processes, and what processes do they undergo? 1.1. Sources The purpose of this paper to present a description of the clusters of Samoyedic languages, esp. of Nenets (Tundra), Enets, Nganasan and Selkup (Taz dialect), which are endangered Uralic languages spoken in North-Siberia in Russia (see map in Fig. 1). They have not yet been thoroughly investigated in the phonological literature. Despite the fact that these languages in question are historically and geographically very close to each other, they have different syllable
Figure 1. Geographical distribution of languages in Western Siberia
Consonant clusters in four Samoyedic languages
121
structures, and they choose different processes to adapt borrowed clusters from Russian. Russian is counted as having complex syllable structure (see WALS 2005), very different from the Samoyedic languages. It is very remarkable that different repair mechanisms are found for the same Russian cluster type. 1.2. The languages under investigation: general description, demography, language status Here we summarize the linguistic situation of languages discussed above: we present geographical location, number of speakers, dialectal distribution (based on Sipos et al. 2007). NENETS, YURAK-SAMOYED Territory / Region: Russia, Northeast Europe and Northwest Siberia in the Tyumen Region: Yamal-Nenets, Khanty-Mansi Autonomous Area, Krasnoyarsk: Tajmyr Municipal District of Krasnoyarsk Region in the Arkhangelsk Region.: Nenets Autonomous Area Dialect: Tundra and Forest Nenets Ethnic population: 41,302 Total number of speakers: 29,052 ENETS, YENISEY SAMOYED Territory / Region: Russia, Northern Siberia Tajmyr Municipal District of Krasnoyarsk Region in villages: Potapovo, Vorontsovo and Tukhard (close, nomadic in tundra); in city: Dudinka Dialect: Tundra and Forest Enets Ethnic population: 237 Total number of speakers: 84 NGANASAN, TAWGY SAMOYED Territory / Region: Russia, Northern Siberia Tajmyr Municipal District of Krasnoyarsk Region in villages: Ust-Avam, Volochanka, Chatanga, Novaja, and in Dudinka Dialect: Avam, Vadej Ethnic population: 834 Total number of speakers: 391
122
Zsuzsa Várnai
SELKUP, OSTYAK SAMOYED Territory / Region: Russia, Western Siberia in the Tomsk Region, the Krasnoselkup and the Pur Districts of the YamalNenets Autonomous Area, and the Turukhansk District of the Krasnoyarsk Region Dialect: Taz-Turukhan; Tym; Narym; Ob; Ket Ethnic population: 4249 Total number of speakers: 1230 In terms of the number of speakers, these undoubtedly count as small communities that have no autonomy. Nowadays they all live in autonomous areas where they form minorities. With the exception of the Nenets, these peoples add up to 1–2% of the inhabitants in their own autonomous areas. Moreover, the Selkup people live in three autonomous areas, so they are geographically distributed and show considerable dialectal differences. The data of the 2002 census of Russia (www.perepis2002.ru) are unreliable in relation to knowledge of languages. The question concerning mother tongue was deleted, thus the data should be taken as results of estimations. The data of the 1989 and 2002 census also show that the ratio of those who speak their language dramatically decreased between 1989 (the date of the previous census) and 2002: Nenets: –2%, Selkup: –10%, Nganasan: –23%, Enets: –3%. According to our research the actual number of the speakers is less and it depends on the definition of native-speaker proficiency (personal communication with other fieldworkers Valentin Goussev, Olesya Khanina, Andrej Shluinski, Florian Siegl, Sándor Szeverényi, Beáta Wagner-Nagy). The indigenous people of this territory are compelled to leave their homeland and move to villages and towns giving up not only their traditional culture but also their language. The present-day linguistic situation is the following: the people still speaking these four Samoyedic languages belong to the oldest age groups. That is to say, most members of these ethnic groups are not balanced bilingual speakers but their first language is Russian, while the use of the language of their parents and grandparents is strongly restricted, and intergenerational transmission has practically stopped. On the whole, these days the Siberian Uralic languages are used only in home contact, while Russian is spoken in every other domain, given that the Russian language is excellently known and spoken by almost 100% of the people belonging to the ethnic groups in question. Even if the parents register their children as indigenous, they do not find it important or preferable to teach them their native language as they are convinced that their children will need the Russian language for prosperity.
Consonant clusters in four Samoyedic languages
123
Finally, let me compare the linguistic situation of the four Samoyedic minorities under review with Fishman’s Graded Intergenerational Disruption Scale (GIDS) (1991, 2001). He has designed a framework to assist speakers of an endangered language in revitalizing their mother tongue and in reversing language shift. We have relied on the model when identifying the threatened status of the Uralic minority languages described above and assigned each to the following GIDS levels: Stage 8 So few fluent speakers that community needs to re-establish language norms; often requires outside experts (e.g., mostly native speaker linguists). Stage 7 Older generation uses language enthusiastically but children are not learning it. L1 is only taught as L2. Stage 6 Language and identity socialization of children takes place in home and community. Stage 5 Language socialization involves extensive literacy, usually including non-formal L1 schooling. Stage 4 L1 used in children’s formal education in conjunction with national or official language. Stage 3 L1 used in workplaces of larger society, beyond normal L1 boundaries. Stage 2 Lower governmental services and local mass media are open to L1. Stage 1 L1 used at upper governmental level. Assigning the four Uralic minority languages described above to these levels, the following situation was found: their situation is alarming in general; Enets and some Selkup dialects are at Stage 8; Nganasan and some Nenets and Selkup dialects are at Stage 7. Only some reindeer herding Nenets communities are at Stage 6. Most of the sources used in the study provide only word lists without context. They are usually written documents and dictionaries. The Nganasan and Enets dictionaries are written for pupils of primary schools, including approx. 3,000 entries, while two others, the Selkup and Nenets ones, contain far more entries. Alternative sources may not be useful for loanwords. Even though there are many published texts of these languages, they are usually tales, folklore texts, and stories with very few loanwords. Nenets: Tereščenko (1989), Tundra Nenets dialect Enets: Sorokina & Bolina (2001), Tundra Enets dialect Nganasan: Kosterkina, Momde & Ždanova (2001) Selkup: only the Taz dialect will be under investigation here (Helimski 2007)
124
Zsuzsa Várnai
2. The survey 2.1. Consonant system and syllable structure In this section we give a schematic discussion of the consonant systems and the phonotactics of the chosen languages. Each statement in this part of the article is based on the author’s own research, where no additional reference is given. The classifications of the consonants may appear to be oversimplified from a phonetic point of view but for a phonological classification they are quite adequate. It is important to consider the ways the segments are allowed to combine with each other in making longer structures, such as syllables. Some languages allow rather free combinations of segments, while in others the combinations are strongly restricted (Maddieson 2008). The complexity of sequencing of segments within syllables will be discussed in the four languages in question. I will show the constraints that apply within the constituents of the syllable and I will define the syllable templates focusing on consonant clusters. A consonant cluster is a group or sequence of consonants that appear together without a vowel between them. In many languages it is important to distinguish tautosyllabic consonant clusters at the beginning of the word from those occurring in word medial position or at the end of the word. In medial position there can be an intervening syllable boundary; i.e. these clusters are heterosyllabic. At the end of the word a morpheme boundary between the consonants is common (including in Russian). However, in our case it is not necessary to distinguish initial and final clusters since in the available sources for the present work words are mostly monomorphemic. The potential order of the segments inside the syllable is predictable and the non-occurring combinations can be regarded as results of the restrictions that the universal sonority hierarchy imposes. The sonority hierarchy is a universal principle, generally valid for natural languages, though obviously having language specific exceptions (Sonority Sequencing Principle ¼ SSP). Sonority is a feature of segments the value of which rises from the left edge towards the nucleus of well-formed syllables and falls from then on to the right edge. More than one type of sonority hierarchy is known in the literature, but they only differ in minor respects (cf. e.g. Kenstowicz 1994: 254). Sequences of syllables are governed by a Syllable Contact Law (Clements 1990: 287; cf. Vennemann 1988: 40). According to the Syllable Contact Law (SCL), the final element of a syllable is not less sonorous than the initial element of an immediately following syllable. In Vennemann’s version of the SCL, the greater the positive difference in sonority between C1 and C2 the better the contact; thus
125
Consonant clusters in four Samoyedic languages
the sequence an.ta is preferred (more natural, less marked) to ap.ta. Consequently the most preferred heterosyllabic cluster is the sonorant-obstruent (SO) cluster and obstruent-sonorant less well-formed (OS). Referring to the SSP and SCL I will determine the well- or ill-formedness of the clusters. In the next section we present the consonant systems and their distribution, the most significant phonotactical restrictions and regularities, and CC combinations for each of the languages. 2.1.1. Tundra Nenets The classification of Nenets consonants is shown in Table 1. Table 1. The consonant system of Nenets (Tereščenko 1966b, Salminen 1977) fricative plosive vless pj
labial
p
dental
t tj
affricate sibilant spirant nasal
voiced b
liquid
vless
vless
vless
bj
d dj
c cj
s sj
voiced m
mj
n
nj
glide
lateral trill voiced w l lj
r rj j
palatal velar
k
glottal
ʔ
ŋ
x
The following table shows which consonants can occur in the various syllabic positions in this language: Table 2. The distribution of consonants in Nenets p pj t
t j k b bj d d j ʔ c c j s s j x m m j n n j ŋ l
lj r rj w j
+
+
+ + + –
– – –
– – –
+ + + +
+
+ + + + + –
V__V +
+
+ + + + + + +
+ + +
+ + + +
+
+ + + + + + + + +
__C
+
–
+ + + + – + +
+ – +
+ + – +
–
+ – + + + + –
C__
+
+
+ + + + + + +
+ + +
+ + + +
+
+ + + + + + + + +
__#
+
+
+ + + + + + +
+ + +
+ + + +
+
+ + + + + + + + +
#__
–
+ +
+ +
We now consider what kinds of phonotactic regularities apply in Nenets, i.e. what characterizes the constituents of the Nenets syllable: The Nenets onset is obligatory and it can be non-complex only. Thus only consonant-initial syllables
126
Zsuzsa Várnai
are possible. The nucleus may be simple or branching in Nenets. Complex nuclei have to involve the same constituent (there aren’t any diphthongs). Codas in Nenets can be diverse: empty, simple and complex also. Word final cluster types are shown in Table 3. Table 3. Complex codas in final position (CC#) in Nenets O
C2 C1
O
S
S
plosive
affricate
spirant
nasal
liquid
glide
plosive
++
++
–
++
–
–
affricate
–
–
–
–
–
–
spirant
–
–
–
–
–
–
nasal
++
++
+
–
–
–
liquid
++
++
–
++
–
–
glide
++
+
–
–
–
–
= ill-formed in terms of sonority “–” = this type of cluster is nonexistent or very rare (only 1 or 2 examples) “+” = several clusters of this type “++” = numerous clusters of this type “O” = obstruent “S” = sonorant
The most frequent word final types of clusters are SO clusters in Nenets. Obtruents occur frequently as the second constituent. Of the sonorants only the nasals can occur in C2 position; liquids and glides do not occur at all. Affricates and spirants cannot form the first element of the cluster. Transsyllabic clusters in Nenets are shown in Table 4. They are adjacent segments belonging to two different syllables. The distribution of transsyllabic cluster types in Nenets is slightly different than final codas: affricate and spirant can be the first element of the cluster, and liquids and glides can occur in C2 position. There are also clusters of three elements in Nenets. They can occur in medial and final position. In medial position the syllable boundary is after the C2 : C1 C2 $C3 . They are generally all well-formed clusters from the viewpoint of sonority. In syllable contact (i.e., intervocalic) clusters of three elements, C1 is most often a plosive, a liquid, a glide or a nasal, C2 is a glottal stop, an obstruent, an affricate or a nasal and C3 is an obstruent, a nasal or a glottal stop. Their elements are never from the same class.
Consonant clusters in four Samoyedic languages
127
Table 4. Syllable contact: CC combinations (C$C) in Nenets O
C2 C1
O
S
S
plosive
affricate
spirant
nasal
liquid
glide
plosive
+
+
–
+
+
–
affricate
+
–
–
–
–
–
spirant
–
–
+
+
–
–
nasal
++
++
+
+*
+
–
liquid
++
++
+
++
+
+
glide
+
+
+
++
++
–
= ill-formed in terms of sonority; *only m “–” = this type of cluster is nonexistent or very rare (only 1 or 2 examples) “+” = several clusters of this type “++” = numerous clusters of this type “O” = obstruent “S” = sonorant
2.1.2. Enets The classification of Enets consonants is shown in Table 5. Table 5. The consonant system of Enets (Tereščenko 1966d, Glukhij 1978) fricative
liquid
plosive
nasal sibilant
vless
voiced
p
b
dental
t
d
s
palatal
t j/tʃ
dj
sj
velar
k
g
glottal
ʔ
labial
* = free variant of s ** = free variant of s j
vless
spirant voiced
glide lateral
vless
voiced
trill
voiced
voiced
m ð
[θ]*
n
l
[θ j]**
nj
lj
x
ŋ
r j
128
Zsuzsa Várnai
The following table shows which consonants can occur in the various syllabic positions in this language: Table 6. The distribution of consonants in Enets p t
tʃ k
ʔ
b d dj g
+ + +
+ –
+ + +
–
V_V + + +
+ –
+ + +
__C
+ + –
+ –
C__ __#
#__
s
sj x
m n nj ŋ l
lj r
j +
+ +
+ +
+ + + –
+ + +
+ + +
+ +
+ + + + +
+ + +
+ + +
+ + +
+ +
–
+ + + +
+ + +
+ + + + +
+ + +
+ + +
+ +
–
+ + + –
+ + +
+ + + + +
+ + +
–
+ +
–
+ + + +
+ +
–
ð
+ +
The following phonotactic regularities apply in Enets: The onset can be empty or filled, but it can be non-complex only. Thus both vowel- and consonantinitial syllables are possible. The nucleus may be simple or branching in Enets. Complex nuclei occur only when they dominate a single element as in Nenets. Codas in this language may be empty or simple. There are no CCC clusters in Enets. Enets syllable contact cluster types are shown in Table 7. In Enets, some intervocalic geminates can occur word medially: dd, gg, ðð, ss. Table 7. Syllable contact: CC combinations (C$C) in Enets O
C2 C1
plosive
fricative
nasal
liquid
glide
plosive
++
+
+
+
–
fricative
+
+
+
+
–
nasal
+
+
+
+
–
liquid
++
++
+
–
–
glide
++
++
+
+
–
O
S
S
= ill-formed in terms of sonority “–” = that type of cluster is nonexistent or very rare (only 1 or 2 examples) “+” = there are several clusters of that type “++” = there are numerous clusters of that type
The most frequent types of clusters are SO also in Enets. Obtruents and also sonorants occur as the second constituent, except glides.
Consonant clusters in four Samoyedic languages
129
2.1.3. Nganasan The classification of Nganasan consonants is shown in Table 8. Table 8. The consonant system of Nganasan (Tereščenko 1966c, 1979; Helimski 1998; Várnai 2002) fricative plosive
liquid
sibilant
spirant
nasal
vless
voiced
voiced
lateral
vless
voiced
labial
[p]*
b
dental
t
d**
s
n
l
tʃ/c
dj
sj
ɲ
lj
velar
k
ɡ**
x
ŋ
glottal
ʔ
palatal
trill
voiced
m ð**
r j
* = occurs before voiceless obstruents only ** = occurs only in onsets of closed syllables
The following table shows which consonants can occur in the various syllabic positions in this language: Table 9. The distribution of consonants in Nganasan t #__V
tʃ k ʔ
+ +
+ –
b d
dj g
+
–
+
–
s
sj ð
+ +
–
h m n nj ŋ l
lj r
j –
+ +
+ +
+ + +
–
V__V ð
+
+ + +
–
+
+ + +
+ + +
+ +
+ + +
+ –
__C
–
–
–
–
–
–
–
–
–
+
+ +
+ + +
+ +
C__
+ +
+ –
+ + +
–
+ +
+ +
+ + +
–
V__#
–
–
–
–
–
–
+ –
+ +
–
+ p
+ nd +
+ –
–
–
– –
–
+
–
–
–
The following phonotactic regularities apply in Nganasan: The onset can be empty or filled, but it can be non-complex only. Thus both vowel- and consonant-initial syllables are possible. But there is a constraint for onsets: when the nucleus is complex, the onset has to be filled in initial position. The nucleus may be simple or branching as in Nenets and Enets, but complex nuclei occur only with different constituents in Nganasan. Codas in this language may be empty or simple as in Enets.
130
Zsuzsa Várnai
Nganasan CC combinations are shown in Table 10. Table 10. CC combinations – syllable contact clusters (C$C) in Nganasan C2 C1
O
S
plosive
fricative
nasal
liquid
plosive
–*
–*
–*
–*
fricative
–
–
–
–
nasal
+
++
+
–
liquid
++
++
+
–
O
S
= are ill-formed in terms of sonority; *except b and ʔ “–” = that type of cluster is nonexistent or very rare (only 1 or 2 examples) “+” = there are several clusters of that type “++” = there are numerous clusters of that type
In Nganasan ill-formed clusters in terms of sonority are very rare (except b + obtruent/sonorant and ʔ þ obstruant/sonorant). The most frequent types of clusters are SO. There is no CCC in Nganasan; after derivation/inflection there could be C#CC, but a simplification process derives C1 C2 C3 ! C1 C3 . 2.1.4. Selkup The Selkup consonant system is shown in Table 11. Table 11. The consonant system of Selkup (Tereščenko 1966a) plosive labial
p
dental
t
affricate
velar
k
uvular
q
nasal
lateral
trill
m
tʃ
palatal
fricative
glide w
s
n
l
ʃ
nj
lj
r j
ŋ
The following table shows which consonants can occur in the various syllabic positions in this language:
Consonant clusters in four Samoyedic languages
131
Table 12. The distribution of consonants in Selkup p
t
k
q
tʃ
s
ʃ
m
n
nj
ŋ
l
lj
r
j
w
#__
+
+
+
+
+
+
+
+
+
+
–
+
+
+
+
+
V__V
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
__C
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
–
C__
+
+
+
+
+
+
+
+
+
+
+
+
+
+
–
–
__#
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
–
The following phonotactic regularities apply in Selkup: The onset can be empty or filled, but it can be non-complex only. Thus both vowel- and consonant-initial syllables are possible. The nucleus may be simple or branching as in Nenets and Enets, and the complex nuclei occur only when they dominate a single element. Codas in this language may be empty or simple as in Enets and Nganasan. Selkup CC combinations are shown in Table 13. Table 13. CC combinations – syllable contact clusters (C$C) in Selkup O
C2 C1
O
S
S
plosive
affricate
fricative
nasal
liquid
glide
plosive
++
++
++
+
+
–
affricate
++
++
–
–
–
–
fricative
++
–
++
+
+
–
nasal
++
++
+
+
+
–
liquid
++
++
++
++
++
–
glide
++
–
+
+
–
–
= ill-formed in terms of sonority “–” = that type of cluster is nonexistent or very rare (only 1 or 2 examples) “+” = there are several clusters of that type “++” = there are numerous clusters of that type
In Selkup we can find the same situation as in the other languages discussed above: Consequently the most frequent types of clusters are sonorant-obstruent clusters, and that type of combination where C2 is a glide, does not occur.
132
Zsuzsa Várnai
To summarize, we observe a tendency to avoid ill-formed CC combinations in the studied languages since they prefer sonorant-obstruent clusters, where according to the Syllable Contact Law the final element of a syllable is more sonorous than the initial element of a following syllable. Clusters with glides as second element are nonexistent in Samoyedic languages (this is the most illformed CC combination). Now on the basis of this section we can define the syllable complexity in the four languages in question, and observe the basic syllable types. 2.2. Syllable complexity in Samoyedic languages In this section, I specify the possible complexity of the onset and the coda in the languages in question. Blevins (1995) reviews the basic syllable and nucleus types in the languages of the world in her general typological paper and suggests a grouping. Table 14 below summarizes the basic syllable types of the four languages investigated here. Table 14. The basic syllable types of the chosen languages Nganasan
Nenets
Enets
Selkup
V
+
+
+
+
CV
+
+
+
+
CVC
+
+
+
+
VC
+
+
+
+
CCV
–
–
–
–
CCVC
–
–
–
–
CVCC
–
+
–
–
VCC
–
+
–
–
CCVCC
–
–
–
–
CVCCC
–
–
–
–
Nucleus types V
+
+
+
+
V1V1
–
+
+
+
V1V2
+
–
+
+
Consonant clusters in four Samoyedic languages
133
Blevins (1995) proposes binary parameters to account for language-particular variation in syllable typology; I show these parameter settings for the four Samoyedic languages. They are very restrictive in terms of complex edge components, e.g. they do not permit initial consonant clusters. There are only intersyllabic, medial clusters at syllable contacts, except in Nenets. Table 15. Parameters of Blevins (1995) applied to Samoyedic languages Nganasan
Selkup
Enets
Complex Nucleus
yes
yes
yes
yes
Obligatory Onset
no*
no
no
yes
Complex Onset
no
no
no
no
Coda
yes
yes
yes
yes
Complex Coda
no
no
no
yes
yes/Initial
no
no
yes/Final
Edge Effect
Nenets
* – it is yes, when the nucleus is complex, the onset has to be filled in initial position.
Consonants cannot appear as syllable nuclei in any of the four languages analysed here. There are no complex edge components in Samoyedic languages in any position, except for final complex codas in Nenets. After derivation and inflection there could be C#CC; but a simplification process applies, deriving C1 C2 C3 ! C1 C3 . These languages have moderately complex syllable structure, which is the most frequent structure in the world’s languages (247 of 485 studied languages have moderately complex syllable structure according to WALS 2005), which means “they permit a single consonant after the vowel and/or allow two consonants to occur before the vowel, but adhere to a limitation to only the common two-consonant patterns” (Maddieson 2008, WALS 2005). Edge effects are active in Nganasan, where there is initial obligatory onset whenever the nucleus is branching, and in Nenets, where there are medial and final branching codas and final clusters with three constituents (CCC#). Table 16 summarizes the information given in Tables 4, 7, 10, and 13 about contact clusters (i.e., those straddling a syllable boundary) in the languages investigated.
134
Zsuzsa Várnai
Table 16. Syllable contact: C$C combination types in Samoyedic languages Nenets
Enets
O
S
O
S
O
S
O
S
O
+
+
+
+
–
–
++
+
S
++
++
++
+
++
+
++
+
C2 C1
Nganasan
Selkup
The most frequent type in in the four languages in question is sonorant-obstruent cluster, which is the most well-formed type from the viewpoint of sonority. Nganasan is more restrictive than the other languages since there are neither obstruent þ obstruent nor sonorant + sonorant clusters. 2.3. Russian loanwords in the Samoyedic languages To what extent do loanword adaptations result in phonetic similarity to the non-native form? It is a well-known fact that methods of loanword borrowing depend on the language status of the speech community. The data indicate that early and later borrowings are different. When the contacts are not so intensive, the repair processes act more radically than when the speech community is bilingual. Later loans are more similar to the donor language form. There are many loanwords exactly corresponding to the Russian form, more often the Russian written form (the following examples are from Nganasan, but analogous data apply to all Samoyedic languages): колхоз машина молоко Москва первомай
kolhoz maʃina molokoɁ [Plur] moskva pervomai
‘collective farm’ ‘car’ ‘milk’ ‘Moskva’ ‘1th of May’.
These words are from the time when the Samoyedic speech community had practically turned bilingual. In this situation bilingualism is very intensive, and loanword adaptation increasingly takes the form of direct adoption. A similar phenomenon was observed by Thomason and Kaufman (1988: 33): early and later Russian loans in Yupik (Asian Eskimo language) differ in the same way: Russian [tabak] [bljutcә]
Yupik (Asian Eskimo language) Early loan Later loan [tavaka] [tabak] ‘tobacco’ [pljusa] [bljutca] ‘saucer’
Consonant clusters in four Samoyedic languages
135
Accordingly, I will only analyse early Russian loans in the four Samoyedic languages, and do not deal with later adoptions of the bilingual speech communities. Given the strong restrictions on onset and coda complexity in Samoyedic languages, and the extensive range of clusters found in Russian, it is interesting to examine the processes affecting Russian loanwords. Not all clusters have been investigated in all languages; only those can be discussed here which were represented in the sources. Gaps in the picture are due to missing evidence, i.e. if the relevant clusters do not occur in the dictionaries, or are not represented in the sample. Unfortunately we lack extensive quantities of data, so we cannot make predictions but can only review the regularities. 2.3.1. Repair processes Russian has many clusters, not only in word medial position at syllable boundaries, but also in onset position at the beginning of the word. I show which types of sequences undergo simplification processes, what kind of processes they undergo, and where (in which position) they are retained. What happens to branching onsets in Samoyedic languages? Which way do they choose to adapt these clusters? We can observe six different repair processes in the course of borrowing from Russian: epenthesis, C1 -deletion, C2 -deletion, CV-metathesis, syncope and substitution. Here we will mention a few examples of each repair process; see the appendix for additional examples. 2.3.1.1. Epenthesis The most frequent strategy is epenthesis. It is active in every position in all the four languages examined here. This, moreover, corresponds to crosslinguistic data: epenthesis appears to be the most frequent adaptation process in languages (Paradis and LaCharité 1997). When vowel epenthesis is used to break up a consonant cluster, there is often more than one location where the vowel could be placed to produce a phonotactically acceptable output. For example, if a language has open syllable structure {CV, V}, hence disallowing CC clusters at the beginning of a word, an initial CCV could be broken up by putting a vowel before the consonants (VC.CV) – prothesis – or between the consonants (CV.CV) – anaptyxis. In a medial CCC cluster, the vowel could occur before the second or third consonant. The choice of epenthesis locations is language specific.
136
Zsuzsa Várnai
Epenthesis affects clusters in all three positions, most often in complex codas in final position, and complex onsets in word inital position, but it can occur at different locations in the word in the same language. The inserted vowel used to resolve the Russian complex onset and syllable contact cluster most often is the same as the vowel of the next syllable; thus the epenthetic vowels generally match the input vowel on the right of the epenthetic site: onset (#CC): крупа krupa дробъ drob j брезент brezent класс klass ключ kl j utʃ плаш plaʃ бревно brevno ключ kl j utʃ бригада brigada кладовка kladovka крест krest j груз gruz
> > > > > > > > > > > >
syllable contact (C$C): тюрьма t j urma > стекло st j eklo >
Nenets
xurupa torob persent xalas Enets kul j utʃ palaʃ beremno Nganasan kul j utʃ birigadә kaladovka kiristә Selkup kurus
‘cereals’ ‘barrel’ ‘fly’ ‘class’ ‘key’ ‘log’ ‘beam’ ‘key’ ‘brigade’ ‘chamber’ ‘chrest’ ‘cargo’
Nganasan t j yryma Selkup t j ekɨla
‘prison’ ‘glass’
The epenthetic vowel after a word final coda cluster is usually a. Otherwise it is language particular what kind of vowel is inserted between the consonants of the cluster: in Nganasan very often ә (~70%), and i (~10%), when there is a palatalised consonant in the Russian form. coda (CC#): километр kilometr > шарф ʃarf > mетр metr > ноябрь nojabr > спирт spirt > * with two epenthetic vowels
Nenets xilometra ‘km’ Enets ʃarpa ‘scarf’ Nganasan metәrә ‘meter’ nәjabәri* ‘november’ Selkup pirta ‘spirit’
Consonant clusters in four Samoyedic languages
137
2.3.1.2. Prothesis A particular type of epenthesis is when the vowel is inserted before the consonant cluster at the beginning of the word; this is also known as prothesis. That process can be observed in the Nenets, Nganasan and Selkup data. The inserted vowel is a in Nenets and Nganasan and i in Selkup and in some cases in Nganasan. Prothesis affects mostly sibilant + plosive clusters at the beginnning of the word. школа школа ржаной cтул скамейка cтол
ʃ kola ʃ kola rʒanoj stul skamejka stol
> > > > > >
Nenets askola/xaskola Nganasan askolә ars j enәj istuәlә Selkup iskamεjka istol
‘school’ ‘school’ ‘rye’ ‘chair’ ‘bench’ ‘table’
2.3.1.3. C1-deletion In general, vowel epenthesis seems to be a heavily prefered repair type in loanword adaptation. Uffmann (2007) surveys case studies of loanword adaptation and he concludes that consonant deletion is a marginal phenomenon, compared to epenthesis. Adding extra segments is less undesirable than deleting segments from the word (Paradis and LaCharité 1997). C1-deletion affects only Russian tautosyllabic clusters in the onset. It acts in each of the four languages: Nenets: стакан школа втулка Enets: школа стекло Nganasan: скамейка школа Selkup спирт здоровать-ся
stakan ʃ kola vtulka
> > >
takan kola tulka
‘cup’ ‘school’ ‘lead shot’
ʃ kola st j eklo
> >
kola t j eklo
‘school’ ‘glass’
skamejka ʃ kola
> >
kamejka kolә
‘bench’ ‘school’
spirt zdarovat j -s j a
> >
pirt tarowattɨ-qo
‘spirit’ ‘to welcome’
We have to mention that the same cluster can be affected by several different processes, i.e. Russian word initial sibilant + plosive clusters can be borrowed to Nenets with C1 -deletion or prothesis (see later discussion in 2.3.2).
138
Zsuzsa Várnai
2.3.1.4. C2 -deletion This is a very interesting repair process. In general, when truncation occurs it eliminates the first consonant of the cluster. We have only three pieces of data for C2 -deletion. This repair strategy is active only in Selkup, affecting two intersyllabic clusters and one onset cluster. The Russian complex sibilant + plosive onset cluster is resolved by two types of truncation in Selkup: zdarovat j -s j a > C1 -deletion: tarowattɨ-qo and C2 -deletion: sarowattɨ-qo. This dichotomy is dialectal. Unfortunately, we have very few data; it would be useful to get more examples of C2 -deletion. Selkup здоровать-ся zdarovat j -s j a кукла kukla нужда nuʒda
> > >
sarowattɨ-qo ‘to welcome’ kuka ‘puppet’ nuʃa ‘poverty’
2.3.1.5. CV-metathesis This adaptation strategy primarily affects initial onset clusters; it is not a common strategy, and its goal is to restructure the complex onset and to shift the cluster to the syllable boundary: CCVCV > CVCCV truba > turba or CVCCCVCV > CVCCVCCV kastrul j a > kosturl ja. Enets платок труба Selkup крупа крупчатка кастрюля
platok truba
> >
poltok turba
‘kerchief’ ‘chimney, pipe’
krupa kruptʃatka kastrul j a
> > >
kurpa kurtʃatka kosturl j a
‘cereals’ ‘grits’ ‘pot’
2.3.1.6. Syncope It is an extraordinary, unique phenomenon in the sample that works only in Enets and produces (rather than removes) syllable contact clusters. Presumably the aim of this strategy is to make a trisyllabic word bisyllabic, because bisyllabic structures are the most frequent ones in the Samoyedic languages. Unfortunately we have very few data, only these two examples:
Consonant clusters in four Samoyedic languages
Enets бумага bumaga молоко malako
> >
139
bomga ‘paper’ molka ‘milk’
2.3.1.7. Substitution It is a little different from the other strategies. It is not a restructuring repair, but it is a kind of assimilation where non-native segments are mapped onto the phonetically closest ones that are well-formed in the native phonology. It affects mostly contact clusters in intersyllabic position, but it can also affect single segments. доктор doktor флаг flag Enets лавка lavka Nganasan конфеты kanfety лавка lavka Selkup почта potʃta ровно rovna
Nenets
toxtur plak lapka kәmpetɨ lapku poʃta / pocta romna
‘doctor’ ‘flag’ ‘store’ ‘candy’ ‘store’ ‘post office’ ‘exactly’
Substitution can act with restructuring repair (epenthesis) together: Nenets грамм крупа дробъ брезент книга класс кроватъ
gram krupa drob j brezent kɲiga klass kravat j
xaram xurupa torob persent xyɲika xalas xorovat j
‘gramm’ ‘cereals’ ‘barrel’ ‘fly’ ‘book’ ‘class’ ‘bed’
The following tables, 17 and 18, summarize these repair processes according to languages and positions. The most frequent strategy is epenthesis, acting almost in every position in all the four languages. Substitution works mostly on contact clusters in intersyllabic position, but it affects not only clusters but also single segments. It is noticeable that in the Selkup data the onset clusters are resolved by truncation only as opposed to the other languages, where onset clusters are repaired by deletion and epenthesis as well. These tables also show that the most frequent strategies are epenthesis and substitution. Clusters in word initial onset position form the site of the most frequent repair processes, but these are completely missing in Samoyedic languages.
140
Zsuzsa Várnai
Table 17. Repair processes in the four languages selkup
nganasan
nenets
enets
On Co C$C Ont Co C$C On Co C$C On Co C$C CVC +
epenth.
+
C1-del.
+
C2-del.
+
+
metath.
+
+
subst.
+
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+ +
sync.
“On” = onset “Co” = coda
Table 18. Repair processes according to position onset
coda
C$C
epenthesis
3
4
2
C1-deletion
4
C2-deletion
1
1
metathesis
2
1
substitution
4
4
syncope
CVC
1
Numbers = in how many languages the strategy acts according to position
The transsyllabic clusters are the second most frequent place where repair processes work. We have to mention that according to our data coda clusters are resolved by epenthesis only. 2.3.2. Repair strategies and cluster types The choice of epenthesis locations is language specific. The placement of the vowel depends on what kind of consonants are in the cluster. Fleischhacker (2001) presents a typological study of epenthesis in initial CC(C) clusters in loanwords in many laguages, focusing on the question of whether the vowel precedes the cluster (VCC) or breaks up the cluster (CVC). Generally in a
141
Consonant clusters in four Samoyedic languages
voiceless sibilant + stop cluster, a vowel tends to be inserted before the cluster while in an obstruent + sonorant cluster, a vowel tends to be inserted into the cluster. In Table 19 we can examine what kinds of clusters are affected by the various repair processes according to position in Samoyedic languages. Table 19. Cluster types according to repair strategies epenthesis
C1-deletion C2-deletion
On C$C Co
On
On
C$C
sP
sP
sP
sP
sP
sP
metathesis
syn- subOn C$C Co cope stitution sP
sP
FP
FP PP
OO
FF AP
OS
PL
PL
PN
PN
PL
PL
PL
PL
PL PN FL FN
SO
LP
LN
Ls
NP
LF
LP
NA
NP
NF
SS
“On” = onset “Co” = coda “O” = obstruent “S” = sonorant
“F” = fricative “L” = liquid “N” = nasal “P” = plosive “s” = sibilant
There is no SS repair and SO repair is very rare. The most frequent types of cluster affected by repair mechanisms are sP and PL clusters: they are affected by all deletions, epenthesis, metathesis, and substitution as well. They are the most unacceptable sequences in all three positions. The resulting order for repair strategies according to the most frequent cluster types is: sP: PL: FP: LP: PN:
epenthesis > deletion > substitution > metathesis epenthesis > metathesis / substitution > deletion substitution > deletion > syncope epenthesis / syncope epenthesis / substitution
142
Zsuzsa Várnai
We can observe a higher frequency of vowel insertion in the initial position of the word before sibilant-plosive clusters. The sibilant + plosive clusters provoke a variety of repair strategies: vowel epenthesis, with the inserted vowel located either before or inside the cluster and consonant deletion. In word-initial clusters consisting of a voiceless sibilant + stop, it is cross-linguistically more common to insert a vowel before the first consonant (prothesis), while in word-initial clusters of an obstruent and sonorant, it is more common to place the vowel between the consonants (anaptyxis) (Fleischhacker 2001). Fleischhacker argues that the reason for this pattern is that epenthetic vowels are inserted where they will cause the least perceptual difference between the foreign word and the epenthesized adaptation. Epenthesis is driven by the goal of maximal auditory similarity to input. Experimental studies show that ST clusters are judged to be more similar to VST than SVT sequences, while TR clusters are judged to be more similar to TVR than VTR sequences (Kang 2003: 221). This can be confirmed by our data to the extent that prothesis can resolve voiceless sibilant + stop clusters, but not obstruent + sonorant clusters. The case of metathesis is especially interesting. Cross-linguistic survey shows that instances of rhotic metathesis have a perceptual basis associated with phonetic cues of the rhotic segment (Blevins and Garrett 2004). Our data show that the most frequent clusters affected by metathesis are obstruent + rhotic clusters (kr, tr, str) with only one exeption ( platok > poltok). It is a very notable phenomenon that the same cluster can be affected by several different processes: Russian word initial sibilant-plosive clusters can be borrowed to Nenets with C1 -deletion or prothesis and as we mentioned above Selkup sibilant + obtruent onset clusters are resolved by two types of truncation. Perhaps this dichotomy is dialectal. Nenets C1 -deletion школа Prothesis
ʃ kola
> kola > askola
‘school’
> tarowattɨ-qo Selkup C1 -deletion здоровать-ся ‘to welcome’ zdarovatj-sja > sarowattɨ-qo C2 -deletion
3. Conclusion The Samoyedic languages permit consonant clusters, but they are very restrictive in terms of complex edge components. For example, most of them do not permit initial consonant clusters, or more than two consecutive consonants in other positions, especially at the same side of the syllable boundary. There are
Consonant clusters in four Samoyedic languages
143
no complex edge components in any Samoyedic language in any position, except for final complex codas in Nenets, so two consonants are not allowed in the onset position of a syllable at all. With respect to sequences of the four languages under scrutiny here from the viewpoint of the sonority hierarchy, we can conclude that well-formed clusters are more frequent than ill-formed ones. As we have seen, these languages choose similar ways to adapt clusters in early Russian loanwords: the most frequent repair strategy is epenthesis in all four languages in any position. Syncope is a most special adaptation process in Enets, which presumably helps to preserve a bisyllabic template most common in these languages. Less accepted are branching onset clusters, least of all in initial position. The most frequent types of cluster to be affected by repair mechanisms are sP and PL clusters; they are affected by both types of deletion, epenthesis, and metathesis and substitution as well. Of course, a lot more research is needed to show the phonetic nature of adaptation from Russian, and to test the correspondence between loanword adaptation and perceptual assimilation. Further investigations are thus needed; the problem is that it is rather difficult to elicit adaptation data in speech communities at the level of bilingualism that present-day speakers of these four languages exhibit.
144
Zsuzsa Várnai
Appendix Nenets: V-epenthesis onset (#CC) Russian gr
грамм
gram
xaram
‘gramm’
kr
крупа
krupa
xurupa
‘cereals’
kr
класс
klass
xalas
‘class’
kl
кроватъ
kravat j
xorovat j
‘bed’
dr
дробъ
drob j
torob
‘barrel’
br
брезент
brezent
persent
‘fly’
kɲ
книга
kɲiga
xyɲika
‘book’
tr
километр
kilometr
xilometra
‘km’
PL OS
PN
Nenets
coda (CC#) OS
PL
prothesis OO
sP
ʃk
школа
ʃ kola
askola xaskola
‘school’
C1 -deletion onset (#CC) st sP OO
ʃk FP
vt
стакан
stakan
takan
‘cup’
стол
stol
tol
‘table’
школа
ʃ kola
kola
‘school’
шкаф
ʃ kaf
kap
‘cupboard’
втулка
vtulka
tulka
‘lead shot’
substitution C$C FP
ft
кофточка
kaftotʃ ka
xoptocka
‘blouse’
PP
kt
доктор
doktor
toxtur
‘doctor’
PL
gr
фотографироватъ
fotografirovat j
potokrapirujas j
‘to take a photo’
OO OS
Consonant clusters in four Samoyedic languages
145
onset (#CC) PL
br
бригада
brigada
prigada
‘brigade’
FL
fl
флаг
flag
plak
‘flag’
OS
Enets: V-epenthesis onset (#CC) Russian плаш
plaʃ
palaʃ
‘log’
платье
plat j e
palat j a
‘dress’
br
бревно
brevno
beremno
‘beam’
kl j
ключ
kl j utʃ
kul j utʃ
‘key’
pl OS
Enets
PL
coda (CC#) OS
PL
tr
метр
metr
metra
‘meter’
SO
LF
rf
шарф
ʃarf
ʃarpa
‘scarf ’
C1 -deletion onset (#CC)
OO
zd
здорова
zdarova
aɲ doroba
‘health, welcome’
ʃk
школа
ʃ kola
kola
‘school’
sk
скучно
skutʃno
kuʃno
‘boring’
sp
спасибо
spasiba
pasiba ŋaj
‘thanks’
стакан
stakan
takan
‘cup’
стол
stol
tol
‘table’
стул
stul
tul
‘chair’
столица
staliʦa
toliʦa
‘capital’
столб
stolb
tolb
‘pillar’
sP st
сторож
staroʒ
toroʒ
‘guard’
st j
стекло
st jeklo
t jeklo
‘glass’
ʃk
шкаф
ʃ kaf
kap
‘cupboard’
146
Zsuzsa Várnai
metathesis onset (#CC) OS
pl
платок
platok
poltok
‘kerchief ’
tr
труба
truba
turba
‘chimney, pipe’
PL
syncope (CVC I C$C) SO
NP
бумага
bumaga
bomga
‘paper’
LP
молоко
malako
molka
‘milk’
lapka
‘store’
substitution C$C OO
FP
vk I pk
лавка
lavka
Nganasan: V-epenthesis onset (#CC) Russian бригада
brigada
birigadә
‘brigade’
брюки
br j uki
burukәɁ
‘trousers’
tr
труба
truba
turuba
‘chimney, pipe’
kl
кладовка
kladovka
kolodovka
‘chamber’
крест
krest j
kiristә
‘chest’
крупа
krupa
kyryhә
‘cereals’
pl
план
plan
holanә
‘plan’
kɲ
книга
kɲiga
kiɲigә
‘book’
dr
кедр
kedr
kedәrә
‘pine’
tr
метр
metr
metәrә
‘meter’
br
ноябрь
nojabr
nәjabәri
‘november’
br
PL OS
kr
PN
Nganasan
coda (CC#)
OS
PL
Consonant clusters in four Samoyedic languages
147
C$C
PL
br
фабрика
fabrika
hu͡ abirikә
‘factory’
tr
натруска
natruska
naturuska
‘strainer’
kl
OS PN
уклад
uklad
ukulatә
‘steel’
d jm
седьмой
s j ed j moj
s j ed j emәi
‘seventh’
t j yryma
‘prison’
kәnәtorә
‘office’
SS
LN
rm
тюрьма
t j urma
SO
NP
nt
контора
kontora prothesis
OO SO
sP Ls
ʃk
школа
ʃ kola
askolә
‘school’
st
стул
stul
istuәlә
‘chair’
rʒ
ржаной
rʒanoj
ars j enәj
‘rye’
C1 -deletion onset (#CC) sk sp OO
sP
st zd ʃk
скамейка
skamejka
kamejka
‘bench’
спасибо
spasiba
hu͡ aśiba
‘thanks’
справка
spravka
horaapkә
‘certificate’
стакан
stakan
takanә
‘cup’
стол
stol
tolә
‘table’
здороваться
zdarovat j sja
dәrәbatudja
‘to welcome’
ʃ kola
kolә
‘school’
школа
substitution C$C SO OO
NF
nf I np
конфеты
konfety
kәmpetɨ
‘candy’
FP
vk I pk
лавка
lavka
lapku
‘store’
148
Zsuzsa Várnai
Selkup: V-epenthesis onset (#CC) Russian OO
sP
Selkup
gr
груз
gruz
kurus
‘cargo’
coda (CC#) OO
sP
sp
спирт
spirt
pirta
‘spirit’
OS
PL
tr
метр
metr
metra
‘meter’
sel j t j a
LP
сельдь
s j el j d j
SO
‘Coregonus sardinella’
lk
шёлк
ʃ j olk
ʃolka
‘silk’
st j
стекло
st j eklo
l jdj
C$C OO
sP
t j ekɨla
‘glass’
prothesis CC# OO
sk
скамейка
skamejka
iskamεjka
‘bench’
st
стол
stol
istol
‘table’
pirt
‘spirit’
sP
C1 -deletion onset (#CC) спирт
spirt
спутаться
sputat j sja
putajʃi-qo
‘to get mixed, to get confused’
zd
здороваться
zdarovat j sja
tarowattɨqo
‘to welcome’
st
стакан
stakan
takan
‘cup’
sk
скамейка
skamejka
kamejka
‘bench’
sp
спасибо
spasiba
paʃipo
‘thanks’
st
сторож
staroʒ
toruʃ
‘guard’
sp
OO
sP
Consonant clusters in four Samoyedic languages
149
C2 -deletion onset (#CC) sP
zd
здороваться
zdarovat j -s j a
sarowattɨ-qo
‘to welcome’
OO
sP
ʒd
нужда
nuʒda
nuʃa
‘poverty’
OS
PL
kl
кукла
kukla
kuka
‘puppet’
krupa
kurpa
‘cereals’
крупчатка
kruptʃatka
kurt j atka
‘grits’
кастрюля
kastrul j a
kosturl j a
‘pot’
OO C$C
metathesis onset (#CC) OS
PL
kr
sP
st
крупа
C$C OO
substitution C$C
OO
OS
SO
AP
tʃt
почта
potʃta
poʃta pot j ta
‘post office’
PP
tk
кадка
kadka
katka
‘tub’
FF
fh
совхоз
savhoz
sapko
‘state farm’
FP
fk
лавка
lavka
lapky
‘store’
sP
ʒd
нужда
nuʒda
nuʃta
‘poverty’
PN
dn
ладно
ladna
latno
‘all right’
sP
sk
натруска
natruska
natruʃ ka
‘strainer’
FN
vn
ровно
rovna
romna
‘exactly’
NA
nts
полотенце
palat j enʦe
polotensa
‘towel’
150
Zsuzsa Várnai
References [All-Russian Census of Population 2002] Всероссийская переписъ населения 2002 год]. www.perepis2002.ru Blevins, Juliette 1995 The Syllable in Phonological Theory. In John A. Goldsmith (ed.) Handbook of Phonological Theory, 206–244. Cambridge Mass.: Blackwell. Blevins, Juliette and Andrew Garrett 2004 The Evolution of Metathesis. In Bruce Hayes, Robert Kirchner and Donca Steriade (eds.) Phonetically Based Phonology, 117–156. Cambridge: Cambridge University Press. Clements, George Nick 1990 The role of the sonority cycle in core syllabification. In John Kingston and Mary E. Beckman (eds.), Papers in laboratory phonology I: between the grammar and physics of speech, 283–333. Cambridge: Cambridge University Press. Fishman, Joshua A. 1991 Reversing Language Shift. Theoretical and Empirical Foundations of Assistance to Threatened Languages. Clevedon (England) and Philadelphia: Multilingual Matters, 1991. Fishman, Joshua A (ed) 2001 Can Threatened Languages Be Saved? Reversing Language Shift, Revisited: A 21st Century Perspective. Clevedon: UK: Multilingual Matters. Fleischhacker, Heidi 2001 Cluster-dependent epenthesis asymmetries. In Adam Albright and Taehong Cho (eds.) UCLA Working Papers in Linguistics 7, Papers in Phonology 5, 71–116. Glukhij, Ja A. 1978 Консонантизм энецкого языка (диалект бай) по эксперименталъным данным [Enets consonantism with experimental data (Bai dialect)], АКД, Leningrad. Hajdú, Péter 1982 Chrestomathia Samoiedica. Budapest: Tankönyvkiadó (second edition). Helimski, Eugen [Хелимский, Е.] 1989 О морфонологии нганасанского языка [About the morphophonology of Nganasan], Paper read at IV. Phonologisches Symposium Uralischer Sprachen, Hamburg. Helimski, Eugen 1998 Nganasan. In Daniel Abondolo (ed.) The Uralic Languages, 480– 515. London: Routledge. Helimski, Eugen [Хелимский, Е.] Северно селькупский словарь, [North Selkup Dictionary] http:// www.uni-hamburg.de/ifuu/Arbeiten.html
Consonant clusters in four Samoyedic languages
151
Kang, Yoonjung 2003 Perceptual similarity in loanword adaptation: English postvocalic word-final stops in Korean. Phonology 20: 219–273. Kazakevich, Olga 2006 The functioning of the indigenous minority languages in the Yamalo Nenets autonomous area, Turukhansk district of the Krasnoyarsk territory and Evenki autonomous area. http://lingsib.iea.ras.ru/en/ round_table/papers/kazakevich1.shtml Kenstowitz, Michael 1994 Phonology in Generative Grammar, Oxford: Blackwell. Kenstowicz, Michael 2003 Salience and Similarity in Loanword Adaptation: a Case Study from Fijian. To appear in Language Sciences. Kosterkina, N. T., A. Č. Momde and T. Ju. Ždanova [Костеркина, Н. Т. – А. Ч. Момде – Т. Ю. Жданова] 2001 Словарь нганасанско-русский и русско-нганасанский [NganasanRussian and Russian-Nganasan Dictionary] Saint Petersburg. Krigonogov, V. P. 1998 Этнические процессы у малочисленних народов Средней Сибири [Etnological processes in Central Siberian minorities], Красноярск, КПУ. Kuznetsova, A. I., E. A. Helimskij and E. V. Grushkina [.А.И. Кузнецова – Е. А. Xелимский Е. В. Грушкина] 1980 Очерки по селькупскому языку, тазовский диалект Том I. [Collection of Selkup language, Taz dialect] Moscow: Издательство Московского Университета. MacConnell, G. D. – Mikhalchenko, V. (eds) 2003 Письменные языки мира. Языки Российской Федерации [Written languages of the world, Languages of the Russian Federation]. Moscow: Академия. Maddieson, Ian 2008 Syllable Structure In: Martin Haspelmath, Matthew Dryer, David Gil, and Bernard Comrie (eds.) The World Atlas of Language Structures Online. Munich: Max Planck Digital Library, chapter 12. Available online at http://wals.info/feature/12 Murray, Robert, W. and Theo Vennemann 1983 Sound change and syllable structure in Germanic phonology. Language 59(3): 514–528. Paradis, Carole and Darlene LaCharité 1997 Preservation and minimality in loanword adaptation. Journal of Linguistics 33: 379–430. Cambridge University Press. Salminen, Tapani 1977 Tundra Nenets Inflection. Mémoires de la Société Finno-Ougrienne 227. Helsinki.
152
Zsuzsa Várnai
Salminen, Tapani 1998 A morphological Dictionary of Tundra Nenets. Societatis FennoUgricae 26. Helsinki. Sipos, Mária, Sipőcz Katalin, Várnai Zsuzsa, Wagner-Nagy Beáta 2007 The Current Sociolinguistic Situation of some Uralic Peoples. Paper read at 11th International Conference on Minority Languages (ICMLXI). Pécs 5–6 July, 2007. Sorokina, I. P. and D. S. Bolina [И. П. Сорокина – Д. С. Болина] 2001 Словарь энецко-русский и русско-энецкий. [Enets-Russian and Russian-Enets Dictionary]. Санкт-Петербург. Tereščenko, N. M. [Терещченко, Н. М.] 1966a Селкупский язык [Selkup]. In Lytkin, V. (eds) Языки народов СССР Том 3. Финно-угорские и самодийские языки [Languages of the USSR, Volume 3. Finno-Ugric and Samoyedic languages] Mосква, Наука. Tereščenko, N. M. [Терещченко, Н. М.] 1966b Ненецкий язык [Nenets]. In Lytkin, V. (eds) Языки народов СССР Том 3. Финно-угорские и самодийские языки [Languages of the USSR, Volume 3. Finno-Ugric and Samoyedic languages] 376–395. Моscow: Наука. Tereščenko, N. M. [Терещченко, Н. М.] 1966c Нганасанский язык [Nganasan]. In Lytkin, V. (eds) Языки народов СССР Том 3. Финно-угорские и самодийские языки [Languages of the USSR, Volume 3. Finno-Ugric and Samoyedic languages] 416– 437. Моscow: Наука. Tereščenko, N. M. [Терещенко, Н. М.] 1966d Энецкий язык, [Enets]. In Lytkin, V. (eds) Языки народов СССР Том 3. Финно-угорские и самодийские языки [Languages of the USSR, Volume 3. Finno-Ugric and Samoyedic languages] 438– 457. Моscow: Наука. Tereščenko, N. M. [Терещенко, Н. М.] 1979 Нганасанский язык [Nganasan], Leningrad: Наука. Tereščenko, N. M. [Терещенко, Н. М.] 1989 Ненецко–русский словарь [Nenets-Russian Dictionary] Моscow: Наука. Thomason, Sara G. and Terrence Kauffman 1988 Language Contact Creolization and Genetic Linguistics. Berkeley, University of California Press. Uffmann, Christian 2007 Vowel epenthesis in loanword adaptation, Tübingen, Max Niemeyer Verlag. Várnai, Zsuzsa 2002 Hangtan [Phonology and Phonetics]. In Wagner-Nagy, Beáta (ed.): Chrestomathia Nganasanica. Studia Uralo Altaica Supplementum 10, 33–70. Szeged.
Consonant clusters in four Samoyedic languages Várnai, Zsuzsa 2003
Várnai, Zsuzsa 2004
Várnai, Zsuzsa 2005
153
Valóban morás nyelv-e a nganaszan? [Really Nganasan is mora counting?] In Zoltán Molnár and Gábor Zaicz (eds): Permistica et Uralica. FUP I, 268–271. Piliscsaba. A nganaszan nyelv fonológiai leírása [The phonological description of Nganasan] Ph.D dissertation, Department of Uralistics, Eötvös Loránd University Budapest. Some problems of Nganasan phonology: Mora or Syllable? In Beáta Wagner-Nagy (ed.) Mikola konferencia, 113–126. Szeged
Várnai, Zsuzsa Phonology, Phonotactics, Morphonology In Beáta Wagner-Nagy (ed.): Descriptive Grammar of Nganasan [manuscript]. Vennemann, Theo 1988 Preference laws for syllable structure and the explanation of sound change: With special reference to German, Germanic, Italian, and Latin, Berlin, Mouton de Gruyter.
Part II.
Production: analysis and models
Articulatory coordination and the syllabification of word initial consonant clusters in Italian Anne Hermes, Martine Grice, Doris Mu¨cke and Henrik Niemann Abstract In this study we investigate the articulatory coordination of word initial consonant clusters in Italian. We show that these clusters are generally coordinated in a similar way to clusters in languages with complex syllable onsets, in that the timing of the rightmost consonantal gesture in relation to the vocalic gesture is adjusted according to the number of consonants in the cluster. However, clusters containing a sibilant, /s/ or /z/, are an exception and show a different coordination pattern altogether. Such clusters are referred to as having an ‘impure s’, mainly as a result of allomorphy of indefinite and definite articles (e.g. il premio, but lo studente). In such cases, the sibilant does not affect the coordination of the remaining consonants, indicating that it may not be part of the syllable onset.
1. Introduction This study takes an articulatory approach to the syllabic parsing of word initial clusters in Italian within the framework of Articulatory Phonology (Browman and Goldstein 1988). In this model, the coordination patterns relating to consonants and vowels have been shown to reflect syllable structure in different languages (Browman and Goldstein 2000; Marin and Pouplier 2010 for American English, Goldstein et al. 2007 for Georgian and Tashlhiyt Berber, Shaw et al. 2009 for Moroccan Arabic). Articulatory Phonology models articulatory movements in terms of consonantal and vocalic gestures. These are coupled in relation to each other in specific ways, reflecting the status of the respective consonants and vowels within the syllable. In CV syllables, the C and V gestures are coupled in-phase, indicating a simultaneous initiation of these two gestures. This reflects the onset-nucleus relation. In VC syllables, by contrast, the V and C gestures are coupled in anti-phase relation, and are thus initiated sequentially. This reflects the nucleus-coda relation. Crucially, syllables with complex onsets, CCV, are modelled as having two competing coupling modes. On the one hand both C gestures are coupled in-phase with the V gesture. On the other, the two C gestures are coupled in anti-phase to each other, such that they do not start simultaneously, aiding
158
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
perceptual recoverability (Nam and Saltzman 2003). This is referred to as competitive coupling. In CCV syllables, the gesture for the rightmost consonant is closer to the vocalic gesture than in CV syllables, implying that the rightmost C adjusts its timing ’to make room’ for the additional onset consonant. This adjustment is referred to as a shift of the rightmost consonant. In Italian, when a word initial sibilant is followed by a consonant, it is referred to as ‘impure s’ (Baretti 1832), indicating a special status of this sibilant in a cluster compared to other clusters. The issue as to how /s/ is syllabified in such clusters has so far remained unresolved.1 The present study attempts to shed light on this question from a kinematic point of view. First we start with an overview of the status of ‘impure s’ in Italian (see section 1.1.) and the link between articulatory coordination and syllable structure (see 1.2.). In the subsequent section, we summarise previous work on the articulation of /s/+C clusters (see 1.3.). Section 2 provides details of the experiment and section 3 presents the results. The summary and the discussion of the findings are dealt with in section 4, referring to coupling structure and its phonological implementations. 1.1. ‘Impure s’ in Italian In Italian, consonant clusters with an ‘impure s’, /s/+C and /s/+CC, are treated differently in morphology from a single /s/ or a consonant cluster without a sibilant. For instance the masculine definite article alternates, depending on the consonants at the beginning of the word (e.g. il sale, il premio but lo studente, Davis 1990). Davis (1990) is specific about syllable structure, arguing that the definite article il occurs when the following C belongs to the syllable onset, while lo occurs when the following C is outside the onset. In fact he claims that it is not only outside the onset but also outside the syllable itself. The syllabification of /s/ in clusters is an issue of much debate in a number of languages. For Dutch, Fikkert (1994) refers to /s/ in /s/+C clusters as extrasyllabic, while for English, Gierut (1999) treats it as an adjunct. Pan and Snyder (2004) on the other hand, propose that /s/ is a coda of a preceding syllable with an empty nucleus. According to Bertinetto (2004) the syllabification of Italian /s/+C clusters is still unresolved. In fact, Turchi and Bertinetto (2000) and Bertinetto (2004) go as far as to say that the syllabification of 1. In what follows we refer to /s/ and /z/ as /s/ for the purpose of simplification. Voicing is not distinctive in this position but rather conditioned by the voicing of the following consonant.
Articulatory coordination and the syllabification of word initial consonants
159
/s/+C clusters is underdetermined for Italian, and may be speaker specific or context dependent. 1.2. Articulatory coordination and syllable structure Gestural coordination patterns have been proposed as a diagnostic for the affiliation of consonants to syllables based on the timing patterns between consonants and vowels (Browman and Goldstein 1988, Honorof and Browman 1995, Browman and Goldstein 2000, Goldstein et al. 2007). In Articulatory Phonology each gesture is associated with a planning oscillator (or clock, see Browman and Goldstein 2000, Nam and Saltzman 2003). The oscillators are coupled with each other in two basic modes: The most stable intrinsic mode is the in-phase relation (= initiated simultaneously), which is assumed for the coordination between consonantal gestures in syllable onsets with the following vocalic gesture in CV). Another, less stable, intrinsic mode is the anti-phase relation (= initiated sequentially). That mode is primarily employed to model the coordination of the consonant gestures in syllable codas (VC) with the preceding vowel. Figure 1 schematises the relation between syllable structure and coupling relations of articulatory gestures. Thus in onsets, consonantal gestures are coupled in-phase with the vocalic gesture (and therefore start simultaneously), whereas in codas, they are antiphase, i.e. they are sequenced with the vocalic gesture.
Figure 1. Coupling of consonants and vowels with respect to syllable structure; in-phase relation = solid line; anti-phase relation = dotted line.
160
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
In complex onsets, consonants are in-phase with the vowel and at the same time anti-phase with each other (Nam and Saltzman 2003, Goldstein et al. 2007). This competitive coupling in complex onsets is present on the surface as the C-center effect (Browman and Goldstein 2000), where the mean of all consonantal targets (C-center) is aligned at a stable timing point relative to the vocalic target. Thus, the distance between the mean of targets for CC in CCV and for CCC in CCCV is comparable to the midpoint for C in CV. As a result of this, the rightmost consonant within the cluster is shifted further towards the vowel with every added consonant. This rightmost shift has recently been confirmed for Georgian (Goldstein et al. 2007). Other languages, such as Tashlhiyt Berber (Goldstein et al. 2007) and Moroccan Arabic (Shaw and Gafos 2008, Shaw et al. 2009), have been analysed as not allowing complex onsets. In these latter studies the rightmost consonant in a cluster has a stable timing with the vowel, regardless of the size of word initial clusters, thus confirming the analysis whereby the rightmost consonant is the only one included in the syllable onset. These studies indicate that it is possible to recover signatures of syllable structure from the timing of articulatory movements, especially from the gestural timing of the rightmost consonant in clusters relative to the vocalic anchor. 1.3. Articulation of /s/+C clusters There are many languages that have /s/+C clusters in word initial position. In English the coordination patterns for word initial /s/+C and /s/+CC clusters imply that /s/ forms part of a complex onset (see figure 2b), although such onsets can incur a sonority violation (e.g. spayed). The analysis of /s/ as a part of the onset in /s/+C clusters in English has recently been confirmed by Marin and Pouplier (2010). The original experiment by Browman & Goldstein (2000) involved triplets such as ‘sayed’, ‘spayed’ and ‘splayed’. They illustrated the effect of competitive coupling (see figure 2a: in-phase relation of Cs with V and at the same time anti-phase relation with each other) in terms of the C-center effect (see figure 2b). Thus, it is the mean of the consonantal targets which is at a constant distance from the vocalic target rather than the rightmost C, which shifts, depending on the number of consonants in a cluster, such that the distance decreases with an increase in size of the cluster (see arrows in figure 2b). For a more detailed discussion of C-center measures see Hermes et al. (2008). In this study we concentrate on the distance between the rightmost consonant and the vocalic target. This is the variable that is used in Goldstein et al. (2007) to ascertain whether a sequence of consonants forms a complex onset. To measure this systematically, the rightmost consonant has to be
Articulatory coordination and the syllabification of word initial consonants
161
Figure 2. Coupling graph (a) and schematised articulatory patterns (b) for onsets in English, adapted from Saltzman et al. (2006).
kept constant (e.g. Berber: mun – smun – tsmun/) as opposed to English (e.g. sayed – spayed – splayed) in earlier work (e.g. Browman and Goldstein 2000). The rightmost C variable is hypothesised to decrease (rightward shift) comparing single onsets with non-sibilant clusters (where consonants are syllabified as part of the onset). For sibilant clusters, it is assumed that the rightmost consonant within the cluster is not shifted, but remains at a stable timing point, indicating that the sibilant is not part of the onset. 2. Method 2.1. Speakers We recorded two native Italian speakers, one female speaker (MS) in her mid-forties from Apulia in Southern Italy and one male speaker (AR) in his mid-thirties from Trentino, in Northern Italy. Both speakers spent their first thirty years in their hometowns. 2.2. Speech Materials The target words contain simplex onsets and clusters with and without sibilants. All words had a feminine (‘-a’) ending, ensuring that the definite article ‘la’ is constant (only the masculine article alternates between ‘il’ and ‘lo’). For the analysis we compared target words with a) C vs. CC, b) C vs. /s/+C and c)
162
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
CC vs. /s/+CC word initially, keeping the rightmost consonant constant. The word list is shown in table 1. The target words were embedded in the carrier sentence ‘Per favore dimmi la __ di nuovo’ (‘Please say the __ again’), ensuring an alternation of high and low vowels throughout the sequence. Table 1. Wordlist C
CC
/s/+CC
/rema/ (‘rheme’)
/prema/ (‘press’)
/sprema/ (‘squeeze’)
/rima/ (‘rhyme’)
/prima/ (‘first’)
/sprima/ (logatome)
/lina/ (proper name)
/plina/ (logatome)
/splina/ (logatome)
C
/s/+C
/pina/ (proper name)
/spina/ (‘thorn’)
/fila/ (‘line’)
/sfila/ (‘s/he unthreads’)
/vita/ (‘life’)
/svita/ (‘s/he unscrews’)
A total of 300 sentences were recorded (15 target words 10 repetitions 2 speakers). The data is a subset of a more extensive corpus. 2.3. Recordings The recordings took place at the If L Phonetics laboratory in Cologne. The speech material was displayed on a computer monitor. Target words were produced in pseudo-randomised order, each being spoken 10 times in total. Speakers were instructed to speak at a rate they considered to be comfortable. Acoustic and kinematic data were recorded simultaneously. We recorded the acoustic signal with a DAT-recorder (TASCAM DA-P1) using a condenser microphone (AKG C420 head set) and digitised at 44.1 kHz/16 bit. The kinematic data was recorded with a 2D electromagnetic midsagittal articulograph (Carstens AG100; 10 channels). We placed 2 sensors on upper and lower lip and 3 sensors on the tongue: tongue tip, tongue blade and tongue body (1cm, 2cm and 4cm behind the tongue tip). Two additional sensors on the bridge of the nose and the upper gums served as references in order to correct for head movements during the recordings (see Hoole 1996). All kinematic data were sampled at 400 Hz, downsampled to 200 Hz and smoothed with a low-pass filter at 40 Hz. For displaying and labelling data, all
Articulatory coordination and the syllabification of word initial consonants
163
acoustic and kinematic data were converted to SSFF-format to enable the data to be analysed and annotated in the EMU Speech Database System (Cassidy & Harrington 2001). 2.4. Labelling Procedure All acoustic and articulatory landmarks were displayed and labelled by hand. We labelled the onset and offset of the target word and its acoustically defined segments. In the present study only the articulatory landmarks are reported on. The remaining labels were placed in relation to the articulatory record. We labelled movements in the vertical dimension, identifying minima and maxima in the respective velocity trace (zero crossings). For vowel-to-vowel articulation, we labelled the vocalic target for /i,e/. For consonants, we labelled the maximum targets of the primary constrictors (Byrd 2000), whereas labial consonants were identified by using the lip aperture index (LA, Byrd 2000). Figure 3 illustrates how the landmarks are annotated for those measures.
Figure 3. Labelling scheme for test word /plina/ in ‘Per favore dimmi la plina di nuovo’. From top to bottom: acoustic waveform, kinematic waveform for vertical tongue-tip position, inter-lip distance and vertical tongue-body position.
164
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
Figure 4. Schematic diagram of articulatory measurements. Arrow indicating the distance between rightmost C and V target, comparing a) /lina/ and b) /plina/.
Articulatory coordination and the syllabification of word initial consonants
165
Figure 4 schematises the rightmost C variable for a simple onset, ‘la lina’ (see 4a), and for a complex onset, ‘la plina’ (see 4b). We measured the distance between the target of the rightmost consonantal gesture and the target of the vocalic gesture. When a consonant is added to an onset, the distance of the rightmost C target relative to the V target is expected to decrease, due to the assumed C-center effect for complex onsets (marked by the horizontal arrow on a virtual time axis). 3. Results We measured the distance of the rightmost C to the V target in 293 tokens for both speakers; 7 utterances were discarded from the analysis, due to technical problems. An overall-ANOVA with rightmost C as dependent variable revealed significance for the independent variable onset complexity (C, CC, /s/+C, /s/+CC; p < 0.05) and for speaker (p < 0.01; speaker as random factor). We therefore used one-way-ANOVAs for each speaker separately including the dependent variable rightmost C and the independent variable onset complexity. 3.1. Results: C vs. CC structure The results for single C and CC clusters are presented for each speaker separately (see table 2). Results of the ANOVAs are summarised in the last column (α-level is set at 0.05). Comparing C to CC clusters, we find the expected decrease of the rightmost consonant displaying a shift to make room for the added consonant. Table 2. Distance of rightmost C to V target in C and CC, both speakers, standard deviation in parentheses. Rightmost C to V (ms)
MS
AR
C
CC
F-value
p-value
rema-prema
151 (11)
124 (6)
47.255
***
rima-prima
166 (11)
117 (7)
141.699
***
lina-plina
203 (12)
165 (21)
22.693
***
rema-prema
189 (16)
140 (21)
27.279
***
rima-prima
182 (20)
122 (23)
40.574
***
lina-plina
227 (27)
155 (28)
33.812
***
166
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
In all cases (p < 0.001) it is shown that the consonant is shifted considerably towards the vowel (for speaker MS: /rema/ vs. /prema/ on average 27ms; in /rima/ vs. /prima/ on average 49ms; in /lina/ vs. /plina/ on average 38ms; for speaker AR: in /rema/ vs. /prema/ on average 49ms; in /rima/ vs. /prima/ on average 60ms; in /lina/ vs. /plina/ on average 72ms). In figure 5 the considerable decrease of the rightmost C variable in C vs. CC structured target words is shown graphically.
Figure 5. Rightmost C to V target in C vs. CC, speaker MS (above) and AR (below). Line-up point is the vocalic target for /i,e/.
Articulatory coordination and the syllabification of word initial consonants
167
The graphs are constructed in analogy to figure 4 (schematic diagram of distance between rightmost C and V target). Looking at the graph, the left edge of each bar plot represents the mean target of the rightmost C in a cluster or the only C (if it is a single consonant) measured as a distance in ms from the vocalic target /i,e/ (synchronised at 0 in the graph). The non-sibilant clusters show the expected decrease of the rightmost C (comparing C vs. CC) to make room for the added consonant to the left, which gives us an indication that these clusters form complex onsets in Italian. We now turn to the results for clusters involving a sibilant. 3.2. Results: C vs. /s/+C structure Table 3 provides the mean durations and standard deviations, separately for speaker MS and AR comparing target words beginning with C and /s/+C (αlevel is set at 0.05). Table 3. Distance of rightmost C to V target in C vs. /s/+C, both speakers, standard deviation in parentheses. Rightmost C to V (ms)
MS
AR
C
/s/+C
F-value
p-value
pina-spina
241 (13)
243 (13)
0.143
n.s.
fila-sfila
189 (17)
184 (10)
0.617
n.s.
vita-svita
163 (12)
169 (15)
0.823
n.s.
pina-spina
267 (29)
271 (19)
0.053
n.s.
fila-sfila
197 (24)
187 (26)
0.835
n.s.
vita-svita
215 (16)
212 (9)
0.291
n.s.
For both speakers in all cases, there is no difference in the timing from the rightmost C to the vocalic target, when comparing C to /s/+C clusters (p > 0.05 n.s.). Although a sibilant is added to the beginning of the word, the rightmost C is not adjusted relative to the vowel, i.e. latencies remain stable. In figure 6 the results are presented graphically. Comparing the bars for each word pair (C vs. /s/+C), we found no decrease of the distance of the rightmost C to V target. The latencies remain the same. That was the case for speaker MS in /pina/ (Δ 241ms) vs. /spina/ (Δ 243ms), /fila/ (Δ 189ms) vs.
168
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
/sfila/ (Δ 184ms) and /vita/ (Δ 164ms) vs. /svita/ (Δ 169ms). Furthermore, no rightward shift was found for speaker AR in /pina/ (Δ 267ms) vs. /spina/ (Δ 271ms), / fila/ (Δ 197ms) vs. /sfila/ (Δ 187ms) and /vita/ (Δ 215ms) vs. /svita/ (Δ 212ms). Across these target words the rightmost c variable remains stable for both speakers (MS: on average 4ms, AR: on average 1ms), when the sibilant is added.
Figure 6. Rightmost C to V target C vs. /s/+C, speaker MS (above) and AR (below). Line-up point is the vocalic target for /i/.
Articulatory coordination and the syllabification of word initial consonants
169
3.3. Results: CC vs. /s/+CC structure We now examine the cases where /s/ is added to a complex onset. This is the case for /s/+CC clusters (see table 4). The rightmost C variable is stable for speaker MS (/prema/ (Δ 124ms) vs. /sprema/ (Δ 128ms), /prima/ (Δ 117ms) vs. /sprima/ (Δ 113ms), /plina/ (Δ 165ms) vs. /splina/ (Δ 158ms)). The same systematic pattern was found in all cases for AR (/prema/ (Δ 140ms) vs. /sprema/ (Δ 135ms), /prima/ (Δ 122ms) vs. /sprima/ (Δ 134ms), /plina/ (Δ 155ms) vs. /splina/ (Δ 158ms)). In this pattern, the rightmost C does not make room for the added sibilant. Table 4. Distance of rightmost C to V target in C and /s/+CC, both speakers, standard deviation in parentheses. Rightmost C to V (ms)
MS
AR
CC
/s/+CC
F-value
p-value
prema-sprema
124 (6)
128 (12)
0.835
n.s.
prima-sprima
117 (7)
113 (9)
1.405
n.s.
plina-splina
165 (21)
158 (15)
2.047
n.s.
prema-sprema
140 (21)
135 (13)
0.424
n.s.
prima-sprima
122 (23)
134 (21)
1.455
n.s.
plina-splina
155 (28)
158 (15)
0.067
n.s.
Compared to the /s/+C clusters, we find similar articulatory timing pattern. Whenever the ‘impure s’ is added to either a single C or a complex onset the distance of the rightmost C target to the V target does not decrease to make room for the added sibilant, as it would have been expected for initial clusters where all consonants are part of the complex onset. This stable timing of the rightmost C variable, comparing CC with /s/+CC, is displayed in the graphs in figure 7.
170
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
Figure 7. Rightmost C to V target in CC vs. /s/+CC, speaker MS (above) and AR (below). Line-up point is the vocalic target for /i,e/.
4. Discussion These results on articulatory coordination in Italian provide evidence for complex onsets in Italian (CC clusters). In the analysis of the target words C and CC, we found a decrease in the distance between the rightmost C target and the V target. The second C target in the cluster is shifted towards the vowel. This supports the hypothesis of an underlying competitive coupling structure
Articulatory coordination and the syllabification of word initial consonants
171
for complex onsets. The articulatory coordination of the rightmost C in a complex onset is adjusted according to the number of consonants. The coordination pattern in /s/+C clusters is different. Here there is no decrease in the distance between the rightmost C target and the V target. These results provide additional, articulatory motivation for assigning a special status to the Italian ‘impure s’ in terms of its syllabic constituency. Table 5 summarises the results for the distance between the rightmost C and the V target in the investigated words. Table 5. Summary of shift for rightmost C towards V for both speakers. Structure
Shift of rightmost C
Cases
C vs. CC
YES
all
C vs. /s/+C
NO
all
CC vs. /s/+CC
NO
all
These results show that /s/ does not exhibit the articulatory timing patterns required for membership of the syllable onset, in that the rightmost C target is at a constant distance from the V target. This is true for all analysed target words containing an ‘impure s’ for both speakers. In other words, adding the sibilant to the onset of a word does not affect the timing of the other consonants relative to the vocalic target. Thus, there is no evidence for an underlying competitive coupling structure between /s/ and the other consonants. 4.1. Coupling structure From an articulatory point of view, the articulatory analysis reflects different coupling structures proposed for word initial clusters in Italian. We found the underlying competitive coupling of Italian non-sibilant consonants in word initial position, resulting in a decrease of the distance from the rightmost C to the V target, reflecting a C-center like coordination (see figure 8a; Hermes et al. 2008). This coupling is schematised in figure 8a, where all consonants are coupled in-phase with the vowel (ideally initiated simultaneously), but the consonants are anti-phase with each other (ideally initiated sequentially). By contrast, sibilants in clusters did not show a C-center like coordination. This is illustrated in figure 8b. In the sequence /s/+C, the rightmost consonant is not shifted rightwards when compared to a simple C onset (see figure 8b).
172
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
Figure 8. Schematised articulatory pattern and coupling graphs for C vs. CC cluster (a), C vs. /s/+C clusters (b) and C vs. CC vs. /s/+CC (c) clusters in Italian.
Articulatory coordination and the syllabification of word initial consonants
173
The same holds for /s/+CC compared to CC (see Figure 8c). This implies that ‘impure s’ does not participate in the competitive coupling structures. 4.2. Phonological implications This is the first study to show that one and the same language can have different gestural timing patterns in word initial consonant clusters, depending on the identity of the consonants concerned. 1) We have provided evidence for C-center coordination in word initial non-sibilant clusters in Italian, supporting an analysis of these clusters as complex onsets. 2) In the case of /s/-clusters we have shown that the sibilant does not coordinate in the same way as other consonants. This is the case both in three-consonant clusters, /s/+CC (e.g. /spr/), as well as in two-consonant clusters, /s/+C (e.g. /sp/). This coordination pattern supports an analysis of /s/ as being outside the onset of the syllable, or even outside the syllable itself. 3) Specifically for Italian, morphological alternations before words beginning with a /s/-clusters have fuelled a debate as to their syllabic structure. Our results add one more piece of evidence that they are not part of a complex onset on this language. 4) Comparing our results for Italian to those for English, we can show that it is not /s/-clusters per se that have this coordination, since sibilants are timed in the same way as other consonants in the latter. Thus /s/-clusters cannot be seen as having a universal gestural coordination. Acknowledgements We would like to thank Hosung Nam (Haskins Laboratories) for the fruitful discussion on coupling structures for word initial consonant clusters in Italian with and without ‘impure s’. References Baretti, G. 1832 Bertinetto, P.M. 2004
English and Italian Dictionary. Part the Second. Florence: Cardinal Printing Office. On the undecidable syllabification of /sC/ clusters in Italian: Converging experimental evidence. Italian Journal of Linguistics/Rivista di Linguistica, 16, 349–372.
174
Anne Hermes, Martine Grice, Doris Mücke and Henrik Niemann
Browman, C.P. and Goldstein, L. 1988 Some Notes on Syllable Structure in Articulatory Phonology. Phonetica 45, 140–155. Browman, C. and Goldstein, L. 2000 Competing constraints on intergestural coordination and self-organization of phonological structures. Les Cahiers de l’ICP, Bulletin de la Communication Parlée, 25–34. Byrd, D. 2000 Articulatory vowel lengthening and coordination at phrasal junctures. Phonetica 57, 3–16. Cassidy, S. and Harrington, J. 2001 Multi-level annotation in the Emu speech database management system. Speech Communication, 33, 61–77. Davis, S. 1990 Italian Onset Structure and the Distribution of il and lo. Linguistics, 28, 43–55. Fikkert, P. 1994 On the acquisition of prosodic structure. Ph.D. Dissertation, HIL dissertations 6, Leiden University. The Hague: Holland Academic Graphics. Gierut, J.A. 1999 Syllable onsets: clusters and adjuncts in acquisition. Journal of Speech, Language, and Hearing Research, 42, 708–726. Goldstein, L. Chitoran, I., and Selkirk, E. 2007 Syllable structure as coupled oscillator modes: Evidence from Georgian vs. Tashlhiyt Berber. Proceedings of the 16th International Congress of Phonetic Sciences, Saarbrücken, Germany, 241–244. Hermes, A., Grice, M., Mücke, D. and Niemann, H. 2008 Articulatory Indicators of Syllable Affiliation in Word Initial Consonant Clusters in Italian. Proceedings of the 8th Internatinal Seminar on Speech Production, Strasbourg, France, 433–436. Hoole, P. and Kühnert, B. 1996 Tongue-jaw coordination in German vowel production. Proceedings of the 1st ESCA tutorial and research workshop on Speech Production Modelling/4th International Seminar on Speech Production, Autrans, 1996, 97–100. Honorof, D.N. & Browman, C.P. 1995 The center or edge: how are consonant clusters organized with respect to the vowel? In K. Elenius & P. Branderup (eds.). Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, Sweden, 552–555. Marin, S. and Pouplier, M. 2010 Temporal organization of complex onsets and codas in American English: Testing the predictions of a gestural coupling model. Motor Control 14(3), 380–407.
Articulatory coordination and the syllabification of word initial consonants
175
Nam, H. and Saltzman, E. 2003 A competitive, coupled oscillator of syllable structure. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Spain, 2253–2256. Pan, N. and Snyder, W. 2004 Acquisition of /s/-initial clusters: A parametric approach. In Proceedings of the 28th BUCLD. Boston, 436–446. Saltzman, E., Nam H., Goldstein, L. and Byrd, D. 2006 The distinctions between state, parameter and graph dynamics in sensorimotor control and coordination. In M. L. Latash and F. Lestienne, (eds.). Motor Control and Learning. New York: Springer, 63–73. Shaw, J. and Gafos, A. 2008 C-Center and Syllabification in Moroccan Arabic. Poster presentation: CUNY Conference on the Syllable, New York, January, 17–19. Shaw, J., Gafos, A., Hoole, P. and Zeroual, C. 2009 Syllabification in Moroccan Arabic: evidence from patterns of temporal stability. Phonology 26, 187–215. Turchi, L. and Bertinetto, P.M. 2000 La durata vocalica di fronte ai nessi /sC/: un’indagine su sogetti pisani. Studi Italiani di Linguisitca Teorica e Apllicata 29, 389–421.
A gestural model of the temporal organization of vowel clusters in Romanian Stefania Marin and Louis Goldstein Abstract This study proposes a task-dynamic gestural model of the Romanian hiatus sequence /e.a/ and of diphthong /ea/, starting from the hypothesis that the temporal organization of hetero- and tauto-syllabic vowel clusters can be modeled in terms of particular coupling relations. For modeling hiatus /e.a/, stimuli were created with the oscillators for vowels /e/ and /a/ coupled anti-phase (180-degrees) or on different cycles (360degrees), resulting in their sequential production. These stimuli were classified perceptually by Romanian listeners as hiatus sequences. For modeling stressed diphthong /ea/ and its alternation with unstressed vowel /e/, stimuli were created with the oscillators for vowels /e/ and /a/ coupled in-phase (0-degree), resulting in their synchronous production, and with additional manipulations of dynamic parameters, intended to model stress effects. The perceptual results showed that vowels /e/ and /a/ synchronously coordinated were perceived as vowel /e/, when all dynamical parameters were kept constant, and that a diphthong percept was triggered when the blending weight for /a/ was greater than for /e/, causing vowel /a/ to achieve its target closer to its specification, to the detriment of vowel /e/. An acoustic analysis further showed a similarity between the modeled stimuli and corresponding stimuli produced by Romanian native speakers.
1. Introduction Languages often distinguish between vowel sequences or clusters that are parsed into different syllables (hetero-syllabic hiatus sequences, shown schematically in diagram 1a), and vowel sequences that are assigned to a single syllable (tauto-syllabic diphthongs, 1b–c). Many languages distinguish between hetero- and tauto-syllabic structures by treating one of the vowels in the tautosyllabic sequence as a consonantal glide, and by structurally assigning it to the onset or coda of the syllable, rather than its nucleus (1b). Other languages treat such tauto-syllabic sequences as syllable nuclei – or true vocalic clusters (1c), with the difference between hiatus sequences and such diphthongs being whether they are assigned to the same or different nuclei (1a vs. 1c). How the structural configuration of a nuclear diphthong (1c) is to be timed during speech production in a manner that would distinguish it from hiatus structures,
178
Stefania Marin and Louis Goldstein
and secondarily also from non-nuclear diphthongs, is a question that has received little consideration. This paper seeks to address it by using a model that formally and systematically links linguistically significant structures (such as syllables and within-syllable constituents) to their temporal implementation at the production level. Specifically, it is proposed that the structural differences in (1) are captured by distinct temporal relations holding between these different vowel sequences. (1) a.
b.
c.
The test-case language selected for the temporal modeling of the structural distinctions illustrated in (1) is Romanian, with extensions to other crosslinguistic instances remaining a subject for future examination. Romanian provides an interesting case for investigating this question in that nucleus diphthong /ea/ contrasts both with the hiatus sequence /e.a/ and with the nonnuclear diphthong /ja/ (cf. Chitoran 2001, for a language description and a detailed discussion of these diphthongs’ phonotactics). Furthermore, the nuclear diphthongs have a mid quality, which makes them quite distinguishable from non-nuclear diphthongs (Chitoran 2002). The nuclear diphthong participates in a stress-conditioned alternation, shown in (2). An interesting experimental finding was that alternating /e/ in (2b) was realized acoustically more centralized than non-derived /e/ (3) (Marin 2005, accepted). This difference was observed both at vowel onset and at mid-point. At the same time, alternating /e/ was shown not to differ qualitatively from the onset part of diphthong /ea/, while non-alternating /e/ and the onset part of the diphthong differed significantly. At mid-point, the diphthong differed from both alternating and non-alternating /e/, exhibiting more centralized formant patterns than those of either alternating or non-alternating /e/. The qualitative difference between the diphthong and non-alternating /e/ is not surprising assuming a bi-vocalic representation of diphthongs, such as the one in (1c): the difference between diphthong-onset and non-alternating /e/ could be explained as a co-articulation effect of the diphthong’s offset part (vowel /a/) on its onset, an effect naturally absent in the case of non-alternating /e/. Following this reasoning, the absence of an acoustic difference between alternating /e/ and the diphthong’s onset suggested that their properties at onset were similar – namely in both cases their beginning consisted of vowel /e/ being co-produced with vowel /a/. Alternating /e/’s acoustic properties could
A gestural model of the temporal organization of vowel clusters in Romanian
179
therefore be the result of vowels /e/ and /a/ being co-produced with each other, which could explain both the difference between alternating and nonalternating /e/, and the lack of difference between alternating /e/ and diphthong /ea/’s onset. (2) Alternating roots: a. Diphthong: ['sea.ra] ‘the evening’ b. Alternating /e/: [se.'ra.ta] ‘the evening party’ (3) Non-alternating roots: a. ['se.ra] ‘the greenhouse’ b. [se.ri.'tʃi.ka] ‘the greenhouse-Diminutive' Starting from this hypothesis, the current paper’s aim is to explore the extent to which the planning and execution of Romanian diphthong /ea/ (and potentially such like units cross-linguistically) can be modeled in a way that (a) is consistent with the kind of compositional phonological representation shown in (1c), while at the same time being distinct from hetero-syllabic /e.a/, (b) is capable of producing the acoustic patterns observed, and (c) can account in a principled way for the alternation between diphthong /ea/ and alternating /e/. In a preliminary gestural modeling study (Marin 2007), in which taskdynamic modeled stimuli were categorized by native speakers, an /e/ vowel percept was obtained when the constrictions/activation intervals for vowels /e/ and /a/ were fully overlapped, and a diphthong percept when the activation intervals for vowels /e/ and /a/ were overlapped for approximately 90% of their movement. When the activation intervals for /e/ and /a/ did not overlap at all, the resulting percept was hiatus /e.a/. These previous results suggested that both alternating /e/ and the diphthong could be modeled as vowels /e/ and /a/ whose constriction movements were (almost) fully overlapped, with the difference that in the presence of stress, vowel /a/ would presumably be realized slightly longer and spatially stronger, and hence not fully blended with the movement for vowel /e/. In contrast, the hiatus sequence /e.a/ could be modeled as two vowels fully sequential (rather than overlapped). These temporal relations as a function of syllable organization can be formalized in terms of specific phasing relations (or coupling modes) between the respective vowels. For many types of skilled actions, it has been shown that two coupling modes – in-phase, and anti-phase, require no learning and can be stably maintained (Haken et al. 1985; Turvey 1990). If the planning clocks responsible for triggering two actions are coupled in-phase, the actions will be triggered synchronously; if the two clocks are coupled anti-phase, one action will be triggered after the other, with a lag equal to half the clock
180
Stefania Marin and Louis Goldstein
period; finally, if two actions are coupled in-phase but on different cycles (i.e. they are “360-degrees-coupled”), their onsets will lag by a complete clock cycle, and the two actions will be triggered fully sequentially. It has been hypothesized that speech employs these intrinsic coupling modes as well, and that syllable structure could be understood in terms of these specific modes of coordination (Browman and Goldstein 2000; Byrd et al. 2009; Goldstein et al. 2006; Krakow 1999; Marin and Pouplier 2010; Nam 2007; Nam et al. 2009). This approach provides a principled and economical way of understanding temporal organization in speech production, by making use of coupling relations between planning oscillators, assumed to play a role not only in speech but in coordinated human action in general. Thus, while in the study discussed above (Marin 2007) the distinction between alternating /e/, diphthong /ea/ and hiatus /e.a/ was achieved informally by manipulating temporal overlap, the present study aims to model these linguistic categories as arising from lawful consequences of specific inter-gestural coupling modes. Specifically, we hypothesize that the temporal pattern exhibited by hiatus sequences with little to no overlap between the vowel activations could be modeled as a 360-degree coupling such that movement for vowel /a/ begins roughly when movement for vowel /e/ ends. As for diphthong /ea/ and its stress-conditioned alternation with /e/, it is hypothesized that the overlap pattern shown previously (Marin 2007) to result in the percept of /e/ or /ea/ can be modeled as the result of in-phase coupling between the two vowel actions. Whether this coordination mode results in the percept of a vowel or of a diphthong should be determined by additional dynamic parameters, whose exact nature is the experimental focus of this paper. This analysis entails that the hiatus and the diphthong are compositionally similar, but distinguishable in terms of the coupling relations, and hence specific timing, holding between their composing vowel actions. It also entails that the alternation between diphthong /ea/ and alternating /e/ is not structural, but the result of different dynamical parameters governing the same vowel actions. To test this analysis, the current study presents a gestural modeling of diphthong /ea/, its alternation with vowel /e/, and its contrast with hiatus /e.a/. The modeled stimuli are evaluated both perceptually (Experiments 1 and 2), and by comparing the acoustic properties of modeled stimuli with those of corresponding stimuli produced by native speakers (Experiment 3). Non-nuclear diphthongs (1b) are not considered in this paper: as onsetnucleus or nucleus-coda structures (cf. Chitoran and Hualde 2007) they are assumed to be organized temporally as onsets/codas with a consonantal glide.
A gestural model of the temporal organization of vowel clusters in Romanian
181
2. Description of the model The computational model used in the current study – the Task-Dynamic Application (TADA), is a gesture-based system developed at Haskins Laboratories to test hypotheses formulated within dynamical speech production models, such as Articulatory Phonology (Browman and Goldstein 1990; Browman et al. 1984; Goldstein et al. 2006; Nam et al. n.d.; Saltzman and Munhall 1989). TADA generates speech outputs on the basis of dynamical specifications of articulatory gestures (as speech action units) and the coupling relations among their clocks, which serve as information for computing a gestural score with precise activation intervals for each gesture. Articulator movement then results from imposing a set of dynamical controls on the articulators. The resulting articulator trajectories are in turn used to compute vocal tract shapes, area functions, and ultimately, sound via the pseudo-articulatory synthesizer HLSyn (Hanson and Stevens 2002). 3. Experiment 1 3.1. Preliminary considerations on modeling word stress As the current TADA implementation does not automatically model stress, and because the alternation in Romanian between diphthong /ea/ and alternating /e/ is stress-conditioned, several possible stress-relevant parameters were explored. Throughout the paper, we assume that vowel /a/ bears stress in the diphthong, as its most sonorous element (cf. Selkirk 1984). Alternatively, our argument remains similar conceptually if it is assumed that both vowels (and indeed the entire syllable) are stress-bearing, but that vowel /a/, as more sonorous, is affected more articulatorily by stress. Two established acoustic correlates of word stress are vowel duration, and spectrum balance (cf. Sluijter and van Heuven 1996, for a review). These acoustic differences between stressed and unstressed vowels have been assumed to be “caused by increased physiological effort due to stress” (Sluijter and van Heuven 1996: 2473). This view is consistent with evidence available from physiological studies showing that stressed vowels are produced with overall greater articulatory excursions (cf. for example Beckman and Cohen 2000; Beckman and Edwards 1994; Harrington et al. 1995; de Jong 1995; Mooshammer and Fuchs 2002), and that they are generally less affected by coarticulation (cf. Cho 2004; Fowler 1981). For our purposes, “physiological effort” was interpreted and modeled as articulatory “strengthening” of a given gesture relative to other gestures, by allowing the stressed gesture to take over a
182
Stefania Marin and Louis Goldstein
shared articulator, and achieve its constriction closer to its underlying target, to the detriment of unstressed gestures (cf. also the insights in Lindblom 1963, and more recently de Jong’s 1995 model of stress as hyperarticulation). Because F0 movement is controlled primarily by placement of prosodic pitch accents rather than by lexical stress, per se (cf. Beckman and Edwards 1994; Sluijter and van Heuven 1996), it was not considered here. Vowel quality was also not assumed to be a relevant cue for encoding stress in Romanian, given previous impressionistic descriptions and empirical evidence showing that vowel /e/ in Romanian does not differ qualitatively as a result of stress (Marin accepted). On the basis of these considerations, three parameters were tested for modeling stress effects: activation interval of the vowel gestures (affecting the vowels’ relative duration), relative blending weight of the two vowel gestures (determining the vowels’ relative articulatory “strength”), and presence of a prosodic gesture (Byrd and Saltzman 2003) slowing the time course of speech production, and resulting in lengthening of the affected constriction. Each of these parameters will now be considered in more detail. The activation interval of the two relevant vowel gestures determines the time between activation onset and offset of each vowel. The coupled oscillator model specifies the phase at which a gesture is activated relative to another, while de-activation by default occurs at some regular phase of the gesture’s own clock (340-degree for vowels). Thus two vowels coupled in-phase are synchronous at activation onset, and by default (i.e. determined by their own internal clocks) also at offset. Activation offset was manipulated so that for some stimuli offset of /e/ occurred earlier than offset of /a/ resulting in a relatively shorter duration of /e/, mirroring the fact that in Romanian (and other languages, cf. Lindblom 1963) /e/ is slightly shorter than /a/ (Burileanu 2002). While differential duration of low vs. mid/high vowels is not per se a stressrelated property of these vowels, it was assumed that stress could affect the movement, and hence the duration of an intrinsically longer low vowel more than that of a shorter one. It must be noted that without a manipulation of activation offset, only very small intrinsic vowel duration differences would emerge automatically from the current implementation of the model. A second manipulation was the relative blending weight of the two vowels. In the prosodic component of TADA currently under development, stress is modeled, in part, by means of a spatial modulation gesture (so-called μgesture) which serves to make stressed gestures more extreme, achieving constrictions closer to their underlying target values, in comparison to unstressed gestures which may show more target undershoot (Saltzman et al. 2008). In the currently available version of TADA in which μ-gestures are not yet implemented, their effect can be approximated for the case of two synchronous
A gestural model of the temporal organization of vowel clusters in Romanian
183
gestures by manipulating their relative blending weights (cf. Saltzman and Munhall 1989). This weight determines how the target parameters are to be averaged in the case when their time-overlapping actions control the same vocal tract variables (for vowels /e/ and /a/, Tongue Body Constriction Location and Degree). Thus, the stressed gesture can be given a higher blending weight than the unstressed one, so that its target specification would affect more the final blended state, much as would occur if the blending weights were equal but an enhancing μ-gesture were associated with the stressed vowel. For our experiment, we manipulated blending weight so that we had stimuli with both equal and unequal blending weights for /e/ and /a/.1 Finally, a third manipulation had to do with the presence or absence of a prosodic gesture (π-gesture) which warps the underlying clock speed of all gestures produced concurrently with it, hence slowing their movement. π-gestures have been previously used for modeling lengthening at various prosodic boundaries (Byrd 2000; Byrd and Saltzman 2003), and this was assumed to be one possible way of modeling lengthening under stress as well. 3.2. Method 3.2.1. Participants Twelve native Romanians, naïve to the purpose of the experiment, and with no reported speech, hearing or language deficits participated in this auditory perception task. 3.2.2. Materials and procedure We hypothesized that the temporal organization of hiatus /e.a/ could be modeled with vowels /e/ and /a/ coupled 360-degree, while diphthong /ea/ and its alternation with vowel /e/ could be characterized by in-phase coupling. To test this hypothesis, we modeled stimuli with the planning oscillators for vowels /e/ and /a/ coupled in-phase (0-degree), resulting in their synchronous production, coupled anti-phase (180-degree), resulting in partly sequential relative timing of their activation intervals, and finally coupled 360-degree, resulting in completely sequential timing of their activation intervals. Stimuli with single vowels /e/ and /a/ were also modeled. The gestures for /e/ and /a/ 1. The same (weight-) averaging mechanism is at work regardless of the identity of the gestures to be blended (a simple averaging for gestures with equal weights, a weighted averaging otherwise). The acoustic and perceptual consequences of such blending are an empirical question to be addressed experimentally.
184
Stefania Marin and Louis Goldstein
were modeled throughout using the default TADA specifications for vowels [ε] and [a] respectively, matching the phonetic characteristics of Romanian /e/ and /a/ (cf. Chitoran 2001). All the stimuli had an initial and final labial stop /b/ flanking the relevant vowels (/b_b/). In addition to the coupling relations between vowels /e/ and /a/, three additional parameters, assumed to play a role in modeling stress (and hence the stress-conditioned alternation /'ea/-/e/), were manipulated. One manipulation was changing vowel activation offset for some items so that offset of /e/ occurred earlier than offset of /a/; thus, for some stimuli, vowel de-activation occurred at 340 degrees on the cycle of either /e/ or /a/, while other stimuli were created with earlier de-activation of /e/, at 300 or 270 degrees, resulting in a shorter activation interval. De-activation for /a/ was kept constant at 340 degrees. A second manipulation was relative blending weight of the two vowels’ targets: for some stimuli both vowels /e/ and /a/ had the same blending weight (i.e. a blending weight ratio BWR of 1), while for the other stimuli /a/, as the vowel more affected by stress, had twice the blending weight of /e/ (resulting in a BWR of 2). A third manipulation was presence or absence of a prosodic gesture on a stimulus. When present, the π-gesture was active for the entire vowels’ activation duration, and its strength was flat throughout (when the two vowels had different activation durations, the π-gesture’s activation coincided with the longer vowel’s one). Tables 1 and 2 provide a full description of the modeled stimuli. Acoustic outputs with a 11025 Hz sampling frequency were generated on the basis of these articulatory configurations, and they were classified on the basis of auditory perception by 12 listeners (five male). The experiment was carried out in a quiet room and the participants were fitted with headphones. DMDX software (K. Forster and J. Forster 2003) was used for stimulus presentation and response recording. The stimuli included the bilabial closures flanking the vowel interval of interest. A forced-choice identification design was used, in which the listeners had to decide, by pressing an appropriately labeled computer key, whether the item heard was a) part of two syllables (BE AB), or contained b) diphthong /ea/ (BEAB), c) vowel /e/ (BEB), or d) vowel /a/ (BAB). None of the choices were real words in Romanian. In the written instructions, the participants were presented with real word examples of the categories and were told that they would hear fragments of computer synthesized words containing those categories in the context /b_b/. The program advanced to the next stimulus as soon as a response key was pressed or after 6.1s. Ten repetitions of each stimulus were included in the experiment, presented in random order.
A gestural model of the temporal organization of vowel clusters in Romanian
185
Table 1. Description of stimuli with single vowel gestures, and with two vowel gestures coupled anti-phase or 360-degree used in Experiment 1. Vowel to vowel coupling
BWR [a]/[e]
π-gesture applied
a
n/a
n/a
no
e
n/a
n/a
no
ea180
180-degree
1
no
ea180_ W2
180-degree
2
no
ea180_π
180-degree
1
yes
ea180_ W2_π
180-degree
2
yes
ea360
360-degree
1
no
ea360_ W2
360-degree
2
no
ea360_π
360-degree
1
yes
ea360_ W2_π
360-degree
2
yes
Stimulus
Table 2. Description of stimuli modeled with two vowel gestures coupled in-phase used in Experiment 1. Activation offset for /e/
BWR [a]/[e]
π-gesture applied
ea
340-degree
1
no
ea30
300-degree
1
no
ea27
270-degree
1
no
ea_W2
340-degree
2
no
ea30_W2
300-degree
2
no
ea27_W2
270-degree
2
no
ea_π
340-degree
1
yes
ea30_π
300-degree
1
yes
ea27_π
270-degree
1
yes
ea_W2_π
340-degree
2
yes
ea30_W2_π
300-degree
2
yes
ea27_W2_π
270-degree
2
yes
Stimulus
186
Stefania Marin and Louis Goldstein
3.3. Results The perceptual results averaged across listeners showed that single vowel stimuli were perceived as either vowels /e/ or /a/ over 90% of the time (Figure 1a). Stimuli with vowels timed non-synchronously were perceived as hiatus stimuli more than 85% of the time, with individual listeners ranging between 70–100% hiatus responses to ea180 stimuli and between 80%–100% hiatus responses to ea360 stimuli. For the stimuli with vowels coupled in-phase (Figure 1b), the identification patterns showed that neither different activation intervals of the two vowels nor presence of a π-gesture alone (nor a combination of the two) made a difference in how they were perceived. Stimuli with these manipulations alone were overall identified as vowel /e/ 90% of the time, similar to the identification pattern of the stimulus with no manipulation (stimulus ea). As to the blending weight parameter, there was a trend towards increasingly identifying as diphthongs those stimuli for which /a/ had greater blending weight. Thus, W2-stimuli were identified as a diphthong on average 35–40% of the time, with the additional presence of a π-gesture slightly enhancing this effect. Individual participant patterns, shown in Table 3, indicated that differential blending weight, independent of the other manipulations, triggered a diphthong response at a 50% or greater level for about half of the participants, while it did not trigger a diphthong response for the other half of the participants. The perceptual pattern indicated therefore that greater blending weight for vowel /a/ was the manipulation most influencing a diphthong percept (albeit not for all listeners), independent of vowel activation duration or π-gesture. To quantify these observations, we carried out a generalized linear mixed model analysis with the individual (non-averaged) classification responses as the dependent variable (two levels: diphthong vs. any other response), stimulus as a fixed factor, and participant as a random factor. This analysis showed that stimuli with a blending weight ratio of 2 were classified as a diphthong significantly more than the base ea stimulus (cf. Table 4), confirming the trend in diphthong response observed on the averaged data. The shift from a vowel to a diphthong percept for those listeners exhibiting it was not due to stimulus duration. Stimuli with a π-gesture were the longest, but this duration difference alone did not trigger a predominant diphthong response (cf. the stimuli represented by circles in Figure 2). While the stimuli with a combined BWR of 2 and presence of a π-gesture were indeed longer, and more consistently perceived as diphthongs (the triangle-stimuli in Figure 2), so were some considerably shorter stimuli where only blending weight had been manipulated (the diamond-stimuli in Figure 2).
A gestural model of the temporal organization of vowel clusters in Romanian
187
Figure 1. Perceptual classification results for Experiment 1, averaged across 12 listeners. Error bars represent one standard deviation from the mean. (a) Responses to single vowel stimuli and to stimuli with two vowels asynchronously timed; (b) Responses to stimuli with two vowels coupled in-phase.
188
Stefania Marin and Louis Goldstein
Table 3. Individual diphthong responses (%) for Experiment 1 for the stimuli with vowels coupled 0-degree in-phase. Diphthong responses at or over 50% are bold-faced. Diphthong Response (%) Stimulus
F4
F5
F6
F7
F8
F9
M1
M4
M5
M6
M7
M8
ea
0
0
0
0
0
0
20
0
0
0
0
10
ea30
0
0
0
0
0
0
20
10
0
0
0
20
ea27
0
0
0
0
0
0
20
10
0
10
0
33
ea_W2
50
50
50
10
0
0
50
60
50
0
0
90
ea30_W2
40
40
40
0
0
10
50
90
40
10
10
90
ea27_W2
40
40
40
25
0
0
40
70
40
10
0
80
ea_W2_π
60
50
60
20
20
20
60
50
60
13
10
100
ea30_W2_π
50
50
50
10
10
10
60
30
50
10
30
100
ea27_W2_π
40
40
40
30
10
0
67
30
50
10
0
100
ea_π
0
0
0
10
0
0
20
10
10
0
0
10
ea30_π
0
0
0
0
0
0
40
20
0
0
0
10
ea27_π
0
0
0
0
0
0
10
40
0
0
0
30
3.4. Discussion The results of the classification showed that stimuli with vowels /e/ and /a/ coupled 180-degree or 360-degree were perceived as a hiatus, while stimuli with vowels /e/ and /a/ coupled in phase were perceived as either vowel /e/ or diphthong /ea/, depending on further manipulations. The parameter separating a diphthong percept from a vowel percept, at least for some of the listeners, was the blending weight ratio between the two in-phase vowel gestures. When vowel /a/ received extra blending weight the resulting percept was, for about half of the listeners, preponderantly a diphthong, while equal blending weight resulted in a single vowel percept. However, none of the stimuli created were classified consistently by all listeners as a diphthong, possibly because the blending weight ratio between /e/ and /a/ was not large enough. We hypothesized that an even larger blending weight ratio would result in a more robust diphthong percept. We investigated this possibility in Experiment 2.
A gestural model of the temporal organization of vowel clusters in Romanian
189
Table 4. Statistical results (Generalized Linear Mixed Model) for the diphthong response comparison across stimuli with vowels coupled in-phase (Experiment 1). Positive Z-values indicate that there were more diphthong responses for the given stimulus than for the base stimulus (stimulus ea). Stimulus
Z
p-value
Intercept (ea)
–6.119
0.000
ea30
0.733
0.464
ea27
1.369
0.171
ea_W2
5.525
0.000
ea30_W2
5.693
0.000
ea27_W2
5.206
0.000
ea_W2_π
6.383
0.000
ea30_W2_π
5.915
0.000
ea27_W2_π
5.651
0.000
ea_π
0.733
0.464
ea30_π
1.303
0.193
ea27_π
1.548
0.122
4. Experiment 2 4.1. Method 4.1.1. Participants Sixteen native Romanians, naïve to the purpose of the experiment, and with no reported speech, hearing or language deficits participated in this auditory perception task. Eleven of the listeners (M4–M8, F4–F9) also participated in Experiment 1. 4.1.2. Materials and procedure This experiment manipulated blending weight systematically for synchronously coordinated vowels /e/ and /a/: while blending weight for /e/ was kept constant, blending weight for /a/ was increased in 10 steps (0.5 increments), from a base stimulus where both vowels had the same weight (i.e. a BWR of
190
Stefania Marin and Louis Goldstein
Figure 2. Relationship between diphthong responses averaged across listeners (%) and duration of vowel interval of stimuli (ms). Each /ea/ category is represented by three values, corresponding to the activation interval manipulation.
1), to a stimulus with blending weight for /a/ six times greater than that of /e/ (i.e. a BWR of 6). The other specifications of the two vowels were otherwise kept constant. The vowels of interest were synthesized in the context /b_b/. The stimuli thus modeled were classified on the basis of auditory perception by 16 listeners. The same overall procedure as in Experiment 1 was used. This time the participants had to decide whether the item heard contained a) diphthong /ea/ (BEAB), b) vowel /e/ (BEB), or c) vowel /a/ (BAB). The hiatus option was excluded as a choice on the basis of the experimenter’s auditory evaluation of the stimuli. Eleven of the participants (M4–M8, F4–F9) also completed Experiment 1 in the same session. The stimuli were presented ten times in random order. 4.2. Results On average, listeners perceived stimuli with (near) equal weight as vowel /e/ over 90% of the time, stimuli with a BWR of 5 to 6 as vowel /a/ over 90% of the time, and stimuli with a BWR between 3 and 4 as diphthong /ea/ at least
A gestural model of the temporal organization of vowel clusters in Romanian
191
Figure 3. Perceptual classification results for Experiment 2, averaged across 16 listeners. Error bars represent one standard deviation from the mean.
50% of the time (Figure 3). Listeners varied slightly with respect to where in the continuum their perception switched to diphthong /ea/ (cf. the individual diphthong responses in Table 5); however 15 of the participants heard the item with a BWR of 4 as a diphthong at least 80% of the time. One participant (F3) showed a different pattern: for this listener, a BWR of 2 was enough to trigger a diphthong percept. The participants in both perceptual experiments showed a consistent response pattern to the common stimuli (ea and the stimulus with a BWR of 1, and ea_W2 and the stimulus with a BWR of 2 respectively). A generalized linear mixed model analysis with the individual classification responses as the dependent variable (two levels: diphthong vs. any other response), stimulus as a fixed factor, and participant as a random factor statistically corroborated our general findings. The stimuli with a BWR between 2 and 4.5 were classified significantly more as a diphthong than the stimulus with a BWR of 1 (Z > 5.29, p < 0.001, cf. Table 6). Additionally, stimuli with a BWR between 3 and 4 were more often classified as a diphthong, compared to the stimulus with a BWR of 2 (Z > 3.72, p < 0.001, Table 6). Finally, there were more diphthong responses to the stimulus with a BWR of 4 than to the stimulus with a BWR of 3 (Z = 6.03, p < 0.001), 3.5 (Z = 5.02, p < 0.001) or 4.5 (Z = 6.85, p < 0.001). These tests, which factored in the listener-specific differences, showed that the diphthong responses significantly increased starting from the stimulus with a BWR of 2, reached a maximum at
192
Stefania Marin and Louis Goldstein
Table 5. Individual diphthong responses (%) for Experiment 2. Diphthong responses at or over 80% are bold-faced. Diphthong Response (%) BWR F1 F2 [a]/[e] 10 0 1.5
0
0
F3 F4 F5 F6 F7 F8
F9 M2 M3 M4 M5 M6 M7 M8
20
0
30
0
0
0
10
0
0
0
0
0
0
20
50
0
0
0
0
0
0
0
0
0
0
0
0
10
2
60
20
90 40 30 70
0
0
0
0
60
50
20
0
10
50
2.5
50
60
70 50 50 30 11
0
40
10
50
20
70
0
30
80
3
80
80
60 60 40 30 80
10
70
30
80
20
50
20
40
80
3.5
80
80
40 70 30 70 70
0
70
70
80
20
90
40
40
80
90 100 40 80 80 80 80
90
80
90
80
90
100
90
80
100
4 4.5
50
90
10 30 50 20 33 100 20
30
30
50
40
40
80
50
5
20
50
0
0
0
0
0
50
0
50
0
20
0
0
20
0
5.5
20
10
10
0
0
0
0
10
0
20
0
0
0
0
0
0
6
0
40
0
0
0
0
0
0
0
10
0
0
0
0
0
0
the stimulus with a BWR of 4, and then decreased again.2 A one-sample t-test carried out on the percentage responses of each participant showed that the diphthong responses to the stimulus with a BWR of 4 were significantly higher than 50% (t(15) = 9.73, p < 0.001), indicating that for this stimulus the diphthong response consistently outnumbered any of the other two possible responses (vowel /e/ or vowel /a/). 4.3. Discussion Experiment 2 showed that manipulating relative blending weight of two synchronously timed vowels triggered a perceptual switch from a monophthong to a diphthong. Equal blending weight for vowels /e/ and /a/ coupled in-phase resulted in an /e/ percept, while a blending weight ratio greater than 5 resulted in the percept of vowel /a/; finally, a blending weight ratio around 4 resulted 2. Given the robust statistical results ( p-values for most comparisons either under 0.001 or over 0.1), the alpha levels were not corrected for multiple testing. However, even by using the conservative Bonferroni correction, 50 tests would have to be carried out before an observed p-value of 0.001 would result in a familywise error rate of 0.05. Our main patterns of (non-)significance would remain the same even after using such a correction.
A gestural model of the temporal organization of vowel clusters in Romanian
193
Table 6. Statistical results (Generalized Linear Mixed Model) for the diphthong response comparison across the stimuli tested in Experiment 2. Positive Z-values indicate that there were more diphthong responses for the given stimulus than for the base stimulus (stimuli with BWR of 1 and 2 respectively). Stimulus
Z
p-value
Stimulus
Z
p-value
Intercept (BWR1)
–7.937
0.000
Intercept (BWR2)
–3.727
0.000
BWR1.5
–1.228
0.219
BWR1
–5.292
0.000
BWR2
5.292
0.000
BWR1.5
–5.617
0.000
BWR2.5
6.172
0.000
BWR2.5
1.245
0.213
BWR3
7.843
0.000
BWR3
3.723
0.000
BWR3.5
8.567
0.000
BWR3.5
4.821
0.000
BWR4
11.474
0.000
BWR4
9.018
0.000
BWR4.5
7.198
0.000
BWR4.5
2.755
0.006
BWR5
2.153
0.031
BWR5
–3.723
0.000
BWR5.5
–0.462
0.644
BWR5.5
–5.479
0.000
BWR6
–1.501
0.133
BWR6
–5.595
0.000
in the percept of diphthong /ea/. The stress-conditioned alternation between diphthong /ea/ (stressed) and vowel /e/ (unstressed) in Romanian was achieved by manipulating relative blending weight of the two vowels. Greater blending weight of one of the vowels resulted in a blended target closer to that of the stressed vowel and further from the unstressed one. For the specific case examined here (two synchronous vowels), this is equivalent to the effect of a spatial modulation (μ-) gesture affecting one of the gestures. Because the modeled alternation is conditioned by the presence or absence of stress, its successful modeling may implicitly provide a way of modeling stress in gestural terms. The results reported here suggest the possibility that stress could be modeled by means of a μ-gesture causing the spatial strengthening of the affected gesture. While gesture lengthening (i.e. the effect of a π-gesture, modeled in Experiment 1) did not seem to play a major role in modeling the specific stress effect examined here (i.e. the diphthong/monophthong alternation), it remains to be established by further explicit investigations of stress whether and to what extent spatial and temporal modulation gestures (cf. Saltzman et al. 2008) may play a role in modeling word stress production.
194
Stefania Marin and Louis Goldstein
5. Experiment 3 Experiment 2 showed that modeling stress as a variation of blending weight (as an implementation for a μ-gesture) resulted in the expected perceptual pattern. If the model used in the perceptual experiment correctly captures key aspects of the production of diphthongs (and monophthongs), then the acoustics of natural and modeled stimuli should match well, and the model of diphthongs as in-phase coupled vowel gestures would gain further plausibility. Experiment 3 tests this hypothesis. 5.1. Method For the comparison of the acoustic characteristics of natural and modeled stimuli, we used the word series in (4). The natural data were produced by 12 native speakers of Romanian (five male), who read the stimuli, embedded in a constant carrier phrase, ten times in random order, and at a self-selected casual speaking rate. The target words were separated by unrelated filler words, embedded in the same carrier phrase. All the recordings were sampled at 22.05 kHz. The same stimuli were modeled using TADA: Both diphthong and alternating /e/ words were modeled with the gestures for vowels [ε] and [a] coupled in-phase, either with equal blending weight for alternating /e/ (henceforth “blended” /e/) (similar to the stimulus with a BWR of 1 in Experiment 2), or with a BWR of 4 for the diphthong (similar to the stimulus with a BWR of 4 in Experiment 2). Non-alternating /e/ was modeled with a single gesture for vowel [ε] (similar to stimulus e in Experiment 1). Acoustic outputs were generated on the basis of these articulatory configurations. (4) Diphthong: Alternating /e/: Non-alternating /e/:
['sea.ra] [se.'ra.ta] ['se.ra]
‘the evening’ ‘the evening party’ ‘the greenhouse’
The acoustic outputs of both natural productions and modeled stimuli were analyzed using Praat speech analysis software (Boersma and Weenink 2009). The vocalic interval was manually labeled from the onset to the offset of the vowel-specific formant contours, and formant frequencies for five formants were automatically calculated using Praat’s short-term spectral analysis function. The frequency values, in Hertz, for the first two formants at the onset of the measured interval, at its offset, and every 10% into the interval, totaling eleven measuring points, were used in the analysis. Onset and offset points were manually determined, while the other points were determined automatically on the basis of onset and offset landmarks. Following the methodology
A gestural model of the temporal organization of vowel clusters in Romanian
195
described in Harrington et al. (2008), we reduced these time-varying formant trajectories to the first three coefficients of a discrete cosine transformation, shown to be proportional to a trajectory’s mean, slope and curvature. We thus compressed the formant information of each vowel/diphthong to a single point in a six-dimensional space (three coefficients for each vowel formant). To assess the degree of acoustic similarity between natural tokens and modeled stimuli, the Euclidean distance (E) from each natural token to every modeled stimulus was calculated on the basis of the six coefficients. Thus, three Euclidean distances – one to modeled ['sea.ra] (E ['sea.ra]), one to modeled [se.'ra.ta] (E [se.'ra.ta]), and one to modeled ['se.ra] (E ['se.ra]), were calculated for every natural token. The prediction was that if natural and modeled tokens were acoustically similar, then naturally produced diphthongs should be significantly closer acoustically (i.e. they should have smaller Euclidean distances) to the modeled diphthong than to either blended or mono-gestural /e/, that natural alternating /e/ should be closest to modeled blended /e/, and that natural non-alternating /e/ should be closest to modeled non-alternating monogestural /e/. If however, natural and modeled stimuli were not acoustically similar, the proximity patterns were expected to be random, with no matching between natural and modeled categories. For the analysis, Euclidean distance values were averaged across repetitions, so that each speaker contributed one value per word. 5.2. Results A comparison of the model stimulus formant trajectories with the trajectories averaged across male and female speakers’ productions showed comparable acoustic patterns for naturally produced tokens and model stimuli (Figure 4). While precise values for F1 and F2 differed to some extent between productions by male speakers, by female speakers and by the model, the general patterns for stimuli types were similar in that both F1 and F2 trajectories for alternating /e/ were (slightly) more extremely front (higher F2, lower F1) than those for diphthong /ea/, and less extreme than those for non-alternating /e/. The Euclidean distance analysis confirmed the acoustic similarity between natural and modeled stimuli. Naturally produced diphthong words were closest acoustically to the modeled diphthong: for the word ['sea.ra], the distance E['sea.ra] had smaller values (Median = 270) than either E['se.ra] (Median = 379) or E[se.'ra.ta] (Median = 294). Likewise, naturally produced alternating /e/ stimuli were closer to modeled blended /e/ (Median = 241) than to either the diphthong (Median = 277) or mono-gestural /e/ (Median = 338), and natural non-alternating /e/ stimuli were closer to modeled mono-gestural /e/ (Median = 246) than
196
Stefania Marin and Louis Goldstein
Figure 4. Vowel F1 and F2 trajectories of the words ['sea.ra], [se.'ra.ta], and ['se.ra], plotted on the basis of values measured at onset (0%), offset (100%), and every 10% into the vowel interval, as produced by the model, by male speakers and by female speakers.
A gestural model of the temporal organization of vowel clusters in Romanian
197
Table 7. Statistical results (Wilcoxon Signed Ranks test) for the comparisons between Euclidean distances. Two-tailed significance is reported. Effect size (r) was calculated on the basis of Z-scores. Word
Comparison
Rank
N
Test statistics
['sea.ra]
E['sea.ra] – E[se.'ra.ta]
(+) (–)
111
Z = 2.981, p = 0.001, r = 0.610
['se.ra]
E['se.ra] – E[se.'ra.ta]
(+) (–)
210
Z = 2.824, p = 0.002, r = 0.578
[se.'ra.ta]
E[se.'ra.ta] – E['sea.ra]
(+) (–)
111
Z = 2.824, p = 0.002, r = 0.578
to either the diphthong (Median = 378) or blended /e/ (Median = 325). Pairedsamples Wilcoxon Signed Ranks tests, summarized in Table 7, confirmed that for each word the smallest Euclidean distance – namely the one matching in category was significantly smaller than the distance next up in value, validating the consistency of the pattern across speakers. 5.3. Discussion The observed acoustic similarity between model stimuli and natural tokens could be taken as an indication of a comparable similarity at the production level, and thus the gestural configuration probably employed in natural production could be inferred from the known gestural configuration employed in the model. It is then plausible that natural tokens were produced similarly to the modeled ones, with the gestures for vowels /e/ and /a/ coupled in-phase both for alternating /e/ and diphthong /ea/, but with equal or different blending weights, as a function of absence or presence of stress. Additionally, the fact that natural alternating /e/ was acoustically more similar to the bi-gestural /e/ in modeled [se.'ra.ta], than to the mono-gestural /e/ in ['se.ra], suggests that indeed production of alternating /e/ may involve two gestures, rather than just one. Alternatively, the difference between [se.'ra.ta] and ['se.ra] could have been modeled as a difference in target specifications (specifically, the target for [se.'ra.ta] could be set to the post blending targets of the BW1 model stimulus), rather than as a difference in gestural composition. The present model, while fitting the natural data reasonably well, has nevertheless the additional advantage of capturing the lexical relationship between [se.'ra.ta] and ['sea.ra] as a possible source for the difference between [se.'ra.ta] and ['se.ra].
198
Stefania Marin and Louis Goldstein
As to the diphthong production model, ours is the first to attempt to relate the diphthongs’ gestural, acoustic, and perceptual properties in a principled way to their structural properties. Similar nuclear diphthongs have been reported for Dutch (Collier et al. 1982), English (Davis and Hammond 1995), French (Kaye and Lowenstamm 1994), Italian (Marotta 1988), however, the analyses proposed are almost exclusively phonological/structural, with no principled connection to their articulation or acoustics. One exception is Collier et al.’s (1982) acoustic and physiological study on Dutch diphthongs, where they describe nuclear diphthongs as being produced with synchronous activations of the muscles involved, with onsets and offsets differing acoustically from corresponding single vowels, and with minimal second formant frequency change. On the basis of this evidence, the authors suggest that nuclear diphthongs in Dutch are mono-gestural, and as such different compositionally from non-nuclear diphthongs (and by extension from hiatus sequences). No formal model is however proposed as to how they should be distinguished in production from other mono-gestural structures (vowels). Prima facie, the empirical evidence in this study is not incompatible with the model we proposed for nuclear diphthongs, specifically a bi-gestural synchronous structure with unequal blending rates. Whether and how exactly our model could be extended to the Dutch diphthongs, and potentially to nuclear diphthongs in other languages, is a question for future research. Finally, speakers may produce the three categories guided by stored exemplars of the respective categories (cf. Exemplar Theory, Pierrehumbert 2002). If so, our proposal provides a formal model of the specific ingredients that may be involved in (re)producing such exemplars.
6. Conclusion We argued that the temporal organization of hetero- and tauto-syllabic vowel clusters could be modeled in terms of particular coupling relations. The results in Experiment 1 showed that indeed non-synchronous coordination (180 or 360 degrees) consistently resulted in the percept of hetero-syllabic clusters (hiatus /e.a/), while in-phase coordination resulted in the percept of tautosyllabic vowel clusters (diphthong /ea/ or blended vowel /e/, depending on further manipulations). Our proposed model can be understood within the general tenets of Articulatory Phonology (Browman and Goldstein 1990, 1992; Goldstein et al. 2006), with the advantage that the hypothesized coupling relations between gestures would serve not only as units of action (encapsulating constriction production) but also as units of information (encapsulating
A gestural model of the temporal organization of vowel clusters in Romanian
199
linguistic contrast). Nevertheless, the model argued for here represents a principled way of accounting for the timing dimension in the production of hetero- and tauto-syllabic vowel clusters independent of assumptions on linguistic representation. While Experiment 1 addressed the more general issue of the distinction between hetero- and tauto-syllabic vowel clusters, Experiments 2 and 3 specifically modeled the stress-conditioned alternation in Romanian between vowel /e/ and diphthong /ea/. This alternation has been modeled by articulatorily strengthening (spatially modulating) the vowel associated with stress such that it was realized closer to its specified target, to the detriment of the other vowel constriction which as a result was articulatorily undershot. The successful modeling of this particular stress-conditioned alternation implicitly suggests that this phenomenon, and possibly word stress in general may be modeled in terms of a spatial modulation gesture (μ-gesture) being applied to the vowel affected by stress. Thus, the results presented here suggest a possible analysis of word stress in gestural terms. This parameter may also explain (some of ) the duration difference between stressed and unstressed vowels, since magnitude of spatial excursion (i.e. extent of target overshoot/ undershoot) may affect at least in part the temporal domain (cf. Beckman and Cohen 2000 for a review). While spatial strengthening (achieved via differential blending weight) turned out to be the most important parameter in modeling the stress-conditioned vowel-diphthong alternation in Romanian, it remains to be determined by future explicit investigations of word stress whether and to what extent this parameter can model word stress and stress effects crosslinguistically, or whether a combination with additional parameters (such as temporal warping) may provide better models for other stress-related phenomena. Acknowledgements Work supported by the Deutsche Forschungsgemeinschaft (PO 1269/1-1) and by fellowships from Yale University to the first author.
References Beckman, Mary E. and K. Bretonnel Cohen 2000 Modeling the articulatory dynamics of two levels of stress contrasts. In: Gösta Bruce and Merle Horne (eds.), Prosody: Theory and Experiment – Studies Presented to Gösta Bruce, 169–200. Dordrecht: Kluwer Academic Publishers.
200
Stefania Marin and Louis Goldstein
Beckman, Mary E. and Jan Edwards 1994 Articulatory evidence for differentiating stress categories. In Patricia A. Keating (ed.), Phonological Structure and phonetic Form: Papers in Laboratory phonology III, 7–33. Cambridge, UK: Cambridge University Press. Boersma, Paul and David Weenink 2009 Praat: doing phonetics by computer [Computer program]. Version 5.1.05. Retrieved June 1, 2009 from http://www.praat.org/. Browman, Catherine P. and Louis Goldstein 2000 Competing constraints on intergestural coordination and self-organization of phonological structures. Bulletin de la Communication Parlée 5: 25–34. Browman, Catherine P. and Louis Goldstein 1992 Articulatory phonology: an overview. Phonetica 49: 155–180. Browman, Catherine P. and Louis Goldstein 1990 Gestural specification using dynamically-defined articulatory structures. Journal of Phonetics 18: 299–320. Browman, Catherine P., Louis Goldstein, J. A. Scott Kelso, Philip Rubin and Elliot Saltzman 1984 Articulatory synthesis from underlying dynamics. Journal of the Acoustical Society of America 75: S22–S23. Burileanu, Dragos 2002 Basic research and implementation decisions for a text-to-speech synthesis system in Romanian. International Journal of Speech Technology 5: 211–225. Byrd, Dani 2000 Articulatory vowel lengthening and coordination at phrasal junctures. Phonetica 57: 3–16. Byrd, Dani and Elliot Saltzman 2003 The elastic phrase: modeling the dynamics of boundary adjacent lengthening. Journal of Phonetics 31: 149–180. Byrd, Dani, Stephen Tobin, Erik Bresch and Shrikanth Narayanan 2009 Timing effects of syllable structure and stress on nasals: a real-time MRI examination. Journal of Phonetics 37: 97–110. Chitoran, Ioana 2001 The Phonology of Romanian: A Constraint-Based Approach. Berlin, New York: Mouton de Gruyter. Chitoran, Ioana 2002 A perception-production study of Romanian diphthongs and glidevowel sequences. Journal of the International Phonetic Association 32: 203–222. Chitoran, Ioana and José Ignacio Hualde 2007 From hiatus to diphthong: the evolution of vowel sequences in Romance. Phonology 24: 37–75.
A gestural model of the temporal organization of vowel clusters in Romanian Cho, Taehong 2004
201
Prosodically conditioned strengthening and vowel-to-vowel coarticulation in English. Journal of Phonetics 32: 141–176. Collier, René, Fredericka Bell-Berti and Lawrence J. Raphael 1982 Some acoustic and physiological observations on diphthongs. Language and Speech 25: 305–323. Davis, Stuart and Michael Hammond 1995 On the status of on-glides in American English. Phonology 12: 159– 182. Forster, K.L. and J.C. Forster 2003 A Windows display program with millisecond accuracy. Behavior Research Methods, Instruments, & Computers 35: 116–124. Fowler, Carol A. 1981 Production and perception of coarticulation among stressed and unstressed vowels. Journal of Speech and Hearing Research 46: 127–139. Goldstein, Louis, Dani Byrd and Elliot Saltzman 2006 The role of vocal tract gestural action units in understanding the evolution of phonology. In Michael A. Arbib (ed.), Action to Language via the Mirror Neuron System, 215–249. Cambridge: Cambridge University Press. Haken, H., J.A.S. Kelso and H. Bunz 1985 A theoretical model of phase transitions in human hand movements. Biological Cybernetics 51: 347–356. Hanson, Helen M. and Kenneth N. Stevens, K. N. 2002 A quasi-articulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLSyn. Journal of the Acoustical Society of America 112: 1158–1182. Harrington, Jonathan, Felicitas Kleber and Ulrich Reubold 2008 Compensation for coarticulation, /u/-fronting, and sound change in standard southern British: An acoustic and perceptual study. Journal of the Acoustical Society of America 123: 2825–2835. Harrington, Jonathan, Janet Fletcher and Corinne Roberts 1995 An analysis of truncation and linear rescaling in the production of accented and unaccented vowels. Journal of Phonetics 23: 305–322. de Jong, Kenneth J. 1995 The supraglottal articulation of prominence in English: Linguistic stress as localized hyperarticulation. Journal of the Acoustical Society of America 97: 491–504. Kaye, Jonathan D. and Jean Lowenstamm 1984 De la syllabicité. In François Dell, Daniel Hirst, Jean-Roger Vergnaud (eds.), La forme sonore du langage, 123–159. Paris: Hermann. Krakow, Rena 1999 Physiological organization of syllables: a review. Journal of Phonetics 27: 23–54.
202
Stefania Marin and Louis Goldstein
Lindblom, Björn 1963 On vowel reduction (Report No. 29). Stockholm, Sweden: The Royal Institute of Technology, Speech Transmission Laboratory. Marin, Stefania accepted Romanian ‘blended’ vowels: A production model of incomplete neutralization. In Selected papers of the PaPI 2009. Mouton de Gruyter. Marin, Stefania 2007 An articulatory modeling of Romanian diphthong alternations. In Jürgen Trouvain and William J. Barry (eds.), Proceedings of the XVIth International Congress of Phonetic Sciences, 453–456. Saarbrücken, Germany. Marin, Stefania 2005 Complex Nuclei in Articulatory Phonology: The Case of Romanian Diphthongs. In Randall Gess and Edward J. Rubin (eds.) Selected papers of the Linguistic Symposium in Romance Languages 34th, 161–177. Amsterdam, Philadelphia: John Benjamins. Marin, Stefania and Marianne Pouplier 2010 Temporal organization of complex onsets and codas in American English: Testing the predictions of a gestural coupling model. Journal of Motor Control 14: 380–407. Marotta, Giovanna 1988 The Italian diphthongs and the autosegmental framework. In Pier Marco Bertinetto and Michele Loporcaro (eds.) Certamen Phonologicum, 389–420. Torino: Rosenberg & Sellier. Mooshammer, Christine and Susanne Fuchs 2002 Stress distinction in German: Simulating kinematic parameters of tongue tip gestures. Journal of Phonetics 30: 337–355. Nam, Hosung 2007 A Gestural Coupling Model of Syllable Structure. PhD Dissertation, Department of Linguistics, Yale University. Nam, Hosung, Louis Goldstein and Michael Proctor n.d. TADA (TAsk Dynamics Application). Retrieved from http://www. haskins.yale.edu/tada_download/ Nam, Hosung, Louis Goldstein and Elliot Saltzman 2009 Self-Organization of syllable structure: A coupled oscillator model. In François Pellegrino, Egidio Marsico, Ioana Chitoran and Cristophe Coupé (eds.), Approaches to phonological complexity, 299–328. Berlin/New York: Mouton de Gruyter. Pierrehumbert, Janet B. 2002 Word-specific phonetics. In Carlos Gussenhoven and Natasha Warner (eds.), Papers in Laboratory Phonology VII, 101–139. Berlin: Mouton De Gruyter. Saltzman, Elliot and Kevin G. Munhall 1989 A dynamical approach to gestural patterning in speech production. Ecological Psychology 1: 333–382.
A gestural model of the temporal organization of vowel clusters in Romanian
203
Saltzman, Elliot, Hosung Nam, Jelena Krivokapić and Louis Goldstein 2008 A task-dynamic toolkit for modeling the effects of prosodic structure on articulation. In Proceedings of the 4th Conference on Speech Prosody, 175–184. Campinas, Brazil. Selkirk, Elisabeth O. 1984 On the major class features and syllable theory. In Mark Aronoff and Richard T. Oehrle (eds.), Language Sound Structures, 107– 136. Cambridge, Mass.: MIT Press. Sluijter, Agaath M.C. and Vincent J. van Heuven 1996 Spectral balance as an acoustic correlate of linguistic stress. Journal of the Acoustical Society of America 100: 2471–2485. Turvey, Michael 1990 Coordination. American Psychologist 45: 938–953.
Coupling of tone and constriction gestures in pitch accents Doris Mu¨cke, Hosung Nam, Anne Hermes and Louis Goldstein1 Abstract This study investigates the temporal coordination of tones and constriction gestures in Catalan and Viennese German using electromagnetic articulography. It is observed that nuclear rises are later in German than in Catalan. We model the difference in tonal alignment patterns using a coupled oscillator model, proposing that it can emerge from differences in the coupling relations between tones and oral constriction gestures. In Catalan, the high tone gesture is coupled in-phase with the accented vowel. In German, a low tone and a high tone gesture compete with each other to be in-phase with the vowel resulting in a rightward shift of the high tone gesture and therefore to a delayed rise on the acoustic surface. We conclude with a comparison of lexical and prosodic pitch accent tones and their interaction with the syllable-level coupling graph. In contrast to lexical tones, prosodic tones do not perturb the within-syllable relations of consonant and vowel timing.
1. Introduction This study describes the temporal coordination pattern between tones and oral constriction gestures in Catalan and German and attempts to analyze the temporal pattern using a planning model of intergestural timing grounded on Articulatory Phonology. We will show that this coordination follows the basic principles applied to consonant clusters, which have been reported in numerous studies (Browman and Goldstein 1988, Honorof and Browman 1995, Byrd 1995, Bombien et al. 2010, Goldstein, Chitoran, and Selkirk 2007, Goldstein et al. 2009, Hermes et al. 2008, Marin and Pouplier, 2010, Nam 2007, Nam, Goldstein, and Saltzman 2009, Shaw et al. 2009). Within the framework of Articulatory Phonology, speech can be decomposed into invariant phonological units, articulatory gestures that are temporally coordinated with one another (Browman and Goldstein 1989). A coupled oscillator planning model of speech timing has been developed that provides a possible way of modelling the coordination of gestures in time (Browman and Goldstein 2000, Goldstein et al. 2009, Nam and Saltzman 2003, Nam, 1. The first two authors equally contributed to this study.
206
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
Goldstein, and Saltzman 2009). In the model, gestures are associated with nonlinear planning oscillators (or clocks) that are coupled with each other in a pattern specified by a coupling graph, assumed to be part of an utterance’s phonological representation. In the present study, we model the control of pitch to achieve a target in F0 as a tonal gesture and investigate the temporal coordination of tonal gestures with oral constriction gestures in Catalan and Viennese German (also referred to as Standard Viennese Austrian) bitonal LH pitch accents. It has been reported elsewhere that Catalan and German are expected to show different alignment patterns for nuclear rises (Prieto et al. 2007b for Catalan, Mücke et al. 2009 for Viennese German). We aim to test whether those alignment differences can be seen as phonological in nature in the sense that they emerge from topological differences in phonological coupling graphs. Our results show that in the acoustic analysis, the accentual rise starts later with respect to segmental landmarks in Viennese German compared to Catalan. In the articulatory analysis, we focus on the start of the F0 rise movement (the L valley) as the start of the H tone gesture. In Catalan, the start of the H tone gesture is tightly synchronised with the start of the vowel gesture, while in the German variety the H tone gesture starts considerably later. We hypothesize that the difference lies in the coupling relations between tones and vowel gestures. Therefore, we propose a non-competitive coupling structure type for Catalan, and a competitive structure (usually known from those in consonant clusters) for Viennese German. The competitive coupling structure leads to a rightward shift of the H tone gesture (and therefore to later F0 rises on the acoustic surface). We conclude with a discussion on the difference between prosodic (pitch accent tones) and lexical tones and how they are supposed to interact with the syllable-level coupling graphs for consonant and vowel coordination. 1.1. What is a tone gesture? In the autosegmental-metrical approach to intonation, tones are treated as tonal point-events: target values of pitch (H or L) that occur at some instant in time. An LH rising pitch accent is composed of two tonal targets or two tonal events: a low valley (L), where the rise starts, and a high peak (H), where the rise ends. According to the segmental anchor hypothesis (within the autosegmental-metrical phonology framework), those tonal events are anchored to acoustically-defined events associated with consonants and vowels (Arvaniti, Ladd, and Mennen 1998 for Greek, Ladd et al. 1999 for English, Ladd, Mennen, and Schepman 2000 for Dutch, Prieto and Torreira 2007a for
Coupling of tone and constriction gestures in pitch accents
207
Figure 1. Summary of alignment properties of prenuclear LH accents for Greek, English and German varieties adopted from Atterer and Ladd (2004); Düsseldorf (Standard Northern German) and Vienna (Southern German) added by Mücke et al. (2008b). C and V are stylized segments in the acoustic domain.
Spanish, D’Imperio, Petrone, and Nguyen 2007 for Italian, Atterer and Ladd 2004 and Mücke et al. 2008b for different German varieties, Ladd 2008 for a general overview). Usually, tones occur in the vicinity of the lexically stressed syllable carrying the tone. Therefore, tones are hypothesized to be aligned with segments corresponding to the lexically stressed syllable. Figure 1 shows the alignment properties of prenuclear rising pitch accents in different languages. The start of the rise (the L event) in English and Greek is constantly aligned with the left periphery of the accented syllable, at the beginning of the acoustic segment associated with the syllable-onset consonant. In fact, these are not the only two languages reported in the literature with this pattern for L (Ladd, Mennen, and Schepman 2000 for Dutch, D’Imperio 2002 for Italian, Prieto and Torreira 2007a for Spanish, Prieto et al. 2007b for Catalan). However, German has been shown to have later rises in prenuclear accents. In Standard Northern German (low Franconian speech area near Düsseldorf ), the L occurs around the middle of the C1 segment, while in Southern German (Viennese German) L occurs even later, during V1. In Articulatory Phonology, speech gestures are modelled as invariant functional units of vocal tract constricting action and speech can be decomposed into a constellation of gestures: articulatory events with extent in time that can temporally overlap with one another. The regularity and variability in intergestural timing have been described by many studies (Byrd 1994, 1996a,b; Cho 2001, Bombien et al. 2010). Such temporal patterns have been modelled using an intergestural timing model, where the intergestural temporal relationship (e.g. timing and connectivity) is specified in an inter-oscillator coupling
208
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
Figure 2. Coupling graphs for (2a) ‘pa’ (simple syllable onset, CV) and (2b) ‘up’ (simple syllable coda, VC) with in-phase (solid lines) and anti-phase (dotted lines) target specifications.
network, or coupling graph (Browman and Goldstein 2000, Saltzman and Byrd 2000, Nam and Saltzman 2003, Nam 2007, Goldstein, Chitoran, and Selkirk 2007, Goldstein et al. 2009). In the model each gesture is associated with an oscillator (or clock) and the oscillators are coupled to one another in a pairwise, potentially competing fashion, rather than being serially arranged. The model incorporates the two stable modes of coupling that have been shown to be spontaneously accessible (without learning) when performers are asked to oscillate multiple limbs: synchronous or in-phase and sequential or anti-phase modes (Turvey 1990). Within the coupling hypothesis of syllable structure proposed by Goldstein et al. (2000) and Nam, Goldstein, and Saltzman (2009), the two intrinsic modes are used to model the temporal relations within a syllable. The in-phase mode, which is stronger and more stable, is used to model the onset-nucleus relation. Figure 2a provides a coupling graph for ‘pa’, where the onset consonant is coupled in-phase to the vowel. Since gestures are triggered at phase 0° of their associated clocks, and since C and V clocks are in-phase, the consonant and vowel gestures will be initiated synchronously. In contrast, the nucleus-coda relation is defined by the anti-phase mode (sequential coupling). Figure 2b displays the coupling graph for ‘up’. The vowel and the consonant gestures are initiated sequentially because each is triggered at phase of 0° of their associated clocks, but those clocks are 180° out of phase in this case. Because of the intrinsic strength of in-phase coupling (Nam et al. 2009), the intrinsically sequential consonants of a complex onset are all pulled into an in-phase coupling with respect to the vowel: the onset gestures are all coupled to the vowel gesture in a synchronous mode (in-phase) but this coupling competes with the sequential coupling between the two consonants (anti-phase). Figure 3a shows a coupling graph for ‘spa’, where the /s/ and /p/ are both in-phase coupled to the vowel and are anti-phase coupled to one another. Thus, the CV coupling competes with C-C coupling. As a result, compared to the single C case, C1 is shifted leftwards with respect to the V target, and C2 is shifted rightwards to overlap the vowel more (the ‘c-center’ effect, Browman and Goldstein 1988; 2000, Bombien et al. 2010, Gao 2009, Goldstein, Chitoran,
Coupling of tone and constriction gestures in pitch accents
209
Figure 3. Coupling graphs for (3a) ‘spa’ (complex syllable onset, CCV) and (3b) ‘ask’ (complex syllable coda, VCC) with in-phase (solid lines) and anti-phase (dotted lines) target specifications.
and Selkirk 2007, Goldstein et al. 2009, Hermes et al. 2008, Marin and Pouplier 2010, Nam 2007, Nam, Goldstein, and Saltzman 2009, Shaw et al. 2009). In many languages, complex codas are defined by a non-competitive coupling structure, because of the weaker strength of anti-phase coupling. A coupling graph for a VCC coordination in English (e.g. ‘ask’) is provided in Figure 3b. Only C1 is linked directly to the V gesture; the coupling is in an anti-phase relation. The following Cs are coordinated only with respect to each other (anti-phase), but not directly to the V gesture. In what follows we will apply the basic coupling modes on the coordination of Tone gestures with oral constriction gestures. A tone can also be understood as a coordinated articulatory action to achieve a tonal task goal and thus defined as a dynamical system in F0 space, a ‘tone gesture’ (Gao, 2009). Considering a tone as a gesture enables one to model the tone-to-gesture timing within the intergestural timing model. A rising pitch accent, e.g. the H tone gesture (or H gesture) involves a tonal movement to an H target in F0 (schematised in Figure 4). The onset of a
Figure 4. Analysis of a rising LH pitch accent contour: Tones as gestural action units (above), and tones as events (below).
210
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
Tone gesture is taken to be the point in time at which F0 begins to move in the direction of that gesture's target. In an LH rise, the onset of the H tone gesture coincides with the offset of the preceding L tone gesture. In this example of pitch accents, the beginning of the L gesture is unclear. Note that Tone gestures (L and H gestures in Figure 4) are dynamical systems of control that have extent in time (their activation intervals), while in the autosegmental view, tones are events that occur at instants in time (H and L in Figure 4). Gao (2009) extended the coupled oscillator model for intergestural timing to the analysis of temporal pattern of lexical tones in Mandarin Chinese. She investigated syllables with single onsets (CV and CVC) such as [ma] or [man]. For syllables with only one tone (Tone 1–H, Tone 3–L), she showed that the oral constriction gestures (C, V) and the Tone gestures (T) are activated in the temporal order of C-V-T. The onset of the consonant gesture occurred considerably (~50 ms) before the onset of the vowel gesture, while the onset of the Tone gesture occurred after the vowel gesture, with about the same lag. She demonstrated that this timing pattern of tones and constriction gestures (C and V) can be predicted by hypothesizing that Tone gestures function like C gestures in the competitive coupling topology in Figure 3a: the C and T gesture are both coupled in-phase to the vowel and C and T are coupled in anti-phase to one another. As a result, the C gesture is shifted leftwards with respect to V, while the Tone gesture is shifted rightwards (‘c-center’ like coordination of C, V and T). The coupling graph for tones and oral constriction gestures in Mandarin Chinese proposed by Gao (2009) is provided in Figure 5. This hypothesis was further supported by the results of Tone 4 (HL). Here, the H tone was synchronized with the V, while C preceded and L followed by substantial lags. This pattern provided evidence that C-H-L are all coupled anti-phase to one another and in-phase to the vowel. In the present study, we extend the work on Tone gestures to pitch accent tones, and we examine how these Tone gestures are temporally coordinated with oral constriction gestures and with other Tone gestures. One hypothesis
Figure 5. Coupling graph for Tone 1–H, Tone 3–L in Mandarin Chinese, syllable [ma], adapted from Gao 2009. The Tone gesture (T) behaves like an additional consonant (C).
Coupling of tone and constriction gestures in pitch accents
211
is that they will be coordinated to C and V gestures in a manner similar to that observed in Chinese, namely that they function as C gestures, triggering the ‘c-center’ effect (causing the C to lead the V), and providing evidence for the competitive loop structure in Figure 3a. However, there are also good reasons for thinking that the nature of the coordination will be different. The tone gestures in Chinese are part of the lexical representation of words/syllables, and for this reason they should be integrated into the network of coupling relations that define those syllables. Since the coupling of a pitch accent to syllables is post-lexical, the coordination of a Tone gesture with a particular syllable in the utterance might not modify the intrasyllabic coupling relations that define that syllable. These possibilities will be evaluated in the present study taking the following factors into account: In both languages, Catalan and German, we test the effect of syllable structure (open and closed syllables) on the alignment patterns. It is reported elsewhere that peaks (the end of the rise) are systematically placed later in closed than in open syllables (Prieto & Torreira 2007a, Mücke et al. 2009). However, the effect of syllable structure on the start of the rise is unclear. In addition, we test for the effect of place of articulation on the alignment pattern to test for intrinsic variation (Löfqvist & Gracco 1999). The variation is expected to be gradient and to be the result of articulator-level interactions among the particular gestures that the coupling graph regulates. In Catalan, we also test for effects of focus structure on the alignment pattern. Contrary to German varieties, where focus structure affects the alignment of nuclear rises (later peaks in contrastive than in broad focus, Braun 2007; Braun & Ladd 2003), Catalan is expected to mark focus structure by peak height and not by peak alignment. Therefore, we expect a similar alignment pattern for broad and narrow focus in Catalan reflecting the same type of coupling graph. Foot size is taken into account to test for effects of polysyllabic shortening, which is known to affect the location of the peak but not necessarily the start of the rise up to the peak.
2. Method 2.1. Speech materials For all speech materials, pairs of meaningful sentences were constructed so as to place the pitch accents in nuclear position. 2.1.1. Catalan speech materials For Catalan, two rising pitch accent types were investigated: LH rises in broad focus and LH rises in contrastive focus (the latter one with a higher peak posi-
212
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
tion, Prieto et al. 2007b). Therefore, mini-dialogues were designed in which target words (such as the fictitious name ‘Mimami’) would carry either broad or contrastive focus, as in answer (1) in broad focus or (2) in contrastive focus. There was always a distance of at least one syllable to the end of the utterance. The nuclear rise was followed by a low boundary tone. (1) Q.: Qui va venir? (lit.: Who came?) A.: La Mimami (lit.: The Mimami) (2) Q.: Va venir la Mimamila? A.: No, la Mimamzi
(lit.: Came the Mimamila?) (lit.: No, the Mimamzi)
Table 1 shows the structure of the eight target words with the lexically stressed syllable as the target syllable. We varied syllable structure (open and closed, such as 'CV.CV and 'CVC.CV), place of articulation of the consonants (C = labial or alveolar), and foot length (2 vs. 3 syllables). Each target syllable was preceded and followed by a minimum of one unstressed syllable to avoid alignment variation due to time pressure or tonal crowding (Atterer and Ladd 2004, Mücke and Hermes 2007, Kleber and Rathcke 2008). Table 1. Structure of Catalan target words. labial open σ closed σ
alveolar
[ m i . m a . m i]
[ n i . n a . n i]
[ m i . m a . m i . l a]
[ n i . n a . n i . l a]
[ m i . m a m. z i]
[ n i . n a n. m i]
[ m i . m a m. z i . l a]
[ n i . n a n. m i . l a]
2.1.2. Viennese German speech materials For Viennese German, all target words were placed in contrastive focus since it has been shown for different German varieties that nuclear contrastive accents in declaratives are more likely to involve LH rises (Baumann, Grice, and Steindamm 2006) than non-contrastive ones. Furthermore, they are followed by a low boundary tone. Therefore question-answer pairs were designed in which a contrast is forced on the target word as in the answer in (3). (3) Q.: Hat sie die Mammi oder die Nanni bestohlen? (lit.: Has she the Mammi or the Nanni robbed?) A.: Sie hat die Mammi bestohlen (lit.: She has the Mammi robbed)
Coupling of tone and constriction gestures in pitch accents
213
Four target words were constructed with the lexically stressed syllable as the target syllable (see table 2). Analogously to the Catalan data we varied syllable structure (open and closed syllables) and place of articulation of the consonants (labial vs. alveolar). The phonological syllable structure was varied by varying phonological vowel length, 'CV:CV vs. 'CVCV. In German, short vowels do not occur in open syllables if they are stressed. Therefore, we assume ambisyllabicity for the intervocalic C in the 'CVCV sequence (as suggested by psycholinguistic experiments carried out by Schiller, Meyer, and Levelt 1997, who have shown that Dutch speakers tend to close syllables containing a short vowel). Table 2. Structure of Viennese German target words. labial
alveolar
open σ
[d i # m a:. m i]
[d i # n a:. n i]
closed σ
[ d i # m a m i]
[ d i # n a n i]
Each target syllable was flanked by a minimum of one unstressed syllable. In German, the definite article (preceding the target syllable) can be assumed to form a single prosodic word with the content word to its right (‘die Mahmi’, the Mahmi). 2.2. Speakers and recordings For Catalan, one native (female) speaker of Central Catalan participated in the experiment. For Viennese German (Standard Viennese Austrian), another native (female) speaker who grew up in Vienna was recorded. All recordings took place at the If L Phonetics laboratory in Cologne. For kinematic and acoustic recordings a 2D Electromagnetic articulograph (Carstens AG100) and a time-synchronised DAT recorder were used. One sensor was placed on the vermillion border of the lower lip, one on the tongue blade (1cm behind the tip), and one on the tongue body (4 cm behind the tip) on the midsagittal plane for capturing the movements of consonants and vowels. Two additional sensors on the bridge of the nose and the upper gums served as references in order to correct for head movements during the recordings (for further details, see Hoole and Kühnert 1996). All physiological data was recorded at 400 Hz, downsampled to 200 Hz and smoothed with a 40 Hz low-pass filter. The acoustic signal was digitized at 44.1 kHz/16 bit. All data was converted to SSFF format to enable annotation and analysis in the EMU speech database system.
214
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
Target words were produced in pseudo-randomized order with five repetitions each. The target utterances (embedded in mini-dialogues) were displayed on a monitor and the subject was instructed to read them in an appropriate way at a normal speaking rate. A total of 120 target words went into the analysis (Catalan: 8 target words 5 repetitions 2 focus structures, Viennese German: 4 target words 10 repetitions). Both speakers consistently realized the nuclear words with rising pitch accents. 2.3. Analysis procedures The following labelling criteria were used, involving the annotation of the F0 pitch contour, as well as the identification of segmental boundaries (acoustics) and gestural landmarks (kinematics) for consonants and vowels. The acoustic analysis evaluates the tonal alignment patterns within the autosegmentalmetrical framework (synchronisation of tonal targets with segments, see Figure 1 and 4), while the articulatory analysis evaluates the coordination of Tone gestures with oral constriction gestures within Articulatory Phonology. F0 labels: For the F0 analysis, we extracted F0 values with a 7.5 ms correlation window and a 3 ms frame spacing. Around the rise contour area we identified local turning points in the F0 contour by hand. We labelled a low valley (L) at the beginning of the LH rise. At the end of the rise, we labelled a high peak (H). For the acoustic analysis, L and H were treated as two tonal targets in accordance with the autosegmental-metrical approach. For the gestural analysis (articulatory phonology approach), the onset of a Tone gesture was taken to be the point in time at which F0 begins to move in the direction of that gesture's target (see section 1.1). That is, the autosegmental-metrical L was defined to be the same point in time as the onset of the articulatory phonology H tone gesture. This point in time also coincided, in the materials we are examining, with the offset of gestural activation of the preceding Tone gesture. Acoustic analysis: For the acoustic analysis, we identified segmental boundaries of the target word in the acoustic waveform. To do this, we displayed an oscillogram and a wide-band spectrogram simultaneously. All segmental boundaries of vowels and consonants were labelled at abrupt changes in the spectra at the time the closure was formed or released: this was the case for the nasals, the laterals (especially in the spectra for the intensity of higher formants) and the fricatives (at random noise patterns in the higher frequency regions). Based on these boundaries, we measured the temporal lags between the beginning of the
Coupling of tone and constriction gestures in pitch accents
215
F0 rise (the L valley) and the beginning of the initial C1 segment of the tonic syllable (Tone-C1 segment). Articulatory analysis: We identified articulatory labels for movements in the vertical position time function (of the respective sensors: lower lip for /m/, tongue tip for /n/, and tongue body for the vowel. Algorithmically, we identified the time of onset and effective target achievement of consonant and vowel gestures (and also offset for consonant gestures) at zero-crossings in the velocity curve. Based on these algorithmically-determined time points, we measured temporal lags between the tone gestures and the oral constriction gestures (V and C gestures) using the onsets of gestural activation, which are the time points when gestures begin to move toward its target. The labels V and C gesture are used to refer to the onsets of the vowel and the initial consonant gestures. For both acoustic and articulatory landmarks, the temporal lag between pairs landmarks is reported as A-B. Thus, a negative value implies that A occurs earlier than B, and vice versa for a positive value.
3. Results Section 3.1 reports the acoustic and articulatory alignment patterns for the nuclear LH rises in Catalan, and section 3.2 reports on the same for Viennese German. We included all stimuli into the statistical analysis (acoustic and articulation). 3.1. Acoustic and articulatory alignment patterns in Catalan All alignment latencies (means in ms and standard deviations in parentheses) are reported in table 3 for all Catalan data. In addition, alignment latencies for the labial data are provided graphically in Figure 6. First, we present the acoustic alignment results. We calculated temporal lags between the L tone and the beginning of the initial C1 segment in the lexically stressed syllable (Tone-C1 segment, table 3, acoustic results). In Catalan, the L tone occurs before C1 (e.g. by 28 ms before C1 in the labial data, and by 33 ms in the alveolar data). Figure 6a provides the corresponding mean alignment latencies (Tone-C1 segment) in the labial dataset. The zero line (dotted line) marks the beginning of the initial C1 segment. The negative values indicate that the L tone leads the initial C1 segment in all investigated conditions.
216
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
Figure 6. Catalan acoustic (a) and articulatory (b–d) alignment latencies in ms, bilabial data.
For within-speaker comparison, a three-way ANOVA (2 2 2) was conducted separately for the labial and alveolar dataset.2 Therefore, we included the dependent variable Tone-C1 segment, and the independent variables Focus Structure (contrastive/broad focus), Syllable Structure (open/closed) and Foot Size (2 syllables/3 syllable). There were no significant results in the ANOVA, neither for the labial dataset (p > 0.05) nor for the alveolar dataset (p > 0.05). The acoustic alignment was not affected by syllable structure or foot size. In the articulatory analysis, we measured temporal lags of the same tone point as in this acoustic analysis (onset of rise), only now lags were measured with respect to the onsets of the C and V gestures. The onsets of the C and V gestures with respect to each other were also measured. The lags are all very close to 0 (within 10 ms), indicating that the gestures are triggered synchro2. We treated labial and alveolar datasets separately to avoid the effects of intrinsic variation (due to different organs) in timing patterns.
Coupling of tone and constriction gestures in pitch accents
217
Table 3. Catalan mean lags (in ms) and standard deviations in parenthesis for acoustic (Tone-C1 segment) and articulatory alignment measures, separately for broad and contrastive focus, all data. The articulatory measures include the lags Tone-V gesture, Tone-C gesture and C-V gestures. Acoustic (segments) Tone-C1 segment
Articulation (gestures) Tone-V gesture
Tone-C gesture
C-V gesture
broad
contr
broad
contr
broad
contr
broad
contr
['ma.mi]
–37 (18)
–25 (16)
–7 (19)
9 (13)
–4 (18)
9 (15)
–3 (2)
0 (7)
['ma.mi.la]
–27 (8)
–25 (11)
7 (8)
5 (11)
8 (6)
11 (9)
–1 (5)
–6 (3)
['mam.zi]
–31 (17)
–19 (25)
–7 (14)
14 (24)
–1 (15)
15 (26)
–6 (6)
–1 (4)
['mam.zi.la]
–29 (9)
–28 (15)
4 (12)
6 (14)
1 (10)
7 (15)
3 (3)
–1 (3)
['na.ni]
–34 (6)
–28 (9)
–5 (7)
1 (8)
11 (8)
4 (5)
–14 (7)
–3 (9)
['na.ni.la]
–29 (7)
–28 (7)
2 (3)
13 (8)
16 (6)
5 (7)
–14 (3)
8 (4)
['nan.mi]
–35 (7)
–36 (17)
–5 (6)
2 (7)
8 (6)
6 (5)
–13 (5)
–4 (4)
['nan.mi.la]
–30 (8)
–41 (16)
3 (9)
7 (6)
13 (8)
–3 (6)
–10 (3)
10 (9)
labial open σ
closed σ alveolar open σ
closed σ
nously. The Tone gesture lagged the V gesture slightly (by 4 ms in the labial data, Figure 6b, and by 2 ms in the alveolar data) and the C gesture by slightly more: 6 ms in the labial data, Figure 6c, and by 8 ms in the alveolar data. Compatibly, the C gesture led the V gesture slightly, (on –2 ms for the labial data and –5 ms in the alveolar data). Thus, gestural onsets occur in the order C-V-T, but the lags are tiny. Like the acoustic analysis, we tested the articulatory measures with threeway ANOVAs (2 2 2) conducted separately for the labial and alveolar dataset. There were no significant results for the labials. However, in the alveolar dataset we found a main effect of Focus structure on all measures: Tone-V gesture [F(1, 40) = 10.43, p < 0.01], Tone-C gesture [F(1, 40) = 17.45, p < 0.001] and C-V gesture [F(1, 40) = 62.83, p < 0.001]. In contrastive focus compared to broad focus the tone starts 5 ms later in the Tone-V measure, 9 ms earlier in the Tone-C measure and the V gesture starts 10 ms later in the C-V measure. Furthermore, there was also a main effect of Foot Size on the measures Tone-V gesture [F(1, 40) = 12.96, p < 0.001] and C-V gesture [F(1, 40) = 12.40, p < 0.01], but not on the measures Tone-C gesture (p > 0.05). In a two-syllable foot compared to a three syllable foot
218
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
the tone starts 8 ms earlier in the Tone-V measure and the V gesture starts 7 ms earlier in the C-V measure. Table 4 gives an overview of the effects found in the articulatory analysis for Catalan. Table 4. Summary of effects found for Catalan articulatory alignment measures. bilabial
alveolar
Tone-V
Tone-C
C-V
Tone-V
Tone-C
C-V
Syllable structure
ns
ns
ns
ns
ns
ns
Foot size
ns
ns
ns
***
ns
**
Focus structure
ns
ns
ns
**
***
***
Place of articulation
ns
ns
ns
ns
ns
ns
Catalan
To sum up, in the Catalan data, the C, V and Tone gestural onsets are very close to being synchronous, occurring in the order of C-V-T. Furthermore, the lags in the gestural analysis turned out to be less affected by prosodic factors in the labial dataset compared to the alveolar dataset. However, a one-way ANOVA on all data (labial and alveolar together) revealed no effect of Place of Articulation on the respective measures (Tone-V gesture, p > 0.05; Tone-C gesture, p > 0.05; C-V gesture, p > 0.05). 3.2. Acoustic and articulatory alignment patterns in Viennese German Table 5 summarizes all alignment latencies for the Viennese German data with means in ms and standard deviations in parentheses. Figure 7 graphically provides medians and quartiles for the labial dataset. In the acoustic analysis (Tone-C1 segment) the L tone occurs after the onset of the initial C1 segment (on average 71 ms in the labial condition and 60 ms in the alveolar condition). That is also reflected in Figure 7a. The zero line marks the beginning of the C1 segment, and the positive values indicate that the L tone occurs considerably after the beginning of C1 in all conditions. A one-way ANOVA revealed an effect of Syllable Structure (open/closed) on the acoustic Tone-C1 segment measure in the labial dataset [F(1, 20) = 14.80, p < 0.01] as well as in the alveolar dataset [F(1, 20) = 6.12, p < 0.05]. In the labial data, the L tone occurs 21 ms later in open syllables than in closed ones. In the alveolar data, the L tone occurs 17 ms earlier in open than in closed syllables. Furthermore, there was an effect of Place of Articulation in an overall ANOVA [F(1, 40) = 6.32, p < 0.05]: the L tone occurred systematically later in the labial data than in the alveolar data (11 ms).
Coupling of tone and constriction gestures in pitch accents
219
Table 5. Viennese German mean lags (in ms) and standard deviations in parentheses for acoustic (Tone-C1 segment) and articulatory alignment measures, contrastive focus, all data. The articulatory measures include Tone-V gesture, Tone-C gesture and C-V gesture. Acoustic (segments) Tone-C1 segment (contrast)
Articulation (gestures) Tone-V gesture
Tone-C gesture
C-V gesture
(contrast) (contrast) (contrast)
labial open σ
['ma:.mi]
closed σ ['mami]
81 (16)
144 (16)
141 (15)
3 (7)
60 (8)
122 (8)
115 (10)
7 (7)
51 (19)
83 (11)
95 (10)
–12 (9)
68 (11)
70 (17)
76 (17)
–6 (6)
alveolar open σ
['na:.ni]
closed σ ['nani]
In the articulatory analysis, the Tone gesture onset occurred considerably after the start of the oral constriction gestures (Tone-V gesture, Tone-C gesture). For the labial set, lags were 133 ms after the V gesture (Figure 7b) and 128 ms after the C gesture (Figure 7c). In the alveolar data, the tone occurred 86 ms after the C gesture and 77 ms after the V gesture. However, there was still a tight synchronisation between the oral constriction gestures in the C-V measure (5 ms in the labial data, Figure 7d, and –9 ms in the alveolar data). We conducted one-way ANOVAS separately for the labial and alveolar dataset (see also table 5 for a brief overview of the effects found for Viennese German). In both datasets, we found main effects of Syllable Structure on the Tone-V and Tone-C measures, but not on the C-V measures (p > 0.05). In the labial data, lags between the tone and the oral constriction gestures were smaller in closed syllables containing a phonologically short vowel compared to open syllables containing a phonologically long vowel (22 ms in the ToneV measure, [F(1, 20) = 15.14, p < 0.01], and 26 ms in the Tone-C measure [F(1, 20) = 22.18, p < 0.001]). In the alveolar data, lags were 19 ms smaller in closed syllables in the Tone-C measure [F(1, 20) = 8.58, p < 0.01]), while in the Tone-V no effect of Syllable Structure was found (p > 0.05). In an overall analysis (labial and alveolar data together), the effect of Place of Articulation was significant for the measures Tone-V gesture [F(1, 40) = 121.46, p < 0.001], Tone – C gesture [F(1, 40) = 57.18, p < 0.001] and C-V gesture [F(1, 40) = 35.68, p < 0.001]. Lags between the Tone and oral constriction gestures were smaller in the alveolar condition compared to the labials (by
220
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
Figure 7. Viennese German acoustic (a) and articulatory (b–d) alignment latencies in ms, bilabial data.
57 ms in the Tone-V measure, and by 43 ms in the Tone-C measure). In the C-V measure the vowel starts 16 ms earlier in the alveolar condition. Table 6 gives a brief overview of the effects found in the Viennese German analysis. Table 6. Summary of effects found for Viennese German articulatory alignment measures. bilabial
alveolar
Viennese German
Tone-V
Tone-C
C-V
Tone-V
Tone-C
C-V
Syllable structure
**
***
ns
ns
**
ns
Place of articulation
**
***
***
**
***
***
To sum up, Tone gestures are substantially delayed in Viennese German, on the order of 100 ms, while the C and V gestures remain synchronized, as in Catalan.
Coupling of tone and constriction gestures in pitch accents
221
4. Discussion of Catalan and Viennese German First, we compare the acoustic alignment patterns in Catalan and Viennese German. Figure 8 summarises the main findings for the target foot [ma.mi] in Catalan (left) and [ma:.mi] in Viennese German (right). The figure is to scale and based on statistical means. The lexically stressed syllable is shaded grey.
Figure 8. Schematic acoustic alignment patterns for nuclear LH rises, contrastive focus, in Catalan (left) and Viennese German (right).
In both languages, L occurs in the vicinity of the C1 segment. In Catalan, L occurs on average just before the beginning of C1, whereas in Viennese German L occurs after it. The alignment difference amounts to 95 ms averaged across all data (112 ms in the ‘Ma(h)mi’ – cases in Figure 8). The consistency of the tonal alignment in Catalan (across variations in syllable structure and accent type) support the segmental anchor hypothesis proposed by Arvaniti, Ladd, and Mennen (1998) that F0 landmarks are aligned with segments. The observed differences between Catalan and Viennese German further support the hypothesis that alignment properties can vary across languages (among others Arvaniti, Ladd, and Mennen for Greek 1998, Prieto and Torreira 2007a for Spanish, Ladd et al. 1999 for English, Ladd, Mennen, and Schepman 2000 for Dutch, D’Imperio, Petrone, and Nguyen 2007 for Neapolitan Italian) as well as across different dialects of a language (Atterer and Ladd 2004, Braun 2007, Mücke et al. 2008a,b, Kleber and Rathcke 2008, Mücke et al. 2009 for Northern, Southern and Eastern German varieties). However, it is difficult to account for these alignment differences phonologically, i.e. within autosegmental-metrical theory, since they involve the assumption of different acoustic anchor types such as ‘the left edge of the onset consonant in the lexically stressed syllable’ for Catalan versus ‘the middle of the onset consonant in the lexically stressed syllable’ for Viennese German. Therefore, those alignment differences should not be regarded as reflecting differences in phonological association; they should be seen as phonetic detail (Atterer and Ladd 2004; Ladd 2008). In accordance with this view, our results can only point to the same segmental L anchor in both languages (the initial C of the
222
Doris Mücke, Hosung Nam, Anne Hermes and Louis Goldstein
lexically stressed syllable) with the alignment differences between Viennese German and Catalan (later alignment in Viennese German) being phonetic or gradient in nature. It further raises the question of just how the alignment is being controlled by speakers. Surely what is being controlled is not a matter of the number of ms following the onset consonant. Note, too, that the H tone target in Viennese German is not reached during the accented vowel, and in some cases does not even occur in the stressed syllable at all. These issues begin to make sense when we consider alignment from the perspective that speakers coordinate the onset of gestures using simple coupling modes. The patterns of gestural timing in Catalan and Viennese German can be visualized with the aid of the gestural scores in Figures 9 and 10. They represent the activation interval of the gestures estimated from the onset and target of the gestural action. From top to bottom the figure displays the activation of the Tone gestures L and H, the labial closure gesture for [m], and the tongue body constriction for the vowels: pharyngeal wide [a] and palatal narrow [i]. The dotted lines for the L gestures indicate that the start of the L tone gesture in the LH pitch accents cannot be estimated from this data. Gestural scores for Catalan show for the production of the lexically stressed syllable that the onsets of the consonantal (lab clo), vocalic (phar wide) and tone (H) gestures all coincide. For both broad and contrastive focus, the onset of the H gesture is synchronous (within +/–15 ms threshold) with the onset of the constriction gestures for the accented V and the initial C during the lexically stressed syllable. This pattern can be explained by an ‘in-phase’ coordination of Tone and oral gestures. C, V, and T are initiated synchronously. (The gestures are initiated in the order C-V-T, but the lags are tiny, 0.05) for any of the three speakers. For potential effects on intergestural timing from syllable type, initial consonant, and tone, please refer to the details of the three-way ANOVAs which are given below.
246
Fang Hu
Table 2. Mean durations in millisecond and standard deviations in parentheses of the CV and VT lags from Speaker 1. syllable type CVS
CVNɁ
CVɁ
CVh
initial
CV lag
VT lag
p-value
n
p
76 (36)
74 (25)
0.5849
86
m
80 (37)
75 (23)
0.1630
87
p
85 (34)
77 (29)
0.1975
30
m
92 (39)
70 (17)
0.0579
14
p
87 (32)
80 (31)
0.1721
29
m
88 (41)
67 (22)
0.0017
30
p
86 (34)
75 (32)
0.0206
29
m
93 (34)
73 (23)
0.0004
30
Table 3. Mean durations in millisecond and standard deviations in parentheses of the CV and VT lags from Speaker 2. syllable type CVS
CVNɁ
CVɁ
CVh
initial
CV lag
VT lag
p-value
n
p
24 (10)
26 (7)
0.4367
57
m
26 (9)
25 (8)
0.607
60
p
28 (7)
24 (8)
0.1928
18
m
29 (10)
26 (9)
0.63
p
30 (8)
24 (5)
0.0137
20
m
31 (12)
28 (8)
0.5333
19
p
25 (10)
23 (8)
0.5806
0
m
27 (7)
27 (9)
0.8233
20
9
and low-toned syllables. Results from the paired t-tests, as listed by the pvalues in the last column in the tables, show that there is no significant difference between the durations of CV lags and VT lags in most cases in Speaker 1 and 2. Two cases from Speaker 1 and one case from Speaker 2, as signified by the shaded cells in the tables, show that the difference is significant at the 95% confidence level. However, all cases from Speaker 3 exhibit a highly signifi-
Tonogenesis in Lhasa Tibetan – Towards a gestural account
247
Table 4. Mean durations in millisecond and standard deviations in parentheses of the CV and VT lags from Speaker 3. syllable type CVS
CVNɁ
CVɁ
CVh
initial
CV lag
VT lag
p-value
n
p
81 (33)
42 (26)
: more sonorous than. . .) [Gouskova 2004; Zec 2007] Low Vs > Mid Vs > High Vs & Glides > Rhotics > Laterals > Nasals > Voiced Fricatives > Voiced Stops > Voiceless Fricatives > Voiceless Stops In particular, we get sonority rises (pl, dv, zn, ðʎ), plateaus (pt, sx) and even reversals (θk, mk). Existence of plateaus and reversals is generally deemed not ideal for onset clusters, but proves unproblematic if one endorses Berent et al. (2007: 594) who state that: “In any given language: (a) The presence of a small sonority rise in the onset implies that of a large one. (b) The presence of a sonority plateau in the onset implies that of some sonority rise. (c) The presence of a sonority fall in the onset implies that of a plateau”. Word-medial biconsonantal clusters on the other hand, form for the most part fine coda-onset sequences, in line with the Syllable Contact Law (Hooper 1976; Vennemann 1988; Baertsch 2002; Gouskova 2001, 2004), which asks that sonority falls across syllable boundaries. Exceptions are: tk, ss (sonority 13. See the Appendix for words containing these clusters.
The acoustics of high-vowel loss in a Northern Greek dialect
387
plateaus), as well as ɣð and ʃm. The latter comprise a sonority rise, indicating that they are instead complex onsets. This idea is corroborated by the fact that ɣð and zm – akin to ʃm – are also found word-initially as underlying clusters, e.g. ɣðérno ‘flay’, zmínos ‘flock’. Longer clusters, e.g. str may also constitute coda-onset sequences, containing either a complex onset (s.tr) or a complex coda (st.r).14 The majority of word-final clusters ends in /n/, followed by /s/ and then /r/ or /t/. Matters seem much more complicated here; assuming a syllabification that only involves complex codas, then we should only anticipate falling or, at worse, level-sonority codas. Instead, we also find rises, e.g. tr, skn. However, it is well-known that sequences of word-final consonants may commonly appear as extraprosodic or extrasyllabic (Vaux and Wolfe 2009; Goad 2011), thus escaping sonority considerations. Alternatively, and given that most of the clusters end in n, it is also possible to pursue an account that views such consonantal sonority-peaks as syllabic consonants. The matter of fact is that we cannot at present offer a clearer picture of KG syllabification, since this requires a number of resources we currently lack; these include among others: syllable-counting perception experiments and, of course, additional data. We believe this matter partly accounts for an issue raised by a reviewer, namely the distribution of the consonants surrounding the VD-undergoing vowel. Differences in that respect may relate to syllabification issues as well as token frequency which may skew the distribution, for example the word [spit] from /spiti/ ‘house’ occurs several times in our data, therefore increasing the number of [t] tokens in C1 word final position. One thing is nonetheless certain; KG not only admits a richer inventory of clusters in all positions (cf. (3)), but it also concedes a wider range of final singleton codas than SMG. (5) lists those found word-finally in both dialects. It is evident that KG includes a much larger coda inventory. Besides [n, s, r] which occur as underived codas in both dialects, they also appear as derived ones after vowel deletion in KG, along with the remaining consonants below. (5) Singleton codas in KG and SMG and representative examples KG
n, s, r, m, ʎ, t, c, ts, ʃ, z, f, ð, v e.g. beθamén ‘dead-fem’, ðjavás ‘read-3sg’, záxar ‘sugar’, m ‘mine’, óʎ ‘all’, spít ‘house’, sukác ‘alley’, ts ‘of-fem-sg-gen’, ʒíʃ ‘live-3sg’, ríz ‘rice’, céf ‘exuberance’, láð ‘oil’, nív ‘bride’
SMG
n, s, r (very rarely)
14. On the special status of /s/ in clusters and various other possibilities of syllabification, see Goad (2011).
388
Nina Topintzi and Mary Baltazani
5. Typological observations Gordon’s (1998) typological survey of VD compiles numerous properties of devoiced/voiceless vowels, many of which are also attested in KG VD (e.g. the gradience and usually allophonic nature of the phenomenon, token-bytoken variation, the strong preference to devoice high vowels, etc.). Gordon additionally points out that voiceless vowels are usually favoured in particular positions within the word, primarily word-finally and then adjacent to voiceless consonants. To account for the attested patterns, he is inspired by work by Dauer (1980) and Jun and Beckman (1993, 1994) and offers two rather distinct explanations of vowel devoicing depending on the position within the word. Word-finally, where devoicing is most predominant, the low subglottal pressure characteristic of that position is held responsible. Word-medially however, a gestural overlap account is promoted instead, whereby unstressed high Vs, intrinsically quite short in duration, are more susceptible to having their glottal adduction gesture overlapped by the glottal abduction gestures of neighbouring voiceless consonants. The gradience observed in this phenomenon is captured by the extent of this overlap – the more extensive the overlap, the more complete the devoicing. This split in accounts has a welcome result. The word-final explanation works regardless of voicing considerations and is independent of the wordmedial voicing explanation. Indeed, as the empirical facts reveal (both our own and cross-linguistically), final vowel devoicing disregards the voicing of neighbouring consonants (see Table 1, rows e+f ). On the other hand, the overlap account word-medially specifically predicts that devoicing will more likely occur when a vowel is flanked by voiceless consonants both ways. It is also corroborated cross-linguistically since VD is indeed more frequent if both Cs are voiceless, followed by cases where C2 is voiceless and finally where C1 is voiceless (Gordon 1998: 98). One final possibility has not yet been discussed, namely the case where VD takes place between voiced consonants. In Gordon’s (1998) typological study of VD this pattern is omitted altogether, presumably because it never arises in any of the 55 languages in his survey.15 Furthermore, to our knowledge, none of the VD accounts that employ gestural overlap has addressed this possibility. 15. Dauer’s (1980) study on Standard Greek reports the same results regarding the distribution of VD, although she claims that instances where VD occurs after voiceless C1 are somewhat more frequent than those where C2 is voiced. She also states that reduction between voiced Cs happens but is highly rare, which is why she totally disregards it in the ensuing discussion.
The acoustics of high-vowel loss in a Northern Greek dialect
389
We propose however that VD of this type occurs and that gestural overlap can extend to it too. In fact, 12% of KG VD occurs between voiced consonants, e.g. /ðuʎa/ → [ðʎá] ‘work, job’, /duvarja/ → [dvárja] ‘walls’, /maɾiɣula/ → [maɾɣúla] ‘a female name’ (cf. Fig. 10).
Figure 10. Complete i-deletion between voiced Cs in the word [maɾiɣula] ‘a female name’.
Recall that in this paper VD has been used as a cover term and does not specifically refer to vowel devoicing or vowel deletion. The latter two are just a couple of the stages encompassed by the phenomenon in question. What we predict then is that between voiced consonants all stages of VD should be able to emerge, save one, vowel devoicing itself.16 This is because voiced consonants have a similar type of glottal gesture as vowels. Thus, none of the consonants can be associated with a devoicing gesture that could overlap into the vowel. Consequently, VD, with the exception of the devoicing stage, may occur. Given the above, we hypothesise that word-medial VD as a phenomenon may appear between all types of consonants in terms of voicing. However, its possible realisations between voiced consonants form a subset of those emerging between other combinations of consonants. The hypothesised situation is schematised in (6). At present, we lack a sufficient number of data that can be adequately tested against such prediction; nonetheless, initial examination of the data at hand, seem to support our proposal. We anticipate that future work shall be able to offer a more conclusive answer.
16. Such devoicing does arise in KG as shown in Figure 11.
390
Nina Topintzi and Mary Baltazani
Figure 11. Final i-deletion in the word [spit], accompanied by aspiration and formant structure, but no voice bar.
(6) Stages of VD in the environment C1V[high unstr.] C2 If C1,C2 = –voi
All stages are possible, but devoicing should be found here with the highest frequency
If C1 = –voi, C2 = +voi or C1 = +voi, C2 = –voi
All stages are possible, but devoicing should appear less frequently
If C1,C2 = +voi
All VD stages are possible with the exception of devoicing
Moreover, we contend that the gestural overlap account (GOA) alone is not sufficient to explain the full range of attested facts cross-linguistically. There are numerous other traits that it leaves unaccounted for, which should be further investigated. For example, GOA cannot explain why in Kozani Greek VD is much more frequent when C2 is voiceless (row c) than when C1 is (row b) (see Table 1 repeated here as Table 4), although the two patterns are identical in the sense that both share the presence of a –voi and a +voi consonant (but in different linear order).
The acoustics of high-vowel loss in a Northern Greek dialect
391
Table 4. VD frequency in different voicing environments. The last three columns show, from left to right, frequency in word medial positions (% medial), in word final position (% final), and in all positions considered together (Total %).
a. b. c. d. e. f.
Pattern
i
U
–voi VD –voi –voi VD +voi +voi VD –voi +voi VD +voi –voi VD# +voi VD#
31 10 34 6 44 35
8 2 1 5 11 5
39 12 35 11 55 40
160
32
192
TOTAL
Total #
% of medial
% of final
40.21 12.38 36.08 11.34 57.9 42.1
Total % 20.31 6.25 18.23 5.73 28.65 20.83 100
A possible explanation for this asymmetry is already hinted at in Figure 9. Pattern c, i.e. +voi VD –voi, most frequently arises word-medially, contrary to pattern b, i.e. –voi VD +voi, which is only marginally manifested in that position. Results are more comparable for the word-final position, whereas word-initially c is found more often than b, but their difference is by no means as dramatic as it was word-medially17. The word-medial results can be understood if sonority is taken into consideration18. Recall from (4) that sonorants are more sonorous than voiced obstruents which in turn are more sonorous than voiceless obstruents. As shown before, VD – at least when interpreted as full elision – creates a consonant cluster. We propose that the reason VD occurs more often when it is to create C[+voi]C[–voi] rather than C[–voi]C[+voi] sequences is because such strings are analysed heterosyllabically and only the former offer good Syllable Contact (see Gouskova 2004),19 that is, the transition from coda to onset is one of falling sonority, rather than rising. 17. Only the left panel of Fig. 9 is used here, since the right panel contains too few data to allow us any claim. Also, reference is solely made to the word-medial position, since it is the one that shows the most systematic effects. 18. Thanks to Lasse Bombien for suggesting this line of thought to us. 19. The C[–voi] C[+voi] string implies either the sequence T-S or T-D where S = sonorant, T = voiceless obstruent, D = voiced obstruent. In a heterosyllabic analysis, both are ill-formed in terms of Syllable Contact, however, we cannot rule out the possibility of a tautosyllabic analysis in terms of complex onsets, e.g. TS. Such cluster would be well-formed, but TD would not (for reasons having to do with consonant phonotactics in Greek). At present, we assume that heterosyllabic syllabification is preferred over tautosyllabic one for derived consonant clusters, a matter that requires further investigation though.
392
Nina Topintzi and Mary Baltazani
Besides arguing that sonority is also at play in VD, we further speculate that its role is subordinate to that of GOA. Such an idea stems from the fact that in the current data pattern d is slightly less common than b. Based on the speculation outlined next, we believe that this difference will be bigger in a larger set of data. Evaluating the patterns of VD in Table 4 in terms of declining well-formedness, we get the order a > b = c > d for GOA and c > a = d > b for Syllable Contact preferences. Matching these to our frequency results, it must be that GOA is more important than Syllable Contact so that a is more widespread than c. Pattern c then follows, since it is next to best in GOA terms, but perfect in terms of sonority. This leaves us with b and d. Pressures here are conflicting; b > d for GOA, but d > b for Syllable Contact. If our reasoning is correct, then pattern b should be more frequent than d. The present results are compatible with this prediction, but are by no means conclusive. Perhaps, examination of additional data will shed light on this issue. All in all, the frequency of VD as regulated by the flanking consonants largely seems to be a matter of GOA and sonority considerations. By the same token, GOA alone is incapable of accounting for differences in the nature of VD. In prototypical VD, only voiceless consonants drive VD (Gordon 1998), whereas in KG their voicing value seems to be irrelevant. We can thus perhaps make recourse to the spreading of [–voi] (cf. McCawley 1968 and Teshigawara 2002 for Japanese) vs. the spreading of [αvoi], respectively. Alternatively, we can say that traditional VD is more phonological in that it involves [–voi] spreading, whereas KG VD is more phonetic, in that it reflects a more general reflex of gestural overlap regardless of the phonological specification of voicing. Whatever the answer to this issue, it is clear that GOA is quite successful in capturing the gradience of VD (unlike any purely phonological account of [voi] spreading), but it too fails to be entirely accurate. If GOA is right for KG, then why doesn’t VD between voiced consonants appear more frequently than it does? While we wouldn’t expect devoicing to occur – for reasons explained above – other realisations of VD, especially elision itself, should be able to emerge, and yet they only do so limitedly. Even more puzzlingly, Shiraishi (2003) finds that VD in Ainu applies to high vowels between voiceless consonants. However, the vowels in the syllables pi, pu, and tu never appeared devoiced or deleted in any of the 120 instances found in his recordings, even though they occur in the right devoicing environment. Consequently, GOA needs to be complemented by additional considerations, both phonological as well as language-specific.
The acoustics of high-vowel loss in a Northern Greek dialect
393
6. Conclusion In this paper, we have offered an acoustic analysis of Vowel Deletion / Devoicing in a dialect of Greek. We have principally investigated the environment, the acoustic correlates, the various realisation stages and the vowel quality differences in the application of VD. In doing so, we have examined the nature of the consonantal clusters in Kozani Greek and have found that a wider inventory than that of Standard Greek emerges. The range of codas is likewise much richer. Beyond the descriptive goals of the paper, we have also seen that the Kozani Greek data are of much empirical value and have theoretical implications for the typology of VD. The presence of VD between voiced consonants is extremely rare and so far had been left unaccounted for by gestural overlap theories of VD. However, we have tentatively argued that gestural overlap can extend to this case too and have hypothesised its specific effects awaiting empirical confirmation.
References Arvaniti, Amalia 1994 Acoustic features of Greek rhythmic structure. Journal of Phonetics 22: 239–268. Arvaniti, Amalia 1999 Illustrations of the IPA: Standard Greek. Journal of the International Phonetic Association 29: 167–172. Arvaniti, Amalia 2001 Comparing the Phonetics of Single and Geminate Consonants in Cypriot and Standard Greek. Proceedings of the Fourth International Conference on Greek Linguistics, 37–44. Thessaloniki: University Studio Press. Baertsch, Karen 2002 An optimality-theoretic approach to syllable structure: the split margin hierarchy. Ph.D. dissertation, University of Indiana. Baltazani, Mary 2006 Focusing, prosodic phrasing, and hiatus resolution in Greek. In Luis Goldstein, Douglas Whalen, Catherine Best (eds.), Laboratory Phonology 8, 473–494. Berlin/New York: Mouton de Gruyter. Baltazani Mary 2007a Prosodic rhythm and the status of vowel reduction in Greek. In Selected Papers on Theoretical and Applied Linguistics from 17th International Symposium on Theoretical and Applied Linguistics, 31–43. Thessaloniki: Monochromia.
394
Nina Topintzi and Mary Baltazani
Baltazani, Mary 2007b
The effect of prosodic boundaries on syllable duration in Greek. Paper presented in Old World Conference in Phonology 4, Rhodes, 18–21 January 2007. Berent, Iris, Donca Steriade, Tracy Lennertz and Vered Vaknin 2007 What we know about what we have never heard: Evidence from perceptual illusions. Cognition 104: 591–630. Boersma, Paul and David Weenink 2009 Praat: doing phonetics by computer. Computer program; available at: http://www.praat.org/. Browning, Robert 1991 Medieval and Modern Greek [Η ελληνική γλώσσα μεσαιωνική και νέα]. 1st edition 1962, 2nd edition 1983; Greek edition 1991. Athens: Papadima Publications. Chatzidakis, Georgios 1905 Medieval and Modern Greek A' [Μεσαιωνικά και Νέα Ελληνικά Α']. Athens: P.D. Sakellarios. Chitoran, Ioana and Ayten Babaliyeva 2007 An acoustic description of high vowel syncope in Lezgian. Proceedings of the 16th International Congress of Phonetic Sciences, 2153– 2156. Saarbrücken, Germany. Chitoran, Ioana and José I. Hualde 2007 From hiatus to diphthong: The evolution of vowel sequences in Romance. Phonology 24(1): 37–75. Dauer, Rebecca 1980 The reduction of unstressed high vowels in Modern Greek. Journal of the International Phonetics Association 10: 17–27. Delforge, Ann Marie 2008 Unstressed vowel reduction in Andean Spanish. In Laura Colantoni and Jeffrey Steele (eds.), Selected Proceedings of the 3rd Conference on Laboratory Approaches to Spanish Phonology, 107–124. Somerville, MA: Cascadilla Proceedings Project. Eftychiou, Eftychia 2008 Lenition processes in Cypriot Greek. Ph.D. dissertation, University of Cambridge. Fourakis, Marios 1986 An acoustic study of the effects of tempo and stress on segmental intervals in Modern Greek. Phonetica 43:172–188. Fourakis, Marios, Antonis Botinis and Maria Katsaiti 1999 Acoustic characteristics of Greek vowels. Phonetica 56: 28–43. Gafos, Adamantios and Angela Ralli 2001 Morphosyntactic features and paradigmatic uniformity in two dialects of Lesvos. Journal of Greek Linguistics 2: 41–73.
The acoustics of high-vowel loss in a Northern Greek dialect Goad, Heather 2011
395
The representation of sC clusters. In Marc van Oostendorp, Colin Ewen, Beth Hume and Keren Rice (eds.), The Blackwell Companion to Phonology, vol. II, chapter 38. Oxford: Wiley-Blackwell.
Gordon, Matthew 1998 The phonetics and phonology of non-modal vowels: a cross-linguistic perspective. Berkeley Linguistics Society 24: 93–105. [Online at: http://www.linguistics.ucsb.edu/faculty/gordon/Nonmodal.pdf; accessed 28 July 2011]. Gouskova, Maria 2001 Falling sonority onsets, loanwords, and Syllable Contact. In Mary Andronis, Christopher Ball, Heidi Elston and Sylvain Neuvel (eds.), CLS 37: The Main Session. Papers from the 37th Meeting of the Chicago Linguistic Society. Vol. 1, 175–185. Chicago, IL: CLS. Gouskova, Maria 2003 Deriving economy: syncope in Optimality Theory. Ph.D. dissertation. University of Massachusetts, Amherst. Gouskova, Maria 2004 Relational hierarchies in OT: the case of syllable contact. Phonology 21(2): 201–250. Han, Mieko Shimizu 1962 Unvoicing of vowels in Japanese. Onsei no Kenkyuu 10: 81–100. Hooper [Bybee], Joan 1976 An Introduction to Natural Generative Phonology. New York: Academic Press. Jannedy, Stefanie 1995 Gestural phasing as an explanation for vowel devoicing in Turkish. OSU Working Papers in Linguistics 45: 56–84. Jun, Sun-Ah and Mary Beckman 1993 A gestural-overlap analysis of vowel devoicing in Japanese and Korean. Paper presented at the 67th Annual Meeting of the Linguistic Society of America. Los Angeles, CA. Jun, Sun-Ah and Mary Beckman 1994 Distribution of devoiced high vowels in Korean. Proceedings of the 1994 International Conference on Spoken Language Processing, vol. 2, 479–482. Kondo, Mariko 1994 Is vowel devoicing part of the vowel weakening process? In Proceedings of the Edinburgh Linguistics Department Conference 1994, 55–62. [Online at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.8089; accessed 28 July 2011]. Kondosopoulos, Nikolaos 2000 Dialects and Idioms of Modern Greek [Διάλεκτοι και Ιδιώματα της Νέας Ελληνικής]. 3rd edition. Athens: Gregori Publications.
396
Nina Topintzi and Mary Baltazani
Kouziakis, Lazaros 2008 I’ve heard, I’ve been told and I’ve written [Άκ'σα, μ'είπαν κι έγραψα]. Kozani. Loukina, Anastassia 2008 Regional phonetic variation in Modern Greek. Ph.D. dissertation, University of Oxford. Maekawa, Kikuo 1983 On Vowel Devoicing in Standard Japanese [Kyootsuugo-ni Okeru Boin-no Museika-ni Tsuite]. Gengo-no Sekai 1: 69–81. McCawley, John D. 1968 The Phonological Component of a Grammar of Japanese. The Hague: Mouton. Mo, Yoonsook 2007 Temporal, spectral evidence of devoiced vowels in Korean. In Proceedings of the 16th International Congress of Phonetic Sciences, 445–448. Saarbrücken, Germany. [online at: http://www.icphs2007. de/conference/Papers/1597/1597.pdf; accessed 28 July 2011]. Newton, Brian 1972 The Generative Interpretation of Dialect: A Study of Modern Greek Phonology. Cambridge: Cambridge University Press. Nicolaidis, Katerina 2001 An electropalatographic study of Greek spontaneous speech. Journal of the International Phonetic Association 31: 67–85. Nicolaidis, Katerina 2003 Acoustic variability of vowels in Greek spontaneous speech. Proceedings of the 15th International Congress of Phonetic Sciences, 3221–3224. Barcelona, Spain. Papadopoulos, Anthimos 1927 Grammar of Modern Greek Northern Idioms [Γραμματική των Βορείων Ιδιωμάτων της Νέας Ελληνικής]. Athens: P.D. Sakellarios. Protopapas, Athanassios, Marina Tzakosta, Aimilios Chalamandaris and Pirros Tsiakoulis 2010 IPLR: An online resource for Greek word-level and sublexical information. Language Resources and Evaluation, Online First™, 2 September 2010. [online at: http://users.uoa.gr/~aprotopapas/CV/ pdf/Protopapas_etal_LRE-IPLR.pdf; accessed 28 July 2011]. Shiraishi, Hidetoshi 2003 Vowel devoicing of Ainu: How it differs and not differs from vowel devoicing of Japanese. In T. Honma, M. Okazaki, T. Tabata and S. Tanaka (eds.), A New Century of Phonology and Phonological Theory, A Festschrift for Professor Shosuke Haraguchi on the Occasion of His Sixtieth Birthday, 237–249. Tokyo: Kaitakusha. Teshigawara, Mihoko 2002 Vowel Devoicing in Tokyo Japanese. In G.S. Morrison & L. Zsoldos (eds.) Proceedings of the North West Linguistics Conference 2002, 49–65. Burnaby, BC, Canada: Simon Fraser University Linguistics Graduate Student Association.
The acoustics of high-vowel loss in a Northern Greek dialect Trudgill, Peter 2003
397
Modern Greek dialects: a preliminary classification. Journal of Greek Linguistics 4: 45–64.
Tsuchida, Ayako 2001 Japanese vowel devoicing: cases of consecutive devoicing environments. Journal of East Asian Linguistics 10: 225–245. Turk, Alice and White, Lawrence 1999 Structural effects on accentual lengthening in English. Journal of Phonetics 27: 171–206. Vaux, Bert and Andrew Wolfe 2009 The appendix. In Eric Raimy and Charles Cairns (eds.), Contemporary Views on Architecture and Representations in Phonology, 101–143. Cambridge, MA: MIT Press. Vennemann, Theo 1988 Preference laws for syllable structure and the exploration of sound change. Berlin: Mouton. Zec, Draga 2007 The syllable. In Paul de Lacy (ed.), The Cambridge Handbook of Phonology, 161–194. Cambridge: Cambridge University Press.
398
Nina Topintzi and Mary Baltazani
Appendix The following table gives examples of words illustrating clusters occurring in KG which are illicit in SMG. The columns show the cluster (column a), the word as it is pronounced in KG (b), as it is pronounced in SMG (c) and its gloss (d). a. Cluster
b. Word in KG
c. Word in SMG
d. Gloss
WORD INITIAL dv bs ðʎ θk zn mk ʃk
[dvarja] [bso] [ðʎa] [θkos] [zn] [mkros] [ʃkóθkan]
[duvárja] [misó] [ðuʎá] [ðikós] [stin] [mikrós] [sikóθikan]
‘walls’ ‘half’ ‘work’ ‘my own’ ‘at the’ ‘small’ ‘rose 3-pl’
WORD MEDIAL zt nʃ ʃm θk θc mpʃ
[terʝázti] [ʝitónʃes] [meʃmér] [ðéθkami] [apulíθcin] [ampʃós]
[terʝázete] [ʝitónises] [mesiméri] [ðeθíkame] [apolíθice] [anipsjós]
‘fit 2-pl’ ‘neighbours’ ‘noon’ ‘tied ourselves’ ‘was released 3-sing’ ‘nephew’
WORD FINAL çn ʃn Ls ts rs rsn lts tʃn skn
[içn] [plúʃn] [bakáls] [pulíts] [tésers] [çírsn] [vanɟélts] [krátʃn] [tirjáscn]
[íçe] [pulúse] [bakális] [polítis] [téseris] [ʝírise] [vanɟélis] [krátise] [terjástice]
‘had 3-sing’ ‘sold 3-sing’ ‘grocer’ ‘civilian’ ‘four’ ‘came back 3-sing’ ‘Vangelis (male name)’ ‘kept 3-sing’ ‘fit 3-sing’
List of contributors Editors Philip Hoole Institute of Phonetics and Speech Processing, Ludwig-MaximiliansUniversität, Munich
Marianne Pouplier Institute of Phonetics and Speech Processing, Ludwig-MaximiliansUniversität, Munich
Lasse Bombien Institute of Phonetics and Speech Processing, Ludwig-MaximiliansUniversität, Munich
Christine Mooshammer Haskins Laboratories, New Haven Barbara Kühnert Institut du Monde Anglophone & Laboratoire de Phonétique et Phonologie, CNRS/Sorbonne-Nouvelle, Paris
Contributors Mary Baltazani Department of Linguistics, University of Ioannina Marie-Anne Barthez Language Reference Center, Clocheville Hospital, Tours Regional University Hospital Center, Tours Pia Bergmann Deutsches Seminar: Germanistische Linguistik, University of Freiburg Natalie Boll-Avetisyan Department of Linguistics, University of Potsdam, and Utrecht Institute of Linguistics, Utrecht University Sandrine Ferré INSERM, U930, Tours, and Université François-Rabelais de Tours, CHRU de Tours, UMR-S930, Tours Louis Goldstein University of Southern California and Haskins Laboratories, New Haven
Martine Grice IfL – Phonetik, University of Cologne Claire Halpert Department of Linguistics and Philosophy, MIT, Cambridge, MA Anne Hermes IfL – Phonetik, University of Cologne Fang Hu Institute of Linguistics, Chinese Academy of Social Sciences, Beijing Rina Kreitman Columbia University, New York Yasutomo Kuwana Asahikawa Jitsugyo High School, Asahikawa Stefania Marin Institute of Phonetics and Speech Processing, Ludwig-MaximiliansUniversität, Munich
400
List of contributors
Doris Mücke IfL – Phonetik, University of Cologne Hosung Nam Haskins Laboratories, New Haven Henrik Niemann IfL – Phonetik, University of Cologne Eva Sizaret Language Reference Center, Clocheville Hospital, Tours Regional University Hospital Center, Tours Hisao Tokizaki Department of English, Sapporo University Nina Topintzi Institut für Linguistik, University of Leipzig
Laurice Tuller INSERM, U930, Tours, and Université François-Rabelais de Tours, CHRU de Tours, UMR-S930, Tours Marina Tzakosta Faculty of Education, University of Crete Zsuzsa Várnai Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest Theo Vennemann Institut für deutsche Philologie, LudwigMaximilians-Universität, Munich
Subject index accentuation 311, 313, 316, 320, 322– 327, 329, 338–339, 341 acquisition 24, 90, 99, 115, 174–175, 285–286, 291, 293–294, 297, 301, 306–308 lexical acquisition 257–263, 269, 275–278, 281 phonological acquisition 116, 280 anaptyxis 95, 135, 142 contact anaptyxis 28 alignment 206–207, 212, 222, 224, 232, 247–249, 253, 345, 355, 358 peak alignment 211, 228 articulatory alignment 215, 217– 220 acoustic alignment 215–216, 221 tonal alignment 205, 214, 221, 226– 227, 229, 252 anti-phase, see coupling apheresis, see copation apocope, see copation articulatory coordination 157–159, 170–171 articulatory phonology 29, 157, 159, 174, 181, 198, 200, 202, 205, 207, 214, 226, 232, 250, 341, 345, 354, 356, 366–367 articulatory retina 12 aspiration 38, 55, 63, 235, 240–241, 249, 348, 357–360, 363, 369–370, 372–373, 375–376, 378, 390 assimilation 28, 58, 65, 104–105, 112– 114, 139, 143, 311, 313–315, 318, 320, 322, 324, 329, 337–343, 345– 348, 350–353, 355, 357–358, 360, 363–368 velar nasal assimilation 312 asymmetry 39–41, 85–88, 91, 225, 229, 252, 368–370, 380–384, 391 autosegmental-metrical approach 206, 214
bilingual 122, 134–135, 143, 280, 360 blending weight 177, 182–184, 186, 188–190, 192–194, 197, 199 borrowing 11, 51, 120, 134–135 calibration 28 case su‰xes 84 C-center 160, 165, 171, 173, 175, 208, 210–211, 227, 231, 233, 245, 247– 249 cluster coda cluster 13, 23, 136, 140 contact cluster 14, 128, 130–131, 133, 136, 138–139 derived cluster 385–386 head cluster 13–16, 18, 20, 23 intersyllabic cluster 13–14, 23, 138 onset cluster 33–36, 38, 41–42, 46, 50, 63–64, 113, 138–139, 142– 143, 386 vowel cluster 177, 198–199 cluster formation 93, 95–96, 98– 101, 107–109, 112, 116, 370–371 cluster well-formedness 93, 95–96, 105, 109 perfect, acceptable, non-acceptable clusters 95–96, 99–107, 109–112 coalescence 82–83 coda 13–15, 18, 23–24, 27, 73–74, 79, 83, 91, 94, 96, 107, 111, 119, 126, 128–129, 131–133, 135–136, 140– 141, 143–146, 148, 157–159, 174, 177, 180, 202, 208–209, 228, 235, 261–264, 270, 286, 289, 306, 339, 369, 377, 386, 391, 393 coda weakening 28 coda inventory 71, 77–78, 80–81, 85, 88, 371, 387 Coda Law, see preference laws compensatory strategies 301, 303–304 complement-head order 71, 79, 89
402
Subject index
complex onset 68, 133, 136, 138, 157, 160, 165, 167, 169, 171, 173–174, 202, 208, 228, 261, 263, 387, 391 Consonantal Strength 12, 16, 18, 23, 27–28, 95 consonantality 12–13 constraints of association 286 Contact Law, see preference laws copation apocope 17, 23–24 procope (apheresis) 24 syncope 11, 17, 23, 51, 135, 138, 140–141, 143, 146, 394–395 coupled oscillator 174–175, 182, 202, 205, 210, 227, 229, 252 coupling 174, 181, 184, 202, 207, 227, 248, 252, 368 coupling hypothesis of syllable structure 208 coupling mode intrinsic mode 159, 208 in-phase 157, 159–160, 171, 177, 179–180, 182–183, 185–189, 192, 194, 197–198, 205, 208–210, 222– 224, 233, 249 anti-phase 157, 159–160, 171, 177, 179, 183, 185, 208–210, 223– 224, 233, 249 coupling graph 161, 172, 205–206, 208–211, 223–226 competitive coupling 158, 160, 170– 171, 173, 206, 209–210, 224–225, 228, 231, 249 deletion 82–83, 95, 106, 135, 137–145, 147–149, 264, 311, 313, 318–319, 321–324, 328–329, 331–334, 337– 341, 352, 364, 369–370, 372–385, 387, 389–390, 393 diphthong 74, 126, 177–181, 183–184, 186, 188–195, 197–202, 314–315, 339–340, 394 dissimilation 45 duration increase 377–378
Early Syllable Law, see preference laws electropalatography (EPG) 226, 311, 313, 315, 330, 337, 342–343 electromagnetic articulography (EMA) 205, 231, 234–235, 250, 312, 342 epenthesis 37, 63, 82, 84, 95, 106, 135–137, 139–146, 148, 150, 152 contact epenthesis 28 feature feature geometry 99 feature [sonorant], see sonorant feature [voice], see voice laryngeal feature 45, 63–64, 114, 345–346, 351, 355, 357, 364 First Syllable Law, see preference laws frequency word frequency 311, 313, 341, 343 high-frequency 257, 259, 311–318, 320–325, 327, 329, 331–333, 337– 338, 340–341 low-frequency 257, 262, 270, 283, 311–315, 317–318, 320–323, 327, 329, 332–333, 352 fricative 12–13, 16, 18, 22, 34, 40–41, 94, 97, 99–102, 108, 125, 127– 131, 141, 214, 232, 235, 287, 290, 293–294, 348, 355–357, 361, 364, 373–374, 377–379, 386 gemination 28 General Syllabication Law, see preference laws geographical gradation 71, 77–78, 85 gesture 12, 29, 64, 157–158, 165, 181, 183–186, 188, 194, 197–198, 202, 208, 216, 218, 229, 232, 252, 318, 354, 361–362 gestural coordination 159, 173–174, 200, 226–227, 233–234, 241, 247, 250–251, 366–367 gestural model 177
Subject index
403
gestural overlap 59, 345–347, 353, 364, 366–369, 388–390, 392–393, 395 intergestural timing 205, 207, 209– 210, 227, 231, 241–245, 248–249 oral constriction gesture 205–206, 209–210, 214–215, 219, 223, 347, 353, 355–358, 360, 363–365 spatial modulation gesture 182, 193, 199 tone gesture 205–206, 209–211, 214–215, 217, 219–220, 222–225, 227, 231, 233, 236, 238, 245, 248– 249, 251 gradience 71, 88, 95, 107–109, 258, 370, 388, 392
metathesis 25, 29, 66, 135, 138, 140– 143, 146, 149–150, 301, 303 at a distance 11, 22 contact metathesis 28 slope metathesis 22–23
head cluster, see cluster head-complement order 78–81, 85, 88 head-final language 79, 82, 84, 88 Head Law, see preference laws head strengthening 28 heterosyllabic 20, 96, 109, 112, 124– 125, 391 hiatus 177–180, 183, 186, 188, 190, 198, 200, 349, 381, 393–394
phonological acquisition, see acquisition phonological complexity 90, 202, 229, 252, 285–286, 290–293, 297, 304– 307 phonological word boundary 313 phonotactics 21–22, 27, 30, 58–59, 65, 68, 93–94, 96–97, 99, 112, 115, 124–125, 128–129, 131, 153, 178, 264, 269, 278–280, 287, 294, 360, 391 probabilistic phonotactics 257–263, 274–277, 281–282 phrasing 87, 393 place scale 93, 102–103, 105, 107, 109, 111–112 planning oscillator 180, 183, 206 plosive 12–13, 15–18, 125–131, 137– 138, 141–142, 348 preference 11, 14–16, 18, 21–22, 30, 107, 117, 153, 314, 372, 388, 392, 397 for brevity 24 preference law 11, 30, 117, 153, 397 Coda Law 14–15, 27 Contact Law 15, 27–28, 124, 132, 386 Early Syllable Law 25, 27–28
implicational universal 18, 29, 65, 72 impure s 157–158, 169, 171, 173 in-phase, see coupling interfixation 87, 90 juncture 71–72, 85, 87–89, 174, 200 left-branching compound 87 lexical acquisition, see aquisition loanwords 17–18, 23, 119–120, 123, 134–135, 137, 140, 143, 151–152, 395 manner scale 101–102, 105–106 markedness 29, 33–34, 40–41, 65, 99, 261–263, 285, 291–292, 306, 346– 347, 352–353, 355–358, 363–364, 367
naturalness 69, 270, 276 graded naturalness 15, 22 n-Insertion 86 nuclear rise 205–206, 211–212 onset, see complex onset, onset cluster, syllable onset nucleus, see syllable nucleus Nucleus Law, see preference laws oral constriction gesture, see gesture
404
Subject index
First Syllable Law 25–28 General Syllabication Law 25, 28 Head Law 14–16, 20, 27 Nucleus Law 27 Stressed Syllable Law 25–26, 28 prependix 18 procope, see copation prosthesis, prothesis 20, 135, 137, 142, 144, 147–148 repair 95, 106, 112, 121, 134–135, 137–143 rhythmic grid 288–289, 293 right-branching structure 72, 85–86, 88–89 right-branching compound 87 scale 13, 15–19, 34, 40, 74, 93–96, 99, 101–107, 109, 111–112, 123, 221– 223, 306, 320–321, 323–329, 337– 341, 386 Consonantal Strength scale 12 sequential voicing 85 short-term memory 257–258, 275, 278–282, 292 singleton 262–264, 360, 362, 377, 387 simplification 25, 71, 77, 82, 85, 88, 120, 130, 133, 135, 158, 231–232, 234, 248–249, 304 articulatory simplification 19 slope 11, 16, 22–23, 195 slope consonant 20 slope displacement 24–25 sonorant 13, 15–16, 35, 42–43, 45, 47, 53–55, 125–128, 130–132, 134, 141–142, 234–235, 240, 248, 287, 289–290, 293, 300–301, 377–379, 391 feature [sonorant] 33–34, 39–41, 44, 49–52 sonority 12–13, 33, 41–42, 49–51, 59, 69, 95–96, 99, 113, 126–128, 130– 131, 134, 150, 160, 288–290, 387, 391–392, 395
sonority scale 34, 40, 93–94, 112, 386 sonority hierarchy 58, 66, 114, 124, 143, 386 sonority sequencing principle (SSP) 36–37, 39–40, 66, 124–125, 262 speaker variation 314, 340 Specific Language Impairment (SLI) 285, 291–308 strength assimilation 28 strengthening 20, 28, 181, 199, 201, 334, 341–342, 382 stress 28, 61, 65, 88, 93, 113, 115, 177, 179, 181, 183, 197, 200–203, 287, 291–292, 314, 382, 394 modeling stress 182, 184, 193–194 stress-conditioned alternation 178, 180, 184, 193, 199 structural complexity 14–15, 23–24, 232, 250, 262–263, 275, 277 sub-lexical level 259–261, 276–277 substitution 103, 115, 135, 139–141, 143–144, 146–147, 149, 301–303, 305 syllabic parsing 157 Syllabic Structure 293–295 syllabification 59, 68, 93, 96–97, 112– 113, 115, 150, 158, 173, 175, 229, 288, 290, 300, 305, 349, 387, 391 syllable syllable boundary 13, 124, 126, 133, 138, 142 syllable coda 13–15, 24, 27, 159, 208–209, 261, 263 syllable complexity 73–74, 77–80, 85, 88–90, 132, 257, 268, 274, 277 syllable contact 27, 124, 126–128, 130–134, 136, 138, 386, 391–392, 395 syllable contact change 21, 28, 30 syllable head 12–16, 18, 23–24, 27 syllable margin 12, 257, 261–262 syllable nucleus 12, 27, 42
Subject index syllable onset 61, 157–160, 171, 174, 207–209, 261, 263, 312 syllable organization 113, 179 syllable simplification 82 syllable structure 11, 15, 24, 30, 61, 65, 69, 71–78, 81–85, 88, 90, 113, 117, 119, 121, 124, 133, 135, 151, 153, 157–160, 174–175, 180, 200, 202, 208, 211–213, 216, 218–221, 225–227, 229, 231–233, 241, 250, 252–253, 261, 263, 265–274, 285, 293, 301, 305–307, 314, 339–341, 368, 393, 397 naked syllable 24 syncope see copation task-dynamics 177, 179, 203, 233 TADA (TAsk-Dynamics Application) 181–182, 184, 194, 202, 223, 229 tautosyllabic 28, 93, 100, 109, 124, 137, 391 temporal coordination 205–206 three-scales model 96, 105, 109 tone 78–79, 207, 209, 211–212, 214– 220, 222–223, 227–229, 233, 237– 241, 245, 247–248, 250–253 tone language 231–232, 234, 236, 249 lexical tone 205–206, 210, 224–225 prosodic tone 205 tone gesture, see gesture tonal alignment, see alignment tonogenesis 231–234, 240, 249–252 timing 29, 157–161, 167, 169, 171, 173, 180, 183, 199–200, 205, 207, 209–210, 216, 222, 224–227, 229, 231, 233, 241–245, 248–249, 252, 342, 345, 349, 355, 363, 368
405
typology 29, 33, 41, 46, 50–51, 60–61, 63, 65, 69, 73–74, 89–91, 111, 119, 133, 252, 261, 287, 293, 307, 345, 347, 363, 369, 371, 393 voice 42–43, 53–54, 63, 97, 104, 112– 113, 240, 359, 372, 379, 388, 390, 392 feature [voice] 33–34, 41, 44–52, 55, 86 voicing scale 103–105, 107, 109, 111– 112 vowel vowel deletion 323, 369–373, 375– 385, 387–393 vowel elision 370 vowel devoicing 370, 388–389, 395– 397 weak bracket 88 weakening 18, 20, 28, 313–314, 331– 332, 346, 395 word word-final 23, 77, 85–86, 151, 286, 288, 306–307, 315, 376–377, 380– 381, 386–388, 391, 398 word-initial 11, 14–15, 18, 33–38, 40–42, 44, 46–49, 51–52, 64, 83, 86, 137, 139, 142, 157–158, 160, 162, 171, 173–174, 226–227, 286, 298, 307, 315, 383–384, 386–387, 391, 398 word-medial 37, 96–97, 120, 124, 128, 135, 372, 377, 380, 386, 388– 389, 391, 398 word stress 181, 193, 199 word frequency, see frequency
Language index See also Appendix I and the Language Database of the chapter by Rina Kreitman (pp. 53–58) Ainu 373, 392, 396 Amuesha 47–48, 53, 56, 61 Arabic 56, 278 Moroccan Arabic 48, 60, 68, 157, 160, 175, 229 Athapascan 79 Avar 80 Babungo 38, 56, 67 Baltic 57, 79 Bantoid 79 Bantu Kinyarwanda 363, 367 Pokomo 358, 363, 367 Sukuma 363, 367 Zulu 345–360, 363–366 Basque 17, 30, 39, 53, 56, 63, 82, 90 Berber 48, 56, 60, 157, 160–161, 174, 227 Bilaan 43–44, 47–48, 53, 56, 60 Biloxi 47–48, 53, 56, 61 Camsa 47–48, 53, 56, 62 Carib 37, 56–57, 62 Catalan 205–207, 211–218, 220–226, 229, 249, 306 Chatino 38, 53, 55–56, 65 Chinese 78, 89, 211, 231, 250, 252– 253 Mandarin Chinese 210, 227, 233, 236, 248–249, 251 Chukchee, Chukchi 37, 56, 58–59, 63, 68, 81 Comanche 38, 56, 67 Dakota 37, 56, 65 Darai 73
Dutch 47, 53, 56, 59, 69, 74, 85, 87, 90, 100, 107, 113, 116, 158, 198, 206–207, 213, 221, 228, 257, 261– 264, 270 Eggon 36, 56, 65 Egyptian 72 English 11, 13–14, 16, 18, 20–21, 29, 62–63, 73–74, 82–83, 112–113, 116, 151, 157–158, 160–161, 173– 174, 198, 201–202, 206–207, 209, 221, 226–228, 251, 261, 278, 286, 291–293, 313–315, 341–342, 360, 366–368, 382, 397 Contemporary English 15, 17 Middle English 17 Old English 15 Fijian 73, 151 French 56, 60, 67, 198, 250, 285–290, 292–295, 302, 306–307 Gansu 78–79 Georgian 38–39, 47–48, 53, 56, 59, 62, 66, 157, 160, 174, 227, 346, 366 German 11, 14, 16–17, 20, 30, 44–45, 53, 55–56, 63–64, 69, 74, 85, 90, 113, 153, 174, 202, 211, 226, 228, 249, 286, 311–313, 318, 338, 341– 343 Contemporary German 15 Upper German 21 Viennese German 205–207, 212– 215, 218–225 Standard German 315 Germanic 14, 20, 29–31, 56–58, 63, 104, 151, 153
408
Language index
Greek 18, 30, 39, 47, 53, 56, 61, 63, 68, 93–95, 97–98, 104, 106–107, 111–113, 115–117, 206–207, 221, 226, 375, 381, 391, 394–395, 397 Classical Greek 16, 20 Contemporary Greek 16 Standard Greek 96, 99, 369–370, 385, 388, 393 Northern Greek 369–370, 383 Kozani Greek (KG) 100, 105, 114, 369–373, 377, 379, 383–390, 392– 393, 396, 398 Greenlandic 57 West Greenlandic 51, 61 Guanzhou 78–79 Hawaiian 73 Hebrew 53, 57 Biblical Hebrew 21 Modern Hebrew 39, 44–46, 48, 52, 58–59, 63–64 Hindi 53, 57, 62, 66, 82 Hua 41, 48, 54, 57, 62 Igbo 73, 81 Ijo 71–72 Irish 39, 44, 54–55, 57, 60–61 Italian 17, 20, 30, 112, 153, 157–159, 161, 167, 170–174, 198, 202, 207, 221, 227, 250, 289, 292 Old Italian 11, 25 Calabria 26 Lombardy 25 Lucania 25 Campania 25 Milanese 25 Tuscan dialects 26 Sicilian 26 Japanese 74, 78, 80, 82–83, 85–87, 90, 279, 373, 384, 392, 395–397 Kannada 82 Kanuri 80 Khasi 43–45, 47–48, 54, 57, 62, 66–67
Klamath 44, 54–55, 57–58 Korean 14, 17, 38, 80–87, 90, 151, 227, 367–368, 373, 395–396 Kurdish 81 Kutenai 39, 47, 54, 57, 61 Latin 11, 17–18, 30, 82–83, 153 Lezgian 81, 373, 394 Mazatec 38, 57 Manchu 78–79 Mba 73 Mixtecan 79 Moghol (Mongolic) 81, 84 Nambiqara 82, 90 Nanshang 78–79 Ngandi 74 Nisqually 36 Nivkh 84, 91 Otomi (Temoayan) Pali 14, 21 Pashto 39, 54, 57, 66 Persian 36, 82, 91 Phoenician 21, 29 Polish 17, 30, 38, 54, 57, 62, 67 Popoluca 51, 54, 57, 61 Portuguese 11, 17–20, 382 Romance 17, 56–57, 200, 202, 306, 394 Romanian 47, 54, 57, 100, 107, 177– 179, 181–184, 189, 193–194, 199– 200, 202, 382 Russian 39, 41, 52, 54–55, 57–58, 64, 67–68, 82, 119–122, 124, 134– 138, 142–146, 148, 150–152 Rutul 81 Samoyedic 135, 141 Nenets 119–123, 125–129, 131– 134, 136–137, 139–140, 142–144, 151–152
Language index
409
Enets 119–123, 127–129, 131–134, 136–140, 143, 145, 150, 152 Nganasan 119–123, 129–134, 136– 137, 139–140, 146, 150–153 Selkup 119–120, 122–123, 130–134, 136–140, 142, 148, 150–152 Scandinavian Old Norse 20 Scandinavian dialects 20 Semitic 48, 57–58, 79 Spanish 11, 17, 19, 21–22, 25, 29, 51, 55, 57, 59, 64, 67, 207, 221, 229, 382, 394 Cusco Spanish 373 Swedish 20, 55, 57, 68
Thai 63, 78–79 Tibetan 56, 232, 235, 250, 252 Old Tibetan 234, 248 Lhasa Tibetan 231, 234, 238, 240– 241, 249, 251, 253 Amdo Tibetan (Xiahe dialect) 248 Tsou 44–45, 47–48, 55, 57, 63, 69 Turkic 79 Turkish 116, 373, 384, 395
Taba 38, 55, 57, 59, 62 Tahitian 17 Tamil 80–81
Yareba 71–73
Vietnamese 231, 249 Wa 39, 55, 57, 69 Warao 71–72 Welsh 44, 55, 58, 68
Zulu, see Bantu