236 80 51MB
English Pages 179 [180]
Linguistische Arbeiten
16l
Herausgegeben von Hans Altmann, Herbert E. Brekle, Hans Jürgen Heringer, Christian Rohrer, Heinz Vater und Otmar Werner
Allan R.James
Suprasegmental Phonology and Segmental Form Segmental Variation in the English of Dutch Speakers
Max Niemeyer Verlag Tübingen 1986
CIP-Kurztitelaufnahme der Deutschen Bibliothek James, Allan:
Suprasegmental phonology and segmental form : segmental variation in the English of Dutch speakers / Allan R. James. - Tübingen : Niemeyer, 1986. (Linguistische Arbeiten ; 161) NE: GT ISBN 3-484-30161-9
ISSN 0344-6727
Max Niemeyer Verlag Tübingen 1986 Alle Rechte vorbehalten. Ohne Genehmigung des Verlages ist es nicht gestattet, dieses Buch oder Teile daraus photomechanisch zu vervielfältigen. Printed in Germany. Druck: Weihert-Druck GmbH, Darmstadt.
CONTENTS
CHAPTER 1
CONTEXTS OF SEGMENTAL VARIATION IN SECOND LANGUAGE PRONUNCIATION
1
1.0.
Introduction
1
1.1.
Segnental variation in contrastive phonological research ...
2
1.1.1. Segroental variants in interlanguage phonology 1.1.2. Segmental variant as segmental error in 'protective 1 contrastive phonology 1.1.3. Segmental variation within psycholinguistically oriented models of contrastive phonology
2
1.2.
9
Segmental variation in contrastive linguistic phonetic research 1.2.1. Segmental variants in contrastive descriptive phonetics 1 . 2 . 2 . Segmental variants in contrastive pedagogical phonetics
1.3.
5
11 11 12
Segmental variation and context in other contrastive research Formal and functional aspects of oontrastive phonological analyses as applied to FL data
18
CHAPTER 2
TOWARDS A PHONOLOGICAL FRAMEWORK OF SUPRASEGMENTAL CONTEXT .
24
2.0. 2.1.
Introduction A general view of suprasegmental context within phonological structure Hierarchical models of phonological structure The syllable in a phonological hierarchy Primary and secondary phonological narking systems within the hierarchy
1.4.
2.2. 2.3. 2.4.
CHAPTER 3
24 25 26 33 37
AN ANALYSIS OF THE SUPRASEGMENTAL CONTEXT OF SEGMENTAL VARIANTS IN FL DATA
3.0. 3.1.
16
Introduction Suprasegmental context and cross-linguistic identifications.
44
44 44
VI
3.2.
Illustrative data: a sentence utterance
46
3.3.
Sample data from continuous speech
48
CHAPTER 4
STRENGTH RELATIONS IN SUPRASEGMENTAL PHONOLOGY
4.0.
Introduction
55
4.1. 4.2.
Conceptions of strength in phonological theory Strength and metrical phonology
56 70
CHAPTER 5
AN EXTENDED PHONOLOGICAL MODEL OF SUPRASEGMENTAL CONTEXT ...
84
5.0.
Introduction
84
5.1.
Features and contrasts in the suprasegmental hierarchy
84
5.2.
Loci and foci in phonological representation
111
5.3.
The phonetic representation
130
5.4.
A typology of the phonological representation
135
5.5.
The status of the phonological description
151
5.6.
Grammar and processor in foreign language learning
158
Bibliography
55
167
CHAPTER ONE: CONTEXTS OF SEGMENTAL VARIATION
IN SECOND
LANGUAGE PRONUNCIATION
1.0
Introduction
The irotivation for the present study derives in the first place from a theoretical dissatisfaction with the observational, descriptive and explanatory limitations imposed on the analysis of speech sound variation in second language pronunciation by established modes of phonological and phonetic description. These limitations are characterized in their extreme form by there being little regard for the systematic analysis of the place of occurrence of segmental variation in foreign language (FL) speech, while the form of the segmental variants themselves is explicated largely in terms of structural comparisons between native language (NL) and FL segmental phonological and phonetic systems. The descriptive categories of such structural comparisons, while adequate to the analysis of languages as conceptualized as stable, 'static', more or less abstract linguistic systems "ou tout se tient", are at the same time, as traditionally employed in the area of second language acquisition and performance, descriptively naive in the absence of a developed conception of the dynamics of the kind of linguistic contact situation between two (or more) language systems which foreign language learning represents. On the other hand, neglect of the position of occurrence of segmental variants in FL locutions is of course readily attributable to the paradigmatic orientation of the major phonological theories until recently, whose concern has been primarily the establishment of contrastive or distinctive structural relations between units, whether expressed directly in terms of phonemes or features, within a given paradigmatic 'slot' in structure. References to the further syntagmatic context within which segmental paradigmatic contrasts obtain are couched in terms of syllable or word position, which failing any further phonological definition of either syllable or word, serve as no more than convenient, i.e. ad hoc, context labels. Phonetic accounts of the place of segment variation in FL speech make reference to the well-established segmental phonetic properties of positional
allophones cross-linguistically, again stated in conventional terms of syllable or word position (initial, nedial, final). Concerning the form of the variants, phonological statements offer cross-linguistic descriptions of systemic, distributional and, rarely, phonotactic patterning of NL and EL phonological segments as the basis for an account of observed variation in learner speech, whereas phonetic statements typically make reference to differences in the phonetic 'realizational' form of segments cross-linguistically as an explanation of FL sound substitution. In all cases, as will be noted, the context serves as no more than a convenient 'backcloth1 to the description of segmental form. By contrast, the main aim of the present study is to show that a suprasegmental context as a phonological context, defined in sufficient structural depth can and does, in -its own right, condition or constrain both place and form of segmental variation in FL speech and can thus, properly specified, lead to a more satisfactory and explanatory account of such variation than has seemed to be possible in received frameworks of description. It is the purpose of the present chapter to critically review attempts made in contrastive phonological and phonetic descriptions to account for the problem of segmental variation in FL speech, assessing the ways in which a notion of context has contributed to such descriptions, and to lay in outline the theoretical foundations for the subsequent treatment of these problems. 1.1.
Segmental variation in contrastive phonological research
1.1.1. Segmental variants in interlanguage phonology It is an obvious and much-noted property of FL speech that its form manifests a great degree of variation in the sense that one and the 'same1 FL or target language (TL) phonological or phonetic unit may exhibit within-speaker variation either on the same or different verbal tasks, according to level of 'attention to speech1, etc., etc., and between-speaker variation, for example, with learners of different TL proficiency levels, as well as of course with learners of different ML backgrounds. Such variation must be considered characteristic of FL speech and occurs over and above the limits of natural variation in sound production as observed in any given speaker of an NL or FL. In a structural perspective, the variant forms produced may be attributed to, for example, systemic, distributional or realizational sources as a product of the differences between the NL and the FL. Note that the degree of segment variation is clearly measured in terms of deviance
from a TL norm. The variation itself may be considered as occurring within the (defective) TL, or in the 'interlanguage' of the learner, i.e. as systemic, distributional or realizational variation of some form within the learner language itself. Ihe construct of an interlanguage (IL), or its notional equivalent 'approximative system1, has been developed over the last decade within applied linguistic research to provide a conceptual framework for the analysis of the language of the FL learner as a quasi-autonomous rule-governed system (see Selinker 1972, Nemser 1971b) - the linguistic reflection of an FL learner's developing 'transitional competence1 (Corder 1967). Forms and rules within the IL may or may not substantively reflect properties of the source language (SL) or TL. In terms of their theoretical status, they may be viewed as independent. Within this line of research, variation is taken to be indicative of the inherent variability and instability of IL systems (cf. Tarone, Frauenfelder and Selinker 1976), as well as of its
'permeability'
in that it allows the rules of other languages to operate within it
(cf.
Adjemian 1976) . However, the problem remains, as in studies, e.g., of child language or disordered speech, as to the interpretation of such variant forms as manifestations of some 'same unit'. Despite paying due lip-service to the conceptual independence of an IL, in practice the great majority of analyses within this framework treat segmental variants as forms of the 'same' TL norm, describing their occurrence in terms of realization rules of the basic form „-/χ/ -*· π1/γ/ in the context IJj 1'
/z/, whereas strictly speak-
IL·
ing, of course, within the interlanguage framework the rule form should be π/Χ·/ ·*· ]i/y//Ti/z/ ^or f rt*161" discussion on this point see Eckman 1981a) . Deciding whether a realization of [t] for TL /Θ/ (or [Θ]) in English "three" should be given the rule form π/θ/
TT /t/ Ι ι'
->· πΙ Ι [t] or '
lj_i
/θ/ -> π1 1[t] or indeed '
-»· -rr[t] depends crucially on a theoretically well developed conceptualiza-
tion of an interlanguage and its relation to both TL and SL, particularly with regard to the question of underlying forms and the phonological or phonetic status of such segmental variants. It is fair to say that thus far within the IL research paradigm, these fundamental questions have not been satisfactorily accounted for. Other work within the IL paradigm, influenced by sociolinguistic models of linguistic variation, has demonstrated that extralinguistic variables such as level of proficiency and type of speaking task involved can be seen as valid conditioning factors in IL segmental variation. To this end, correlations have been produced linking scalar values of these variables with
scalar values of individual segment forms. For example, it has been shown (by, e.g., Dickerson and Dickerson 1977) that in the speech of a Japanese learner of English the 'correct' target variant of the variable /r/,
i.e.
[Λ] , is most used in word list reading, less in dialogue reading and least in free speech. Wenk (1979), investigating variants for English 'th'-sounds produced by French learners according to proficiency level, concludes: "one might speak of a natural acquisition sequence as speakers move from predominantly labio-dental substitutions to plosives, then to sibilants, before attaining a high degree of success with the target variant itself" (1979: 208) Moreover, in these studies attention has also been paid to the immediate phonological and phonetic context of a segmental variant as a potential conditioning structural factor in variation. Dickerson (1975), for example, shows how degree of accuracy to the TL norm of Japanese learners' renderings of English /z/ correlates not only over tine and according to verbal task, but also with position of the variant. Greatest accuracy in a dialogue reading task is achieved with /z/ in prevocalic position; least with /z/ before the set of consonants /Θ 3 t d ts dz/. Gatbonton (1978) demonstrates that the greatest percentage (43%) of correct variants for English /$/ was achieved by French Canadian learners in the context after a [+ vocalic, - consonantal], i.e. vowel, segment and the smallest (14%) after a [- voice, - continuant, + consonantal] segment, i.e. /p t k/. However, in all these studies, specifications of "context" are limited either to immediate segmental environment, i.e.
to the position of a variant
within its linear segmental string in which contiguous phonetic effects could be interpreted as conditioning the particular variant from, or to the wider suprasegmental context of a segment within a syllable or word. In both cases, though, any explanatory value of stating such a linguistic surrounding as conditioning the form of the particular variant is diminished by its purely 'identificational' status within such analyses, and by the fact that it is in the nature of such variationist studies that such statements are primarily satisfied with demonstrating statistically significant correlations between scalar values of independent variables. Causal or directly conditioning relations are imputed to such correlations rather than being projected by them. A conclusion that 'phonetic environment1 constitutes a conditioning factor in TL segmental variation (Dickerson and Dickerson 1977) is thus primarily statistically based and in the absence of any further defining relationship between context and variant, lacks any explicit explanatory power in this
form. Similarly, random or even systematically based statements to the effect that'TL variant /A/ would seem to be more difficult to produce correctly in context B, less difficult in C and least difficult in D 1 may be observationally sound, but do little to increase our understanding of why a particular context is more, or less, conducive to a more, or less, accurately produced TL segment. Hence, one may conclude that an explanatory account of the influence of suprasegmental context on segitental variation is poorly served by this kind of approach. In a related vein, Brasington (1982), in a discussion of a possible universal relationship between segmental properties and positional properties, concludes that "to characterize a position in some relevant fashion is not simply to identify it ...
. W e should be prepared
to find that our conventional identifications of phonological environments by reference to positions in a hierarchically structured set of units (syllables, words ...) or position within some string of segment types are in the present context also no more than useful labelling systems" (1982: 87). 1.1.2. Segmental variant as segmantal error in 'projective' contrastive phonology 1.1.2.1. In the context of the study of segmental variation, the concept of 'segmental error1 is of course subsumed under that of 'non-target variant1, which in turn must be established by means of some kind of acceptability metric by which native speakers of the TL exercise a value judgement as to the TL-likeness of a given learner form. However, the majority of phonological and phonetic studies of segmental error or, here, segmental variation in FL speech, have in fact been concerned with establishing purely structural criteria which, it is assumed, condition the occurrence and form of TL segmental error. In contrast to the types of variation analyses discussed in the previous section, most phonological (and, to a lesser extent, phonetic) studies of this type purport overtly to exhibit predictive value concerning the segnental forms (erroneous and non-erroneous) to be expected to occur in a particular TL, given a particular SL. The basis of such predictive statements lies in the structural comparison of the two languages 'in contact1 in an FL learning situation. In the case of phonology proper, these "predictive1 structural comparisons have led in their extreme form to the postulation of 'hierarchies of phonological difficulty' (with their associated likelihood of error) to be encountered by the learner in his acquisition and production of the FL sound system (see, in particular, Stockwell and Bowen 1965) .
Such hierarchies of difficulty are established by conparing and contrasting the structural equivalences of the phonological systems of SL and TL. Compatibility or structural equivalence ratings are stated in terms of the presence/absence of phonological contrasts within and between the two languages, which in turn are formulated with reference to the occurrence of phonemes and their allophones which to an FL learner offer cross-linguistically relevant 'optional1, Obligatory1 or 'zero1 structural 'choices'. For example, an Optimal' phonological choice refers to 'the possible selection among phonemes': in Eiiglish as opposed to Dutch, for instance, there is an Optional1 choice of /s/ or /J/ in word-initial, medial and final positions - Dutch having only one phonetically intermediate /s/ phoneme by comparison. An 'obligatory' choice would in English involve the contextually determined allophonic realization of /!/ as [3, ]initially and [i]finally and preconsonantally, whereas the realization of Dutch /!/ involves no such "obligation1 in form. 'Zero' choice on the other hand, reflects the absence of a category in one of the languages in a contact situation while it is available in the other: for example, English lacks a /x/ phoneme, phonetically velar fricative, which Dutch possesses. These different availabilities of choice in SL and TL allow eight kinds of structural relationship between the two languages, which in turn are said to represent an eight-point 'hierarchy of difficulty', which is then simplified to a scale of three orders of difficulty by coalescing the eight points into three bands. Stockwell and Bowen (1965) propose the following scale: Order of Difficulty Most
II III
Comparisons of Choice LI
L2
0 0 Op
Ob Op Ob Op 0 0 Op Ob
Ob Ob Op Op Ob
Type
Least
Clearly, there is much to criticize in approaches such as these. The main criticisms have revolved around; a) the reliability of their predictive power when corroborated with the findings of the empirical analysis of actual FL learner data; b) their explicit equation of structural difference: point of difficulty: FL error; and related to this; c) their naive assumptions concerning the learning of an FL phonology in which, i) the contact situation
between SL and TL in a learner is conceived of as a kind of "blinding flash" (Nemser 1971b), where the totality of two systems meet instantaneously, ii)
the
FL learning perspective is interpreted entirely in terms of a behaviourist Stimulus-Response learning theory, and iii) no distinction is made between production and perception aspects of SL-TL language contact (Nemser 1971c). However, restricting one's interpretation of such contrastive analyses to the actual structural comparison, devoid of any literal 'predictive1 or even descriptive claims in the behavioural domain that constitutes FL learning, one notes that such hierarchical scales of structural difference as they point up potential areas of phonological compatibility and incompatibility between two languages are expressed in terms of traditional segment units, which are quasi-abstract and essentially context-free. Again it must be stressed that the forms second language (L2) segments take in a given FL learning situation are subject to a number of further structural influences not directly related to the first language (L1) itself, but which pertain to, it is claimed here, suprasegmental context, as well as to the nature of the developing interlanguage itself, and to a range of extralingusitic variables inherent in the L2 acquisition and performance process itself. 1.1.2.2. Structural phonological comparisons of context-free elements across two languages are equally characteristic of other, less ambitious, post-hoc contrastive analyses as applied to FL learning. For example, Ritchie (1968) , in his attempt to account for the Japanese preference to substitute English [Θ] as [s], as opposed to the Russian speaker's substitution of English [Θ] as [t] in terms of the relative 'markedness' values attached to the generative phonological features 'continuity1 vs. 'stridency1 in the languages involved, makes no reference to contextual factors. It is surely a very evident fact in NL-TL learning situations that while there may well be an overall preference for speakers of the same ML to substitute one particular ML phone in the TL, TL segmental performance is characterized by there being a whole variety of NL-related forms produced: for example, Dutch learners of English produce [s z f v t d r] for [3] in varying contexts (cf. James 1983a, and the above reference to Wenk 1979 on French learners of English). Michaels (1974), in his analysis of the substitution of English /ij/ to /n/ and /s/ to /c/ by Spanish learners, employs an SPE-type minimal set of marking conventions (Chomsky and Halle 1968, ch. 9) to explain these particular forms and makes reference to context only in as far as the markedness values for features are universally specifiable in terms of contiguous feature
8
specif icaticns. Keel (1979), in an examination of why it is that Spanish, English, Thai and Arabic speakers substitute [f] and not [p] for German [pf], and [s] not [t] for German [ts], invokes the 'complement relation' of atomic phonology to explain such (context-free) substitutions via the notion 'minimal complement segment', which states that "Given two segments, both of which are complements of a third segment, that segment is the minimal complement which has the fewest nonidentical feature specifications with respect to the third segment" (1979: 222). Concretely, the L2 learner analyses those segments not present in his L1 as their minimal complements in his L1. Thus given the choice between [t] and [s] for [ts], he will choose [s] since this segment has the fewest nonidentical features in common with [ts]. Eckman (1977, 1981b) has proposed a "Markedness Differential Hypothesis' to predict areas of 'difficulty' in the learning of an FL, by which the relative degree of typological markedness of an item or structure in the TL together with its degree of 'difference' to an 'equivalent1 SL structure, determines the degree of (structural) difficulty experienced by the learner with respect to that item or structure. Applied to phonology, such implicationally ordered markedness relations between phonological contrasts are also ordered according to position in the word. Thus it is claimed, for example, that German speakers will have difficulty producing word-final voicing contrasts in English, since such a contrast is most marked finally (as opposed to least marked initially). Also here one notes that the idea of context has no role in determining the actual occurrence of TL segmental variants and only has a bearing on the abstract, independently established (unit) scale of the iitplicational occurrence of contrasts. 1b the extent that all such contrastive analyses claim to be 'predictive' or, perhaps more accurately, 'projective1 with regard to the behavioural domain of L2 learning, they do not stand up to serious empirical validation. The analysis of FL learner data does not bear out such projections by showing that the form and occurrence of erroneous or non-target segmental variants correlates with areas of SL-TL structural difference and non-erroneous forms with areas of cross-language structural similarity. Further, it has already been noted that FL speech is characterized by its variability and that even the most cursory phonetic analysis reveals many more segmental variants present in learner speech than predicted by SL-TL phonological projections. Even within variationist studies, it has been pointed out by, for example, Knowles (1977) that any serious sociolinguistically based statement of phonological variables in a particular language variety needs not only to be based on a
more differentiated phonetic analysis of segmental form, but also must take into account variation within other components of 'the speech production system1 apart from the strictly phonological cne. He suggests taking a serious lock at variation in, for exanple, articulatory setting, the monitoring system, the (articulatory) coordination system and in the motor commands themselves. From this discussion, assuming that such contrastive analyses do not constitute a completely vacuous undertaking and that they can and do make contact with empirically established data, one may conclude that: i) contrastive phonological analyses have as a main "raison d'etre" the contribution they make to our knowledge of linguistic typology (cf. the notion of a 'theoretical' contrastive analysis as opposed to 'applied1 contrastive analyses which are explicitly oriented to problems of FL learning, as espoused by, for example Fisiak 1975, 1976 and Jackson 1976); and/or ii) with regard to their role in L2 learning, that perhaps the descriptive phonological frameworks themselves used in analysis and which theoretically constitute the so-called tertiim comparationts of the description of two languages, must be revised in order to function as a more 'explanatory1 means of comparison with relation to observed learner linguistic behaviour. Indeed, the function of a phonological framework employed in the explanation of FL segmental variation is not in the first place to provide substantive projections of learner phonological behaviour on the basis of a comparison of the structural properties of SL and TL, but rather, consistent with its role as a tertium comparationis, to function primarily as a formal framework which provides the context of phonological organization, albeit sufficiently rich in structure to accommodate not only substantive statements of comparative phonological and phonetic structure but also to serve as a framework for the analysis of observed FL sound data. 1.1.3. Segmental variation within psycholinguistically oriented models of contrastive phonology Possible elements of such a framework can be deduced from the results of studies conducted within a more psycholinguistic than a priori structural perspective on TL phenology learning. The aim of such studies is to shed light on the nature of "phonological processes' underlying observable FL data, as, for example, evidenced by interference phenomena which show up in the occurrence of non-target segmental variants, for instance. Within a cognitivist view of learning, characteristic of more recent work en the area, a recognition of such 'processes' has implications for an interpretation of the psycholinguistically relevant 'learning1 or 'production strategies' the learner employs in the organization of his FL data.
10
Briere (1968), in a study of the learning of an artificially created FL phonological system constructed from elements of Arabic, French and lhai by American English speakers, concludes that the use of traditional phoneme and allophone categories of phonological description in conparing the TL and SL is difficult to justify in that patterns of observed phonological interference are not readily captured in terms of such categories. The traditional 'linguistic ' (as opposed to 'psycholinguistic') definition of phonological interference in which 'ease or difficulty of learning phonological categories experienced by a speaker of language X attempting to learn language Y 1 is attributable to " (1) the carpeting phonemic categories of the Ν [Native] and Τ [Target] systems, (2) the allophonic membership of the phonemic categories, and (3) the distributions of the categories within their respective systems" (Briere 1968: 13), does not correlate with the facts of observed learner performance. A resulting conclusion is that "the syllable is a better prime on which to base a contrastive analysis of AE (American English) with any Τ language and any ensuing prediction of the hierarchy of difficulty of learning involved should be based on the syllable" (1968: 73) . In a related vein, Tarone (1980) fccusses on the role of the syllable as the unit of description in an analysis of the phonological performance in English of Korean, Cantonese and Portuguese speakers (i.e. speakers of languages with rather dissimilar basic syllable structure types to English). It is within the syllable that phonological 'strategies' (perhaps, better, 'processes') such as epenthesis and consonant deletion are observed to operate in learner speech. Characteristic of the latter is the preference for a basic CV syllable form, it is claimed. A conclusion which may be derived from these observations is that the syllable may be considered, a) as a convenient unit in the analysis of learner phonology in forming a focus of IL phonological processes; and b) as a possible prime unit in the learner's organization of his TL/IL phonology. Under a ) , the syllable has the status of a descriptive prime in the phonological analysis of FL data and as such might constitute a unit of 'interlingual identification1 in Vfeinreich's terms (Weinreich 1953), devoid of any psycholinguistic (cognitive or otherwise) interpretation. Under b ) , the syllable takes on some kind of psycholinguistic validity as a structural unit in the FL learner's mind, i.e. as one of the phonologically relevant units of a "hypothesized latent psychological structure within which interlingual identifications exist" (Selinker 1972: 222). In this view, the syllable (together with certain other linguistic units) is elevated to the status of a 'psychologically real1 element in that it is said to constitute a structural frame of reference for the FL
11
learner as he 'identifies' structural elements of his SL with the TL, a process which in turn shapes his IL performance. At the same time, via a Chomskyan 'systematic ambiguity', in equating the linguist's grammar with the grammar in a speaker's mind, a unit of interlingual identification such as the syllable can be employed simultaneously for the description of 'parallel data' in the three systems of the ML, the IL and the TL. Even disregarding at this stage any claims for the "psychological reality' of such structures, it should be clear that studies such as these supply valid evidence for the inclusion of the syllable - long neglected as a unit of description in mainstream theoretical phonology - in the conception of a phonological framework as outlined in 1.1.2.1 above. That is, the syllable may function as (part of) a structural context within which statements can be made as to realizational and distributional properties of segmental units within observed FL learner pronunciation as well as constituting a framework within which contrastive structural comparisons can be made on an analytical level of description - whether 'interlingual identifications' are imputed to the learner or not as part of a heuristic in the explanation of observed IL phonological behaviour. 1.2.
Segmental variation in contrastive linguistic phonetic research
1.2.1. Segmental variants in contrastive descriptive phonetics The most comprehensive cross-linguistic study of phonetic parameters to date has been undertaken by Delattre (1965), in which he compares and contrasts the prosodic, vocalic and consonantal features of (American) English, French, German and Spanish according to experimentally derived articulatory and acoustic criteria. The main motivation for the study is baldly 'the improvement of language teaching', for which purpose a knowledge of the phonetic characteristics of the commonly taught FL's French, German and Spanish in comparison with English is deemed to be essential for the FL teacher. In effect, single standard realizations of the individual vowel and consonant phonemes and their distributional properties are described. For the vowels, there are detailed observations on the influence of segmental· environment on duration and of stressed position on their quality. For the consonants there is some discussion of adjustment in consonantal quality and duration according to position in word. The separate treatment of prosodic features is formulated in terms of basic intonation patterns and the placement of stress in word and 1
sense-group', together with observations on typical syllable structure forms
in the four languages. The parameters of phonetic description, i.e. the
12
phonetic tertivm aompavationis, are expressed in terms of spectral properties of frequency, duration and intensity and dynamic articulatory properties of place and manner. Again here, statements concerning the suprasegmental 'conditioning1 of segmental variation are basically of an 'identificational' type, using categories of context such as initial, medial or final position in word or syllable as taken over from phonological analyses of phonemic and allophonic distribution and realization. 1.2.2. Segmental variants in contrastive pedagogical phonetics 1.2.2.1. Analysis of the segmental phonetic attributes of SL' s and TL's have been traditionally subsumed under the allophonic and/or realizational statements of contrastive phonological analysis. Such detailed comparisons as there are, with the obvious exception of Delattre's work just referred to, are to be found predominantly in the form of phonetics and/or pronunciation teaching handbooks for pedagogical use (representative of such studies are, for example, the recent contrastive accounts of English and Dutch for L2 English learners of Gussenhoven and Broeders 1976, and Collins and Mees 1981). The analyses are based on the individual description of segments according to the articulatory categories of the International Phonetic Alphabet, often containing no more than the traditional three-term articulatory labelling for vowels and consonants. Articulatory comparisons are then made with the 'nearest1 SL or TL sound, where proximity is judged according to general 'common-sense' assessments of phonemic and phonetic equivalence. References are made to positional characteristics of particular phonemes in the SL and TL, and the major patterns of assimilation and elision word-internally and -externally may be compared together with the incidence cross-linguistically of 'weak forms'. Statements as to the phonetic forms of segments in the TL (and SL) are restricted to the description of the major allophonic and/or realizational variants of phonemes. Suprasegmental features pertaining to intonation, accent and/or stress are contrasted separately. While it would be perhaps ungenerous to criticize such studies on their lack of theoretical or descriptive sophistication, which may be legitimately constrained by the pedagogical orientation of these works - since they offer in the first place a practical guide to the learning of the sounds of the FL -, they are nonetheless guilty of projecting the same kind of difference: difficulty: error equations on a phonetic level of comparison which are characteristic of types of phonological contrastive analysis discussed above
13
in section 1.1.2.1. Such hierarchies of (phonetic) difficulty as are set up are established in terms of articulatorily defined differences between isolated segments of the two languages SL and TL and often presented in the form of typical substitution errors. Thus for English / /, Collins and Mees (1981) note for Dutch learners: "Replaced by /s/
(or less commonly by /t/)" (1981: 191);
for /w/, "Replaced by /v/, lacking essential lip-rounding, and with labio-dental articulation. Confused with E /v/" (1981: 193) ; and for /ae/, "Replaced by D / /, giving a vowel which is too close and causing confusion of E /e ~ ae /" (1981: 194). 'Hierarchies of error1 nay be established on the basis of an admixture of 'conmunicative' and (phonologically) 'distinctive' criteria, mediated by a largely intuitive notion of 'acceptability'. In addition, such descriptions have been justifiably taken to task in recent years for promoting an over-idealized picture of the phonetic qualities of the TL by ignoring significant details of both casual and careful speech styles (cf., e.g., Brown 1977), and for their heavy reliance on phonemic analysis for their description and pedagogical presentation. Concerning the latter, in a plea for a 'text-phonetic' approach to FL phonetic description and correction, Thurow (1977) notes: "Die Gleichsetzung von phonematischen und phonetischen Parametern in der Unterrichtspraxis bedeutet eine irreführende Vereinfachung. Sie stellt zudem eine unangemessene Reduktion der tatsächlich gegebenen phonetischen Variabilität jeder Einzelsprache dar. Die Aneignung gerade dieser lautlichen Variabilität jedoch bildet das eigentliche Lemziel der Ausspracheschulung" (1977: 4 2 ) . Indeed, comparative phonetic descriptions of this type employ essentially phonemic categories as a type of formal tertiwn comparationis (t.c.), while using (phonologically motivated) phonetic parameters as a substantive t.c. and in terms of such parameters projecting articulatory discrepancies between the SL and TL. 'Phonetic1 substitutions noted of SL for TL forms are represented in phonemic brackets. However, comparatively rare observations in such studies of the type, again for Dutch-English contact,"/S/: When initial, replaced by /d/. Medially, replaced by /d/ (or /z/). In final position, replaced by /t/ or /s/" (Collins and Mees 1981: 191) throw into relief the potential inadequacies of a "single SL segment for single TL segment1 type of analysis and at the same time point towards the typical variation within FL substitutions. As this example shows, suprasegmental context vould seem to directly correlate with the occurrence of a particular IL variant of the TL /$/. Assuming 'position' here refers to 'within a word', why is it that the alveolar fricative variants /z/ and /s/ are said to be found only in non-initial position, whereas the alveolar/dental stop variants occur within all three positions and
14
to the exclusion of any fricative word-initially? The v»ays in which the structure of the suprasegmental context itself may be seen as influencing the choice of particular segmental variants forms the subject of subsequent chapters and will therefore not be pursued further here. However, it clearly has a bearing on the observation that in one case (initially), the choice of a variant [d] (or [d]?, reflecting a dental stop of Dutch) for English [3] would seem to indicate the selection of an 'equivalence1 value of the place of articulation parameter dental (-alveolar), while in other cases (medially and finally), the use of variants [s] and [z] is based on an equivalence value of the manner of articulation parameter fricative. At the same tine, it must be noted that Dutch allows the occurrence of both dental stops and (post-)alveolar fricatives in all three positions (for further discussion see James 1983a). 1.2.2.2. Independent of any influence of the positional context on the choice of a particular IL segmental variant, Barry (1978), for example, notes in a discussion of the substitution by German speakers of English [Θ] and [3] by [s] and [z], respectively, that an analysis of FL sound substitutions must have recourse to both production and perception modes of feature associations, identifications or equivalences of TL's and SL's. In these particular examples, the "perceptual' feature manner of articulation/fricative dominates the 'production1 feature place of articulation/dental, contrary to the priorities one assumes are prevalent in the actual teaching of English [Θ] and [3] to German learners. He concludes, therefore, that a substitution error of this type "involves a confusion of behavioural patterns which belong to different areas from the point of view of scientific description but which necessarily meet in the learner's articulatory behaviour and are consequently not ultimately separable in the concrete learning process" (1978: 76). However, while one may support the general spirit of this observation, Barry would seem to be falling into the descriptive trap of equating not only perhaps phonological with phonetic paraneters (cf., e.g., Thurow's criticism above in 1.2.2.1.) , but more seriously, descriptively established phonetic parameters with the behavioural domains of the production and perception of speech. A phonological or, more importantly, a phonetic conception of the parameters or features dental or alveolar place and fricative manner as defined primarily with reference to articulatory and acoustic/auditory criteria of description, respectively, does not of course confine their behavioural validity to respectively production and perception modes of speech behaviour. "The apparent independence of the perceptual and production modalities in
15
the interpretation of alien phonemes" is a similar conclusion drawn by Nemser (1971c: 95) in his study of Hungarian learners' rendering of English [Θ] and [#](see also Nemser 1971a), in which he shows that when errors occurred, the interdentals were usually perceived as labial fricatives (cf. also Vfeiher 1975 for German learners of English), but produced as apical stops. Of greater direct significance for the above presented discussion of Dutch [d]/[t] or [s]/[z] for the English interdentals, is Nemser's finding that [Θ] was imitated in the order of preference sibilant [s], fricative [ f ] , stop [t], whereas [3] was imitated in the order of preference [d], [ν], [z]. As has often been observed, of course, English [3] manifests much less friction than [Θ] and thus may not lend itself to a fricative 'identification1 with an SL phone to the same extent as the latter. Clearly, then, caution is required in the adoption of phonologically motivated phonetic features in the analyses of IL phonological and phonetic behaviour (for further discussion see also Kohler 1981). Vfriile processes of identification between segments of SL and TL on the part of the learner may be captured analytically in terms of preferred feature equivalences and differentiated according to such non-linguistic variables as the behavioural modalities of production and perception, verbal task and level of learner proficiency (see above discussion of segmental variation in 1.1.1.), it is surely the prime task of a structural linguistic analysis of such imputed identifications to first account for language-internal variables which may be seen to condition the place and form of one or another IL variant of a TL segment. Such a major language-internal variable, it
is
claimed here, is the suprasegmental context of such items (see further James 1981). It is the task of a phonological analysis applicable to FL learning data to provide a structural description of such a context within which these equivalences may be established. However, as has already been noted above (section 1 . 1 . 1 . ) , a linguistic account of such a context must do more than identify structural positions such as 'word-initial', "syllable-final1, etc., on an ad hoc basis and correlate them with the occurrence of particular segmental variants/ if it is to at least try to approximate to an explanatory description of such phenomena. Structural attributes of contextual positions must be shown to directly influence realizational attributes of IL segmental variants. Methodologically, this procedure may be expressed as "given the possibility of ranking a set of ENVIRONMENTS with respect to some property, we could establish the relative compatability of various segments with degrees of that property by examining the relative likelihood of their occurrence,
16
emergence or disappearance in that set of environments" (Brasington 1982: 87) . A second methodological step would be to examine in how far properties of the environments are equally shared by the occurring segments and manifested in the form they take. For example, it has been noted that phonetically "stronger" (measured in terms of degree of manner occlusion, type of phonation, place of articulation, presence of secondary articulation, etc.) segments, whether SL-, TL- (or IL-)derived, are more likely to occur at word-initial than word-final position in learner phonological performance (cf. James 1981, 1983a). It is equally clear, however, that the expression of these 'properties' necessitates a sufficiently abstract level of description to permit the formulation of 'significant generalizations' in this area. 1.3.
Segmental variation and context in other contrastive research
The influence of structural context as a determinant of the form of 1L segmental variants is recognized as an important area for further research by Walz (1979) in his longitudinal study of the acquisition by American learners of French of a number of TL vowels - including the nasal series - and the consonant [R]. Examples of [i] as a frequent variant of [e] (as opposed to a possible predicted [θ]) in the French word "cinema" and the frequent omission of (word-)final consonants constitute evidence that suprasegmental context, including non-contiguous context, must play a significant role in the conditioning of such variants. Walz concludes - portentously - that "the actual effect that environment has is much more complicated than is commonly believed" (1979: 78). Systematic evidence for the importance of suprasegmental context in the analysis of segmental variation is provided in a study of non-interference errors of English speakers learning Arabic, Polish and Russian as an FL by Garnica and Herbert (1979). They observe that a number of phonological errors are produced that could not be directly related to the influence of the SL sound system, which they label 'sequential replacement1 (e.g. [k/e/wo] for Polish [k/eswo] "chair", [rozi] for Polish [robi] "to make" and [ru:sijjm] for Arabic [rursijjun] "a Russian") and 'metathesis1 (e.g. [raguda] for Russian [raduga] "rainbow"), constitute substitution processes which take place in the IL resembling phonological processes as found in child speech and as speech lapses in normal adult performance. The analysis of such segmental errors or variants will clearly have to come to terms with the suprasegmental context within which such variants occur and/or within which such processes operate. Interpreting their analysis within a psycholinguistic
17
perspective of language learning in which these processes are seen as evidence of 'strategies' used by the second language learner which are in part similar to those employed by first language learners (for further discussion on this point see, e.g., Hecht and Mulford 1982) , they conclude: "We propose that the second language learner also attends to only select utterances and either does not perceive minor variations in the pronunciation of particular words (or expressions) or misperceives parts of words. In the latter case, the learner's native language influences the learner and interference errors are likely to occur. In the former case, however, other principles become operative. In some cases the second language learner may omit segments. In other cases he may perceive certain phonetic features of the utterance but not be aware of the exact location(s) of such features in the sequential arrangement of the word. Since the learner can identify some kind of ' s c h e m a ' , certain phonetic features appear where they do not in the adult model. The use of 'strongly articulated 1 , repeated or pervasive aspects of the word in parts of the word where they do not appear in the model result in the types of errors which occur in the cases of phonological replacement illustrated in the examples discussed in this paper. Thus, it appears that an outline of certain characteristics of the lexical item is coded, e . g . , number of syllables, syllable prominence, the presence of a 'strongly articulated' phonetic f e a t u r e , etc." (1979: 14-15)
This conclusion merits an extensive citation since its implications are highly significant for the concerns of the present study and in effect provides a psycholinguistic processing account which makes contact with major elements of the cross-linguistically relevant view of linguistic structure to be proposed here. The "strongly articulated1, correlated with "perceptually salient", features identified derive this behaviourally relevant attribute from properties of their phonetic quality and properties of the 'prominent1 position with which they are associated within some higher-level unit such as the syllable or word. It is iinportant to note, however, that such attributes as 'strongly articulated" and "perceptually salient" here describe features within the behavioural domains of production and perception as they are mediated by the acquisitional situation and abstracted from properties of linguistic structure. They do not form part of any phonetic or phonological structure itself. However, such patterns of cross-linguistic identification by the learner make reference ultimately to patterns of linguistic structure. As noted in the previous section and as is supported by Garnica and Herbert's findings, the essence of a cross-linguistically relevant structure, it would appear, must be defined in terms of the intersection of realizational properties of units with the structural properties of their position. As will be shown in later chapters, this may be
18
achieved by a suitably abstract conception of relational 'strength1 or 'prominence1 as it is manifested in both unit, i.e. here segmental, properties and positional properties. 1.4.
Formal and functional aspects of contrastive phonological analyses as applied to FL data
1.4.1. It is to be hoped that on the basis of the discussion so far, already some kind of picture is emerging of the requirements for a linguistic framework of analysis which is adequate to the description of cross-linguistic phonological and phonetic structural characteristics as they may serve as a basis for statements pertaining to the linguistic constraints on FL speech behaviour. It should also hopefully be emerging that such a framework at the same time can be interpreted as forming a structural description of the suprasegmental context within which segmental variation occurs, as well as itself serving as a t.c. for comparative analyses of the phonology of twD or more languages. At this point, an understanding of the formal aspects of such a possible framework for cross-linguistic description could perhaps be illuminated by a consideration of certain functional and ontological consequences of this kind of viewpoint. 1.4.2. It has often been noted, and referred to in the previous discussion, that the 'predictive1 power of contrastive phonological and phonologicallyoriented contrastive phonetic studies is "weak1 in the sense that 'predictive' statements formulated on the basis of structural comparisons of SL's and TL's of the kind (degree of) difference in structure -> (degree of) difficulty experienced in the TL ·> (relative) incidence of TL error are not borne out in the observation of attested FL learner behaviour. In the wake of these negative findings, the explanatory value of contrastive phonology in a theory of second language learning has been reduced to that of a 'diagnostic' or 'a posteriori1 role in the explication of L2 pronunciation error. Ihat is, knowledge of the structural similarities and differences between SL and TL will be usefully applied in the analysis of the sources of non-target forms in the IL. Howsver, it must be stressed that much criticism of the 'predictive' or 'explanatory1 value of contrastive phonological analysis has been misplaced For one thing, by no means all such analyses have claimed 'behavioural validity' for their findings and have restricted their pronouncements carefully to establishing areas of likely difficulty or error in TL learning
19
a structural contrastive analysis of SL and TL points up. Typical of criticisms of the application of particular theoretical phonological models in the area of EL learning are the statements by, for example, Kohler (1971): he concludes "Because of its failure (1) to predict mistakes and their gravity in all cases, (2) to distinguish between speaker and hearer explicitly; and (3) to explicate a grading of difficulties, taxonomic phonology cannot be regarded as an adequate theory for contrastive linguistics" (1971: 86). In the following sweeping statement, the contribution of generative phonology is rejected, since "To account for linguistic interference we have to know the rules of articulatory movement control, which cannot possibly be expressed in pseudo-psychological forms that are determined by historical alternates of different degrees of petrification" (1971: 87). Apart from the fact that phonological theories - taxonomic or generative are in their conception neutral to speaker/hearer distinctions and do not purport to account for 'articulatory movement control', statements such as these (typical for a large group of "disillusioned" phonologist-language teachers) suffer from serious misconceptions of what the function of a phonological theory or description must be in relation to questions of the explication (structural and/or behavioural) of FL pronunciation data. Aside from the basic question as to whether contrastive phonological studies should, or can ever be used or designed to make 'predictions' in a literal sense concerning FL learner speech behaviour on the basis of structural comparisons of SL's and TL's - which some phonologists are apparently still prepared to believe-, consonant with certain of the arguments already presented here, it is most definitely not the role of phonological theory or description, whether as applied to FL learning or equally ML learning, to be directly and uniquely 'accountable' - either 'predictively' or 'diagnostically' - for the explanation of learner data gathered in one or other behavioural domain of linguistic performance. A phonological theory, a phonetic theory, or indeed a linguistic theory or description in general, can at the most have a "projective1 function in such explanation, such that inherent structural properties of language - the specification of which constitutes the proper object of linguistic analysis - form the basis of or may be 'projected1 as those properties relevant to the assessment of purported structurally related determinants of linguistic performance in whatever behavioural domain. In other words, a phonological framework can only serve as a structural basis of comparison, a t.a. , between two languages within which suspected cross-linguistic structural influences deduced from the form of FL speech behaviour may be explicated.
20
1.4.3. A clear distinction mast be drawn between structural comparison as a 'translation procedure1 and conparison as a 'set of descriptive statements' in SL-TL contrastive analyses, the latter being made within a particular theoretical framework and ideally based on a conception of substantive universals with reference to the description of two or more languages. Significantly, Chomsky doubts whether a 'reasonable' translation procedure is possible, 'reasonable'.in the sense of "one that does not involve extralinguistic information" (1965: 201-202, fn. 17). Ihus a translation procedure converting the elements and structure of L1 into L2 as is practised in certain types of contrastive phonological analysis is methodologically unsound in the absence of extralinguistic information - for example, which makes reference to 'real-world1 meaning and/or sound determinants. As Chomsky further notes, in any case "Ihe possibility of a reasonable procedure for translation between arbitrary languages depends on the sufficiency of substantive universals" (1965: 202, fn. 17), a point of course of equal significance for the making of comparative descriptive statements across languages (see, however, Catford 1965 for further discussion of the linguistic bases of translation procedures). The general procedural inadequacies of contrastive analyses for FL learning have been highlighted by several scholars, but notably by Whitman (1970). He distinguishes four steps in a procedure of comparison; description of SL and TL, selection of forms from the description, contrast of the forms selected, and prediction of difficulty on the basis of the contrast - all steps depending on a basis of equivalence between the forms of the languages compared. Related to the terms of the present discussion, Whitman distinguishes the problem of the basis of equivalence of 'structures' (e.g. of the type noun, verb, adjective) at the stage of description, from the equivalence basis problem of 'realizations' (e.g. defined in terms of phonetic or semantic features) at the stage of contrast, and concludes that the former are not truly universal (espousing the classic view of individual language structuring as systems "ou tout se tient") and that the latter are dependent on 'metalinguistic1 points of reference - currently lacking in his view for their proper definition. Instead, he suggests enriching the procedures of the selection (of forms to be contrasted) and prediction (of difficulty) stages by the adoption of a psychological perspective: concerning "prediction1, he characterizes this perspective as "describing the necessary psychological adjuncts of difficulty and then fitting the contrast to these adjuncts" (1970: 196) ; concerning the ''selection1 of forms to be contrasted, this must
21
be based on vvhat forms the learner considers to be equivalent, since "'equivalence1 is not a linguistic primitive (i.e. a competence concept) but a psychological one" (1970: 194). The value of these statements lies in the explicit methodological separation Whitman proposes of linguistic and "psychological' (or 'psycholinguistic1?) procedures in contrastive analysis and their claims for validity within structural or 'competence' areas of description as opposed to behavioural or 'performance' areas. However, it must be admitted that little progress has been made in describing the 'psychological adjuncts of difficulty1 in relation to FL learning beyond the veil-established measures of difficulty in terms of task similarity (cf., e.g., Osgood 1949). On the other hand, recent work on the 'transferability' of SL items and structures into TL (for phonology see James 1983b; for lexis, see Kellerman 1977, 1979) has cone closer to a psycholinguistic, if not psychological, specification of learner judgements of 'equivalence' in an FL learning situation. These studies have shown that learner judgements of structural equivalences across SL and TL are, amongst other things, determined by the degree of 'language-specificness which items are thought to possess in SL and TL - idioms, for example, are treated by learners as language-specific and thus non-equivalent (Kellerman 1977), likewise dialect-associated phones (James 1983b). Since the procedural inadequacies of contrastive analyses in Whitman's view revolve around the credibility of an adequate linguistic basis of equivalence, it is clearly a major task of linguistic analysis to provide an adequate tertium aomparationis. Bearing in mind the above postulated distinction between comparison as a translation procedure and as a set of descriptive statements and the place of linguistic constructs within them, a basis of comparison couched in terms of substantive elements of universal validity, i.e.
'substantive universals1 expressed as phonological categories and/or a
set of universal phonological features, could serve to resolve certain of the inadequacies. Both phonological 'structures' and 'realizations' (the latter perhaps as expressed in a set of universal phonetic features) represent substantive universals in Chomsky's (1965) sense and as such equally constitute 'metalinguistic points of reference' in cross-linguistic description, since it is surely the purpose of any linguistic theory - and par excellence of Universal Grammar (UG) - to provide 'metalinguistic' statements on language structure. In a conception of linguistic competence as propounded in UG, both 'structural' (e.g. phonological categories) and 'realizational' linguistic constructs (e.g. as represented by phonetic features) are equally
22
universal in their properties. Ihe recent subsuming of 'formal1 and 'substantive universals1 under 'formal' (as opposed to "functional universale') by Chomsky and Lasnik (1977) does not affect this. A theoretical objection often raised against contrastive analyses is that they eitploy the descriptive tools of a conpetence (or 'langue') model for explanation in the performance (or 'parole') area of language manifestation. This has led to suggestions such as those of Whitman above, that a psychological or psycholinguistic component be incorporated in the description to account for the realities of language processing inherent in an FL learning and production situation, i.e. providing a kind of dynamic filter on the observational validity of purely structural conparisons of SL and TL and thus increasing the power of their explanation of FL learner behaviour. However, fundamental to any discussion of the status and form of contrastive analyses relevant to FL learning is a proper conception of the relationship between models of conpetence and models of performance in linguistic descriptions. Following Chomsky (especially 1967, 1980), the conception adopted in the present study is that of a 'coitponential1 view of the relationship between conpetence and performance, i.e. that a theory of performance includes a theory of conpetence in the most general sense. As far as any psychological reality can be inputed to a theory of UG, i.e. of competence, Chomsky concludes that there is no distinction between 'psychological reality' and "truth in a certain donain" (1980: 106-107). While accepting this basic theoretical position, in the context of the concerns of the present study a more moderate interpretation of 'the general psychological reality of linguistic concepts' may be adopted such as that formulated by Levelt (1974) , i.e. that "A linguistic concept is psychologically real to the extent that it contributes to the explanation of behaviour relative to linguistic judgements and nothing more is necessary for this" (1974: 70). Ihe present study of the suprasegmental context of segmental variation in FL speech, having main reference to that of Dutch learners of English, has as its object of analysis a domain of linguistic performance, as its aim, however, the formulation of partial theory of (phonological) conpetence. As such it not only must make reference to a set of substantive universals in the conparison of SL and TL, but also to a set of formal universals, in this case properties of phonological representations and rules. It will be shown that formal elements and substantive elements of the description are mutually dependent. However, essential to the conception underlying the present study, consistent with the 'componehtial' view of the relationship between conpetence and performance, is the axiom that a theory of linguistic performance
23
must make reference to both a grammar (derived from a theory of linguistic competence) and a (psycholinguistically related) theory of processing. A description of performance, then, as a description of the linguistic performance capacity will make reference to both the grammar of structure underlying 'use1 and to the use itself made of that grammar in language processing. The capacity for linguistic processing, the 'processor', is a product of the cognitive and conceptual systems necessary to language performance, but also ultimately the product of the interaction of these systems with other nonlanguage-specific human cognitive and conceptual endowments. The 'componential1 view of the competence-performance relationship is here interpreted with Kean (1981), as: "in linguistic performance the systematic levels of representation generated by the grammar are realized. As the grammar is what characterizes the systematic levels of representation (i.e. provides structural descriptions of strings), then it follows from an internal assumption of the theory that the grammar is a component of the performance theory" (1981: 191). Further, "Accepting the assumption that the representations of a string generated by the grammar are realized in processing, it follows that all substantive partitions of items in a string which are captured by those representations are available to the processing mechanisms" (ibid.). Qi the position of substantive universals in a theory of performance, Kean concludes: "all and only those substantive partitions of linguistic elements which are captured in the grammatical theory of substantive universals will be the realized partitions of elements in linguistic performance" (1981: 200) . A clearly defined position such as this on the theoretical status of "structures' and 'realizations' identified in an analysis of FL speech behaviour would seem to be necessary, given the problems - referred to in previous discussion - concerning the interpretation of the findings of contrastive analyses within linguistic and/or behavioural domains of reference.
CHAPTER TWO: TOWARDS A PHONOLOGICAL FRAMEWORK OF SUPRASEGMENTAL CONTEXT
2.0.
Introduction
The characteristics of a framework of suprasegnental context as outlined in Chapter One may be summarized as follows: 1) in terms of general function, it should serve as an adequate descriptive basis or structural t.o. within which to undertake comparative analyses of segmental forms in two or more languages, with a view to providing a linguistic explanation for observed variation in FL speech. 2) in terms of its general theoretical status, it will provide the grammar component within a theory of linguistic performance under the assumption that such categories of description as it provides are regarded as the "substantive realization" of categories of a UG which form the structural basis or input to the processing mechanisms necessary to produce speech behaviour. As such, a framework for description of SL's and TL's cannot be accredited any directly "predictive1 or 'diagnostic' power in 'accounting for 1 learner FL behaviour, for the way in which an FL learner utilizes (his knowledge of) the structure of SL and TL in comparison and contrast is individually subject to the mechanisms of his processor, which contribute, for example, to determining the learner-specific identification and equivalence procedures between elements of SL's and TL's. However, the analysis of structural - in the present study in the first instance, positional - constraints on the form of segmental variants in FL speech serves a a legitimate heuristic by which "evidence about the actual organization of behaviour [i.e. performance, AJ] may prove crucial to advancing the theory of underlying competence" (Chomsky 1980: 226). 3) in terms of general form, it has been suggested that a structural description which allows a statement of (variation in) substantive or realizational properties of segments as a product of their structural position may be achieved within a suprasegnental framework of phonological structure, and that this relationship between unit properties and positional properties can best be captured within a system of relative "strength" relations which directly relates scalar attributes of segments as 'fillers' to scalar attributes of structural positions as 'slots'.
25
In the present chapter, a number of phonological frameworks of description will be critically examined which offer the possibility for an incorporation of suprasegmental context within their structural specifications. Further, a preliminary attempt will be made to develop a number of descriptive primes which may serve as the basis for a subsequent theoretical development of the notion of suprasegmental context within phonological theory. 2.1.
A general view of suprasegmental context within phonological structure
2.1.1. In the discussion in Chapter One relating to the linguistic context within which segmental variation is observed to occur in FL speech, reference was made to unit constructs of the order syllable and word, where for example it was noted that the non-target form of English [£] in Dutch learners' speech was realized differently according to position in the word (stop in initial position, fricative in final position). Further examples of forms of non-target variants correlating with word (and syllable) position in 'Dutch Qiglish1 are noted in James (1981, 1982): for instance, Qiglish [p*1] in stressed initial position is rendered as [phh], i.e. with 'exaggerated' aspiration, or as a lengthened stop [pp] - the former clearly TL-derived, the latter SL-derived; 'postvocalic r 1 , for example, 'emerges1 in stressed syllables and in phrasefinal position; [v] occurs for [w] in stressed word-initial position, [υ] for [w] in unstressed initial position, etc. Assuming that there is a systematic basis to observations such as these and that such correlations of forms with positions are not merely coincidental (see, e.g., the data analysis of James 1981 for claims of 'statistical significance'), it clearly behoves any statement of suprasegmental context which aims to offer a structural explanation for this type of phenomenon to progress beyond mere identification of the context involved. Units such as syllable and word as they constrain - in the first instance phonetic choices of segments according to the position they take up in the former, are obvious candidates for inclusion as structural elements of the suprasegmental context and, by extension, of phonological description. The phonetic 'realization' of forms must be related to properties of the phonological structure in which they are embedded. Ihis relation may of course be stated in the traditional terms of allophonic realizations of phonemes or as contextual segmental realizations conditioned 'indirectly' (e.g. with reference to an 'intermediate' phonemic realization) or directly by positional properties inherent in the larger phonological structures within which they occur.
26
Since in the later analysis it will be shown that segmental variation is equally conditioned by position of a segment within structures of a higher order than syllable or word (see, e.g., the reference above to phrase-level conditions on EXitch English 'postvocalic r 1 ) , these units too will have to be considered as elements of a (suprasegmental) phonological structure. Ihese observations clearly point to the assumption of a hierarchically ordered phonological structure as necessary to the specification of suprasegmental context in the present study. 2.2.
Hierarchical models of phonological structure
2.2.1. Hierarchically ordered models of phonology have until recently been the province of systemic (e.g. Halliday's) and tagmemic (e.g. Pike's) theories of linguistic structure. 2.2.1.1. Halliday (1967), for example, proposes a phonological rank scale comprising the units tone group, foot, syllable and phoneme is descending order, "related taxonomically as are the units of the gramtiatical rank scale: each one consists of one or more of the one below it"
(1967: 1 2 ) . Ihe structure
of the tone group is represented as:
τ
or
(ρ!··-η)
τ
(2···η>
τ
(2···η>
i.e., the tone group comprises two elements of structure, an obligatory tonic (single or double) T, and an optional pretonic P, with each element consisting of one or more feet (·).. .n) , (2- · · η ). Ihe structure of the foot is represented as: KRl...n)
i.e., the foot has a structure of two elements, an obligatory ictus I, vvhich, however, may have a zero exponent as 'silent ictus1, and an optional remiss R, with each element consisting of one or more syllables. The syllable itself displays the two primary classes salient and weak, with salient syllable operating at ictus and weak at remiss. In practice, however, Halliday has restricted the application of this phonological hierarchy to the analysis of the intonational and rhythm systems of English, which make reference primarily to the units foot and tone group, and in effect, one is left to reconstruct the form of the hierarchy at the ranks of syllable and phoneme. Following Abercrombie (1967) and the work of Firthian prosodic analysis, it is possible to claim that the syllable would have a structure of two elements, an
27 obligatory V and one or more optional C's - for English showing the numerical restriction as represented in the fomula: (C 1 . . . 3 ) V (Cl
... 4)
which captures the upper limits of three prevocalic and four postvccalic 'consonants' within the syllable for English. Each element of C and V consists of a phoneme. However, for a more recent development of the phonological component of a systemic grammar - with reference to the analysis of English and Telugu which shares something of the structural, if not directly theoretical, conception of the present study, cf. Prakasam (1979), vvho postulates a hierarchy of utterance, tone group, phrase, cluster, syllable and phonematic unit. Here the structure of each rank unit is expressed in terms of paradigmatic contrasts expounded by units of the next-lowest rank and syntagmatic contrasts as expounded by same-level prosodies. For Telugu, the structure of lower rank units may be represented as (Prakasam, 1979: 60): Phonological unit
Structure Paradigmatic
Cluster
Syllables
Syllable
Phonematic units
Syntagmatic + Cluster prosodies +· Syllable (-Part) prosodies -,
Phonematic Unit
2.2.1.2. Pike (1967) proposes a phonological hierarchy of breath group, pause group, stress group, syllable and phoneme, which is paralleled by 'equivalent' and partly interlocking lexical and grammatical hierarchies comprising levels of notionally the same order of magnitude. Additionally, in his work on the phonetic analysis of speech in terms of 'rhythm waves', Pike (1962) relates the syntagmatic structure of such 'waves' as sequences of 'crests' and "troughs', representing, respectively, nuclei and margins with "slopes' connecting them, to that of phonological structure proper, in that the distribution of phonological elements (e.g., phonemes or their allophones) may coincide with the 'marking' of nucleus or margin points of the syntagmatic string. Moreover, rhythm waves may be analysed as 'etic1 exponents of various levels of an 'emic1 phonological hierarchy, such that phones, syllables and phonological phrases can all be interpreted in terms of a 'rhythmic' syntagmatic structure of nucleus, margin and 'slope1. A tagmemic interpretation of each unit-level of a phonological hierarchy would thus regard the phoneme, syllable, phrase or stress group, etc., as comprising a paradigmatic class within a functional slot in syntagmatic structure.
28
A complete statement of a phonological analysis of a language in these terms is offered by E. Pike and Scott (1962) with regard to Peruvian Marinahua in which: 1) contrasts and variants of both segmental and tonal phonemes are presented with reference to their occurrence in the syllable, word and phonological clause; and 2) constituent next-level units of the vrord and clause, i.e. syllable and word respectively, are analysed for their distribution in these higher level units. Significant for the present concerns are the theoretical possibilities opened up as thus demonstrated of not only formulating the phonological analysis of unit-levels of the hierarchy in statements of the occurrence and distribution of the 'immediate constituents' of such levels, but also of allowing statements of the contrasts and variants of the lowest-level 'ultimate' phonological constituents, i.e. phonemes, with reference to all levels of the hierarchy. Ihus, depending on the limits one sets on allcphonic variation, phonemic or sub^phonemic variation may be considered as structurally determined by occurrence in units of the order phonological clause and, one may add, phrase, sentence or even "utterance1. Of course, the problems in defining what is a phonemic contrast per se as opposed to a segmental realization or feature of a higher level unit are considerable, without recourse to some kind of classical 'juncture phoneme1: for example, does one posit a phonemic status for f 7 ] as it occurs (vs. zero) at the end of listed words, phrases or sentences (as in Marinahua), or between words ending and beginning with vowels (also in English), or clause-initial before a vowel (also in English) or does one consider it to represent a contrastive feature - perhaps a "prosody1- solely of the unit it characterizes? In practice, in Pike's "word", phonemic contrasts continue to be established on the basis of distribution and commutation within the monosyllable word or morph, whereas segmental variation within higher level units is considered to be 'allophonic', with segmental effects such as vocalic lengthening, etc., being attributed in a non-systematic way to influence of the suprasegmental context - e.g., occurrence in stressed syllable, in clause/sentence/utterancefinal position, etc. However, Pike's framework does not show that such 'segmental effects' as length(ening) and quality distinctions in both vocalic and consonantal units may be systematically interpreted, in conjunction with 'non segmental effects' such as pitch and stress distinctions, as realizing the structural positions margin or nucleus on various levels of the phonological hierarchy. Ihe relationship between the phonemic or unit-relevance of these features and their non-phonemic position-relevance is summarized as: "certain components are subphonemic when viewed within a routine description
29
of the segmental phonemes, but contrastive in reference to the higher layers of phonological units. Loudness, length, or high pitch may be relevant as components contrasting and identifying nuclei and margins of contrastive rhythm units, while comprising subphonemic variants of the segmental phonemes themselves" (Pike 1962: 14). Phonologically, it is only at the level of the syllable and below that Pike has attempted a consistent tagmemic analysis defining structure and constituency in terms of the interplay of syntagmatic and paradigmatic contrasts. In a study of the Mazateco syllable (E. Pike and K. Pike 1949) , margin and nucleus are regarded as its irttnediate constituents (1C's) which in turn have 1C's of, respectively, consonants and vowels, which combine in clusters. Each cluster is defined as comprising a 'principal member1 with one or more 'subordinate members'. In effect, this means that the 1C's of clusters are defined in terms of principal and subordinate structural ' slots' which are 'filled' by consonants and vowels. Within the principal slot a large number of phonemes can occur; within the subordinate slot only a restricted number. Phonetically, the articulation of members within the subordinate slot "tends to be secondary, tertiary, or subprimary in relation to the primary articulation of the other members of the cluster" (Pike and Pike 1949: 80). In principle, this type of 1C framework has been used in a number of more recent (non-tagmemic) phonotactic analyses of the English syllable (cf., e.g., Fudge 1969; Selkirk 1980a, 1982b), and its further phonological implications have been exploited by, for example, Hooper (1976), who with reference to Spanish has interpreted syllable-initial position as 'strong1 (cf. 'principal member1) in that it allows a maximum number of phonemic contrasts. Ihus 'principal member position' (Pike and Pike) or 'strong position1 (Hooper) represent in their definitions correlates of syntagmatic properties of their structural position or 'slots' with paradigmatic properties of their content or 'fillers', the latter specified phonologically in terms of number of phonemic 'placeholders' possible (Pike and Pike) or - more abstractly - in terms of number of phonemic contrasts possible at a certain place (Hooper). This indeed constitutes a true phonological specification of an ordered structural hierarchy. However, as has already been indicated, such a - in another sense - 'strong' conception of a phonological hierarchy is only realizable within those levels of the hierarchy whose lower order constituent (unit-)positions are filled by elements which contrast paradigmatically with same-order elements (in absentia) at those particular positions within the higher order structure. However, within a 'weak' conception of phonological hierarchy, i.e. in which in
30
Halliday's terms, there is a coincidence between hierarchical constituent 'unit1 and expounding structural 'element1, - failing a phonemicization of, for example, stress and intonation or pitch as in traditional structuralist accounts -, one has to be content with interpreting such structural positions within the phonological phrase, clause, etc., as the 'Loci of paradigmatic feature, not unit, contrasts as relevant to the characterization of units of a lower order. This 'weaker' conception is of course equally consistent with Hooper's view of syllable position as a locus of (feature-based) segmental contrasts. Within this conception of a phonological hierarchy, which underlies in large part the type of structural hierarchy proposed in the present study, a phonologically significant position within a particular level structure simply represents the locus or point of contrast of a set of paradigmatic features which via positional coincidence are associated syntagmatically with constituent elements of a next-lowest order structure (i.e. do not directly carry the contrasts between the constituent units themselves of the lower order as Hallidayan 'elements of structure1). Ihe positional coincidence itself is further expressed via a paradigmatic-syntagmatic interpretation of the relational phonological property 'strength'. Hence there is no necessity in principle for the locus itself to be directly isomorphic with any particular lower level constituent unit. For example, in English one may claim that pitch direction features - basically rising vs. falling (possibly, 'level') - which expound phonologically relevant contrasts in intonation and are associated with the unit-level phonological clause, have as their locus of contrast the tonic syllable. While the inmediate domain, i.e. unit of 'syntactic association1, of such features is the phonological clause, choices of rise or fall (or level) may also be seen as being conditioned by the constituent status of the clause relative to other linearly co-occurring clauses within, e.g., the phonological sentence. Whereas Halliday would adduce the 'structural element1 T to express this relation, independent of the hierarchical constituent 'foot', one could equally - within a more 'monosystemic' approach - adduce some structural measure of the relative syntagmatic status of constituent phonological clauses with regard to each other: e.g., a 'principal' or 'strong' constituent (-position) allowing a potential choice of rise or fall, and a 'subordinate' or 'weak1 constituent (-position)offering no choice of rise vs. fall. Ihe same approach could apply to the analysis of the stress feature at phonological phrase level as it is related to the relative syntagmatic status of constituent phonological words (as the domain of contrast), where
31
again the syllable constitutes the locus of such contrasts. Moreover, in an exclusively syntagmatic view of phonological structuring, such syllable loci constitute points of manifestation of the 'culminative' ("gipfelbildende", Trubetzkoy 1958) phonological function that sound properties possess. Ihe phonological significance of this function is apparent in considering the role of the structural position at which the sound properties realizing it are manifested and the unit with which they are associated with regard to the syntagrratic unit within which the position/unit is located. Sound properties ("Schalleigenschaften") "geben an, wieviel "Einheiten" (=Wörter, bzw. Wortverbindungen) im betreffenden Satze enthalten sind" (Trubetzkoy 1958: 29). In the present view, they also indicate the relative status of linearly cooccurring units in terms of their 'strength'. 2.2.1.3. Ihe previous discussion has attempted to demonstrate the structural motivation - paradigmatic and syntagmatic - for a type of hierarchical phonological description which will form a basis for statements on the suprasegmental context of segmental variation found in FL speech. Central to this conception of a phonological hierarchy is: i) the distinction between same level constituents in the hierarchy as relationally 'strong1 and 'weak 1 , i.e. as types of Hallidayan 'elements of structure1 (one may add that where there is only one constituent present it will be 'strong'); and ii) the distinction between the domain of phonologically relevant structure-defining features and the exact locus of their realization. Ihe type of 'strong-weak' relationship between constituents of a hierarchy as outlined here has been interpreted by Tench (1976) as a relationship between 'basic unit' and 'expansion' within a 'double rank1 (or Pike's "paired level1) view of phonological structuring, where the 'expanded unit", for example, cluster or 'rhythm group1, has an internal structure of basic units, respectively for example, phonemes or syllables, one of which may be regarded as the 'nucleus' ('principal', 'head') and the others - if present - as the 'satellites' ('subordinate', 'modifier'). Basic units have a relationship of 'function' with other level basic units, allowing the expression of, for example, distribution, and of 'expansion' with the expanded unit. A reason given why, for instance, the rhythm group (or foot) is seen as an expanded unit of a syllable and not as a direct constituent of the intonation unit (tone group) is that tone contrasts fall on the syllable itself, not 'on1 the rhythm group itself. However, intonational contrasts within the next highest level basic unit, the "phonological paragraph", equally fall on the syllable,
32
although as lench admits, "It is difficult to determine with any degree of certainty if and how the nucleus of higher phonologj/ral units nay be identified" (1976: 16) - e.g., of the expanded units 'phonological discourse1 and "conversation1. The 'double rank1 phonological hierarchy is summarized as (•ibid.) : Process DIALOGUE PARAGRAPH
Basic Unit κ
" structure
INTONATION function RHYTHM ARTICULATION
phonological exchange phonological paragraph intonation unit syllable phoneme
Expansion conversation discourse intonation group rhythm group cluster
However, in the absence of other - paradigmatic - evidence for the status of 'nuclei' as opposed to "satellites', a solution such as that proposed above in 2.2.1.2., which distinguishes 'strong' vs. "weak1 constituents according to the intersection of paradigmatic and syntagmatic distinctions between unit-levels and between the 'domin1 and "locus' of phonological contrasts, is to be preferred. Syntagmatic evidence alone is not sufficient to establish the structural status of a constituent unit in a phonological hierarchy, even though there may appear to be regular phonetic (or 'phonological', within a polysystemic approach to structuring) correlates which mark a particular element 'culminatively" or 'delimitatively' (i.e. in terms of its boundaries), (Trubetzkoy 1958). Such syntagmatic phonetic markers are indeed phonologically relevant, as will be shown in the further analysis below, but as 'correlative ' markers they must in the first instance be of secondary rather than primary phonological relevance in the specification of hierarchical units. However, their relevance increases by the degree to which such phonetic correlates regularly and systematically characterize or mark subdivisions of constituent units as 'strong' or 'weak 1 , the more so, when the incidence of 'strong', i.e. markedly 'positive', values of phonetic features - e.g., a marked degree of pitch change in terms of a major change in direction or height, a marked degree of stress, extensive segmental or syllabic lengthening etc. - within a syntagmatic unit coincides with the locus of paradigmatic phonological contrasts, as defined above. For instance, in English (and Dutch), phrasal, clausal or sentence 'accent' is marked by 'strong' values of pitch, stress, quantity and quality features occurring on the "nucleus" or "tonic1 syllable; the phrase or clause constitutes the domain of syntagmatic marking, the syllable itself the locus.
33
At the same time, the nuclear or tonic syllable occupies the position within a syntagmatic unit at which a potential paradigmatic contrast may be manifested - that, for example, of 'contrastive stress' or accent. In the sentence
set, Mary had to leave Mary had to leave Mary had to leave
the paradigmatic contrast is manifested via the presence (or positive value) of a phonological feature [accent] at a particular syntagmatic position as associated with a particular syntagmatic element (domain - phrase within clause (or clause within sentence): locus - syllable), as opposed to its absence (or 'negative' value) at the same position as associated with an identical element in a minimally contrastive pair or set of sentences. It is this type of 'systematic coincidence' of syntagmatically and paradigmatically established phonological distinctions which forms the structural basis of the type of phonological hierarchy developed in the present study. The 'coincidence' is systematic positionally in that a locus within structure is shared and 'realizationally', in that the phonetic features which realize the syntagmatic and paradigmatic distinctions manifest shared values of "strength1. Ίο this extent, such phonetic markers have not merely an 'identificational1 role in signalling phonological units, but a structurally defining role in the hierarchy. 2.3.
Ihe syllable in a phonological hierarchy
2.3.1. Ihe previous analysis has shown - in a preliminary fashion - the importance of the syllable as a locus of phonological distinctions with reference to the unit-levels phrase, clause (and sentence) of a phonological hierarchy. However, within such a hierarchy, the syllable has the status of a constituent unit-level itself - of the phonological word or word-part ('formative') - as well as positionally providing the locus of higher level phonological patterning. At the same time, then, as a unit-level of the hierarchy, the syllable will also have its ovai constituent units. It is the purpose of the present section to examine in outline the constituency of the syllable with a view to pointing up the veys in which syntagmatic and paradigmatic phonological structuring at the level of, and below the level of, the syllable defines its constituent parts. It is within the syllable and its constituents that higherlevel ('suprasyllabic') and lower-level ('subsyllabic') patterns of phonological structuring converge.
34
2.3.2. Ihe conventional phonological 1C structure of the syllable is analysed in terms of 'onset1 + 'rhyme', with 'rhyme1 itself having the 1C's 'peak' or 'nucleus' + 'coda1 (cf., e.g., KuryZowicz 1948; Hockett 1955; Fudge 1969). Thus for the English syllable "blind" [blamd] , the structure would be posited: SYLLABLE -""""
~^~--~
ONSET
[bl
RHYME
PEAK
CODA
ai
nd]
Note that this view of 1C structure differentiates Pike's 'margins' into prenuclear onset and post-nuclear coda. It has been structurally motivated by the existence of phonotactic constraints on the occurrence of phonemes within the syllable, where, for example, within the rhyme there are far fewer constraints on the occurrence of phonemes in peak and coda positions taken together, than within onset and rhyme (or peak, coda) positions together (cf. Kuryiowicz 1948; Pike 1967; Fudge 1969) . However, purely phonotactic considerations for establishing these 1C divisions and the recognition of a layered structural hierarchy within the syllable are not sufficient on their own to validate onsets and rhymes (peaks + codas) as constituent categories of such a hierarchy. For this kind of reason, Selkirk (1978, 1980b) in her generative 'prosodic' model of a phonological hierarchy - to which much of the conception of the present description is indebted - claims that such syllable parts cannot be assigned, in her terms, 'prosodic category labels' and therefore are excluded as structural elements from the hierarchy itself. With Fudge (1969), one may conclude that not only "is a systematic element not defined but only characterized by its distribution" (1969: 257), but also further that nor are the higher level units within which the distribution may be stated 'defined' in this way, only 'characterized'. However, a number of linguists working within 'metrical phonology' (e.g. Selkirk 1978,
1980b; Kiparsky 1979) have posited a binary-branching division
of the syllable into nodes, labelled in terms of relational phonological 'strength' as s vs. ω, in which at the first level the rhyme-equivalent node is marked strong, the onset node weak and at the second level, the peak node strong and the coda weak, terminal nodes corresponding to segments respectively as strong or weak. Thus Kiparsky (1979) proposes the syllable 'template' (1979: 432):
35
But it is clear that this principle of syllable division and its conception of phonological 'strength1 has a different analytical basis to the one proposed here. Sub-syllabic 'strength1 is associated with the relative 'sonority1 of the segments that fill the terminal nodes, which Kiparsky expresses by neans of a 'universal core rule' which requires a optimal mtching of such a 'syllable template1 with the traditional 'sonority hierarchy1 of segment types (stops, fricatives, nasals, 1, r, w, y, u, i, o, e, a ) . Hie sonority value in effect constitutes a typological-phonetic characterizing feature of structural position, whereby syllable rhyme and, specifically, syllable peak contains the "strongest" position as filled by the most "sonorous" type of segment. Similarly, Selkirk's (1978) division of the syllable "flounce" as:
reflects strength in terms of sonority, the peak being more strong than the onset and within each constituent, the s has been assigned to the more sonorant element (1978: 5 ) . This view contrasts sharply with the one adopted here, which claims that, by analogy with the structural specification of units of the suprasyllabic hierarchy (formative, word, phrase, clause) as outlined in 2.2.1.3. above, syllable-initial position, i.e. onset, is a phonologically 'strong1 position in English in allowing a greater number of phonemic contrasts than rhyme, (i.e. peak and/or coda); cf. Hooper's (1976) similar analysis of Spanish. Thus it would appear that at subsyllabic levels of the hierarchy, there is a discrepancy between the strength markings of constituent units in terms of strictly phonological (paradigmatic contrast) arguments and phonetic (here, syntagmatic typological) considerations in that these distinctions no longer positionally coincide. In the present analysis, consistent with the constituent division of higher
36
level structures, it is proposed to establish the 1C's of syllables as equal order units, termed 'sets' (van Buuren 1978). Thus a syllable such as "blind" [blaind] would comprise the sets [bl], [ai] and [nd], where the initial set [bl] would be marked 'strong' as the domain of maximal (paradigmatic) phonological contrast at this level - the details of which will be presented in a later chapter. (Note that Hooper treats segments (i.e. phonemes) as themselves inmediate constituents of syllables and thus conflates syllable-initial as strong position directly with the locus of contrasts between phonemes). Ihe initial set position is marked syntagmatically as 'strong', as opposed to the medial and final sets as 'weak' via phonetic feature values of "quality1, predominantly, manner of articulation. Syllable-initial sets are characterized by 'full 1 manner values - in [bl] in the present example, [b] is produced as a firm stop, [1] with full laterality at its primary place of articulation (cf., for example, the weak laterality often associated with syllable final [1]). Note that a strong set may be segmentally realized as zero in syllables beginning with a vowel. However, here one may detect a 'trace' of the s trong element in the presence of a prevocalic syllable-initial glottal stop. Ihus in terms of strength labelling the syllable would be designated as: SYLLABLE
SET
s
w
w
[bl
ai
nd]
Consistent with constituent division within the suprasyllabic hierarchy, there is no structural motivation for strict binary branching (cf. also, e.g., Schane 1979 for a non-binary 'metrical' theory of English stress patterning). At the next-lowest level, initial positions of sets will be marked strong as the domain of maximal (paradigmatic) phonological contrast at this level; a greater number of phonemic contrasts are potentially available at initial position relative to non-initial positions(s) - again, however, the details of these contrasts will be presented below in a later chapter. Single segment realizations of these set positions such as in the syllable 'ban1 [baen ] , must be regarded as constituting strong initial positions in the present analysis. Ihus the segment level specification of [blaind] may be compared to that of [basn] as: SYLLABLE
37
However, returning to the analysis of [blamd] , one notes that the set-initial position is narked as syntagmatically "strong" in all cases via phonetic feature values of 'quality', predominantly manner of articulation (as vowel height within the syllable "peak", i.e. syllable-medial set position), and phonation, which may be conceived of as a feature timing relation between segments constituting positions with a set. Considering the realization of [b] in the set [bl] in [blamd] , a close phonetic analysis of the segments within the set reveals that the phonation characteristics of [b], i.e. 'whisper' (cf. van Buuren 1980) , 'extend' into [1], as may be represented in a narrow transcription [££11]. In the medial set [ai] , the vowel height - low - of the initial segment [a] 'extends' into the [ i ] in the sense of 'constraining' its height realization to a mid, not an ('ideal') high value. In the final set [nd], in transcription [nndd], the initial segment [n] 'extends1 its phonation type voice into [d]. The manner feature associated with the initial segments in sets will in each case manifest an 'ideal' value: of full closure of [b] in [bl], full opening of [a] in [ai] and full closure of [n] in [nd]. 2.4.
Primary and secondary phonological marking systems within the hierarchy
2.4.1. In the previous section, the phonological relevance of syntagmatic marking within a hierarchical model of phonological structure has been judged according to whether the point of marking within a unit-level coincides with a paradigmatically defined locus of contrast: i.e., when the syntagmatic same-level marker is located within the 'strong' constituent (of the nexthighest level structure) as its domain of relevance and the same constituent constitutes the domain of paradigmatic distinctive contrasts. These contrasts are operative at a particular position within the constituent (domain), which in turn coincides with the exact position of realization of the syntagmatic marking. This position is termed the 'locus' (of strength). The distinctive contrasts involved are between same-level units which "fill 1 the same-level constituent "slots'. Within unit levels of the suprasyllabic hierarchy - formative, word, phrase, clause, (sentence) - it has been suggested that the syllable forms the locus of such marking, whereas within levels of the subsyllabic hierarchy - syllable, set, segment - domain and locus coincide at the set or segment. Further, the position in structure at which syntagmatic and paradigmatic marking 'systems' coincide differs in the two sections of the hierarchy: within the suprasyllabic section the position falls as a rule on and within a final constituent, within the subsyllabic section as a rule
38 on the initial constituent. Thus a hierarchical representation of the sentence "John's gone now to London and Mary's returning to Amsterdam" ["dgcnz gran 'nau ta 'lAndan snd 'mesriz π ' t a m o tu "aemstsdaem ] could be for the suprasyllabic levels as follows: Unit-levels SENTENCE CLAUSE PHRASE WORD FORMATIVE SYLLABLE
S
S S S
I John's
I I I gone now to
S W S
1 1 1 London and
S W
I I Mary's
W
S
W
I I I returning
S
S
W
W
I I I I to Amsterdam
Note that of course the choice of exactly which syllable within a polysyllabic formative in each case will bear the s narking is constrained by lexical word stress or accent. Note further that s markings do not represent any kind of 'stress' or 'rhythm' measurement, as is often assumed in 'metrical' approaches to hierarchical phonological representation, but only a relational phonological measure of structural 'strength' (or 'dominance'), which phonetically may correlate in some - to be specified - way with "prominence'. As has been mentioned earlier, the phonetic features realizing "strength1 are not confined to those of stress and pitch, but - also within the suprasyllabic hierarchy make reference to features of quality and quantity. A subsyllabic representation of the same sentence would be: SYLLABLE
(s)
SET
SW W SWW S
SEGMENT
SSSW SSS
(s) (s)
~! °p - fric
As far as the role of cover features in the lexical representation of segments is concerned, they may be employed in their 'broadest form' sufficient to establish all (segmental) contrasts of the particular language. In more recent work, however, Ladefoged (1980) has discarded the idea that phonological features correspond in any isomorphic way to the phonetic features of a language, which thus effectively releases the specification of phonological features, whether 'prime' or 'cover', from defining phonetic constraints and allows considerable abstractness in their description. However, while the general notion of superordinate phonological features of the type 'cover feature' may be well motivated for the establishment of 'natural' classes of segments on a formal and functional basis, and thus enable 'linguistically significant generalizations' to be made concerning the phonological patterning of such units, a feature Strength, as Vennemann and Ladefoged note, must not only be specified in terms of variation of values within more than one phonological feature (where additional features to 'stop1 and 'fricative' must presumably
64
also be specified), but also the theoretical status of the feature itself is clearly of a different order to that of other cover features such as Labial, Consonantal, etc. The notion 'resistance to diachronic assimilation/weakening' taken over from Foley, which determines the relative strength of a segment within a class - and this only with reference to diackronic development may in any conventional conception of generative phonology only be expressed as a kind of theoretical meta-statement in as far as it cannot be expressed within the type of markedness conditions developed in SPE (Chomsky and Halle 1968, ch. 9 ) . The inability of SP£-phonology to provide a framework for the statement of such intuitively valid 'strength1 markings has of course in itself led to the expression of those relations within alternative frameworks such as those provided by Hooper's natural generative phonology and Foley's 'true phonological theory". 4.1.2.5. It seems that an important weakness of such phonological strength scales or hierarchies lies in their restriction in application to segment units. However, any incorporation of the notions of a strength hierarchy within phonological theory thus far considered, expressing some kind of 'resistance1 to or 'potentiation' for phonological 'strengthening' or 'weakening' processes, i.e.
for phonological change, whether as correlated with structural position
or not, whether primarily diachronically (cf. Foley) or synchronically motivated (cf. Hooper), would seem in the first place to indeed have main reference to the establishment of phonological universals of segment (or segment type) occurrence. In this sense, as Brasington (1982) notes, statements of strength hierarchies are notionally equivalent to statements of universal 'markedness' relations obtaining between segments which take the form 'the occurrence of segment B in the language presupposes the occurrence of segment A ' , etc. Just as the set of feature-based SPE universal marking conventions may be criticized as being no more than a formalization of an ad hoc set of observations on , potentially universally valid patterns of segment occurrence essentially peripheral to the theory of generative phonology, so the sets of strength hierarchies expressed in terms of segment-based phonological features in the alternative theories of phonology thus far reviewed equally express little more than a series of typological regularities. A structural motivation for the expression of strength hierarchies between units in that relations of this type may be seen to condition the patterning of phonological contrasts (paradigmatically and syntagmatically) and their phonetic realization within a language is, with the partial exception of natural generative phonology, clear-
65 ly lacking. The lack of structural insights offered by such approaches largely lies of course in the restriction of unit-bound strength measures to the segment, where feature hierarchies and segment (type) hierarchies are often confusingly coalesced, whereas at the same time, position based strength measures are restricted to the syllable. Correlating strength measures with inherent properties of individual segments (or segment classes) raises the question of the status of the segmental properties themselves. Are the properties of segments to be expressed as combinations of binary marked "distinctive" or "prime" features (cf. Vennemann and Ladefoged) or as a total of numerical scalar values along different parameters of strength - e.g. of "relative resonance", "binding strength", "vowel strength" (cf. Foley) - or as some kind of (numerically valued?) quasi-phonetic features (cf. Hooper)? That is, respectively, as relatively surface phonological, deep phonological or phonetic features? As Foley notes, any abstract statement of segment strength must have reference to a number of different parameter scales. In a more "concrete" interpretation of phonetic attributes of segmental "strength", Lass and Anderson (1975), for example, note that at least two major parameter scales are necessary for the explication of diachronic change - one, defined as "resistance to airflow", the other as "sonorancy" (or "opening", i.e. "onset of periodic voicing1). Above all, even granting that scales of segment strength have to be separately established for each language (according to a broad set of universal principles), any attempt to relate segmental units to positional occurrence will have to come to terms with the fact that the same segment (type) as defined in a phonological or even phonetic feature specification will in one case evidence a "weakening" process and in another a "strengthening" process. For example, word-finally, generally considered a "weak" position, /b/ may become /p/, whereas intervocalically, also considered a "weak1 environment, /p/ may become /b/. At the same time, however, as Lass and Anderson point out, a voiced stop is weaker in airflow impedance, stronger in sonorancy, a voiceless stop stronger in airflow impedance and weaker in sonorancy. Furthermore, aspiration, e.g. of initial voiceless stops, may be considered diachronically as evidence of a weakening process (cf. Lass and Anderson 1975), synchronically, however, as a strengthening process - within the same language (cf. Sommerstein 1977). Considerations such as these would seem to point to the doubtfulness of postulating any (universal or not) set of concretely interpretable strength scales for individual segments or segment types. Either one defines, as Foley
66
does, the scalar parameters on a highly abstract (some would claim, thereby, vacuous) basis and relate them to phonological segment patterning per se or one makes reference to a (universal or not) hierarchy of phonological or phonetic features and their incidence and coincidence with regard to particular segments in particular structural positions. However, problematic in all these approaches remains the (largely unaddressed) relation between phonological and phonetic feature specifications of strength. In Hooper's theory, conditions of "naturalness1 which apply to feature specifications as well as rules require that the same set of 'natural', i.e. phonetically motivated, features be employed "to capture the motivation for natural processes" (1976: 135). However, apart from her adoption of traditional segment labels such as 'nasal', 'liquid', 'voiceless stop1, etc., to state the relative strength of segment types, Hooper takes no further stand either on the exact nature of phonetic or phonological features or on the specification of degree of strength within feature terms. In an assessment of the potential of feature systems to reflect the kinds of degrees of segment strength proposed by Hooper, Brakel (1979) concludes that via a procedure of counting and averaging positive (i.e., +) values per segment (type), a purely articulatorily based feature system correlates far more closely with Hooper's strength values than the articulatory-acoustic SPE system. Foley in any case sees no necessity for a phonological theory to be held responsible for the specification of phonetic features or phonetic (or 'physical1 in his terms) correlates of phonological features. Hence the kind of view expressed by Brasington (1982) and Sommerstein (1977) that at some stage a specification of the 'physical1 correlates of strength values (with regard to position or segment) will be provided by phonetic research. There is, however, a crucial distinction between 'physical' measurement, whether pertaining to articulatory, (aerodynamic), acoustic or perceptual phases of speech transmission, and phonetic specification, and a definition of the latter has been considered to be within the brief of the great majority of phonological models of description from Baudouin de Courtenay onwards. This viewpoint forms of course the cornerstone of current 'natural1 (or 'phonetically motivated') phonological descriptions. Moreover, it has been noted above that even purportedly abstract or 'relational' forms of phonological analysis need reference to phonetically motivated primes of description, if only in their feature systems (pace Foley, and also pace Fudge 1967). However, the distinction itself between 'physical' specification and 'phonetic' specification is often lightly equated in phonological theory. The
67
contrast can either be expressed via a clear distinction in levels of representation within a phonological derivation between the 'systematic phonetic1 and the "physical phonetic' levels (present in Chomsky and Halle 1968, but most clearly articulated in Ladefoged 1972) or by an alternative conception of the ontological status of phonological derivation whereby the derivation does not 'lead to1 or 'end in1 the physical realization of phonological elements in the same (competence) description, but rather merely provides the substantive realization of elements at various levels of abstraction which form the input to the speech-processing mechanism as a part of performance. Physical measurements of phonological or phonetic elements represent abstractions from the same empirical domain that constitutes the data base for phonological and phonetic descriptions, but, crucially, within an entirely independent behavioural framework of interpretation. While it is a desirable constraint on the 'naturalness1 of linguistic theories that such phonological and phonetic properties established within these theories correlate in some way with the physically established measurement of such properties, the relationship is one of possible correlation, not of direct realization of such properties. Strictly, the realization of phonological (and phonetic) elements within a physically relevant domain is a role of the processor, not the gramnar, in a theory of performance. 4.1.2.6. A genuine alternative to the notions of 'strength/weakness' and 'strengthening/weakening' just discussed is offered by Stampe (1973; also Donegan and Stampe 1978, 1979) in his theory of 'natural phonology1. Here 'strengthening' and "weakening" are interpreted as 'natural' phonological processes which form part of the actualization of speech, constraining the relationship between phonological representation (corresponding to the phonological intention of the speaker) and phonetic representation (i.e. the sound forms actually produced) as "phonetic substitutions which limit articulation" (Wojcik 1981: 635). A theory of natural phonology aims to describe the characteristics of language in use, viewing language as "a natural reflection of the needs, capacities, world of its users rather than a merely conventional institution" (Donegan and Stampe 1979: 127). Phonological processes of strengthening or 'fortition' and weakening or "lenition" are a reflection of the general need to communicate with language and as such can be attributed both articulatory (being produced) and perceptual (being understood) teleologies. Strengthening and weakening processes are phonological, not exclusively phonetic processes, being "mental operations performed on behalf of physical systems
68
involved in speech perception and production" (Stampe 1973: 9), i.e.
'mental
in occurrence, physical in teleology'. They are 'mental' or phonological, not purely 'peripheral' or phonetic, since their origin within the speech producing mechanism is claimed to be in the central nervous system, not in the articulatory organs. Further evidence for their mental status is provided by the observation that such 'natural' processes are suppressible in the acquisition of an ML or the learning of an FL. In speech production, fortition (or 'centrifugal1 or 'paradigmatic') processes contribute to clarity of perception by emphasizing contiguous differences between segments: lenition (or 'centripetal' or 'syntagmatic') processes contribute to ease of articulation by reducing the degree of articulatory movement between contiguous segments. The former, examples of which are dissimilations, diphthongizations, syllabifications and epentheses, "intensify the salient features of individual segments and/or their contrast with adjacent segments" (Donegan and Stampe 1979: 142), whereas the latter, examples of which are assimilations, monophthongizations, desyllabifications, reductions and deletions, make segments and sequences of segments easier to pronounce "by decreasing the articulatory 'distance' between features of the segment itself or its adjacent segments" (ibid.: 142). Fortition and lenition processes have opposite effects: e.g., a fortition process may syllabify the [J] of [paeid] 'prayed* to [paeid] in emphatic speech, whereas a lenition process may desyllabify the [j] of [paeid] 'parade' to [paeid] in casual speech. Fortition processes, which always 'apply' before lenition processes in speech production, are observed to be particularly favoured in so-called 'strong1 positions, e.g. vowels in syllable peaks, consonantal syllabic onsets and 'segments in positions of prosodic prominence and duration'. Lenition processes 'apply' particularly in 'weak' positions, e.g. to 'consonants in 'blocked' and syllable-final position, to short segments, unstressed vowels, etc.'. All processes apply after the phonological rules of the language have been specified. The latter are crucially distinct from the former in that they have main reference to morphophonological alternants: thus the 'Velar Softening Rule' of English, which expresses the alternation between such pairs as "electric/ electricity", is an example of a rule as opposed to a process. Similarly, syllable-final 'devoicing' or "Auslautverhärtung" in German is an example of a (norphophonologically conditioned) rule. Pules are said to be leamt, obligatory (as opposed to processes which may optionally apply according to speech style), and lack synchronic motivation (see Donegan and Stampe 1979: 144-145 for a detailed comparison of rules and processes). In this sense, natural
69
phonology's clear separation of processes and rules has much in common with a natural generative phonology approach, which separates out 'true' P-rules on the one hand and "MP-(Morphophonerriic) Rules' and "via-rules1 (i.e. phonological rules expressing lexical alternations) on the other (Hooper 1976). Stampe's natural processes may take on the status of rules over time, i.e. become 1
phonologized' - a case in point could be that of German final devoicing. Many of the basic assumptions and findings of natural phonology seem of
course to make intuitively good sense and correlate well on important points with aspects of the model of phonological description being developed in the present study. In particular, the position and form of fortition and lenition processes described by Stampe tallies closely with the observations offered in previous chapters on the occurrence of strongly and weakly marked elementpositions in the suprasegmental hierarchy. However, one wonders if it is strictly necessary or even desirable to re-define the ontological status of a phonological theory (as one which reflects 'real1 speech production) to express the kind of description of sound substitutions - processes and rules that Stampe suggests. While accepting the descriptively desirable motivation for clearly distinguishing between (morpho-)phonological alternations (i.e. 'rules') and phonetically oriented phonological alternations (i.e. 'processes'), one doubts the validity of attributing the former exclusively to 'structural' or 'system-internal' constraints and the latter exclusively to realizational or 'system-external1 constraints deriving directly from the behavioural context in which language is 'realized'. In Stampe's view, processes, which have a phonetic (articulatory or perceptual) teleology, depend for their actual realization on, for example, constraints of speech style ('casual', for instance). In the present study, by contrast, 'processes' of strengthening and weakening are interpreted as realizational effects of properties of the phonological structure itself, specifically of the primary and secondary structural marking systems - culminative and deliminative. The qualitative and quantitative actualization of such effects as constrained by linguistic structure is of course ultimately dependent upon the speech processing mechanism, (i.e. the processor in performance), which in turn is sensitive to the external (i.e. 'extra-linguistic') conditions within which the speech event takes place and which define, for example, the level of speech style employed, etc. Stampe does not attach any further theoretical significance to the coincidence of process types with particular segmental positions. However, he claims that at some stage in a phonological description (phonological) units such as words, phrases and sentences are mapped onto 'prosodic structures', the Onsets' of which are 'stronger' than their Offsets'. Such prosodic
70
structures are organized in units of syllables and 'accent measures' (which seem to correspond notionally to 'feet'), which in turn reflect 'rudimentary patterns of rhythm and intonation' (Donegan and Stampe 1979: 142). Significantly, however, it is noted that "insofar as syllabicity, stress, length, tone and phrasing are not given in the lingusitia matter·, they are determined by prosodic mapping" [my italics] (Donegan and Stampe 1979: ibid.). A third set of phonological substitutions, 'prosodic processes', is responsible for this mapping. At the level of phonetic realization, then, phonological structure interacts with some kind of basic accentual structure, perhaps the product of a type of 'rhythm generator' (cf., e.g., Fromkin 1968). Of course, this conception of the mediation of linguistic structure in realization by an independent rhythm or accent system associated with some notion of the basic 'carrier' rhythms of speech has been proposed elsewhere (cf., e.g., Martin 1972). However, a structurally rich enough specification of "the linguistic matter' in which not only 'prosodic' but also 'rhythmic' principles of phonological and phonetic organization are incorporated in a statement of sound patterning and reflected in a phonological derivation as intersecting types and levels of representation is, as will be hopefully demonstrated in the following chapter, capable of representing the linguistic determinants of the temporal and accentual characteristics of speech. It is the task of a phonological theory to specify the elements of a "rhythmic1 organization or patterning, which qua their structural properties, e.g. their grouping into units of "clitics' and 'heads' (cf. Knowles 1974) and their feature specifications, lay the basis for their interpretation in temporal terms. The intersection of such a defined "rhythmic structure" with a 'prosodic ( or suprasegmental) structure1 (such has been developed thus far in the present study), i.e. as units specified in terms of clitics and heads coinciding with units of a prosodic structure with their strength markings en units such as segments, syllables, phrases, etc., provides the basis for the 'accentual' interpretation of speech. As Allen (1975) concludes in an extensive discussion of linguistic rhythm, "rhythmic universals act from within phonology rather than as external constraints on performance" (1975: 83). 4.2.
Strength and metrical phonology
4.2.1. A major extension of the theoretical interpretation of strength relations between units and/or positions of a phonological description is provided by the recently developed 'metrical' or 'prosodic phonology1 within generative
71
theory. The new theory has emerged out of a relational account of English lexical and phrasal stress (cf. Lieberman and Prince 1977), which aimed to replace the SPE view of stress as a numerically specifiable segment-based feature assigned by Stress Cycle rules in a phonological derivation (Chomsky and Halle 1968). Instead, in metrical accounts, a limited set of (non-cyclic) rules or conditions determines the relative stress or 'prominence1 marking of terminal constituents, i.e. syllables, within a word or phrase on the basis of a hierarchically ordered prosodic or "metrical1 phonological structure characterized as a binary (right or left) branching tree with sister nodes labelled s ('stronger than') and ω ('weaker than 1 ). The 'rhythm' of a lingzistic unit is 'read off a 'metrical grid1 which captures the temporal regularity of the nodes or constituents thus specified. Thus, Lieberman and Prince (1977) provide the following algorithm for assigning metrical structure: a stress rule assigns the feature [stress] to vowels from left to right in a string; then any sequence of a single [+ stress] vowel followed by a maximal sequence of [- stress] vowels is associated in a left-branching tree labelled s/w, called 'feet 1 ; the feet are then joined in a right-branching structure labelled w/s. The three steps may be representated thus for the stress assignment of the 'word1 "hamamelidanthemum" (van der H lst and Smith 1982): i.
ii.
stress assignment
foot
formation
hamamelidanthemum
Λ Λ
S W S W S
-s \W W
hamamelidanthemum iii.
word-tree formation
Λ Λ
S W Ξ
W S
WW
hamamelidanthemum
"Main stress', as expressed by Chomsky and Halle's (1968) iMain Stress Rule, here falls on the vowel that is exhaustively dominated by the nodes labelled s; i.e. in the example given, pre-penultimate 'a 1 . Descriptive equivalence to the SPE minimal specification of [stress], where [1 stress] designates the vowel manifesting 'main stress', [2 stress] the strongest non-primary stress, etc., may be achieved by counting the number of nodes dominating the lowest node labelled ω (if any) and adding 1. Further developments within the theory have led to the postulation of a
72
number of competing but related conceptions of phonological structure (cf., e.g., Kiparsky 1979; Halle and Vergnaud 198O; Giegerich 1981, 1983; Prince 1983; and here especially Selkirk 198Ob, 1982b) as being hierarchically organized in its derivation and thus manifesting a greater structural 'depth1 than the original SPE version, which was characterized as a string of segments within phonological phrases complete with boundary markers. It is claimed that such a structural derivation largely obviates the need for phonological boundaries at a phonological level of representation. Also, in more recent work it has been shown that both an SPE-type feature [stress] as well as stress assignment rules employing it may also be rendered superfluous in the theory (cf., e.g., Selkirk 1982a). 4.2.2. 4.2.2.1. Of particular relevance for the kind of phonological framework being proposed in the present study, however, is the version of metrical theory proposed by Selkirk (esp. 1978, 1980b, 1982b), which accredits phonological constituent status to the majority of structural strength nodes in a phonological representation, thus effectively defining the form of a constituent hierarchy in terms of the strength relations obtaining within and between elements of different levels. It is to a critical discussion of Selkirk's 'prosodic phonology1 that the present section is devoted. 4.2.2.2. Selkirk finds it necessary to posit the existence of a set of 'prosodic categories' in the theory, "which are isolable subunits of the prosodic structure, in associating 'labels' corresponding to these categories with the appropriate nodes in the tree, and in allowing for the possibility that nodes with different prosodic category labels may be interpreted differently by processes of the phonology" (198Ob: 565). The labelling of such categories and their further interpretation as constituents within hierarchical phonological structure is motivated by a 'systematic convergence' analogous to the labelling of, e.g., NP in syntactic representation and the status of NP as a constituent of syntactic structure: i.e., "Those units of phonological description which are motivated on distributional, i.e. phonotactic, grounds (that is, those which have their own principles of construction and prominence) are just those which play a role in processes, accentual or non-accentual in character, which look at or apply in terms of the phonological representation" (Selkirk 198Ob: 576). Phonological units thus motivated and which represent constituents of
73
phonological status are the syllable ( σ ) , stress foot ( Σ ) , prosodic word ( ω ) , phonological phrase ( < f > ) , intonational phrase (I) and utterance (U). An example of a syllable-level representation (1) for "flounce" would thus be: (1)
-
σ s /\
W
S
W
Α Α Λ W S S W S W f l a u n s
of a prosodic word-level representation (2) for the word "irrespective": (2)
ω
σ
s
σ /ι w
ir i s p e k t i v and (3) of an utterance-level representation: (3)
u
The absent-minded professor has been avidly reading the latest biography of Marcel Proust
These examples show that not all structural nodes have prosodic category labels, i.e. represent prosodic constituents - cf. the unlabelled s nodes below wordlevel in (2) and intonational phrase level in ( 3 ) , and the complete lack of constituent labelling below the syllable in ( 1 ) . In other words, certain nodes only exist by virtue of the branching structure of higher-level category nodes and do not have their own phonotactic motivation (i.e. own 'principles of construction and prominence') nor constitute independent units which phonological rules and processes make reference to. Possible constituents of the syllable such as sets (as in the present study) or onsets and rhymes, and within the latter, peaks and codas, are not freely distributable in the way, for example, stress feet are within the prosodic word or syllables within the stress foot. Although phonotactic restrictions on segments may be stated with regard to occurrence within sub-syllabic units such as the onset or rhyme (or sets) and restriction on the occurrence of these units within syllables, these
74
structural conditions on the composition of syllables (or onsets and rhymes) are stated in Selkirk's theory as part of a set of 'well-formedness conditions', since it is claimed that the logical distinction between the phonological representation of, for example, a syllable and the notion 'possible syllable of a language' must be separately expressed within the theory. The latter is thus formulated as a so-called 'template', which for the syllable has the form:
-syll
(+son)
+syll
(+son)
+cons
(-son)
The 'template' encodes the characteristics of syllable structure by, i) representing the composition of the syllable in terms of segment types identified by the major class features [isyliable], [isonorant] and [ίconsonantal]; ii) representing the order of segment types within the syllable; iii) defining the structural relations between segment types in constituency terms; and iv) expressing the optional!ty of segments or groups of segments (i.e. the present 'sets') within the syllable. However, for each language, additional 'collocational restrictions' must be stated, which further limit the combinatory occurrence of segments and segment groups. These might take the form of such a restriction stated for English by Fudge (1969) as 'if a second position in onset is /w/, then first position is not [+labial]'. At the same time, it is a necessary condition for the well-formedness of a phonological representation that the branching structure of the representation itself is non-distinct from the branching structure of the template, as can be demonstrated in the above phonological representation of the syllable [flauns]. (For more recent proposals on syllable representations and templates, linking them to a 'sonority index1 for segmental placeholders, see Selkirk 1982a). Each prosodic category itself is defined in terms of a triplet of conditions: the "principle of construction', 'principle of prominence' and the "syntactic domain' within which such conditions are met. The 'principles of construction' specify the nature of the structure internal to the category, i.e. its constituency in terms of other categories, direction of branching, etc.; the 'principles of prominence' specify the s/w relations of the subtrees constituting the category; the "syntactic domain1 expresses the mode of relationship or 'mapping1 between phonological categories and categories of syntactic structure where, for example, syllables are related to (morpho-)syntactic stems plus
75
affixes, the phonological phrase to VP, NP, etc.), demonstrating that the phonological representation is of course by no means isomorphic with syntactic representation. Thus, instead of introducing a system of phonological Readjustment Rules to 'convert1 or 'prepare1 the surface syntactic representation for the phonological representation as in the SPE model, the links between syntactic categories such as phrases, formatives ('stems' and 'affixes') and sentences and phonological categories are directly expressed in the specification of the phonological categories themselves. This mapping relationship is captured in the formula: Μ
, = {Κ./Κ. = (C s/p i i kl
P kl
D, .) } kl
where M stands for 'mapping relationship', s for syntax, p for phonology, K for category, and C, P, D for construction principles, prominence principles and syntactic domains, respectively. (However, for an alternative view of the re^ lationship of syntactic-phonological mapping, cf. Giegerich 1981, 1983). Ihese principles for the phonological phrase (φ), for example, take the form: Constituency (conflating "construction1 and "syntactic domain") - (i) An item which is the specifier of a syntactic phrase joins with the head of the phrase, (ii) An item belonging to a 'non-lexical1 category such as Det, Prep, Gomp, VerbuULX , Conjunction, joins with its sister constituent; Prominence - "Given the two sister nodes of prosodic structure [Ν-,Ν-] within φ, Ν- is s (and 1SL hence weak)" (Selkirk 1978: 15-18). Inasmuch as in the SPE model, underlying phonological representation is specified by the Phonological Readjustment Rules (and phonological redundancy rules), so in this model of 'prosodic phonology' the underlying phonological representation is effected by this triplet of conditions or principles, which of course means that the hierarchically ordered phonological constituency structure is defined at the outset of the phonological component of the grammar, i.e. is equally specified in phonological underlying representation and is not further specified in the course of phonological derivation. This amounts to claiming of course that phonological representation is a priori hierarchically structured. However, another perhaps even more radical break with traditional generative phonology is Selkirk's contention that since patterns of lexical stress in English can be accounted for in terms of the structure of the below-word phonological representation, with particular reference to the unit (stress-)foot, the phonological representation of this level and lower - i.e., including the unit levels word, foot and syllable - must be anchored in the lexicon (not in the phonology itself) and introduced into the phonological representation by lexical insertion, while
76
the above-word level representation - i.e., including the unit levels phonological phrase, intonational phrase and utterance - forms part of the phonological representation itself. 4.2.2.3. The kinds of rules which operate on the phonological representation thus specified are classified by Selkirk into four categories: i) (certain of the) principles of prominence which assign s/w relations in function of the configurations of prosodic structure; ii) prosodic structure transformations whose explicit function is to modify prosodic structure relations (but not the segmental compositions) of the representation; iii) rules of segmsntal phonology whose characteristic domains of application are defined in terms of the constituents of prosodic structure; and iv) rules of phonetic interpretation (cf. Selkirk 1980b:576). The 'rules' under i) "apply1 of course strictly at the same time as the hierarchical representation itself is established and take the generalized form 'Given a pair of nodes [N^N-], N~ is s iff
it branches' -
this is equivalent to the original Lexical Category Prominence Rale of Lieberman and Prince (1977) for the determination of English word stress. Type ii) rules also operate on underlying representation to make minor category label and/or s/w marking adjustments to the internal composition of prosodic units where 'necessary1. An example of such a prosodic structure transformation would be the socalled Iambic Reversal Rule (Lieberman and Prince) or Rhythm Rule (Selkirk), which adjusts the s/w marking on word level constituents of phonological phrases, as evidenced, for example, in the stress patterns in English lexical compounds. Thus the representation of "Marcel Proust" as: Φ ω
w s Marcel
ω
s
Proust
which generates a stress pattern "Mar'eel 'Proust", is adjusted by the Iambic Reversal Rule to:
ω
/\
ωs
Σs Σw · Marcel Proust
77
i.e., to "'Marcel 'Proust". The rule specifies that when a weak left sister has the internal composition ω s, the internal prominence relations are changed to s ω provided that the right sister is strong. NDte that it only applies within the domain of the prosodic constituent phonological phrase ( φ ) . Another transformational rule adjusting prosodic relations is that of Resyllabification operative within the domain stress foot ( Σ ) , which 'moves' the onset consonant of a weak syllable in the foot to the coda position of the preceding syllable, accounting for the 'ambisyllabic' (Kahn 1976) or 'interlude' (Hockett 1955) nature of, for example, intervocalic 'flapped t' or unaspirated voiceless stops in (American) English. Thus the representation of the word 'writer1 is changed from: ω
ω to
σ
/r
σ
Ξ.
a
; W
i
t
s
r /
i.e., the 'ambisyllabic' /t/ is removed from syllable-initial position to syllable-final position and may ultimately be interpreted phonetically as [c] in [jaira]. This 'movement' of course presupposes some kind of original preferred basic CV syllable structure, which is indeed seen by Selkirk as a universal structural principle underlying the 'Basic Syllable Conditions1 comprising the syllable template and collocation restrictions. This universal principle is of course by no means unknown in both phonological and phonetic analysis of the syllable, and Selkirk borrows her 'Maximal Syllable Onset Principle1 from Pulgram (1970). It states: 'In the syllable structure of an utterance, the onsets of syllables are maximized, in conformance with the principles of basic syllable composition of the language1 (Selkirk 1982b). Further, Selkirk notes that not only are Resyllabifications true transformations in that they are structure-preserving, but also in that they are optional as, for example, in lento speech. A logical ordering of stress assignment and (re-)syllabification processes must then be in the sequence 1) 'basic syllable composition', 2) stress, 3) resyllabification (Selkirk 1982b). Type iii) rules, those of segmental phonology, have as their domains of application the various constituents of prosodic structure, i.e. a rule may be limited in its application to strings of segments that are dominated by
78 a particular prosodic category. Selkirk (1980a) shows that the domains of application of a large nuirber of Sandhi rules in Sanskrit have reference to the prosodic units utterance, word and foot. She further distinguishes between i) rules which apply to segments contained in the string of a domain regardless of any further sub-division of the domain corresponding to smaller prosodic constituents, i.e.
'domain-span rules'; ii) rules which apply on a
particular domain, but whose operation is dependent on the segments belonging to particular sub-domains of the major domain, i.e.
'domain juncture
rules'; and iii) rules which apply to segments occurring at one or the other end of a domain, i.e.
'domain limit rules' (Selkirk 1980a). Instances of, for
example, foot-span rules in Sanskrit would be those of Ruki and Nati Sandhi in \fedic Sanskrit; of word-juncture within utterance domain rules those of final voicing as in -> , ->- , etc.; of utterance limit rules, that of Visarga or 'breathing1 at pause, as in ->- , -»· . An example of a domain-limit rule in English would be the SPE rule tensing non-low vowels word-finally, as in [+ Y003110] + [+ tense]/ ( ) ; of a domain— low ω ω juncture rule in (American) English that of 't flapping', which applies at a "potato" and "spaghetti", i.e.
word-juncture internally within the intonational phrase (I) - cf. 'flapped t' in ( 1 ) , no 'flapped t' in ( 2 ) :
(1)
oranges (the tree labelling adopted here is an accordance with the conception of a phonological hierarchy proposed in the present approach). An example of a domain-span rule would be that which is responsible for 'intrusive' or 'linking r' in British English, which applies within the domain utterance ( U ) . (For the present examples, cf. Nespor and Vogel 1982). Finally, type iv) rules, those of phonetic interpretation, also make reference to phonological representation as specified by the labelled prosodic categories. For example, those determining the relative duration of units of the representation, i.e. "timing rules', are claimed to make reference to syllable (σ) and intonational phrase
79
(I) domains; those determining rhythm - in the sense of stress-timing in Ehglish - make reference to (the regular occurrence of prominent syllables within) the domain stress foot. 4.2.2.4. This extended sketch of elements of a particular version of metrical phonology shows that there is much in this theoretical conception of a suprasegmentally motivated phonological framework that is nationally, if not in parts descriptively, equivalent to the structural model being developed in the present study. While it is not within the purview of the present discussion to offer an exhaustive detailed critique of Selkirk's prosodic phonology, a critical review of a number of its central principles from the standpoint of the present model of description may contribute to defining certain of the theoretical means and ends of the latter more clearly. Firstly, the a priori assumption of a hierarchically ordered phonological structure is shared. However, whereas in metrical phonology, but also in tagmemic (Pike, Itench) and systemic (Halliday) phonological approaches, the hierarchical ordering of a phonological description stands in some 'mapping' or 'correlative' relation with an equally hierarchically ordered syntactic description which partly motivates the postulation of a phonological hierarchy in the first place, in the present study no relations of this type are postulated. The hierarchical form of phonological structure is, though, in concord with these theories, present as the representation and is not arrived at via phonological derivation of any kind. The hierarchy itself is conceived of fundamentally as a relationship of constituency obtaining between unitlevels of a descending order. The representation of constituency is in terms of structural dominance relations between and within unit-levels of the system, which in metrical phonology and the present approach is expressed via the notion of structural 'strength'. However, whereas in metrical phonology, suprasyllabic strength relations are primarily motivated by the assignment of relative degrees of stress on terminal constituents, i.e.
syllables, and sub-
1
syllabically by the correlation of 'strength with a 'sonority index' of constituent segment types (cf. Kiparsky 1979; Selkirk 1982a), here strength relations at all levels of the hierarchy are motivated "primarily1 by the systematic positional coincidence of paradigmatically defined distinctive (feature) relations and syntagmatically defined culminative and delimitative (feature) relations obtaining between and within the constituent unit-levels. The relative phonological strength status expressed by such feature values is a product of their unit-positional fusion and realized in phonetic interpretation. In
80
metrical phonology/ by comparison, s and w node labelling is stated by the well-formedness conditions of 'principles of prominence1, node branching by the 'principles of construction1, i.e.
according to a stress- or sonority-
motivated 'a priori' geometric pattern. An extra 'grid structure' (Liebentian and Prince 1977; Hayes 1982; Prince 1983) is posited by certain metricalists - although not developed by Selkirk in the work here reviewed - independent of the hierarchical 'prosodic structure' as a necessary structural entity for the correlation of the relative s/w values assigned to nodes with a rhythmic interpretation of their values. This type of intersection of a 'prosodic' hierarchy and a rhythmic representation is in the present study developed in Chapter Five with reference to a parallel 'rhythmic' phonological hierarchy. NDte that not all structural nodes in Selkirk's metrical phonological representation are assigned 'category labels', i.e. are assigned constituent status. However, this is perhaps difficult to reconcile within the kind of strictly ' nonosystemic' view of phonological patterning adhered to in a metrical 'prosodic' representation, especially where a necessary condition for the well-formedness of the representation states that branching structure be non-distinct from the 'template'. One of the functions a syllable template has, for example, is the specification of structural relations obtaining between segment types in constituency. Such constituency-phonotactic relations obtaining between segments below syllable level, however, are defined at the same time via SPE-type major class feature values permissible at (unlabelled) onset and rhyme (peak and coda) structural nodes. Clearly, this strikes one as a structural anomaly: unlabelled nodes which express a constituency-phonotactic relation both at levels below (i.e. segment) and above (i.e. syllable) are themselves not accorded constituent status. Further, certain of the unlabelled nodes at levels above the syllable, e.g. between word, foot and phrase, would seem to be suspiciously the product of a strictly binary view of branching. In the present study there is no structural motivation for this ,(cf., again Schane 1979 for a non-binary branching metrical analysis of English word-accentuation). These observations lead to a recognition of the fundamental distinction in explanatory orientation between a (metrical or prosodic type of) generative phonological theory and the descriptive model presented here. Whereas the majority of phonological theories from Prague phonology to metrical phonology have as their explanatory orientation the specification of sound patterning in terms of 'structural-typological1 regularities underlying the occurrence of phonological entities of the order 'unit-type' (e.g. phoneme), the present
81
approach, with 'naturally1 oriented models of description such as those of Stampe and Linell (1979, 1982) and to a lesser extent with Hooper's natural generative phonology/ has as its explanatory orientation the specification of sound patterning in terms of 'structural-realizational' regularities underlying the occurrence of phonological entities of the order 'unit-form1 (e.g., segment). As will be shown in the following chapter, phonotactic statements, for example, may be equally made within the present type of approach. In other words, the data of phonological analysis are provided by 'surface' manifestations of structure, i.e. produced observable forms whether segmants, syllables or phrases, which are not related derivationally - transformationally or otherwise - to any 'underlying' or 'deep' base forms. 'Surface* forms thus produced and which are given a phonetic interpretation directly derive that interpretation from their own phonological properties. These in turn are a product of the relations they engage in within prosodic, rhythmic and lexical (phonological) representations which are expressed as values of (phonological) features. Within the prosodic representation such values are mediated by dominance or phonological 'strength1 relations obtaining between unit-levels of a hierarchical structure and similarly with rhythmic representation, but in the absence of any 'distinctive' function accorded to such relations. The phonological lexicon is conceived of as comprising in the first instance the units words and phonemes, the former marked for word-accent, the latter for segmental distinctive features (cf. also in this respect Vennemann 1974). Feature values assigned to units of a total phonological representation and given a phonetic interpretation then reflect a fusion of the values assigned to them within these three types of representation (see Chapter Five for a development of this line of argument). In this respect, it is crucial to maintain a distinction between lexical word and phonological word or formative on the one hand, and phonemic and hierarchical segment on the other. Particularly concerning the former, Selkirk would appear to conflate the notion 'lexical word1 and 'phonological word' in her belief that since lexical stress patterns may be defined with reference to lower level phonological hierarchical units of the order stress foot, etc., these levels of phonological structure must be represented in the lexicon. Lexical accent or stress is given for lexical words in the phonological lexicon itself and integrated within phonological structure 'proper' via 'spell-out1 or insertion where it is given a prosodic and rhythmic 'interpretation' within the unit-levels of the hierarchy. 'Adjustments' of lexical phrasal stress or accent that occur as in the case of "Mar'eel 'Proust" ->· "'Marcel 'Proust" must
82
be seen as being accounted for by various types of 'word-formation' conditions on lexical compounding, (derivation and flexion) present in the lexicon itself and which must be reflected in a specification of the phonological properties of their patterning in the phonological lexicon - some of which may be statable in global terms, others of which per item (in this connection, cf. further, e.g., Kiparsky 1982 on 'lexical phonology1). In the case of the 'prosodic' or 'metrical adjustment* posited to be necessary to account for the Rhythm Rule, for example, where "thir'teen 'men" becomes "'thirteen "men", in a phonological lexicon 'thirteen' as a cardinal number will have the accent gloss "thir'teen", and as a premodifying numeral the gloss '"thirteen",
i.e.
here there are separate lexical words involved. Regarding American English 'flapped t' in intervocalic, post-nasal and -lateral contexts, for example, if one posits - conventionally - an 'underlying', here lexical, phoneme /t/ in words containing the flap, the occurrence of the [r] 'allophone' may be accounted for simply as a variant found as a product of suprasegmental context. While not denying that immediate phonetic environment of voicing must have an affect on the phonation value of the phone, its
'weakened' manner value (tap or flap for stop) is a product of the supra-
segmental phonological context such that the constraints of prosodic representation (syllable-initial in unstressed position, for example) - and rhythmic representation (posthead position in word, for example) - as the total sum of its contextual embedding will influence its exact form. For instance, at clause or sentence level within the hierarchy as in "He's a writer!" where the /t/ is embedded in a proscdically relatively 'strong' context, the likelihood is that it will be produced as a stop rather than a tap or flap form. That is, there is no structural motivation within the present approach to posit a Resyllabification transformation to account for this as Selkirk and others do (see esp. Kahn (1976) on 'flapped t 1 ) . As will perhaps be clear, there is no necessary distinction made in the present approach between categories of 'extrinsic'.and 'intrinsic allophones1 of phonemes, although different degrees of 'phonemicization' of segments will exist at particular stages of the phonological development of a language, which may be said to occur when a shift of primary manner degree for consonantal and peripheral vowel area 'place1 for vocalic 'variants' is no longer attested within suprasegmental context. Thus it follows from this line of argument that there is no distinction in the present approach between Selkirk's transformational (i.e. type ii) rules and (type iii) of segmental phonology.
rules
83
4.2.2.5. The present chapter has attempted to examine the definition and place of a concept of structurally defined 'strength1 relations between units and constituents of a phonological system and to critically examine the success of approaches that have given formal expression to these relationships in a theory, with particular reference to generative or generative-related models of linguistic description. In the following chapter, a final outline will be given of a possible framework of description emerging from the present and previous discussion which is adequate for the formal expression of 'suprasegmental context1 within the general type of phonological model here envisaged.
CHAPTER FIVE: AN EXTENDED PHONOLOGICAL MODEL OF SUPRASEGMENTAL CONTEXT
5.0.
Introduction
By way of a conclusion to the present study, this chapter will, a) present an extension of the descriptive phonological framework for the statement of suprasegmental context as it has been developed in the discussion of previous chapters, notably chs. 2 and 4; b) offer a concluding view as to the role and status of such a structural framework as defined within a particular conception of linguistic description in the explication of foreign language data; and c) by way of an analysis of a limited data sample practically demonstrate the explanatory role of such a framework in 'accounting for 1 observable regularities in second language pronunciation variation. 5.1.
Features and contrasts in the suprasegmental hierarchy
5.1.1. 5.1.1.1. The type of phonological framework emerging from the discussion in chs. 2 and 4 which, it is claimed, is necessary for a structurally well-founded statement of suprasegmental context is in the first instance characterizable as a hierarchical and strength-based model of description. The hierarchical nature of phonological structuring has been shown to be well-motivated by the consideration of both systemic and metrical phonological descriptive models. The expression of structural relations between elements of the hierarchy has been seen to be statable in terms of ' dominance' relations between and within levels of the hierarchy by employing the concept of structural 'strength1. In the present conception, the key to such relations of strength resides in a unit-positional view of structure, such that relations obtaining between entities in terms of constituency are determined by the location within such entities of particular values of features which constitute the relata. More concretely, the positional occurrence of feature contrasts and values having reference to different constituent levels of the hierarchy determines the structural relationship obtaining between them as expressed in terms of
85
'strength1. Clearly it now behoves any further development of the node! of description to state precisely the nature and status of such features and the manner in which they serve to relate elements of different levels of structure. It has been shown that in a hierarchical ordering of phonological constituents of the (descending) order sentence, clause, phrase, word, fontative, syllable, set, segment, 'sane-level' or 'sister' constituents are assigned the structural status strong or weak. Ihus a hypothetical ordering with binary sister constituents could be represented in a tree as: Unit-levels SENTENCE CLAUSE PHRASE WORD FORMATIVE
etc.
SYLLABLE SET
SEGMENT
s
w
s
w
i/
/P
Note that since the tree constitutes a (partial) statement of the phonological representation of a particular string, it could equally be represented albeit impracticably - as a labelled bracketed string of segmental elements of the form: [
[
[
[
[
[
t
t
SENT CLW PHF/ WOW VOW SYLS SETS SEGS
p
l
etc.
SEGS
Such a tree might represent the strength assignments of the 'primary'
(i.e.
culminative suprasyllabic; delimitative subsyllabic) narking system. Note however again, that there is no necessary motivation for strictly binary branching in the tree, nor for a uniform direction of branching. Within the primary system, right branching is typical, but not a defining characteristic of, the suprasyllabic hierarchy, whereas left branching by definition (initial delimitative marking) characterizes the subsyllabic hierarchy. That this is so is a function of the determinants of the strength relations obtaining bet-
86
ween constituent units as sisters, lover-level "daughters' or higher-level 'mothers' in the hierarchy. Ihe only conditions imposed on adjacent-level constituent units is that one immediately lower-level constituent of a higher level structure be narked strong, and if there is only one lower-level unit, that will be s. Thus of a combination of sister constituents, one is labelled strong and - in theory - nil to a total of n number of sisters will be marked weak. However, the actual.number of w sisters is constrained by partly language-specific and partly universal phontactic well-formedness conditions, there being limits, for example, imposed on the total possible number of segments in a set, sets in a syllable, etc. 5.1.1.2. It has been claimed in previous chapters that strong constituents were thus labelled in the primary marking system, since within their boundaries they include a locus with the greatest potential for a feature contrast with distinctive function, which in turn is relevant for the structural characterization of that particular level of constituent. Reference has been made in chs. 2 and 4 to the different phonological functions which elements of phonological structure may express. In particular, Trubetzkoy's trichotomy of distinctive, culminative and delimitative functions has been invoked in the discussion of structural properties of the suprasegmental hierarchy and its component unitlevels. However, it must be noted that in Irubetzkoy's conception of what he terms the 'secondary' functions of sound units, such culminative and delimitative functions as may be assigned to units derive only 'coincidentally1 from the position of phonemes (or their variants) within a structural syntagma, i.e. from their phonotactic properties. Thus the phoneme combinations /s0; z#; it; cs; ss; ss/ in English for example, constitute "Grenzsignale", i.e. have a delimitative phonological function in that between them must fall the boundary of a "Bedeutungseinheit" (Irubetzkoy 1958: 247); or further, the contrast between 'clear' and 'dark 1' in English in the combination V + L + V has a delimitative function since "die 'dunkle Realisation' des 7--Phonems in dieser Phonemfolge besagt, daß zwischen Z und dem folgenden Vokal eine Wbrtgrenze liegt" (1958: 249). Suprasegmental features as such in phonological description have been proposed by Vanderslice and ladefoged (1972) (cf. also, ladefoged 1971, 1975 and Hirst 1977; the former in association with SPE Stress Cycle phonological rules, the latter within a 'syntactic approach to English intonation1 consistent with an Aspects-type model of generative grammar). Vanderslice and Ladefoged's 'binary suprasegmental features' comprising [±heavy 1, [iaccent], [lintonation],
87
[Icadence] and [iendglide] are employed primarily as descriptive phonological features characterizing phonological phrases in the way that the SPE feature [stress] is similarly used by Chomsky and Halle (1968). Syntactically derived phonological strings may be assigned binary values of these features much as they are assigned values of segmental distinctive features, with the difference of course that the suprasegmental features are in the first place syntaqmatically motivated. However, as Hirst too notes (1977: 53), the features specified are to a certain extent indeterminate in interpretation as to their phonological (i.e. strictly 'classificatory') or phonetic (i.e. strictly 'descriptive') value. In later work (e.g. ladefoged 1975, but also ladefoged 1971), the suprasegmental features thus postulated serve as a potentially theory-neutral expression of possible sentence-level phonological contrasts to be found in the languages of the world. Hirst's (1977) system of 'intonative features', by contrast, represents a set of basic descriptive primes in a generative model of syntax and phonology in which such (distinctive) features 'are assigned from the syntactic surface structure of the sentence', the latter being a representation of strings in their 'non-reduced1, i.e.
non-ellipted, syntactic form. ΉΙΘ intonative features
are properly distinctive in function in that they may serve to disambiguate otherwise identical syntactic strings. Ihus, for example, the feature [stress] marked ', distinguishes the two strings (Hirst 1977:30-31): he 'ate a "little pudding (i.e.
'he ate a small pudding')
and
he 'ate a little 'pudding (i.e.
'he ate a small quantity of pudding')
[stress] therefore is seen as "an abstract intonative feature which is assigned to formatives of the syntactic surface structure, and which subsequent rules assign to a given syllable or syllables of that formative" (1977:30). The feature [centre], marked °, is postulated to distinguish strings such as (1977:36-37): I "thought he was 'married ( i . e . "He is married as I thought') and
I 'thought he was "married (i.e. that he was married')
I thought,
(wrongly it now seems),
A phonological phrase, as opposed to a syntactic sentence, is consequently defined as 'a sequence of formatives one and only one of which carries the feature "centre"1 and the limits of which are determined by the formatives carrying a further feature [boundary]. Ihis feature, marked /, distinguishes the two strings (1977:38-39):
88 /She 'should have "phoned / her 'mother was "worried/ and
/She 'should have "phoned her /'mother was "worried/
A fourth feature [terminal], in its negative value marked + , in its positive value //,
is required for the disambiguation of (1977:40-42):
/'Would you 'like °tea + or "coffee // Tea or c o f f e e ? ' )
(i.e.
'Which would you like?
and
/'Would you 'like °tea + or "coffee + (i.e. coffee? or something else? 1 )
'Would you like tea or
Finally, a feature [contrast], marked _, distinguishes (1977:42): / ' I f you "give it to me + I'll "mend it // I ' l l mend it for y o u ' )
(i.e.
'Give it to me and
(i.e.
' I won't mend it
and
/ ' I f you "give it to me + I ' l l "mend it // unless you let me keep i t ' )
Hirst's view of phonological representation is that of Chomsky and Halle (1968), in which phonological phrases as labelled and bracketed segment strings are derived from surface syntactic structure via Phonological Readjustment and Redundancy Rules. The rules which assign the intonative features take as their input syntactically derived structural categories, much as the SPE Stress Cycle Rules. Ihus the rule for assigning the intonative feature value [+stress] takes the form (1977:73): A ·> Ά / ( X + ( ίι
W
+ Β)
W
+ γ)
ίι
where A represents a formative, Ά the formative assigned [+stress], Β another formative which may be null, Ζ a non-lexical category, X and Υ strings which may be null, and where W = (L; Mι1 ; CL; \j C^; y NPoss; NPint; Cnscj ), i.e. W represents a specific category for a number of sub-rules. These specific categories are L = lexical category symbol (as in the SPE Stress Rules); M- a modality category expressing 'eventuality', 'probability' and Obstination1, which expresses the stressing of the modals respectively in examples such as "They 'may go home early", "He 'must be mad" and "Robert 'will come late"; CL a deictic subcategory, e.g. "what"; Cn an interrogative morpheme subcategory, e.g. "what"; NPoss a possessive NP as, e.g., in "I saw 'his burn" in the reading: "I saw the one(s) he had bum"; NPint an intensive pronoun, e.g., as in "He doesn't know him'self", and C never", etc.
a negative category such as "none,
89
As this exaitple of a rule shows, Hirst achieves a sophisticated level of 'delicacy' in his description of what in effect corresponds to the metrical phonological 'syntactic napping'. Similarly fine-grained rule specifications are posited for the assignment of the other intonative features, and the analysis as a whole must count as the most successful attempt to incorporate suprasegmental distinctive features within traditional generative grammar to date. Further, a necessary and consistent distinction is maintained between "classificatory1 phonological 'intonative features' and 'descriptive' phonetic 'prosodic features' in the model (for a development of the latter see Hirst 1979). However, in contrast to the description developed in the present study, the conception of features and the contrasts they express is a restricted, if legitimate, one. Features are solely distinctive, and the distinctions they carry are between potential minimal sentence pairs, whose 'meaning' contrast is interpreted exclusively in terms of differences carried in syntactic information, consistent with an Aspects model of generative grammar. In the present study, features are seen to have culminative and delimitative as well as distinctive 'functions' in the phonology, the structure of which is defined by the locational intersection of the manifestations of these different functions. A limitation of the present analysis is of course that no defining structural link is posited between syntactic (or semantic) structures and the phonological representation. Rather, as a working basis it is assumed that phonological structure in the first instance must be internally motivated, and such structural relationships that may obtain between phonological and syntactic and/or semantic representations, whether as expressed in terms of 'realization' or simply 'correlation', for example, must be discovered and not a priori assumed. 'Distinctiveness' within the phonological system as manifested by features is in the first place system-internal, and such 'meanings' as are thereby distinguished may be interpretable in terms of lexical or word-level reference to sentence or utterance or even text or discourse levels of reference. 5.1.2. 5.1.2.1. 1b summarize again: in the view of phonological structuring presented here, by contrast, distinctive, delimitative and culminative functions of Irubetzkoy's 'sound properties' are attributes of features as they mark the structural status of the units with which they are associated. Within the phonological hierarchy, the occurrence of particular features and their values which have a potential distinctive function at one level "systematically
90
coincides1 with their occurrence in a culminative or delimitative function at the next highest level. Ihis has been termed the 'primary marking system1 in phonological structure. In other words, paradigmatically established distinctive features and values serve to define the syntagmatically established culminative and delimitative status of features. The 'systematic coincidence1 is structure-defining in that the unit-positional coincidence of features is such that the location of greatest potential distinctive paradigmatic contrast coincides with the location of significant culminative syntagmatic or delimitative syntagmatic feature values, and that in doing so the next-highest unitposition which contains this 'locus' is thereby defined as the strong sister constituent of that level in the phonological hierarchy. Within the suprasyllabic hierarchy, this coincidence is primarily position-oriented and contributes to the definition of syntagmatic 'accent' at various levels of hierarchy (e.g., clausal accent, phrasal accent, word accent, etc.). Within the subsyllabic hierarchy, however, the coincidence is primarily unit-oriented and has direct relevance for the phonotactic substance of constituent units. That this is so, is of course a product of the type of features contrasting at supra- and subsyllabic levels: within the former, features contrasting are typically 'prosodic' or 'suprasegmental', e.g. with reference to 'pitch-accent1 patterns; within the latter, features contrasting are typically more 'segmental' in nature, e.g. consonantal/vocalic,
etc.
An illustration of the system of (primary) marking, which via the systematic coincidence of feature values and contrasts defines the structural (strength) status of unit-positions of the phonological hierarchy, may be provided by an analysis of the following sentence: She went to the window and looked through the curtain.
On syntagmatic grounds, the hierarchical representation of the sentence above the syllable could be as follows, with strong constituent-levels in each case containing a 'primary accent' locus: Unit-levels SENTENCE CLAUSE PHRASE WORD FORMATIVE SYLLABLE I
I
I
I
I
I
I
I
I
.
the curtain
91
i.e.,
at clause level, primary 'sentence accent1 occurs within the second
clause beginning "and looked"; at phrase level, primary 'clausal accent' occurs within the second phrase of each clause; at word level, primary 'phrasal accent' occurs within the second, fourth, sixth and eighth phonological word; at formative level, primary "word accent1 occurs within branching structures on the formative "curtain" as opposed to "the" in the phonological word "the curtain", similarly on "window" in "the window"; at syllable level, within branching formatives, 'formative accent' occurs obviously on the syllable "win" of "window" and "cur" of "curtain". Clearly, the 'primary accent' of each level-unit in the representation falls on a syllable as its locus. 'Accent1 in this sense is usually thought of as being manifested by particular values of the phonetic features stress and pitch primarily, and segmsntal quantity and quality secondarily. Thus, 'sentence accent1 or 'clausal accent' is usually characterized by a pitch change on the "nuclear* or 'tonic' syllable with vowel lengthening and 'ideal' qualities of beginning consonant and medial vowel. Syntagmatically therefore, 'accent' (or 'tonicity'), i.e. the culminatively marked locus syllable within a constituent domain, is interpretable phonetically by positive feature values of pitch (height and direction), stress, quantity and quality. However, place of 'accent' is also paradigmatically motivated. Indeed, it must be for the assignment of structural strength marking to the relevant constituents. The traditional 'suprasegmental' phonetic features associated with 'accent' have a phonological basis in the paradigmatic distinctive dimension. The locus of 'sentence accent1 within a constituent s clause at the same time offers the greatest potential for phonological DIRECTION feature contrast, i.e.
a distinctive feature characterized phonetically predominantly
by a rising or falling pitch at this position. Phonologically, it has the values [rise] and [fall], and its primary meaning referents are in the area of sentence type (e.g. the expression of interrogative vs. declarative) or illocutionary force. In the example given, there is a potential choice of [rise] vs. [fall] on "curtain" as containing the locus of an s clause which is not on "window" as containing the locus of a ω clause. The locus of 'clausal accent' within a constituent s phrase offers the greatest potential of a phonological HEIGHT feature contrast, i.e. a distinctive feature characterized phonetically predominantly by marked high or low pitch. Phonologically, it has the values [high] and [low], where within an s phrase the locus may bear contrastively the value [high] or [low], whereas in a ω phrase no such
92
choice is possible. For example, in the phrase "to the window" as opposed to "She
went", [high] or [low] are possible on the locus "win(dew)", i.e. a
marked pitch height, whereas in the u sister constituent "She went" this is not possible. The semantic referents of this potential distinction are predominantly in the area of 'attitudinal' meaning whether interlocutor- or text-addressed, e.g. for the expression of surprise, anger, etc., but also may be associated with so-called 'contrastive stress1 or "accent1 within an appropriate text or discourse context. The locus of 'phrasal accent1 within a constituent s word offers the greatest potential for a phonological MOVEMENT CHANGE feature contrast, i.e. a distinctive feature characterized phonetically by raised or lowered pitch relative to the previous pitch value. Phonologically, it has the values [up] and [down] where within an s phonological word the locus may bear contrastively the value [up]
or [down], whereas in a ω word this choice is not possible. For example,
in the word "looked" a potential (contrastive) choice is possible between [up] and [down] which is not possible in "and" as a ω word. The semantic referents of this potential distinction are again in the area of 'attitudinal1 meaning. The locus of "phonological word accent1 within a constituent s formative offers the greatest potential for a phonological MOVEMENT feature contrast, a distinctive feature characterized phonetically by kinetic pitch within the particular formative. Phonologically, it has a positive and negative value of [movement], i.e. [imovemsnt], where within an s formative the locus may or may not bear a "kinetic tone", whereas in a ω formative this potential choice is not possible. Thus in the branching word "the window", the formative "window" offers the choice of pitch movement or not, the formative "the" not. Strength assignments on syllables within formatives, i.e.
the establishment
of 'formative accent', is of course entirely determined by lexical accent patterns which are given for each formative in the phonological lexicon. For polysyllabic formatives, lexical accent assignments may,
however, be correla-
ted with the segmental composition of such syllables and distinctions made between 'accentable' and "non-accentable1 syllables, which thus do point to be possible distinctive feature contrast potential at this level 'coinciding1'with a positional interpretation of a syntagmatic (culminative) s marking. However, the further exploration of this type of correlation is beyond the scope of the present study. Similarly, as has already been noted in the discussion of the Rhythm Rule of metrical phonology in Chapter Four, lexical compound accent must also be taken account of in a consideration of lexical insertion in the phonological hierarchy. Thus, the culminatively motivated strength markings
93
established for the hierarchical representation of a phonological sentence such as the one given may be seen to be equally distinctively motivated in all cases. 5.1.2.2. The significance attached in this study to the positional coincidence of culminative 'sound attributes', such as that of 'accent', which may lend themselves to expression via a feature system (cf. the SPE [stress]), and distinctive 'sound attributes' as expressed by a paradigmatically established distinctive feature system, may be usefully compared to important elements of the type of metrical phonology discussed in the previous chapter. It has been a goal of current versions of metrical phonology to obviate the necessity of the SPE feature [stress] as a 'segmental1 feature in phonological representation. In certain studies (predominantly those of Selkirk), this has been achieved by deriving stress or prominence relations obtaining between different phonological units via an enriched conception of the form of a phonological representation, whereby the hierarchical ordering of 'prosodic categories' itself expresses the relations of prominence between units as well as capturing the degrees of, or 'relational' nature of, stress, which in the SPE model was only derivable by means of the rules of the stress cycle. However, most analyses have by and large%restricted their attention, much in the spirit of SPE, to the specification of patterns of lexical word and compound stress (see, e.g., especially Lieberman and Prince 1977 and Schane 1979), with little concern for the prominence relations obtaining within higher levels of a phonological hierarchy (cf., however, Selkirk 1978, 1980a). Furthermore, strength relations thus determined have exclusively culminative reference in the present terms, as is also evidenced in most versions of metrical theory by the dualistic representation of phonological structure in terms of metrical grids. As Lieberman and Prince put it:
"first,
we represent the notion relative prominence in terms of a relation defined on constituent structure; and second, we represent certain aspects of the notion linguistic rhythn in terms of the alignment of linguistic material with a 'metrical grid'. The perceived 'stressing1 of an utterance, we think, reflects the combined influence of a constituent-structure pattern and its grid alignment" (1977:249). More will be said on the issue of 'grid alignment1 below in the context of the type of rhythmic representation postulated here. However, the goal of such analyses is in the first place to account descriptively for the positional occurrence of units bearing greater and lesser degrees of 'prominence1 in the
94
phonological syntagma. Apart from revealing the necessity of a hierarchical view of phonological structuring, with a concomitant revision of the nature of a phonological representation, metrical phonology has yet to cone to full terms with the important consequences such revisions have for the specification of the primes of phonological description, i.e., which in generative theories include features. Selkirk (1982a) summarizes the inplications for the position of features in the following terms: "an enrichment of the theory of representation has meant a reduction in the need for certain features in the representation of distinctions between particular forms" (1982a:1). Attention has concentrated on the phonotactic structure of phonological units, primarily the syllable, with recent suggestions that phonological (distinctive) features as developed in SPE and later work, particularly the 'major class features' [iconsonantal],
[isyllabic] and [Isonorant], are in-
adequate as well as superfluous for the specification of phonotactic well-formedness conditions imposed on syllable structure by the possibility of occurrence of different segment types in different positions (cf. Selkirk 1982a). For example, concerning the redundant status of the feature [Isyllabic], Selkirk concludes, significantly, "Given that segments are organized into syllable structure, if 'syllabicity' is to be represented with a feature, that feature has the peculiar property of being syncategormatic: whether or not a segment is 'syllabic' depends on its position in a syllable, not on any inherent phonological property of the segment itself
[my italics]" (1982a:3).
While making the obvious objection to this view that the SPE feature [ivocalic] and its successor [isyllabic] as a "major class feature' was not primarily intended to capture phonotactic properties of segments and segment types with which it was associated, the latter part of this statement expresses part of the basic conception of the role of features in the present analysis, i.e. as properties in the first instance associated with a particular structural position. However, the present analysis further attempts to show that a feature specified structural position is in a determining relationship to the constituent status of units that 'fill' that position. Further, the kind of phonotactically motivated feature properties as developed by Selkirk for the syllable are in the current model of a suprasegmental phonology systematically related to distinctively motivated feature properties (as indeed also in the original SPE model), as has been shown in the discussion of the suprasyllabic hierarchy above. The relationship will be demonstrated to be equally valid within subsyllabic levels of the constituent hierarchy. For phonotactic purposes a unitary feature [±sonority] with numerical values expressing a 'sonority
95
index" has been proposed by Selkirk (1982a) to capture the traditionally established 'sonority' and/or 'aperture' hierarchies in syllable structure as they account for the typical ordering of segment types. Ihis development in itself of course takes up elements of the syllable hierarchy as proposed in a generative framework by Hooper (1976) and as discussed in previous chapters. Note again, however, that such a feature has reference to properties of a segment or segment type as it correlates with syllable position, without recourse to any structural unit intermediate between segment and syllable. 5.1.3. 5.1.3.1. It has been previously noted in chapter 4 that metrical phonologists have argued against the incorporation of the traditional syllabic immediate constituents onset and rhyme (and of the latter, peak and coda) within a phonological representation as such, instead at a syllable 1C level of representation employing unlabelled structural nodes with s/w assignments. However, here it is claimed that a recognition of the unit-level 'set' (i.e. consonant/vowel cluster) as a 'daughter1 constituent of the syllable is motivated by the same kind of inter- and intra-level positional-structural convergence as underlies the constituent unit-levels of the suprasyllabic hierarchy, as well as being motivated on phonotactic grounds. Again the convergence or coincidence hinges on the positional confluence of distinctive and non-distinctive phonological sound properties as expressable via a feature system. Moreover, sets as much as onsets and rhymes may be independently motivated structural entities with reference to which segment phonotactics of the syllable may be stated (cf., e.g., van Buuren 1978). However, no attempt has as yet been made to establish criteria by which the structural relations of sets within a syllable may be expressed. Within the present framework this is possible by again employing 'structure-defining' features. 5.1.3.2. It has been demonstrated above that within the suprasyllabic hierarchy, the location of culminative marking at a particular unit-level converges with that of a potential distinctive (feature) marking at the next-lowest unit-level such that the sister constituent within which it occurs is marked as strong. Within the subsyllabic hierarchy on the other hand, the systematic convergence is between delimitative marking, i.e. delimitative syntagmatically relevant sound properties, at level χ and distinctive marking, i.e. distinctive paradigmatically relevant sound (feature) properties, at level χ - 1. More concrete-
96
ly, (feiimitative - here unit-initial - properties of syllables converge with distinctive feature properties at set level. The particular distinctive feature contrast which syllable-initial set position has the potential for is that of [+consonantal] vs. [+vocalic]. By contrast, syllable-medial sets only 'allow' the [+vocalic] feature (value) and syllable-final sets only [+consonantal]. Ihe phonological features [+consonantal] and [+vocalic] represent basic types of 'major class feature1 (Chomsky and Halle 1968) and are realized phonetically by quantity, primarily manner, attributes. The phonetic correlates of the phonological features may be defined thus: [consonantal]
- perturbation of the airflow at a point in the supralaryngeal tract via close to complete approximation of the articulating organs in the mid-sagittal region such that local friction may be produced.
[vocalic]
- perturbation of the airflow at a point in the supralaryngeal tract via narrow to wide approximation of the articulating organs in the mid-sagittal region such that (quasi-)periodic noise may be produced.
These definitions reflect traditional conceptions of the basic distinction between consonant-like and vowel-like types of articulations. (For definitions of 'close', 'narrow' and 'wide' closure see Catford 1977). At the same time, they make reference to strictly articulatory as well as acoustic (and aerodynamic) elements of the phonetic medium which reflects the notion - here supported - that phonetic features as such, which include quality, in principle have correlates within articulatory, (aerodynamic), acoustic - and perceptual modes of description. As with unit-levels of the suprasyllabic hierarchy, actualization of the feature contrast [+consonantal]/[+vocalic] depends of course on the phonotactic coirposition of the syntagma, the essence of distinctive relations carried by features and feature values being potential. Ihe contrast between [+consonantal] and [^vocalic] at syllable-initial sets reflects the fact that in English (and Dutch) in this position a manner type 1
approximant' ('semi-vowel1, 'frictionless continuant') may be found, which
in terms of the phonological features proposed would have the feature value [+vocalic]. Segment types in English associated with this manner type are [wj^] , in Dutch [uj]. Thus in terms of features, a contrast is possible between [+consonantal] and [+vocalic] at this position, as, for example, in the Ehglish syllables "bell" vs. "well", etc.
Such contrasts (and manner types)
are clearly not found within non-initial syllable sets. Moreover, bearing in mind that both [consonantal] and [vocalic] have binary values, plus and minus, the combination [+consonantal] and [+vocalic], i.e. [+VfV, ] r is only found at
97
syllable-initial set position. Otherwise, i.e. in syllable-medial and -final sets, only [ °°η3] (medial) and [J~°"S] (final) combinations are found. Ihis ^ VUG
"
observation captures the fact that in polysegmental initial sets a manner type approximant may co-occur with 'consonantal' manner types as in "twin" [twin], "pure" [pjua] and "dram" [draem ] in English, and "twee" ("two") [tue1] , "sjaal" ("scarf") [sja:l] in Dutch. Ihus one may note that the feature value combinations [_
], [
whereas only [ J
-VOC
] and [
] are possible at syllable-initial sets,
]- at final position - and [ ^
+VCC
] - at medial position ^
are possible elsewhere. The manner types (or 'segment types') lateral and nasal as occurring in syllable sets are associated with the phonological feature values [ ] . The —voc occurrence of 'syllabic laterals' and 'syllabic nasals', respectively, as in English "little" [litl] and "button" [bAtn] , does not affect this positionally consonantal feature specification, there being strong phonotactic-phono logical grounds for treating such 'syllabic consonantal' occurrences as sequences of 'vowel' plus 'consonant' (plus 'vowel'), i.e. V + I (+ v) , V + η (+ν) , etc., within the syllable (cf., e.g., Fudge 1976; also van Buuren 1980). By contrast, 'syllabic r 1 in British English may be analysed phono logically as (v +) r + V, i.e. as a syllable- initial (set) segmental unit, for example as in "temporary" t ' temprri ] -»·/ ' tempersri/ . 5.1.3.3. Turning now to the location of set- level features, it may be established that sound properties marking the beginnings of sets de limitative ly merge with potentially distinctive properties associated with constituent segments, specifically with the positional realization of scalar-defined values of the feature [closure] . The phonological feature [closure] may have the scalar values [1 closure] , [2 closure] , [3 closure] , [4 closure] , [5 closure] and [6 closure] . Phonetically, these values are interpretable in terms of articulatory 'degree of approximation" of 'active articulator' to 'passive articulator1: i.e., [1 closure] = 'complete approximation1 (i.e. centrally and laterally together), [2 closure] = 'partially complete approximation1 (e.g., only laterally), [3 closure] = 'near-complete approximation', [4 closure] = 'close approximation1, [5 closure] = 'mid approximation ' and [6 closure] = Open approximation1. The scalar features have reference to - in terms of "segment types' - stop, lateral, fricative, approxinant/high vowel, mid vowel and open or low vowel, respectively. Subject to the restrictions imposed by the phonotactic composition of sets within syllables, values 1 to 6 may be found in association with initial segment position in a set, whereas only a restricted
98
range of values is found in non-initial segment position within sets, lhat is, for example, values 1 to 4 are found initially in syllable-initial sets, 1 to 3 initially in syllable-final sets and 4 to 6 initially in syllablemedial sets. By contrast, only values 1 and 2 are found non-initially in syllable-initial and -final sets and 4 and 5 non-initially in syllable-medial sets. Ihus set-initial position is marked by the greater potential for distinctive feature (value) contrast associated with its constituent segments. The phonetic specification of the phonological scalar values of the feature [closure], while here defined in purely articulatory terms, are assumed to have acoustic and perceptual correlates, the latter, for example, definable with reference to degree of 'sonority'. The scalar values of this feature [closure], then, capture potentially distinctive sound properties as they occur at initial position in sets in association with 'consonantal1 and 'vocalic' segment types in English (and Dutch). In syllable-initial sets, the types stop, lateral, fricative and approximant may occur as initial segments, whereas in syllable-final sets, the types stop, lateral and fricative may occur as initial segments. Noninitially in syllable-initial, with the exception of /s/-initial syllableinitial sets, only a limited number of values or segment types are permitted (for English, cf., e.g. Gimson 1980:240-244). In syllable-medial sets in English, initial segment position may be 'filled' by a close, mid or open vowel, non-initial place only by close or mid ones, i.e. a limited number of feature values (cf. the 'diphthongal series' /ai a u 01. ei eu IB ea υθ/) . In Dutch, non-initial position in syllable-medial sets may only be filled by a close vowel, i.e. that associated with feature value [4 closure] - cf. the 'diphthongal series' /εί · head (enclitic) at all levels of the phonological hierarchy. The kind of (phonological) lexical representation referred to in the previous discussion, comprising lexical words with (lexical) accent and their phoneme-composition
Λ U Η EH Η
Λ
^ υ
^ υ
V
Ι
Λ
υ Η
Λ (J Μ
H
Η J U
Η
ν
Sΐ
Λ
ΕΗ
V
Ι
Λ
u Η Η Η J
υ Η Η Η J U
i Οι
§ (Χι
Ι
ο
υ Μ Η
Μ J
υ
S Οι
Λ Ο Η
υ 2 ω ν
Ι
g ν Λ
ο Η Η
ν Q
1 u Η ΕΗΗ J
u Η (r> Η lJ
U §
α.
Μ
Η
Λ Ο Η
Η J
L) Ζ
ω ν
1 §
ν
Q
i
Q
ω ν Q
Ι
Q
Ι Ι ι ν Q
&ι
V
Ss:
q
Q
Λ U
Λ
i Μ
Η
i
aj
sβ
a u
w
d
Q
C|
Q
Z
l
3 < A u
-s 0)
0
.H
T3 β
id
Λ
l Q
3 a:
i l
Ϊ
0
•α c 's
3 a: S al
ω -
ru
4-1
3 a; Q
3 ΐ Q
s
j
>H
w
a
(U M
u •H
0
4J
4J
α)
M
υ •H Λ ιβ
S
ω — ,c al w
ωj
G O 4J 03 4J G 0) W) 0) )-i
•rH
ε
Q
W
ω w
Ό
H H
ΙΟ Μ
Ή Ι 4J •"Ι
4-1
U
C5
Ι
J3 Cn 3 0 Μ J3
3 a:
Sa;
ι 1
(U
— Λ4J
Q
Q
ν c>
J-l 3 Ο
— 3 aj
ω v
Οι
4-f
Q
U
S
1]
•z 63 a:
ω ν
1
I«
«t Q a:
C|
Sa;
Ή
Q
Q «C 65
S
c
3a:
s S:
V
Λ
ν
Λ U
W
fl M
t
ω
122
derived via a lexicon-internal 'fill-in1 of elements of the phonological alphabet, provides via lexical insertion or 'spell-out' the basic 'substantive', or 'compositional' in this sense, units of the phonological hierarchy, which of course is as relevant for the rhythmic representation as for the prosodic representation. 5.2.2. 5.2.2.1. Thus far, even though a type of 'systematic coincidence' between loci and foci of prosodic marking and a syntagmatic 'rhythmic1 division of units into (proclitic) peak ->· head (enclitic) has been established, for further structural justification of the rhythmic representation developed it will be necessary to examine the form of such 'rhythm units' within a rhythmic representation itself. Here there can be no distinctive or contrastive function accorded the units as in the 'systematic coincidence' positionally of culminative/delimitative and distinctive properties of sound units as expressed in prosodic representation via the features [CENTRE], [BOUNDARY], [Centre] and [Boundary]. Rather, structure-defining properties may be established which consistently distinguish the postulated sub-units proclitic, peak (head (+ enclitic)) from each other within the phonological syntagma. As Knowles (1974) has shown, such units and sub-units of what is here the rhythmic representation are at various levels of a phonological hierarchy characterized by particular rhythmic properties. 'Rhythmic grouping1, in his terms, as represented by a division of a 'rhythm unit' into the type of 'constituents' as suggested here, forms together with 'rhythmic timing1 an integral part of a general theory of rhythmic patterning in English. Whereas 'grouping' "depends partly on the different kind of stress in the sentence (and to a limited extent particular tones on accented syllables), and partly on surface syntax" (1974:125), 'timing' "depends partly on the rhythmic grouping, and partly on the tempo" (ibid.). The 'rhythmicity' of a stretch of speech depends, thus, on an interpretation of temporally relevant regularities deriving from linear organizational properties of the units in a syntagma. Central to such rhythmic patterning is the relative contribution of the different sections of the 'rhythm unit', i.e. the respective proclitic, head and enclitic. Traditional measures of what Knowles terms 'beat1, often equated with the 'rhythm' or "rhythmicality1 of speech, have made reference to the temporal regularity in occurrence of 'stressed syllables', as in the so-called 'stress-timing' analysis of English rhythm.
123
However, in this connection, Knowles draws a careful and significant distinction between 'beat analysis' and 'rhythm analysis' as exenplified by the different analyses (a) and (b) of the sentence '"Hie curfew tolls the knell of parting day": (a) The 'curfew - 'tolls - the 'knell - of 'parting - 'day (b) The /curfew / tolls the /knell of / parting /day
where (a) constitutes a "rhythm analysis', and (b) a "beat analysis' (1974:125). Drawing an analogy with a musical score, Knowles observes that "we can divide an utterance up into rhythmically grouped stretches, or alternatively into 'bars' beginning with the stressed syllable" (ibid.). Clearly, it is the former grouping that is of relevance for the present description. In terms of phonological structure, it is the linear occurrence of heads at various levels of the hierarchy relative to each other syntagnatically that forms the structural basis for the establishment of overall (linguistic) rhythmic patterning. The inposition on a text or utterance of 'perceived' teitporal regularities in terms of a 'beat analysis', as practised in metrical scansion or possibly by listeners in the interests of perceived iscchrony (cf. Lehiste 1977), is not the concern here. >
5.2.2.2. Rhythmic grouping of units within the syntagma has been demonstrated by Knowles to occur at various levels of a hierarchy of phonological units in his terms, from the "tone unit' via the "phrase" and "word" to the syllable and syllable part. The phonological basis of "rhythmicity" has also been alluded to by, for example, Allen (1975), who states that "Rhythmic movements act from within the phonology rather than as external constraints on performance" (1975:82). Similarly, Lehiste (1970) calls for a closer examination of the phonological basis of much of suprasegmental patterning. A phonological interpretation of rhythm is of course central to the postulation of a 'metrical grid" within metrical phonology (cf. Lieberman and Prince 1977). However, care trust be taken in phonological analyses of rhythm not only to distinguish between "beat", or literally, "metrical1 analyses and "rhythm analyses', as Knowles clearly shows, but also - and perhaps more ittportantly not to constrain the form of a phonological rhythmic structure, for exarrple as that conceived of here, by inposing behaviourally established organizational units and a priori established principles of rhythmic organization (for instance, of "stress-timing' vs. 'syllable-timing') on the description. Thus, for exanple, the postulation of the 'foot" (cf. Abercrombie 1964;
124
Halliday 1967; and further Tench 1976) as the central unit of rhythmic organization and at the sane time as a legitimate phonological unit(-level) has been based not only on a 'beat analysis' analogous to the metrical analysis of music and verse, but also on a still poorly understood behavioural propensity of listeners (if not of speakers as in the original formulations) to impose a rhythmic structure on speech according to a principle of 'stresstiming1 (for a recent useful discussion of 'stress-timing' vs. 'syllabletiming1 see Vfenk and Wioland 1982). Similarly, the 'metrical grid1, in its original conception "a normalization of the traditional idea of 'stress-timing'" (Lieberman and Prince 1977:250), is motivated notionally by such behavioural concerns with the object of reducing to a phonetically 'realistic1 number of SPff-type 'stress levels' the highly differentiated pattern of s/w 'prominence relations' obtaining betwsen units as expressed in their syllable 'terminals' in a phonological hierarchy. Ihus, instead of assigning a numerical value to syllables by their strength - i.e. 'prominence'-status in the hierarchy to obtain an equivalent SPE stress numbering via the algorithm 'For any terminal node s, determine the first ω that dominates s. Count the number of nodes that dominates this w. Add 1.
This is the SPE stress number of s.' (Lieberman and Prince 1977),
it is suggested that a constituent tree and its strength relations may be 'flattened' by aligning this structure with a 'metrical grid1. Rather, the alignment to the grid, itself a formalization of 'hierarchies of intersecting periodicities', is achieved by a 'Relative Prominence Rule1 (RPPR) which has the form 'for any pair of sisters ( s , u ) , s must contain a node that holds a grid position stronger than any held by terminals of u 1 (Lieberman and Prince 1977:316). An example of the grid alignment of a metrical constituent tree is in the Lieberman/Prince schema (1977:259): χ X
X
X
X
X
X
X X
X X
X
reports of threats of violence
I
V
I
I
\/
I
I
I
I
125
An x marks the grid alignment of each syllable terminal. A totalling of the χ 'assignments' per syllable produces its relative metrical 'prominence' level. Similarly, Giegerich (1981) proposes a formal means for the 'pairing' of elements in the tree onto the grid to reflect 'stress-timed rhythm 1 . Ihe regular alternation of stressed syllables is necessary, it is claimed in metrical phonological studies, to account for rhythmic adjustments of the type "thir'teen 'men" ->· "'thirteen 'men", as discussed above in ch. 4, where a potential 'stress clash" of adjacent stressed syllables (or, strictly, adjacent syllables with the same 'degree of stress') as in "thir'teen "men" is prevented by a general principle of 'eurhythmy1, formalized in Lieberman and Prince (1977) generally in terms of a "Rhythm Role' and specifically here, as a rule of 'Iambic Reversal1. Hayes (1982) further develops the eurhythmy principle in a detailed specification of the ideal permitted distance between syllables of an equal prominence level. While there are perfectly valid behavioural considerations in thus limiting the degrees of 'stress-levels' to a number which is conceivably produceable or perceptible, and identifying those elements of a phonological representation whose temporal regularity contributes in large measure to at least perceived rhythm patterns in speech, i.e. in the sense of 'stress-timing1, it would appear nonetheless methodologically dubious to ^directly impose on the form of a 'metrical1 phonological structure constraints of this type. By contrast, in the present study, the postulation of a rhythmic representation - to be further developed below - is motivated by linear or syntagmatic structural properties and regularities of the phonological syntagma at various levels of a descriptive hierarchy. While recognizing that a temporal interpretation of the alternation of rhythmic head (sub-)units at suprasyllabic levels of the hierarchy may relate to a 'rhythm 1 , or more strictly, 'beat analysis' in terms of 'stress-timing', this must be seen as a behavioural correlate of a structurally established linguistic or phonological rhythm. The relative (phonological) 'prominence' of a 'syllable terminal1 is in the present conception a product of structural values assigned to a syllable within prosodic, rhythmic and lexical representation and not, as in the metrical view, to be read off from the grid alignment of metrical tree structures. For one thing, phonological 'prominence1 is realized by a variety of phonetic means in the behavioural domain, which include stress in the narrower sense but also other phonetic parameters such as tempo, segmental quality, and quantity, and pitch. Indeed, as Knowles (1974) has observed, 'stress' (even defined as 'loudness') is but one factor in prominence judgements made by
126
listeners, and stressed syllables themselves need not necessarily be 'prominent'. For another, the 'rhythmic' structure of speech is organized at a variety of levels, which include patterning at syllable level but also between structures below and above syllable level (cf. in this connection, Port et al. 1980, who distinguish rhythmic 'micro-' and 'macro-structure'). For instance, in terms of phonological structure, positions of sub-syllabic prosodic strength with loci and foci on the rhythmic units onset, nucleus and coda, for example, may be "directly realized1 and perceptible behaviourally as in slow or careful speech and thus even lend themselves to a behaviourally 'rhythmic' or 'temporally regular1 interpretation. A description of such 'temporally regular' alternations, assuming the patterning may have some perceptual 'reality' as such, is of course ontologically distinct from that of structurally established recurring phonological patterns of (linear) 'regularity' which lend themselves to interpretation as 'rhythm1 or 'beat1 regularities within an empirical behavioural domain. In at least one recent metrical account, this distinction is formally recognized: e.g., Giegerich (1983) reduces the metrical grid to the status of an 'interpretive device' in the theory, "which gives information about the stress timed performance of English metrical structures. In this use, it has the same status as abstract verse patterns or musical patterns: for each of them, mapping rules have to be defined that formally connect the (central) metrical structure with ... patterns of performance" (1983:26). The status of the grid as a component of metrical theory and its relation to the proscdic or metrical representation itself is of course open to considerable debate and the potential redundancy inherent in a theory including a metrical structure as well as a metrical grid has not gone unnoticed: for example, Kiparsky (1979) cites arguments in favour of the sole inclusion of a structure or representation, whereas Prince (1983) argues for the sole inclusion of a grid and the abandonment of the metrical tree. 5.2.2.3. In the original formulation of metrical theory, Lieberman and Prince (1977) reject the possibility of formulating a separate rhythmic representation such as is proposed in the present study in favour of a metrical grid on the grounds that "Representing such grids as trees, although possible, [my italics] requires us to define rows and columns derivatively and also requires the imposition of constituent-structure relations that will have no relevance to our present purposes" (1977:313). However, in the present analysis, a rhythmic representation makes reference to the same axiomatic phonological hierarchy as
127
the prosodic representation and imposes a syntagmatically derived compositional 'constituency1 structure on linear unit-levels via the generalized rhythmic division (proclitic) peak -> head (enclitic). Degrees of relative prosodic strength (not 'stress1 or 'prominence') of unit elements in a prosodic representation do, however, lend themselves to numerical measurement (rejected by Lieberman and Prince), as has been demonstrated above in Chapter Three (cf. also Halle and Vergnaud 1980 for an alternative algorithm of numerical measurement) . Note, though, that in the present model such values are assignable to constituent units at different levels, and not solely to terminal syllables. The location of the 'realization1 of the strength value within a unit, i.e. its locus and focus, may 'fall on' a syllable, syllable-part or set-part. Clearly, the status of the syllable as a 'carrier unit1 and as a constituent must be carefully distinguished, as noted previously. In metrical theory, this distinction is of course fundamentally recognized and motivates the postulation of a grid structure as a type of ' carrier corrponent'. While a ' carrying function" may be attributable to syllables, it is a product of a matching of structurally defined hierarchies - the prosodic and rhythmic - and not the work of a Relative Prominence Projection Rule linking a metrical structure and an 'interpretive1 metrical grid, the latter framework motivated by general 'stress-timing' principles of rhythmic regularity and reflecting a restricted view of the equation of (phonological) 'prominence1 and 'stress'. Rather, the link between prosodic and rhythmic representations may be conceived of notationally in terms of the 'association lines' linking different levels or "tiers' of a phonological representation characteristic of, for exanple, autosegmental phonology (cf. Goldsmith 1976, 1979), which posits such links between suprasegmental and segmental tiers of representation. From autosegmental phonology, the general well-formedness condition on representations may be taken over, namely that 'association lines must not cross". Further, the notational apparatus of the lines of association between the two representations may be formally interpreted as expressing the set of previously suggested 'strength location conditions' obtaining between prosodic and rhythmic representations. Note, at the same time, that similar lines of association may be postulated 'projecting' or 'inserting' a lexical representation onto prosodic and rhythmic representations as expressing sets of "lexical composition conditions' between the representations. The type of phonological structure emerging, then, from this discussion is of a triple set of intersecting representations which may be graphically depicted thus:
128 PROSODIC REPRESENTATION -primary and secondary s/w marking at unit levels: (sentence) -clause -phrase -word -formative -syllable -set -segment
LEXICAL REPRESENTATION -lexical words (with accent) composed of -phonemes
lexical composition conditions
strength location conditions RHYTHMIC REPRESENTATION -generalized structure: (pro) peak -»· head ( e n c ) , at unit-levels: i
-sentence -clause -phrase -word -formative -syllable -set -segment
Fig. 2: Sub-components of a phonological representation
5.2.2.4. It remains at this point to specify the structure-defining properties associated with the generalized or 'canonical' rhythmic structure (proclitic) peak -*· head (enclitic) which consistently 'mark1 it within the different levels of the phonological hierarchy. In the light of the preceding discussion on rhythmic attributes of elements of a phonological representation, it should be possible to assign phonologically relevant properties to those elements of rhythmic structure which, at each level of the hierarchy, by their linear (note, not temporal) 'regularity', appear to be responsible for phonological rhythmic 'grouping'. The obvious candidate for representing the rhythmic 'centre' of units is the head, or more precisely, the peak ->· head (enclitic) element, which functions as a focus of syntagmatic grouping as well as constituting the obligatory compositional element of next-highest level units in the hierarchy (cf. again Knowles 1974). The peak may be assigned the feature (value) [Rhythmic] in this capacity, the proclitic - by default - the feature (value) [Arhythmie].
129
The phonological function of this feature is comparable to that of the features [CENTRE] and [Centre] within the prosodic representation. However, for the sake of convenience, adopting an old-established term, it may be termed 'demarcative'. Ihe feature [Rhythmic], then, characterizes the peak element at every level of hierarchical structure from 'vowel centre1 via 'syllable nucleus1 to 'sentence tonic1. Significantly in this connection, for example, syllable onsets (i.e. 'proclitics') have been variously considered to manifest 'arhythmic' properties both from an empirical viewpoint (Knowles 1974) and a theoretical one (Selkirk 1982a). Concerning its phonetic manifestation, the feature [Rhythmic] may correlate with positive values of the basic phonetic features [pitch], [stress], [quantity] and [quality] which 'realize1 prosodic strength marking. However, the phonetic property which consistently manifests phonological 'rhythmicity1 is that of utterance rate. Knowles (1974), for example, notes that local rate of utterance consistently marks the 'rhythmic grouping' of elements in English into - here - proclitics, heads and enclitics, such that 'arhythmic1 proclitics are associated with increased rate (i.e. accelerando), heads and enclitics - here, together, peaks - with decreased rate (i.e. decelerando). In a rather different theoretical context, Crompton (1981) suggests - again, significantly for the present concerns - that a phonetic feature [rate of utterance] as a 'continuous feature of time1 represents in phonetic representation the concept of 'linguistic timing'. Ihus the phonological feature [Rhythmic] will be represented by low or 'decreasing' values of a phonetic feature [rate] (i.e. = decelerando), and [Arhythmic] by high or 'increasing* values of the phonetic feature [rate] (i.e. = accelerando). While a discussion of the general issue of the role of 'time' or 'timing* in linguistic structure would go beyond the scope of the present study, it is clear that 'timing', for example, just as 'rhythm', as an aspect of the behavioural organization of speech is based linguistically on a temporal interpretation of the linearly realized properties of the phonological syntagma (for a 'linguistic-phonetic1 interpretation of timing see, e.g., van Buuren 1980). Within phonological structure it may be seen - much as 'prominence1 - on the one hand, as the product of the association of (elements and features of) prosodic and rhythmic representations, phonetically manifested by the interaction of values of the features [pitch], [stress], [quality] and [quantity] with those of the feature [rate] - perhaps with particular reference to the unit syllable -, and on the other, in the sense of 'phoneme*
130
or 'word timing1, as a product of the association of (units and features of) lexical representation with both prosodic and rhythmic representations . (for further recent phonological discussions of time and timing cf.m e.g., Coates 1980; Hewlett 1981; Crompton 1981; Linell 1982). In summary, the kind of interrelationships between the representations as expressed in. terms of features may be shown thus:
PROSODIC REPRESENTATION
prosodic features: [CENTRE]
[BOUNDARY]
LEXICAL REPRESENTATION
lexical features:
[Centre] [Boundary] lexical composition —conditions
-word accent (primary, secondary)
strength location conditions RHYTHMIC REPRESENTATION
phoneme features:
rhythmic features:
-(manner, place, phonation, etc.)
[Rhythmic] [Arhythmie]
Fig. 3: Types of phonological representation and their associated features
5.3.
The phonetic representation
5.3.1. 5.3.1.1. The phonetic representation itself may be conceived of as a linear string of phones lacking any autonomous structural status, i.e. 'unlabelled' and 'unbracketed' as in the SPE phonetic representation, but with the significant difference that they serve only as 'realizational place-markers' for the total phonological representation and as such are devoid of any independent 'phonetic' context. In principle, other units could also fulfil this function. The motivation for postulating phones for this representation derives from traditional phonetic transcriptional practice in which phones have the role of notational primes, which itself of course reflects some kind of 'perceived' validity of phones in a behavioural domain. The phonetic properties accruing.
131 to the phonological representation do so to the structure in its totality and diversity, to units of the prosodic and rhythmic representations, but also to those of the lexical representation. The descriptive primes, however, of the phonetic representation are a set of phonetic parameters, constant in presence in the syntagma, but linearly variable in value. Such a conception reflects very closely the Schemas of phonetic representation and description proposed by Crompton (1981) and van Buuren (1980). Crompton, for example, characterizes his concept of a phonetic representation as a 'track' through 'a multidimensional space, with each dimension corresponding to a separate phonetic feature' (1981:17). The features he distinguishes are 'pitch1, "loudness1, 'timing1, 'source features' and "quality", each feature having correlates in the articulatory, acoustic and auditory domains of the speech signal. Thus the feature "quality" has as its articulatory correlate configurations of the vocal tract, acoustically certain aspects of the frequency spectrum, and auditorily for vowels, IPA vowel diagram parameters and for consonants, the auditory correlates of place of articulation and secondary articulations (1981:20). In essence, van Buuren (1980) proposes a similar model of description, albeit within a non-generative framework, where values of a set of in total 11 articulatorily defined parameters including, for example, "pitch", "phonation', "posture" (i.e. secondary articulation), "nasality", 'consonant place', "vowel place" and "manner1, whose values and segmental or 'phonemic1 (transcriptional) associations are defined by rule and integrated with the phonemic transcription by a system of syllable- and phoneme-timing rules, which then order the phonemes along an (abstract) time scale. Similarly, Crompton proposes that his own phonetic features are "quasi-continuous functions of time" (1981:17), the linguistic analogue of which is included as an extra dimension within the multi-dimensional feature space. However, the necessity for including in a linguistic description an "abstract1 time-scale (e.g. van Buuren 1980 defines the units of such a scale as "moras") is debatable. An ordered phonological syntagma incorporating the structural alignments of lexical, prosodic and rhythmic representations with their associated phonological and phonetic features, itself, constitutes a statement of linear organization or abstract "timing" which is available to a temporal interpretation in an empirical domain (cf. the discussion of "rhythm" above). This view of the serial ordering of (rich) phonological structure as the necessary basis for a temporal interpretation also lies at
132
the basis of the recently developed theory of the 'intrinsic timing1 of articulatory dynamics (cf., e.g., Fowler 1980; Fowler at al. 1980). As has been already noted above, the phonetic interpretation of phonological structure proposed here via the features [pitch], [stress], [quality] and [quantity] and [rate] applies to units of that structure, and positive and negative values of these features characterize structurally determined locations within the phonological syntagma. It.is this locational determinant of feature 'application which constitutes the phonologically 'temporal' aspect of their occurrence. 5.3.1.2. The final form of the phonetic representation itself, in terms of its phonetic properties is a product of the total integration of the 'basic' phonetic features 'realizing' prosodic and rhythmic phonological representation, i.e.
[pitch], [stress], [quality] and [quantity], and [rate], respectively,
with the phonetic interpretation of the phonemic and word-accent features of the lexical representation. In a sense then, the former supply the latter with a 'suprasegmental interpretation'. Since the four phonetic features realizing prosodic structure together contribute to specify the degree of 'ideal-ness1 of particular segment and word-level phonetic values - strictly, serve to determine the exact phonetic specification of (lexical) phonemic and word(-accent) features - it may be convenient to conceive of them as specific dimensions of a generalized or 'cover feature1 [precision]. Values of the cover feature [precision] must be viewed as constituting a scale, which may in principle lend itself to numerical expression, much in the way that Vennemann and Ladefoged's (1973) original cover feature [strength] is assigned numerical values. Ihe 'highest degree' of the [precision] feature would then be equivalent to a (binary) phonetic feature specification [+pitch], [+stress], [+quality], [+quantity]; the 'lowest', equivalent to the feature specification [-pitch], [-stress], [-quality], [-quantity], with the possibility that the four basic features may also remain unspecified, expressing a 'neutral' or 'unmarked' phonetic value. Ihe cover feature [precision] and the feature [rate] and their values may therefore be regarded as directly characterizing the 'phonetic quality' of the speech syntagma as it realizes phonological structure. In the previous discussion, it has been often noted that, for exanple, a particular phonological feature, say that of [CENTRE], is realized by 'high' or 'positive' values of the phonetic features - here, for instance, of all four [pitch], [stress], [quality] and [quantity] features. Thus in the example sentence, a prosodically strong element such as the clause-level culminatively marked locus "cur-" in "curtain" ("She went to the window and looked through
133 the curtain") is said to have positive values of these features; e.g. high pitch, 'tonic' stress, full aspiration and closure at velar place for [k], full Openness' (i.e. a peripheral vowel place value) and lip-spreading for [3], plus lengthening, i.e. maximum length, of both [k] and [3], In the phonetic representation this would be marked by, literally, positive values of all four features - [+pitch], [+stress], [+quality], [+quantity]. However, clearly, these 'basic1 phonetic features are themselves each manifested along a number of further phonetic dimensions. Values of [pitch], for instance, must make further reference to pitch height and direction; values of [quality] to the dimensions manner and place (of 'articulation'), including vowel place, secondary articulation, nasality as well as phonation (i.e. voicing, breath, whisper, etc.), which includes 'voice quality'. These phonetic dimensions, which one may term 'parameters', 'fill out' in a multidimensional manner the fine-grained phonetic detail in a phonetic representation. Their precise values are a product of the phonologically defined suprasegmental context. Sets of phonetic parameters have been suggested, as already noted, by Crompton (1981) - as entities within a phonetic feature system - and by van Buuren (1980) as exclusively articulatory defined attributes within descriptive phonetic analysis. Further, Brown (1977) has proposed a parameter set of 'paralinguistic features' including 'pitch span', 'placing in voice range", 'tempo1, 'voice setting1, 'articulatory setting', 'direction of pitch* and 'timing1 in a discussion of the phonetic exponents of 'emotion1 and "attitude1. In the present analysis, the descriptive basis for the postulation of the four basic phonetic features [pitch], [stress], [quality] and [quantity] may be related to Crompton's (1981) schema (as noted already, he proposes the features pitch, loudness, timing, source features and quality, all with correlates in articulatory, acoustic and auditory domains), and that of the phonetic parameters to the schema of van Buuren (1980) with the distinction that all parameters, just as features, must be seen to have correlates within the three signal-phase domains. Van Buuren proposes in total the (articulatory) parameters 'airstream initiator" (including 'type1 and 'force'), 'pitch' (including 'span1, 'key' and 'configuration'), 'phonation1, 'posture' (i.e. tongue and lip secondary articulation), 'lip-shape1, "tongue-shape1, 'jaw-drop1, 'nasality1, 'manner', 'consonant place' and 'vowel place1 (van Buuren 1980). Ihe phonetic feature [rate], as the phonetic realization of phonological [Rhythmic/Arhythmie], whose values [+rate] and [-rate] may be interpreted, respectively, as 'increasing' and 'decreasing', may also for its further phonetic interpretation need reference to phonetic parameters, here of tempo
134
(fast/slow) and length, and possibly of mannei» (Of articulation1). Thus in the exanple sentence "She went to the window and looked through the curtain", the word "curtain", which within a sentence-level rhythmic structure constitutes the peak, {head - "cur-", and enclitic - "-tain") and manifests the phonological feature [Rhythmic], has the phonetic feature value [-rate], i.e. 'decreasing', which in turn makes reference to the parameters tempo (here - slower/slow) and length (here - lengthened). The relationship between the prosodic and rhythmic representation on the one hand, and the lexical on the other, within the phonetic representation, is expressed by the integration at this level of features and feature values deriving from the different representations. Lexical phonological features define lexical words and their phoneme composition via a (lexical) accent feature and 'distinctive' phonemic features. It is of course these features which since The Sound Pattern of English (Chomsky and Halle 1968) have been considered the phonological (prime) features "tout court". In the present conception, an accent feature ([accent]) marks the syllable of a lexical item, i.e. lexical entry, which in isolation bears the 'primary' lexical accent and in connected speech constitutes the most 'accentable' syllable of the item. Phonemic features on the other hand, express distinctions between the units of the 'phonological alphabet'; i.e. phonemes, as they are shown by their occurence in lexical words. The feature system which perhaps lends itself most for adoption in a more 'surface' and 'concrete' conception of phonological form such as that represented in the present study is one similar to that of Ladefoged (1975) (rather than that, e.g., of SPE). With Ladefoged, one may reject the SPE principle of consistent binarity in the value specification of such 'prime1 features; with Vennemann (1974) one may accept the possibility of redundancy in the feature specifications in opposition to the SPE reductionism in feature specification. Note that phonemes are distinguished as units of an 'independent1 alphabet as well as in their capacity of 'carrying' the sound properties which serve to distinguish lexical words. Phonemic features are here conceived of - in nomenclature at least - much in the spirit of the traditional IPA-derived consonantal three-term 'labelling1 system and cardinal vowel-derived vocalic three-term 'labelling' system; i.e., the former with reference to manner, place and phcnation properties, the latter with reference to vowel height, frontness/backness and lip position. However, a full development of such a phonemic feature system would go beyond the confines of the present study. Suffice it to say, that, remaining with the same syllable "cur-" in the
135
example sentence "She went to the window and looked through the curtain", the phonological feature specification of the phoneme /k/ would include [plosive], [velar] and [voiceless]
(Ladefoged has the feature specification including
[stop] [velar] [-voice] (1975: 2 6 8 ) ) , and the features for the /3/ phonems in British English would include [half-open], [central] and [spread] (Ladefoged would include the values [2 height] [+back] ([-round]) in his specification). The phonetic representation then interprets these features together with the [accent] feature, which in this example is assigned to syllable "cur-" within the lexical word "curtain", via the basic phonetic features of [pitch], [stress], [quality] and [quantity] and their descriptive parameters. It is at this point in the description that "lexical1 and 'contextual' determinants of sound attributes meet. Ihe ultimate values of phonetic features and parameters derive from prosodic, rhythmic and lexical phonological features in combination, the 'contextual' values of phonetic features being determined by the constraints of prosodic and rhythmic representation. Thus the lexical phonological feature [accent] on "cur-" of "curtain" in the example is realized phonetically via positive [pitch] [stress] [quality] and [quantity] phonetic feature values as further manifested in parameter specifications, the values of which are in turn determined by the phonetic interpretation of prosodicand rhythmic-derived phonological features. In this way, the phonetic representation of a phonological syntagma may be seen to directly reflect prosodic strength marking and rhythmic structural marking. Notationally, a means could be developed by which prosodic strength values - perhaps as proposed in Chapter Three - as expressed via the features [BOUNDARY], [CENTRE], [Boundary] and [Centre] and rhythmic structural values (e.g. increasingly greater values for increasingly higher level (proclitic) peak -> head (enclitic) structures) as expressed via the feature [Rhythrnic/Arhythmie] - are computed onto (probably numerically based) phonetic feature and parameter scales in the phonetic representation . 5.4.
A typology of the phonological representation
5.4.1. 5.4.1.1. The structural relations of alignment obtaining between lexical, prosodic and rhythmic representations have been termed 'conditions' in the previous discussion in 5.2. Indeed, as noted before, these "lines of association' linking the units of the different representations may be viewed as constituting
136
a set of well-formedness conditions on the relations between the representations of phonological structure. The alignment or association of units - phonemes and lexical words - of the lexical representation with (units of) the phonological hierarchy in prosodic and rhythmic representations, constitutes the 'lexical insertion1 or 'spellout' process within the phonological conponent of a grammar. The structural relation expressed by this alignment nay be considered one of 'composition', in the sense·that the units of the lexical representation form the 'minimal1 (i.e. phoneme) and 'maximal' (i.e. word) 'free form' components of the suprasegmental hierarchy. In effect, of course, lexical words are by and large but, note, not invariably, compositionally "equivalent1 to the formatives of the suprasyliable hierarchy, and phonemes equivalent to the segments of the subsyllabic hierarchy as reflected in both prosodic and rhythmic representations. Phonotactic statements on permissible concatenative patterns of phonemes (in,
for example, syllables) may then be defined positionally and compositional-
ly with reference to the association of elements of the syntagmatically motivated structure (proclitic) peak ·*· head (enclitic) of the rhythmic representation, and the paradigmatically motivated s/zj-labelled constituent structure of the prosodic representation - both at (syllable), set and segment levels within the phonological hierarchy. Ihe specification may be implemented via the intersection of values of the phonemic features of lexical representation and those of the [Rhythmic] feature and prosodic feature [BOUNDARY] - with its 'disjunct' distinctive features [Consonantal/Vocalic] at syllable-set and [Closure] at set-segment levels. Thus it has been shown in 5.1. of the present chapter, that positional occurrence restrictions on segments within sets (and sets within syllables) give rise to a distinctive phonological interpretation of possible feature contrasts, available at particular locations within sets, and thus define the phonological feature [BOUNDARY] as marking as prosodically strong a position of maximal distinctive contrast at these levels. Furthermore, the feature [Rhythmic] within the rhythmic representation marks the structural element head, which in turn at set level (as 'nucleus'), being 'rhythmically' a obligatory constituent of syllables, imposes constraints on phonemic occurrence at that position. Of course, such phonotactically relevant prosodic and rhythmic feature definitions may have to be defined anew for each language, and in any case language-specific 'collocation restrictions' will have to be accounted for separately in the description. However, it does seem that the present framework offers the possibility of developing the type of integrated positionalcompositional statement of segment or phoneme concatenation within syllables
137
based on a phonotactically sensitive feature system as demanded recently by, e.g., Selkirk (1982a). As has been previously noted, Selkirk herself proposes an analysis in terms of 'sonority indices' of segments, which at the same time incorporates an account of language-specific restrictions (in this context, c f . again Cairns and Feinstein 1982 for a universal markedness solution based on syllable positions). While in Selkirk's terms, every syllable tree must be non-distinct from a syllable 'template1 (i.e. the characterization of possible syllable structure (1982a:12)), which in turn includes the conditions '(a) a characterization of the internal structure of the syllable (into onset and rhyme), (b) a specification of the minimum and maximum number of the terminal positions in the syllable, (c) a set of conditions on terminal nodes' (1982a:13), the ultimate goal of such analyses still to be attained remains that of a matching of the terminal positions of the syllable template - whether defined in terms of segments, features or place-holding C and V elements (cf. Halle and Vergnaud 1980) with the form of an actual representation of syllable structure. Again, it is claimed that an analysis within the present framework at least partly achieves this goal of matching. 'Syllabification1, as another phonological condition, is in the present framework simply statable as the coincidence of phonemes (or segments) with syllables, the latter forming a constituent unit-level of the phonological hierarchy in their own right. Doubtless here too, there are universal conditions or tendencies on the 'coincidence', such as the 'Maximal Onset, Minimal Coda Principle' as formulated by Pulgram (1970) - for further discussion cf. Selkirk (1982a, 1982b) - as well as language-specific conditions such as those of 'resyllabification1, which violate this universal principle. For example in English, certain types of potentially syllable-initial consonants in unstressed syllables (e.g. [t]) are claimed to be shifted to syllable-final position in the preceding syllable if that is stressed and within the same word (cf. Kahn 1976; Selkirk 1982b). In summary, one may claim that the alignment or association between the different phonological representations expressed as "lexical composition conditions' between lexical on the one hand, and prosodic and rhythmic representations on the other, and 'strength location conditions' between prosodic and rhythmic representations and interpreted as well-formsdness constraints, constitute typologically the structure-forming processes of the phonology.
138
5.4.1.2. In contrast to these structure-forming processes, the mechanisms, statable as rules, which constitute the phonetic interpretation may be termed the structure-realizing processes of the phonology. The rules which give a phonetic interpretation to the phonological representation via an integration of values of the prosodic phonological features [CENTRE], [BOUNDARY], [Centre] and [Boundary], the rhythmic feature [Rhythmic/Arhythmic] and the lexical features [Accent] and [Manner], [Place], etc., via the phonetic cover feature [precision] (subsuming [pitch], [stress], [quality] and [quantity]) and the phonetic feature [rate], are conceptually equivalent to the 'processes' of natural phonology (see, e.g., Stampe 1973; Donegan and stampe 1978, 1979) and the P(honetic)-Rules of natural generative phonology (see Hooper 1976). In typology and function, they are truly 'generative' (as opposed to 'relational') rules - (see Tiersma 1983). Ihe result of these structure-realizing rules is to effect local 'strengthening' and 'weakening' processes in the phonetic representation as they 'realize' phonological structure, i.e. they achieve the 'suprasegmantally conditioned' phonetic form of the structurally defined units of the phonological hierarchy. Degrees of strengthening and weakening are directly reflective of values accorded to the phonological features. Given the linear composition of the phonetic representation, the realization rules 'strengthen' or 'weaken1, i.e.
'augment1 or 'reduce', the phonologically derived features associated
with places in the syntagma and accord them a phonetic interpretation in terms of 'augmented' or 'reduced' values of phonetic features and their parameters. In general, one may observe that 'feature (value) spread' characterizes points of 'strong' phonological marking - for prosodic strength and rhythmic head elements, for instance; i.e., a 'spreadable1 phonetic feature associated with a particular unit-position in the syntagma will 'impinge1 on a equal level adjacent sister unit (inasmuch as the latter is part of the same rhythmic structure as the former). By contrast, "feature (value) shrink1 may characterize points of 'weak' phonological marking - for prosodic weakness and rhythmic proclitic elements, for example; i.e., a 'retractable' feature associated with a particular unit-position will be reduced in value or lost under the influence of feature values accorded to adjacent unit-positions in the syntagma. Examples of both 'spreadable' and 'shrinkable' features are length, the phonation parameters voice and breath, and the manner parameter fricative. Clearly, these phenomena reflect, respectively, the familiar processes of assimilation and dissimilation. Again here there will be language-specific contraints as to which features may be augmented and reduced and/or spread and retracted;
139
e.g., whereas in Dutch, syllable-initial 'voiceless stops' [p t k] are 'strengthened' by an augmented length value, closure value (i.e. tight) and 'added' glottalization (vocal cord adduction), in English the sane segment types in the same position in the syntagma are 'strengthened' by augmented manner and place values plus augmented and spread phonation parameter breath values (i.e. full aspiration). 5.4.1.3. Claiming the existence of conceptual similarity between the structure-realizing processes or rules proposed here and structural mechanisms found in certain other phonological theories does not, of course, imply that these theories share the same assumptions about the nature of phonological and phonetic description, against the background of which the theoretical status of such mechanisms must be assessed. While, for example, both the theories of Stamps and Hooper share the attribute 'natural', which requires that the major proportion of the structural regularities noted in the phonology are to be defined in terms of some kind of phonetic motivation (see, however, Anderson 1981 for a balanced critique of this view), their theoretical foundations and descriptive orientations differ considerably. Stamps defines phonological 'processes' as 'mental operations performed on behalf of physical systems involved in speech perception and production1 (1973:9), "mental in occurrence, physical in teleology1. While phonological processing in speech production is governed by phonetic teleologies, in speech perception it also represents a form of teleological analysis on the part of the listener, "projecting frcm what is heard to the phonological intentions of a speaker" (Donegan and Stamps 1979:158-159). Phonological representation is thus seen to be the 'phonological intention1 (and 'perceived phonological intention') of speech, and a natural phonological system must be interpreted as the 'system of limitations which stand between the intention and the actualization of speech, i.e. between phonological and phonetic representation" (1979:163). As has been noted in the discussion in previous chapters, a crucial distinction is made in natural phonology between "rules' and 'processes', the former characterizing only morphophonemic alternations, e.g. of the Velar Softening Rule type in English /elektrik/ -*· /elektrisiti/, the latter phonological substitutions which characterize phonetically motivated alternation and variation. 'Rules' are 'constraints that the language brings to the speaker"; "processes' are 'constraints that the speaker brings to the language'. A list of dichotomies which distinguish processes and rules
140
respectively includes: 'synchronic phonetic motivation1 vs. 'no synchronic phonetic motivation'; 'innate' vs. 'learnt'; Optional or obligatory1 vs. Obligatory1; 'apply to tongue slips, Pig Latins and foreign words' vs. 'do not apply'; 'apply instinctively and unconsciously1 vs. Originally consciously formad' (Donegan and Stampe 1979:144). Ihe three main types of processes that Stampe distinguishes are those of 'prosodic', 'fortition' and 'lenition'. While the specification of prosodic processes is left rather vague - they 'map words, phrases and sentences onto prosodic structures, rudimentary patterns of rhythm and intonation1 (1979:142) - the discussion of fortition and lenition processes, as has been shown above in Chapter Two, is highly relevant to the view of phonological structure presented here. Fortition processes "intensify salient features of individual segments and/or their contrast with adjacent segments" [my italics] (Donegan and Stampe 1979:142): they have a perceptual teleology and "are found in 'strong' positions, applying especially to vowels in syllable peaks and consonants in syllable onsets, and to segments in positions of prosodic prominence and duration" [my italics] (ibid.). By contrast, lenition processes make segments "easier to pronounce by decreasing the artioulatory 'distance ' between features of the segment itself or its adjacent segments" [my italics] (ibid.): they have an articulatory teleology and "tend to be context-sensitive and/or prosody-sensitive, applying especially in 'weak' positions, e.g. to consonants in 'blocked' and syllable-final positions, to short segments, unstressed vowels, etc." [my italics] (1979:142-143). As already previously noted, an example of a fortition process is the syllabification of 'pre-tonic resonants' as in English [preid], an 'emphatic' pronunciation of "prayed"; an example of a lenition process would be the de-syllabification of a 'pretonic' syllable as in [preid] for "parade". The distinction which Stampe draws between 'rules' and 'processes' is of course not isomorphic with the distinction made in the present account between structure-forming conditions or processes and structure-realizing rules or processes, although the latter do offer a formal conception of the kinds of 'processes' Stampe identifies. Rules of the morphophonemic kind, which form an integral part of the SPE phonological component (viz. the Velar Softening Rile, the Tensing Rule, e.g.), express phonological regularities obtaining between veil-formed lexically and/or morphologically related items. Indeed, there have been an increasing number of arguments put forward in recent years in favour of renewing such rules from the phonology altogether to a wordformation or morphological (sub-)component of the grammar (cf. Linell 1979)
141
or removing then to the lexicon (cf. Kiparsky 1979). Linell's 'ttorphophonological Rules Proper" (MRP's) and Hooper's 'Marphophonemic Rules' (MP-rules) and 'via rules' (relating phonological alternations between lexical items) have been characterized by Tiersma (1983) as examples of 'relational rules' - as opposed to 'generative rules' -, the main properties of which he summarizes as follows (1983:74): "1. A relational rule is non-generative. It serves to relate lexicalized alternants to one another, rather than deriving surface alternants from underlying forms. Every generative rule may also function as a relational rule, but the converse is not true. 2. A relational rule is Optional' in the sense that the awareness of such a rule may or may not be part of the competence of any particular speaker. In addition, speakers may or may not be cognizant of the relationship between any specific etymologically related lexical items."
Clearly, such relational rules have no place in the present model of description thus far dewloped, although in principle of course phonologically regular alternations between morpholexical units nay need to be accounted for by means of a type of relational rules in any attempt to state grammatically (i.e. syntactically, semantically or phonologically) motivated inter-item regularities holding between either lexically or morphologically defined units. The type of structure-forming conditions posited here which express associations between different types of phonological representation are, however, broadly "relational1 in function, whereas the type of structure-realizing rules suggested are 'generative' in function. Staupe's 'processes', while embedded within a particular 'teleological1 view of the status and role of phonological description and linguistic description in general, and which is 'natural' in a broad sense, seeing language as 'a natural reflection of the needs, capacities, and world of its users, rather than as a merely conventional institution1 (Donegan and Staupe 1979: 127), closely parallel in function the role of the structure-realizing rules posited here. Again, both 'processes' and structure-realizing rules are 'generative' in Tiersma's typology. However, as they stand, Stampe's 'processes' require further formalization (as much as the present structure-realizing rules require an adequate notational representation) if they are to justify their place and status within a phonological theory. It is not clear, for example, what form the 'phonological representation" as 'intention' will take, what its structural primes are, etc. These considerations are crucial to the further definition and specification of "processes'. Indeed, it would seem to
142
be characteristic of 'natural' theories of phonology that while much attention has been paid to rules (including 'processes'), little has been given to representations, this in contrast to metrical phonology where the emphasis in the description is precisely the converse. 5.4.1.4. Linell, a phonologist sharing much of Stampe's perspective on the role of phonological description, seeing phonology as "concerned with the linguistic aspects of sound structure and articulatory and perceptual behaviour' (1979:31), and rules as 'properties of or [as] conditions on linguistic strings or the underlying constructing operations' (1979:16), proposes a typology of phonological rules relevant to the present concerns. He distinguishes Marphophonological Rules Proper (MRP's) from Phonotactic Rules (PhtR's), Perceptual Redundancy Rules (PRR's), Articulatory Reduction Rules (ARR's), and what he terms 'sharpening and elaboration rules' (1979: ch.10); PhtR's, PRR's, ARR's and sharpening rules constituting together the phonological rules of a language. An example of a PhtR is one that specifies the predictability of an /e/ in Spanish word-initially before /s/ followed by a consonant, statable as (Linell 1979:168): IF:
r¥
sC -J/ e
THEN:
This obviously expresses a sequential well-formedness condition obtaining between concatenative elements - segments or phonemes - in the phonological syntagma. PRR's express, for example, "extrinsic allophonic" realizations of phonemes such as lack of aspiration of /p t k/ in English following /s/ in syllableinitial position, statable as the rule (1979:170): IF:
THEN:
s
Γ30"]
-cont
Ψ [-asp]
Condition: no intervening syllable boundary
However, as Linell himself admits, it is difficult to draw a hard and fast boundary between PRR's and PhtR's: is, for example, the rule for the distribution of /£/ vs. /x/ in German a Phonotactic or Perceptual Redundancy Rule? The problem of where in the phonological representation to accommodate rules specifying extrinsic allophones is of course a perennial one in various models of phonological description. In the present description, it may be claimed that post-/s/, /p/ syllable initially as unaspirated in English may
143
be explained with reference to the phenomenon of 'feature retraction1 referred to above in a previous section of this chapter (as much as syllable-initial /p/ as aspirated, is covered by the phenomenon of 'feature spreading1). At set level within /sp-/, it is the segment /s/ (or [s]), that will be prosodically marked for strength via the phonological feature [Boundary]. In a phonetic interpretation, values of the phonetic features and parameters pertaining to /s/ will be 'augmented1 along, in particular, 'consonantal1 dimensions and 'spreadable1 features extended to other unit-positions within the immediate rhythmic structure, here onset. The length of /s/ will be augmented and 'impinge on' that of /p/, reducing the duration of the latter. In turn, the phonation value breath associated with /p/ as prosodically weak will 'retract1 to the /p/ itself. Friction of /s/ may also 'spread' to /p/ in the sense that in a 'rapid colloquial1 style of speech, often during the /p/ segment the lips are not fully closed, but rather approximated in a fricative-type manner gesture. This in turn is compounded by the low length value of /p/, which also derives from its
[+rate] value as part of the onset of a syllable level rhythmic
structure. The same arguments may of course also apply to post-/s/ syllable initial /t/ and /k/ in English. Another example of extrinsic allophony in English which potentially lends itself to interpretation either within Linell's PhtR's or PRR's is that of the distinction and distribution of 'clear1 and 'dark I 1 . In articulatory terms, clear 1's within syllable-initial positions manifest a firm medial closure at the alveolar ridge, dark 1's within syllable-final positions show, in contrast, a looser primary alveolar closure (as is also evidenced by the increasing 'vocalization' of post-vocalic 1 in (British) English). By and large of course, syllable final positions are weak prosodically, hence the 'weakening1 of the manner value. Where, however, they form the focus of, for example, secondary suprasyllabic delimitative marking via the feature [Boundary], it is the 'vocalic1 element of the phonetic characterization of syllable-final 'dark 1's' that is lengthened, i.e. the secondary articulation of velarization/uvularization which actually defines its 'darkness1. This is equally true of lengthening brought about by a syllable-final 1 constituting (part of) an enclitic unit within rhythmic structure and being subject to [-rate], i.e.
'decreasing' (or 'slowing' tempo). Thus, it is possible within
the present framework to accoirmodate the realization of 'extrinsic allophones1 in terms of the phonetic interpretation of phonological representation via the concept of structure-realizing rules. By the same token, 'intrinsic allophony' is also accountable for in the
144
present framework - equally, in terms of the phonologically derived contextual constraints imposed on phonetic form. Indeed, within the terms of the present model there is no conceptual distinction between a phonetic context-sensitive intrinsic allophony and a phonological context-free extrinsic allophony of phones. Thus 'devoicing1 of post-'voiceless stop" syllable-initial approximants in Qiglish, i.e. [kjua] and pray
[w] , [j] and [-J] after [p t k] as in "twenty" [twenti] ,
cure
[p^ei] as much as nasalization of inter-nasal vowels in un-
stressed syllables as in "moaning" [mounTri] , or indeed the well-noted examples of consonantal assimilations between and across words of the (regressive place) type "that man" [araepmaan] , constitute a phonetic variation derivable from a structural phonological description as is proposed here. In each case in these examples, a structurally weak position phonologically is marked by phonetic 'quality reduction'. Ihe basis of feature spreading and retraction does not rest on a relationship of similarity or difference between features of adjacent phones as such, although such features involved in these "strengthening1 and 'weakening' processes must be 'compatible1 with the specification of contiguous phones, but rather emanates from phonologically derived properties of unit-positions which affect the values of certain phonetic features and parameters associated with the compositional phones. Exactly which features and values are 'affected' and thus "spread" (or "augmented") or "retracted" (or "reduced"), depends on language-specific subsets of possibly universal tendencies in 'assimilation1 and 'dissimilation' phenomena (cf., e.g., Webb 1983), which express the particular syntagmatic phonetic 'bias' of a language. This, in turn, is subject to lexically derived constraints on degrees of permissible variation ('strengthening" or "weakening") in the pronunciation of word forms - see below the discussion of Linell's 'sharpening rules'. However, at the same time, it must be noted that these arguments do not preclude the possibility of, in essence, extra-linguistic 'performance οοητ ditions" constraining phonetic form and with it,
'strengthening' and "weakening"
processes. Conditions of the speech situation may bring about changes in overall speed and accuracy of delivery, for example, producing 'allegro forms' of phonological units (cf. e.g., Zwicky 1972; Dressier 1975; also further Brown 1977). However, these constraints on phonetic form fall outside the competencebased phonologically-derived statement of sound variation proposed here. In effect, the phonetic representation in phonological theory must be conceived of as 'style-neutral', and while the quantitative and qualitative variation in phonetic form brought about by 'style-shifting' (casual - formal, etc.) may
145
in principle be of a different order to the permitted variation as constrained by phonological structure itself, ultimately the limits of such variation are set by the kind of system- or structure-internal conditions discussed here. Returning to Linell's rule typology, his Articulatory Reduction Rules "express generalizations about differences between different pronunciations of the same word forms, i.e. differences in degree of reduction" (1979:172), an example being a rule relating a full pronunciation of American English "winter" as [wintsa] and a reduced pronunciation [wira·]. That is, such rules seem to express the language-specific limits on the reduction in articulation which in the present model derive from phonologically determined structural properties. Within the present analysis, the form such reduction can take can be accounted for by an examination of the effects of prosodic and rhythmic (and lexical) structure on the elements as filling structural positions. It is significant, for instance, that in the example given, an [j] 'trace1 is left in the final vowel in the form of a rhotacized [a·] as a 'word-final' (i.e. deliinitative) marker. Indeed, a close phonetic analysis of [a·] will reveal that it is in fact produced as a weak velar approximant, the velar 'place' being a reflection of a secondary articulation present in the 'full 1 pronunciation of the phone [a] (which of course has palato-alveolar as its 'primary' place). Again, as with a reduced 'dark 1', it is found that a 'vocalic-type' secondary articulation parameter 'remains for' prosodic strong delimitative and rhythmic peak (head/enclitic) marking, while primary place (and manner) as a 'consonantal-type' parameter is reduced or retracted as a reflection of prosodic weak culminative marking. Ihe phone [a·] could equally be transcribed as [ui], also indicating its syllabic status. Concerning the reduction of [t] to [r], although it is syllable-initial, it is in an 'unstressed' syllable, i.e. a suprasyllabically culminatively 'weak' syllable. MDreover, it seems to be a general fact of American (and British) English that syllabic phones 'weaken' their Onsets': cf. "codling" vs. "coddling", where the [d] in the latter as onset of the syllable [dl] has a characteristically weak closure. In terms of rhythmic structure at the syllable level, it has been noted that [?] or [ui] fills both nucleus and coda (i.e. head and enclitic) positions, and therefore is marked by the feature [Rhythmic] phonologically and [-rate] (i.e. 'decreasing') phonetically. Therefore one may conclude that at syllable level there is syntagmatic pressure on the onset [t] in "winter", which has a rhythmic feature [Arhythmie], phonetically [+rate] (i.e. increasing), to further "weaken1. This balance bet-
146
ween prosodic delimitative pressure for 'strengthening', prosodic culminative pressure and rhythmic pressure for 'weakening1 is achieved by the maintenance of a closure nanner, but 'reducing' the 'degree' of manner from stop to tap in the interests of locally high utterance rate (a tap articulation being intrinsically faster than a stop). Concerning the elision of [n] in the reduced pronunciation [wira·], it is of course in 'weak1 syllable-final position culminatively, but 'strong' delimitatively. Rhythmically at syllable level it is a coda part of the rhyme (enclitic as part of peak) and locally will be assigned [Rhythmic], phonetically [-rate] . Elision of the place (and manner) of articulation leaving a nasal 'trace1 is commensurate with these structural constraints: the nasal "trace1 which accrues to the vowel [i] and even 'spreads' to [r] is reflective of the phonetic parameter nasality which is 'lengthenable' (or even 'spreadable') in the interests of prosodic delimitative
marking and rhythmic peak marking. 1
Linell's 'sharpening' and 'elaboration rules are said to typically apply in 'lexical pronunciation1 (i.e. the pronunciation of the isolated form of words), affecting especially word-final segments, as well as in phrase-final position and/or 'under emphasis' (1979: 56-57, 169). In this sense they correspond conceptually to Stampe's 'fortition processes', as the Articulatory Redundancy Rules correspond to Stampe's 'lenition processes'. Note that the contexts quoted are those in which within the present approach s-marked units would be expected to occur, delimitatively in the suprasegmental units word and phrase, culminatively in stressed (word) positions. Such rules (together with the ARR's) clearly correlate with the type of (phonetically interpreted) structure-realizing rules proposed here. Significantly, Linell gives as an example of a 'sharpening' rule German (and Dutch) word-final "Auslautverhärtung" or 'final tensing1, whereby, e.g. /p - b/, /t - d/, /k - g/ in German 'neutralize' to phonetic [p], [t], [k], respectively, in word-final position. In effect, Linell treats 'final tensing1 as a partly phonologized phonetically motivated rule. Again here, as with the discussion on Perceptual Redundancy Rules, the question arises as to the theoretical status of the rule, and one may conclude again that given a rich enough phonological structure of the type proposed here, a typological distinction between distributional allophonic rules and phonetic realization rules may disappear. .Word-final tensing in German and Dutch as a phonetically 'strengthening' process has sometimes been seen as an anomaly, in that word-final position has traditionally been regarded as a 'weakening' environment. However, here
147
the positing of suprasyllabic delimitative marking in the prosodic representation may account for this. Ihe choice of 'tense' (breathed or 'voiceless') forms word-finally is again in phonetic interpretation a product of the constraints of prosodic and rhythmic structures. Rhythrnically, such segments would be included in the peak (as enclitic) at word level and marked phonetically [-rate]. A lengthenable feature to accommodate decreasing rate would here be the manner of articulation stop. Prosodically, there must be a delimitative phonetic interpretation - here completeness of closure (and length). fteintaining voicing (or whispery voicing) under these conditions is not ideally compatible with the requirements of lengthening and the phonation type breath is produced. Ihis interpretation would equally apply to the fricative series /f - v/, /s - z/ which likewise neutralize to the breathed form word-finally. Ihere has been of course additionally evidence put forward that final tensing may constitute a universal phonetically motivated phonological process. (It has also be noted as a characteristic of the early language development of children). 5.4.1.5. Extensive reference has already been made in chs. 2 and 4 to the natural generative work of Hooper (1976). In ch. 4, Hooper's concepts of positional and segnental strength and their relation to syllable structure was discussed with reference to the development of a phonological 'strength' metric within the framework currently presented. There, as here, there is much support to be gained for the present approach by a close examination of Hooper's ideas. First and foremost, the distinction Hooper makes between on the one hand P(honetic)-Rules as 'automatic rules of phonetic detail1 and on the other, MP (Morphophonemic)-Rules and via-Rules correspond of course to similar distinctions made by Stampe and Linell. The latter fall outside the scope of the present perspective, but may be typified as morphological and lexical "relation rules' in Tiersma's typology (for a discussion of natural generative phonology see Tiersma 1983:72-73), and may perhaps be said - with Linell's MRP's and Stampe's 'rules' - to constitute sets of structure-regularizing conditions or rules (as opposed to the structure-forming and structure-realising rules thus far distinguished). P-Rules correspond to the last type of phonological alternation: they are motivated phonetically by describing alternations "that take place in environnents that are specifiable in purely phonetic terms" (Hooper 1976:14). P-Rules account for "the way surface contrastive features will be manifested in a phonetic environment" (1976:16) and in this sense conform to a general axiomatic principle of natural generative phonology, that of the
148
'True Generalization Condition1, which states that "the rules speakers formulate are based directly on surface forms and these rules relate one surface form to another rather than relating underlying to surface form" (1976:13). This view is quite consistent with the present framework in which phonological "deep structure1 in the SPE sense of 'underlying phonological representation" plays no role in the description offered. The only place in a model of phonological description where such 'underlying representation1 could be motivated would be in a related component comprising the set of what have been tented structure-regularizing rules which state phonological relations holding between morphological or lexical forms, i.e. phonological regularities 'correlating with1 or 'manifesting' morphologically and/or lexically motivated alternations of the type Velar Softening (/ilektrik/-»-/eliktrisiti/). Consistent too with Linell's, Stampe's and the views expressed here, is Hooper's conception of lexical phonological representation taking the form of items (in Hooper's case, words and morphemes) consisting of sore type of surface segments. It remains a matter of debate, however, how 'surface' such segments should be represented as, for example, in terms of their feature specification. Hooper suggests an "archisegmsntal1, i.e.
non-redundant, feature specification in the
lexicon, which "only contains enough classificatory information to trigger the correct P-rules" (1976:126). Further relevant to this issue is of course the treatment of allophonic rules in the phonology. Note, however, that the present structure-realising rules do not Operate on1 lexical representations directly but on associated units of lexical, prosodic and rhythmic representations as they constitute the output of structure-forming rules or conditions. It has been demonstrated above, for example, that Qiglish "unaspirated p (t and k ) ' , 'dark I 1 and American English 'flapped t' could all be derived via structure-realising rules. Clearly, this does not support their specification in lexical representation as independent segments ('phonemes'), i.e. h
h
[p t
h
separate from, respectively, /p t k/ or aspirated 1
k ], /!/ or 'clear I [1, ], and /t/ or aspirated [th] or unreleased [f] .
In this context, one may consider the descriptive problem of Spanish Spirantization in Hooper's analysis. This concerns the 'weakening1 of Spanish /d/ /b/ and /v/ in intervocalic environment, or more precisely, when these phonemes occur immediately after a 'stressed vocalic nucleus' and are themselves followed by a vowel; examples are [arma3a] "armada", [hueßo] "huevo" and [la^yo] "lago". The 'weakening' may also occur syllable-finally, but not syllable-initially. Since Hooper posits a defining link between the occurrence and form of P-Rules and structural conditions of the syllable, such that the former are sensitive
149
to the strength values of individual segments and individual syllable positions - in spirit much as is characteristic of the present model -, it must be seen as an anomaly that a weakening process of Spirantization occurs in an "intrinsically strong1 position, i.e. syllable-initially. Post-accent (-tonic) position - in particular intervccalically - has of course often traditionally been treated as a weakening environment, especially for obstruent consonants, in that it constitutes a type of locus of voicing assimilation and/or manner degree weakening (cf., e.g., Lass and Anderson 1975). Ihis has prompted phonologists, including metrical phonologists, to suggest that obstruents in this position be 'resyllabified1 to the (coda position of the) preceding syllable, as has been noted in previous discussion. However, an explanation is possible within the present framework without resorting to a resyllabification device. Treating [3], [γ] and [3] as initial in their syllables, a prosodic representation would not mark them as strong suprasyllabically in the hierarchy, being for culminative marking purposes in an unstressed syllable. Cft the other hand, subsyllabically they are found in a delimitatively strong (i.e. syllableinitial) position. In the rhythmic representation, the segments take up enclitic position at word level, proclitic at syllable level. Proscdically, suprasyllabic strength values will outweigh subsyllabic ones per segment by the type of algorithm offered in ch. 3. Rhythmically, at word level the segments form part of the peak (as enclitic), whereas at syllable level they form part of the proclitic as onsets of their syllables. These various phonological constraints then combine to 'condition1 the 'weakening' in manner degree from stop to fricative. How they exactly confcine may be interpreted as follows: in prosodic terms, suprasyllabic 'weakness' and subsyllabic 'strength1 combine to ensure that in a weak syllable, syllable-initial position must continue to be marked, i.e. there cannot be too radical a 'reduction1 in the phonetic identity of the marking segment. In rhythmic terms, word-level enclitic status will favour a segmental form compatible with the pressures for reduced rate, whereas syllable-level proclitic status will favour the opposite, i.e. a segmsntal form for high rate. This is resolved by a fricative form being produced conducive to lengthening, which at the same time is articulatorily faster to produce than a stop. A further elaboration of the American English flapping reduction [t] -> [r] in the word types "water", etc., discussed above in 5.4.1.4., is now possible. Again, prosodically it fills the same position and is subject to the same 'pressures' as Spanish /d/, /v/ and /g/, i.e. suprasyllabically weak, subsyllabically strong. Rhythmically, it is equally comparable - enclitic within
150
word, proclitic within syllable. Here, howevec, voicing of [t] is commensurate with the demands of a 'lengthenable' feature of an enclitic segment; fast tap. [r] articulation, with the demands of a proclitic segment. Ihese processes will clearly differ in their phonetic actualization depending on the segmental composition involved and it is likely that different languages may prefer different patterns. However, it is equally possible that there may exist a universal scale of preference or implication for such phonetic actualization, perhaps supported by diachronic evidence, as, e.g., suggested by Lass and Anderson (1975). Crucial to the implementation of such processes is, however, a lexical constraint on permitted segmental variation by which the (phonetic and) phonemic form of individual lexical items must remain clearly distinguishable. 'Phonemic overlapping1 may even be allowed as in the American 'flapped t 1 as long as, like in the English and Spanish examples discussed, the word forms are distinguishable also in their reduced form. Returning to the question of the specification of segments, i.e. phonemes, in lexical representation, a conclusion is indeed that these are to be described in terms of phonemic features of the kind that traditionally serve to distinguish phonemes as they occur in lexically determined contrastive environments (e.g. as in minimal pairs) . With this view of the lexical representation of 'segments', there is no motivation to adopt the distinctive feature systems advocated by SPE (Chomsky and Halle 1968) with their sequential and simultaneous non-redundancies in specification, but rather a version of the traditional three-way (-plus) labels of traditional descriptions (for comparable views, see also Linell 1979). In this respect, Hooper's claim for an archisegmentally defined lexical representation would not seem to be compatible with the requirements of a 'naturally1 oriented phonological description. A major limitation of Hooper's scheme is that by restricting attention to the syllable as the basic phonological unit, a definition of the structural determinants of phonological strength/strengthening and weak/weakening relations is limited to reference to compositional and positional properties of just one unit. Clearly, this limitation severely restricts the explanatory power of the model. Also other researchers, e.g. Selkirk (1982a), have commented on the rather simple linear structure of the syllable that Hooper posits, the compositional elements of which are solely segments. An additional weakness is of course that the strength values accorded to segment types by Hooper only refer to consonants. However, the important observation made by Hooper that strong and weak positions in syllables correlate with the number of contrasts possible
151 in these positions is of course to be found in the present analysis, in modified form, as a major determinant of suprasegmental (specifically here, proscdic) structural patterning. 5.5.
The status of the phonological description
5.5.1. 5.5.1.1. It remains to consider now briefly the claims of such 'concretely oriented1 theories, including the present approach, concerning the role and status of phonological description and its relation to linguistic behaviour (a full discussion of these issues would go far beyond the confines of the present study). The theoretical stuatus of metrical phonology is of course firmly anchored within generative grammar. By comparison, Hooper's natural generative phonology distinguishes itself from orthodox phonology in that it considers the role of phonetically conditioned P-Rules to be central to phonological description and to constitute the main focus of phonological structuring, this in contrast to, e.g., the SPE model of phonology, which pays scant attention to the form and function of phonetic realization rules linking phonological and phonetic representations. Indeed, general dissatisfaction with the inadequate treatment of these rules has led to increasing interest being taken in recent years in the nature of such phonetic 'implementation rules' by more concretely oriented phonologists (for discussion see, e.g., Crompton 1981; Hewlett 1981). Hooper's views on the general focus and status of phonological description are conveniently expressed in the 'True Generalization Condition1 (see discussion above) and 'No Ordering Condition' on P-Rules (for a discussion of ordering conditions on natural processes see Donegan and Stampe 1979:145-158). However, a fully developed conception of the 'natural1 status of Hooper's phonological itodel would seem to be lacking. One often has the impression that 'naturalness1 represents more a declaration of faith and general focus of interest in the description than a conceptually well-developed "metaphysic'. Further, the relationship between a ccmpetence-based model of analysis and the behavioural sound-producing domain it is intended to make reference to in the theory, may be considered to be ontologically naive: viz. the statements that P-rules describe 'processes governed by physical properties of the vocal tract1 (1976:16) and that P-rules are seen as 'dynamic processes that describe the act of articulation' (1976:126).
152
By contrast, Linell's model of description is embedded in a well-developed netaphysic concerning the nature of linguistic conmunication and the role of a phonological description within it. Linguistic communication is conceived of as an instance of goal-directed behaviour - a view comparable to Stamps's conception of the 'naturalness1 of linguistic communication - and accordingly, must be approached within a teleological mode of explanation. Within this perspective, a speaker's phonological knowledge is concerned with (phonological) conditions that must be met in 'correct linguistic behaviour1: "These conditions are not causally related to the speech events. Rather they enter a teleological explication of speech acts" (Linell 1979:15) - cf. Stampe's 'mental operations on physical events'. Within a teleological theory of linguistic communication, Linell considers the concepts action and act to be of fundamental importance: "Actions may be defined sinply as behaviour governed by intentions, and acts are a sub-class, i.e. units of behaviour which are, often consciously, directed towards a certain well-defined goal (end state)
Pets and operations are
supposed to be real units of behaviour, something that speakers, and listeners, actually do, or can do. However, these behavioural units must be explicated in phenomenological or linguistic terms, since we cannot today provide a causal theory of how they are actually implemented in the central nervous system" (1979:15) . Phonological rules must not be directly equated with acts or behavioural events, but rather are therefore understood as 'properties of or as conditions on linguistic strings or the underlying constructing operations'. Rslating this ontological conception of phonological description to constructing operations in the production of speech, Linell views phonological form or 'representation' as constituting a phonetic plan for behavioural acts and operations, i.e.
"a plan to perform a certain type of phonetic act,
i.e.
to produce a sound signal with certain specific (phonetic/phonological) properties. Thus, I assume that for each word form there is (at least) one phonetic plan that specifies its phonological identity" (1979:48). Speech production viewed within an actional view of linguistic behaviour comprises the operation of 'plan construction1 and 'plan execution1, and it is to these processes that Linell relates the 'function' of phonological rules. Marphological operations, including 'Marphophonological Rules Proper1 (MRP's) are part of plan construction, Perceptual Redundancy Rules (PRR's) and Articulatory Reduction Rules (ARR's) and elaboration or sharpening rules are part of plan execution. Phonotactic rules occupy a position in between, although as Linell is at pains to stress, one cannot expect all rule types to be mono-functional. A derivation of 'the fully specified articulatory plan1 (of a particular
153
'reduced1 pronunciation) from the "phonetic plan1 (i.e. "phonological form 1 ) of the word "winter" in American English - see also above - would look like (Linell 1982:49): phonological -form (phonological plan)
/winter/ |
fully specified articulatory plan of careful pronunciation
[win· t a·]
derivation by (perceptual) redundancy rules (in reality much more detailed) derivation by articulatory reduction rules (e.g. nasal absorption, flapping, /r/ absorption)
V
fully specified articulatory plan of "reduced" pronunciation
[ v/rra·]
(in reality much more detailed)
Fig. 4: Derivation of an articulatory plan (after Linell 1982) Each of the rule types and levels of phonological form constitute the 'grammartheoretical' analogues of different types of knowledge/abilities of competent speakers. Ihus, 'knowledge of how to pronounce specific words' in different forms corresponds to 'knowledge of phonological forms - phonetic plans, fully specified articulatory plans, articulatory reduction rules; 'ability to perceive pronunciations differently (at different levels of detail)' corresponds to 'knowledge of perceptual redundancy rules; 'knowledge of general conditions on phonological-phonetic structure' to phonotactic rules; and 'knowledge how to form new phonetic plans (word forms) from lexical structure' to morphological operations' (Linell 1979:38). Clearly, Linell's theory in its developed ontological conception deserves the label 'natural generative phonology' perhaps more than Hooper's model of description. In terms of its internal methodology and its confirmed belief in the goal of 'psychological validity" in phonological description, it shares essential fundamental characteristics of orthodox generative phonology. Ihe obvious main difference to the latter is in Linell's argument for a teleological interpretation of phonological structure deriving from an 'actional1 perspective on the ontology of language, which, in its broadest terms, demands a theory of linguistic practice as it is performed by speakers, listeners and thinkers acting meaningfully to achieve social (and individual) goals in different social contexts (Linell 1980:40). This "actional1 perspective on language is not only consistent with Stamps's 'natural1 phonology, but is equally consistent with recently developed models of speech production in
154
phonetic research (cf., e.g., Fowler et al. 1980), as indeed Linell points out. This conception of language leads him to all but abandon the distinction between competence and performance: "One of the metaphysical (i.e. general) axioms to be adopted in this work is that there is a very close relationship between a speaker's linguistic competence and his actual communicative performance" (1979:18). His theory, then, constitutes a description of 'communicative competence', since "A theory of a speaker's linguistic competence must explicate what speakers must know in order to communicate linguistically their various intentions in different kinds of situations" (Linell 1979:32). Given such a broad definition of what this type of competence involves, it is then perhaps surprising that further elaboration as to 'what speakers must know1 phonologically is restricted to the specification of the phonetic-phonological conditions on the behavioural acts of producing and perceiving utterances. It may.be argued of course that the kind of teleological view of linguistic description adopted by Linell conceptually limits the focus of a teleological view of language to that of its production and perception rather than, say, its 'use1 - however defined - within any other socially and psychologically 'relevant1 domain. Ihe different types of phonological knowledge/abilities correlated with their 'grammar-theoretical analogues' as described, all make reference to the production/perception domains of language actualization. Ihis is reminiscent of the conceptual orientation of those few extant phonological models of performance, which are equally exclusively concerned with the production (and in certain cases, perception) aspects of language - cf., e.g., Fromkin (1968). However, the original Chomskyan conception of performance as "actual use of language in concrete situations' (Chomsky 1965:4) is of course open in principle to description making reference to a variety of domains of language behaviour. Within the domain of production, for example, performance may be conceived of as a set of processing restrictions on the 'realization1 of the grammar, but these processing restrictions may in turn be defined with reference to different domains of empirical investigation - physiological, psychological, neurophysiological, neuropsychological, etc., making use of descriptive primitives of different orders such as memory, attention, granmaticality judgements, etc., which themselves may or may not make contact with 'real-time1 constraints, for instance. It is not so in performance description that one domain or the other enjoys a privileged status in explanation. Nor, one might add, will the 'ultimate truth' be revealed within an as yet to be perfected neurolinguistic area of explanation, for example. By extension, neither is there, as Kean (1981) points out, any privileged set
155
of data constituting the 'most-favoured' (internal or external) evidence for linguistic explanation within Universal Grammar or competence. In this light, Linell's production/perception-oriented competence model with its actional equation of phonetic plan and phonological form and functional correlation of rule type and plan construction and execution, offers perhaps a conceptually too limiting (and concrete) interpretation of a speaker's 'implicit or tacit knowledge' of language. Parts of Linell's argument, and more consistently especially in his most recent work (e.g. Linell 1982), lend themselves more closely to an interpretation of the theory in terms of a model of speech production rather than a linguistic model of phonological communicative competence within a teleological view of language use. Perhaps a final observation could be made that it is neither necessary nor sufficient for a 'phenomenological' interpretation of language and language use to make exclusive reference to production and perception domains. At this point in the discussion, it may be useful to again examine implications of the 'componential' view of the relationship between competence and performance adopted as the theoretical basis of the present study. It is based on the three axiomatic assumptions adopted from Kean (1981), as noted at the end of Chapter Che: i) that "human linguistic capacity is to be conceptualized in terms of a grammar and a processor, the grammar being a theory of the structure of the system of linguistic knowledge, and the processor being a characterization of the structure of the mechanisms which allow a person to exploit that knowledge" (1981: 175); ii) that "in linguistic performance the systematip levels of representation generated by the grammar are realized" (in processing), from which it follows that "all substantive partitions of items in a string which are captured by these representations are available to the processing mechanisms" (1981:191); and iii)
'in the absence of any data to
the contrary', that "(a) the only systematic levels of representation realized by the processor are the grammatical levels of representation, from which it follows that (b) the only substantive partitions of items available to the processor are those which are available in the levels of grammatical representation (e.g. made available by ÜG)" (ibid.). The processing mechanisms, or 'processor1, to be accounted for in a theory of performance in this view, do not have sole reference to the languagespecific capacities involved in the activities of speech production and perception as conceived of within a goal-directed view of the speech act as an individual event. Speech production and perception seen in this way, of course, may well constitute a legitimate type of conceptual abstraction from the raw
156
data of a theory of performance - with the goal of thereby sharpening the focus in a study of language-specific processing capacities and in the explanation of the 'interface1 between levels of representation (and rules) of the grammar and the 'conputational mechanisms for processing language". Such a conceptual abstraction away from primary data involves taking an a priori stance on a general kind of "behavioural legitimacy' of linguistic structures, which in Linell's case is then established in a particular form by an equation of representations and rules with properties - plans, construction and execution - of an actional theory of language use. Ihis is of course acceptable as it stands: however, the legitimacy thus established for the rules and representations of the grammar is severely limited to their relevance within (the abstracted view of) the behavioural domain that serves as the context for their definition. Note too again that Linell offers a theory of (communicative) competence, not performance. The teleological view of language production (and perception) adopted by Linell represents of course only one rather restricted interpretation of a goal-directed view of linguistic behaviour. Moreover, the linguistic means for achieving goal-directed ends themselves may be conceived of in different ways: e.g., in terms of linguistic products, processes and/or underlying capacities or knowledge. Indeed, this kind of narrow interpretation seems rather at odds with Linell's call otherwise for methodological pluralism in linguistic theory formation (see also the critique of Drachman 1981). , By contrast, the approach followed here to the interpretation of language data within a linguistic theory is based on an a priori assumption concerning the relationship between a 'competence component" (i.e. a grammar) and a 'performance component1 (i.e. a processor) of the theory, and not on an a priori stance as to the general (behavioural) legitimacy of linguistic structure as established within a particular idealization of a behavioural domain. Ihe latter assumptions characterize linguistic models of speech production purporting to be theories of "performance" - such as that of Fromkin (1968) in even greater measure than Linell's approach. In the present description, the grammatical representations which form the input to the processor are available for whatever ends the processor requires. As, for example, Crompton (1982) has shown in a performance study of speech production, different types of phonological representations of a grammar may function at different locations within a speech production programme; e.g. the 'text1, i.e. the input to an articulatory programme, is claimed to be analogous to a phonemic representation and is hierarchically structured in terms of phonological phrases, syllables, syllable constituents and phones, whereas the
157
'library of articulatory routines' are syllable-sized and their 'addresses' are in the form of phonemic representations (Crompton 1982:136). Note that Cronpton enploys the term 'analogue1 to represent the connection between units or representations of a performance model and those of a competence model. However, this interpretation reflects a perhaps unnecessarily cautious view of the relation between units and representations of a grammar and those of a processor. In the present view, this is not necessary: the ontological relation between performance and competence is one of inclusion, the latter including the former. The caution manifested in such work may conceivably be a product of a 'realist' view of performance as some kind of behavioural dimension, the determining parameters of which are multiple and where any postulation of the linguistic 'reality' of units must proceed carefully. Rsrformance itself is of course a linguistic idealization or abstraction from, in principle, any and every behavioural domain - however specified - of language in use. 5.5.1.2. Little has been said so far on the status of rules (as opposed to representations) of the grammar in a theory of performance. A performance interpretation of grammatical rules has of course been severely discredited by the generally accepted failure of the Derivational Theory of Complexity to correlate grammatically established (transformational) rules with psycholinguistically defined production and perception mechanisms of on-line and/or acquisitional language processing. As has been noted, the kinds of rules proposed in the present framework - structure-regularizing, structure-forming and structure-realizing - all in the broadest sense constitute structural conditions on the well-formedness of a number of grammatical (i.e. phonological) representations. However, do they in any way constitute "substantive partitions of the grammar'? In the received generative dogma these rules constitute formal rather than substantive properties of grantnars. However, significantly for the present argument, Chomsky and Lasnik (1977) propose that the distinction between formal vs. substantive universals may be conflated and a more fundamental distinction posited between what they term 'functional' and 'formal universals', the latter subsuming what were earlier termed 'substantive universals'. While functional universals specify the way in which rules apply to the data they are designed to describe, formal universals 'specify the form of rules in a grammar, the vocabulary in which they are stated and the way in which they interact1 (anith and Wilson 1979:253). Clearly, the form of phonological rules such as those suggested here constitutes as much a 'formal1 (universal) property of the granmar as the representations they serve to relate. Thus any theoretically consistent
158
interpretation of 'substantive partitions' of the granmar available to a processor must also include reference to the position of rules as veil as representations. Indeed, in a discussion of the "psychological reality1 of linguistic constructs within a generative grammar, Steinberg (1975:244ff.) has argued that one cannot assert the 'reality' of representations and at the same time deny the reality of the means by which they are connected. Linell (1980) labels such a view "intermediate mentalism*. 5.6.
Granmar and processor in foreign language learning
5.6.1. 5.6.1.1. It has often been asserted that certain types of rules equivalent to Stance's processes, Hooper's P-Rules and Linell's Phonotactic Rules, Perceptual Redundancy and Articulatory Reduction (and sharpening) Rules - all notionally equivalent to the present structure-realizing rules, i.e. what Linell terms 'productive1 (as opposed to non-productive) rules - 'apply' in certain language behaviour contexts. Stanpe claims, as noted above, for example, that processes "apply to' tongue-slips, Pig Latins and foreign words, whereas rules do not. Linell lists evidence from loan word nativization, child language, linguistic games, production tests, speech errors, misperceptions and transfer in foreign language learning to show that non-productive rules such as Morphophonological Rules Proper are not evidenced as "processes' in such behaviours. However, there is clear evidence in foreign language learning, for example, that morphopnonc— logical rules (here a type of structure-regularizing rules) may be identified in transfer processes (cf., e.g., J. James 1977; Singh and Ford 1983). Obviously, evidence from these diverse sources of behaviours and behavioural contexts that a particular rule or rule-type has some kind of (behavioural) 'reality1 must be 'filtered1 via a defined theoretical position on the relationship between linguistic descriptions and behavioural domains of language. The tacit assumption behind the evaluation of this kind of evidence is of course that 'formal' properties (in the above sense) of a granmar are "realized1 in language in use or the use of language. However, in his discussion of transfer as evidence - which is of course of relevance to the concerns of this study Linell (1979) notes that by no means all sub-types of PhtR's (Phonotactic Rules), PRR's and ARR's are observed "to transfer1 in foreign language learning situations and there are obvious non-linguistic conditions on the "transfer1 of rules, as Linell himself concedes (e.g. time and type of exposure to the foreign language, etc., etc.). On the face of it,
then, this kind of 'evidence' for the
159
'reality' of grammatical rules in behavioural domains is less than totally convincing. However, within a well-defined conception of the relationship between lingzistic competence and linguistic performance, such data may be interpreted in a more differentiated - and perhaps more convincing? - way. By accepting as axiomatic the assumption that 'formal' (originally, 'substantive') partitions of the grammar are available to a processor, and that the sum statement of the 'grammar component" and 'processor component' constitutes a theory of performance, this standpoint still leaves open the question as to in what way the processor avails itself of the grammar, i.e. representations and rules. Clearly, in foreign language learning there are external, i.e. non-linguistic, variables which in part condition the 'transfer' of elements of the grammar of the source language to the production of the target language, as there are in first language learning non-linguistic variables which "determine1 the "employment1, via generalization procedures, for example, of certain rules and representations of the developing grammar. However, a linguistic theory of performance, specifically that of the processor, couched ontologically within the terms of a generative grammar "vrould seek to establish the ways in which PF [Phonetic Form] and LF [Logical Form] are integrated in other cognitive systems in thought, expression of thought, interpretation and comprehension, and in such specific uses of language as communication with others" (Chomsky 1981:33). At the same time, Kean (1981) warns that "one must be cautious not to confound the contributions of various cognitive modules with that of linguistic capacity in considering data from language use" (1981:177). In practice, this may prove a difficult task. Chomsky's answer is "to proceed to construct specific theories of cognitive systems, language amongst them, and to determine how they meet the task of explanation and providing insight into a variety of questions ve may raise about thought, behaviour, and physical mechanisms" (1981:33). As far as foreign language learning is concerned, the kinds of cognitive and conceptual processes involved are those familiar from first language acquisition, namely those of abstraction and generalization - in the broadest definition. Ihese processes can be found in any language behavioural domain or context where a speaker/hearer is confronted with new data (a second language, adult language, a playful form of language, a foreign word, data in the context of a language task, etc.), i.e. where there is a (language) learning situation or a (language) problem-solving task. Tiius in task-defined behavioural contexts, the processes relevant in an account of performance - at
160
a general level of abstraction - are precisely those of 'abstraction' and 1
generalization'. It is what Chomsky would call "an eirpirical problem' to decide at what level
and within which mode of abstraction to define the "linguistic capacity', i.e. the 'performance capacity1, as 'a contribution of various cognitive modules'. There are obviously different orders and modes of abstraction underlying the postulation of general and linguistic capacities of, for example, 'memory' and 'attention' as opposed to 'abstraction' and 'generalization'. Strictly more consistent with a general cognitive view of the linguistic capacity would be the latter mode of abstraction. However, failing constraints existing with generative theory as to the levels of conceptualization of 'cognitive systems' - for instance, various psychologically derived relevant behavioural constraints may be accorded a cognitive interpretation -, it appears to be permissible within the present state of the theory to adopt different levels of conceptualization and abstraction of 'cognitive modules' for different behavioural domains and contexts, as long as they (or a sub-set of them) may be convincingly be shown to be 'language-related1 or to 'determine language use1, iforeover, the definition of certain domains and contexts imposes constraints on certain types of conceptualizations and idealizations: linguistically-related explanation in the domains of speech production and perception as such in current models of idealization may need recourse to the language-related 'capacities' of memory, lexical access and retrieval, etc. The behavioural context of foreign language learning on the other hand, in current rrodels of idealization may need recourse to generalization and abstraction - or transfer in a cognitive interpretation. 5.6.1.2. Ihis line of argumentation leads now to a concluding statement on the theoretical, and ultimately, ontological, status of the framework of description presented here. It has been proposed in this study that a particular type of suprasegmentally-oriented phonological structure would appear to be well-motivated by the observational analysis of foreign language data. In the beginning of this study it was claimed that in the context of a second language learning situation, the role of a phonological framework is to serve, i) as a basis for the structural comparison of the two languages involved in contact in functioning as a tertiwn Qomparationis for the making of comparative and contrastive statements of their relative phonologies; and ii) as the linguistic-structural basis for the making of causally-oriented statements on observed FL learner behaviour. In generative terms, a formulation of i) serves the end of the development of competence models
161
of phonology; a formulation of ii) serves the end of the development of performance models of phonology, as conceived of in the preceding discussion. Within the view of the relationship between competence and performance here espoused, the grammar proposed will simultaneously serve both ends. Accepting a generative interpretation of the goal of psychological validity, the grammar via a Chomskyan 'systematic ambiguity' represents, on the one hand, 'the mentally (ultimately physically) represented system that constitutes the state of knowledge attained by a given individual" and on the other, 'the theory constructed in an effort to capture and make explicit the properties of the internalized grammar' (Chomsky 1981:33). The linguistic interpretation of foreign language data depends crucially on a well-founded theory of performance. It is not enough to note compatibility or non-compatibility of linguistically derived primes of description with other primes derived from independently established language- (or non-language-) based theoretical abstractions and conceptualizations (such as is done in comparing acoustic measurements of L1-L2 differences or L1-L2 unit or attribute marges within linguistic-phonological-structural categories), or to assert simply that certain linguistically defined structures are "found" or "evidenced" in data. Neither is it of course enough to project a causal role in a behavioural context to the contact of an SL and TL such that the explication of L2 data may be achieved solely with reference to the structure of L1 imposing on the structure of L2 to "produce" the structure of the interlanguage or L2 learner variety. Nor is it enough to "test" the behavioural "validity" of linguistic primes of description with sole reference to one or other identified behavioural domain (such as production or perception). In the absence of a conception as to the relationship and status of linguistic primes with regard to various behavioural contexts and domains, each of these undertakings (which together represent the bulk of empirical research on L2 data) lacks any theoretical motivation. A theory of performance of the kind adopted in this study is crucially defined as comprising a grammar (representations and rules) - here a phonology hierarchically ordered and including a triple set of phonological representations (lexical, prosodic, rhythmic) and a phonetic representation, defined and interrelated by sets of (structure-regularizing), structure-forming and structurerealizing rules or conditions -, and a processor which has access to the grammar and which in the context of foreign language learning and behaviour may be described in terms of the general language-related manifestation of the cognitive processing mechanism of transfer. In a dynamic conceptualization of second language development, the actual structural source of the transfer process may
162
be found in the source L1, the target L2 or in the developing learner interlanguage (IL), 'approximative system1 or learner L2 variety. In every case, the process of transfer involves the abstraction from, and generalization of, data which confront the learner in his own learning experience. In an individual case, whether and which structures are transferred - representing linguistic knowledge - from L1 to L2, from L2 to L2 or IL.. to IL-, etc., may depend on linguistically related determinants, e.g. cross-linguistic typological factors and their assessrtent, cross-linguistic similarity/difference assessments, etc. (cf. James 1983a), which may of course in turn be correlatable with extra-linguistic - e.g. psychological or sociological - determinants of the particular language behaviour context. In general terms, one may conclude that while the place and general form of non-TL (e.g. segmental) variants in learner speech may be accounted for by the competence model of phonology developed, the specific form of the variants found must make reference to the performance component of the description. 5.6.1.3. Consider now the phonological explication of a sanple of FL data in the light of the preceding discussion. The extract is taken from a conversation conducted under examination conditions between a native English speaker and a third year Dutch female undergraduate student of English at Amsterdam University. Ihs student had had at this stage 11 years of formal instruction in English. The student is B, the native speaker interlocutor A. The topic of conversation is 'Religion'. Ihe transcript offered constitutes a 'speech transcription" of the conversation (minor pauses indicated by -; major by +, i.e. 'pieces' delimited by -; 'locutions' by +: perceived stress on words marked ' 'primary' and,
r
'secondary') - see further sections 3.3.1. and 3.3.2. above.
Sample Four A: [uh - have you any ideas on - uh - on religion + uhm - most people nowadays haven't a religion + have you got a religion +] B: ' jJe s - aimI 'k°e6oirk + k J
[you are + are you +]
A:
B: 'prektisiq + A:
nst riii 'nou -
have I asked you that before +
B: ai ,daunt 'siqJt , sou + A:
[mm - sit a little bit closer +]
B: 'wel - 'ai 'brst - 'an - 'ai WDZ 'brot up - 'k°e6o4ik - an am stil - be'livrij - bed 'nAt - ''ik'sekli in 3s 'k°e9o4iik wei + if ju 'min - f j e 'nou wod ai 'rain + 'wel + 3o 'p°oup - 'end 3: cfe 3a 'harraki in 't/3r/ - a: daunt 'laik
163
Concerning the place and general form of the non-TL segmental variants in this extract., one may note the following: clearly the pronunciations [vroz] for [woz] "was", [am] for [am] or [m] "am" and [end] for [end] "and", are lexically derived substitutions, i.e. a word-form is pronounced which violates L2 conventions ('strong forms' for 'weak forms') . 1he last two exanples show instances of L1-based transfer in that the vowel [a] is produced for [ae] (-»· [a]) in "and". Etjually lexically derived would seem to be ['k^Ooirk] for ['kaeOslik] , i.e. unaspirated [k°], [ε] for [ae] and 'dark 1" [i] for 'clear I 1 [1] in "Catholic". The particular form of the word would appear to derive from wordlevel and phoneme-level lexical characteristics of the Dutch equivalent word "katholiek" [}£atou'iik], i.e. 'unaspirated k' from the Dutch lexeme and 'dark I 1 (cf. [ik'sekli] with a 'clear 1'). Prosodic structure may be seen to influence the choice of [ς] for [e] in [jes] "yes", [s] for [Θ] in [QiM "think", [Λ] for [D] in [not] "not", [συ] for [θυ] in [nou] "no", and [ou] for [au] in [powp] "Pope", [p°] for [phaup] "Pope", [3J] for [3] in [t/st/] "church" and [ε] for [ae] in "that". The syllables in which all these substitutions occur constitute strongly marked loci suprasyllabically (i.e. culminatively) and in this sense condition the occurrence of these deviant forms. Note, however, that the phonetic realization of phonologically determined structures - here, those pertaining to prosodic strength as inpleiTEnted by structure-realizing rules, i.e., globally pertaining to 'strengthening' - in an FL learning and performing situation is a product of cross-linguistic structural influences (i.e. corrpetence factors) as well as of the use of cognitive processing mechanisms, namely those of transfer (i.e. performance factors). In the substitution [e] for [e] in "yes", 'pressure' of strengthening brings about an L1-L2 conplex of transfer whereby the [g] form may be interpreted as a 'hypercorrect1 rendering of English [e]. Since Dutch [ε] in conparable structures is noticeably lower in place than English [e], the 'pressure' for an ideal form of the English phone results in an over-high production. In [s] for [Θ] in "think", the pressure for "marking1 the initial set of the syllable as strong in a sense prevents the correct rendering of [θ], a phone unfamiliar from Dutch, and 'triggers' an L1-derived substitution of [sj via transfer. The distinction [Λ]/[Ό] is an uncertain one for Dutch learners of English, there being no phonetically sane [Λ] phone in the L1. The problem is compounded by irregularity in the phonetic interpretation of the English letter "o" - as [Λ] as in "front, monk" etc. or [D] as in "font, mock" etc. Ihe pressures of suprasegmental context (clear marking of the medial vowel) may
164
force a wrong choice between L2 [D] and [Λ] in the light of this uncertainty. Here transfer would seem to operate within the L2. Concerning the two instances of [ou] for [au] in "know" and "R3>pe" (cf. the correct TL production of the same vowel in [nau] "no", [sau] "so" and [daunt] "don't"), it is possible that these, too, are conditioned by suprasegmental pressure, with the effect of producing a 'stronger' imagined 'ideal', i.e. more (vowel area-)peripheral, quality of the diphthong, which at the sane tine approximates most closely to the Dutch [o ] quality as in "zo" ("so"). However, it will be clear that on the basis of a limited extract such as the one presented, it is in principle impossible to exclude the possible interpretation that these pronunciations (and others) are lexically conditioned in the above sense (cf. again target [au] in "no", for instance). With unaspirated [p°] in "Pope" it seems likely that is the product of a transparent 'strengthening transfer1 from Dutch, where [p] is lengthened and tightened in strong syllable-initial position. [3^], i.e.
'postvocalic r insertion1, in "church"
is the product of an L1-derived strengthening process whereby under the pressures for vowel lengthening on the syllable, an r 'emerges1 to augment the quality, there being an auditorily comparable vowel quality with r in Dutch [far]
as in "deur" ("door"), the vowel itself [