229 65 6MB
English Pages 210 [216] Year 2002
Linguistische Arbeiten
460
Herausgegeben von Hans Altmann, Peter Blumenthal, Hans Jürgen Heringer, Ingo Plag, Heinz Vater und Richard Wiese
The Relation of Writing to Spoken Language Edited by Martin Neef, Anneke Neijt and Richard Sproat
Max Niemeyer Verlag Tübingen 2002
Die Deutsche Bibliothek - CIP-Einheitsaufnahme The relation of writing to spoken language / ed. by Martin Neef.... - Tübingen : Niemeyer, 2002 (Linguistische Arbeiten; 460) ISBN 3-484-30460-X
ISSN 0344-6727
© Max Niemeyer Verlag GmbH, Tübingen 2002 Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschützt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung des Verlages unzulässig und strafbar. Das gilt insbesondere für Vervielfältigungen, Übersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in elektronischen Systemen. Printed in Germany. Gedruckt auf alterungsbeständigem Papier. Druck: Weihert-Druck GmbH, Darmstadt Einband: Industriebuchbinderei Nädele, Nehren
Table of Contents
Martin Neef, Anneke Neijt, and Richard Sproat Introduction
1
Section 1: Consistency Anneke Neijt The Interfaces of Writing and Grammar
11
Richard Sproat The Consistency of the Orthographically Relevant Level in Dutch
35
Section 2: Cross-Linguistic Studies Susanne R. Borgwaldt & Annette M.B. de Groot Beyond the Rime: Measuring the Consistency of Monosyllabic and Polysyllabic Words... 49 Dorit Ravid & Steven Gillis Teachers' Perception of Spelling Patterns and Children's Spelling Errors: A Cross-Linguistic Perspective
71
Section 3: Diacritics and Punctuation Vincent J. van Heuven Effects of Diaeresis on Visual Word Recognition in Dutch
99
Jochen Geilfuß-Wolfgang Optimal Hyphenation
115
Ursula Bredel The Dash in German
131
VI
Table of Contents
Section 4: Sharpening in German Christina Noack Regularities in German Orthography: A Computer-Based Comparison of Different Approaches to Sharpening
149
Martin Neef The Reader's View: Sharpening in German
169
Thomas Lindauer How Syllable Structure affects Spelling: A Case Study in Swiss German Syllabification... 193
Addresses of Contributors
209
Martin Neef, Ameke Neijt, and Richard Sproat
Introduction
This collection of papers grew out of the workshop Writing Language, held at the Max Planck Institute Nijmegen, the Netherlands, on August 28-30, 2000. The purpose of the workshop was to bring together researchers of diverse backgrounds who share a common goal of achieving a better understanding of the role of writing in language behavior. The international grounding of this workshop is reflected by the present volume which includes articles written by researchers working in six different countries (Belgium, Germany, Israel, the Netherlands, Switzerland, and the USA) and analyzing four different writing systems (Dutch, English, German, and Hebrew). The papers selected for the current volume represent several lines of research into the intricate relation between writing and spoken language: Theoretical and computational linguists discuss the models that explain why orthographies are the way they are and the constraints that hold between writing and speaking a language; researchers in the area of special education deal with the question how certain aspects of orthography can be learned; and psycholinguists discuss aspects of language processing affected by variation in orthographies. Among the theoretical papers, there is one pursuing a functional perspective on language, while the others adhere to the formal paradigm, supporting either a derivational or a non-derivational theory. By offering a forum of discussion to researchers in all these fields, we hope to stimulate research that takes all aspects of the written mode into account in order to gain a better understanding of the relation between writing and the spoken language. Several important general questions are raised by the papers to follow and we would like to review some of them briefly here. Orthography and writing system The terms orthography and writing system are used as near synonyms in this book. Both terms refer to the way a language is written. When a difference is intended, it will be that an orthography is the standardized set of spellings. These spellings may follow from the application of a conventional set of rules for writing a given language, or they may be singular cases that are principally independent from such a rule system. A writing system, however, includes the regularities underlying the writing behavior of competent writers which may in principle differ fundamentally from the conventional rule formulations. Most of the contributions to this volume deal with the standard spelling system, or with aspects of writing that are not explicitly standardized, in which case the more precise difference between orthography and writing system is irrelevant. How natural is writing? Given a modular approach to linguistic structure, one may assume that writing is a module of the grammar, with an interface level that defines the relation between the written and the spo-
2
Martin Neef, Anneke Neijt, and Richard Sproat
ken variant of a language. Then, the relation between, e.g., orthography and phonology could be in essence comparable to the relation between phonology and syntax, and the issues discussed could be similar. Also, one may then consider writing, even though it is an artefact, to be a natural system, obeying the constraints that hold universally for the architecture of human languages. Alternatively, writing and speaking might be considered parallel routes of processing, without a clear interface, but instead with writing being parasitic on speaking. In that case, there is no single level functioning as the interface between the oral and written modes. Given the latter point of view, one could of course also take the surface level of a language as the interface, given the assumption that language users derive all extra information to be encoded in the written mode from their knowledge of the language. And vice versa: that readers take the written form to be directly related to the language's surface structure and that extra knowledge needed to understand the written code derives from their knowledge of the language. In either case, the conclusion will be that writing is not related to the language system as if it were a natural component of the grammar. Deep and shallow orthographies One of the classic issues in orthographic research is the notion of orthographic depth. This notion is based on the ordering between the modules of the grammar, taking morphology to be 'deeper' than phonology. Within the modules, orthographic depth is based on rule ordering, classifying writing systems that encode abstract, more phonemic information as being deeper than writing systems that encode concrete, more phonetic representations. In modern models of phonology, rule ordering has been deprecated, but orthographic depth may still be a valuable notion: data do not change just because theories change, a point that is often lost in the rush to adopt new theories. The term depth, it seems, turns out to be a descriptive notion that is in need of a theoretical foundation and re-interpretation in actual constraint-based models of grammar. What is the relation between orthography and the processes of writing and reading? Theoretical linguistics aims at specifying the interrelations of the elements constituting a linguistic system, or, with regard to language users, at identifying the knowledge structures language users have to have in order to be competent. Psycholinguistics, on the other hand, deals with the question how these knowledge structures are put to use, either in production or in reception. It is an important question how these two methodological approaches are connected. Does a convincing answer in one of these fields of linguistics automatically constitute a substantive answer in the other field, or do we have to be prepared that the findings in these fields will turn out to be quite independent from each other? This problem is also relevant for research on written language. In principle, knowledge of the set of rules defining the way a language is written must be distinguished from the processes involved in applying this knowledge. Theoretical models of writing systems differ in the amount of psycholinguistic findings concerning reading and writing they are willing to incorporate. On the other hand, psycholinguistic research strongly relies on theoretical assumptions with the effect that any new trend in theoretical linguistics has strong repercussions in psycholinguistics.
Introduction
3
Local, global, and transderivational constraints Reflections on different kinds of constraints are suitable to further illustrate this point. Processing feasibility is one of the general constraints on language systems, and locality conditions function as baseline conditions on processing. The idea is that systems where language users need to collect information from non-local domains take too much time and effort to use, and thus cannot be psychologically real. Do such considerations apply to spelling systems as well? Recall that global constraints are constraints that refer to an earlier or later stage in the derivation. Their use led, for instance, to the introduction in syntactic theory of traces to mark the position of a moved constituent. Transderivational constraints are non-local constraints of another kind, in that they refer not only to the current derivation, but also to other, related derivations. The criticisms of non-local constraints are valid for speaking, less clearly so for writing: for example, writers may depend upon explicit instructions or conscious strategies for making the correct choice between a pair of differently spelled homophones. Writers may consciously reflect on the spelling variant needed in such cases, e.g. in choosing between the English verbs affect and effect, between dass and das in German, and between word and wordt in Dutch. One strategy for the English case, for example, is to remember that effect means 'to bring about'; so if one merely means 'to influence in some way', one probably wants affect. Such considerations seem to be non-locaL, since in such cases the writer seems to be invoking alternative scenarios. So, writers consciously decide on spellings of homophones, and spelling instruction includes warnings for the writer about homophones. Furthermore, in cases of uncertainty, writers may decide to change their wording so as to avoid a potentially embarrassing mistake. However, one should not be led to the conclusion that non-local language behavior is restricted to spelling and homophony: writers also labor over matters of lexical choice (in this particular situation what is the mot juste ...), phrasing, morphological form (should I write octopuses or octopi), and whether, for example, a conditional is the right way to express a particular point. Even when speaking, people often think carefully about what they are saying: consider the situation where you are about to complain about something to someone and you are deciding exactly how to say it, e.g. which words and tone of voice to use, so that they won't get offended. Language use may thus include non-local processing, under special circumstances. Non-local language behavior is certainly not restricted to issues of spelling. In speaking, however, indications of non-local behavior in language use has not led to the assumption that non-local constraints are available for the lexicon, morphology, syntax, and semantics. Rather, one of the basic assumptions has been that language systems are constrained by severe locality conditions, excluding global and transderivational constraints from the description of language. Similar considerations may be taken as point of departure for spelling research, but the facts that writing language is a more conscious process and learning to write requires explicit instructions may give us some indication that writing systems are essentially different from natural language, and that these may exhibit non-locality, such as global or transderivational constraints. The dependency hypothesis versus the autonomy hypothesis Theories of orthography usually follow a conception that seeks to derive written forms from spoken forms. This is most obvious for the relation between sounds and letters. The sounds
4
Martin Neef, Anneke Neijt, and Richard Sproat
represented in phonological structures are taken as the primary elements on which the respective letters are dependent. Under this view, the English word beat has the initial letter because the underlying spoken form [bi:t] begins with the sound [b]. However convincing this approach is on first sight, there are several aspects of written forms that cannot be explained straightforwardly in this way. For example, there are letters that have no basis in the pronunciation, as in the case of the mute in German, and there are sounds that have no reflection in the spelling, as in the case of short vowels in unvocalized Semitic writing systems. The use of graphotactic constraints in Dutch further illustrates autonomy: there is no phonological difference between and in manen 'moons' and maan 'moon', and the difference can be described with rules that refer to the string of letters only. It will be an issue of future research to decide on the balance between the dependency hypothesis that highlights aspects of spellings that have a clear base in the spoken forms or the autonomy hypothesis which focuses on those elements of spellings that seem to have a status independent from the spoken forms. When both kinds of rules are needed for a proper understanding of writing systems, three sets of information about a given writing system are implied: Well-formedness constraints on the output (the strings of letters, the use of spaces, punctuation, and perhaps layout); rules governing the relation between the spoken language and its written output; and rules governing the opposite relation, between writing and spoken language. Readability versus writability This theoretical dichotomy can be subsumed under a more general view on orthography: What is the balance between reading and writing in orthography? At the design phase of a writing system, the need to express what can be spoken is present, but in the case of spelling reforms, the needs of the readers may become more important. This may explain why spaces in between words were invented relatively late. Perhaps also the general tendency of spelling systems to develop from more phonologically based to more morphologically based can be explained this way. Approaches to explain orthographies predominantly stem from the perspective of the writer. This is understandable given that learning to write is much more difficult than learning to read. Hence, the didactics of orthography are the didactics of writing. Efforts to reform a specific orthography also predominantly stem from the perspective of the writer. This may be because lecturers in teaching methods have been given the main responsibility of spelling reforms. But it may lead in a wrong direction, if it turns out that the main function of a writing system is not to make writing as easy as possible but to make reading as effective as possible. Thus, the question of readability may be one of the central aspects of future research on writing systems. The contributions in this volume The contributions to this volume all deal with one or more of the tenuous questions posed above. The first section is devoted to the discussion of a theoretical conception introduced by Sproat (2000), the Consistency Hypothesis. Embedded in a derivational conception of grammar, this approach makes the substantive claim that for each language there is one fixed point in the grammatical derivation where the derivation of the writing system of that language branches off. Sproat terms this point in the derivation the Orthographically Relevant Level. As a consequence, the effects of some linguistic rules should be consistently mirrored in the
Introduction
5
respective written forms while the effects of other rules should consistently not be visible in the written forms. In her paper The Interfaces of Writing and Grammar, Anneke Neijt challenges the Consistency Hypothesis. On the basis of Dutch, she claims that only the first step in the translation of sounds into letters can be restricted to one consistent level. For the graphotactic rules defining the well-formedness of strings of letters and for other aspects of written forms, more than this single level is necessary in defining the relation between speaking and writing. Among these other aspects is punctuation that is in need of global information from morphology, syntax, and semantics. Furthermore, morphological information has to be invoked. Classes of morphemes may form exceptions to an otherwise consistent spelling system, with depth in terms of their derivations having no bearing on the issue. In The Consistency of the Orthographically Relevant Level in Dutch, Richard Sproat carefully examines the data presented by Neijt and concludes, that given certain assumptions about rule formulations, a specific phonological level (a level somewhere in between phonemes and phones) can be taken as the input of the writing system for a language such as Dutch. Sproat supplies an explicit analysis of a fragment of Dutch phonology that gives a clear localization of the Orthographically Relevant Level in Dutch, dealing with questions like stress, final devoicing, and different rules for native vs. non-native morphemes. In his conclusion, Sproat reflects on the naturalness of the Consistency Hypothesis. The second section presents cross-linguistic studies. Susanne Borgwaldt and Annette de Groot base their paper Beyond the Rime: Measuring the Consistency of Monosyllabic and Polysyllabic Words on a close inspection of the writing systems of Dutch, English, and German. Their focus is the notion of consistency in a psycholinguistic tradition. Usually, research on phonological consistency focuses on monosyllabic words, which are split up into onset and rime. Subsequently, the mappings between written and spoken rimes are compared. Words sharing the same written rime are then considered feedforward consistent if the corresponding spoken rimes are pronounced in the same way. Words sharing the same spoken rime are called feedback consistent if their rimes are written in the same way. Borgwaldt and de Groot offer a method for determining the degree of bidirectional consistency that is applicable for monosyllabic and polysyllabic data alike. It is shown that by taking not only the consistency mappings between rimes into account but also those between other (overlapping) subsyllabic units, the accuracy of the description of consistency increases considerably. In Teachers' Perception of Spelling Patterns and Children's Spelling Errors: A Cross-Linguistic Perspective, Dorit Ravid and Steven Gillis illustrate that the complexity of orthographies as different as Hebrew and Dutch must be valued from different perspectives. They examine the teachers' perception of morphologically-mediated spelling patterns, compared with children's actual spelling performance on items spelled according to these same patterns. The study focuses on teachers' explicit knowledge of the role of morphological and morphophonological cues in spelling homophonous graphemes in Hebrew and Dutch, with alternative spellings for the same sound. In general, Ravid and Gillis find that teachers' metalinguistic knowledge of spelling patterns is a mirror image of children's performance. The authors explain their findings in terms of consciousness: explicit metalinguistic formulation of spelling patterns operates differently than natural information processing in language use. Section 3 deals with elements of writing systems different from mere letters. One kind of such elements are diacritics that modify the content of letters. A specific type of diacritics is
6
Martin Neef, Anneke Neijt, and Richard Sproat
the topic of Effects of Diaeresis on Visual Word Recognition in Dutch by Vincent van Heuven. Usually, efforts of the writer make the reading process easier. For instance: when writers bother to signal nouns with capital letters (as in German), the reading process will be facilitated. This is not what has been found by Van Heuven for the use of diaereses in Dutch. Such diaereses signal orthographic syllable boundaries and sometimes prevent homography. In a lexical decision task manipulating words with and without diaereses and with a transposed diaeresis, however, reaction times were not faster, nor was the accuracy of lexical decision different. This shows that not all information on the pronunciation of words need to be encoded in the written form. A certain amount of abstractness will lead to a system which is more efficient for the writer and nonetheless equally efficient for the reader. Punctuation marks have a less clear foundation in spoken language than letters. The theoretical debate revolves around the question to what amount their distribution can nevertheless be explained with recourse to the phonological structure of linguistic units. Jochen GeilfiißWolfgang in his article Optimal Hyphenation intends to find supporting evidence for an autonomous approach to orthography. According to his analysis, hyphenation is sensitive to orthographic syllables, a notion that is related to, but not identical with, the phonological syllable. Employing the constraint-based Optimality Theory, Geilfuß-Wolfgang formulates some constraints specific to the orthographic component of grammar. If these constraints are adequately ranked, the account enables the computation of the hyphenation data in German. Geilfuß-Wolfgang concludes that orthographic syllables and phonological syllables have many of the same properties and are governed by many of the same structural constraints. Other punctuation marks like the comma or the full stop cannot be explained in relation to word phonology. As Ursula Bredel in The Dash in German shows, analyses that assume a dependence of written forms on spoken forms argue whether intonation or syntax primarily guide the distribution of punctuation marks. Since intonation itself is grounded in syntax, however, these approaches can both be regarded as syntactical in nature. Bredel herself opts for a different approach that focuses on the characteristics of written language and can, thus, be subsumed under the autonomy paradigm. In general, she takes punctuation marks as means for the steering of language processing. Based on a historical reconstruction of the functions the dash has had in German orthography, Bredel is able to reduce the diverse manners of use of the dash in contemporary German to one main function, namely to prepare the reader for a shift of focus. Section 4, the final section of the book gives different perspectives of one particular noteworthy phenomenon of the writing system of German, namely sharpening, which is also known as consonant doubling. The core of this subject can be illustrated by the word Neffe 'nephew': Its pronunciation [nefa] contains only one fricative, but the corresponding letter appears twice in the written form. Many different approaches have been supported to explain this complex. Christina Noack compares three different rule systems to sharpening, spanning a period of more than two hundred years, in her paper Regularities in German Orthography: A Computer-Based Comparison of Different Approaches to Sharpening. These rule systems differ in their central focus, which is either the segment or the syllable or the morpheme. Noack's main concern is to present a computational tool to evaluate the consistency of alternative linguistic analyses on the basis of large corpora. This computer program, called ORTHO 3.0, enables her to give explicit lists of exceptions that each of the rule sys-
Introduction
7
tems generates. With respect to the number of exceptions, the segment-based approach shows the worst results, while the other two approaches do not differ significantly. The approaches compared by Noack all share the dependency perspective in that they derive the written forms from the spoken forms. Martin Neef in The Reader's View: Sharpening in German offers an alternative analysis within the autonomy paradigm. He follows the idea that the function of orthography is to give the reader instructions on how to read an unknown text. The basic tenet of this approach is therefore the Readability Principle, which demands that spellings should guarantee an unambiguous access to spoken forms. On this background, Neef re-examines the sharpening-data. His analysis reveals slightly different exceptions from the derivational theories discussed in Noack's article. Furthermore, Neef uncovers areas of the vocabulary that show orthographic underspecification, and he defends a different position on the question in how far sharpening is stress-based. The final paper of the volume, How Syllable Structure affects Spelling: A Case Study in Swiss German Syllabification by Thomas Lindauer, takes up the theme of dialectal variation. It is well-known that language communities are non-homogeneous. This holds especially for the German speaking societies in Germany, Switzerland, and Austria. Nevertheless, one writing system is agreed upon for these non-homogeneous groups of speakers. The question arises, then, how this writing system can be taught most effectively, given that explicit learning rules rely on phonological awareness of the learners. Most difficult are, of course, those rules that refer to phonological information not available for a group of language users (because a certain distinction is lacking in this variant of German). Lindauer proposes to present explicitly different spelling rules for the different communities, nevertheless leading to the same spelling output. He illustrates his assumptions with the example of sharpening and the related phonological phenomenon of ambisyllabicity. Since the phonological structures related with ambisyllabicity are different in Standard German and in Swiss German, the rules teaching sharpening should be different for these language communities. Most of the papers presented here grew out of oral presentations at the workshop Writing Language. Other talks of the same workshop will be published separately in an issue of the Journal of Written Language and Literacy, edited by Rob Schreuder and Ludo Verhoeven. The workshop was organized by Harald Baayen, Martin Neef, Anneke Neijt, Rob Schreuder and Ludo Verhoeven, and sponsored by the Nederlandse Organisatie voor Wetenschappelijk onderzoek (NWO) and the Center for Language Studies (CLS). We greatly appreciate this support that helped in producing this book. Finally, we would like to thank Richard Wiese, the editor of Linguistische Arbeiten, for many helpful comments, and Moritz Neugebauer and Jessica Schwamb (University of Cologne, German Department) for their help during the final stages of the preparation of this book.
Section 1: Consistency
Ameke Neijt The Interfaces of Writing and Grammar
1. Introduction
Spoken language, sign language, and written language - three modes of expression, but one underlying system? The answer will be negative for sign languages. Studies reveal that sign languages need not be derived from spoken languages and that if they happen to be derived from spoken languages, they tend to develop characteristics not present in their spoken origins (Wilbur 1987, Boyes Braem 1995). For younger generations, sign language can be acquired in a way that is familiar to how spoken languages are learned. Hence, there is evidence that sign language forms a system on a par with spoken language and is not dependent on it. This is not the case for the written mode. Writing seems to be secondary to oral language, being derived from it, and fundamentally different from sign language. Each new generation learns the written variety at school after most of the spoken language has been acquired. Whereas for children the acquisition of a spoken or sign language is an unconscious process, acquisition of writing requires explicit learning strategies, of which teachers and pupils are well aware. Writing should be considered another code for the language acquired, which is why spelling is called secondary. The existence of spelling pronunciations, however, shows that this secondary mode of expression influences speaking, the primary mode (Van Haeringen 1962, Wells 1982: 106-9, Carney 1994, Maas 2000: 33). Other evidence for this influence on the primary mode comes from psycholinguistic experiments (cf., for instance, Seidenberg & Tanenhaus 1979, Schreuder et al. 1998) and from language change (Jespersen 1909). In this paper, the question how both modes of expression are related is investigated from a theoretical point of view. The close relationship between a spoken language and its written variant has led to the hypothesis that the major part of the system is shared by both modes of expression. For instance, the semantic component provides the interpretation of scope-bearing elements, whether written or spoken; the syntactic component provides word order for both. Morphology creates words and inflection for both, and even some part of phonology is common, e.g. phonological segments correlate closely with letters. Some writing systems are called 'deeper' and others more 'shallow', reflecting the derivational level relevant for writing. Systems based on morphosyntactic structure are called deeper than systems based on phonological or phonetic representations (Haas 1976, Sampson 1985, Sgall 1987, Asher & Simpson 1994, Daniels & Bright 1996, Meisenburg 1996). The claim is that the written mode of expression follows a route different from the oral mode only in the final stage of processing. In reading, it is only the first stage of processing that follows a different route, according to this hypothesis. Speaking and writing thus share a large number of derivational stages, as do hearing and reading. Schematically:
Anneke Neijt
12
common stages of processing semantics, syntax, morphology, part of phonology
phonology phonetics
• speaking / hearing Figure 1:
phonology
orthography
• writing / reading
General model of the relation between spoken and written language
This view on how spoken and written language relate to each other has been worked out for Dutch by Nunn (1998). Dutch orthography is known to be based on a deep phonological stage of processing, cf. Van Heuven (1978) and Booij (1987). Nunn (1998) adds to this the conclusion that the derivation from phonology to orthography consists of two steps. After the first step of phoneme-to-grapheme conversion for morphemes, a second step takes care of grapheme co-occurrence restrictions by way of graphotactic rules, i.e. grapheme-to-grapheme conversion rules. Nunn calls such rules 'autonomous spelling rules', claiming that the rules refer to orthographic information only, although some of the phonological characteristics (the distinction between consonants and vowels, for instance) are carried over to the orthographical representation. Of course, in defending the claim of a derivation in two steps, Nunn emphasizes the differences between the two steps, i.e. the difference between phonologically and orthographically based rules. It is from this perspective that Nunn tries to find evidence for the orthographic nature of autonomous spelling rules and to restrict the amount of phonemic information necessary for the second step in the derivation from phonology to writing. From this perspective, it is not surprising that Nunn's analysis of Dutch has been used in Sproat (2000: 16) to illustrate the Consistency Hypothesis. (1)
Consistency The Orthographically Relevant Level for a given writing system (as used for a particular language) represents a consistent level of linguistic representation.
This hypothesis, a direct reflection and strict interpretation of the model sketched in figure 1, states that there is one consistent Orthographically Relevant Level for a given writing system, not more than one, cf. figure 2. Notice that 'Consistent' here must not be understood as 'without exceptions'. Where alphabetic writing systems concern the spelling of finite sets of elements, the opportunity is present to store exceptional orthographic forms in memory. It seems that exceptions occur in many alphabetic writing systems.
The Interfaces of Writing and Grammar
13
(deep) underlying level
ORL
(surface) Figure 2:
The claims of the Consistency Hypothesis: one consistent level by the oral and written modes
In this paper, evidence will be presented to show that the processes of speaking and writing share more information than can be provided by a single derivational level. The claim made in this paper is that the phoneme-to-grapheme conversion rules are based on information from different levels, as are the grapheme-to-grapheme conversion rules. Of course, the distinction between the two sets of rules will be valid even when more than just one linguistic level provides input to the orthographic representation. Therefore, the two-step analysis of Nunn can be maintained, though defined in a less rigorous fashion. The Consistency Hypothesis, however, cannot be maintained as a universal principle. The layout of this paper is as follows. First, the arguments by Nunn (1998) in favor of a two-step derivation of orthography will be reviewed. Then, in section 3, the Orthographically Relevant Level according to Nunn will be discussed. It will be shown that the hypothesis that there is only one such level can be maintained only at the cost of storage. Sections 4 and 5 show that a native Orthographically Relevant Level must be distinguished from a non-native Orthographically Relevant Level and that punctuation is based on other levels than the phonemic representation of morphemes. Section 6 presents the linguistic information necessary for the autonomous spelling rules. Section 7 finally summarizes the evidence gathered in the preceding sections about the linguistic levels needed for writing and presents the overall conclusion. Information from different levels of language processing is collected in writing. In the presentation that follows, most arguments are based on writing and virtually no arguments are presented about reading.
Anneke Neijt
14 2. An outline of Dutch orthography
Detailed information on the orthography of Dutch can be found in Nunn (1998). She distinguishes several orthographic components that are relevant for Dutch. Conversion of native morphemes needs to be distinguished from conversion of non-native morphemes, and a set of autonomous rules forms part of the orthographic derivation. Figure 3 is Nunn's analysis in a nutshell. Observe that she assumes one level with information on the underlying, phonemic, representations of the segments of morphemes at which all information of the spoken mode is translated into information on the written mode. Nunn's proposal for Dutch therefore confirms Sproat's Consistency Hypothesis. According to Nunn, there is one Orthographically Relevant Level, the level of morphemes in their phonemic form: phonemic descriptions of morphemes
1
native conversion
phonological rules
I phonetic form Figure 3:
non-native conversion
autonomous spelling rules
I orthographic form
Nunn's model of the relation between phonetic and orthographic form
The remainder of this section will present explanatory notes on this model. Dutch has a so-called deep orthography. Underlying rather than superficial sound segments are spelled; i.e., morphemes tend to receive a uniform spelling, irrespective of the application of certain phonological rules that generate sets of allomorphs. Frequently used examples to illustrate this are hond and heb, with final obstruents spelled in accordance with their underlying forms /hand/ and /heb/ instead of their phonetic forms [hont] and [hep]. These underlying forms are detectable for the writer on the basis of plural inflection: [hands] and [hete] with voiced obstruents. Other examples are zuinigheid, aanmelden, hoofddoek 'carefulness, to announce, head-shawl', for which a more superficial spelling would be *zuinigeit, *aamelde, *hoofdoek, derived by h-deletion, final devoicing, nasal assimilation, final n-deletion, and degemination. Furthermore, Dutch is a language with two sets of words: native ones, such as kunstzinnigheid, and non-native ones, such as artisticiteit, both meaning 'artisticity'. The difference has its origins in the earlier stages at which Dutch imported words from Latin or French, but new borrowings follow this distinction as well. Non-native words can be distinguished from native ones on the basis of systematic differences in present-day phonology and morphology (Van Heuven et al. 1994, Nunn 1998: 155flf.).One of the most important characteristics is the
15
The Interfaces of Writing and Grammar
number of full vowels present in morphemes: when more than one full vowel is present, the morpheme will be non-native. Exceptions to this rule are only a handful of frozen compounds such as aardbei 'strawberry' which behave as native words, notwithstanding the presence of more than one full vowel. The distinction between native and non-native morphemes takes the native morphemes as point of departure, such that all morphemes not in accordance with the constraints that hold for native morphemes are non-native. Therefore, the fact that only one full vowel is present in a morpheme is a necessary but not a sufficient criterion for this morpheme being a native morpheme. Further constraints are the combination of consonant clusters (for instance, only a limited set of clusters occurs in native morphemes, not the clusters /sk/, /stf, and /tm/, which predicts that skelet 'skeleton', sfeer 'sphere', and ritme 'rhythm' are non-native words, even though only one full vowel occurs) and constraints on morphology (for instance: plural -s is restricted to native words ending in /a, o, u/ and native words ending in a syllable with schwa; hence, the plural forms trams and e-mails indicate that these words are non-native). On the basis of such criteria, the etymological distinctions are recoverable from the synchronic spoken mode even for language users without any knowledge of foreign languages. The orthography reflects the difference between native and non-native words, since partly different sets of phoneme-to-grapheme conversion rules are used (indicated in figure 3 by the two routes for native and non-native morphemes) with, for instance, the graphemes c, q, th, y, and χ for non-native words only, cf.: (2)
non-native camera 'camera' guasi 'quasi' ether 'ether' hypo these 'hypothesis' examen 'test'
native kamer 'room' kwaad 'angry' eter 'eater' hier 'here' heks 'witch'
sounds Μ Dd /t/ Ν /ks/
Literacy therefore leads to awareness of the distinction between native and non-native morphemes. The general model in figure 1 of how speaking and writing can be related is not only complicated by the difference between the spelling of native and non-native words, but also by the existence of autonomous spelling rules. One of the reasons to incorporate such rules in the model of Dutch orthography is the presence of allography in examples such as: (3)
stem bak judo laan vers
-
derivedform bak+er -> hij judo+t -> laan+en ietsvers+s ->
spelling bakker 'baker' hij judoot (third person ending of the verbal stem to judo) lanen 'lanes' ietsvers 'something fresh'
No phonological alternation is involved here. In order to account for such forms of allography, Nunn (1998: 183 ff.) proposes a set of autonomous graphotactic rules, i.e. rules that operate on grapheme sequences, such as the following ones for gemination and degemination. The formulation of Nunn's rules has been simplified for expository reasons. C abbreviates for consonant letters, V for vowel letters, and dots indicate syllable boundaries. The distinction between short and long vowels is not one of phonetic duration, but rather expresses the feet
16
Anneke Neijt
that short vowels may combine with a coda that consists of more consonants than the coda following long vowels. (4)
a.
b.
Orthographic gemination C CC after a short vowel at the end of the syllable V -> VV for long vowels when a C follows within the syllable Orthographic degemination VV V when syllable final CC -> C when syllable final
The derivation of the words presented in (4) runs as follows (backslashes indicating the underlying orthographic forms): (5)
a.
b.
c.
Conversion of morphemes /bak/ \bak\ /ar/ \er\ /jydo/ \judo\ /t/ \t\ /lan/ \laan\ /an/ \en\ /vers/ \vers\ /s/ \s\ Concatenation of morphemes and syllabification \ba.ker\ \ju.dot\ \laa.nen\ \verss\ Application of orthographic (de)gemination rules, cf. (4)
As a result of these orthographic rules, vowel letters for short vowels are always followed by a consonant within the syllable, whereas syllable-final vowel letters represent long vowels. It is because of this pattern that short and long vowels in the literature on Dutch orthography are called 'covered vowels' and 'free/uncovered vowels' (Dutch gedekte and ongedekte/vrije vocalen). Covered vowels are always followed by a consonant letter within the syllable, whereas uncovered vowels may occur at the end of syllables: (6)
covered/short vowels [kanta] kan.ten 'sides' [keldar] kel.der 'cellar' [pbfte] plof.te 'plumped'
covered [ka.re] [be.la] [pb.fo]
by C-gemination kan.nen 'cans' bellen 'bells' plof.fen 'to plump'
uncovered/long vowels [mans] ma.nen 'moons' [bens] be.nen 'legs' [pokar] po.ker 'poker'
This generalization holds in orthography but is present in phonology as well: intervocalic consonants after short vowels are ambisyllabic, as demonstrated in experiments in which speakers of Dutch are forced to explicitly syllabify such examples (cf. Rietveld 1983 and Sandra et al. 1996). The experiments show that speakers' judgments are influenced by orthography. However, interestingly, illiterate speakers of Dutch and pre-school children also present analyses with ambisyllabic consonants, though significantly less than the literate participants for whom the spelling rules seem to enhance ambisyllabic responses.
17
The Interfaces of Writing and Grammar
The only exceptions to the generalization that short/covered vowels are followed by a consonant within the syllable are loan words such as sjwa [sjwa] 'schwa' and exclamations such as bah [ba] 'ugh!' and joh [jo] (an exhortative word, presumably derived fromjongen 'boy'). In the latter cases, has the function of covering the short vowel at the level of orthography. With the introduction of autonomous orthographic rules, the process of writing becomes a two-step derivation. The first step is conversion from phonemes to graphemes and the second step is the set of autonomous spelling rules for the conversion from graphemes to graphemes. Arguments in favor of this position are based on the observation that the two sets of rules display different characteristics (Nunn 1998: 131): (7) context domain native/non-native sensitive
phoneme-to-grapheme conversion rules phonological morpheme yes
autonomous spelling rules orthographic word no
The following short summary of spelling /i/ in Dutch will illustrate the characteristics of the conversion rules (backslashes again indicate underlying orthographical forms): (8)
Conversion rules for Iii a. Iii -> \ie\ in native morphemes {betel 'kittle') b. Iii \ie\ in the last syllable of non-native morphemes (komiek 'comic', natie 'nation') c. I\l -> \i\ in non-native morphemes, when not the last syllable (titel 'title')
These rules take phonological information as their input and are restricted to the morphemedomain. Rules (a) and (c) show that the native/non-native distinction is relevant. Both kietel and titel are monomorphemes, and hence, only the conversion rules can be responsible for the spelling difference. Rule (b) shows that information on morpheme boundaries is essential. Final syllables in native and non-native morphemes are spelled . Diaeresis placement may illustrate the characteristics of autonomous rules. This rule applies to ambiguous letter strings: because aa, oo, and uu encode one sound in (9a), a diaeresis should be used in (9b) where the two vowels indicate two sounds (resp., uncovered/long and covered/short ones). Because ii, eo, and ue are not in use as a digraph, no diaeresis should be used for these letter pairs, cf. (10). (9)
a. [a] [o] [y]
(10)
a. [ii] [30]
Μ
baal 'bale' koor 'choir' postuum 'posthumous'
b. [aa] [oa] [yy]
Baäl 'biblical name' coördinatie 'coordination' vacuum 'vacuum'
kopiist 'copyist' geolied 'oiled' ambigue 'ambiguous + inflection'
b. * kopiist *geölied * ambigue
The following examples show that diaereses occur in native and non-native forms alike:
Anneke Neijt
18 (11)
native words geent 'grafted' knieen 'knees' be'ihvloed 'influenced'
non-native words geemotioneerd 'emotional' manieen 'manias' geillustreerd 'illustrated'
Moreover, these examples are morphologically complex, which shows that the diaeresis rule also applies across morphological boundaries, at the level of the word. Diaeresis placement will be discussed below in section 6.1, where the fact that the rule makes use of phonological information will lead to the conclusion that such graphotactic rules are not autonomous. In sum: Nunn finds evidence for her two-step hypothesis in the clustering of characteristics of the rules involved. Her two-step analysis will be taken as a point of departure for the remainder of this paper, but arguments will be presented against the claim that the context of autonomous spelling rules consists purely of orthographic information. First, evidence will be presented that phoneme-to-grapheme conversion rules are based on information from different levels (section 3) and that the Orthographically Relevant Level is different for native and non-native words (section 4).
3. Phonological rules expressed in Dutch writing
In this section, rules will be discussed that show that some morphemes are spelled according to the phonemic level, but that a more superficial level must be assumed for other morphemes. Nunn's conclusion was that the more superficially spelled allomorphs are stored in the lexicon, even though phonological rules predict their distribution. In the absence of independent arguments for this position, one might claim equally well that the cases discussed are counterexamples to the Consistency Hypothesis and that there are several Orthographically Relevant Levels for Dutch.
3.1. Voice assimilation As illustrated above, Dutch spelling is based on a deep phonological level at which, for instance, the rule of Final Devoicing has not been applied. Hond and heb are written, even though [hont] and [hep] are pronounced. Dutch orthography, however, reflects Perseverative Devoicing in past tense suffixes, cf.: (12)
Dutch past tenses [d] stem - stemde tob - tobde kano - kanode [t] lek-lekte hoop - hoopte straf- strafte
'vote - voted' 'worry - worried' 'canoe - canoed' 'leak-leaked' 'hope - hoped' 'punish - punished'
The Interfaces of Writing and Grammar
19
The fact that Perseverative Devoicing in past tenses is expressed in spelling comes as a surprise, given that Dutch spelling generally expresses the underlying form of d/t-allophony. The inconsistent spelling of past tenses has been explained in the literature by referring to Readability, an output constraint that requires lekte, Hoopte, and plofte instead of *lekde, *hoopde, and *plofde. The Readability Requirement has been incorporated in the Principle of Uniformity by Te Winkel (1863: 12) as follows: (13)
Principle of Uniformity (Regel der Gelijkvormigheid) Give the same orthographic form to a word and to its constituent parts, as far as pronunciation allows this.1
This explanation has been repeated in later publications (for instance, in Booij et al. 1979), but the Readability condition has never been explicitly formulated (but cf. Neef, this volume). There are reasons to doubt that Readability can be so formulated that it accounts for the spelling of past tenses in Dutch. First, observe that the reading process is quite robust, as illustrated by examples such as politie 'police' and politiek 'politics' (with indicating [tsi] in the first word and [ti] in the second one) and diminutives such as cremepje (lit. 'small cream', i.e. cream in small pots or tubes), written with three syllables and pronounced with only two. Such examples show that the relation between spelling and pronunciation may be a loose one, as long as the morphemes are recognized and get a stable spelling. The second argument comes from English. Observe that past tenses in English are also subject to Perseverative Devoicing, but that these verbs receive a morphological spelling. (14)
English past tenses [id] lift-lifted [d] puzzle - puzzled [t] look - looked
If these forms are not problematic for English readers, why then would the deep spellings *lekde, *hoopde, and *strafde be problematic for Dutch readers? Presumably, Readability is a universal requirement, related to the language processing capacities available to human beings. When languages differ, the differences should be explainable on the basis of other characteristics of the languages, and no such explanation seems to be available for these cases. In order to maintain the hypothesis that phonemic representations of morphemes form the input for spelling, Nunn proposes a lexical approach to past tense allography. She assumes that these suffixes are stored in their more superficial forms -te and -de (cf. Nunn 1998: 63 and 136) and that not only storage of underlying forms, but also the option of what she calls 'competing allomorphs' is available in Dutch orthography. Evidence from the spoken mode for the special status of past tense suffixes is then called for. As long as such evidence is lacking, these instances might as well illustrate that some morphemes (be it a finite list) get a more superficial spelling, whereas the spelling of most morphemes is in agreement with the underlying phonological representation. But this alternative would be in conflict with the Consistency Hypothesis. '
Geef, zooveel de uitspraak toelaat, aan een zelfde woord en aan ieder deel, waaruit het bestaat, steeds denzelfde vorm.
20
Armeke Neijt
Another way to maintain the Consistency Hypothesis, more in line with Sproat (2000), would be to assume that Dutch orthography is based on some intermediate level, after the application of Perseverative Devoicing but before all other phonological rules apply. This proposal conflicts with the traditional, derivational approach of Dutch voicing assimilation present in the literature on Dutch phonology; cf. Zonneveld (1983), who claims that Final Devoicing is ordered before all other assimilation rules. Of course, other analyses of Dutch voicing assimilation can be provided. For instance, analyses with another domain of application for Final Devoicing (not the word, but the syllable), with another underlying form of the past tense suffix, or with a lexically governed rule of Perseverative Devoicing, different from the general rule of Progressive Assimilation in Dutch. But it seems hard to find independent evidence to choose between these alternative approaches, which is one of the reasons why the derivational approach is no longer the predominant model of phonological research. As long as derivational models do not provide independent evidence for a specific rule ordering, the conclusion must be that the Consistency Hypothesis cannot be tested. Rather than forwarding claims about some intermediate level and more in line with newer insights in the interaction of components, one should conclude that the underlying phonemic representation is the input to Dutch orthography, except for a finite list of morphemes of which the allomorphs are distinguished in orthography.
3.2. Diminutive allomorphy and d-insertion Nunn (1998: 62) proposes the competing allomorph analysis also for the spelling of diminutives and for the spelling of agentive and comparative -er, cf.: (15)
allomorphy/allography of diminutive suffixes -etje, -tje, -pje, -Ige, -je bloem - bloemetje 'flower' laan - laantje 'lane' oom - oompje 'uncle' koning - koninkje 'king' koek - koekje 'cookie'
(16)
allomorphy/allography of -er, -der roep - roeper 'call - caller' hoor - hoorder 'hear - hearer' mooi - mooier 'beautiful - more beautiful' raar - raarder 'weird - weirder'
The choice between diminutive suffixes is predictable on the basis of phonological contexts, but some diminutives get an idiosyncratic meaning which may form an argument for considering diminutive allomorphy as a lexicalized process. The rule of d-insertion before agentive and comparative -er, on the other hand, is productive and fully predictable. Therefore, the competing allomorph analysis is not more likely to be present for these morphemes than it is for any other morpheme. Hence, as long as no additional evidence is provided, -er/-der allography forms an argument against the Consistency Hypothesis.
The Interfaces of Writing and Grammar
21
3.3. Nasal assimilation Nasal Assimilation is usually not expressed in orthography, indicating that the phonemic level is the Orthographically Relevant Level: (17)
/n/ [n] /n/ -> [m] /n/ [η]
onaardig'not nice' onprettig 'unpleasant' onklaar 'out of order'
In non-native morphemes, however, Nasal Assimilation is present in orthography for the labial nasals, but not for the velar ones: (18)
/n/[m] /n/ -> [η]
implosie'implosion' incapabel 'incapable'
The same pattern holds in English and German. At first sight, the Consistency Hypothesis is faced with two problems: the difference between native and non-native words, and within the non-native words, the difference between labial and velar nasals. The latter problem, however, can be discarded by an autonomous spelling rule that forbids the strings ngk and ngc within words. The existence of this rule can be shown by diminutive formations such as honing - koninkje ('king - small king'). The different reflection in orthography of Nasal Assimilation in labial contexts could be accounted for if Nasal Assimilation in non-native words could be shown to be lexicalized. However, contractions and emphatic use show that the underlying form is in when the context for the phonological rule is absent: (19)
in-en export 'in-and export' (existing phrase, next to import) in- en exploderende Stoffen 'in- and exploding substances' (possible phrase, next to imploderend) in-, in-, implausibel 'very implausible' (possible phrase, with emphatic repetition) ik zei /«-plausibel Ί said /'«-plausible' (corrective use, no Nasal Assimilation)
It is again possible to maintain the Consistency Hypothesis by the claim of differences in storage. Native in- is stored in its phonemic form, non-native in- is stored in its three phonetic forms in-, im-, and ing- (with subsequent deletion of by a graphotactic rule). In sum, some affixes receive a more superficial spelling than provided by the phonemic level. Nunn proposes storage of so-called competitive allomorphs for such cases. This allows for lexical idiosyncrasies, which indeed occur. The Consistency Hypothesis claims that there exists one level that provides this information: the phonological rules involved in these affixes should all precede the phonological rules not expressed in orthography. As argued at the end of section 3.1, it will be difficult to find evidence for the rule ordering required by the Consistency Hypothesis. On the other hand, storage of allomorphs leaves unanswered the question why some allomorphs are stored and others are not. Booij (p.c.) suggested another route of explanation instead of ordering, based on the observation that some phonological rules are general and others are restricted to specific morphemes. Rule ordering then need not be the explaining factor. Rather, some economy principle would be at work, such that orthography neglects general, 'unavoidable' or 'automatic' rules. This may indeed be the case, but: it cannot be the
Anneke Neijt
22
full answer. Observe that according to this hypothesis, other instances of Dutch orthography will be inconsistent. For instance: devoicing of fricatives at the end of words is a general rule of Dutch phonology, but still, the superficial form is spelled in words such as huis and leef (which have underlying voiced fricatives, witnessed by the inflected forms huizen and leveri). Other examples illustrating that there is no tendency to avoid representation of general phonological rules in Dutch are vowel reduction in words such as apostel, cirkel ('apostle, circle', with derived forms apostolisch and circulair), nasal assimilation in monomorphemic words such as ramp 'disaster', and degemination at the end of words. Perhaps all these counterexamples can be explained on the basis of graphotactic rules, but the question to be answered then is why such graphotactic rules violate an otherwise sensible constraint on the orthographic system for Dutch.
4. Native and non-native morphology
The spelling of non-native words in Dutch differs systematically from the spelling of native words. Above, in (2), examples are presented with c, q, th, y, and x, letters that are not in use for the sounds /k, t, i, ks/ in native words. More subtle differences exist in the spelling of vowels. In native words, long (or uncovered) vowels are spelled with digraphs and short (or covered) vowels are written with a single letter, cf. (20). In non-native words, however, all vowels are written with a single letter, cf. (21a), except when they occur in the final syllable of the word, cf. (21b): native words short vowel [a] handel [ε] verder [x] mispel [o] koster [y] durven
(20)
(21)
a.
long [a] [e] [i] [o] fy]
vowel vaandel meerder kietel klooster huurder
non-native words, nonfinal syllables short vowel long vowel [a] apotheose [a] apotheek [e] mechanisch [ε] echo distichon [i] diploma [I] comite [o] homoniem Μ muskiet [y] stucwerk Μ
b.
non-native words, final syllables short vowel long vowel [a] amalgaam [a] sesam [e] gareel [ε] rebel passim [i] muziek Μ [o] piloot [ο] complot museum [y] minuut Μ
The difference between long and short vowels does not seem to be a phonemic difference in non-native words, which is why variation of pronunciation may occur. For instance: apotheek with a first long vowel and apotheose with a first short vowel occur, though perhaps less frequently as the other way around. Only a keen listener will notice when muskiet and stucwerk are pronounced [myskit] and [stYkwerk] instead of the more usual [mYskit] and [stykwerk].
The Interfaces of Writing and Grammar
23
Minimal pairs based on vowel length (such as komma 'comma' and coma) are hard to find in the set of non-native words, presumably because vowel length distinctions played a minor role in the donor language Latin, in which liber 'book' and liber 'free' is one of the few examples of a minimal pair based on this distinction. The above examples illustrate that different conversion rules apply to the two classes of words. The following examples show that the domain at which conversion takes place differs also (Nunn 1998: 93): (22)
a. stem Fries limiet trochee station
b. native suffix Friezin limieten trocheeen stationnetje
c. non-native suffix frisisme 'Frisian - Frisian woman - frisism' limiteer 'limit - limits - to limit' troche'isch 'trochee - trochees - trochaic' stationair 'station - small station - stationary'
Starting with a native or non-native stem, cf. (22a), a native morpheme added to it results in a spelling without adaptation, cf. (22b). This is what the Consistency Hypothesis predicts in combination with the assumption that morphemes form the domain of phoneme-to-grapheme conversion. When, however, a non-native suffix is added, the stem is spelled as if the word were monomorphematic: long vowels are written with a single letter, cf. (22c). Nunn accounts for this spelling behavior by assuming that non-native morphology is ignored. Complex derivations with non-native affixation are treated as if they were monomorphematic. The solution proposed by Nunn meets some difficulties. First, the above examples of contraction and emphatic use presented in connection with Nasal Assimilation (in- en export, inplausible etc.) show that morphological structure is present in non-native derivations. Second, a set of correspondence rules is needed to account for spelling idiosyncrasies that occur in non-native sets of words such as context - contextueel, tekst - intertekst - intertekstueel, medievist - medievistiek, quaestor - quaestrix. Morphemes of non-native complex words receive a constant, though sometimes idiosyncratic spelling, but the spelling of sets of morphologically related non-native words cannot be considered completely ad hoc. Context and tekst form the basis of the two sets of consistent spellings; ae is replaced by e in ether (< aether) and forms derived from ether, but not in quaestor and its derived forms. Third, consonant geminates in non-native words are the reflection of morphological structure, cf. acclamatie - declamatie, adduceren - deduceren, collocatie - dislocatie. When writers are aware of this kind of morphology, this shows that non-native morphology is present in the language system and reflected in orthography. On the other hand, some distributional facts will receive an explanation by a level in between non-native and native morphology (cf. Van Beurden 1987): given such a level, nonnative morphemes would be closer to roots and stems than native morphemes, which is generally true, although productive formations to the contrary exist (cf. Haas & Trommelen 1993: 459 ff.) (in (23), boldfaced sub-, hyper-, and -eer are non-native affixes, the other affixes are native):
Anneke Neijt
24 (23)
a.
b.
General pattern spelling groepering verdisconteer Exceptions spelling subafdeling hypergevoelig
morphology (((groep) eer) ing) 'grouping' (ver((dis(cont))eer)) 'negotiate' morphology (sub(afTdeelY)ing) 'subsection' (hyper((ge(voel))ig)) 'hyper-sensitive'
Perhaps sub-, hyper-, and the like are to be grouped together with the native ones. (For English, non-native prefixes are claimed to belong to Class II, cf. Giegerich 1999 and previous literature.) In that case, the different spelling behavior of native and non-native morphemes can be combined with the Consistency Hypothesis when a level in between non-native and native morphology is assumed to form the input for phoneme-to-grapheme conversion, and the elements converted are native morphemes and non-native complex forms. In that case, a new solution must be found for idiosyncratic spellings of related non-native formations and for consonant geminates in contexts where orthographic gemination does not apply. The level ordering hypothesis and stratum-oriented models never succeeded in adequately describing the morphological patterns available in languages such as English, German, and Dutch. Instead, approaches with restrictions for individual morphemes seem to be more successful, cf. Fabb 1988, Neef 1996, Plag 1999, and Hay 1999. In line with these more recent approaches, the competing allomorph analysis forwarded by Nunn, and hence storage of spelling forms for individual morphemes, seems to be more promising than the search for one level as the input for writing.
5. Punctuation
The orthographic rules that mirror segmental phonology take morphological words as their maximal domain. For other aspects of orthography, i.e. punctuation, larger domains are relevant. For instance: words in a phrase are separated by spaces, words in compounds are written together. When a phrase is embedded in a word, the spaces are eliminated, which offers the opportunity to disambiguate in writing what may be ambiguous in speaking, cf.: (24)
phrase klein kind 'small child' vuile grondaffaire 'dirty affair about land' oude mannen 'old men'
compound kleinkind 'grandchild' vuilegrondaffaire 'affair about polluted land' oudemannenhuis 'old men's house'
Moreover, larger syntactic or prosodic constituents can be distinguished in orthography. Capital letters and dots surround utterances, and commas or semicolons separate the parts of enumeration (as in English), with subtle distributional differences that may signal coordination embedded within coordination: Jan, Karel en Harry; Kees, Marie en Piet; en Susan, John en Martin. There are indications that the distribution of punctuation signs relates to the
25
The Interfaces of Writing and Grammar
thematic structure of the text (cf. Bredel, this volume). The following list of orthographic means for representing information from different linguistic levels illustrates the issue, but is not meant to be exhaustive: (25)
orthography spaces capital letters capital letters capital letters and dots commas semicolons indents or white lines dash
linguistic level syntax: syntactic words semantics: proper names syntax: German nouns syntax or prosody: utterances syntax: clauses, coordinated constituents syntax: coordination with embedded coordination syntax: domain of pronominal reference semantics: change of focus
This list shows that the Orthographically Relevant Level needs to be an all-encompassing representation of the utterance, including syntactic and semantic information. The Consistency Hypothesis claims, however, that writing is based on a single level of information, e.g. that it translates sounds of a certain level into letters.
6. Linguistic information for graphotactic rules
According to the null hypothesis, the output of phoneme-to-grapheme conversion would be the string of letters. Nothing more. However, there is abundant evidence in the formulation of autonomous spelling rules that the representation is more articulate. Word and morpheme boundaries are retained, as is some information about the connection between successive words and morphemes, witnessed by the use of spaces, for instance. In this section, the kind of information necessary in the formulation of graphotactic rules will be discussed. It will be shown that graphotactic rules and phonological rules of the later stages of the derivation share characteristics which make it necessary to assume a close relationship between the two sets of rules. The discussion will take as its point of departure the proposal by Nunn (1998: 32-34) to carry over to orthography the morphological structure and specific parts of the segmental phonological information, e.g. the distinction between vowel letters and consonant letters and information about length of vowels. Nunn proposes representations such as (26) to handle the fact that and may be long vowels (VV) and that may represent a short vowel (V) as well. (26)
letter tier CV-tier
ee II VV
e Λ VV
Nunn assumes that these two tiers are sufficient:
e I V
Anneke Neijt
26
The use of orthographic CV-structure based on the pronunciation accounts for the feet that spelling needs more phonological information than can be encoded by letters only, without stating that all phonological information has to be available. (Nunn 1998: 34)
However, in her formulation of autonomous rules, she uses the CV-structure both as a way to distinguish long vowels from short ones and as a generalization for the set of underlying consonant letters and vowel letters that form the input of autonomous rules. In actual feet, thus, she uses two CV-tiers, a phonological one and an orthographic one (for the underlying representation of the orthography). Below, arguments will be presented that both kinds of CV-tiers are needed.
6.1. Capitalization and Diaeresis Placement The rules of orthography that show the need for a more elaborate, three-tiered, representation are Capitalization and Diaeresis Placement. First, look at Capitalization. Dutch has one special letter in its alphabet: . Despite its appearance, is one letter, not a digraph, as shown by capital use: (27)
letter IJs *Ijs 'ice'
digraphs *AArde, »AUto, »CHaos, *IEmand, ... Aarde, Auto, Chaos, Iemand,... 'earth, car, chaos, someone,...'
In former days, ij was one touch on Dutch typewriters, and if necessary, y was used where ij was meant (or vice versa: the birth registration officer once wrote down Neijt instead of Neyt). The distinction between the letter and the digraphs must be captured in the orthographical CV-tier: (28)
orth-CV-tier letter tier phon-CV-tier
I ij Λ fW/
II aa Λ /VV/
II ch I /C/
Μ ie Λ /VV/
The second rule that needs more than one kind of CV-tier is Diaereses Placement. The two dots above vowel letters that function as umlaut in many languages (and in some Dutch loans such as loss, überhaupt) are used in Dutch productively as separators for strings of vowel letters that could have been interpreted as digraphs (cf. (9b) above and Van Heuven, this volume): (29)
a.
b.
Digraph baal, geen reus, blazoen Two monographs Baal, geent reiinie, kanoen
[bal, χεη] [r0s, bla-zun]
'bale, none' 'giant, blazon'
[ba-αΐ, χο-εηί] [re-jy-ni, ka-no-wan]
biblical name, 'grafted' 'reunion, to canoe'
27
The Interfaces of Writing and Grammar
In Nunn's notation, long vowels get the same notation as a pair of short vowels. The left-hand and right-hand examples above will thus get the same representation for the relevant vowel letters (Nunn 1998: 32-4), and there is no basis for the diaeresis rule to distinguish the two sets: (30)
orth-CV-tier letter tier
I I I I baal Βaä1
one consonant one consonant only after a short only after a lax vowel plus related form with short lax a vowel following the consovowel nant: [v a 1 3 n] wallen 'to surge'
related form in which the consonant is ambisyllabic or is tightly connected to a preceding heterosyllabic vowel respectively: [v a 1 a n] wallen 'to surge'
walten: sharpening: no
root [v a 1 t] -> two different consonants after a short lax vowel
no related form in which the consonant is ambisyllabic or is tightly connected to a preceding heterosyllabic vowel respectively
Table 1:
root [v a 1 t] -> two different consonants after a short lax vowel, no related form with a vowel after the consonant
Necessary conditions in segment-, morpheme- and syllable-based approaches
In German orthography, sharpening is much more regular than lengthening. There are, however, many words without sharpening even though the stressed vowel is lax and short and followed by a single consonant, cf. an, um, mit (prepositions), Damhirsch 'fallow deer', Brombeere 'blackberry'. Spellings like these violate the segment-based rule but not the morpheme- and syllablebased ones, as there is no related form in which there is ambisyllabicity or tight syllable cut respectively. Moreover, they do not violate the morpheme-based condition since there is no related form which has a vowel after the root.
3. Three different concepts
As already mentioned, there are different approaches to explain regularities of German orthography, which are partly based on different domains. In the following, a morpheme-based (Adelung 1788a), a segment-based (Prussia 1902) and a syllable-based approach (Maas 1997b) will be tested. The sharpening rule serves as an
152
Christina Noack
example for the respective orthography-theoretic conceptions in general since these approaches hold for the other special spellings like lengthening or the i-graphemes as well.
3.1. Adelung's rule (morpheme-based) The works of the librarian, scholar and grammar school teacher Johann Christoph Adelung (1732-1806)4 are still regarded as outstanding in modern research. His expositions on German grammar temporarily rounded off a development of grammatical systematization for the German language that started in the 16"1 century with the writings of Valentin Ickelsamer.5 In older grammatical descriptions orthography has always been included, and there is a long tradition of orthography-theoretic work which attended to the vernaculars not later than the Renaissance. Adelung's expositions are characterized by a high degree of argumentative consistency. This may be one reason why he had an immense influence on linguistic usage in his own lifetime. As Maas (1992: 229) points out, his 'Deutsche Sprachlehre' was compulsory reading in Prussian schools, and great writers like Schiller or Goethe consulted his 'Vollständige Anweisung zur Orthographie'. In Adelung (1788a), he formulates his condition for consonant-doubling: (7)
Sharpening rule 1 (Adelung 1788a: 218) "If a vowel is followed by a simple consonant and the extension of the word shows that this consonant is doubled and, as a consequence, the vowel is sharpened, then double the consonant also for the eye, both inside and outside of the extended word."6
Adelung's rule focuses on the root morpheme which for him shows a deficit if it consists (after an onset of any form) of a short vowel followed by a single consonant only. In this case, the graphemic counterpart is mapped twice (= geminated) and transmitted into all other forms of the word family: (8)
W
br
4
5
6
7
ε
η
a
n
brennen
'to burn'7
Themost important of these are 'Deutsche Sprachlehre', Berlin 1781, 'Umständliches Lehrgebäude der deutschen Sprache', 2 vols., Leipig 1782, 'Vollständige Anweisung zur deutschen Orthographie nebst einem kleinen Wörterbuche', Leipzig 1788. 'Ein Teütsche Grammatica', 1534. Further commendable grammarians are Chr. Gueintz , J.G. Schottel, K. Stieler (17" century), Η. Freyer, and J.Chr. Gottsched (18" century). „Wenn auf einen Vocal ein einfacher Consonant folgt, und die Verlängerung des Wortes zeiget, daß er gedoppelt und folglich der Vocal geschärft lautet, so verdoppel ihn auch für das Auge, so wohl in der Verlängerung als außer derselben." W = word, R = root, Sf = suffix.
Regularities in German Orthography: A Computer-Based Comparison
153
In the 3rd section of Adelung (1788a; 'On syllables'), the author makes a difference between a 'lengthened' and a 'sharpened' accent: In German, as well as in other languages, each vowel can be pronounced in two ways, depending on whether the voice rests on it for a longer or a shorter period of time, or rather depending on whether the opening with which it is pronounced takes more time or the mouth is closed sooner. [...] The longer resting of the voice on a vowel makes the lengthened accent, the shorter resting makes the sharpened accent. (Adelung 1788a: 212f.)8
Since Adelung's domain for orthographic special spellings is the root of the word, he assumes an optimal structure for it concerning the segmental build-up: When the vowel is 'sharpened', two consonants follow in the optimal case. When it is 'lengthened' (long vowel or diphthong), one consonant is preferred.
3.2. Official rule Prussia 1902 (segment-based) The Official rule in the edition of Prussia stands in place of the editions of the other German states of that time (such as Bavaria, Saxony or Hesse). After the spelling reform of 1901 in which the German orthography was standardized for the entire German-speaking area, these editions were the first all-German rulebooks. This reform terminates a development in which the effort for standardization stood in the foreground. Up to this point, there had been no valid standard, and even after the first attempts at regulation in the 19th century, several differing spellings existed alongside one another in the sovereign territories. At the same time, the century-old tradition of systematizing and bringing out of the regularities of the German orthography was pushed into the background.9 The official rule for sharpening in the Prussian rule book reads as follows: (9)
Sharpening rule 2 (Prussia 1902: 11) "The shortness of a vowel is only marked in stressed syllables that have a single consonant in final position, namely in such a way that this consonant is spelled double."10
Note that 'syllable' here means morpheme, as in the older literature no difference had generally been made between syllables in a prosodic sense and as morphological units. The grammatical foundation of this rule is rather superficial; the domain for sharpening is a morpheme with a short vowel followed by a single consonant in final position. This constitutes merely a necessary, but not a sufficient condition. The further prosodic or morphologic context is left unmentioned. What consequences this has for the consistency of the analysis will be shown in chapter 4. 8
9 10
, Jeder Vocal kann im Deutschen so wie in anderen Sprachen auf eine gedoppelte Art ausgesprochen werden, je nachdem die Stimme länger oder kürzer auf ihm verweilet, oder vielmehr, je nachdem die Öffnung, mit welcher er ausgesprochen wird, länger dauert, oder der Mund froher geschlossen wird. [...] Die längere Verweilung der Stimme auf einem Vocale gibt den gedehnten, und die kürzere den geschärften Accent." Cf. Schlaefer 1980 and 1981; Maas 1997a; Noack (2000: 14ff). „Die Kürze des Selbstlautes wird überhaupt nur in betonten Silben, die nur auf einen Mitlaut ausgehen, bezeichnet, und zwar dadurch, daß dieser Mitlaut doppelt geschrieben wird."
Christina Noack
154 3.3. Maas' rule (syllable-based)
A s already mentioned, syllable-based approaches back up orthographic regularities in the phonological structures o f spoken language. Maas bases his considerations on the opposition o f the connection o f a stressed vowel to a following consonant: (10)
Sharpening rule 3 (Maas 1997b: 30) "Tight connection occurs also without a tautosyllabic consonant (as a connection, hence, to a heterosyllabic consonant in the following onset). This case is marked by means o f sharpening, i.e. by a copy o f the symbol for the tightly connected consonant."11
The terminology o f 'loose' and 'tight connection' is equivalent to the older dichotomy o f 'smooth' and 'abrupt syllable cut'. The former g o e s back to Jespersen (1913) and has been established especially by the works of. Trubeckoj who offered the following definition: [The correlation of syllable cutting is] really nothing more than an opposition between the so-called 'tight' and 'loose' connection of a vocalic syllable peak to a following consonant. If the vowel with the tight connection is thereby shorter than the vowel with the loose connection, this is simply the result of a phonetic consequence. With the tight connection, the consonant begins at such a moment when the vowel has not yet passed the summit of its usually alternating course, whereas with the loose connection, the vowel goes off completely before the consonant begins..12 (Trubeckoj 1989: 196) The condition for sharpening formulated by Maas is founded on prosody (syllable cut opposition); the form in which the short v o w e l o f the strong syllable is tightly connected to the consonant o f the following syllable, is the so-called 'supporting form'; 13 the principle of'constant spelling' leads to the transmission o f the special grapheme within the word-family: (11)
'σ
σ
brennen
"
12
13 14
'to burn' 14
„Fester Anschluß findet sich auch ohne tautosyllabischen Konsonanten (also als Anschluß an einen heterosyllabischen Konsonanten im folgenden Anfängsrand). Dieser Fall wird durch die Schärfiing notiert, also durch eine Kopie des Zeichens für den festangeschlossenen Konsonanten." „[Die Silbenschnittkorrelation ist] eigentlich nichts anderes als eine Opposition zwischen dem sogenannten 'festen' und dem 'losen' Anschluß eines vokalischen Silbenträgers an einen folgenden Konsonanten. Wenn dabei der Vokal mit festem Anschluß kürzer als der Vokal mit losem Anschluß ist, so ist dies nur eine phonetische Folgeerscheinung. Beim festen Anschluß setzt der Konsonant in einem solchen Augenblicke ein, wo der Vokal noch nicht den Höhepunkt seines normalerweise steigendfallenden Ablaufes überschritten hat, während beim losen Anschluß der Vokal noch vor dem Einsatz des Konsonanten zur Gänze abläuft." Maas (1992: 244f). The German expression is 'Stützform'. 'σ = stressed syllable, °σ = reduced syllable, Ο - Onset, R = Rhyme, Ν = Nucleus, C = Coda.
155
Regularities in German Orthography: A Computer-Based Comparison
Because of this pattern, there is also sharpening in the related forms, e.g.: (12)
σ
σ
brannte
'burned'
4. Computer-based analysis of sharpening: technical requirements
4.1. Corpus The data for the investigation are based upon the 'Kleines Wörterbuch' by Adelung (1788b) which has been published as an appendix to his treatise on orthography (Adelung 1788a) in which his approach is developed. The spelling of the data, however, is modern, i.e. it conforms to the norm which was valid for German from 1901 up to 1996. The input-list contains 10,044 word forms. IS Both foreign words and compounds in Adelung's dictionary were filtered out. The spelling of the former quite often does not correspond to the orthographical conventions of the German language, while the latter are problematic because of multiple stems and stresses for which ORTHO is not yet fitted for. For the analysis of sharpening, all 10,044 word forms have been used. This, of course, means that there is a certain amount of redundance, as the majority of forms are not sharpening words. On the other hand, the generating of a partial list would require to commit oneself from the beginning to certain phonological assumptions, which had to be avoided. Adelung's approach and the Official rule demand a modification of the morphological encoding in comparison with Maas' rule. This is necessary, as not only the treatises on sharpening but the entire rule-concepts are based on different morphological analyses: lemmas which end on schwa plus sonorant like Himmel 'sky' or offen 'open' and which are atomic from a modern view, are analyzed in such a way that these 'endings' do not belong to the root but form a suffix: Himm + el, off +en. According to Adelung (1816: 36ff.), this is because suffixes such as +el or +en definitely exist in words in which their status as a suffix is etymologically provable, e.g.: (13)
15
Schlüssel 'key' Flügel 'wing'
related to related to
Schloß 'lock' Flug 'flight'
The complete list is printed in Noack (2000).
Christina Noack
156 4.2. The program ORTHO 3.0 4.2.1. Program architecture
The computer program used for the experiment, ORTHO 3.0, is a rule-based data processing system. It was developed specially for the modeling and evaluation of orthographic theories.16 The program is able to generate orthographic words out of phonological data, as it is provided with a rule apparatus which corresponds to artificial knowledge structures. The phonological forms can either be keyed in manually or be generated automatically by the program (which gives suggestions that can be edited afterwards). For the actual run of the programme, i.e. the generation of the written forms, diverse grammatical information is needed, namely: - the phonological structure of the word - syllable boundaries - the morphological structure - lexical information like part of speech, inflection, and derivation - assignment to a word family The following lines give an example for the coding of the input data (Irrtum 'error'; irren 'to be mistaken'): (14)
7Ir+tu:m ?Ir+@n
{NN-nS} {VV-13PIE}
Irrtum irren
irrtum irren
The first column shows the phonological word with morpheme-boundaries but without information on syllable structure which the program determines by itself in a later step. In addition, stress markers are noted. The transcription is in SAM-PA,17 which is especially adequate for data processing, as it uses exclusively ASCII symbols. In the second column, grammatical information is encoded. 'NN-nS' indicates a Noun derived from a Noun in the nominative Singular; 'VV-13PIE' denotes a Kerb derived from a Kerb in the 1st and 3rd person Plural /ndicative present tense.18 The third position shows the lemma, which identifies the paradigm of the word. Finally, the correct spelling of the word is given.
4.2.2. Rule apparatus The part of the program the user can modify for evaluating different theories consists of two different types of rules: i. 'Special' rules (lengthening, sharpening etc.) ii. Phoneme-grapheme-correspondence rules. 16
17 18
The development was promoted by the German Research Council (DFG) as a three-year project called 'Computational Modeling of Orthographic Processes (CMP)' at the University of Osnabrück, conducted by Prof. Dr. Utz Maas and Dr. habil. Helmar Gust. A SAM-PA - IPA - correspondance-table is given in Köhler (1995). For further information on the notations in ORTHO, cf. Maas et al. (1999).
Regularities in German Orthography: A Computer-Based Comparison
157
The rules are context-sensitive which means that they match only when the focussed element appears in a certain context of other elements. These contexts are laid down in each rule. In rules of type i., certain conditions are formulated which correspond to the respective theory. If such a condition is fulfilled in a word form of the input file, it leads to a 'marking' within this word. These markings are later on translated into special graphemes. For example, the following rule adds a marker ('length:+') to a phoneme [:] when it appears immediately after [i] in a stressed syllable ( ' ' ' ) : (15)
Lengthening rule 1: strong syllable: {' * [p: i] [p:: length: +] | *}
'p:' stands in front of a phoneme to indicate that a phoneme is following (length in this case counts as a phoneme as well). An asterisk ("·") denotes a variable for any phoneme(s). Variables with names ('^consonant') have definite values, e.g. a phoneme out of the class of consonants. A vertical line stands for the boundary of a unit (morpheme or syllable). Each rule in ORTHO has a name (e.g. 'lengthening rule 1') which distinguishes it unambiguously from all other rules. Finally, there are constants with definite values. The values are identical with the names of the constants (e.g. 'strongsyllable' or 'root morpheme'). Apart from this, there are phoneme-grapheme-correspondence-rules which are used to 'translate' a phonemic representation into a graphemic representation. Here, again, there are two different types: rules which lead to 'special graphemes', i.e. lengthening, sharpening etc., and rules which translate all remaining unmarked phonemes into their corresponding graphemes. In (16), example a. translates [:] into when it appears immediately after [i] as in [li:t] Lied 'song' (cf. 'Lengthening rule 1' above), whereas b. leads to a default-translation of [i] into when no special case is given: (16)
a. b.
Translation rule 76: Translation rule 88:
: length i i.
e / i _.
4.3. Implementation of different sharpening-rules in ORTHO In the next step of the investigation, the three different sharpening-rules explained in chapter 3 are implemented in the program (for a better understanding, each rule is also given as a paraphrase): (17)
Implemented rule according to Adelung Sharpening rule 2 : root_morpheme:{ [p: *consonant sharpening: +] [*vowel] | *} ifwordvariant following_morpheme:{ # @ I i | *}.
(If a root-morpheme ends with a short vowel plus a single consonant, the latter is given a marker if a word-variant exists in which a suffix with a vowel in first position follows the root.)
158 (18)
Christina Noack
Implemented Official rule (Prussia 1902) Sharpening_rule 3 : root morpheme: {[p: ^consonant sharpening: +] [*vowel] [* consonant] | *}.
(In a root morpheme which ends with a short vowel plus a single consonant, the latter is given a marker.) (19)
Implemented rule according to Maas Sharpeningrule 1 : strong_syllable:{ * * * [p: * consonant sharpening: +] | *} if_word_variant strongsyllable:{ * * * [ ] | •}.
(A consonant in the coda of a strong syllable is given a marker (= 'sharpening: +') if there is a word-variant in which the coda of this syllable is empty. 'Empty coda of strong syllable' implies tight connection to a heterosyllabic consonant.)
5. Results of the computer-based investigation
5.1. Qualitative analysis In this section, different groups of words are analyzed that seem to be of varying difficulty as far as sharpening is concerned. This differentiation is done because doubling of consonant letters does not seem to be as unproblematic as the literature quite often suggests. After that, a purely quantitative analysis of the efficiency of the three rules follows in section 5.2.
5.1.1. First group: mono- or disyllabic words with short vowel (20)
[... V C (V...)]root ... V C (+) V ...: ...V C + C ...: ...V C # :
sharpening! 'sky', 'to bark', 'gods', 'masters' 'barked', 'divine', 'rule', 'god', 'master'"
There is no sharpening, however, if two different consonants follow within the root: (21)
[... V C, C2 (V...) ]root
no sharpening!
'cold' (n.), 'break' (n.), 'edge'; 'shirt', 'hard', 'west'
19
+ indicates morpheme boundary, # word final position.
Regularities in German Orthography: A Computer-Based Comparison
159
In this group, all three rules have attained good results, i.e. ORTHO has only generated sharpening if the respective conditions are fulfilled: - Adelung: Root ending with short vowel plus single consonant and suffix with initial vowel in at least one form of the word family - Official rule\ Root ending with short vowel and single consonant - Maas: Tight connection of root-vowel to heterosyllabic consonant in at least one form of the word family
5.1.2. Second group: Monosyllabic words without disyllabic (= trochaic) variant in word family i. Part of speech (22)
a. b.
'consonant' (Mit- = preposition) 'Wednesday' (Mitt- = noun)
Spelling generated by ORTHO with implemented - Adelung rule: correct (no supporting form within the word family which motivates sharpening) - Official rule·, incorrect in accordance with the rule (*) - Maas rule: correct (no supporting form within the word family which motivates sharpening) The false spellings generated by the Official rule can only be corrected through a rule expansion: (23)
"The consonant is written single in monosyllabic, usually weakly stressed words such as an, am [= prepositions]."20 (Prussia 1902: 12)
The ORTHO-rule in (18) has to be expanded correspondingly: (24)
Implemented Official Rule (expanded) (Prussia 1902) Sharpening rule 3 : root_morpheme: { [p: *consonant sharpening: +] [*vowel] [^consonant] | *}. syntax: part_of_speech: #a η ν
Now the rule applies to adjectives ('a'), nouns ('n'), and verbs ( V ) only. The rules according to Adelung and Maas generate mistakes in the opposite case, i.e. words which are written with sharpening even though they have no supporting form: (25)
20
* for 'quits'; * for 'despite'; * for 'suddenly'; * for 'well-build, imposing'; * for 'back' (adverb)
„Man schreibt aber den Mitlaut einfach in einsilbigen, gewöhnlich schwach betonten Wörtern wie an, am."
160
Christina Noack
Adelung, however, has a rather wide view of relatedness between words; in this fashion, he nevertheless obtains the correct spelling in these cases: (26)
A consonant is doubled "in some prepositions, as long as they descend from nouns or are real nouns such as statt [= instead of], sammt [= together with]."21 (Adelung 1788: 223)
ii. Semantic opaqueness: 'obscured' roots In German, as in other languages, there exist several compounds which are semantically opaque, e.g.: (27)
'fallow deer'; 'blackberry'; 'raspberry'
Dam-, Brom-, and Him- are no longer productive roots; the respective compounds are lexicalized. In this respect, these morphemes are not subject to morphological processes. Therefore, they do not build paradigms with both mono- and disyllabic forms which are characteristic for German grammar. This fundamentally differentiates them from words like those in (20), which is why it is interesting to include them in the investigation. The results obtained with semantically opaque roots are the following: - Adelung rule: correct (no supporting form within the word family) - Official rule: incorrect in accordance with the rule (short vowel plus single consonant -> *) - Maas rule: correct (no supporting form) Here, again, the Official rule needs an expansion: (28)
"The consonant is written only simple in the qualifying element of some compounds which no longer occurs independently."22 (1902: 12)
On the other hand, there is one example of an unproductive root with sharpening: (29)
Bollwerk 'bulwark'
This case is complementary to those in (27), i.e. it is spelled correctly by the Official rule but incorrectly by both Maas' and Adelung's rule (*). What is interesting here are the different viewpoints of the concept of regularity of sharpening: The Official rules describe spellings like denn, wann, daß 'for, when, that' etc. as regular, but an, ab, bis, was 'on, off, until, what' or Brombeere 'blackberry', Damhirsch 'fallow deer' etc. as exceptional. With the syllable- and morpheme-based concepts of Adelung and Maas it is just the opposite.
21
12
Die Verdopplung findet statt „in manchen Präpositionen, so fern sie von Substantiven abstammen oder wahre Substantive sind, statt, sammt." Cf. the related words bestatten 'to bury', and Stätte 'place', or zusammen 'together'. „Man schreibt den Mitlaut nur einfach in dem Bestimmungswort einiger Zusammensetzungen, das selbständig in dieser Form nicht mehr vorkommt."
Regularities in German Orthography: A Computer-Based Comparison
161
iii. Morphological intelligibility: 'Extended' stems There are some words that are spelled without sharpening despite a related form containing a sharpening marker: (30)
a. b.
'smack' but '(smell) of burning' but
'click' 'spirits'
This is a special problem: Sharpening is related to the question of how such forms are to be analyzed morphologically and what supporting forms are permitted. In the older spelling of Adelung, there is always sharpening in such words (, , ), as for him, the domain of sharpening is strictly the root to which he counts neither in the former nor and in the latter words.23 The Official rules describe this differently by referring to the word-stem: One must consider whether the word forms are built by adding inflectional or derivational endings to the stem, or if the stem itself is expanded by consonants like st, t, d. [...] Depending on this the word must be written gebrannt ['burned'], Branntwein ['spirits'], but Brand ['fire'], gekannt ['known'], kenntlich ['distinguishable'], Kenntnis ['knowledge'], but Kunde ['client']. 24 (Prussia 1902: 12)
iv. Weakly stressed syllables Sharpening, just like the other special spellings, is considered to be restricted to strong syllables. Therefore, sharpening spellings should not appear in weakly stressed syllables. Even though there are some words without sharpening in this context, such as (31a.), there also exist several words which have sharpened consonants in the 'lengthened' form, i.e. in the plural, but not in the singular (31b.). What makes it even more problematic is the existence of a third type with sharpening in both numbers (31c.): (31)
a. b.
c.
- 'pilgrim' sing. / pi. - i«e«> 'queen' sing. / pi. - 'polecat' sing. / pi. - 'pumpkin' sing. / pi. 25 - 'anvil' sing. / pi.
The spellings of these orthographically different types generated by ORTHO are of course homogenous with each rule, whereas the three types are phonologically unique: - Adelung rule: always with sharpening in roots - ; * , without sharpening in suffixes -> * - Official rule: same as Adelung rule - Maas rule: always without sharpening, as it is not the strong syllable -> * *,26 - *
23 24
25 26
Cf. Adelung (1788b). „Zu beachten ist hier, ob die Wortformen durch das Hinzufügen von Biegungsendungen und Ableitungssilben an den Stamm gebildet sind, oder ob der Stamm selbst durch Mitlaute, wie st, t, d, erweitert ist. [...] Demnach ist zu schreiben: gebrannt, Branntwein, aber Brand', gekannt, kenntlich, Kenntnis, aber Kunde." In the spelling that was valid up to 1996, in final position was substituted by . The ß-grapheme is obtained here by another rule concerning the spelling of vs. in German.
162
Christina Noack
Adelung always spells words of this kind with sharpening in the singular as well as in the plural "...because they grow when inflected, since the doubling is all the more important because of their weak stress" 27 (1788a: 224). Words with the feminine ending -in! -innen occur only once in the wordlist with a single representative, as they are merely different tokens of the same type.
5.2. Quantitative analysis The second analysis displays the number of mistakes concerning sharpening which were produced by ORTHO with the different implemented rules. The mistakes are sorted by their causes. Since the rules have different premises, the causes for the mistakes are different as well. For instance, Maas' rule does not generate sharpening when the tightly connected vowel is not stressed, cf. 'Unfall 'accident', 'ankommen 'to arrive'. If, on the other hand, the vowel is stressed, sharpening is generated only if this vowel is tightly connected to a heterosyllabic consonant in at least one form of the word family. For the first and second run of the program, the input-data (i.e. the words in phonetic transcription) had to be modified with regard to modern morphological analysis: as already expounded on p. 155, according to Adelung's morphological analysis (as well as that of the Official rules), stems with final schwa plus sonorant (e.g. Himmel 'sky', Mutter 'mother') are split up into root plus (pseudo-)suffix: Himm + el, Mutt + er. The reason for this modification is that the rules are based on this analysis. The first run resulted in the following numbers (a '+' in front of a line indicates cases in which ORTHO has generated sharpening by mistake, whereas ' 0 ' means that ORTHO has mistakenly generated no sharpening):
5.2.1. Adelung's rule This rule produced the following misspellings: (32)
27
+ —> single consonant in root-coda/ trochaic supporting form/ main stress (e.g. * for 'april'): + —> single consonant in root-coda/ trochaic supporting form/ second stress (e.g. * for 'practice'): 0 —> not coda of root-morpheme (* for 'appetite'): 0 no supporting form: (* for 'now', * for 'for'): total·.
6 15 53 18 92
"weil sie in der Biegung wachsen, da denn wegen ihres Nebentones die Verdoppelung noch nothwendiger wird."
Regularities in German Orthography : A Computer-Based Comparison
163
Explanation: Adelung's rule generates unnecessary sharpening when the main- or second-stressed root has a 'short' vowel plus a single consonant in final position and when there is at least one related form in the word family in which this consonant is in the onset of the syllable. Sharpening mistakenly failed to be generated when the short vowel was not given the main-stress but preceded the strong syllable, which is the case especially in foreign words, e.g. appe'tit 'appetite', palli'sade 'palisade', parallel 'parallel'. Here, the consonant is not in the final position of the root, which is, however, a condition formulated in Adelung's rule. The question is whether these cases should be counted as sharpening anyway, since they do not show native German spellings, and their prosodic contours are marked in German as well. Sharpening was also not produced in such cases where there was no change between mono- and disyllabic trochaic structures in the paradigm, e.g. * for 'now', * for 'when' (non-inflecting parts of speech), * for 'litmus'.
5.2.2. Official rule (Prussia 1902) (33)
+ -> single consonant in root-coda/ main stress (e.g. * for 'in what'): 40 + —> single consonant in root-coda/ 2nd stress (e.g. *28 for 'tobacco'): 29 + —> problem of morpheme-boundary (e.g. * for 'schnapps') 4 0 -> non-final position of root-morpheme (e.g. * for 'last'): 65 total·. 138
Explanation·. The Official rule generates unnecessary sharpening in words that belong to non-inflecting parts of speech when the root has a short vowel plus a single consonant in final position. The type Schnaps 'schnapps' is problematic because the final -s is synchronically not interpretable as a suffix. Etymologically, the form is related to schnappen 'to snap'. This is why Adelung still prefers the spelling Schnapps (as in the English loan-word today); further tokens of this type in the tested corpus are Klaps 'smack' and Klecks 'stain'. The difficulty is that in all these forms the root is recognizable and existing in other words without the ending -s (cf. schnappen 'to snap', klappen 'to clap', 1decke(r)n 'to spill'). The derivation with -s, however, had already become unproductive at Adelung's time. It is not even always semantically intelligible. We have quite an analogous case with words like (ge)samt 'total' in contrast to Erkenntnis 'discovery', which are related to zusammen 'together' and kennen 'to know' respectively. For Adelung, these words are unambiguous: he demands sharpening in all these forms if the root does also exist in etymologically related forms without the ending -5 or -t. Thus, he spells 28
Tobak ['to:bak] is an obsolete form which nowadays only exists in idioms such as 'starker Tobak' = '(that's a) bit thick'.
164
Christina
Noack
these words Schnapps, Klapps, and sammt as well as Klecks and Erkenntnis. In comparison, the Official rules prescribe that there be no sharpening when "the stem is expanded by consonants such as st, t, if' 29 (cf. 5.1.2.iii.). Nevertheless, the standardized spelling is inconsistent here, as, on the one hand, it prescribes words without sharpening like Klaps and Schnaps, but, on the other hand, forms with such a spelling like Klecks even though they all belong to the same type. There is no explanation for this difference in the Official rules.30 In the end, the correct or incorrect spelling for this type is a question of morphological analysis. This is why it is listed as a separate case in the analysis above.
5.2.3. Maas' rule (34)
+ —> incorrectly sharpened in accordance with rule: 5 0 —> not strong syllable (e.g. * for 'peewit', * for 'alone',* for 'marshal'): 79 0 —> no supporting form: 32, of this: 0 no synchronic supporting form (e.g. * for 'instead o f , * for 'waiter'): 18 0 —> no supporting form at all (e.g. * for (conjunction), * for 'litmus'): 14 total: 116
Explanation: While Maas' rule rarely produces unnecessary sharpening, the number of missing sharpenings is quite high (111). The number of foreign spellings (like Batterie 'battery') is 37. Those words in which sharpening has not been generated because there is no supporting form have to be differentiated: there are cases without any supporting form throughout the word family (e.g. plötzlich 'suddenly', denn 'for') as well as cases in which there is synchronically no supporting form even though etymologically related words (which often have vanished) can be found, e.g. zurück 'back' (adverb) - Rücken 'back' (noun). The former group exclusively consists of words which are non-inflecting. As they do not form paradigms, there is - unlike in verbs, nouns and adjectives - no opposition between trochaic and non-trochaic feet. If one excludes forms like Batterie as foreign spellings which are not regular at all in German orthography, the number of incorrectly omitted sharpenings reduces to 79. Table 2 shows a synopsis of the results of the quantitative analysis:
29 30
„wenn der Stamm selbst durch Mitlaute, wie st, t, d, erweitert ist." [§ 13]. Such an explanation is also missing in the modern editions.
165
Regularities in German Orthography: A Computer-Based Comparison
Table 2:
Rule
Unnecessary sharpening
Missing sharpening
Σ
Adelung Official Maas
21 73 5
71 65 111
92 138 116
Total number of sharpening mistakes generated by the different rules
5.3. Conclusion The analyses have shown that the three different conditions of the implemented rules lead to different kinds of mistakes as well. Adelung's rule produces the highest number of correct forms, mainly because it allows sharpening in secondary stressed syllables as well, as long as they parse to the root morpheme. This is especially important for words that have affixes attracting stress (e.g. 'Unglück 'bad luck'). This is why Adelung's morpheme-based rule is more effective than Maas' syllable-based rule. On the other hand, the generating of sharpening obviously depends upon a 'supporting form' which has a vowel tightly connected to a heterosyllabic consonant. This condition restricts sharpening to words that belong to the inflecting parts of speech and makes both Adelung's and Maas' rule more effective than the Official one. The Official rules need expansions for some spellings in order to explain so-called 'exceptions'. This, however, requires more studies for the learners. Such rule-expansions are unnecessary in the concepts of both Adelung and Maas; certain parts of speech are generally excluded from special graphemes like sharpening, as they are non-inflecting and thus show no change between trochaic and non-trochaic contours. Therefore, they do not supply forms to support sharpening. This is why prepositions, conjunctions, or pronouns as well as 'obscured' roots as in Brombeere, Damhirsch etc. are excluded.31 These cases are produced correctly by both Adelung's and Maas' rules respectively without any rule expansions. Altogether, the analyses have demonstrated that sharpening in German orthography is not as unproblematic as it is often declared to be because of its apparent 'regularity'. It is a 'special grapheme' that does not simply mark 'shortness' of the preceding vowel. Thus, it has not a local function but marks units which embrace several phonemes, namely syllables. This should have been made clear by the mistakes produced by the Official rule which reduces sharpening to this function.
31
In contrast, some words with sharpening are irregular, such as denn, wann 'for, when'. However, these cases are much less frequent than those without sharpening, such as in, man, von 'in, one (pron.), from'.
166
Christina Noack
6. Summary
The aim of the investigation reported in this article was to show the regularities that are basic to the so-called 'phonographic' part of German orthography and to see if there are any inconsistencies within its standardization. Sharpening has been investigated as a representative for other 'special graphemes' in German such as 'syllable-dividing-h', the 's-graphemes', and 'lengthening'. Computer-based modelling has proved an effective method for such problems because of the possibility to evaluate different linguistic concepts. On the one hand, alternative concepts can be evaluated reliably while associative aspects and individual mnemonics are excluded. On the other hand, these concepts can be tested on data corpora of any extension in a quite short time. For the area of sharpening which has been investigated here, the linguistic concepts by Adelung (1788a) and Maas (1997b) have been implemented, that is, a morpheme-based and a syllable-based concept respectively. These concepts turned out to be to a large extent equal to each other. They are, so to say, 'two sides of one coin'. The differences of a syllable-based and a morpheme-based orthographical analyses are rather marginal and are almost restricted to such forms in which, as an exception, the root does not bear main stress (because of an affix which attracts stress). In addition to this, the segment-based concept of the Official rules (Prussia 1902) has been implemented. Here, special graphemes are reduced to the local level of orthographic encoding. Through this, it emerged that sharpening does not mark the quantity relationship in the first place, but rather the prosodically marked cases of spoken language. One noteworthy indicator for this are non-inflecting parts of speech like prepositions, conjunctions, or pronouns as well as semantically opaque roots. These have no consonant-doubling after short stressed vowels in the majority of cases. While these spellings are regular within both the morpheme- and the syllable-based concepts, they have to be learned as exceptions from the rule with the Official rules. Conversely, the syllable- and the morpheme-based concepts generate mistakes in words like denn, wann, quitt 'for, when, quits' etc., which are produced correctly by the Official rule. But those cases are not very numerous. So, there is no reason to assume that in German orthography, double-consonants tend to mark primarily shortness of vowels. The primary function of sharpening is the encoding of prosodic and morphologic structures. Apart from these prosodic regularities, there are further reasons to focus on syllable cut opposition instead of speaking of'long' and 'short' vowels. On the one hand, vowel quantity is relative, even in stressed syllables; phonetic analyses show that the quantitative difference can be neutralized in colloquial speech. On the other hand, there is a didactic problem: As spoken language differs strongly throughout the German-speaking area, the terms 'long' and 'short' can be extremely irritating for dialect-speakers in certain regions where this feature hardly plays a role for distinguishing vowels.32 Since the reforms of the 19th century, it is characteristic for the majority of orthographic concepts to reduce the German spelling system to a linear 'phoneme-grapheme-correspon32
Cf. Röber-Siekmeyer & Spiekermann (2000) and Lindauer (this volume).
Regularities in German Orthography: A Computer-Based Comparison
167
denee' (PGC), i.e. to an exclusively local level of encoding. However, this PGC is only one part within the 'alphabetic' part of orthography and cannot be regarded without taking into account their appearance in different syllable positions. Cf. e.g.: (35)
= = = = =
[f] before
and in the syllable onset: [fpi:l] Spiel 'game'; [z] in simple syllable onset: [zam] sein 'to be'; [s] in all other cases: [lo:s] Los 'lot'. [e] in syllable nucleus and coda: [fa:.te] Vater 'father'; [fi:e] vier 'four'; [κ] in syllable onset: [Ko:t] rot 'red'.
Because of certain prosodic peculiarities of the German language, especially the syllable cut opposition, the encoding of purely segmental sequences of sounds by means of graphemes is not sufficient. The prosodic oppositions mentioned have to be taken into consideration as well. The attempt to force the global (= prosodic) level of encoding into a local model must inevitably lead to problems, both within the particular linguistic concepts and in didactic models that arise from them. Graphemic linearity is an abstraction of the spoken language that does not work for languages like German. This is why problems occur especially in attempting to describe special graphemes with a strictly linear approach. One way to solve these problems in didactics is to produce artificial pronunciations, e.g. to pronounce [ge:./»n] instead of [ge:.an] in order to make the 'syllable-dividing-h' audible which is actually not pronounced, or to pronounce [muf.fe] instead of [mute] to indicate double-consonants even though there is only one consonant phonologically. In the linguistic literature, there are attempts to catalogue all possible PGC's which makes quite a long list in German.33 This is, at the same time, an attempt to motivate the 'phonemic principle'. A similar matter is the names which are given to those graphemes that cannot be covered by 'linearity': Quite often, they are simply labelled 'mute', like 'mute h' (lengthening grapheme). But 'mute' or 'pronounced' is not the issue and - what is more - does not even describe its function. One language in which the opposition of long and short vowels in open and closed syllables is consequently orthographically marked is Dutch. As there is no constant spelling, there is also no unnecessary marking in forms that do not actually 'need' a special grapheme as it is the case in German: if one follows Maas' assumption that sharpening marks a tightly connected vowel to a heterosyllabic consonant, sharpening is necessary in bellen 'to bark' but not in bellt '(he) barks'. The latter is only sharpened because of its relationship to words like the former. In Dutch, however, sharpening only occurs where it is needed, cf. bellen 'to ring' vs. belt '(he) rings'; open stressed syllables with short vowel like in bellen [be.Ian] always get orthographically marked while closed stressed syllables with short vowel do not (belt). Lengthening is analogous in Dutch, cf. (ik) speel [spe:l] '(I) play' (closed stressed syllable with long vowel) vs. spelen [spe:.bn] 'to play' (open stressed syllable with long vowel).
33
Cf. Nerius et al. (1989).
168
Christina Noack
References
Adelung, Johann Christoph (1788a): Vollständige Anweisung zur Orthographie. Leipzig. Reprint: Heidelberg: Olms 1971. - (1788b): Kleines Wörterbuch für die Aussprache, Orthographie, Biegung und Ableitung. Leipzig. Reprint: Heidelberg: Olms 1971. [Arbeitskreis 1995] Internationaler Arbeitskreis für Orthographie (ed.) (1995): Deutsche Rechtschreibung: Regeln und Wörterverzeichnis. Vorlage für die amtliche Regelung. Tübingen: Narr. Eisenberg, Peter (1998): Grundriß der deutschen Grammatik: Das Wort. Stuttgart: Metzler. Jespersen, Otto (1913): Lehrbuch der Phonetik. Leipzig: Teubner. Kohler, Klaus J. (19952): Einführung in die Phonetik des Deutschen. Berlin: Erich Schmidt. Lindauer, Thomas (this volume): How Syllable Structure affects Spelling: A Case Study in Swiss German Syllabification. Maas, Utz (1992): Grundzüge der deutschen Orthographie. Tübingen: Niemeyer (= Germanistische Linguistik 120). - (1997a): Orthographische Regularitäten, Regeln und ihre Deregulierung. Am Beispiel der Dehnungszeichen im Deutschen. In: Gerhard Äugst et al. (eds.): Zur Neuregelung der deutschen Orthographie. Begründung und Kritik. Tübingen: Niemeyer (Germanistische Linguistik 179), 337-364. - (1997b): Grundstrukturen der deutschen Orthographie. Ein Arbeitspapier. Ms. University of Osnabrück. Maas, Utz, Helmar Gust, Christian Albes, Christina Noack & Tobias Thelen (1999): Computerbasierte Modellierung orthographischer Prozesse. Abschlußbericht. University of Osnabrück. Neef, Martin (this volume): The Reader's View: Sharpening in German. Nerius, Dieter et al. (19892): Deutsche Orthographie. Leipzig: Bibliographisches Institut. Noack, Christina (2000): Regularitäten der deutschen Orthographie und ihre Deregulierung. Eine computerbasierte Untersuchung zu ausgewählten Sonderbereichen der deutschen Rechtschreibung. Ph.D. thesis, University of Osnabrück. Published electronically at: http://elib.ub.uni-osnabrueck.de/publications/diss/EDissl 58_thesis_l .ps. [Prussia 1902] Regeln für die deutsche Rechtschreibung nebst Wörterverzeichnis. Berlin: Weidmannsche Buchhandlung. Röber-Siekmeyer, Christa & Helmut Spiekermann (2000): Die Ignorierung der Linguistik in der Theorie und Praxis des Schriftspracherwerbs. Überlegungen zu einer Neubestimmung des Verhältnisses von Pädagogik und Phonetik/ Phonologie. Zeitschrift für Pädagogik 46: 753-771. Schlaefer, Michael (1980): Grundzüge der deutschen Orthographiegeschichte. Sprachwissenschaft 5: 276-319. - (1981): Der Weg zur deutschen Einheitsorthographie vom Jahre 1870 bis zum Jahre 1901. Sprachwissenschaft 6: 391-438. Trubeckoj, Nikolaus S. (19897): Grundzüge der Phonologie. Göttingen: Vandenhoeck & Ruprecht.
Martin Neef
The Reader's View: Sharpening in German No matter how closely I study it No matter how I take it apart No matter how I break it down It remains consistent King Crimson, Indiscipline
1. A reader-oriented orthography
Writing systems are predominantly examined from the perspective of the writer. This is natural from a didactic point of view because learning to write is burdened with many more problems than learning to read. Efforts to reform to simplify the orthography consequently aim at the difficulties faced when writing. From a functional perspective, however, this approach is unreliable. According to Maas (2000: 37), orthography gives "instructions for a reader especially how to read an unknown text."' Specific characteristics of writing systems, like punctuation, use of small-versus-capital initial letters, or morpheme constancy, are functionally explainable from the reader's point of view only. Therefore, the consistency of a writing system should not be valued against the background of how straightforwardly a writer can encode his linguistic knowledge in orthographic forms but how straightforwardly a reader is enabled to decode grammatical structure (including lexical information) from the orthographic form. Focusing on the relation between letters and sounds, the central question is how a written form can be translated into a spoken form, a process that I want to term 'recoding'. This changing of the perspective parallels to trends in theoretical linguistics. The writerbased view can be judged as being input-oriented since corresponding models derive the written form from the spoken form. There are many examples for this 'encoding' approach, like Bierwisch (1972), Prinz & Wiese (1990), Nunn (1998), Sproat (2000, this volume), and Lindauer (this volume). The reader-based view, on the other hand, is output-oriented in nature: The theoretical analysis takes orthographic surface-forms as the point of departure and studies how relations to grammar and the lexicon can be established, how the linguistic structure can be decoded. A reader-based orthography in the sense that I envisage asks for constraints on written forms that enable a correspondence between spelling and knowledge structures of the spoken language. This branch of research has notable precursors like Venezky (1970, 1999).2 Recent research in text-to-speech systems takes a similar approach; cf. e.g. van Heuven & Pols (eds.) (1993).
' 2
„Instruktionen für einen Leser, wie er insbesondere auch einen unbekannten Text erlesen kann." The overview in Rahnenfilhrer (1980) contains hints to some other older reader-based approaches.
170
Martin Neef
In a reader-based approach to orthography, the following Recoding Principle is constitutive for alphabetical and syllabic writing systems:3 (1)
Recoding Principle The written form has to make possible an unambiguous recoding of the spoken form.
In the face of regional and stylistic variation, the written form of a standard orthography can only represent one of the possible pronunciations.4 In German, this specific form is that of Standard High German, actually a high stylistic register that can be characterized as being especially clear. Usually, this stylistic level is termed 'explicit pronunciation' ('Explizitlautung'; cf. Duden 1998: 45, Neef 1997). From a grammatical point of view, this explicit pronunciation has a designated position because lower stylistic forms are characterized by greater simplicity. This notion of 'greater simplicity' can be given a clear meaning in a declarative conception of grammar (cf. e.g. Neef & Neugebauer 2002): Explicit forms are subject to a maximal set of constraints or conditions of well-formedness while lower stylistic forms are subject to only a subset of these constraints. The Recoding Principle governs the assimilation of foreign words. Based on their pronunciation, the spelling of foreign words can be assimilated in such a way as to make their pronunciation immediately obvious from their written form. This means that the word in question can then be read regularly in the target language. The Norwegian word ski,5 for example, is pronounced [fi:] both in Norwegian and in German. The conditions of German orthography, however, would regularly relate to the spelling the pronunciation [ski:]. Therefore, the spelling has been changed to which guarantees a regular correspondence of spelling and pronunciation. The pronunciation of a foreign word can also be modified based on the Recoding Principle. This is what happened with the same word ski in both Dutch and English, resulting in the pronunciation [ski:]. German also provides examples for this development. The English word humbug differs in its pronunciation from the English form [1iAm.bAg] both in German, where it is pronounced [TiUm.buk], and in Dutch, where it is pronounced [ΊΐΥπι^Υχ], following the regular correspondences of the target languages. Hence, such spelling pronunciations take literally the Recoding Principle (cf. Neef 2000). Frequently, there are also mixed cases where both the pronunciation is assimilated by the phonological target system and the spelling is modified. For example, the English word shawl cannot be pronounced as [jb:]] in German because lax vowels must not be long. Following the spelling, the vowel in question has been changed into [a:] (although the phonological form would suggest a change into [o:], yielding the permissible pronunciation [fo:l]). Additionally, the spelling has been adjusted to .
3
4 5
The Recoding Principle is related both to the compatibility requirement and the readability requirement (discussed e.g. in Nunn 1998: 17f.). In the respective models, these requirements constrain the application of general principles. This restriction is known as the 'Principle of Received Pronunciation', cf. Nunn (1998: 3). Since I am almost exclusively dealing with formal aspects of words, I will provide glosses only when the meaning is relevant, e.g. for distinguishing homographs with different grammatical properties, or when the word in question is quite scarce in German.
The Reader's View: Sharpening in German
171
Subordinated to the Receding Principle are correspondence relations determining which phonological elements may be represented by which orthographic elements. The following terminological commitments are necessary for a reader-based orthography: • The primary elements are letters, i.e. in the case at issue the letters of the Latin alphabet (in a colloquial meaning). Letters constitute a level of first-order information and thus meet the condition of epistemological priority in Chomsky's sense (1981: 10). I stipulate that a distinction between vowel letters and consonant letters has already be drawn on this level. • In alphabetical writing systems, the basic function of letters is to establish a correspondence relation to phonological segments.6 A reader needs linguistic knowledge to recognize the specific function of a letter of the alphabet. If a letter has the functional capacity of establishing a correspondence relation to some phone, it constitutes a graph. Usually, the letter |t| corresponds to the phone [t] as in the word tax, pronounced as [tasks]. Therefore, from a reader-perspective the letter |t| has the function of the graph . Likewise, the letter |a| in this word has the function of establishing a correspondence relation to the phone [ae], thereby constituting the graph . If the relation between letters and graphs were always isomorphic, the need for the latter term would be questionable. However, single letters may also correspond to sequences of phones in alphabetic writing systems. In the same word tax, the letter |x| corresponds to the phone sequence [ks], thus constituting the graph . On the other hand, a graph can also consist of more than one letter if the corresponding phonological segment is a single phone. The word ship contains the complex graph corresponding to the single phone [J"]. Because pronouncing a word is a surface matter, the relevant correspondences hold, in a reader-based model, between graphs and surface phonological segments, instead of underlying segments as in at least some writer-based models. • Whereas the notion of graph only plays a minor role in theoretical approaches to writing systems, the notion of grapheme is usually the center of attention. For the purposes of a reader-based model, it seems reasonable to couple the abstract notion of grapheme to the capacity of graphs to establish correspondences. Different graphs that correspond to the same phone or to the same sequence of phones constitute one grapheme. Phonologically, both book and luck end in [k]. The graphs that represent this phone, however, are different. Thus, the graphs and in the examples given both correspond to the same phone and constitute allographs of an abstract grapheme which may be named « k » . In order to prevent a letter sequence like |spr| as in spring from being judged as a grapheme corresponding to the phone sequence [spR], a prerequisite for the definition of grapheme is that either one of its allographs must be a single letter or the corresponding phonological segment must be a simple phone. Given this background, the consistency of a writing system can be valued according to the number of spellings that cannot be read regularly. The Receding Principle can be violated in two respects:
6
In strictly syllabic writing systems, the primary elements correspond to phonological syllables. In logographic systems, the correspondence holds between the respective primary elements and lexemes or morphemes respectively.
172
Martin Neef
• The productive correspondence relations lead to a wrong pronunciation of a specific written word. In English, the ending regularly corresponds to the phonetic form [eid] as in aid, maid, braid, afraid. The pronunciation fsed] of the spelling is not covered by these relations but is a singular exception (cf. Muthmann 1999: 32), which has to be learned by rote. From a writer-based perspective, Sproat (this volume) terms this kind of lexical inconsistency 'idiosyncratic spelling'. From a reader-based perspective, spellings like these are (at least partly) irregular. This kind of irregular orthographic knowledge is assumed to be part of the lexical entries in question. In the spirit of the Elsewhere Principle, lexically stored orthographic forms have priority over the set of regular correspondences. • The productive correspondence relations lead to more than one possible pronunciation of a given spelling. The orthographic ending in polysyllabic English simple words corresponds to five different pronunciations when preceded by a consonant letter (cf. Muthmann 1999: 348). Three of these pronunciations occur quite frequently, namely [i:n] as in gasoline, [in] as in crinoline, and [aln] as in alkaline. The spelling alone provides only insufficient information to decide between these correspondences but allows all three of them. This is a case of orthographic underspecification. A comparable case in German is discussed in section 6 below.7 The qualification of a given spelling as either irregular or underspecified depends on a thorough analysis of the underlying spelling system. This is because the question whether an attested correspondence is regular or not can only be answered as a result of conducting a theoretical investigation and not from an a priori classification.
2. Writer-based analyses of sharpening
In the following analysis, I will deal with an intensely discussed phenomenon of German orthography. This phenomenon is known under different names like 'consonant doubling' ('Konsonantenverdopplung; cf. Eisenberg 1998, Ramers 1999), 'double-consonant graphemes' ('Doppelkonsonantengrapheme'; cf. Duden 1998: 68), or 'written geminate' ('Schreibgeminate'; cf. Prinz & Wiese 1990: 82). Since I regard the doubling of consonant letters as a special case of a more general phenomenon, I will retain the traditional term 'sharpening' ('Schärfung'; cf. also Maas 1992,2000). The transfer of the Latin alphabet to the German language entails the problem that only five vowel letters (|a|, |e|, |i|, |o|, |u|)8 are available to represent the much higher number of German vowel phonemes.9 The introduction of umlaut letters (|ä|, |ö|, |ü|) improved the proportion of letters to phonemes, but it did not solve the problem. Like Eisenberg (1998: 94f.), I assume 7
8 9
Borgwaldt & de Groot (this volume) find that German is more consistent in the spelling-to-sound direction than in the sound-to-spelling direction. However, their distinction between consistent and inconsistent spellings is not isomorphic to the distinction between irregular and underspecified spellings. The letter |y| is not employed in the core vocabulary of German. Similar problems hold e.g. for Dutch; cf. Nunn (1998: 12).
The Reader's View: Sharpening in German
173
that there are fifteen full vowels in German, passing over for the remainder of this paper the problems of the reduced vowels schwa and vocalic hl (cf. also Neef 1996: 91). Theoretical attempts to reduce the number of vowel phonemes (recently Becker 1998, Lenerz 2000) cannot do without a lexical marking of the vowels in the differently pronounced minimal pairs like Wall and Wal. Hence, the fundamental difference between these approaches and the one pursued in this paper is not in the number of vowel phonemes but in the choice of the relevant feature distinguishing the two relevant vowel series. Some approaches, like Wiese (2000), regard the vowel quantity as the decisive distinction between these two vowel series. On the surface, however, long vowels like [e:] in leben ['le:.ten] appear short in an unstressed position as in the derived form lebendig [le.'ben.dl?]. Still, this short [e] is different from [ε] which is short both in stressed and unstressed positions. Therefore, I will follow the tradition that takes the qualitative difference as the primary one. Usually, the phonological feature [± tense] is used for this purpose. Since this is phonetically not convincing, at least for the necessary distinction of the two /a/-sounds, I find it more promising to term the two vowel series as 'centralized' versus 'peripheral', following Moulton (1962: 62),10 with the vowel in Wall being centralized and the one in Wal being peripheral. The corresponding phonological feature is [± peripheral]. In my approach, only full vowels, but not reduced vowels, are marked for the feature [± peripheral]. Since the fifteen phonological full vowels of German cannot be distinguished orthographically by means of simple vowel letters, German orthography makes use of special graphemic signs, which are called sharpening signs and lengthening signs respectively. In principle, it would be sufficient to mark only one of the vowel series explicitly. For example, the peripheral vowels could be marked throughout with lengthening signs whereas the unmarked cases would represent centralized vowels. However, the situation in German orthography is more complex. Only in the case of the peripheral [i] is lengthening marked more or less consistently by the complex graph as in Tier (with a number of exceptions like Igel or Liter). The typical lengthening sign for the other peripheral vowels, namely the so-called lengthening-/!," only appears in front of graphs corresponding to sonorants, and even in this sub-group of spellings, the lengthening-/! appears in only 200 of 500 relevant cases (according to Augst & Stock 1997: 120; cf. also Ossner 2001). The marking of sharpening is more regular than the marking of lengthening, and it also carries over to the foreign vocabulary. The two most important approaches to this subject follow an input-oriented perspective in that they ask how the written form can be derived from the phonological form.12 The first of these analyses can be found in the Official rules of German orthography (cf. Duden 1996: 863):
10
Actually, Moulton calls the two classes of vowels 'centralized' versus 'decentralized'. His main motivation for this terminology is the observation that the articulatory difference between the two vowels in Wall versus Wal is that the latter is produced with a lower jaw position than the former. " In Dutch, the letter |h| is employed to mark the shortness of word-final vowels as in bah\ cf. Nunn (1998: 12). 12 Noack (this volume) gives a computer-based comparison of three different input-oriented approaches to sharpening in German.
174 (2)
Martin Neef
Segment-based approach "If in a word stem, only a single consonant follows a stressed short vowel, then the shortness of this vowel is marked by doubling of the consonant letter."13
This wording is slightly indifferent as to which aspects are of a phonological nature as opposed to an orthographic nature. Despite this shortcoming, the distribution of sharpening signs is defined in phonological terms only, and the marking of sharpening itself is described as a doubling of a consonant letter. Because a segmental property of the vowel phoneme is said to be the trigger of the sharpening spelling, I will characterize this approach as 'segment-based'.14 In contrast, Peter Eisenberg, among others, pursues a different strategy. The core of his sharpening-analysis is contained in the following formulation (cf. Eisenberg 1998: 297): (3)
Syllable-based approach "A double-consonant grapheme appears every time there is an ambisyllabic consonant (syllable link) in the phonological word. The grapheme that corresponds phonographically to the ambisyllabic consonant has to be doubled."15
According to Eisenberg (1997: 332), phonological ambisyllabicity only appears with a stressed front syllable which indicates that this approach is as stress-based as its competitor. According to a comparison of these two approaches in Ramers (1999), the empirical differences between them are minimal (including necessary supplementary rules). Since my own reader-based approach, introduced in the next section, is more closely related to the segment-based approach, I will point to some problematic aspects of Eisenberg's assumptions here. Eisenberg approaches the function of sharpening in an indirect manner. According to him, sharpening does not have the function of encoding a vowel quality, but of encoding an aspect of syllable structure. Nevertheless, Eisenberg (1998: 129) regards ambisyllabicity in German as motivated by a constellation of a single consonant following a centralized vowel and preceding a fiirther vowel. This consonant, then, both closes the front syllable and fills the onset of the back syllable. In this way, ambisyllabicity marks the vowel quality of the preceding vowel because it never shows up after peripheral (nor after reduced) vowels. A central claim of Eisenberg's analysis is that ambisyllabicity is said to always trigger the doubling of a consonant whereas not every doubled consonant derives from ambisyllabicity. Here, the principle of morpheme constancy comes into play. In German as in many other languages, stem morphemes are written in a constant form, irrespective of the grammatical context. From a theoretical point of view, however, stems are usually construed as lexical codings of lexemes that are reduced from predictable information. Therefore, the spelling of a word cannot straightforwardly depend on the information contained in a lexical stem because stems include neither information about syllable structure (hence, no information about ambisyllabic13
14
15
„Folgt im Wortstamm auf einen betonten kurzen Vokal nur ein einzelner Konsonant, so kennzeichnet man die Kürze des Vokals durch Verdopplung des Konsonantenbuchstabens." This rule was already part of the efforts towards a unified orthography of German in the late 19th century; cf. Bramann (1987: 88). Augst (1991) and Ramers (1999) call this position 'stress-based' ('akzentbasiert'), which is misleading given that the competing approach turns out to be stress-based as well. „Ein Doppelkonsonantengraphem erscheint immer dann, wenn im phonologischen Wort ein ambisilbischer Konsonant (Silbengelenk) auftritt. Verdoppelt wird das Graphem, das dem ambisilbischen Konsonanten phonographisch entspricht."
The Reader's View: Sharpening in German
175
ity) nor information about the context needed for ambisyllabicity throughout. The stem of the verb schaffen e.g. is /Jaf7, which gives no hint to the ambisyllabic character of the final fricative. Consequently, Eisenberg relates the concept of morpheme constancy to a specific surface form which he calls 'prosodically determined explicit form'. According to this conception, stems are always written the way they are in a form that ends in a reduced syllable. Thus, the imperative form of the verb schaffen is written with a double consonant because there exists a word form with the same stem and a final reduced syllable where the fricative in question is in ambisyllabic position. Such explicit forms can be found with differing degrees of easiness depending on the part of speech: (4)
prosodically determined explicit form (following Eisenberg 1998) a. verbs: infinitive
b. adjectives: attributive form
c. nouns: nominative plural
genitive singular
Verbs always have such an explicit form at their disposal, namely with the infinitive, which is the citation form. For adjectives like straff, the attributive form gives the context needed. The situation is slightly more complicated with nouns. Many nouns belonging to the core vocabulary have a suitable form of the nominative plural, which is characterized by a final reduced syllable (cf. Neef 1998). This is true for nouns like Schiff vn\h its plural form Schiffe, but neither for nouns like Hass or Schmuck, which have no plural forms at all, nor for words like Scheck, which take the s-plural (Schecks) that provides no final reduced syllable. In these cases, the genitive singular may stand in, but with some crucial qualifications: First, only masculine and neuter nouns mark their genitive singular at all, whereas for feminines the genitive singular is identical with the nominative singular. Second, only non-feminine nouns ending in a sibilant regularly mark their genitive singular with a final reduced syllable (des Hasses), whereas the regular ending for other non-feminine nouns is simply -s (des Schmucks). Third, some of the latter words have a form for the genitive singular that ends in a reduced syllable (des Schmuckes), but this syllable is a marked or obsolete inflection form. Fourth, this kind of genitive marking is not available at all for polysyllabic nouns and foreign nouns (*des Kuckuckes, *des Scheckes). Finally, the genitive singular is suspect as a prosodically determined explicit form because of its textual infrequency. In language acquisition, the genitive is the last case to be acquired. According to Wegener (1995), many children do not acquire this category at all. The concept of a prosodically determined explicit form is a concept for the writer only. It is conceptually problematic because it predicts that while writing a specific word form, a different word form has to be present in thought. In some cases, this related form has a designated status as a citation form, but in other cases it is quite marginal. Moreover, there exists a number of words ending in a double consonant that have no suitable explicit form at all. To substantiate this claim, I present a list of words ending in (based on Muthmann 1988) that have, according to Duden (2000), no plural form ending in a reduced syllable. Furthermore, words that have no genitive form containing schwa (according to the same source) are preceded by an arrow in the following list.
Martin Neef
176 (5)
Consonant doubling in nouns lacking a reduced syllable plural form (with the example , based on Muthmann 1988 and Duden 2000) a. no plural —>Effeff (no inflection) Gewaff ('canine tooth of a wild boar') Kaff ('chaff) Muff ('mould') —Off Puff (neuter, dice) Suff —»Töff (Swiss 'motorbike'; zero plural) —»Töffiöff (child language) —Zoff b. only s-plural —»Bluff —•Mastiff —•Puff (masculine, 'brothel') Reff (nautical) Riff (jazz) —•Sheriff —•Treff (neuter, 'clubs') —•Treff (masculine, 'meeting') —•Tuff ('tuft') c. also s-plural —•Ganeff (zero-genitive) Haff -•Kabuff —Kaff ('dump')
These words do not allow the explanation of consonant doubling through ambisyllabicity. Therefore, Eisenberg's analysis falls short in cases like these or is in need of some further device yet to be developed. The segment-based approach is capable of explaining these cases, except the reduplicated forms E f f e f f and Töfftöff and the words Sheriff and Mastiff, the exceptions in these four cases are due to the consonant doubling that appears in an unstressed syllable.
3. A reader-based sharpening constraint
From the perspective of the reader, it is relevant to ask under which circumstances a simple vowel graph refers to a peripheral and to a centralized vowel respectively. The general correspondence relations allow both possibilities. This claim can be illustrated with the pair Herde Horde, where the orthographically unmarked vowel graph of the first example corresponds to a peripheral (tense) vowel phone, and the one of the second example corresponds to a centralized (lax) vowel phone. A substantial restriction of the possible correspondences of simple vowel graphs is given in the following sharpening constraint. The extent to which stress is rele-
The Reader's View: Sharpening in German
177
vant for the subject of sharpening will be discussed in section 7 where a modified version of the sharpening constraint will be presented. (6)
Restriction of the correspondences for centralized vowels ('sharpening constraint') (first version) If in a word less than two consonant letters follow adjacently a simple vowel graph, this vowel graph must not correspond to a centralized vowel.16
Simple vowel graphs consist of exactly one vowel letter, as opposed to complex vowel graphs which may also contain lengthening signs. The sharpening constraint restricts the general reader-based correspondences which state that as a default every simple vowel graph may represent both a peripheral and a centralized vowel. The sharpening constraint is a contextsensitive restriction on this default. In a spelling like there are two consonant letters following a simple vowel graph, which may therefore correspond either to a peripheral or to a centralized vowel. The actual pronunciation of this word is [valt] with a centralized vowel. The spelling , on the other hand, has only one consonant letter following the vowel graph, which must therefore correspond to a peripheral vowel; its pronunciation being [va:l]. A problematic situation arises if the correspondence relations motivate only one consonant letter following a vowel graph that is meant to correspond to a centralized vowel as in [val]. If only one consonant grapheme follows a simple vowel graph that corresponds to a centralized vowel, this consonant grapheme is in a constellation that I will refer to as a 'sharpening position'. A consequence of the sharpening constraint could be that there is no orthographic representation available at all for this phonological form since the sharpening constraint demands there be two consonant letters in order to enable the vowel graph to correspond to a centralized vowel. In order to establish the intended correspondence without violating the sharpening constraint, German orthography makes available specific allographs for consonant graphemes, which I call 'sharpening allographs'. These allographs thus enlarge the stock of graphs (the special sharpening allographs will be discussed in the next section): (7)
Sharpening allographs Consonant graphemes have double-consonant allographs. Exceptions: Special sharpening allographs are: for « k » for « ζ » for « χ »
The default realization of a consonant grapheme is a simple graph. The double form appears under specific circumstances only, namely, to fulfil the restriction in (6). Since there seems to be no other reason for the appearance of sharpening allographs,17 an unambiguous correspondence between a simple vowel graph and a centralized vowel is given if the former precedes a sharpening allograph. Double-consonant graphs do not hinder the Receding Principle because 16
17
This constraint borrows from the traditional two-letter-rule which is for example implicitly given in Augst (1985: 61), admittedly in a form of a writer-based rule and restricted to stressed syllables. In proper names like Schrempp or Württemberg there seems to be no motivation for the double consonant at all. The readability of these words is not disturbed, however, because the respective double-consonant graphs are established as allographs of simple consonant graphs.
Martin Neef
178
German has no phonological geminates, neither tautosyllabic nor heterosyllabic when both syllables belong to the same phonological word. Therefore, double-consonant graphs can be read neither as two distinct phonological consonants nor as long consonants. In the following, I will first have a look at the advantages of the sharpening constraint before I turn to areas which behave as exceptions with respect to this constraint. As a consequence of the analysis, it will be discovered that German orthography contains instances of orthographical underspecification. Finally, I will investigate the influence of stress on sharpening and thereby slightly modify the given constraint.
4. Achievements of the sharpening constraint
Usually, consonant graphemes in a sharpening position are realized as double-consonant graphs or as special sharpening allographs. In certain cases, however, the default allograph is sufficient: (8)
Polygraphs without doubling
a. [x]
Μ
b. m
c. ω
* * * *
Given a writer-based perspective, an explanation of this distribution is not straightforward in phonological terms because the corresponding phonological segments do not form a natural phonological class. The respective input-oriented models are thus forced to postulate some additional devices supplementary to the original rule of consonant doubling like Eisenberg's graphematic subregularities (1998: 298) or Nunn's autonomous spelling rules (1998: Ch. 5). From a reader-based perspective, however, the absence of consonant doubling in these cases is absolutely regular: A consonant graph does not have to be doubled if the default allograph already consists of more than one consonant letter. This correlation is explained by the sharpening constraint that refers to the number of letters following the vowel graph in question. Since a consonantal polygraph consists of at least two consonant letters, there is no need to add a further consonant letter because the sharpening constraint has already been fulfilled. In this sense, the cases in (8) fell under the scope of the sharpening constraint and have to be regarded as strictly regular cases. There are also examples for the opposite constellation: a sharpening allograph appears even though there are already two consonants following a centralized vowel on the phonological level. From an input-oriented perspective, there is thus no motivation for a sharpening allograph. In order to understand these cases, it is first of all necessary to realize that sharpening allographs do not necessarily consist of a doubled consonant letter. Since the sharpening constraint only demands that there be two consonant letters following a simple vowel graph, a sharpening allograph consisting of two different consonant letters would also suffice. The clearest example for this case is the sharpening allograph of the grapheme :
179
The Reader's View: Sharpening in German
(9)
irregular sharpening allographs I a.
b.
*, *
For words of the indigenous vocabulary, the sharpening allograph of the grapheme « k » is not a double-consonant graph but the exceptional graph . This irregular case, which has to be learned individually under any theoretical perspective, fits well into the sharpening scenario since the digraph appears in only those constellations where the monograph would violate the sharpening constraint.18 The double-consonant graph can only be found in the foreign vocabulary. This means that although double-consonant spelling is a central characteristic of German orthography, the double-consonant graph clearly marks a word as foreign. Two further consonant graphemes have, as indicated in (7b), an irregular sharpening allograph. They both consist of more than one consonant letter but are different from a mere double-consonant graph: (10)
irregular sharpening allographs II a. indigenous: Katze, Hitze foreign". Skizze, Pizza b. indigenous: Achse, Fuchs, wechseln, sechs indigenous: ausbüxen, Hexe, Jux foreign: Affix, Exemplar, faxen
As in the case of the grapheme « k » , the grapheme « ζ » has a special sharpening allograph for the indigenous vocabulary: . The double-consonant graph can only be found in a number of foreign words with about ten of them belonging to the regular vocabulaiy. The picture is less clear for the sharpening spelling of the grapheme « χ » . The sharpening constraint raises the expectation that at least for the indigenous vocabulary, this grapheme would have an allograph consisting of more than one consonant letter. This allograph actually exists in the form of . However, the simple allograph appears in sharpening positions not only in foreign words but also in a number of indigenous words. In fact, the Official rules prefer to regard the simple graph as the regular spelling of the consonant cluster [ks] (Duden 1996: 871) because of its higher type frequency (cf. Augst & Stock 1997: 117). Since indigenous words predominantly use the sharpening allograph , with the allograph being mostly restricted to foreign words and proper names, I find it more convincing to regard as the regular sharpening allograph for the grapheme « χ » . In a different respect, the spelling of is definitively irregular. In sharp contrast with the general regularities of sharpening, any vowel graph preceding the graph corresponds to a centralized vowel. This regularity leads to problems for those morphologically simple words that contain a sequence of a non-centralized vocalic segment and the consonant cluster [ks]. Such cases are rare but existent:
18
The status of as a sharpening allograph had been even clearer in the older orthographic norm up to 1998 because in this norm the graph alternates with in a hyphenation position.
180 (11)
Martin Neef
simple words with a peripheral vowel or diphthong plus fksj (complete list, based on Duden 2000) a. Keks, Koks, piksen, quieksen, staksen b. Deichsel, Weichsel c. Wuchs
The examples in (IIa) demonstrate the possibility of orthographically representing, unproblematically, the constellation in question. Here, there is neither a sharpening position nor a sharpening allograph. In two items, however, the sharpening allograph follows a diphthong, hence in a position where sharpening allographs are not to be expected (cf. (lib)). The same constellation also allows spelling with as in Deixel or feixen. In any case, the spellings in (lib) do not cause any problems for reading because diphthongs have unambiguous correspondences. The spelling in (11c), however, is misleading because here a sharpening allograph follows a vowel graph corresponding to a peripheral vowel. Morphologically, the noun Wuchs is derived from the verb wachsen (containing a centralized vowel) with the 3rd ps. sg. form wuchs (containing a peripheral vowel), which may motivate the irregular spelling. The important thing with the data in (10) is the following: These cases are spelled with a sharpening allograph although the phonological representations do not contain a sharpening position (the centralized vowels are followed by a consonant cluster).19 Hence, the writerbased approaches in (2) and (3) are not able to explain the appearance of these sharpening allographs. From a reader-based perspective, on the other hand, the sharpening spelling is regular because only the phonological clusters [ts] and [ks] have a defeult spelling with a monograph. Orthographically there is thus a sharpening position in words like Katze or Fuchs because a simple vowel graph, which is meant to correspond to a centralized vowel, is followed by a single consonant grapheme which has a monograph as its default realization. Therefore, in these cases the existence of a sharpening polygraph is not motivated by phonological considerations but by orthographic considerations: to satisfy the sharpening constraint. Admittedly, there is still a certain degree of irregularity concerning spellings with « χ » , but these are mostly restricted to the foreign vocabulary.
5. Systematic and unsystematic exceptions to the sharpening constraint
The reader-based sharpening constraint marks specific data as being exceptional. In part, these data are systematic in nature. First, closed classes, like function words, prepositions and affixes, fall out of the scope of the sharpening constraint:
" The orthographic pecularities cannot be explained through reference to the status of [ts] as an affricate because the cluster [ks] is usually not valued as an affricate while the other affricate in German, [pf], is always spelled with the digraph .
The Reader's View: Sharpening in German
(12)
181
Exceptions I: closed classes a. determiner das pronouns es prepositions ab, bis, in conjunctions ob, um b. affixes un-; -in, -nis
In all these examples, a simple vowel graph corresponds to a centralized vowel although it is followed by one consonant letter only. Hence, the sharpening constraint can be relevant only for the lexical categories adjective, noun, and verb. This does not prevent some closed-class items like denn, wenn, and dass from containing sharpening allographs. The segment-based approach shares this view while the syllable-based view takes most of these data as being regular: those that have no related forms showing ambisyllabicity. I think that closed classes in German behave principally different from lexical classes. Functionally, it makes sense to subject the lexical classes to different rules or constraints than the closed classes because the former contribute to the meaning of an utterance whereas the latter contribute to the grammatical structure. Closed classes not only differ in the application of the sharpening constraint, but they also lack, for the most part, lengthening signs (cf. the pronoun denen as opposed to the homophonous verb dehnen), which leads to significantly shorter spelling forms for closed class items than for lexical items. This different behaviour should be reflected in the theoretical analysis as it is in the present study.20 Furthermore, it is to be expected that the sharpening constraint may be violated by foreign words. The foreign character of these items is in part manifested by their deviant spelling. Especially words that show no sharpening spelling in their language of origin may violate the sharpening constraint. As an example, the following list gives the complete data of words containing the monograph in sharpening position, based on Duden (2000). (13)
Exceptions II: non-assimilated foreign word (with the example of , complete list based on Duden 2000) a. clear cases Chef, Rel'ief [Rel'jef], 'Rififi, Tar'tufo, vif b. phonological variation •Mafia, 'Trafo c. proper names Sif, 'Stefan d. abbreviations cif, Prof
The masculine noun Chef lacks an inflected form that shows ambisyllabicity (due to its j-plural), but it has a derived female form, which is to be spelled according to Duden (2000). The plural of Relief is Reliefe, which makes a sharpening allograph strongly expected. In the cases in (13b) there is regional variation between a pronunciation with a centralized and
20
For similar reasons, Borgwaldt & de Groot (this volume) exclude closed-class items from their investigation.
Martin Neef
182
a peripheral vowel. Altogether, these exceptions are small in number and inconspicuous in general. Conversely, there are both indigenous and foreign words that have a sharpening allograph in a position where it is not motivated by the sharpening constraint: in front of a further consonant graph. Again, I have counted as an example words containing , based on Duden (2000). It is important to notice that none of these words violates the sharpening constraint. The only peculiarity is that the sharpening constraint cannot be made responsible for the respective sharpening allographs straightforwardly. I will show, however, that there are some more or less convincing arguments that deem these sharpening allographs necessary. (14)
Exceptions III: presumably unexpected sharpening allographs (with the example of , complete list based on Duden 2000) a. ausgebuflt, Fuffziger, Schaffner, versiffi b. Daflke, Raffke; Mufflon, Töffiöfif c. Chiffre, Dufflecoat, Shuffle(board), Souffle, soufflieren d. Affrikate, Affront, Diffraktion, Effloreszenz, Suffragan, Suffragette
According to the sharpening constraint, a sharpening allograph is only motivated to guarantee two consonant letters to follow a simple vowel graph that is meant to correspond to a centralized vowel. A further consonant letter following this sharpening allograph is therefore an indication of a morphological boundary in between these two consonant graphs. In certain cases as in the noun Hoffnung (based on the verb stem hoff-) or in the verb waffnen (based on the noun Waffe) the motivation of the consonant graph following the sharpening allograph is unclear. The sharpening allograph itself, however, is motivated by stem constancy. Therefore, these words do not belong in the list in (14). In the cases in (14a), the respective affixes indicate morphological complexity, but the bases of these words have the synchronic status of bound roots. For the cases in (14b), morphological complexity is somewhat harder to argue for. The words ending in -he are typical for the dialect spoken in Berlin with -ke itself arguably being a diminutive suffix. Mufflon derives from the French word mouflon. In German, it may have been reanalyzed as a derivation to the base Muff'muff. The child-language noun Töfftöff can be said to be a reduplication based on the noun Töff which motivates the sharpening allograph. The foreign words in (14c and d) (plus some related words like Souffleuse or effloresziereri) either have retained their sharpening spellings from their language of origin (cf. (14c)) or they must be analyzed as prefixed structures with the prefix-final consonant being assimilated (e.g. Affrikate being derived from Ad-Frikate). An even more interesting case can be seen in recent tendencies to assimilate English words. In the context of the present study, assimilation is realized by the introduction of a sharpening allograph if the respective graph appears in a sharpening position:
The Reader's View: Sharpening in German
(15)
183
Assimilation of English loans old new (related verb) a. Mop Mopp (moppen) Step Stepp (steppen) Stop Stopp (stoppen) Tip - * Tipp (tippen) b. Bob - * •Bobb (bobben) Job *Jobb (jobben) *Lobb (lobben) Lob - * Mob - * *Mobb (mobben)
The spelling reform of 1996 changed the spelling of the nouns in (15a) in accordance with the related verbs that have been in use with the sharpening spelling for a longer time. The nouns in (15b), however, also have related verbs showing sharpening spelling, but the reform insisted on retaining their original spelling in clear violation of the sharpening constraint. Phonologically, there is hardly any difference between the cases in (15a and b), with Mop and Mob even being homophones. Orthographically, the different behaviour is explainable with recourse to a hitherto only implicitly known constraint on German spellings that prohibits words from ending in . In fact, there are no German words showing this constellation (based on Muthmann 1988 and Duden 2000). The existence of a verb like jobben, however, leads to the expectation that there may exist inflectional forms ending in : the imperative form jobb. I am not aware if forms like these can actually be found. An interesting related case is the partly-assimilated form of the English noun club, which is pronounced [klup] in German, but it is neither written nor but , violating the sharpening constraint but following the constraint on final . It suggests itself that this constraint extends to the graphemes and . Duden (2000) actually contains no words ending in while Muthmann (1988) lists the rare words Odd and Mudd, the latter being a spelling variant to Mud, which is to be pronounced with a centralized vowel. The data are slightly less clear for the grapheme : (16)
Words with « g » in final sharpening position (bases: Muthmann 1988 and Duden 2000) a. Brigg, Lagg, Rigg b. Beg, Clog, Grog (but: groggy), Log (only e-plural), Reg, Smog
Besides some proper names like Scheidegg and Rigg, there are three words ending in in German (cf. (16a)). While Lagg has been borrowed from Swedish together with the sharpening allograph, the two other cases are assimilations from the English words brig and rig respectively. On the other hand, there are a larger number of foreign words lacking the sharpening allograph in word-final sharpening position (cf. (16b)). The indigenous vocabulary shows at least one case where word-final is suppressed: in words ending in like König (cf. also (23) below) where the « g » appears in a sharpening position. The discussion of these data seems to confirm the following constraint: (17)
Constraint on word-final sharpening allographs The sharpening allographs , , and must not appear in word-final position.
184
Martin Neef
The relevance of this constraint is visible in attempts to assimilate foreign words (cf. Neef2002 for further discussion). Neither the segment-based nor the syllable-based approach is capable of explaining these data without adding a similar constraint to their analyses.
6. Consequence: Orthographic underspecification
Theoretically, it is important that the sharpening constraint makes reference only to information of actual word forms without recourse to some other 'explicit form' or to some abstract underlying form. Nevertheless, in the reading process it is necessary to identify those graphs that refer to the lexical stem in order to understand the words. Through reference to stems, homographs of the following kind can be distinguished: (18)
non-homophonous homographs a. simple nouns with b. centralized vowels Front Kost Last Pult Rast Spurt
I inflected verbs with peripheral vowels fron-t (3rd ps sg pres) kos-t (2nd ps sg pres) las-t (2nd ps pi past) pul-t (3rd ps sg pres) ras-t (2nd ps sg pres) spur-t (3rd ps sg pres)
The verbs in (18b) predictably contain a peripheral vowel because there is only one consonant letter following a simple vowel graph in the stem sequence. The morphologically simple words in (18a) contain a centralized vowel, but this vowel quality is not predictable from the spelling. This is because the sharpening constraint does not make any predictions as to what quality a vowel that is followed by at least two consonant letters has to have. The spellings in (18a) could also correspond to words containing a peripheral vowel. This well-known blurredness of German spellings constitutes a case of orthographic underspecification. This phenomenon is exemplified by the following minimal pairs in (19a). The pairs in (19b) show the same blurredness with polygraphs. (19)
Variable vowel quality for vowel graphs followed by at least two consonant letters peripheral centralized Wert Wort Herde Horde Mond Mund Trost Frost Ostern Astern Schuster Muster Dusche Tusche Wuchs Wachs nach noch
185
The Reader's View: Sharpening in German
This variability is constrained by regularities of German phonology. Within a syllable, a maximum of one consonant may follow a peripheral vowel while two consonants may follow a centralized vowel. In this respect, two consonant graphs at the end of a syllable indicate that the preceding vowel graph corresponds to a centralized vowel. There is, however, one crucial exception to this generalization: coronal obstruents may follow the maximal syllable in any case. Consequently, if there are two consonant graphs following a simple vowel graph with the latter consonant graph corresponding to a coronal obstruent, the respective vowel may either be peripheral or centralized. The monosyllabic examples in (19a) reflect these interrelations in that they all end in a coronal obstruent. Admittedly, the number of such words containing a peripheral vowel is much smaller than the number of words containing a centralized vowel, a fact that leads the official spelling to take only the latter case as being regular and the former not being in need of an explanation (cf. Duden 1996: 863). However, words with a vowel graph for a peripheral vowel followed by two consonant graphs form part of the core vocabulary of German, and their number is too big to be ignored. How many such words exist in German is not easy to determine. In trying to find an answer, I have searched through all entries contained in the reverse dictionary of Muthmann (1988) that end in a graph for coronal obstruents: this is the only constellation that allows a peripheral vowel to be followed by more than one consonant. In this way, I ended up with 56 words; I have listed the common ones in (20a) below. The data in (20b) derives from the phonological dictionary of Muthmann (1996) where I looked up all words beginning with a vowel graph for peripheral vowels followed by at least two consonant graphs.21 Of the 34 relevant cases, I have listed the common ones in (20b) below. However, in both groups there are some cases that may also be pronounced with a centralized vowel. For a more complete inquiry, medial positions have to be taken into account (cf. the pairs Schuster vs. Muster and Herde vs. Horde in (19a)). (20)
Words with a vowel graph for peripheral vowels plus at least two consonant graphs a. Jagd, Magd, Mond, Herd, Pferd; Vogt, Wert, Schwert, Geburt, Obst, prost, Trost, Papst, Probst, erst, Wust, wüst, Arzt; Krebs, Wuchs, Keks, Koks, damals, Krams, Gedöns, Pups, stets; Harz, Quarz; Knatsch, Tratsch, ätsch b. Adler, atmen, Art, Arzt, April, Aprikose, adrett, Adresse; ebnen, Erde, Erz, erst; Iglu; Obrigkeit, Obst, Ostern
My investigation revealed only two true minimal pairs, i.e. homographs that are not homophonous but have the same morphological structure (minimal pairs with different morphological structure are given in (18)). (21)
21
non-homophonous homographs a. Latsche [la:.tja] Latsche [lat.Ja] b. Hochzeit [ho:x.tsalt] Hochzeit [hox.tsalt]
II 'slipper' 'mountain pine' 'high time' 'marriage'
With the letter |a| Muthmann does not distinguish two different vowel qualities, which made the search somewhat laborious.
Martin Neef
186
Reading experiments with made up words may reveal a preferred pronunciation of spellings with the structure discussed. I would expect that readings with peripheral vowels constitute the marked case and would tend be reduced. At present, there is dialectal variation in the pronunciation of some of the words in (20), like Jagd and Magd, that may also be pronounced with a centralized vowel, presumably as a reflex of the orthographic underspecification. These cases call for the use of lengthening signs to explicitly mark the peripheral vowel quality. Actually, only few relevant words are marked accordingly. On the base of Muthmann (1988), the following list gives all verbs containing a complex vowel graph that is followed by at least two consonant letters: (22)
Lengthening signs preceding at least two consonant letters in verbs (base: Muthmann 1988) a. ahnden, öhmden, fahnden; dreeschen b. drieschen, kieksen, knietschen, piepsen, pietschen, quieken, quietschen
Whereas lengthening signs are, for the most part, redundant, the data in (20) constitute an area where they would be functionally sensible. In the present state, German orthography is highly inconsistent in the marking of the vowel quality if the vowel grapheme is followed by more than one consonant letter.
7. Additional remark: Is sharpening stress-sensitive?
Both the segment-based and the syllable-based approach agree that sharpening is restricted to stressed syllables. This is not an indispensable consequence of these approaches: in principle, both would work well without this restriction. The motivations for including this restriction, however, are different for these analyses. In fact, there are a large number of words in which a short vowel is followed by a single consonant without sharpening marked in their corresponding spellings. Predominantly, these words end in a reduced syllable with the vowel schwa as their peak like Pinsel, pronounced [pin.zal]. If schwa is analyzed as belonging to the class of short vowels, as is the case in the Official rules (cf. Duden 1996: 862), then unstressed syllables have to be eliminated from the scope of the sharpening constraint. The segment-based approach in (2) is therefore capable of explaining the non-appearance of sharpening allographs in words like Pinsel but not the appearance of sharpening allographs in words like 'Amboss. Most of the recent studies on the phonology of German, however, assume that reduced vowels have a different status than full vowels. This position is held by Eisenberg (1998) who claims that ambisyllabicity is found only after centralized full vowels, hence not after schwa. Therefore, the fact that sharpening is never marked when the preceding vowel graph corresponds to schwa has already been explained in Eisenberg's syllable-based approach in (3) without recourse to stress. The reason Eisenberg restricts ambisyllabicity (and hence sharpening) to stressed syllables must therefore be different from the motivation for the segment-based approach. On closer inspection it turns out that the empirical base for this restriction is not very strong. This is because in the indigenous vocabulary that is the focus of Eisenberg's analy-
The Reader's View: Sharpening in German
187
sis there are not many words containing two stressable syllables. Only in this constellation could sharpening in an unstressed full syllable possibly appear. In order to find an empirical base for the evaluation of the stress-basedness of sharpening, it seems necessary to define the borderline between indigenous and foreign words, which is a problematic task. Many analyses regard morphologically simple words that contain more than one füll syllable automatically as foreign, which means that there would be no data base at all to assess Eisenberg's hypothesis. Therefore, I have looked through different sources to find words containing more than one full syllable that are valued as indigenous in one source or another. Of the 180 words resulting from this search, the following list contains all those presumably non-foreign words that contain a sharpening position. The respective relevant syllable is underlined. (23)
Sharpening in simple words containing more than one full syllable a. Sharpening marked in stressed syllable Ballast, billig, Bottich. Bücking, Bussard. Eppich. Essig, Fittich, Forelle, Hallig. Hornisse. Kaffee. Kamille. Kartoffel Karussell, Kassette. Klamotte. Klassik. Koralle. Kuckuck, Lakritz, Lattich, mannig(fkch), Marüle, Meüsse, Mennige, Messing, Narzisse. Pfennig. Rettich. Schabracke. Schaluppe. Scharmützel. Schatulle, Schlamassel, schmarotzen. Sittich, stibitzen. Teppich. Wittib. Zwillich b. Sharpening marked in unstressed syllable Amboss, Elritze, Fassade, Girlitz. Herlitze, hurra. Insasse. Kannibale. Karussell. Kassette. Kiebitz. Kuckuck, Lotterie. Mumpitz. Nachtigall, passieren. Porzellan. Pumpernickel Schabernack. Stieglitz c. Sharpening unmarked in stressed syllable Ananas. Anorak. April Hotel, Januar. Kamera. Kapitel Mama. Mini. Papa. Relief. Witib d. Sharpening unmarked in unstressed syllable Albatros, Ananas. Anorak, billig, Bischof Bräutigam. Diskus. Eidam. Essig. Globus. Hallig, Honig, hurtig. Iltis, Käfig. Kamerad. König. Kürbis, ledig. Mennige. Pfennig, wenig. Zeisig
For stressed syllables, these data verify a clear predominance for marked sharpening as opposed to unmarked sharpening. For unstressed syllables, however, the picture is less clear. Therefore, independent arguments (which go beyond the existing vocabulary) are called for to decide whether sharpening is relevant for unstressed syllables. Interestingly, Eisenberg implicitly distinguishes final from non-final syllables. Ambisyllabicity is only possible in non-final syllables because a consonant at the end of a word must naturally belong to one syllable only. Bearing in mind stem constancy, final stressed syllables with a centralized vowel should not need the marking of sharpening if there is no related form showing ambisyllabicity. Therefore, Eisenberg's analysis marks the cases in (5), like Kaff, as exceptions whereas those cases in (23 c) in which the final syllable is concerned are regarded as regular. In this connection, the segment-based approach values the conditions the other way round. Regarding the other cases in (23c) and the cases in (23b), the segment-based approach and the syllable-based approach both assess them as irregular. Possibly, the crucial cases are the following ones, which are expectable neither from a segment-based perspective nor from a syllable-based perspective:
188
(24)
Martin Neef
Sharpening alternation in unstressed syllables singular plural a. Ärztin Ärztinnen Kenntnis Kenntnisse b. Ananas Ananasse Kirmes Kirmessen Iltis Iltisse Albatrosse Albatros Globusse Globus
As opposed to the endings in the left row in (24b), those in the left row in (24a) are affixes. All cases in (24) violate morpheme constancy. A relevant feature of these examples is that the final syllables in the left row are unstressed and they remain unstressed in the derived forms in the right row. But here, sharpening is marked, which should not be the case due to their unstressed position. Based on these data, I would like to draw the conclusion that sharpening is independent of stress except for the word-final syllable. In other words, if there is a sharpening position, sharpening always has to be marked, except for word-final syllables. Only in the case of wordfinal syllables is stress relevant in that sharpening also has to be marked in stressed word-final syllables despite the question whether there is a related explicit form showing ambisyllabicity. This assumption still shows some exceptions that are partly irregular (e.g. Bus) and partly regular (e.g. Job; cf. section 5). Another means of verifying this assumption would be to test the reading behaviour of competent language users regarding unknown words. From a lexical perspective it is no problem that there are exceptions to a general condition like the sharpening constraint. A word that violates the sharpening constraint can still be read correctly if the whole word image is learned as a lexical exception. While reading this word, the reader has to recognize the word pattern as a whole instead of relying on his regular receding competence. If the word in question is already known as a lexeme, recoding it is no problem If the word is unknown, however, its pronunciation has to be derived from its spelling. Therefore, while reading unknown words, readers probably rely on their regular orthographic knowledge; they employ the regular correspondence relations. The given analysis predicts that a spelling like , which is currently not used in German, should have six possible pronunciations. If the first syllable is regarded as stressed, both the first vowel and the second vowel may be either centralized or peripheral. If the second syllable is regarded as stressed, however, these two possibilities only apply for the vowel of the first syllable whereas the vowel of the second syllable can only be read as a peripheral one. These considerations lead to a modified version of the sharpening constraint that incorporates the assumptions concerning stress: (25)
Restriction of the correspondences for centralized vowels ('sharpening constraint') If in a word less than two consonant letters follow adjacently a simple vowel graph, this vowel graph must not correspond to a centralized vowel, unless this vowel is in an unstressed word-final syllable.
The Reader's View: Sharpening in German
189
8. Conclusion: Recoding and stem constancy
The sharpening constraint assigns sharpening to the function of making explicit the correspondences to a vowel phone that cannot be distinguished sufficiently by the letters of the Latin alphabet. Thus, sharpening has a function for the reader. In general, I construe orthography with respect to the reader with the Recoding Principle in (1) being the core of the approach. Given this perspective, the principle of stem constancy can be stated more precisely. According to Maas (2000: 330), stems have to be written in a maximally constant form. However, the exact meaning of the word 'maximally' remains unclear. Given a reader perspective, stems are written in a constant form as long as their spelling does not hinder the recoding of a word. In a verb like sammeln, the double-consonant graph is motivated by sharpening: there have to be at least two consonant letters between two vowel letters if the first vowel letter is a simple vowel graph that is meant to correspond to a centralized vowel. In the derived noun Sammlung, stem constancy is violated at one point because the vowel graph of the basic verb stem sammelis missing. This vowel graph must not be written in Sammlung because any vowel letter that is surrounded by consonant letters has to correspond to a syllable peak. Hence, the spelling , which follows the principle of stem constancy, corresponds to a trisyllabic word only. The noun Sammlung, however, can only be pronounced bisyllabically, which is reflected in the spelling under violation of stem constancy. The double-consonant graph , on the other hand, is preserved in accordance with stem constancy because it does not disturb the recoding of the word. Of course, the double-consonant graph is not motivated by the actual word form Sammlung because this word could also be spelled without prohibiting the correspondence of the simple vowel graph with a centralized vowel. Writing systems aim at being consistent not for the writer but for the reader. This is expressed by Maas's fundamental maxim for determining the function of orthographies: "Write like you want to be read!"22 (Maas 2000: 44).23
References
Augst, Gerhard (1985): Regeln zur deutschen Rechtschreibung vom 1. Januar 2001. Entwurf einer neuen Verordnung zur Bereinigung der Laut-Buchstabenbeziehung. Frankfurt: Lang. - (1991): Alternative Regeln zur graphischen Kennzeichnung des kurzen Vokals im Deutschen - ein historischer Vergleich. In: Gerhard Augst et al. (eds.): Festschrift für Heinz Engels zum 65. Geburtstag. Göppingen, 320-344. Augst, Gerhard, Kurt Blüml, Dieter Nerius & Horst Sitta (eds.) (1997): Zur Neuregelung der deutschen Orthographie: Begründung und Kritik. Tübingen: Niemeyer. Augst, Gerhard & Eberhard Stock (1997): Laut-Buchstaben-Zuordnung. In: Augst et al. (eds.) (1997): 113-134.
22 23
„Schreib, wie du gelesen werden willst!" I wish to thank Utz Maas, Anneke Neijt, Beatrice Primus, Derk Quiggle, Richard Sproat, and Richard Wiese for help and advice.
190
Metrtin Neef
Becker, Thomas (1998): Das Vokalsystem der deutschen Standardsprache. Frankfurt: Lang. Bierwisch, Manfred (1972): Schriftstruktur und Phonologie. Probleme und Ergebnisse der Psychologie 43: 2144 (also in: Ferenc Kiefer (ed.) (1975): Phonologie und Generative Grammatik. Frankfurt: Athenaion, 1151). Borgwaldt, Susanne R. & Annette M.B. de Groot (this volume): Beyond the Rime: Measuring the Consistency of Monosyllabic and Polysyllabic Words. Bramann, Klaus-Wilhelm (1987): Der Weg zur heutigen Rechtschreibnorm. Frankfurt a.o.: Lang (= Theorie und Vermittlung der Sprache 6). Chomsky, Noam (1981): Lectures on Government and Binding. Dordrecht: Foris. Duden (1996): Rechtschreibung der deutschen Sprache. 21" edition. Mannheim/ Leipzig/ Wien/ Zürich: Dudenverlag (= Duden 1). - (1998): Grammatik der deutschen Gegenwartssprache. 6,h edition. Mannheim/ Leipzig/ Wien/ Zürich: Dudenverlag (= Duden 4). - (2000): Die deutsche Rechtschreibung. Das Standardwerk zu allen Fragen der Rechtschreibung auf CDROM. Version 2.1. Mannheim: Bibliographisches Institut & Brockhaus. Eisenberg, Peter (1997): Die besondere Kennzeichnung der kurzen Vokale - Vergleich und Bewertung der Neuregelung. In: Äugst et al. (eds.) (1997): 323-335. - (1998): Grundriss der deutschen Grammatik Das Wort. Stuttgart, Weimar: Metzler. Heuven, Vincent van & Louis Pols (eds.) (1993): Analysis and Synthesis of Speech. Strategie Research towards High-Quality Text-to-Speech Generation. Berlin: Mouton de Gruyter. Lenerz, Jürgen (2000): Zur sogenannten Vokalopposition im Deutschen. Zeitschrift für Sprachwissenschaft 19: 167-209. Lindauer, Thomas (this volume): How Syllable Structure affects Spelling: A Case Study in Swiss German Syllabification. Maas, Utz (1992): Grundzüge der deutschen Orthographie. Tübingen: Niemeyer (RGL 120). - (2000): Orthographie. Materialien zu einem erklärenden Handbuch der Rechtschreibung des Deutschen. Ms. University of Osnabrück. Moulton, William G. (1962): The Sounds of English and German. Chicago: University of Chicago Press. Muthmann, Gustav (1988): Rückläufiges deutsches Wörterbuch. Tübingen: Niemeyer (= RGL 78). - (1996): Phonologisches Wörterbuch der deutschen Sprache. Tübingen: Niemeyer (= RGL 163). - (1999): Reverse English Dictionary. Berlin, New York: Mouton de Gruyter (= Topics in English Linguistics 29). Neefj Martin (1996): Wortdesign. Eine deklarative Analyse der deutschen Verbflexion. Tübingen: Stauffenburg (= Studien zur deutschen Grammatik 52). - (1997): Die Alternationsbedingung: Eine deklarative Neubetrachtung. In: Christa Dürscheid, Monika Schwarz & Karl Heinz Ramers (eds.): Sprache im Fokus. Festschrift für Heinz Vater. Tübingen: Niemeyer, 17-31. - (1998): The Reduced Syllable Plural in German. In: Ray Fabri, Albert Ortmann & Teresa Parodi (eds.): Models of Inflection. Tübingen: Niemeyer (= Linguistische Arbeiten 388), 244-265. - (2000): Die Distribution des [h] im Deutschen: Schriftaussprache und Phonologie. Convivium 2000: 271286.
-
(2002): Graphematische Beschränkungen und Fremdwortintegration im Deutschen. To appear in: Estudios Filolögicos Alemanes. Neef, Martin & Moritz Neugebauer (2002): Beschränkte Korrespondenz: Zur Alternation von Schwa und silbischen Sonoranten im Deutschen. To appear in: Zeitschrift für Sprachwissenschaft. Noack, Christina (this volume): Regularities in German Orthography: A Computer-Based Comparison of Different Approaches to Sharpening. Nunn, Anneke (1998): Dutch Orthography. A Systematic Investigation of the Spelling of Dutch Words. The Hague: Holland Academic Graphics. Ossner, Jakob (2001): Das -Graphem im Deutschen. Linguistische Berichte 187:325-351.
The Reader's View: Sharpening in German
191
Prinz, Michael & Richard Wiese (1990): Ein nicht-lineares Modell der Graphem-Phonem-Korrespondenz. Folia Linguistica 24: 73-103. Rahnenführer, Ilse (1980): Zu den Prinzipien der Schreibung des Deutschen. In: Dieter Nerius & Jürgen Scharnhorst (eds.): Theoretische Probleme der deutschen Orthographie. Berlin: Akademie-Verlag (= Sprache und Gesellschaft 16), 231-259. Ramers, Karl Heinz (1999): Vokalquantität als orthografisches Problem: Zur Funktion der Doppelkonsonanzschreibung im Deutschen. Linguistische Berichte 177: 52-64. Sproat, Richard (2000): A Computational Theory of Writing Systems. Cambridge: Cambridge University Press (= Studies in Natural Language Processing). - (this volume): The Consistency of the Orthographically Relevant Level in Dutch. Venezky, Richard (1970): The Structure of English Orthography. The Hague: Mouton (= Janua Linguarum 82). - (1999): The American Way of Spelling. New York, London: Guilford. Wegener, Heide (1995): Die Nominalflexion des Deutschen - verstanden als Lerngegenstand. Tübingen: Niemeyer. Wiese, Richard (2000): The Phonology of German. Τ1 edition. Oxford: Clarendon.
Thomas Lindauer
How Syllable Structure affects Spelling: A Case Study in Swiss German Syllabification
1. Introduction
This article is based on two key questions: (1)
Question 1: How and to what extent do writers make use of their implicit phonological knowledge (knowledge in the sense of Knowledge of Language)? Question 2: In what way would the answers to question 1 influence the formulation of explicit spelling rules?
In order to answer these questions, I will take the double consonant spelling1 in German as an example. German has both long and short vowels.2 Whereas the graphic representation of short vowels is relatively regular, that of long vowels - apart from long [i:]3 - is less straightforward: There are two possibilities, namely the inserting of the so-called 'Dehnungs-h' ('lengtheningh') or the doubling of the vowel letter (with a special variant for long [i:] that normally takes as its written form; see e.g. Ossner 1996, Augst & Stock 1997). In most cases, however, vowel length is left unmarked, but it can be indirectly deduced from the fact that only a single consonant follows. This indirect way of marking length is possible because short vowels are marked by doubling the following consonant letter if the stem ends in a single consonant (for details see Neef, this volume). Since vowels followed by two or more consonants are usually short, they lack an additional graphic marking.
2
3
Another common term for this phenomenon I will subsequently use is 'written geminate'. This term is, however, exclusively limited to orthographical terminology and does not denote actual consonant length. On the terms used in the literature, see Neef (this volume). At least phonetically. The question of whether to analyze German vowels principally in terms of quantity or quality is a vexed one and there is no agreed approach. In order to avoid confusion, I will employ standard symbols to distinguish phonemic, phonetic, and orthographic transcriptions: Slashes for phonemic transcriptions, square brackets for phonetic, and angle brackets for orthographic transcriptions.
Thomas Lindauer
194 (2)
short vowel
long vowel Stahl, Tal, Saal 'steel', 'valley', 'hall'
onset
Stall, toll, Fell
Hals, fest, Bart
'stable', 'great', 'fur'
'neck', 'firm', 'beard'
onset nucleus C
V
Jt
a
t
3
f
ε
St
a
t
0
onset
nucleus
C
c
V
1 1
h
a
c 1
coda ]
Λ
F
,
e
,
1 1 1
1 1 1
coda
c s
t t
f
e
s
b
a
R
Η
a
1
s
f
e
s
Β
a
r
t t
1I
The adequate wording to regularize the use of double consonant letters has been the object of a recent debate. In both the theoretical and the didactic discussion, it has proved difficult to state a comprehensive spelling rule. There are basically two conflicting accounts: On the one hand, we find a stress-based position as is, for instance, advocated by the official rules for correct spelling (Amtliche Regelung der deutschen Rechtschreibung, reprinted in Duden 1996: 861-910): (3)
If a stressed short vowel is followed by a single consonant in a word stem, its shortness is marked by doubling the consonant letter. (Duden 1996: 863)4
On the other hand, Eisenberg (1997), amongst others, formulates a syllable-based rule:5 (4)
Consonant letters are doubled if the corresponding consonant is ambisyllabic. (Eisenberg 1997: 332)6
In spite of the differences, both formulations cover the central aspects of written geminates relatively well. In both the central and peripheral fields of German spelling, the rules are nearly identical in their effects and approximately equal in number of exceptions (see Ramers 1999 for discussion). In the following, I will present two phenomena which may shed some additional light on the topic. They both concern the syllabification in Swiss German and the problems that arise when
4
5
6
„Folgt im Wortstamm auf einen betonten kurzen Vokal nur ein einzelner Konsonant, so kennzeichnet man die Kürze des Vokals durch Verdopplung des Konsonantenbuchstabens." Maas (1997) suggests a somewhat different but also syllable-based version. Since this is of no further interest to the issue at hand, I will not discuss it in this study. A comparison between different approaches to double consonant spelling is given in Noack (this volume). „1st ein Konsonant ein Silbengelenk, so wird er durch Verdoppelung des Buchstabens für den Konsonanten dargestellt."
How Syllable Structure affects Spelling: A Case Study in Swiss German Syllabification
195
Swiss writers apply the rules of German Standard Spelling quoted above. (See section 3 for a further illustration of the linguistic situation in Switzerland.) Since in spoken language, dialect pronunciation influences the standard pronunciation, there is no uniform but a (finite) variety of dialectally influenced standard pronunciations. In the remainder of the current study, I will use the term 'standard variety' to capture this feet. I will argue that spelling rules should be based on the respective varieties of Standard German. How many of these orthographically relevant varieties there are and to what extent they differ is subject to further research. However, if there are relevant differences between the varieties actually spoken, it is supposedly more logical to take these varieties into consideration. Spelling rules which are essentially based on an idealized and contrived pronunciation7 seem rather pointless, especially when phoneme-grapheme correspondences are concerned. On the other hand, it might then prove difficult to state uniform spelling rules applicable to the entire German speech community.
2. The phenomena
The first phenomenon under consideration is the spelling of the s-sound in Switzerland as opposed to Germany and Austria. In the latter countries, [s] after long vowels is spelt with a single graphic element, the so-called 'Eszett' : [gry:sa] 'greetings', [gro:s] -> 'big', and thus in accordance with the rule for written geminates. Swiss Standard Spelling (which normally does not deviate from the orthography used in Germany and Austria), however, lacks the 'Eszett'. Instead, has been used for decades, even after long vowels and diphthongs. Due to this convention, Swiss pupils learning Standard German might be expected to pronounce long vowels preceding short. Surprisingly and luckily, such incorrect readings have not been reported in any grade (the transcription for Swiss Standard Pronunciation is given in square brackets):8 (5)
a.
b.
7
8
Swiss Standard Spelling: [gro:s], [gry:ss] Standard Spelling: , 'big', 'greetings' Swiss Standard Spelling: [flo:sa], [flosan] Standard Spelling: , 'rafts', 'floated'
The Atlas zur Aussprache des Schriftdeutschen in der BRD (Atlas on the pronunciation of Standard German in the FRG) has been designed by its author on the assumption that the German actually spoken does not correspond to what has been considered standard by dictionaries on pronunciation. This holds regardless of the speaker's education. (Cf. König 1989). The distribution of long vowels/diphthongs vs. short vowels followed by [s] is more or less identical in Swiss Standard German and in Standard German. The occurrence of after long vowels and diphthongs is mainly restricted to the core vocabulary and includes around 40 root words.
196
Thomas Lindauer
The second phenomenon is a misspelling that occurs systematically: In texts written by Swiss pupils - and adults - , we often find written geminates after long vowels and diphthongs. Such spelling errors are particularly frequent in connection with [f], and to a somewhat smaller extent also with [p] and [t]. The doubling of the consonant letter is, as pointed out before, no violation in Standard Swiss Spelling.9 Accordingly, misspellings such as in (6a) and (6b) occur strikingly often; those in (6c) are a little less frequent:10 (6)
a.
b.
*
'to snatch' * < Strafe) 'punishment' * < Häute) 'skins'
* < pfeifen) 'to whistle' *