Annual Review of South Asian Languages and Linguistics: 2009 9783110225600, 9783110225594

South Asia is home to a large number of languages and dialects. Although linguists working on this region have made sign

305 82 4MB

English Pages 257 Year 2009

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Frontmatter
Contents
General Contributions
Strategies and their Shadows
A Taxonomy of EAT Expressions in Marathi
Morpheme-specific Exceptional Processes and Emergent Unmarkedness in Vowel Harmony
Special Contribution: Indian Sign Language (ISL)
Typology of Indian Sign Language Verbs from a Comparative Perspective
Regional Reports
The Middle East:
Japan:
Nepal:
Reviews
Dialogue
Backmatter
Recommend Papers

Annual Review of South Asian Languages and Linguistics: 2009
 9783110225600, 9783110225594

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Annual Review of South Asian Languages and Linguistics 2009



Trends in Linguistics Studies and Monographs 222

Editors

Walter Bisang Hans Henrich Hock (main editor for this volume)

Werner Winter

Mouton de Gruyter Berlin · New York

Annual Review of South Asian Languages and Linguistics 2009

edited by

Rajendra Singh

Mouton de Gruyter Berlin · New York

Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.

앝 Printed on acid-free paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability.

ISBN 978-3-11-022559-4 ISSN 1861-4302 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. ” Copyright 2009 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Christopher Schneider, Laufen. Printed in Germany.

Contents Editorial Preface.........................................................................................vii General Contributions Strategies and Their Shadows.......................................................................3 Probal Dasgupta A Taxonomy of EAT Expressions in Marathi…........................................41 Peter Hook and Prashant Pardeshi Morpheme-specific Exceptional Processes and Emergent Unmarkedness in Vowel Harmony..............................................................................65 Shakuntala Mahanta Special Contribution: Indian Sign Language (ISL) Typology of Indian Sign Language Verbs from a Comparative Perspective................................................................................................103 Michael W. Morgan Regional Reports The Middle East: Sinhala in Contact with Arabic: The Birth of a New Pidgin in the Middle East................................................................................................135 Fida Bizri

Japan: Research on South Asian Languages in Japan: 2000–2008.............149 Kazuyuki Kiryu and Prashant Pardeshi Nepal: A Morphosyntactic Categorisation Scheme for the Automated Analysis of Nepali.....................................................................................171 Andrew Hardie et al

vi

Contents

Reviews Shishir Bhattacharja – Word Formation in Bengali: A Whole Word Morphological Description and its Theoretical Implications by Niladri Sekhar Dash............................................................................199 Yamuna Kachru and Larry Smith – Cultures, Contexts, and World Englishes........................................................................................207 by Graeme Cane Pingali Sailaja – Indian English...............................................................213 by Caludia Lange Tove Skutnabb-Kangas – Linguistic Genocide in Education...................217 by Otto Ikome K.G.Vijaykrishnan – The Grammar of Carnatic Music by Nirmalangshu Mukherjee.....................................................................223 Dialogue Aspects of Assamese Morphonology Revisited: Reflections on Mahanta.....................................................................................................231 by Luc Baronian Nativeness, Deviance and Ownership: A Response to Singh...................235 by Pingali Sailaja Appendices Announcements : The Gyandeep Prize / Housekeeping...........................245 Notes on Contributors...............................................................................247

Editorial Preface

Annual Review of South Asian Languages and Linguistics (ARSALL) is devoted to bringing out what is currently being explored in South Asian linguistics and in the study of South Asian languages in general. South Asia is home to a wide variety of languages, structurally and typologically quite diverse, and has often served as a catalyst and testing ground for theories of various kinds. Although linguists working on South Asia have made significant contributions to our understanding of language, society, and language in society, and their numbers have grown considerably in the recent past, until recently there was no internationally recognized forum for the exchange of ideas amongst them or for the articulation of new ideas and approaches grounded in the study of South Asian languages. The Yearbook of South Asian Languages and Linguistics, of which this annual is a direct descendant, played that role during the last decade, but in 2007 we decided to go a bit further and incorporate a slightly modified form of such a forum into Trends in Linguistics. This is the third issue of ARSALL as part of the series Trends in Linguistics: Studies and Monographs. Each volume of this annual has five major sections: i. General Contributions, consisting of selected open submissions that focus on important themes and provide various viewpoints. ii. Special Contributions, consisting of single or inter-related or easily relatable, invited contributions on important issues, ranging from the narrowly grammatical to the wide-scope socio-linguistic/socio-political. This section will in effect constitute a mini-symposium, albeit in the written form, on the issue chosen for a given year. It will serve the function of familiarizing the reader with current thinking on issues seen as salient in the study of South Asian languages. iii. Regional Reports, consisting of reports from around the world on research on South Asian languages

viii

Editorial Preface

iv. Reviews and Abstracts, consisting of reviews of important books and monographs and abstracts of doctoral theses. v. Dialogue, consisting of a forum for the discussion of earlier work, preferably previously published in this annual, comments, reports on research activities, and conference announcements. Other than excellence and non-isolationism, ARSALL has no theoretical agenda and no thematic priorities. The first, general section of this, the third, issue of ARSALL contains three contributions: Dasgupta`s Strategies and their shadows, Hook and Pardeshi’s, A Taxonomy of EAT Expressions in Marathi, and Mahanta`s Morpheme-specific Exceptional Processes and Emergent Unmarkedness in Vowel Harmony. The Special Contributions section is dedicated to Indian Sign Language (ISL), and contains an important paper by Michael W. Morgan on ISL verbs from a comparative perspective. We are fortunate this year to have some rather exceptional regional reports―from the Middle-East, Japan, and Nepal, areas normally underrepresented in publications devoted to research on South Asian languages. The Review section of this issue contains reviews of Y.Kachru and Smith`s recent book on World Englishes, Sailaja`s monograph on Indian English, Skutnabb-Kangas` thought-provoking book on linguistic genocide, and Vijaykrishnan`s ground-breaking formal analysis of Carnatic music. These have been written by Cane (Karachi), Lange (Dressden), Ikome (Montreal) and Mukherjee (Delhi), respectively. I am particularly happy to note that the Dialogue section of this issue contains responses to two pieces published in ARSALL itself.These contributions establish that ARSALL has in fact become the forum I wanted it to be. I am grateful to Prof. Hans Henrich Hock, Dr. Ursula Kleinhenz, Mr. Wolfgang Konwitschny, and Mr. Mooli Ikome for help far beyond the call of duty in the preparation of this issue. Rajendra Singh

General Contributions

Strategies and their Shadows Probal Dasgupta Sentences are assembled. Words are retrieved (and occasionally coined). The structuralist exaggeration that retrieval goes all the way up appears in a formalist generative inversion as the thought that assembly goes all the way down, halting at the ‘arbitrary’ items. But an ‘intersecting economies’ approach enables the alternative programme, substantivism, which characterizes grammatical representations in terms of distinct lexical and syntactic economies, to attain descriptive adequacy vis-à-vis irregularity and anti-irregularity, and to show that Word Formation Strategies, constitutively non-iterative, cast a ‘shadow’ that points up the non-assembled character of words.

1. Introduction Section 1 presents the notion of intersecting economies associated with arbitrariness and with transparency. Section 2 shows how these tools are used in substantivist inquiry, highlighting the role of multiple validation. Multiple validation issues connect section 3, on strategies and their shadows, with section 4, on the proper treatment of certain constructions that elude the regular syntax. When we take up the formal study of language, we have to choose between two different conceptualizations of the task of characterizing linguistic phenomena in terms of several modules of the grammar. The formalist conceptualization of this task in formal linguistics favours the unique allocation of particular characterization tasks to one module or to another. For example, if one proposes to derive the English comparative adjective taller from a syntactic representation isomorphic to the phrasal comparative more complex, one is expressing a formalist desire to make ‘the’ statement about English comparatives only once in the grammar. The substantivist conceptualization of the task of formal characterization of linguistic phenomena (Dasgupta 1977, 1989, 1993, 2005, 2006, Dasgupta, Ford and Singh 2000) permits and often encourages cocharacterization or multiple characterization. Substantivism works with the intersecting economies of distinct modules to find, by triangulation, what principles of economy are actually operative in particular phenomena one is

4

Probal Dasgupta

describing. Unsurprisingly, the first explicit articulation of the substantivist programme (Dasgupta 1989) emerged in a translation studies context. Substantivist inquiry mediates, translates, between formal apparatuses, between modules, between languages, to co-characterize particular phenomena in keeping with the principles of economy governing them. For English comparatives substantivist principles favour a morphological description of taller and a syntactic description of more complex, with a shared grammatical feature [+comparative] connecting the two. In all frameworks, linguists agree that descriptions have to implement a preference for specific forms over general forms, for instance by arranging for the more specific form taller to preempt or ‘block’ the more general form more tall (if we set aside cases like more tall than wide). There are certain unresolved questions, however, about what does or doesn’t get preempted, and why. For instance, consider (1) and (2): (1) What say you? (2) What do you say? If we had an extensive irregular pattern on our hands – but the starred status of (3) and (4) shows that we do not: (3) *What think you? (4) *What want you? – we might have wanted to subsume (1) and (2) under an account based on two competing subsystems. But (1) is isolated; so is (5); and their presence in English does not block the regular questions (2) or (6): (5) What have we here? (6) What do we have here? How does the intersecting economies idea manage the coexistence of isolated irregular forms with the rest of the system? The syntax proper, however a particular framework may choose to run its version of the Do-Support routine, licenses (2) and (6) as the routine forms. Elsewhere, some idiosyncratic expressions built around the verb to say – such as Says who?, What say you?, plus another expression or two – live in the suburbs of the syntax. They do not have the grossly irregular look of rote-learnt opaque expressions like Easy come, easy go or every

Strategies and their Shadows

5

which way. Our say-expressions are easy to parse. At the same time, the syntax does not license formulaic expressions. It only issues a special suburban pass for them, so to speak. Moving from this informal metaphor of issuing passes to serious formalization involves breaking this theme down into operational questions. The licensing of semi-regular or irregular entities is a problem that the syntax and the morphology both have to deal with. Traditionally, inquiry in the domain of how irregular forms preempt regular forms or coexist with them has focused on the morphology – to which we now turn. Our discussion of morphology assumes some familiarity with the approach known as Whole Word Morphology, WWM (see the standard expositions in Ford, Singh & Martohardjono 1997 and Singh 2006, and a survey of specific elaborations in Singh & Starosta 2003). We return shortly to the task of contextualizing its basic assumptions within the substantivist approach; see our exposition from (10) onwards. Under WWM assumptions, it is the Word Formation Strategies (WFSs) of English morphology that license regular plurals: foxes, girls, fans, persons, brothers. The lexicon extrastrategically authorizes irregular plurals like oxen, children, men, people, brethren. Our metaphoric statement that the morphology issues suburban passes to these translates into a proposal that the morphology should bring these items under a rigorous regime of secondary licensing by applying a wild card type Word Formation Strategy. Accordingly, we hereby propose the following addition to the WWM toolkit: (7) Secondary Licensing (morphology) [X]α ÅÆ [Xw]β, where: X is a word, Xw is a word that wildly differs from X formally, and α and β are feature bundles that figure in regular WFSs in the language (see (10) below) The empirical content of our proposal is that, since English has no regular optatives or desideratives or duals, it cannot have irregular optatives or desideratives or duals.1 In the syntax, the equivalent move is for a lexico-phrasally authorized formation of clausal rank to receive Secondary Licensing in the wild card mode, as in (8). What this amounts to is summarized at (9):

6

Probal Dasgupta

(8) Secondary Licensing (syntax) Assign wild card features, drawing from UG features for illocutions (9) a. Secondary Licensing in the syntax will target root sentences b. Formations so licensed will not block regular competitors Why does (9) follow? Because, (a), the clause type information of an ordinary, non-‘root’-type embedded clause is so completely specified by matrix forces that wild card respecification gets preempted. And (b), Secondary Licensing routes the irregular sentence straight from lexicophraseological storage into sentential use. Bypassing the syntax machinery means that that system continues to generate its routine products without disturbance or upstaging. How does syntactic assembly cooperate with lexical storage to give such items a semi-transparent look? Careful answers will be frameworkspecific. On ‘left periphery’ assumptions, one may underspecify nodes in the C[omplementizer] region of the clausal architecture, assigning an interrogative feature to one of them, perhaps to Force0. UG principles oblige some interrogative in the body of the clause (the what in What say you?, for example) to interact featurally with the C system, but the auxiliary-fed special effects characteristic of English will go missing because of underspecification in the C system. Translating this account into other frameworks is a straightforward matter. Issues of transparency and opacity reappear on our screen in section 2. We can now situate the WWM framework of morphology (Ford, Singh and Martohardjono 1997) in the overall substantivist research programme that frames the present study. One fundamental manoeuvre that distinguishes substantivist from formalist inquiry is the substance-focused use of multiple characterization. Here the term substance refers – along the lines of Chomsky & Halle (1968: chapter 9) – to factors that determine patterns of marked vs unmarked, natural vs unnatural, easy vs difficult, basic vs non-basic, basilectal vs acrolectal, and even spoken vs written. The leading question in the substantivist study of language connects substance with multiple characterization as follows. Lexical storage maximizes ease of retention and access; syntactic processes maximize ease of assembly; there are also other maximizations at work. How do linguistic representations, in explicit compliance with imperatives emanating from several modules, manage these intersecting economies in such a way that a base language – easy for the

Strategies and their Shadows

7

child to acquire – can exist? What extensions associate this base with the full richness of human language? The point about explicit compliance obliges each linguistic string to wear multiple representations accountable to multiple sources of validation. All doctrines agree that every string must dress up for sound and for meaning and be validated in both of those dimensions. Substantivism says, in addition, that a string must receive validation from Mod1, Mod2, …, Modn – the morphology, the syntax, and any other relevant modules. This connects with the markedness legacy at the tenet that managing a viable base involves, for every i, showing the string’s full Modi-face to the principles of the module Modi. For example, even if the syntax guarantees that there is a determiner with every noun, nonetheless semantic representations must fully specify every determiner instead of leaving blanks to be filled in by the syntax. Formalist grammarians, who despite a declaration or two have not taken markedness on board, have consistently found it convenient to apply premarkedness ‘formal economy’ criteria that seek mechanical generalities and accordingly postulate ‘roots’ and ‘affixes’ in their morphology. They also continue the structuralist habit of subsuming these under a superordinate category of ‘morphemes’, which embodied the claim that ‘roots’ and ‘affixes’ have serious properties in common. Even ‘distributed morphology’ solutions of formalist devising, while their precompilations sometimes resemble the substantivist’s integral words at a superficial level, invoke ‘vocabulary items’ and keep the spirit of morphemics alive. Thus, all versions of formalist linguistics leave open the possibility that languages may exist which suspend or reverse the natural asymmetry between contentdenoting lexical materials and function-signalling modifications of these. What makes this point about asymmetry empirical? To see what is at stake, imagine a world in which some languages reverse the asymmetry between the formalist’s ‘roots’ and ‘affixes’, i.e. between lexical material and its function-signalling modifications. The exercise may run as follows. In the real world, a believer in ‘morphemes’ finds that in English the ‘past tense affix’ is manifested in sighed as the segment /d/ but in blew, threw as a ‘replacive’ – as a substitution of /u:/ for /ou/. Suppose we turn this around; and note that the exercise remains materially unaltered if we move from morphemics to precompilation accounts of the ‘distributed’ kind. Imagine a world, then, in which what we shall call the Spenglish past tense ‘affix’ has the shape /éd/ nearly everywhere and carries primary stress. This involves imagining that Spenglish words are mostly spellED as in our English but spelling-pronouncED with primary stress on /éd/. Now

8

Probal Dasgupta

we get to the hard part of this exercise: let there be a quirky Spenglish verb /tu óug/ ‘to eat’, and let this verbal ‘root’ morpheme be manifested as a replacive when it interacts with the ‘affix morphemes’ /éd/ and /íŋ/. Thus we get /eid/ for ‘ate’, where the ‘root morpheme’ appears as a ‘replacive allomorph’ that substitutes /ei/ for /e/. Likewise, in order to express ‘eating’, a Spenglish speaker would say /i:ŋ/, where the ‘root morpheme’ again shows up as a ‘replacive’, as the substitution of /i:/ for /i/. To flesh the picture out, we add another Spenglish verb /tu éi/. It means ‘to drink’ and interacts opaquely with the ‘affixes’ to yield /eəd/ ‘drank’ and /iəŋ/ ‘drinking’. Thus, an /e/ Æ /eə/ ‘replacive allomorph’ manifests this ‘root morpheme’ in the past form; there is an /i/ Æ /iə/ manifestation in the gerund form. The reader knows, of course, that languages in fact never give lexical material so little space and function-signalling modifications so much. But surely some languages could have done so, if ‘roots’ and ‘affixes’ had indeed been equal, and thus validly subsumable under a superordinate ‘morpheme’ notion. How are we to respond to the emphatic absence, in the real world, of such interchangeability between ‘roots’ and ‘affixes’? Substantivist inquiry’s core commitment keeps it focused on optimality or ‘markedness’ factors that determine fundamental asymmetries in linguistic phenomena. This focus makes it important to formulate the basic concepts of morphology in a way that makes “Can ‘affixes’ in language B behave the way ‘roots’ do in language A?” an unformulable question – for instance, by prohibiting reference to ‘roots’, ‘affixes’, ‘morphemes’, or euphemisms like ‘vocabulary items’ (if by this one means anything smaller than words). One framework with this property, WWM, is built around essentially the following conceptualization of what constitutes a Word Formation Strategy: (10) a. [X]α ÅÆ [X′]β b. Schema (a) states that at least two pairs (X1, X′1; X2, X′2) of words in the mental lexicon of speaker S anchor a correspondence that has the properties specified here; c. /X/ and /X′/ are words, the prime and the arrow indicate a bidirectional X-X′ mapping, and the form of each side as well as the mapping is specified with appropriate maximization of specificity and generality; d. in particular, the representations of /X/ and of /X′/ specify only those phonic features that automatic phonology cannot predict; e. α and β are bundles of grammatical features;

Strategies and their Shadows

9

f. formal correspondences as in (a) are associated with interpretation mappings. Scholars interested in developing some other morphological framework so that it meets the ‘root’-‘affix’ asymmetry challenge at least as seriously as (10) does will no doubt formulate such proposals. WWM assumptions predict the impossibility of a Spenglish by making the ‘generalizations’ governing /oug, eid, i:ŋ, ei, eəd, iəŋ/ unformulable. In the absence of other viable proposals, it is Word Formation Strategies in the sense of (10) substantivist inquiry will stay focused on. The WWM literature standardly adds to (10a-f) the further assertion that, if X = X′, then α ≠ β. Formulation (10) omits that line and makes lexical correspondence a reflexive relation. Preventing the vacuous or trivial use of mechanisms is a topic to which we return in section 3. Right now our task is to begin to situate WWM in the substantivist programme. Substantivism’s debate with formalism turns on the notions of language as code and language as discourse. Even radical forms of formalist linguistics – with multiple spell-out from anarchically plural work-spaces, warps, and other apparently open processes – seek closure, at the level of the finally assembled output of such processes, in the structuralist notion of a composite sign consisting of structured simple signs. For a formalist even a multiclausal sentence is a huge composite sign; only above the sentence level does discourse begin. It is in this sense that the formalist research programme in linguistics is committed to viewing language as a code, an array of signs, under the assumption of signifier-signified colligation at the sign, and under the generative extension of this view to a non-trivially infinite array of sentencelength composite signs. Lexicalist / representationalist alternatives to the transformational/ derivational mainstream – including such work on lexical integrity as Aronoff 1976, 1993 – cleave to the language-as-code assumption even when they question the morpheme, or word-internal derivations, or other proposals that have been made within particular implementations but do not drive formalism’s core agenda. Substantivism keys language at all levels into discourse while continuing to use formal mechanisms. In substantivist inquiry, discourse is the domain of encounter between potential speaker A and potential speaker B, of contact between possible speaking P and possible speaking Q. This makes even word to word relations discursive, for an utterance can be as short as a word. WWM per se offers a formal account of word relatedness patterns. Substantivism chooses the WWM account in part for reasons of its

10

Probal Dasgupta

own (such as the argument from Spenglish), and uses this account to conceptualize the continuous access that a speaker using word P has to paradigmatic neighbours Q, R, S – in a space whose situatedness in discourse independently invites cognitive science inquiry. The substantivist take on paradigmatic relations in discursive space also looks at phrases and clauses, as will become clearer in an empirical context in section 3. This is why the formal workings of (7) in the morphology and of (8) in the syntax raise the stakes for all linguists. In the morphology, irregular plurals or pasts sometimes preempt regular forms (men in English upstages mans, and went blocks goed) and sometimes do not (people coexists with persons, as does dreamt with dreamed). In the syntax, however, Says who? and What say you? do not block the regular Who says? and What do you say?. We return in section 2 to some reasons for complicating this empirical picture. For the moment, though, we take it that this contrast simply polarizes the morphology and the syntax. To the extent that it does, what sense may we make of the phenomena in terms of intersecting economies? We have already – at the discussion of (8) above – answered the question of why secondary licensing in the syntax does not preempt the primarily licensed regular form. We turn now to the morphological question. Cases like persons vs people, or brothers vs brethren, resemble elder vs older. Semantically differentiated doublets provide a niche for special forms. People and brethren exemplify the pluralia tantum phenomenon. Elder is similar. You can say the third and the fifth persons/ brothers, but not the third and the fifth people/ brethren. You can say older than, but not elder than. Clear examples like went upstaging *goed, or gave blocking *gived, are to be contrasted with dreamt and learnt freely alternating, for some speakers, with dreamed and learned. Such clear cases of irregularity, devoid of semantic doublet properties, call for comment. The following initial generalization suggests itself. Whenever a lexically isolated irregular form receives secondary licensing in the morphology, it blocks the regular template: men, women, children, oxen, hurt (past), went block the expected regular forms mans, womans, childs etc. When a WFS competes with a more general WFS, there is sometimes free variation (dreamt, leant, leapt ~ dreamed, leaned, leaped; learnt, burnt ~ learned, burned) and sometimes blocking (meant, crept, slept). In other words, a free variation pattern implies competing WFSs; morphological secondary licensing of a lexical loner implies blocking.

Strategies and their Shadows

11

This first approximation does not quite work, though. In Bangla verb morphology (Dasgupta 2001: 166, 171), certain free alternations such as guchono ~ gochano ‘to arrange neatly’, bulono ~ bolano ‘to stroke’, upRono ~ opRano ‘to uproot’, Sudhrono ~ Sodhrano ‘to correct’ are indeed associated with a WFS. But pechono ~ pichono ‘to step back’ is a lexically isolated free alternation – we find no WFS competition here. The claim we need to make requires a more careful formulation – “an irregular form bearing a syntactically significant feature freely alternates with a regular form only if a less general WFS supports the irregular alternant”. The Bangla data just mentioned do not counterexemplify this claim, for in the case of guchono ~ gochano ‘to arrange neatly’ neither alternant is more regular than the other. Where we can check this claim in Bangla – a future tense paradigm given in (11), where WFS (11g) supports the irregular variants (11c, f) – the claim is confirmed, whereas the lexically isolated past at (12f) blocks the regular template, a correlation our conjecture leads us to expect: (11) a. de ‘give!’ b. debo ‘(I) will give’ (regular) c. dobo ‘(I) will give’ (irregular) d. ne ‘take!’ e. nebo ‘(I) will take’ (regular) f. nobo ‘(I) will take’ (irregular) g. [Ce]V,Imp ÅÆ [Cobo]V,Fut,1p (12) a. kha ‘eat!’ b. khelo ‘(s/he) ate’ (regular) c. pa ‘get!’ d. pelo ‘(s/he) got’ (regular) e. ja ‘go!’ f. gElo ‘(s/he) went’ (irregular) g. *jelo (supplanted regular form) If this generalization is able to handle the gross patterns considered so far, we are ready to deal with the more intricate facts that a closer look brings into view – in section 2.

12

Probal Dasgupta

2. Arbitrariness It is not enough to say that syntax assembles sentences. That such assembly counts as primary licensing of the products so assembled is implicit in the statement that secondary licensing in the sense of (8) takes precompiled material like What say you? straight from lexico-phrasal storage into actual use as a sentence – bypassing the regular syntactic assembly process, avoiding competition with its mechanisms, and ensuring that its normal outcome What do you say? also counts as well-formed. However, even a regular sentence cannot be compositionally assembled all the way down to its phonological segments. What preexisting items does the syntactic process assemble when it composes a sentence? What non-transparent input serves as the point of departure for this transparent process of assembly? Linguists who offer the answer that the non-transparent input is essentially items like John, loves, and Mary do not speak of ‘opacity’ – the natural antonym for ‘transparency’ – to describe them. One says instead that John loves Mary differs from Jean aime Marie because the linguistic sign is ‘arbitrary’ by nature. There is more to this than meets the eye. The concept pair ‘arbitrary/ motivated’ has much in common with the ‘opaque/ transparent’ pair. But it pays to also take a careful look at some differences – one of our tasks in this section. Formalism’s code approach inherits structuralism’s tendency to stress relations between signs within a formally structured whole. While substantivism’s discourse approach does bring the concepts ‘opaque vs transparent’ to the fore, our proposal is not that these should supersede the codefocused ‘arbitrary vs motivated’, but that there should be a division of explanatory labour. We need ‘arbitrary/ motivated’ to manage lexical storage and ‘opaque/ transparent’ to drive the fresh assembly of utterances. Clarifying this division of labour, and the proposal that the two economies intersect, will involve some theorizing. But the moves made here will lead to empirical consequences before the section is over. The belief that simple signs like French chien ‘dog’ or Bangla kukur ‘dog’ are arbitrary because no biological or other foundation underwrites their concrete forms is often loosely held. “Signs are ungrounded” easily gives way to “simple signs constitute the ground.” One makes this slide and begins to regard kukur as unmotivated – as carrying no clues – in contrast to the ‘relatively motivated’ kukurer ‘dog’s’ which invites comparison with kukur ‘dog’, beRaler ‘cat’s’, beRal ‘cat’. In such a perspective, arbitrariness and motivation count as natural opposites. One regards kukurer

Strategies and their Shadows

13

‘dog’s’ as relatively motivated because it shares something with kukur ‘dog’ and something else with beRaler ‘cat’s’. But kukur ‘dog’ counts as arbitrary – it shares nothing with any other word. Notice that one is talking about items in lexical storage, not commenting on assembly. The differently conceptualized pair ‘transparent / opaque’ does involve the process of assembling an utterance. Transparency refers to its undistorted compositionality. An utterance is compositional if no opaque barrier within it (such as a world-creating predicate or modal operator) halts the cumulation of part-interpretations assembling the interpretation of the whole. On this view an utterance consists of constructions that consist of minimal utterances – independent words – and of those dependent words that work with them to build viable utterances. Now, how is the concept pair ‘arbitrary/ motivated’, which helps us make sense of inter-sign relations in the lexicon, to be articulated vis-à-vis the pair ‘transparent/ opaque’ that has to do with sentence assembly? Merging the two pairs does not in fact help. But formalists – thanks to the structuralist legacy – feel a strong temptation to conflate them. Why are they tempted to do this? And why should substantivists object? Under structuralist assumptions, relatively motivated signs like kukurer ‘dog’s’ count as composite signs. Structuralism never noticed the heterogeneity of the processes composing these composite signs. A linguist maximizing formal generality ends up wishing to fuse the two concept pairs as follows: “A language is anchored in a basic vocabulary consisting of simple (entirely arbitrary/ unmotivated) signs. Every relatively motivated sign is a composite sign, a construction composed of simple or composite signs. The patterns of the composition phenomena of a language are exhaustively describable in terms of rules. Rules specify opacity factor effects where necessary and implement transparent compositionality elsewhere.” Our portrayal of the conflation stresses an appeal to the ‘rule of grammar’ device since we are focusing on the illegitimacy of conflating word complexity and sentence assembly. Is our picture of what the formalists are doing rendered obsolete by the age of ‘principles’? We respond to this objection by noting that the distinction between description and explanation is old. The formalist programme in generative grammar set itself the initial goal of examining just how the abstract cross-rule patterning in particular grammars supports ‘principled’ UG characterization – termed ‘explanatory’ rather than ‘descriptive’ since at least 1962. Serious scrutiny of what the formalists have produced after syntax moved from rules to principles shows no break with transformational grammar’s structuralist beginnings, despite repeated claims that such a break has been made. ‘Principle’-focused for-

14

Probal Dasgupta

malistic writing does switch off descriptive devices like constructionspecific rules. But it still portrays a sentence as an array of minimal meaning-bearers. That structuralist picture postulates a hierarchy of arbitrary atomic units niched into substructures, these niched into larger substructures, right up to the sentence itself. What is the problem with this? Does substantivism not recognize the fact that language involves wholes containing parts? The substantivist is worried about the role of novelty in sentence assembly. Sentences are freshly assembled. Words are their starting point. Motivation/ arbitrariness handles supplies. Transparency/ opacity is about assembly. To run these together, in the name of a unified treatment of constituent-constitute relations, is surely to let the generative revolution down. We were supposed to be making sense of the constitutive novelty of sentence assembly. Substantivism is about keeping faith with the core commitment of generative grammar, instead of celebrating the structuralist residue in our legacy. A unified treatment of constitute-constituent relations is an inappropriate generalization. A sentence is a fresh assembly, not a stale constitute. Language is an Erzeugung, not an Erzeugtes. We now explore some cases where opacity/ transparency differs empirically from arbitrariness/ motivation. If the opacity/ arbitrariness conflation were valid, there should be only one operative economy involved. But we find intersecting economies at work. Thus we do need a division of labour between the two concept pairs. We begin with irregular verbs. Consider the case of Bangla causatives, beginning with regular forms: (13) a. rina duTo SaRi kacbe. ‘Rina will wash two saris’ a’. jitu rinake diye duTo SaRi kacabe. ‘Jitu will make Rina wash two saris’ b. korim tomake almari debe. ‘Karim will give you a cupboard’ b’. mOheS korimke diye tomake almari deWabe. ‘Mahesh will make Karim give you a cupboard’ The causative verbs of (13a’,b’) have a regular formal correspondence with the base verbs of (13a,b). The causative exhibits an additional /(W)a/ within the verbal word. Some causatives are termed irregular because they do not match this template:

Strategies and their Shadows

15

(14) a. morle to Ek bari morbo. ‘If I die, well, I’ll only die once’ a’. prane marle (*mOrale) to Ek bari marbe (*mOrabe). ‘If they take my life, well, they’ll kill me only once’ b. kaMcer baTi poRle to bhangbei. ‘A glass bowl, if it falls, will of course break’ b’. kaMcer baTi tumi phelle (*pORale) to bhangbei. ‘A glass bowl, if you drop it, will of course break’ c. eSOb rastaY gaRi Oto jore colbe na. ‘Cars won’t go so fast on these roads’ c’. eSOb rastaY tumi Oto jore gaRi calabe ki? (*cOlabe) ‘Will you drive so fast on these roads?’ d. tOrkariTa aSche. ‘The vegetable is on its way’ d’. ora tOrkariTa anche (*aSacche). ‘They are bringing the vegetables’ e. chatrira e ghOre thakuk. ‘Let the girl students stay in this room’ e’. ora chatrider e ghOre rakhuk (*thakak). ‘Let them keep the girl students in this room’ As we see at (14a’-e’), the irregular causatives not only fail to match the base verbs (of (14a-e)) along the lines of (13), but actively prevent the regular causative counterparts from surfacing (to show this we present those forms and star them). However, in certain ‘sarcastic’ contexts, the regular causatives normally blocked by such irregular causatives make a cameo appearance – calling for theoretical commentary. Consider (15a”-e”), examples of the Sarcastic Causative: (15) a”. tumi bujhi morbe bhabcho? mOracchi! ‘You think you’ll die, do you? I’ll make you die!’ b”. Etogulo baTi poRe gElo, eTao poRbe bujhi? pORacchi! ‘So many bowls fell and broke, now it’s this one’s turn? I’ll make it fall!’ c”. tomar gaRi eSOb rastaY cOle? cOlacchi! ‘Your car goes on these roads, does it? I’ll make it go!’ d”. OboSeSe Ekhón tOrkari aSche? aSacchi! ‘Now the vegetable arrives at last? I’ll make it come!’

16

Probal Dasgupta

e”. chatrira e ghOre thakbe? thakacchi! ‘The girl students will stay in this room, will they? I’ll make them stay!’ In the pragmatically marked context just exemplified, Bangla makes available Sarcastic Causative verbs, which carry a characteristic intonation contour. These verbs flaunt the regular causative templates that (14a’-e’) take plenty of trouble to avoid. What really is going on at (15a”-e”)? A Sarcastic Causative form in Bangla is always a single, unbroken word. We would expect it – if arbitrariness and opacity were identical – to exhibit idiosyncrasy simply because it is a word rather than a phrase. But the sarcastic causative is manifestly as predictable as periphrastic causatives (as in make him do it), with which it shares three properties that lexical causatives do not: (16) a. phonological fidelity: the sarcastic causative mimics the base word closely; b. semantic invariance: this causative ranges over all the uses of the base; c. device independence: the mapping from the base onto this causative is consistent and does not have different shapes for different base-causative dyads Irregular forms standardly block the regular templates. But ‘our’ phenomenon, suspending the Blocking effect, exhibits what has been called Deblocking. En route to an explanatory analysis, it may help if we add another case of Deblocking – this time from English – to the basket of pertinent data: (17) a. life a’ lives b. wife b’ wives c. knife c’ knives aa Life aa” Lifes The irregular plurals lives, wives, knives face no competition in English in default contexts, where they routinely block the regular lifes, wifes, knifes.

Strategies and their Shadows

17

However, when we say Life – the name of a popular magazine from the fifties – some sort of anti-irregular context seems to get switched on. The regular Lifes surfaces here, unblocked by the irregular plural. The association of anti-irregular plurals with the proper noun niche is demonstrably robust. The French proper noun Ciel ‘Sky’ also exhibits the anti-irregular plural Ciels. But the irregular plural cieux ‘skies’ always blocks ciels when used as a common noun. Tentatively, here is a descriptive summary of the anti-irregularity facts. In certain marked contexts, a syntactic specification like ‘causative’ is able to elicit forms exemplifying the most general plural/ causative templates that the morphology of the language provides. The phenomenon does not merely set aside normal irregular causatives in favour of otherwise unobserved regular causative schema instantiations. It even elicits causatives of verbs that otherwise hard to causativize. The word for ‘cause to sneeze’ is marginal in ordinary Bangla: (18a) verges on ungrammaticality; speakers normally say (18b) instead. But the sarcastic context brings even that verb into currency, as in (19): (18) a. ???ei mOSlaTa SObaikei haMcaY. ‘This spice makes everybody sneeze’ b. ei mOSlaTate SObari haMci aSe. ‘Given this spice, everybody sneezes’ (19) sophar nice lukiye tumi haMcbe? haMcacchi! ‘You hide under the sofa and sneeze, do you? I’ll make you sneeze!’ Likewise, the Sarcastic Passive, as in (20a), also elicits impersonal passives of verbs that resist passivization, as in (20b): (20) a. abar chobi aMka hocche! ‘And now (one) is painting, is one!’ b. abar puliSer hate dhOra pORa hocche! ‘And now (one) is getting caught by the police, is one!’ What gives anti-irregular morphology such swift access to forms otherwise unavailable in the lexicon? The answer, we suggest, is a process bypassing the lexicon. One way to implement this idea is as follows, in the case of the anti-irregular causative: (20) [Xe]T, PresSimp, 3p ÅÆ [Xacchi]T, PresProg, Caus, 1p

18

Probal Dasgupta

This syntactically deployed Word Formation Strategy specifies the two sides in terms of the syntactic node T[ense] rather than the lexical category V[erb] and thus induces word formation on line, during syntactic tree assembly. For a similar analysis of the anti-irregular plural see Dasgupta (2003: 69), where these forms are termed ‘transparent’. We now add that anti-irregular formations are ‘transparent’ in that they reflect UG processes that can bypass the ‘arbitrary’ workings of the lexicon even when generating a word. The material for which section 1 proposed ‘wild card’ solutions was unusually irregular; it pertained to arbitrary lexical storage. In contrast, the phenomenon we are now looking at is unusually regular; it has to do with transparent syntactic assembly. This contrast between the storage of arbitrary material and the compositional assembly of transparent utterances marks two distinct economies. To demonstrate that these economies not only coexist but intersect, we turn to nominal classification in Bangla. Bangla displays noun classification phenomena that invite description in terms of classification formats, not distinct ‘classifiers’ (for a thicker description, see Dasgupta & Ghosh 2007, Dasgupta 2008): (22) a. EkTa meye jabe. b. Ekjon meye jabe. ‘One girl will go.’ ‘One girl will go.’ (23) a. duTo meye jabe. b. dujon meye jabe. ‘Two girls will go.’ ‘Two girls will go.’ (24) a. duTo ghOr khali ache. b. *dujon ghOr ‘Two rooms are vacant.’ two.Hum room (25) a. *duTo bhOdromohila jabe/jaben. b. dujon bhOdromohila jaben. ‘Two ladies will go.’ The noun meye ‘girl’ can occur either with the general numerals EkTa ‘one.Gnl’, duTo ‘two.Gnl’, or with the human numerals Ekjon ‘one.Hum’, dujon ‘two.Hum’. But ghOr ‘room’ cannot take a human numeral, hence the star at (24b); bhOdromohila ‘lady’ rejects a general numeral, see (25a). Notice that the honorific future verb form jaben ‘will go’ at (25b) contrasts with the default or non-honorific jabe; this will turn out to matter. Classification formats are also available at Det, as in (26)-(28), or N, as in (29) and (30). Note the absence of meyejon ‘girl.Hum’ – N has fewer formats than Num does. The distinctive shade of meaning associated with the XTi format at (28) is hard to gloss. We gloss it as NuanIndiv, for

Strategies and their Shadows

19

‘nuanced individuation’; lexical semanticists will eventually address the issue. (26) konTa ‘which one?’ (27) konjon ‘which one?’ (28) konTi ‘which one?’ (29) ei meyeTa ‘this girl’ (30) ei meyeTi ‘this girl’ On the basis of a richer set of data it has been shown that formalist accounts cannot be made to work. A morpheme-based analysis must assign separate feature submatrices to a Det/ Num/ Q/ N base and to a Classifier affix morpheme. It has been shown (Dasgupta 2007) that even the simple grouping of the common ‘Classifier morphs’ into ‘Classifier morphemes’ is unfeasible. At that stage, we had provisionally made (Dasgupta 2007) the assumption that a ‘Classifier morph’ can be separated from a ‘base’ in featural terms. Later, even that limited separability was shown to be unsustainable (Dasgupta & Ghosh 2007). What classification formats can a noun exhibit in order to mark definite/ specific readings (specific with a demonstrative, definite elsewhere)? No noun appears in a human /Xjon/ format, as shown in (31a-c), but the /XTa/ format is widely used for singulars and /Xgulo/ for plurals, see (31d-g): (31) a. *meyejon b. *bhOdromohilajon ‘the girl’ ‘the lady’ c. *upacarjojon ‘the vice-chancellor’ d. meyeTa ‘the girl’ f. ei meyeTa ‘this girl’

e. meyegulo ‘the girls’ g. ei meyegulo ‘these girls’

Some readers will need more clarity about how classification format exponence interacts with N, with Num/Q, with Det. One account of that

20

Probal Dasgupta

traffic is provided in Dasgupta & Ghosh (2007). To summarize, in a single Bangla nominal structure, at most one of the sites Det, Num/Q, and N carries classification features. We turn now to issues related to the way verbs agree with nominals for honorificity (recall (25a, b)). ‘These five students’ translates two different Bangla phrases: (32) a. ei paMcjon chatro this five.Hum student

b. ei

paMcTa chatro this five.Gnl student

The numeral paMcjon in (32a), specified Human, contrasts with (32b)’s general numeral paMcTa. Numerals appear either skeletally, when we count Ek dui tin car paMc ‘one two three four five’, or in a format carrying classification features. WWM describes (32a,b) in terms of two Word Formation Strategies: (33) WFS for Human Numerals [X] Num ÅÆ [Xjon] Num, Cla, Hum (34) WFS for General Numerals [X] Num ÅÆ [XTa] Num, Cla, Gnl Bangla verbs agree with their subject for Person and Honorificity. A pronoun must formally commit itself to an Honorificity value (Intimate, nHon, or Hon); a noun is, within limits, free to refer to individuals of varying degrees of honour: (35) ‘You will go tomorrow’, three variants: a. tui kal jabi. you.Intim tomorrow go.Fut.Intim b. tumi kal jabe. you.nHon tomorrow go.Fut.nHon c. apni kal jaben. you.Hon tomorrow go.Fut.Hon (36) ‘My student will go tomorrow’, two variants: a. amar chatrokal jabe. my student tomorrow go.Fut.3p.nHon b. amar chatrokal jaben. my student tomorrow go.Fut.Hon Grammatically, any noun can take either Hon or nHon agreement. When the noun means ‘baby’ or ‘goat’, Hon agreement signals irony. If the noun

Strategies and their Shadows

21

means ‘president’ or ‘queen’, nHon agreement indicates a speaker’s intention of expressing disrespect. Pronouns trigger feature-driven agreement. Strings that violate this are neither ironic nor disrespectful, but ungrammatical. Compare (35) with: (37) *tumi you.nHon (38) *apni you.Hon

kal tomorrow kal tomorrow

jaben. go.Fut.Hon jabe. go.Fut.nHon

Particular nouns have no lexically specified absolute Hon values. Formally the freely assigned Hon value a given nominal phrase carries triggers agreement. Does the noun control this Hon value? (39) ‘These five students will go tomorrow’, variants: a. ei paMcjon chatrokal jabe. this five.Hum student tomorrow go.Fut.3p.nHon b. ei paMcjon chatrokal jaben. this five.Hum student tomorrow go.Fut.3p.Hon (40) a. ei paMcTa this five.Gnl b. *ei paMcTa this five.Gnl

chatrokal jabe. student tomorrow go.Fut.3p.nHon chatrokal jaben. student tomorrow go.Fut.3p.Hon

Human classification features carried by the numeral are compatible with both nHon and Hon agreement, we find at (39a,b). But (40) shows that a General numeral triggers nHon agreement, sharply excluding Hon. What form should the proper description of this contrast take? We have seen at (36) that a noun accepts both values of Hon in principle. The (39b)(40b) contrast shows that the General feature matrix resists Hon agreement, while the Human nominal phrase in (39b) permits it. How can we allow for this without forgetting the rigidity of pronouns? Leaving some details to Ghosh (2006), we focus on examples built around upacarjo ‘vice-chancellor (university president, rector)’:

22

Probal Dasgupta

(41) a. ??ei dujon upacarjo kal jabe. ‘These two vice-chancellors will go tomorrow.’ b. ei dujon upacarjo kal jaben. ‘These two vice-chancellors will go tomorrow.’ (42) a. ei duTo upacarjo kal jabe. ‘These two vice-chancellors will go tomorrow.’ b. *ei duTo upacarjo kal jaben. this two.Gnl VC tomorrow go.Fut.3p.Hon While chatro ‘student’ is a neutral noun, upacarjo ‘vice-chancellor’ carries an Hon expectation, which (41b) meets. If a speaker intends disrespect, we expect the casualness to go all the way; in (42a), both the General features of the numeral and the nHon agreement on the verb express disrespect. (41a) is so puzzling as to sound like an error; the choice of the Human numeral, plus the pragmatic default of honour for vice-chancellors, leads us to expect an Hon verb, but what we get is puzzlingly nHon. To rescue (41a), we can imagine the speaker to be a senior figure who is so far above all vice-chancellors in status that s/he can use nHon verbs casually, but who wishes to avoid disrespect and sticks to the Human format. It is the need to imagine such a special viewpoint for (41a) that makes it nearly uninterpretable. Suppose you are a disrespectful speaker and would use (42a). You would then utter (43) for the singular. If you wish to show respect as in (41b), however, your choices are (44a,b). The Nuanced Individuation form (44a) carries mild irony. The unformatted noun in (44b) can be diagnosed as a case of UG imposing a transparent stopgap in a niche left unoccupied by the arbitrary logic of classification formats in the particular grammar of Bangla. In support of our diagnosis we note that even mild pejoration at duTi contradicts the Hon verb so severely as to nearly star (45a): (43)

upacarjoTa kal jabe. VC.Gnl tomorrow go.Fut.3p.nHon (44) a. upacarjoTi kal jaben. VC.NuanIndiv tomorrow go.Fut.3p.Hon b. upacarjo kal jaben. VC tomorrow go.Fut.3p.Hon

Strategies and their Shadows

23

(45) a. ??upacarjo-duTi kal jaben. VC-two.NuanIndiv tomorrow go.Fut.3p.Hon b. upacarjo-dujon kal jaben. VC-two.Hum tomorrow go.Fut.3p.Hon c. upacarjo-duTo kal jabe. VC-two.Gnl tomorrow go.Fut.3p.nHon ‘The two vice-chancellors will go tomorrow.’ In other words, speakers choose between the respect-preserving default (45b) and the overtly disrespectful alternative (45c); they have no use for (45a), except perhaps to convey extreme irony. But (44b) sounds normal and (44a) comes out as an only slightly ironic variant. What does this indicate? Our reading is that the UG default at (44b) and Bangla’s Nuanced Individuation format NTi at (44a) step in to fill a language-particular system gap. Revisiting (29) and (30) helps identify the gap in question – recall that Njon is starred; but the point of interest is the availability of two fillers, not the gap itself. The language-particular system offers a limited extension of NTi at (44a) (limited in that (45) makes NumTi’s non-participation evident); UG offers the option that an unformatted Hon noun can take on the definiteness features normally associated with a classification formatted noun (Dasgupta & Ghosh 2007); neither of the fillers blocks the other. How do these arbitrary, word-carried particular realities of a language interact with UG-driven sentence assembly? How are we to make sense of the fact that, in a context where resources are stretched to meet unusual needs, what the arbitrary face of Bangla has to offer by way of a minimal stretching of the logic of classification formats in order to fill the gap – namely, (44a) – neither blocks nor gets blocked by (44b), the form that the transparent face of Bangla offers as a filler for the same gap? Notice that this fact contradicts formalist predictions. Formalists set up a single dimension ranging from ‘most arbitrary/ opaque’ to ‘most motivated/ transparent’. Their view thus weakly predicts that, of the two choices (44a, b), one shall block the other. Given the requirement in Bangla that a definite nominal should use a classification format on the noun, one might argue that the formalist view strongly predicts that only (44a) should count as well-formed, for (44b) violates that requirement. The facts of (44) disconfirm both variants of the prediction. Thus, the mutually unprevented availability of (31a, b) confirms the substantivist conjecture that lexical arbitrariness and sentence assembly transparency manage intersecting

24

Probal Dasgupta

economies. Both of these economies, intersectingly, determine such matters as whether word A or word B is available to speaker S in context X or context Y. The point that arbitrariness and transparency run distinct economies was made earlier (Dasgupta 2007: 170), in relation to the consequences of the maximization of ‘compact’ arbitrariness in lexical storage and of the distinct maximization of transparency in the compositional assembly process of the syntax. This time we concentrate instead on how these economies intersect – on the logic of multiple validation. If language is driven by intersecting economies associated with lexical storage and with fresh assembly, parallel validation vis-à-vis distinct sets of constraints is bound to raise formal issues. What do these look like on the ground? This question comes to the fore in section 3.

3. Strategies and Shadows Some advances in our understanding of validation in phonology and syntax have emerged from the study of ill-formed strings. What we have said about Word Formation Strategies so far concerns only the possible words that they serve to illuminate. But we need also to investigate what a strategy pushes into the shadow – its invalidations. To this end, consider WFS (46a), which interrelates the Bangla words shown at (46b-d): (46) a. b. c. d. e.

[X]N ÅÆ [XWala]N, ‘someone professionally concerned with X’ aiskrim ‘icecream’ aiskrimWala ‘icecream seller’ baRi ‘house’ baRiWala ‘landlord’ baMSi ‘flute, pipe’ baMSiWala ‘flautist, piper’ *baMSiWalaWala ‘piper seller’

If we come across a vendor selling toy Pied Pipers of Hamelin, an unconstrained application of (46a) might make us call him a *baMSiWalaWala ‘a piper seller’; but this word, we note at (46e), is morphologically unavailable. Likewise, English does not enable the use of *flautistist to describe a social scientist who studies the category of flautists. We propose to call this phenomenon the ‘strategy shadow’ cast by a WFS. Along the same lines, (47a)-(49a) in Bangla enable (b, c) in each set, but disable the strategy shadow forms at (d, e):

Strategies and their Shadows

(47) a. b. c. d. e. (48) a. b. c. d. e. (49) a. b. c. d. e.

25

[X]N ÅÆ [Xoj]A/N ‘(something) originating from X’ jOl ‘water’ jOloj ‘water-born (organism)’ bon ‘forest’ bonoj ‘forest-produced/ product’ jOloj (see (b)) *jOlojoj ‘aquan-born’ bonoj (see (c)) *bonojoj ‘sylvan-born’ [X]N ÅÆ [Xhin]A/N ‘(someone) lacking X’ griho ‘home’ grihohin ‘(someone) homeless’ bitto ‘wealth’ bittohin ‘(someone) penniless’ grihohin (see (b)) *grihohinhin ‘homelessless’ bittohin (see (c)) *bittohinhin ‘pennilessless’ [X]N ÅÆ [Xbhoji]A/N ‘(organism) feeding on X’ trino ‘grass’ trinobhoji ‘grass-eating/ eater’ pipilika ‘ant’ pipilikabhoji ‘ant-eating/ eater’ trinobhoji (see (b)) *trinobhojibhoji ‘grass-eater-eater’ pipilikabhoji (see (c)) *pipilikabhojibhoji ‘ant-eatereater’

The fact that WFSs in general invalidate double application products needs to be contrasted with the different behaviour of freshly assembled phrases. A minimal pair useful for this purpose is available. At (51d, e) one finds the syntactic means to express the notions that the morphology is unable, at (50d, e), to format as single words – and we should add that superimposing (51d, e)’s contrastive stress on (50d, e) does not remove their illformedness: (50) a. [X]N ÅÆ [Xantor]N ‘another X’ b. deS ‘country’ deSantor ‘another country’ c. gram ‘village’ gramantor ‘another village’ d. *deSantorantor ‘another other country’ e. *gramantorantor ‘another other village’ (51) b. onno EkTa deS other one country ‘another country’ c. onno EkTa gram other one village ‘another village’ d. ónno EkTa onno deS other one other country ‘anóther other country’

26

Probal Dasgupta

e. ónno EkTa onno gram other one other village ‘anóther other village’ The reason that (51) goes through is that the syntax performs fresh and free assembly. (50d, e) fail because a strategy cannot operate in its own shadow – a phenomenon that calls for a formal account. But is it indeed the case that every WFS casts a shadow systematically invalidating what the double application of the strategy would have produced? Do (52) and (53) from Bangla and (54) and (55) from English not counterexemplify the claim that this phenomenon is perfectly general? (52) a. b. c. (53) a. b. c. (54) a. b. c. (55) a. b. c.

pitamOho ‘grandfather’ propitamOho ‘great-grandfather’ propropitamOho ‘great-great-grandfather’ SOmaj ‘society’ SOmajbirodhi ‘(an) anti-social (element)’ SOmajbirodhibirodhi ‘(an) anti-anti-social (element)’ communist anti-communist anti-anti-communist language meta-language meta-meta-language

We comment on these cases in Appendix 1, in order to preserve the flow of this discussion and to make it easy for readers to develop their own solutions. Setting these aside, we now propose a general analysis of (46)-(51). Strategy (46a) as stated does not say only that one can move from X to the schema specified phonically as XWala and semantically as ‘an X-concerned person’. Its bidirectional arrow also lets a user move, phono-semantically, from XWala to X. A foreign learner of Bangla who has not heard kulpi ‘coolfi, a cold sweet’, on hearing kulpiWala ‘coolfi seller’, will infer that kulpi is the word for whatever such a person is professionally concerned with. Notice, then, that a WFS works, with reference to some set of paired examples underwriting the strategy – for instance, (46b-d) – by applying the bischematic template of (46a) to a word fitting either the X schema or the XWala schema. On the basis of this template matching, the strategy’s action proceeds either right to left, adding Wala plus its semantics to an X that

Strategies and their Shadows

27

lacks it, or left to right, subtracting Wala-plus-semantics from an XWala that has it. A strategy is a toggle switch. In its bischematic design, the specification of XWala on the right-hand side implies that the X on the left does not have a Wala-plus-semantics in it. Applying this strategy to an XWala form yields an X form that involves reversing the ‘add Wala’ instruction, i.e. subtracting Wala-plus-semantics. There is no way to obtain an XWalaWala form by applying (46a). In other words, the formal operation of word formation strategies itself entails the strategy shadow effect; we are looking at a theorem. Does this account invite comparison with Aronoff’s (1976: 95-97) discussion based on Isačenko (1972) on what they both took to be “truncation rules which prevent surface suffix doubling”? Or perhaps with Aronoff’s (1976: 37n4) observation that “Systematically, -ly does not attach to adjectives which themselves end in -ly (silly/*sillily)”? The status of Aronoff’s truncation rules or of his constraint on -ly seems not to have been elaborated into a full-blown formalist analysis of a ‘suffix doubling’ filter. In order to take the debate further – regardless of particular approaches – morphologists will need to consider the possibility of paradigmatically relating the strategy shadow phenomenon to reduplication. To see the point, imagine that language design were to let word formation strategies apply iteratively and to produce ‘accidental reduplication’. Such traffic would get in the way of reduplication existing as a distinct phenomenon. But natural language seems to have some use for reduplication (for recent serious work on reduplication, see Singh 2005, Montaut 2008). It follows that reduplication needs space, and therefore must be visible. Thus language design must have features guaranteeing the non-generation of accidental reduplicants – features such as strategy shadow, if our account of the phenomenon is on the right track. Whether we are on the right track is something we can check by triangulating – by asking whether strategies other than WFSs cast a shadow. At this juncture readers need to acquaint themselves with substantivist proposals for the proper treatment of arbitrariness in lexico-phrasal storage. The most arbitrary material takes the form of words. Substantivist lexical entries are connected by Word Formation Strategies, already exhibited in this paper. Within this realm, other schools of thought formally demarcate degrees of arbitrariness by using either lexical strata or the word boundary/ morpheme boundary distinction. But substantivism adopts the WWM working hypothesis that there are no morpheme boundaries, and that postulating a word-internal word boundary, as in Aronoff (1976: 121-

28

Probal Dasgupta

9) on the ‘productive affix #able’, is not a descriptively adequate solution. It is perhaps only fair to specify what our take is on these matters. At the formal level, analyzing the material within WWM is a straightforward exercise. Those ‘unproductive +able/ +ible’ words that appear in pairs like perceptive, perceptible, suggestive, suggestible, division, divisible, derision, derisible are amenable to highly specified WFSs like Xtiv~Xtibl, Xžən~Xzibl, etc. But Aronoff’s ‘productive affix #able’ corresponds to a simple, general WFS X~Xəbl. Where does WWM or substantivism say this distinction should predict that the intricately arbitrary ‘unproductive’ cases shall contrast with the more iconic phonology and semantics of the ‘productive’ ones? WWM does not say; it excludes the matter from the morphology. Substantivism comments that (non-)iconicity properties of words – and the contrast between general WFSs, which by preserving shape maximize iconicity, and highly specified WFSs, which do not – are handled by semiotics. Natural language words, as well as images and other objects of semiotic inquiry, invite semiotic description and explanation, in addition to formal linguistic analysis. This is what multiple validation is all about. Inter-word paradigmatic relations hold in the space of discourse, a domain that semiotic analysis has long claimed for its own.2 What substantivists object to is not the decision by other linguists to present some semiotic results – in terms of strata, distinct boundaries, or other devices – within what they package as morphology, for this is a question of nomenclature. We object to the claim that these devices are continuous with the formal mechanisms of syntactic assembly. And it is that claim that lies at the heart of the formalist programme. It is important to demarcate the lexical domain of arbitrariness from the process of fresh assembly in the sentence. If the relations between a word and its neighbours were subject only to the laws of the syntax, the task would be simple, and the substantivist account of lexico-phrasal storage would refrain from comment. But linguists recognize the special relations of a ‘clitic’ with its ‘host’ word, and constructions that in various ways elude syntactic generalization. Substantivist work has accordingly made formal proposals for the adequate treatment of the lexico-phrasal specification of these properties. To handle clitics, we have proposed Word Extension Strategies, WESs (Dasgupta 2005: 61). Although WESs do not formally define the notion ‘clitic’ – this reticence is akin to what prevents WWM from formally registering the semiotically distinguishable degrees of arbitrariness – they in

Strategies and their Shadows

29

effect postulate one WES per clitic. In this paper, we have nothing to say about Word Extension Strategies. In order to handle constructions that either resist general treatment in the syntax or correspond so directly to morphological devices that the continuity with them needs to be formalized, we have proposed Phrase Formation Strategies (Dasgupta, Ford & Singh 2000: 171).3 To return to the main thread, in our bid to check whether our strategy shadow proposals are on the right track, it is to Phrase Formation Strategies (PFSs) that we now turn. In addition to adjectival comparison WFSs, English also has PFSs introducing the comparative functor more and the superlative functor most. The formulations provided below abstract away from syntactic frameworkladen details that will need to appear in any fleshed-out version of strategies (11a, b): (56) a. [X]Adj Æ [more [X]Adj]Compv b. [X]Adj Æ [most [X]Adj]Superlv The comparative WFS in English associates light adjectives with comparative adjective words like higher, lower, brighter. In contrast, PFS (11a) associates heavier adjectives like strenuous, intelligent, effective with phrasal comparatives like more strenuous. The question now is whether (56a) casts a strategy shadow. The logic of comparison – given the grammaticality of (12c) – would lead us to expect (12d) to be fine, but it is in fact ungrammatical, and this looks like a strategy shadow fact: (57) a. b. c. d.

A is more effective than B (as a manager). C is more effective than D. A is as much more effective than B as C is more effective than D. *A is more more effective than B than C is more effective than D.

Unless some other account of the ungrammaticality of (57d) is shown to be more persuasive, data set (57) stands as evidence for the claim that PFS (56a) is the right analysis of phrasal comparison of English, that it casts a shadow, and that (57d) falls within this shadow. At this stage, issues of intermodular traffic arise. Compare (57) with (58) – we omit the (a)-(c) examples here to save space: (58) d. *A is more taller than B than C is taller than D.

30

Probal Dasgupta

Under substantivist assumptions, a word such as taller does not in any sense arise from a syntactic structure of the er MUCH tall type. Thus, the syntax per se cannot monitor interactions between the shadow of PFS (56a) and the shadow of the WFS responsible for taller. What, then, prevents (58d)? The strategy shadow phenomenon has to do with what the strategy is adding/ subtracting – here the Comparative specification, which actual instantiations must systematically lack in order to match the left-hand side schema in (56a) and thus to qualify as acceptable left-hand side input for that strategy. (58d) then fails because taller, though an Adj, fails to match this feature of the schema, since it bears Comparative feature/s. The syntactic composition of the Comparative feature submatrix is where the action is. This submatrix, shared by PFS (56a) with the WFS that handles light adjective comparatives like taller, ensures that (58d) will not be generated through the action of (56a). For this account to work, it is essential to distinguish syntax from semantics. Consider (59) – (59) d. ?John is more senior to Bill than Susie is to Mary. – which is, if not perfect for all speakers, clearly far more acceptable than (58d). If semantics and syntax were closely matched, we would expect the adjective senior to be a lexically irregular comparative (given secondary licensing by the morphology as a comparative), and we would expect (59d) to be just as bad as (58d). This expectation is not met. We conclude that the lexical item senior does not carry the syntactic feature submatrix that specifies true comparatives, even though its semantics must be extremely close to that of a true syntactic comparative such as older. We cannot conclude, however, that the semantics per se has no direct bearing on strategy shadow phenomena. Consider (60a-h) from Bangla and (61) from English: (60) a. bhu a’. bhutOtto b. SOmaj b’. SOmajtOtto c. cikitSa c’. cikitSaSastro d. gonit d’. gonitSatro

‘earth’ ‘geology’ ‘society’ ‘sociology’ ‘treatment’ ‘the science of medicine’ ‘mathematics’ ‘the study of mathematics’

Strategies and their Shadows

e. f. g. h.

31

bhutOttoSastro‘geology-science’ *SOmajtOttoSastro ‘sociology-science’ *cikitSaSastrotOtto ‘medicine-ology’ *gonitSastrotOtto ‘mathematics-ology’

(61) a. *geographology, *geologography, *economicology, etc. b. *driverist, *chauffeurist, *conductorist, etc. The unavailability of double marking of discipline status observed at (60eh) in Bangla and (61a) in English, and of double marking of professional status observed at (61b) in English, suggests that there is a specific semantic factor to reckon with. Next to cases of the strategy shadow phenomenon itself, such as *Xerer, *Xistist, *Xographography, the semantic clustering of certain strategies gives rise to a ‘strategy cluster shadow’ phenomenon. The way it works is that iterative applications of any two strategies that are clustermates sound just as bad as iterative application of one single strategy. What needs to be mapped with care is the differentiation of effects. Do these semantic effects seem to parallel certain doubling avoidance effects that appear at the word formation level but arise from phonological factors, as in Aronoff’s (1976:37n4) footnote cited above? Or does the semantics work in a more intimate association with the syntax? At the present moment in such inquiry, the best we can do is sharpen these questions and cast our empirical net wider. When parallel validation – vis-à-vis different components of the language faculty – works smoothly, we do not notice what each component contributes to the process of validating a legitimate linguistic structure. The present section, focusing as it does on ill-formed cases, gives us some access to points at which the multiplicity of validation becomes visible, and the contributions of the various components fan out.

4. Verbal Constructions When we approach the study of constructions that resist general treatment in the syntax, we face a frustratingly heterogeneous list of quirky phenomena. The Phrase Formation Strategy device – or devices from other frameworks that specify the properties of a syntactic construction – can provide initial coverage of some facts. But it leads to no general recipe for advancing our understanding of the niche that these distinctive phenomena occupy

32

Probal Dasgupta

either in the process of sentence assembly, or in the realm of lexico-phrasal storage, or even in some intermediate zone. We thus need to focus on the problems that all inquiry in this domain must address. Recent years have seen the development of formalisms that manage the interface with the formal semantics, or that make it possible to harvest corpus data for phraseological listing. While these developments are welcome, the fundamental issues of the traffic between lexico-phrasal arbitrariness and syntactic transparency in these semi-transparent constructions do not come to the fore. It is to be hoped that the relatively unadventurous formal properties of PFSs will make it possible for the substantivist framework to highlight questions that others are also going to have to face. In this context, we explore here a few constructions based on verbs. These constructions are amenable to analysis on the basis of the proposals in section 1 about secondary licensing in the syntax. Further inquiry on constructions based on items other than verbs is welcome – that is where advances beyond the mechanisms proposed in this paper can be expected. The regular syntax of embedded non-finite constructions is exemplified at (62); the embedded verb appears in the infinitive form. At (63) we observe the conditional adverbial participle. The sentences in (64) show that these sequences are reversible in certain contexts. At (62c) we see that negation is admissible; this is also true for the other cases. These constructions also permit interrogation; to save space, we ask you to take our word for this: (62) a. tOndra gan gayte pare ‘Tandra can sing (songs)’ b. bijon SaMtar kaTte ceYechilo ‘Bijan wanted to swim’ c. robin projitke baRi jete debe na ‘Rabin won’t let Prajit go home’ (63) a. tOndra gan gayle projit cole jabe ‘If Tandra sings, Prajit will leave’ b. projit baRi gele robin chOTphOT korbe ‘If Prajit goes home, Rabin will fidget’ (64) a. tOndra páre gan gayte ‘Tandra cán sing (songs)’ b. bijon céYechilo SaMtar kaTte ‘Bijan wanted to swim’ c. prójitke baRi jete robin debe na

Strategies and their Shadows

33

‘Prájit going home is something Rabin won’t allow’ d. projit cóle jabe tOndra gan gayle ‘Prajit will léave if Tandra sings’ e. robin chÓTphOT korbe projit baRi gele ‘Rabin will fídget if Prajit goes home’ We now consider special constructions – some embedded conditional participles that call for PFS treatment: (65) a. projit baRi gele pare ‘It’s best if Prajit goes home’ b. tumi gan Sikhle paro ‘It’s best if you learn music’ (66) a. projit baRi gelé i pare ‘Prajit may as well go home, why doesn’t he’ b. tumi gan Sikhlé i paro ‘You may as well learn music, why don’t you’ We do not provide actual PFS formulations here as it is not the formalism that is of interest. We note at (67) that permutations if they break the construction up are systematically excluded (special intonation add-ons have no impact on the ill-formedness of (67a, b), and the presence or absence of the Emphatic particle makes no difference either), and at (68) that negation and interrogation are excluded: (67) a. *projit pare baRi gele (i) Prajit can home go.Cnd (Emph) b. *tumi paro gan Sikhle (i) you can music learn.Cnd (Emph) c. baRi gele (i) pare projit ‘As for Prajit, he may as well go home (why doesn’t he)’ d. tumi Sikhle (i) paro gan ‘As for music, you may as well learn it (why don’t you)’ (68) a. *projit baRi gele (i) pare na Prajit home go.Cnd can Neg b. *projit kothaY gele (i) pare? Prajit where go.Cnd (Emph) can c. *tumi gan Sikhle (i) paro na you music learn.Cnd (Emph) can Neg

34

Probal Dasgupta

d. *tumi ki Sikhle (i) paro? you what learn.Cnd (Emph) can Why should these be excluded? We take it that the conditional verbs in the special constructions of (65) and (66) are doing work normally reserved for infinitives. However, regular conditional adjuncts, as in (64d-e), and regular infinitival complements, as in (64a-c), are not averse to permutation breaking the verbal sequence up; negation and interrogation are routine options for clauses. What then prevents permutation in (67a-b) or negation and interrogation in (68)? In what way does the construction, however specified, affect the fundamental properties of these verb forms in the syntax? Another verb-based construction has the form of a fortified future tense that involves sandwiching an Emphatic /i/ between two copies: (68) e. rajar SOngge dEkha korbo i korbo ‘(We) absolutely sháll meet the king’ f. o tomake churiTa pherot debe i debe ‘S/he definitely wíll give you the knife back’ This construction allows permutation but not negation/interrogation: (69) a. dEkha korbo i korbo rajar SOngge ‘Meet the king we absolutely sháll’ b. *rajar SOngge dEkha korbo (na) king with meet will.do (Neg) c. *tomra kar SOngge dEkha korbe you who with meet will.do

i korbo na Emph will.do Neg i korbe? Emph will.do

In the face of different demands from different constructions, how can we so dovetail the construction’s special demands with the syntactic system’s general traffic that our description will replicate just the right degree of traffic dislocation? Our response is to appeal to the secondary licensing device proposed in section 1 and to look at the specific zone of clausal architecture that the construction targets. Secondary licensing locates the drama in the root sentence. The fortified future is a positive polarity construction and targets the zone where options of negation or interrogation would have surfaced; hence (69b, c). The embedded conditional participle targets a zone lower in the clausal architecture and thus leaves negation/ interrogation options un-

Strategies and their Shadows

35

affected, but freezes the non-finite plus finite sequence, whose constituents therefore cannot be separated. Have we been able, though, to effect a neat separation of lexical idiosyncrasy from constructional irregularity? Can we claim to have put all truly arbitrary material in a lexical box whose lid we know how to shut? As we conclude this study, we draw your attention to some evidence that the lexical box does not have a lid we can shut. One type of lexical idiosyncrasy is described in terms of ‘bound’ words. Such a word is bound to a particular neighbourhood. For instance, in contemporary English, the bound words betwixt ‘between’ and let ‘hindrance’ occur only in the fixed locutions betwixt and between and without let or hindrance. One would imagine that bound words represent the peak of arbitrariness in natural language. It is noted in Dasgupta (2006: 155-57) that a systematic class of bound words in Bangla are associated with a ‘productive’ WFS: (70) ki choTánTa i na chuTechi! what run.x Emph Particle I’ve.run ‘What a running I’ve run!’ (71) ami ja ThOkán Thokechi tar juRi I what cheat.x have.been.cheated its match ‘Nobody can match the deception I’ve been through!’ (72) lokTa amader ki bhogán bhugiyeche! the.man us what harassment has.harassed ‘What harassment that man has put us through!’

nei! isn’t

These words choTan, ThOkan, bhogan are bound – they always serve as cognate objects that must co-occur with a verb with which the WFS associates them. They carry a characteristic intonation contour expressive of frustration and must appear in a clause with an exclamatory wh-phrase. If we are able to refine sufficiently the formalism of PFSs, perhaps such data will lend itself to statement in terms of a WFS embedded in a PFS, though this lies beyond our current means. To the extent that words bound hand and foot to a particular context count as especially arbitrary, there is something paradoxical about the fact that a languagewide pattern should be able to sponsor a systematic class of bound words – something oxymoronic about the expression ‘systematic class of bound words’. B.N. Patnaik (personal communication) informs us that (70)-(72) are not unique to Bangla – that similar facts obtain in Oriya, a sister language.

36

Probal Dasgupta

We can get around the paradox, technically, by so defining the notion of ‘bound word’ that only a word obliged to co-occur with a specified neighbour shall count as bound. Exclamatory cognate objects would then stop being bound – they are obliged only to co-occur with a specified type of neighbourhood, not with any specified neighbouring word or words. Nevertheless, it is surely odd that a language should produce an entire class of words only for use in such restricted contexts; surely no known theory of arbitrariness in natural language predicts such a phenomenon. To this extent, we should conclude that we do not yet know how to put a lid on the lexicon as a repository of arbitrariness. One striking feature of the exclamatory cognate object phenomenon in Bangla pulls together some of the earlier strands in our discussion and helps bring our deliberations to a close. Not only does the phenomenon target the word level and thus trigger an unusual lexical mechanism – specifically, if we are on the right track, a WFS embedded in a PFS. The phenomenon also contributes to the formation of a syntactic exception of the type that attracts what we have called secondary licensing in the syntax, and therefore directly hits the syntactic roof – it helps compose a root sentence. Unusual word type meets unusual clause type at a picnic of exceptions. For us, it is time to celebrate linguistic theory’s ability to find space for a very wide range of facts within a moderately restrictive and wellunderstood theory of how the modules cooperate. We can look forward to more illumination – especially in the study of strategy shadow.

Appendix 1 We now revisit examples (67)-(71) from section 3; they appeared to be a problem for the strategy shadow hypothesis. Space constraints prevent us from repeating the data here. Readers will have come up with their own conjectures. We would hazard the guess that these examples show formal/ mathematical game-playing at work in language. It has long been known that academic users of a language deploying its verbal resources for formal/ mathematical purposes – such as mediaeval logicians reshaping Sanskrit to make the work of navya nyaaya (‘the new logic’) possible – routinely stretch these resources beyond what natural language use would permit. Their utterances, usually in the written mode, violate constraints that ordinary language use, outside the context of mathematical game-playing, consistently adheres to.

Strategies and their Shadows

37

We realize of course that anti-anti-communist and SOmajbirodhibirodhi are not themselves words invented by mathematicians. We would nonetheless like to suggest that they are playful, constraint-violating imports into natural language from the formal/ mathematical realm. Specifically, we are claiming that the (c)-forms in 3.(7)-(10) are not words obtainable by normal morphological means, but loans from a special mathematical register of human activity that lies at the edge of language per se. Some readers will jump to the conclusion that by making this move we are introducing an escape hatch that amounts to the destruction of falsifiability for our account. To preempt that jump, let us briefly point out that even in a language where much of the morphology is an explicitly mathematics-type exercise, the artificial language Esperanto, certain expected outcomes do not occur. Esperanto allows the formation of words like ŝafido ‘lamb’ from ŝafo ‘sheep’, kaprido ‘kid’ from kapro ‘goat’. Users of the language are creative and playful. Thus, one would have expected filo ‘son’ and nepo ‘grandson’ to have given rise to nepido ‘great-grandson’, nepidido ‘great-greatgrandson’ and so on. But what one says in fact is pranepo for ‘greatgrandson’ with the same praX device that appears in praavo for ‘greatgrandfather’ based on avo ‘grandfather’. Esperanto iterates pra the way English iterates great, but that is about all; it does not permit socialismismo for ‘doctrinal attachment to socialism’ or tajpististo for ‘someone who professionally deals with typists’. It would be a big mistake to imagine that the formal imagination, when left unfettered, does in fact run wild. It does not, and inquiry is needed to find out exactly what constraints it spontaneously observes. One problem for the account has to do with double causativization, which several languages permit. That only causatives pose a problem indicates that the account as a whole is on the right track; but the problem of causatives remains unsolved.

Acknowledgements Research for this paper was supported by the Indian Statistical Institute under projects on the generation of a differentiated electronic lexicon for Bangla and on the structure of nominal expressions in Asamiya and Boro. The notion of strategy shadow is inspired by the urban shadow concept that plays a key role in the geographer Malasree Dasgupta’s 1999 doctoral dissertation ‘Process of urbanization and inter-urban relationships: a case

38

Probal Dasgupta

study of selected small and medium towns and metropolitan Hyderabad’ (Centre for Economic and Social Studies/ B.R. Ambedkar Open University, Hyderabad). The assistance of colleagues at the Linguistic Research Unit, including especially the project assistants Sarita Panda and Sanghamitra Khan, is gratefully acknowledged. These results were presented at ICPR lectures (delivered as the national visiting professor of philosophy) at Visvabharati and at GLOW Asia, Hyderabad. Thanks to audiences there and to Noam Chomsky, Alice Davison, Sanjukta Ghosh, Peter Hook, Manfred Sailer and Sundar Sarukkai for responses. The usual disclaimers apply.

Notes 1.

Was, were, and am may strike some readers as counterexamples. But number agreement is available in English morphology, as is agreement marking for the first person singular (I go contrasts with he goes).

2.

Readers unfamiliar with the place of semiotics in substantivist theory may wish to revisit Dasgupta, Ford & Singh (2000: 177).

3.

The formulation of PFSs by Dasgupta, Ford & Singh (2000: 171) stands, but the ‘WFS’ (91b) at 2000: 172 would be formalized today as a Word Extension Strategy. Our ‘WFS’ there gets the phonology wrong – it wrongly predicts /bhulbabe/ with a short [u], whereas the correct output is /bhul#bhabe/ with a phonetically long [u:].

References Aronoff, Mark 1976 Aronoff, Mark 1993

Word Formation in Generative Grammar. Cambridge: MIT Press.

Morphology by itself: Stems and Inflectional Classes. Cambridge: MIT Press. Chomsky, Noam; Halle, Morris. 1968 The Sound Pattern of English. New York: Harper and Row. Dasgupta, Probal 1977 The internal grammar of Bangla compound verbs. Indian Linguistics 38:2.68-85. Dasgupta, Probal 1989 Outgrowing Quine: towards substantivism in the theory of translation. International Journal of Translation 1:2.13-41.

Strategies and their Shadows

39

Dasgupta, Probal 1993 The Otherness of English: India’s Auntie Tongue Syndrome. New Delhi/ London/ Thousand Oaks: Sage. Dasgupta, Probal 2001 On a vowel template asymmetry in Bangla verbs. In Linguistic Structure and Language Dynamics in South Asia: Papers from the Proceedings of the SALA XVIII Roundtable, Anvita Abbi, R.S. Gupta, Ayesha Kidwai (eds.), 164-81. Delhi: Motilal Banarsidass. Dasgupta, Probal 2003 Antiopacity and Bangla causatives: a substantivist approach. In Yearbook of South Asian Languages and Linguistics 2003, Rajendra Singh et al. (eds.), 47-70. Berlin/ New York: Mouton de Gruyter. Dasgupta, Probal 2005 Q-baa and Bangla clause structure. In Yearbook of South Asian Languages and Linguistics 2005, Rajendra Singh et al. (eds.), 45-81. Berlin/ New York: Mouton de Gruyter. Dasgupta, Probal 2006 Bikiron, aakoronpokkho aar kaayaapokkho. Sahitya Parishat Patrika 113:1-2.147-57. Dasgupta, Probal 2007 Advances in substantivist grammatical research. In Research Trends in Lexicography, Sanskrit and Linguistics: Proceedings of the Professor S.M. Katre Birth Centenary Seminar, K.S. Nagaraja et al. (eds.), 151-81. Pune: Deccan College Research Institute. Dasgupta, Probal 2008 Transparency and arbitrariness in natural language. In Annual Review of South Asian Languages and Linguistics 2008, Rajendra Singh (ed.), 3-19. Berlin/ New York: Mouton de Gruyter. Dasgupta, Probal, Alan Ford, and Rajendra Singh 2000 After Etymology: Towards a Substantivist Linguistics. München: Lincom Europa. Dasgupta, Probal, and Rajat Ghosh 2007 The nominal left periphery in Bangla and Asamiya. In Annual Review of South Asian Languages and Linguistics 2007, Rajendra Singh (ed.), 3-29. Berlin/ New York: Mouton de Gruyter. Ford, Alan, Rajendra Singh, and Gita Martohardjono 1997 Pace Panini. New York: Peter Lang. Ghosh, Sanjukta 2006 Honorificity-marking words of Bangla and Hindi: Classifiers or

40

Probal Dasgupta not? Bhashacintana 1.21-27.

Isačenko, A.V. 1972

Rolj usechenija b russkom slovoobrazovanii [‘the role of truncation in Russian word formation’]. International Journal of Slavic Linguistics and Poetics 15.95-125.

Montaut, Annie 2008 Reduplication and ‘echo words’ in Hindi-Urdu. In Annual Review of South Asian Languages and Linguistics 2008, Rajendra Singh (ed.), 21-61. Berlin/ New York: Mouton de Gruyter. Singh, Rajendra, and Stanley Starosta (eds.) 2003 Explorations in Seamless Morphology. New Delhi/ London/ Thousand Oaks: Sage. Singh, Rajendra 2005 Reduplication in Modern Hindi and the theory of reduplication. In Studies in Reduplication, Bernhard Hurch (ed.), 263-81. Berlin/ New York: Mouton de Gruyter. Singh, Rajendra 2006 Whole Word Morphology. In Encyclopaedia of Language and Linguistics, Second Edition, Vol. 13, Keith Brown (ed.), 578-79. Oxford: Elsevier.

A Taxonomy of EAT Expressions in Marathi* Peter Edwin Hook and Prashant Pardeshi

This paper is modeled on the ‘lexicographic portraits’ pioneered in the Nineteen-Seventies by Apresjan and Mel’chuk. It consists of three parts: 1. As complete a listing as possible of the EAT-expressions in the Indo-Aryan language Marathi. The argument structures of their closest competing alternates are used to determine their groups and subgroups. 2. A characterization of them along a scale of idiomaticity. 3. An exploration of their syntactic properties. The chief innovation of the paper is the recognition of EAT-expressions as a lexical means for diathesis with power to reshuffle syntactic roles among coreferential actants in a set of synonymous or otherwise closely related constructions.

1. Introduction The Indo-Aryan language Marathi abounds in idiomatic expressions featuring the verb khā- EAT.1 There are at least 70 of them.2 (1) ghar bāpā-na khastā khāllyā mhaN-un sāra don house father-ERG labours ATE say-CP whole two veLes time

poT-bhar stomach-full

jevala3 dined4

‘If we were able to fill our stomachs every day it was due to Dad working himself to the bone.’ (IN)5 (2) to he

asambaddh baD.baD.at unconnected muttering

tsāl-at walk-ing

d zhokāNDyā oscillations

hotā was

‘He staggered along muttering incoherently.’ (IN)

khā-t EAT-ing

42

Peter Edwin Hook and Prashant Pardeshi

(3) malā me Doka head

je what

māhit āhe known is

khā-u EAT-INF

te-ts sāng-un that-EMPH say-CP

mādzha my

nakos not

‘Don’t bug me by telling me things I already know.’ (IN) In this paper we examine on-line and elicited data from Marathi in an attempt to create a taxonomy of idiomatic EAT-expressions.6 In particular we are interested in how notions of correspondence and alternation may be recruited to identify the properties that distinguish one type of EATexpression from another. Although the taxonomy put forth here is limited to the Indo-Aryan language best known to the co-authors, much of the analysis may be extended to parallel constructions in Hindi-Urdu, Panjabi, Bangla, Oriya, Gujarati and other Indo-Aryan languages of the IndoGangetic plain. In comparison to Marathi the phenomenon seems to be more proliferate in Hindi-Urdu and Panjabi and more limited in Kashmiri (Hook & Koul 2008) and in Dravidian (Pardeshi et al. 2006). For exploratory remarks on the origin and diachrony of EAT-expressions in South Asian languages, see Hook & Pardeshi (2009).

2. Taxonomizing EAT-expressions in Marathi We examine two different ways in which the cloud of EAT-expressions can be taxonomized: in Section 2.1. by the scope of idiomaticization; and in Section 2.2. by the type of diathetic reconfiguration that results from their use. 2.1. VP-idioms versus V-idioms An initial two-way distinction can be made between the semantic extension of the entire VP versus semantic extension limited to the verb itself. Examples of the former are the English EAT CROW or EAT ONE’S HAT in which the expressed relation between the action denoted by the verb and the referent of its object is the normal one of consuming an object per os (‘via the mouth’). It is the meaning of the entire verb phrase that is idiomatic. Marathi also has a number of these:

A Taxonomy of EAT Expressions in Marathi

(4)

mi I

āplā self’s

manā-t-lyā mind-in-ADJ

manā-t māNDe mind-in cakes

khā-t

43

hoto

EAT-ing was

‘I was building sand castles in my mind.’ (IN) [Lit: ‘I was eating cakes in my mind.’] In (4) the literal meaning of māNDe khā- is the normal one of ‘eat cakes’. When manāt ‘in the mind’ is added the meaning of the overall VP is metaphorically extended to ‘dream (unrealistically) of something’. (5)

mājhyā my

shikShakān-kaDun teachers-from

murkh.paNā-baddal foolishness-for

āLshi.paNā-baddal, laziness-for

vagaire etc.

orDe scoldings

khā-t aste EAT-ing be.HAB

‘I am often scolded by teacher for my laziness, foolishness, etc. ’ (IN) The referent of the direct object orDe in (5) is (unlike cakes) inedible and the relation between it and the verb khā- EAT has little if anything to do with consumption per os. The word orDe in (5) retains its normal lexical sense of ‘scoldings’. The semantic extension is limited to the verb khā-. There is a third and rarer kind of EAT-expression in which the word for an edible object itself is explicitly identified as referring to an abstraction [in (6) by the modifier lāThyāntsā ‘of night-sticks’]: (6)

prasangi on.occasion

lāThyān-tsā batons-GEN

sahā-pāsun-ats six-from-EMPH

prasād khā.t prasad EAT-ing

mumbai-ne Bombay-ERG

vimān-taLā-lā plane-field-DAT

gher-la surround-PST

‘Occasionally braving blows (from police), the citizens of Bombay surrounded the airfield from 6 AM on.’ (IN) However, it remains the case that [as in (4)] it is the meaning of the entire VP that is extended as the relation between the verb and its object prasād ‘consecrated food distributed to worshippers at a temple’ refers to the normal one of taking something per os. In example (5) in order to accommodate the direct object orDe ‘scoldings’ the meaning of khā- is not ‘eat’ but ‘suffer; endure’ whereas in (4) and (6) the meaning of khā- re-

44

Peter Edwin Hook and Prashant Pardeshi

mains ‘consume’ in accord with the edibility of the direct objects maNDe and prasād. 2.2. Lexical diathesis (syntactic role change): “Fire caught (in) the hay.” “The hay caught fire.” In addition to general constructional means of diathetic alternation (such as the choice of the active versus the passive voice), specific lexical items may be used to present the same external situation under more than one syntactic guise (or disguise). This phenomenon has been referred to as lexical conversion (Apresjan 1974: 256ff). (7)

a. There were many changes in immigration policy in America this year. Ù b. Immigration policy in America saw many changes this year. c. America saw many changes in immigration policy this year. d. This year saw many changes in immigration policy in America.

While various non-subject constituents of (7a) are represented in (7b-d) as the agent-subjects of their clauses, the overall situation-in-the-world referred to—if not completely the same—is essentially so. In English the power to reassign syntactic roles from one constituent to another inheres in a limited number of lexical items (catch, see, witness…) It is not found in near-synonymous or co-hyponymous counterparts (grab, hear, observe):7 (8)

a. On Sunday there was music from the wonderful Richard Waters.Ù b. Sunday saw music from the wonderful Richard Walters.8 c. *Sunday heard music from the wonderful Richard Walters.

Marathi EAT also has the power to reshuffle syntactic roles.9 In the rest of this section we examine what the syntactic role of the agent-subject in each EAT-expression is in its closest competing expressions with a view to discovering how those syntactic roles can be used for grouping and classifying EAT-expressions.

A Taxonomy of EAT Expressions in Marathi

45

2.2.1. Unintransitives The term “unintransitive” refers to a class of EAT-expressions in which the subject is a syntactic agent that corresponds to the subject of a synonymous or nearly synonymous intransitive construction.10 2.2.1.1. Transparent unintransitives [i.e. where the contained noun is morphologically related to the competing verb] In (9) bhāv ‘price’ is the syntactic agent-subject of the grammatically transitive EAT-expression usaLi khā- ‘jump up; (suddenly) increase’. (9)

yā bhāvā-na this

tsāngli-ts usaLi price-ERG good-EMPH

khā.lli spurt ATE

‘This price made a sharp jump.’ (IN) As far as their relation to the referents of bhāv in the real world goes there is no difference between the transitive usaLi khā- in (9) and its intransitive counterpart usaL- in (10): . (10)

telā-tse bhāv usaL-le oil-GEN prices spurt-PST ‘The prices of oil jumped.’

The near-synonymy of usaLi khā- and usaL- is also reflected in the morphological similarity of the contained noun usaLi and the verb root usaL-. Including usaLi khā- and usaL- ‘jump; suddenly increase’, there are about a dozen EAT-expressions in Marathi exhibiting an alternation with a verb that corresponds morphologically to the contained noun: āpaTi khā- vs. āpaT- ‘drop, suddenly fall’ (said of prices, markets), Dulkyā khā- vs. Dulak- ‘nod off; doze’, helpāTā khā- vs. helpāT- ‘go on a wild goose chase; wander about pointlessly’, Dubkyā khā- vs. Dubak- ‘dip; drown’, butskaLyā khā- vs. butskaL- ‘dip; bob’, kats khā- vs. k(h)ats- ‘fink out; lose heart’, gatske khā- vs. gatsak- ‘be jolted; suffer setbacks’, palaTi khā- vs. palaT‘flip, turn over’ (said of vehicles or opinions), X-shi meL khā- vs. X-shi miL- ‘resemble, match X’; gandz khā- vs. gandz- ‘rust’:

46 (11)

Peter Edwin Hook and Prashant Pardeshi

lokhanDi iron

khal-batte mortar-pestle

gandz khā-t paD-lele dis.tāt EAT-ing fall-en seem rust

‘Iron mortars and pestles are seen lying around rusting.’ (IN) (12)

lokhanDi iron to it

sāngāDā skeleton

purN-paNe gandz-lā āhe kadhi-hi complete-ly rust-ed is ever-also

kosaLu shak-el topple can-FUT3sg

‘…iron skeleton is rusted out. It can fall down any time…’ (IN) 2.2.1.2. Opaque unintransitives [i.e. where the contained noun is morphologically unrelated to competing verbs] Not every unintransitive subject of an EAT-expression can be linked to a subject in a transparently related intransitive. For example, the contained noun kolānTyā in (13) is not closely related to an intransitive root: (13)

moTar car

have-t air-in

var uDāli va up flew and

khā-un EAT-CP

rastyā-cyā kaDe-lā road-GEN side-to

tin-cār kolānTyā 3-4 somersaults dzā-un paD.li go-CP fell

‘The car flew up in the air, turned 3 or 4 somersaults and fell down by the side of the road.’ (IN) To identify the subject moTar in (13) as being an unintransitive subject we must have recourse to comparing (13) with a morphologically relatable pair [(14-15)] which itself features close synonyms:11 (14)

tyā-lā him-DAT

kāhi anything

kaL-Nyā purvi injinā-ne realize-INF before engine-ERG

palTi khālli hoti turn.over ATE was ‘Before he knew it the engine had flipped over.’ (IN) (15)

tevaDhyāt just.then

bas bus

palaT-li turn.over-PST.Fsg

‘Just then the bus flipped over.’ (IN)

A Taxonomy of EAT Expressions in Marathi

47

In addition to kolānTyā khā- Marathi has another dozen expressions of this type: Tappā khā- ‘bounce’, gaTāŋgaLi khā- ‘bob up and down’, dzhoke khā- ‘jiggle; swing’, dzhokānDyā khā- ‘stagger’, dam khā- ‘pause, rest’, khasta khā- ‘labour’, TakkeToNape khā- ‘suffer hardship’, Thokrā / lap khā- ‘suffer hardship’, heLkāve khā- ‘swing’, gote khā- ‘founder’, etc. 2.2.2. Undatives In Indo-Aryan languages the dative case has a number of functions. For the purpose of developing a suitable taxonomy it is sufficient to distinguish three of them: the dative of recipient, the locative dative, and the dative of experience. The locative dative occurs in two-valent and in three-valent constructions. 2.2.2.1. Undative of recipient [three valents {(Z) X-from Y EAT}] There are a number of EAT-expressions that correspond to alternate expressions in which the syntactic agent position in the EAT-expression has the real-world or participant role of Patient and the syntactic or clausal role of recipient [see the set (16) through (19)]: (16)

(mi) āi-kaDun tsāŋlā tsop-hi khāllā āNi rāgāvaNi-hi and anger-too I mother-by good beating-too ATE ‘I got a good beating from Mom and a dose of her anger, too.’ (IN)

(17)

tyā-lā tsāŋlā tsop paDlā him-DAT good beating befell ‘He got a good beating.’ (IN)

(18)

dādar bhāgā-t uttar bhāratiyān-nā tsop deNyā-t ālā Dadar area-LOC north Indians-DAT beating giving-LOC came ‘In the area of Dadar, too, North Indians were given a beating.’ (IN)

48 (19)

Peter Edwin Hook and Prashant Pardeshi

huks yān-nā Hooks

ashā-ts vartanā-muLe mare-paryant behavior-root them-DAT such-EMPH

tsop miLā-lā die-until

hotā beating GOT-PST be.PST

‘The Hooks got beaten to death for similar behavior.’ (IN) Using as criterion the alternate collocability of their contained nouns with de-, paD-, mār-, bas- or miL- as operator the following EATexpressions can be put in this class:12 mār khā- ‘be beaten’, bolNi / rāgāviNi / phāyring / dam khā- ‘be scolded’, shivyā khā- ‘be sworn at’, orDā khā- ‘be shouted at’, TomNe khā- ‘be taunted’, X-kaDun hār khā- (vs. hār-) ‘lose to X’, lāth khā- ‘be kicked’, dhakke / ātske khā- ‘be knocked about’, guddi / bukki khā- ‘be punched’, thappaR / tsaprāk / tsāpaT / dhapāTā / taDākhā / tsaTke khā- / phaTke / raTTā / daNkā khā- ‘be slapped, be struck with the (open) hand’.13 2.2.2.2. Instrumental undatives [i. three valents {(Z) X-from Y EAT}] This set of EAT-expression is similar to the preceding set. The main difference is that the contained noun in the preceding set refers to an action (mār / tsop ‘beating’) or to some non-tangible entity (shivyā ‘curses’) while in this set it refers to a tangible object, usually a weapon (goLi ‘bullet’, lāThi ‘baton, night-stick’, etc.) or part of the body used as a weapon (bukki ‘fist’). Like the expressions in the previous set there is an Agent, either overt or conceptually present. If overt the Agent is usually expressed by a genitive (23) or by an ablative (5). The agent-subject of the EAT-expression corresponds to the real-world Patient and to the syntactic recipient in competing expressions in de- ‘give’ (21), mār- APPLY (22), miL- GET (24), etc.14 (20)

lāThyā-kāThyā batons-sticks

khāt

gujjarān-ni āpli

māgNi

soDli

nāhi

EAT-ing Gujars-ERG self’s demand gave.up NEG

‘Braving cudgels & batons the Gujars did not give up their demand.’ (21)

tyān-ni (āmhālā) kāThyā dilyā tar āmhi talvāri de.u they-ERG (us.DAT) sticks gave then we swords give ‘If they beat (us) with sticks then we’ll resort to swords.’(IN)

A Taxonomy of EAT Expressions in Marathi

(22)

ādzu.bādzu-tse side-side-GEN

49

lok tyā-lā … kāThyā mār-u lāg.le people him-DAT sticks APPLY-INF began

‘People around began to beat him with sticks...’ (IN) (23)

mi I.ERG

marāThi-cyā Marathi-GEN

sarān-cyā chaDyā khāllyā hotyā teachers-GEN canes EAT.PRF be.PST

‘I had gotten canings from the Marathi teachers.’ (IN) (24)

ek.dā-cyāāmhā sarvān-nā hātā-var don chaDyā miLālyā us.OBL all.OBL-DAT hand-on two canes GET.PST once-GEN ‘Then each of us got a final two canes on the hand.’ (IN)

Count nouns like chaDi ‘cane’ and lāThi ‘baton’ normally refer to tangible entities. But when they are in EAT-expressions they refer to the blows or injuries delivered, not to the physical weapons themselves. Thus, the phrase sarān-cyā chaDyā in (23) refers to acts of caning administered by the teachers, not to canes made or owned by them. When seen in this light the EAT-expressions that have words denoting weapons as contained nouns can be linked conceptually to those in the previous section (tsāpaT khā- ‘be slapped’, etc.) In addition to lāThi khā{baton EAT} ‘be cudgeled’, kāThi khā- {stick EAT} ‘be hit with a stick’, and chaDi khā- {cane EAT} ‘be caned’, we find goLi khā- {bullet EAT} ‘be shot’, lāTNa khā- {rolling-pin EAT} ‘be hit with a rolling pin’ > ‘be henpecked’, and dzoDe / caplā khā- {shoes / sandals EAT} ‘be shoebeaten’ > ‘humiliated; insulted’. 2.2.2.3. Locative undatives [ii. two valents {(Z) Y EAT}] There is a quasi-delimitable set of EAT-expressions in which the subject corresponds to a locative-dative in the closest competing constructions and the contained noun in the former [for instance, un, vārā, pāus in (25)] corresponds to an unaccusative subject in the latter. The subject samādhi ‘tomb’ of (25) corresponds in participant role to the locative-dative samādhi-lā in (26), while the contained nouns un, vārā, pāus ‘heat, wind, rain’ in (25) correspond to the subject of lāg- ‘attach; reach; touch’ in (26):

50 (25)

Peter Edwin Hook and Prashant Pardeshi

samādhi un vārā heat wind tomb

pāus khā-t ekāki paD.li hoti rain EAT-ing alone fallen was

‘Exposed to heat, wind, and rain the tomb lay in solitude...’ (IN) (26)

samādhi-lā tomb-DAT

un vārā heat wind

pāus rain

lāgat

navhtā

ATTACH-ing

NEG.PST

‘The heat, wind, and rain couldn’t reach the tomb.’ (PP) 15 A half dozen EAT-expressions of this type exist in Marathi. The general meaning is ‘be exposed to X’. That exposure may lead to damage: un / dhuL / vārā / Taplyā / dhur / pāus / gārā / lāTā khā- ‘be exposed to (often damaged by) sunlight / dust / wind / drops / smoke / rain / hail / waves’: (27)

mhātāri-na old.woman-ERG

dhuL dust

khā-un māt.kaTdzhāleli EAT-CP soiled be.PSTPART

tsumbaL rikāmyā head.ring empty

pāTi-t bowl-form basket-in

Tāk.li threw

‘The old woman tossed her head ring soiled from dust into the empty bowl-form basket.’ (IN) 2.2.2.4. Undative of experiencer [three valents {(Z) X-GEN Y EAT}] Like many South Asian languages (Masica 1991: 346-56) Marathi can use the dative case for the experiencer in constructions expressing emotion: (28)

tyā that

parikShe-ci sarvān-nā dhāsti vāT-at ase exam-GEN everyone-DAT anxiety feel-ing was

‘Everyone used to feel anxious about that exam...’

(IN)

Use of a competing EAT-expression allows the speaker to present the experiencer as agent-subject: (29)

tyān-ci dahashat them-GEN terror

khā-un nānā … paL-āle EAT-CP Nana … flee-PST

‘In fear of them Nana had run away ...’ (IN)

hote was

A Taxonomy of EAT Expressions in Marathi

51

Marathi has only two or three of these. Beside dahashat khā- and dhāsti khā- ‘take fright’ there is the (nonce?) hybrid DāuT khā- ‘have doubts’.16 2.2.3. Unagentives An additional type of syntactic role-change is no change. We refer to them as unagentives. 2.2.3.1. Unagentives [isolates: two valents {(Z) Y EAT}] Among Marathi’s EAT-expressions there are two in which the agentsubject (30) has the same agent role that it has in corresponding constructions in ghe- ‘take’ (31) and (33): havā khā- ‘get fresh air’ and shapath khā- ‘swear’: (30)

unhāLyā-t summer-in

havā khāy-lā dzāy-tsa EAT-DAT go-GEN air

as-el tari gange-var be-FUT still Ganges-on

‘In the summer if I wanted to get fresh air I would still (go) to the Ganges.’ (IN) (31)

tāji fresh air

havā ghyāy-lā take-DAT glass

kāts down

khāli did

ke.li

‘I rolled down the window to get fresh air.’ (IN) (32)

tyā-ne parat mājhyā vāTe-lā na dzā-Nyā-ci shapath khālli ATE he-ERG again my.OBL way-DAT NEG go-INF-GEN oath ‘He promised never to get in my way again...’ (IN)

(33)

ti-ne tyā vāTe-lā punhā na dzā-Nyā-ci shapath ghetli took she-ERG that way-DAT again NEG go-INF-GEN oath ‘She promised never to go that way again.’ (IN)

2.2.3.2. Unagentives [cephalophagous {(Z) X-GEN HEAD/SOUL EAT}] Unlike the preceding sets of unagentive EAT-expressions this set features synecdochal use of the contained noun. These are to be seen as VP idioms:

52

Peter Edwin Hook and Prashant Pardeshi

(34)

phār a.lot

Doka head

khā.lla kā

ga

ATE

PHAT I.ERG your

QM

mi

tudzha ?

‘Did I nag you enough, dear?’ (IN) Unagentives of this kind in Marathi number a half dozen or so: Doka khā- {head EAT}, bhejā / mendu khā- {brain EAT}, jiv khā- {life EAT}. All mean ‘pester / bother / torment X’ (mostly verbally). 2.2.4. Residue There are a small number of EAT-expressions that either lack alternating counterparts or out-frequent them. Among these are X-ci hāy khā- {X-GEN alas EAT} ‘take X to heart’, sheN khā- {dung EAT} ‘misbehave’, khār khā- {thorn EAT} ‘be jealous’, khoT khā- {loss EAT} ‘take a loss’, penalTi kha- {penalty EAT} ‘incur a penalty’, (X tsā) bhāv khā- {(X’s) price EAT}. The last of these has ramified into a complex of meanings ‘rise in price; succeed; become proud; upstage, outshine’:17 (35)

māgaNi-t vāDh dzhālyā-ne jhenDu-ne tsānglā-ts demand-in growth became-IN marigold-ERG good-EMPH bhāv khāllā price ATE ‘With a spurt in demand marigolds went up nicely in price.’ (IN)

(36)

sinemā bhārat.ā-pekShā-hi vilāyate-t adhik bhāv khā.un ge.lā film India-than-also abroad-in more price EAT WENT ‘The film enjoyed greater success abroad than even in India.’ (IN)

(37) mi

itkā bhāv khāllā tyā divashi ki kāhi vicār.u nakā I-ERG so.much price ATE that day-LOC that anything ask NEG

‘That day I became so proud that, well, don’t even ask!’

(IN)

(38) hā phoD.Ni-tsā bhāt muL jevaNā-pekShā adhik bhāv khāun dzāto this “curry”-GEN rice basic meal-than more price EAT-CP GOES ‘This “curried” rice even outshines the basic (part of the) meal!’ (IN)

A Taxonomy of EAT Expressions in Marathi

2.2.5.

53

Summary

Table 1. Types of Marathi EAT-expressions and their populations18

Syntactic type

Subtype

Unintransitive

Transparent

(2.2.1)

(2.2.1.1) Opaque

Example

N

%

usaLi khā- (Ù usaL-) ‘jump’

11

15

kolānTi khā- ‘flip over’

12

16

X-tsā tsop khā- ‘be beaten by X’

23

30

X-ci chaDi khā- ‘be caned by X’

7

9

vārā khā- ‘be exposed to wind’

8

11

dhāsti khā- ‘become anxious’

3

4

havā khā- ‘go for a walk’

2

3

3

4

6

8

75

100

(2.2.1.2) Undative

Recipientive

(2.2.2)

(2.2.2.1) Instrumental (2.2.2.2) Locative (2.2.2.3) Experiential (2.2.2.4)

Unagentive

Isolated idioms

(2.2.3)

Cephalophagous X tsa Doka khā- ‘pester X’

Residue (2.2.4) Total

bhāv khā- ‘be conceited’

3. EAT-expressions as idioms In this section we turn our attention to the question of where Marathi’s EAT-expressions fall on a putative scale of idiomaticity. In an ideal world of descriptive precision it would be possible to assay each EAT-expression and assign to it some “index of idiomaticity”. Even though we have not yet developed the analytical tools required for that, we can demonstrate that the syntactic, semantic, and stylistic properties of EAT-expressions place them

54

Peter Edwin Hook and Prashant Pardeshi

in various places on an imagined scale of idiomaticity: 1. They are idioms. 2. They are not all idioms to the same degree. 3. Some EAT-expressions are basic. That is, they score close to EAT when used in its most basic sense of consumption of edibles per os. The spectrum of expressions termed idioms is a heterogeneous category and various terms have been used to describe it leading to a proliferation of terms, types, and subtypes. In the words of Nunberg et al. (1994: 492): “In actual linguistic discourse and lexicographical practice, ‘idiom’ is applied to a fuzzy category defined on one hand by ostension of prototypical examples like English kick the bucket, take care of NP, or keep tabs on NP, and on the other by implicit opposition to related categories like formulae, fixed phrases, collocations, clichés, sayings, proverbs and allusions—terms which, like ‘idiom’ itself, inhabit the ungoverned country between lay metalanguage and the theoretical terminology of linguistics.” Asserting that a categorical, single-criterion definition of idiom is misleading, Nunberg et al. propose a prototypical definition with one necessary feature, viz. conventionality, and a list of other typical features: inflexibility, figuration, proverbiality, informality and affect. The features proposed by them are briefly summarized below: a. Conventionality: ‘Idioms are conventionalized: their meaning or use can’t be predicted, or at least entirely predicted, on the basis of knowledge of the independent conventions that determine the use of their constituents when they appear in isolation from one another.’ b. Inflexibility: idioms are syntactically constrained (e.g. *breeze was shot, *the breeze is hard to shoot, etc.) c. Figuration: idioms involve various kinds of figuration such as metaphor (take the bull by the horns), metonymies (lend a hand, count heads), hyperbole (not worth the paper it’s printed on), etc. d. Proverbiality: idioms typically describe social activities (e.g. become restless, talking informally, divulging a secret) in term of concrete activities (climb the wall, chew the fat, spill the beans). e. Informality: typically associated with informal speech styles or registers, popular speech, oral culture. f.

Affect: idioms typically imply a certain evaluative or affective stance toward the things they denote.

Another feature may be added to their list:

A Taxonomy of EAT Expressions in Marathi

55

g. Markedness: Compared to their non-idiomatic alternants idioms are more marked, hence less frequent. Feature a (conventionality). Since we have limited our study to those EAT-expressions which do not refer to the literal consumption of edible (or inedible) objects per os this feature may be taken for granted. However, some EAT-expressions are further from the basic lexical sense of EAT than others. Compare (39) with (40): (39)

paN ma-lā ek but me-DAT one

cintā khā-te worry EAT-ing

āhe is

‘Worry about one thing, though, is eating away at me…’ (IN) (40)

ma-lā hā phon ālā tevhãã-ts mi DāuT khāllā hotā me-DAT this phone came then-EMPH I doubt ATE was ‘I felt suspicious at the very time I got this call...’ (IN)

For speakers of many languages including those of Western Europe the idiom in (39) is both similar to ones in their mother tongues and easy to relate to EAT’s basic meaning of consumption, while the idiom in (40) in which EAT has assumed the sense of ‘feel’ lies at several removes. Feature b. EAT-expressions display some degree of syntactic flexibility. For instance, they do allow passive: (41)

mārā.māri-madhye mār dilā-khāllā dzāto to donhi bādzun-ni fighting-in beating given-EAT.en goes that both sides-INSTR ‘In a fight the beating that is given and taken is on both sides.’ (IN)

However, (with the exception of EAT in its basic sense of consumption and unagentives like utsal khā- ‘revive, recover’ and shapath khā- ‘take an oath’), passives of EAT-expressions are limited to the passive of indefinite agency [as in (41)]. The normal prototypical passive [as in (42)] does not apply to non-agentive EAT-expressions.

56

Peter Edwin Hook and Prashant Pardeshi

(42)

bhān Thev-un-ats hi utsal khālli geli awareness put-CP-EMPH this jump EAT.en WENT ‘It was in full awareness (of that) that this overture was made.’ (IN)

Features c and d (figuration and proverbiality). Though metaphor is seen in EAT-expressions of the VP type [see (4) and (34)], most of the EATexpressions discussed here do not usually involve figured speech (metaphor, metonymy, hyperbole, etc.). Nor are they felt to be—or are they pressed into service as—proverbs. Features e and f (informality and affect). There is evidence that EATexpressions are more vivid and less formal than most of their competing constructional counterparts. As such they tend to appear in lively headlines of news stories: (43)

“pavārān-ni kats khālli” Pawar-ERG cornering ATE ‘“Pawar finks out.”’ [headline in 28 Oct 2004 Lok Satta (IN)]

(44)

jorj bush-ne khālle dzoDe shoes George Bush-ERG ATE ‘George Bush “eats” shoes.’ [headline in mywebduniya.com (IN)]

Feature g (markedness). This feature is linkable to features e and f. EAT-expressions are almost without exception less frequent than their closest competing counterparts. For examplea, palTi khālli [as in (14)]: 7 hits vs. palaTli [as in (15)]: 19 hits; gandz khat [as in (11)]: 3 hits vs. gandzat: 23 hits. (Counts of google hits totaling less than twenty-five have been ‘verified’.)19 Individual EAT-expressions may exhibit varying degrees of idiomaticcization. For instance, the expression dhuL khā- {dust EAT} may express simple exposure to dust as in (27). As an idiom, however, dhuL khā- means more than just ‘gather dust’. It is a metaphor for ‘be ignored / unused’: (45)

pālike.tse anek prakalp dhuL khā-t paD.le āhet municipality’s many projects dust EAT-ing fallen are ‘Many municipality projects lie gathering dust….’ (IN)

A Taxonomy of EAT Expressions in Marathi

57

4. Modulation of transitivity It is not hard to see that an EAT-expression (46) raises the transitivity of a corresponding intransitive (47). Note the presence of the ergative case in (46) versus its absence in (47): (46) apaghātā-nantar ya accident-after this

jip-ne jeep-ERG

palTi turn.over

khālli hoti EATen PSTPRF

‘After the accident the jeep had flipped over.’ (IN) (47) indirānagar Indiranagar

yethe here

rikShā khaDDyā-t palaT-li rickshaw-NOM ditch-in turned.over

‘A rickshaw flipped over in Indiranagar.’ (IN) However, as a group the transitivity of EAT-expressions is lower than EAT when used in it basic sense even though the verb EAT itself is a ‘semi-transitive’ (Masica 1991:476, fn 16). The differences between the real-world situations denoted by EAT in its basic sense as opposed to the situations represented by non-agentive EAT-expressions are observable in their differential preferences for vector verb. (Vector verbs in Indo-Aryan are optional auxiliary verbs, more or less grammaticalized reflexes of basic verbs of motion, that function to express perfective aspect, manner, and speaker attitude.)31 The vector ghe- TAKE, itself derived from a transitive verb, characteristically occurs with transitive main verbs: pakaD- ‘catch’, māg- ‘ask for’, kāDh- ‘take out’, etc; while dzā- GO, is the vector normally used with intransitives: uTh- ‘get up’, nigh- ‘go out’, dzaL- ‘burn’, etc. When the verb khā-, itself semi-transitive, is used in its basic lexical sense, it can occur with vector dzā- GO [as in (48)]: (48) ādhi before

manasvi Manasvi

iDli idli

khā-un eat-CP

geli

hoti

WENT PSTPRF

‘Manasvi had already eaten an idli.’ (IN) However, it occurs more frequently with vector ghe- TAKE (49): (49) tyā-ne he-ERG

punhā cimNi-sārkha again sparrow-like

thoDasa a.little.bit

‘Again he ate a tiny bit, like a sparrow.’ (IN)

khā-un ghetla eat-CP TOOK

58

Peter Edwin Hook and Prashant Pardeshi

However, in non-agentive EAT-expressions like bhāv khā-’get proud’, even though co-occurrence with vector ghe- TAKE is not excluded [see (50)], vector dzā- GO is the one preferred [as in (51)]: (50) asmādikān-ni ... nehami-pramāNe always-as egoists-ERG

khup bhāv khā-un ghetlā a.lot price EAT-CP TOOK

‘The egoists—as always—swelled up with pride .’ (IN) (51) sarvān-t all-LOC

bhāv price

khā-un geli

ti ... hi lāvNi this lavani

EAT-CP WENT that

‘The one that outshone all the others was this lavani.’ (IN) This difference can be clearly seen by counting. Googling (and verifying20) relevant tokens we have Table 2. Table 2. Differential effect of idiomaticity on vector choice

co-occurring vector ghe-

dzā-

basic verb ‘eat’

X khā- ‘eat [X = food]’

24

3

idiomatic EAT

bhāv khā- ‘EAT [price]’

3

34

5. Concluding remarks EAT-expressions, especially unintransitive and undative ones, are rare in the languages of west Europe. As we (working together with many others) have shown elsewhere (Pardeshi et al., 2006), EAT-expressions abound in the languages of South, Central, and Northeast Asia. Indeed, they show a geographical distribution that is largely congruent with Masica’s IndoTuranian linguistic area (Masica 2001). Lexical parallels with EATexpressions in Persian point to that language as being their most likely source in Marathi and other Indo-Aryan languages, either through direct borrowing or through stimulus diffusion (Hook and Pardeshi 2009, Kachru 1982). The properties and functions of EAT-expressions in Marathi include

A Taxonomy of EAT Expressions in Marathi

59

the power to reshuffle syntactic roles of coreferential NPs creating thereby sets of closely related (competing or conversive) constructions, the power to increase their syntactic transitivity, and power to enhance the vividness and salience of expression.

Notes *

The genesis of this paper goes back almost a decade as a section in http://www.umich.edu/~pehook/marKa.html. Input and encouragement came from Terry Verma, Tahsin Siddiqi, Kusum Jain, Alice Davison, Griff Chausee, Elena Bashir, Mehr Farooqi and her father. Some Marathi data were checked by Hari Damle and Shrikant Atre. Needless to say that not everyone agrees with what we have written here and we alone are responsible for errors. We are grateful to Jim Nye for special technical assistance. Improvements in clarity of presentation are due to the careful reading of an earlier draft by Terry Verma. The research reported here was supported in part by a grant from the Mellon Foundation and a Grant-in-aid (#18520314) from the Ministry of Education, Culture, Sports, Science and Technology (MOMBUSHO), Govt. of Japan.

1.

The NIA root khā- is the modern reflex of OIA khād- ‘chew, bite; eat’ (Turner #3865). It is used in the basic lexical sense of ‘consume (edibles)’ in nearly all of NIA including Romani, Dumaki, Kashmiri, Nepali, Sinhala, and Dhivehi. The few exceptions are NIA languages in northern Pakistan and Afghanistan: Kalasha, Khowar, Pashai. (p.c. Elena Bashir)

2.

Most of our data are from the Internet or from Molesworth (1857).

3.

In transcribing data we use capitalization to indicate retroflexion; the digraphs sh, Sh, and dz, and the trigraph dzh we use for fricatives and affricates; and the macron for contrastive length in vowels (ãã, ĩĩ, and ũũ are the nasalized equivalents of ā, ī, and ū).

4.

Abbreviations used in this paper: 3 = third person, ACC = accusative, ADJ = adjectivizer, CAUS = causative, CONT = continuative, CP = conjunctive participle, CTF = counter-to-fact, DAT = dative, DEF = default, EMPH = emphatic, ERG = ergative, F = feminine, FUT = future, GEN = genitive, HAB = habitual, IMPER = imperative, INF = infinitive, INSTR = instrumental, LOC = locative, M = masculine, NEG = negative, NOM = nominative, OBL = oblique, PART = participle, PASS = passive, PHAT = phatic particle, pl = plural, PRES = present, PRF = perfect, PROG = progressive, PST = past, QM = question marker, SBJNCTV = subjunctive, sg = singular.

5.

The abbreviation IN indicates that the data comes from the Internet.

60 6.

Peter Edwin Hook and Prashant Pardeshi In this paper we are not concerned with and will not discuss EAT-expressions that denote consumption per os of non-edibles (eating poison, dirt, coins, etc.) or that refer to consumption (not per os) of time, money, fuel: (a) sadar

kasoTī sāmnyā-t

andhūk prakāshā-ne barāts

said

test

dim

veL

khā-llā

hotā

time

EAT.PRF

be.PST

matach-in

light-ERG

quite.a.lot

‘In the said test cricket match dim light ate up a lot of game time.’ (IN) or to accepting or living off wealth provided by others (“eating” bribes, interest, pimping, etc): (b) tsāngale better

posting miLaviNyā-sāThi posting get-for

dzātāt ashī go.PRES.pl such

vandatā rumour

paise khā.lle money EATen

āhe is

‘It is rumoured that to get a better posting bribes are being taken.’ (IN) (c) tumcyā-dzavaL your-close.by

thoDā.phār paisā asel tar some money be.FUT then

Thevā aaNī keep.IMPER and

vyādz khā-un interest EAT-CP

madze.t happily

bank-et bank-in dzagā live. IMPER

‘If you have some money, keep it in the bank & live off the interest.’ (IN) or in which the verb khā- denotes destruction: (d) vyavahār ashā-prakāre āpale sāre bhānDval business such-manner.LOC our all capital bagh.tā-bagh.tā khā-un Taak-u shakto watching-watching EAT-CP throw-INF can ‘In this way a business can quickly destroy all our capital.’ (IN) 7.

The cohyponym of X is a non-synonym that shares a hypernym with X. Thus see and witness are (near) synonyms while see and hear are cohyponyms of the hypernym perceive. The line dividing ‘near-synonym’ from ‘cohyponym’ may be fuzzy.

8.

If this phrasing seems rather British to North American and Indian readers, it is because it is British: (www.oxfordmail.co.uk/leisure/musics).

9.

For an earlier discussion of shifting syntactic roles from one actant to another see section 19 of Patanjali's mahābhāShya.

10. The term “syntactic agent” may require some explanation and illustration. A syntactic agent is to be distinguished from the participant role “Agent”. Unlike definitions of the latter there are no necessary semantic conditions (such as volitionality or causality) for designating a noun or noun phrase as syntac-

A Taxonomy of EAT Expressions in Marathi

61

tic agent. The only requirement is that it should be the subject of a transitive verb (or predicate) that is not in the passive voice. Thus in (a) the noun phrase [what is pleasurable] is the syntactic agent-subject of the clause even though it meets none of the conditions needed to merit the Agent participant role: (a) If [what is pleasurable] exceeds [what is painful]… To be syntactic agent-subject of (a) it is sufficient that [what is pleasurable] be the object of by in the corresponding passive: (b) If [what is painful] is exceeded by [what is pleasurable]… 11. For example, there is a verb kol- ‘toss up; throw aside’ which may be related to kolānTi ‘somersault’. 12. Marathi uses the root mār- ‘strike’ as a verbalizer more commonly than Hindi-Urdu does: H-U tānā mār-: tānā de- = 3:2, whereas Mar tomNā mār-: tomNā de- = 5:1. Use of a verb SIT (bas- Marathi) as verbalizer is absent in Hindi: (a) ātā ravivāri sakāLi mi now Sunday morning I.Erg shivyā curses

fon phone

kela made

asta tari malā-ts had.CTF still me-Emph

bas.lyā astyā SAT had.CTF

‘Now if I'd made the call Sunday morning I’d still have been the one to get chewed out.’ (parag-blog.blogspot.com/) 13. As can be seen from this list Marathi has many expressions for hitting and striking with the hand and its parts (palm, fist, knuckles, etc.), a repertoire that English is insufficiently rich to render. 14. Constructions in de- GIVE do not always appear as close alternates to those in khā- in this set because with the contained noun referring (as it seems to do) to a tangible object the interpretation of de- is only the basic lexical sense of ‘give’. For a discussion of why verbs of striking in Indo-Aryan take direct objects of the instrument (and therefore should be thought of as meaning something like ‘apply’ rather than ‘hit’) see Hook and Koul (2002). 15. The abbreviation “PP” indicates that Prashant Pardeshi is the data source. 16. The Marathi speakers that we have consulted all reject DāuT khā- ‘have doubts’. However, there are instances of it in Bokil’s memoir shāLā (School): (a) ti.cyā bahiNi-na nakki.ts DāuT khāllā as-tā her sister-ERG certainly doubt EAT-en be-CTF ‘Her sister would certainly have been suspicious.’ 17. In the sense of ‘rise in price’ an “ungenitive” analysis may be possible for bhāv khā-. Compare (35) with (a): (a) ādz

bādzār-āt

jhenDu-tsā

bhāv

dupTī.ne

vāDhlā

62

Peter Edwin Hook and Prashant Pardeshi today

market-in

marigold-Gen

prices

twofold

increased

‘Prices of marigold gone up by two times in the market today.’ (PP) But an ungenitive analysis will not accommodate the other meanings of bhāv khā- seen in (36), (37) and (38). 18. Our lists show EAT-expressions to be about twice as numerous in Hindi-Urdu as in Marathi. That difference is in large part a matter of differing histories (Hook and Pardeshi 2009). 19. In relying on Google for quantitative comparisons of this kind it is necessary to verify that 1) all the hits are relevant (no Nepali, Marwari, Chattisgarhi, Haryanvi, etc) and 2) there are no double or triple or n-tuple countings of the same passages. 20. See foontnote 19.

References Apresjan, Juri 2000 Systematic Lexicography. Oxford University Press. 1974 Leksičeskie konversivy [Lexical conversives]. In Leksičeskaja semantika [Lexical semantics], 256-83. Moscow: “Nauka”. Bhaskararao, Peri, and K. V. Subbarao (eds) 2001 South Asia yearbook 2001: Papers from the symposium on South Asian languages: contact, convergence and typology. Delhi: SAGE Publications. Hook, Peter 1991 The emergence of perfective aspect in Indo-Aryan. In Approaches to Grammaticalization, B. Heine and E. Traugott (eds.), 59-89. Amsterdam: John Benjamins. 1979 Hindi structures: intermediate level. Ann Arbor: CSSEAS, University of Michigan. 1974 The Compound Verb in Hindi. Ann Arbor: CSSEAS, University of Michigan. Hook, Peter, and Omkar Koul 2002 The verb is not an exception. In Topics in Kashmiri Linguistics, Omkar N. Koul and Kashi Wali (eds.), 143-52. Delhi: Creative Publishers. Hook, Peter, and Hsin-hsin Liang Ms. EAT-expressions in Hindi-Urdu and Mandarin.

A Taxonomy of EAT Expressions in Marathi

63

Hook, Peter, and Prashant Pardeshi 2009 The Semantic Evolution of EAT-Expressions: Ways and Byways. In The Linguistics of Eating and Drinking [TSL 84.], John Newman (ed.), 153-172. Amsterdam: John Benjamins. Kachru, Yamuna 1982 Conjunct verbs in Hindi-Urdu and Persian. In South Asian Review 6.3: 117-126. Masica, Colin P. 2001 The definition and significance of linguistic areas. In Bhaskararao & Subbarao (eds.), 205-267. 1991 The Indo-Aryan Languages. Cambridge University Press. 1981 Identified object marking in Hindi. In Topics in Hindi Linguistics, O.N. Koul (ed.), 16-50. Delhi: Bahri Publications. 1976 Defining a Linguistic Area: South Asia. Chicago: University of Chicago Press. Molesworth, James T. 1975 Molesworth’s Marathi-English Dictionary (Reprint). Poona: Shubhada-Saraswat. Original edition, Bombay Education Society, 1857. Nunberg, G., I. A. Sag and T. Wasow 1994 Idioms. Language 70: 491-538. Pardeshi, Prashant; Peter Hook, Colin Masica, Hajar Babai, Shinji Ido, Kaoru Horie, Jambalsuren Dorjkhand, Joungmin Kim, Kanako Mori, Dileep Chandralal, Omkar N. Koul, Hsin-hsin Liang, Yutaro Murakami, Kingkarn Thepkanjana, Qing-Mei Li, Prasad Vasireddi, Terry Varma 2006 Toward a Geotypology of EAT-expressions in Languages of Asia: Visualizing Areal Patterns through WALS. Gengo Kenkyu 130: 89108. Patanjali’s mahābhāShya 1880-85 Franz Kielhorn (ed.). 3rd revised edition by K. V. Abhyankar, 19621972. Poona: Bhandarkar Oriental Research Institute. Turner, Ralph L. 1966-69 A Comparative Dictionary of the Indo-Aryan Languages. 3 vols. Oxford: Oxford University Press.

Morpheme-specific Exceptional Processes and Emergent Unmarkedness in Vowel Harmony Shakuntala Mahanta

In this paper I discuss exceptional occurrences in the nominal and verbal morphology of Assamese and the role of indexed markedness constraints in accounting for these exceptions. Assamese vowel harmony is a ‘directional’ right-to-left regressive harmony system, which normally ignores morphological boundaries. In contrast to the general process of harmony, this work shows that some morphemes can influence vowel harmony and result in exceptional patterns. Following Pater (2006) it is shown that indexation of markedness constraints can account for these exceptional occurrences. This work also shows that the caveat in Pater (2006) about such constraints’ ability to subvert the universal constraint ranking ROOT FAITH >> AFFIX FAITH is indeed borne true. Consequently, the Assamese examples show that indexed constraints lead to an exceptional alternation where [±Back] harmony occurs only in the root and the suffixal [±Back] values remain unaltered. I argue that this reversal is a result of confluence of several factors leading to the theoretically motivated observation that some unexpected processes in OT may be emergent. The paper also deals with another exceptional pattern where the combined outcome of vowel harmony and a morphological requirement of avoiding vowel-vowel sequences results in an exceptional pattern of syllabification. Contextualising these occurrences to the need for indexed markedness constraints, it is shown that the indexation of *HIATUS is able to analyse these occurrences. Furthermore, it is shown that hiatus avoidance can be the result of varying markedness requirements and it need not be the result of a single motivation of unmarkedness dictated by the overarching grammatical structure. *HIATUS is satisfied by both epenthesis as well as vowel deletion, and in one instance a morpheme even prefers a hiatal situation over non-hiatus. The exceptional occurrences show that markedness requirements and reversals may be often emergent, and therefore a result of a particular relation that a specific morpheme construes to be the least marked.

66

Shakuntala Mahanta

1. Introduction In the first part of the paper I will present data and analysis of exceptional triggering of harmony by the two morphemes /–iyɑ / and /–uwɑ / in Assamese. This process of triggering exceptional realisation of harmony can be characterised as morphologically induced harmony, which is obtained at the cost of flouting the highly ranked phonological constraint IDENT [Low] (which prevents any alteration of the low vowel /ɑ /). In the first case, a morpheme triggers exceptional alteration on the otherwise nonparticipating vowel /ɑ /. The application of harmony in this context violates the constraint IDENT [Low], which is otherwise highly ranked in the phonology of Assamese. In the second case, the presence of /–iyɑ / and /– uwɑ / leads to the emergence of front harmony whenever there are preceding front vowels under the same conditions as described in the first case. In the rest of the paper I describe these morpho-phonological interactions and discuss some aspects of exceptional morpho-phonemic interactions, which have consequences for a theory aiming to handle exceptions in general, and Optimality Theory (Prince and Somlensky 1993/2004) in particular. Section 1 presents a general background to Assamese vowel harmony and provides an overview of the conditions which leads to exceptionality. Section 2 deals with data and problems relating to the exceptional triggering of harmony by the affixes /–iyɑ / and /–uwɑ / in Assamese. Section 3 presents an elaborate analysis of the exceptional problems discussed in section 2. This section lays down the theoretical precept of locality in exceptionality, and deals with the problems keeping locality as the guiding criterion in a theory of constraint indexation. This section also shows how emergent front harmony can be handled within a theory of lexical indexation. Section 4 briefly discusses alternative theoretical approaches and the paper ends with a conclusion in section 4. 1.1. Introduction to Assamese vowel harmony Assamese has an eight vowel inventory consisting of /i, u, ʊ , e, o, ɛ , ɔ , ɑ /. Assamese harmony is regressive and always triggered by an immediately following /i/ or /u/. The harmony constraint produces the alternations /ɛ / → [e], /ɔ / → [o], and /ʊ / → /u/, where the vowel /ɑ / is opaque to vowel harmony. However, /ɑ / exceptionally undergoes harmony under the influence of the morphemes /–iyɑ / and / –uwɑ /. Assamese vowel harmony is typically word-based, excluding compounded words and larger morpho syntactic domains. Vowel harmony in Assamese is therefore a right-to-left process and there are no morphologically significant positions

Morpheme-specific Exceptional Processes

67

which either trigger or target it. The Assamese vowel harmony examples are given below: (1)

Vowel harmony triggered by the /i/ suffix Root a. tɛ l b. ʊ pɔ r c. kʰ ɔ rɔ s

Gloss Suffix Derivation Gloss ‗oil‘ i ‗above‘ i ‗spend‘ i

teli upori kʰ orosi

‗oily‘ ‗in addition‘ ‗spendthrift‘

In this section I present a basic OT analysis of Assamese vowel harmony and I define the relevant constraints below: (2)

*[–ATR][+ATR] Assign a violation mark to a candidate containing a [–ATR] segment followed by a [+ATR] segment.

(3)

IDENT [ATR] A candidate containing a segment in the output and its correspondent in the input must have identical specifications for [ATR].

(4)

IDENT[High] A candidate containing a segment in the output and its correspondent in the input must have identical specifications for [High].

(5)

*[+High, –ATR, –Back] Assign a violation mark to a candidate containing vowels with the feature values [–ATR] [+High] and [–Back]

(6)

Vowel harmony in Assamese /kɔ r/+/i/‗do‘ INF

a. kɔ ri b.  kori c. kuri d. kʊ rɪ

IDENT *[– ATR, *[–ATR] [±High] +High, –Back] [+ATR] *! *! *!

IDENT [±ATR] * * *

68

Shakuntala Mahanta

The tableau above shows that the constraints IDENT [High] and *[– ATR, +High, –Back] are undominated in Assamese1. The high ranking IDENT [HIGH] Constraint does not allow height values to alter. The harmony driving constraint *[–ATR][+ATR] is ranked above IDENT [ATR], leading to the alteration of [ATR] values only. The constraint *[– ATR][+ ATR] plays a crucial role in prohibiting output sequences with an [ATR] mismatch in their feature specifications. 1.2. Blocking by /ɑ / and /ɑ /- raising Whereas /ɑ / alternates with [e], /ɑ / alternates with [o] and /ʊ / alternates with /u/, /ɑ / is a non-alternating vowel in the inventory. Therefore, /ɑ / behaves as a phonologically opaque vowel. It is protected by the faithfulness constraint IDENT [Low] as /ɑ /’s involvement in harmony would also result in the violation of *[+ATR +Low], an undominated constraint, because low [+ATR] vowels are absent from the surface inventory2. (7)

Assamese trisyllables with medial /ɑ / and final /i/ Root Gloss a. kɔ pɑ h ‗cotton‘ b. zʊ kɑ r ‘shake’

Suffix –i –i

Derivation Gloss kɔ pɑ hi ‗made of cotton‘ zʊ kɑ ri ‘shake’ (inf)

The examples in (7) represent words in which /ɑ / occurs word-medially and there is no agreement with the [+ATR] value of the triggering suffixal vowel. Instead, the leftmost vowel is [–ATR] and has not been influenced by the [+ATR] vowel in the right periphery. There are also various suffixes with /ɑ /, which result in opacity and the ones shown below are some such examples: (8)

/–ɑ ru/ and /–ɑ li/ block harmony Root a. lɛ kʰ b. gɔ z

Gloss Suffix Derivation Gloss ‗write‘ –ɑ ru lɛ kʰ ɑ ru ‗writer‘ ‘grow’ –ɑ li gɔ zɑ li ‘sprout’

The tableau below shows how IDENT [Low] is responsible for blocking harmony.

Morpheme-specific Exceptional Processes

69

/ɑ / remains unaltered in the presence of a following trigger

(9)

I: *[+ATR IDENT *[–ATR] *[–High +ATR] IDENT /kɔ pɑ h/+/i/ +Low] [Low] [+ATR] [ATR] ‗car‘ a.  kɔ pɑ hi * b. kopohi c. kopæhi

*! *!

* *

* *

The inertness of /ɑ / to the harmony process is accounted for by high ranked IDENT [Low] and *[+ATR +Low]. These constraints are ranked higher than the harmony driving constraint *[–ATR][+ATR], and therefore the candidate in (9)-a which does not undergo any /ɑ / alteration is the winning candidate. 1.3. /ɑ /-raising: Local exceptional triggering In this section, I will first show in detail the environments in which the exceptional morpho-phonological patterns alluded to above, occur. Before going into the details of exceptionality, I will draw examples from the regular morphology to show the operations of vowel harmony in a regular derived environment domain. In the examples in (10) and (11) below, the [High] vowels in the suffixes trigger [+ATR] harmony in the preceding root/stem. Monosyllabic roots and regular vowel harmony Root Goss Suffix Derived Gloss a. mɛ r ‗wind‘ –uwɑ meruwɑ ‗wind‘ (causative) b. dʰ ʊ l ‗drum‘ –iyɑ dʰ uliyɑ ‗drummer‘ c. tɛ l ‗oil‘ –iyɑ teliyɑ ‗oily‘

(10)

Regular vowel harmony in bisyllabic stems Root Gloss Suffix Derived Gloss a. bɔ yɔ x ‗age‘ –iyɑ boyoxiyɑ ‗aged‘ b. bɔ sɔ r ‗one year‘ –i bosori ‗yearly‘ c. gʊ bɔ r ‗dung‘ –uwɑ guboruwɑ ‗fly‘ (spread like dung)

(11)

The examples above show that there is ample evidence that the adjectival suffixes / –iyɑ / and /–uwɑ / trigger regular [ATR] harmony in the preceding [–ATR] vowels /ɛ / /ɔ / and /ʊ /.

70

Shakuntala Mahanta

2. /ɑ /-raising: Local exceptional triggering /ɑ /-raising occurs when the two affixes /–iyɑ / and /–uwɑ / trigger harmony in morphemes containing /ɑ /. In monosyllabic stems, /ɑ / always adapts itself to /o/, when followed by /–iyɑ / or /–uwɑ /. (12) a. b. c. d.

/ɑ /-raising in monosyllabic roots Root Gloss Suffix Derivation sɑ l ‗roof‘ –iyɑ soliyɑ dɑ l ‗branch‘ –iyɑ doliyɑ dʰ ɑ r ‗debt‘ –uwɑ dʰ oruwɑ mɑ r ‗beat‘(v) –uwɑ moruwɑ

Gloss ‗roof-ed‘ ‗branch-ed‘ ‗debtor‘ ‗beat‘(causative)

The data below show that /ɑ /-raising is restricted to the vowel adjacent to the triggering morpheme. /ɑ /-raising does not occur when /ɑ / is not adjacent to the triggering vowel: (13)

a. b. c. d.

/ɑ / does not change when it is not adjacent to the triggering vowel3 Root Gloss Suffix Derivation Gloss pɑ tɔ l ‗light‘ –iyɑ pɑ toliyɑ ‗lightly‘ ɑ pɔ d ‗danger‘ –iyɑ ɑ podiyɑ ‗in danger‘ ɑ lɑ x ‗luxury‘ –uwɑ ɑ loxuwɑ ‗pampered‘ ɑ dʰ ɑ ‗half‘ –uwɑ ɑ dʰ oruwɑ ‗halved‘

The examples (13) a –b have the segmental composition /Cɑ Cɔ ../ and harmony triggered by /–iyɑ / only affects the immediately preceding [– ATR] vowel /ɔ /, but the non-adjacent /ɑ / does not undergo harmony. However, this is not different from the behaviour of similar sequences when harmony is triggered by suffixes other than /–iyɑ / and /–uwɑ / (see examples in (18) and (19)), as they would all produce the same result. The local triggering behaviour of /–iyɑ / and /–uwɑ / is exemplified very clearly by the examples in (13) d–e. In these cases, there are two instances of /ɑ /, but only the vowel adjacent to the triggering vowel undergoes harmony. /ɑ /-raising triggered by /–iyɑ / and /–uwɑ / 4 violates IDENT [Low], but IDENT [Low] violations are as minimal as possible, because /ɑ /-raising is restricted to the smallest possible domain. The participation of only two morphemes /–iyɑ / and /–uwɑ / in triggering exceptional realisation of harmony can be characterised as morphologically induced harmony, which is obtained at the cost of flouting the highly ranked phonological constraint IDENT [Low] (which prevents any altera-

Morpheme-specific Exceptional Processes

71

tion of the [Low] vowel /ɑ /). This violation leads to the harmonising behaviour of the normally opaque vowel /ɑ / in such a way that it alters to a vowel which is already present in the surface phonetic inventory. Exceptional triggering of the type discussed in this paper cannot be deemed to be the same as dominance in vowel harmony or other kinds of exceptionalities recorded in the literature. In Assamese, there are no instances of exceptional root or suffixal morphemes which undergo harmony under special circumstances or cases where morphemes do not undergo harmony because they are opaque to the spreading process. The Assamese data are unique cases of exceptional triggers. However, they are only unique as far as exceptionality in vowel harmony is concerned. Such cases of local exceptionality are found in other morpheme specific phonology as well (see Pater 2006, for an example of Finnish). 2.1. Further domain related issues in exceptional [+ATR] harmony The examples below show that /ɑ /-raising does not occur when the stem is longer than two syllables (the final /ɑ / in trisyllablic roots deletes itself). (14)

No /ɑ /-raising in trisyllables with final /ɑ / Word

Gloss

Suffix Derivation Gloss

a. kɛ tɛ rɑ ‗spoken harshly‘ –iyɑ b. sɔ kɔ lɑ ‗a round flat piece‘ –iyɑ c. pɔ hɔ ra ‘guarding’ –iyɑ (VERBAL NOUN)

keteriyɑ sokoliyɑ pohoriyɑ

‗peevish‘ ‗slice‘ ‘guard’

These examples have been presented to show that there is a minimal domain in which /ɑ /-raising can occur and it is limited to the first two syllables of a word. In all likelihood, there is a constraint which limits /ɑ /raising to the foot which bears primary prominence in Assamese (Assamese follows a strong-weak or trochaic rhythm, Goswami 1982). 2.1.1. /ɑ /-raising and prefixes The examples below show how the prefixal vowels /ɛ –/ and /ɔ –/ change their feature value for [±ATR] in an environment where there is a /i/ or /u/ on the right side of the morphological word:

72

Shakuntala Mahanta

(15)

Prefixal participation in [+ATR] harmony Prefix Root

a. ɔ – b. ɛ –

Gloss

gʰ ɔ r ‗home‘ kʰ ʊ z ‗steps‘

Suffix Derivation

Gloss

–i –iyɑ

‗homeless‘ ‗slowly‘

ogʰ ori ekʰ uziyɑ

Similarly, a process of /ɑ /-raising similar to the one observed in examples (18) and (19) applies when /ɑ / belongs to the root and /ɛ –/ or /ɔ –/ are prefix vowels. (16) a. b. c. d. e.

/ɑ /-raising and prefixes Prefix Gloss Root

Gloss Suffix Derivation Gloss

ɛ– ɛ– ɛ– ɛ– sɔ –

roof branch leaf slope month

one one one one six

sɑ l dɑ l pɑ t dʰ ɑ l mɑ h

–iyɑ –iyɑ –iyɑ –iyɑ –iyɑ

esoliyɑ edoliyɑ epotiyɑ edʰ oliyɑ somohiyɑ

one roof-ed one branch-ed one branch-ed sloping 6 months old

In the examples in (16), the root /ɑ / does not change its value for the feature [±Back] to that of the preceding prefixal vowel. The reason for this behaviour is dependent on the affiliation of /ɛ –/ and /ɔ –/ as prefixal vowels. Under such circumstances, the [±Back] value that the vowel /ɑ / alternates with depends on the intrinsic [+Back] value of /ɑ /, so that it invariably changes to [o] instead of [e]. Given this description of the pattern of alternation, I will now discuss the locality requirements in these exceptional environments. 2.2. ɑ /-raising in the presence of preceding mid vowels – dual exceptionality In the examples presented till now, we have seen that local /ɑ /-raising shows up in words of the following configuration: (17) Schema of /ɑ /-raising a. (C) iCɑ +iyɑ b. (C)ɑ Cɑ +iyɑ

Morpheme-specific Exceptional Processes

73

In words with the vowel sequences above, the locality of the process always result in words where only the /ɑ / adjacent to the triggering vowel undergoes raising, resulting in words of the type (C)VCoCiyɑ . This is only a partial analysis of the exceptional occurrences in Assamese vowel harmony. I have not yet presented the data where the consequence of the exceptional morphemes /iyɑ / and /uwɑ / are seen in the presence of mid vowels preceding the low vowel, i.e. sequences like /Cɑ Cɑ +iyɑ /. Assamese vowel harmony is iterative and regressive, hence if /ɑ / undergoes harmony and produces a [+ATR] output, it is plausible that the resultant [+ATR] vowel will also trigger harmony in the preceding /ɛ / and /ɔ /. Hence in instances of /Cɛ Cɑ +iyɑ /, it could be predicted that the outcome of /ɑ /-raising would result in /CeCoCiyɑ /, because /Cɛ CoCiyɑ / will violate the constraint driving iterative vowel harmony, i.e. *[–ATR][+ATR]. A quick look at the data below shows that this prediction is true, but the hypothesis falls short of predicting another exceptionality in the actually observed data. In disyllabic stems, apart from [+ATR] harmony, the [±Back] feature that /ɑ / assumes for raising is determined by the root-initial vowel5. (18) a. b. c. d. (19) a. b. c. d.

/ɑ /-raising triggered by /–iyɑ / Word kɔ pɑ l dʰ ɛ mɑ li gʊ lɑ p misɑ

Gloss ‗destiny‘ ‗play‘ ‗rose‘ ‗lie‘

Suffix –iyɑ –iyɑ –iyɑ –iyɑ

Derivation kopoliyɑ dʰ emeliyɑ gulopiyɑ misoliyɑ

Gloss ‗destined‘ ‗playful‘ ‗pink‘ ‗liar‘

Derivation elehuwɑ bozoruwɑ keseluwɑ 6 bʰ uluwɑ

Gloss ‗laziness‘ ‗cheap‘ ‗raw(ness)‘ ‘mislead’

/ɑ /-raising triggered by /–uwɑ / Word ɛ lɑ h bɔ zɑ r kɛ sɑ bʰ ʊ l

Gloss ‗laziness‘ ‗marketplace‘ ‗raw‘ ‘daze’

Suffix –uwɑ –uwɑ –uwɑ –uwɑ

The pattern observed above shows that when /–iyɑ / and /–uwɑ / trigger harmony, /ɑ / alters to either [e] or [o], depending on the [±Back] value of the root-initial vowel. /ɑ /-raising alongwith front harmony is the combined result of two processes. In the kind of exceptional /ɑ /-raising demonstrated by the presence of preceding mid stem vowels, the stem vowel ([–Low, –Back] or [–Low,

74

Shakuntala Mahanta

+Back]) determines the [±Back] feature that /ɑ / might assume, so that it becomes [+ATR] ([e] or [o]). It is clear from this behaviour that the stem initial vowel is responsible for initiating a type of progressive front harmony, where the triggers /–iyɑ / and /–uwɑ /, provide the morphological environment for this exceptional Front/Back harmony. The highlight of this process is, again, that this morpheme-specific [±Back] harmony focuses on a specific domain, which includes the vowel in the immediately following syllable of the trigger. 3. Background to exceptional phenomena in the generative literature In recent theoretical discussion in the OT framework, there has been considerable interest in the way exceptional morphological interferences in phonology can be modelled (Pater 2000; Anttila 2002; Inkelas and Zoll 2003). It is of special interest in an OT framework where all constraints are universal and individual grammars are a result of permutation of these constraints. The interest then, lies in how morphologically conditioned phonological ‘aberrations’ can be handled in an OT approach. In the cophonology approach of Antilla (2002), morphemes select their own ranking from a set of partially ordered constraints. Accordingly, only constraints that are unranked in the grammar can have lexically specified rankings. I will not go into the details of the co-phonology approach (cf. Antilla 2002 and Inkelas and Zoll 2003 for an elaboration of the framework, and Pater 2000, 2006, for arguments against the constraint ranking approach and in favour of constraint indexation). Again among the diacritic approaches, the ones favouring faithfulness constraint indexation are many and varied (e.g. Fukuzawa 1999; Itô and Mester 1999; 2001; KraskaSzlenk 1997, 1999; Benua 2000; Alderete 2001). It has been argued that morpho-phonological processes are the result of the grammar at large. The indexation of Faithfulness constraints only was supposed to lend force to the argument that such grammar dependent processes have manifest limits on their range of occurrences (e.g. Benua 1997; Itô & Mester 1999; Alderete 2001). It was argued that lexical indexation can pose limitations to the scope and extent of exceptional occurrences simply because of its restriction on the indexation of markedness constraints. Therefore Faithfulness only indexation would lead to languages which could not vary in the markedness patterns in its repertoire of exceptional processes.

Morpheme-specific Exceptional Processes

75

3.1. Constraint Indexation and morphologically conditioned exceptions in phonology Pater (2006) shows that most of the problems tackled in morpheme specific constraint ranking, as well as Faithfulness-only constraint indexation theories, can be analysed in terms of constraint indexation of both markedness and Faithfulness constraints. At the same time, however, the fact that exceptional triggering or blocking by morphemes is never an unbounded phenomenon, is only predicted by lexically indexed constraints. Constraint indexation is of special relevance in this paper because the predicted ‘local’ behaviour of morphemically indexed constraints is borne out in the exceptional data of Assamese. In the constraint indexation approach, morphemes that trigger a process are indexed for a lexically specific faithfulness or markedness constraint. It is assumed that these indexed constraints are cloned from already existing constraints, which are ranked lower in the hierarchy. 3.2. An analysis of exceptionality in Assamese ‘Locality’ or the application of a phonological process to a certain smallest possible domain is of special relevance in this paper. In Assamese, the two morphemes /–iyɑ / and /–uwɑ / exceptionally trigger harmony in the otherwise opaque vowel /ɑ /. This kind of triggering behaviour is exceptional, as it is confined only to these two morphemes, but it is also systematic: /ɑ / systematically changes to /o/ only when it is adjacent to the harmony triggering morpheme, i.e. if /ɑ / does not occur in immediate proximity to the triggering vowel, the fact that it does not harmonise can be captured by a locally applicable markedness constraint to Assamese. The constraint which I suggest active here is the indexed version of the contextual markedness constraint *[–ATR][+ATR]L1. (20)

*[–ATR][+ATR]L1 No instance of [–ATR] followed by [+ATR] includes a phonological component of the morpheme lexically specified as L.

The locality convention manifests itself in this constraint in the form of a condition on the position of violation of this constraint. This constraint is violated only in the absolutely adjacent syllabic position of the triggering morpheme specified as L1. Any further instantiations of [–ATR][+ATR] are not under the jurisdiction of this constraint. The full ranking of Assamese

76

Shakuntala Mahanta

exceptional triggering is given below in (21) and the corresponding tableau exemplifying the analysis is in (22): (21)

Ranking: *[–ATR][+ATR]L1 IDENT[Low] *[+ATR–High] IDENT[ATR]

(22)

Indexed morphemes in the Lexicon: /–iyɑ / L1 /–uwɑ / L1

(23)

/ɑ / harmonises in the presence of /–iyɑ /

Input: *[–ATR] IDENT /mɑ r/+/iyɑ /L1 [+ATR]L1 [Low]

*[–ATR] [+ATR]

a. mɑ riyɑ

*

*!

*[–ATR][+ATR]

*[+ATR– IDENT [ATR] High]

b. moriyɑ

*

*

*

c. moriyo

**!

**

**

The lexically indexed constraint *[–ATR][+ATR]L1 penalises a sequence where [ɑ ] is followed by the triggering [i]. Note that the constraint *[– ATR][+ATR]L1 does not refer to the entire morphemic sequence of /–iyɑ / and /–uwɑ /, but only to a portion of it. (23)-a. is ousted because it violates the highly ranked lexically indexed constraint. The choice between the two remaining candidates (23)-b. and c. is determined by the faithfulness constraint IDENT[Low] which is violated twice by the failed candidate in (23)c. In the tableau below, I show how this constraint hierarchy works when there are two instances of /ɑ / in the input. The tableau below shows that *[–ATR][+ATR]L1 inhibits occurrences of [–ATR][+ATR] only in the minimal domain. (24)

Local alternation of /ɑ / when followed by /iyɑ / or /uwɑ / Input: /ɑ lɑ x/+/uwɑ /L1 a. ɑ lɑ xuwɑ

*[–ATR] IDENT *[–ATR] *[–High IDENT [+ATR]L1 [Low] [+ATR] +ATR] [ATR] *! *

b. ɑ loxuwɑ

*

c. oloxuwɑ

**!

*

*

*

**

**

Morpheme-specific Exceptional Processes

77

This tableau shows the markedness requirement of the exceptional trigger /–uwɑ /, i.e. its local application. The indexed constraint *[–ATR][+ATR]L1 does not apply to the initial /ɑ / of the stem /ɑ lɑ x/. (24)-a violates the top ranked *[–ATR][+ATR]L1. Multiple violations of the faithfulness constraint IDENT [Low] leads to the disqualification of the candidate (24)-c. I now move on to discuss exceptional front harmony in Assamese, where a completely innovative Front/Back harmony is triggered by the presence of the /–iyɑ / and /–uwɑ / morphemes, without itself undergoing any change. 3.3. Exceptional front harmony in Assamese: emergent and local As discussed in section 2.3, exceptional triggering in Assamese also involves simultaneous changes of other features, i.e. it is not only the [–ATR] quality of the mid vowel which changes, but also the [±Back] quality of the vowel /ɑ /. When there are no preceding vowels in the presence of /–iyɑ / and /–uwɑ / triggers, /ɑ / assumes its own inherent [+Back] quality while adapting itself to a raised [+ATR] value. Therefore, IDENT[±Back] remains unviolated in such circumstances, as shown by the tableau below: (25)

/ɑ / is faithful to IDENT[+Back] when it is the stem-initial vowel /mɑ r/+/iyɑ /L1 *[–ATR] IDENT IDENT*[–ATR]*[–HighIDENT [+ATR]L1 [±Back][Low] [+ATR] +ATR] [ATR] a. mɑ riyɑ *! * b. moriyɑ c. meriyɑ

*!

*

*

*

*

*

*

The example above also shows that the [±Back] specification of the low vowel is not influenced by the [±Back] quality of the triggering vowel. The vowel /ɑ / retains its [+Back] value despite the fact that the triggering morpheme is [–Back]. IDENT[+Back] is higher ranked than IDENT[Low] as faithfulness to the [Back] value of /ɑ / is substantially more important than faithfulness to the [Low] value7. This accounts for the failure of candidate (25)-c where a [Back] value alteration occurs. However, this is the straightforward case of /ɑ /-raising. The second type of exceptionality pertains to the raising of /ɑ / to the [±Back] vowel quality of the preceding mid vowel, such that, if the preceding mid vowel is [–Back], /ɑ / is also [–Back] and vice versa. This phenomenon, which

78

Shakuntala Mahanta

can be called ‘emergent front harmony’8 only influences the immediately following vowel, thereby exhibiting a local process. This kind of exceptionality is also strictly limited to the domain of a bisyllabic root (also recall section 2.1 where /ɑ /-raising in general does not affect trisyllabic domains). To analyse these cases where the [±Back] specifications change as a result of a stem-initial vowel, I will propose a sequential markedness constraint which requires the agreement of [±Back] values in the smallest possible domain of the root. Before presenting a formal analysis, we have to present a modification of our understanding of the locus of violation that was postulated in the preceding section. The trigger of exceptional /ɑ /-raising was the /–i/ or /–u/ portion of the morphemes /–iyɑ / and /–uwɑ /, but in the case of emergent front harmony, /–iyɑ / and /–uwɑ / only provide the environment for the application of this constraint. /–iyɑ / and /–uwɑ / do not provide a ‘direct’ phonological motivation for the occurrence of front harmony, at least not in the same way that it provides the phonological factor for the occurrence of /ɑ /-raising. Taking the instance of /elehuwɑ / again, the [+Back] feature of /u/ in /– uwɑ / cannot be held responsible for the spreading of [–Back] features in the root of /elehuwɑ /. This cannot be dissimilation either, given that there are derived lexical items like /dʰ emeliyɑ / (cf (18) b.). Obviously then, the spreading of [±Back] features in the root seems to have ‘nothing to do’ with the phonological components of the morphological environment which provides for this exceptional process. There perhaps need not always be an intimate phonological motivation for morpheme-specific phonological phenomena, but whenever there is one, it undoubtedly leads to a tighter and unified analysis. The fact that there is no such clear-cut motivation in exceptional front harmony gives way to some other insights which would not have otherwise caught our attention. Exceptional front harmony involves two processes, one is /ɑ /-raising and consequently [ATR] spreading, the other is front harmony. What we have not stressed till now is the special status of the vowel most adjacent to the exceptional morpheme. In this analysis, I suppose that the dual exceptionality of the kind encountered in Assamese is because the vowel most adjacent to the exceptional morpheme is specially licensed to undergo exceptional behaviour. Therefore, the constraint responsible for exceptional front harmony needs to be effective in prohibiting non-adjacent segments from the domain of exceptional front harmony. The constraint is formulated as below:

Morpheme-specific Exceptional Processes

(26)

79

*[–Back –High][+Back –High]iyɑ L1 Assign a violation mark to the minimal string containing a [–Back –High] vowel followed by a [+Back –High] vowel, only if a portion of the string is the vowel which is adjacent to a morpheme indexed as L1

I assume that the constraint *[–Back –High][+Back –High] is a universal constraint which drives Front/Back harmony. However, this constraint exceptionally selects only those lexical items which have an /–iyɑ / or a /– uwɑ / suffix. The constraints *[–ATR][+ATR]L1 and *[–Back –High][+Back –High]L1 are ranked together and any optimal candidate has to respect both of these constraints. (27)

Co-dependent *[–ATR][+ATR]L1 and *[–Back –High][+Back – High]L1 I

/ɛ lɑ h/+/uwɑ /L1 *[–ATR] *[–Back– IDENT IDENT IDENT**[–ATR] [+ATR] High] [–Back][ [+Back][Low][ [+ATR] [+Back– High] L1 a. ɛ lɑ huwɑ

*!

b. elɑ huwɑ

*

* *!

*

c. elehuwɑ

*

d. olohuwɑ e. elohuwɑ f. olehuwɑ

*!

* *

*!

* *!

*

*

The tableau in (27) shows how constraint indexation can satisfactorily capture two processes which are indexed to the same morpheme. The selected candidate [elehuwa] satisfies the highly ranked lexically indexed markedness constraint *[–Back –High][+Back –High]L1. It also simultaneously satisfies *[–ATR][+ATR]L1. Both these processes require the same environment, i.e. exceptional triggering by the vowels in /–iyɑ / and /– uwɑ /, but they do not contravene the principle of locality that constraint indexation espouses. Both the processes of exceptional /ɑ /-raising and

80

Shakuntala Mahanta

Front/Back harmony are concentrated on the absolutely adjacent syllable or ‘minimal string’. The evaluation shows how this constraint ranking prohibits candidates (27)-b and (27)-e because they violate *[–Back – High][+Back –High] L1. Candidates (27)-d and (27)-f are barred from being selected in this evaluation because of their multiple violations of IDENT[– Back]. The tableau below shows an input which contains a /Cɔ Cɑ –/ sequence. (28)

Co–dependent *[–ATR][+ATR]L1 and *[–Back –High][+Back –High]L1 II

/ *[–A] *[–Back–High] IDENT IDENT IDENT *[–ATR] bɔ zɑ r/+/uwɑ /L1 [+A]L1 [+Back–High] L1[–Back][ [+Back] [Low] [+ATR] a. bɔ zɑ ruwɑ

*!

*

b. bozɑ ruwɑ

*!

*

c. bozoruwɑ

*

d. bezeruwɑ

*!*

*

e. bozeruwɑ

*!

*

*

*

f. bezoruwɑ

*!

The constraint hierarchy is able to generate the right output [bozoruwa] because the selected output neither violates *[–Back –High][+Back – High]L1 nor *[–ATR][+ATR]. All the other candidates incur fatal violations of either the two high ranking constraints or the constraint demanding faithfulness to the [±Back] values. The modification of the locality principle in exceptional morphemic triggering shows that only a segment adjacent to a triggering morpheme is the subject of the exceptional process. The difference in Assamese is that the adjacent segment ends up bearing two exceptionalities, and one of them is not the direct result of the phonological properties that the exceptional morpheme bears. Despite this modification in our approach, we have been able to save the crucial generalisation that exceptional morphemic triggering involves as minimal a domain as possible. Although locality is salvaged, there is another fundamental tenet of OT which is undermined by the indexed markedness constraint *[–Back –

Morpheme-specific Exceptional Processes

81

High][+Back –High]L1. The constraint demanding front harmony only in the root subverts a central tenet espoused in OT (McCarthy and Prince, 1993) that FAITH ROOT always outranks FAITH AFFIX. In effect therefore, the caveat that Pater (2006) had mentioned about the potential of lexically indexed morpheme-specific constraints ability to subvert this ‘universal’ metaranking can be a reality. In fact, the data itself shows that front harmony applies to the root but fails to apply to the suffix. Although we are not yet certain whether this is something to be hailed as an achievement of morpheme-specific indexed constraints, it may be instructive to look at the possible reasons for this kind of exceptionality. All Front/Back harmony systems are triggered from root outwards (see Kaun 1995 and Walker 2001 for a cross-linguistic survey) and emergent Front/Back harmony in Assamese falls in line with this universal pattern. The next question which confronts us is then why doesn’t this exceptional front harmony not neutralise the Front/Back values of the suffixes /–iyɑ / and /–uwɑ / as well? The reason lies in what we have been arguing for exceptional patterns all along – that all exceptional patterns need to be local, affecting a minimal string, and in this case the minimal string comprises of the value of one root vowel only. This particular root vowel is the vowel adjacent to the exceptional triggering vowel, which is specially licensed to display exceptional behaviour. To recapitulate from the discussion in section 2.1 in this paper, the bisyllabic root is important to characterise the nature of exceptional occurrences in Assamese and these are the following: firstly, the domain of application of /ɑ /-raising is always the bisyllabic root (see examples in (14)). Secondly, there is no /ɑ /-raising when a /ɑ / is the final vowel of a trisyllabic word (the /ɑ / deletes itself in those cases). Finally, prefixes are ruled out from the domain of /ɑ /-raising. Front harmony is locally applicable only to the [±Back] values of the vowels of the root. All these combined factors keep out suffixes from the influence of the process of front harmony. The surfacing of front harmony plays a significant role in emphasising the aspect of emergent unmarkedness that may be sometimes underscored by exceptional processes. The constraint demanding front harmony is operative in many harmony languages of the world. Given OT’s architecture, it is only natural that the markedness constraint responsible for triggering front harmony in some languages is also present but not active in the constraint hierarchies of other unrelated languages. Therefore, the emergence of this unmarked pattern in the exceptional phonology of Assamese confirms the existence of The Emergence of the Unmarked (TETU, McCarthy and Prince 1994a) effects even in exceptional phenomena.

82

Shakuntala Mahanta

By restricting the exceptional process to a root domain only, the most prominent vowels are affected by the process. The universal tendency of the triggers of front harmony being in the root vowel also lends support to our account of emergent front harmony in Assamese as a process which abides by universal tendencies. It is perhaps in contexts like these that the metaconstraint FAITH ROOT >> FAITH AFFIX needs reconsideration.9 In a morphologically driven exceptional pattern, various factors pertaining to minimal domain and universally well-attested root outward harmony conspire to produce an output which fatally violates this particular metaconstraint. Hence, the metaconstraint though an useful constraint to refer to a lot of phonological processes which attest more faithfulness in roots and more neutralisation in affixes can lead to wrong predictions of cases where multiple factors converge to produce more faithfulness in affixes than in roots. 3.4. Alternative theoretical approaches to morpho-phonemic alternations We can compare this derived environment domain with the processes of derived environments presented in Kiparsky (1993), Lubowicz (2003) and elsewhere. Taking as a case in point the most recent treatment of derived environment in Lubowicz, we can try to draw a parallel and see if we need a similar treatment of the processes discussed in this paper. To get away with the problem of blocking effects in DEE (Derived Environment Effects) Lubowicz conjoins the two constraints, so as to get the results which are seen only in a derived environment. Consequently, we can offer a Lubowicz-style conjoined constraint *[–Back –High] & IDENT (Low) >> IDENT (Back) solution to the problem described here. The problem with this constraint is that it still does not capture the absolutely morphemespecific environment of Front harmony in Assamese. Pater (2006) argues that all morphologically derived environment domains can be analysed with the aid of constraint indextion. The added advantage of lexical indexation in morpheme specific phonology is that it offers a way of getting the results of some phonological processes, which are observed in a derived environment domain, yet morpheme-specific. A theoretical approach which would founder with respect to the Assamese patterns of morpho-phonological interaction is that of cophonology, more specifically in the manner argued by Anttila (2002), one of the main proponents of the cophonology approach. Cophonology in general requires exceptionalities in the lexicon and morphology to be analysed with the aid of lexically specified rankings of constraints which are already present in the grammar. In this approach propagated by Anttila (2002) only pairs of

Morpheme-specific Exceptional Processes

83

constraints whose ranking is unspecified in the grammar can have lexically specified rankings. When a lexical item is unspecified for the respective ranking, then it would show variation. In the case of the exceptions of the emergent variety, no such pairing is possible. In these cases, a low ranked constraint supercedes all other Markedness and Faithfulness Constraints to assume precedence over them. Anttila (2002) also presents this model to establish a connection between exceptionality and variation, so that whenever the constraints are unranked the resultant output has the potential to vary either on the Markedness or on the Faithfulness count. This kind of variation is not attested in Assamese and therefore this attribute of Cophonology cannot be applicable to all instances of exceptionality in the grammar of languages. Further, as argued in Pater (2006), Anttila (2002) cannot elegantly capture local effects of constraints which are active in exceptional triggering and blocking.

4. Vowel harmony in verbs Assamese does not allow the presence of [e] and [o] without a following /i/ or /u/10. In such a scenario, the result of harmony when the verbal root /rɔ / ‘wait’ 11 is suffixed with /–il/ is expected to be /roil/ (or maybe even /royil/ because of hiatus avoidance). The existence of the apparently impossible sequences /rol/, /gol/ (after the deletion of /i/) 12, etc. is therefore unpredictable, given the phonological grammar of the language. The eventually occurring surface output forms /rol/ and /gol/ are realised with the harmonised segment /o/ even though the triggering segment /i/ has been deleted. I present a sample of the vowel harmony pattern displayed in verbs: (29)

Vowel harmony in the verbal paradigm

Verbal root rɔ ‘wait’

lɔ ‘take’

dʰ ʊ ‘wash’

kʰ ɑ ‘eat’

Past perfect il+ ʊ /i/ɑ /ɛ

il+ ʊ /i/ɑ /ɛ

il+ ʊ /i/ɑ /ɛ

il + ʊ /i/ɑ /ɛ

1P 2P(fam) 2P(ord) 2P(hon)&3P

lolʊ loli lolɑ lolɛ

dʰ dʰ dʰ dʰ

kʰ kʰ kʰ kʰ

rolʊ roli rolɑ rolɛ

ulʊ uli ulɑ ulɛ

ɑ lʊ ɑ li ɑ lɑ ɑ lɛ

84

Shakuntala Mahanta

future

im/ib+i/ɑ +ɔ im/ib+i/ɑ +ɔ

im/ib+i/ɑ +ɔ

im/ib+i/ɑ +ɔ

1P 2P(fam) 2P(ord) 2P(hon)&3P

rom robi robɑ robɔ

dʰ dʰ dʰ dʰ

kʰ kʰ kʰ kʰ

lom lobi lobɑ lobɔ

um ubi ubɑ ubɔ

ɑm ɑ bi ɑ bɑ ɑ bɔ

In the paradigms above, the [+high +ATR] vowel /i/ always trigger a change in the preceding [–ATR] vowels /ɛ / /ɔ / and /ʊ /. Verbs inflect in the following order: (30)

Root + Aspect (Perfective/Progressive) + Tense+ Person

The pattern of inflection of the open monosyllables /dʰ ʊ / ‘wash’ and /kʰ ɑ / ‘eat’, deserves attention because only open monosyllables provide the context for vowel deletion. Therefore, only these monosyllables have been taken into consideration. Note that the monosyllabic verbs like /rɔ / ‘wait’ appear to inflect for their future and past perfect forms without the presence of the harmony-triggering vowel by undergoing the alternation that the deleted vowel triggers. Therefore, in the past perfect and future forms of all the verbal forms above, the vowel /i/ is deleted, such that the initial vowels in /im/, /ib/ and /il/ are left invisible after inflection. But these altered forms exist in the verbal morphology as a result of vowel harmony triggered by the underlying presence of /i/. However, the paradigm in (29) is not representative of the entire verbal morphology of Assamese. In other words, /i/ deletion under hiatal conditions is not attested across the board in the verbal morphology of the language. In the following paradigm, as a result of affixation of /–is/, the perfective suffix, there is no deletion or epenthesis. (31)

affixation of /–is/

Root Pres. Prog. 1P 2P(fam) 2P(ord) 2P(hon)&3P

kʰ ɑ +is+ ʊ /ɔ /ɑ /ɛ rɔ +is+ʊ /ɔ /ɑ /ɛ kʰ kʰ kʰ kʰ kʰ ɛ

ɑ isʊ roisʊ ɑ isɔ roisɔ ɑ isɑ roisɑ ɑ isɛ roisɛ ɑ +is+il+ʊ /ɔ /ɑ / rɔ +is+il+ʊ /ɔ /ɑ / ɛ

dʰ ʊ +is+ʊ /ɔ /ɑ /ɛ dʰ dʰ dʰ dʰ dʰ ɛ

uisʊ uisɔ uisɑ uisɛ ʊ +is+il+ʊ /ɔ /ɑ /

Morpheme-specific Exceptional Processes

Past-Prog. 1P 2P(fam) 2P(ord) 2P(hon)&3P

kʰ kʰ kʰ kʰ

ɑ isilʊ ɑ isili ɑ isilɑ ɑ isilɛ

roisilʊ roisili roisilɑ roisilɛ

dʰ dʰ dʰ dʰ

85

uisilʊ uisili uisilɑ uisilɛ

The set of examples above show that the morphological extension /–is/ does not require hiatus resolution. The constraint on hiatus resolution is violated by the verbal derivations produced as a result of the addition of /– is/. 4.1. Hiatus resolution – deletion and epenthesis It is a well-observed phenomenon that segments may be either deleted or inserted and thereby link two adjacent segments, which may be present at the edges of a morphological domain. It is therefore only natural that hiatus avoidance is present in the phonology of Assamese, independent of vowel harmony. Apart from the verbal paradigm discussed above, hiatus avoidance13 is robust in Assamese derived forms. Some examples from the nominal paradigm of Assamese are presented below: (32)

[–ATR] initial suffixes

Root Ergative Accusative Dative /ɛ / /k/ /loi/

Genitive Locative Instrumental /r/ /t/ /rɛ /

bʰ ɑ t bʰ ɑ t-ɛ bʰ ɑ t-ɔ k bʰ ɑ tmɑ mɑ -yɛ mɑ -k oloi mɑ -loi

bʰ ɑ t-ɔ r bʰ ɑ t-ɔ t bʰ ɑ t-ɛ rɛ mɑ -r mɑ -t mɑ -rɛ

These examples show that hiatus avoidance14 is indeed present as a strategy in the morpho-phonology of Assamese when morphology provides the context of a juncture. Hiatus avoidance measures like epenthesis kick in to preserve the ideal phonological shape of a morpho-phonological word, when morphology provides the context of a juncture. There are some other examples from adjective formation where the epenthetic element may be /l/ or /r/ as shown in the examples below:

86

Shakuntala Mahanta

(33)

Epenthesis in adjective formation

a. kɛ sɑ b. misɑ c. dɛ kɑ

‗raw‘ –uwɑ keseluwɑ ‗false‘ –uwɑ misoliyɑ ‗young‘ –uwɑ dekeruwɑ

‗raw(ness)‘ ‗liar‘ ‗young–ish‘

As argued in the literature of OT, hiatus need not be the constraint involved in the analysis of all such instances. A requirement on Onsets can easily account for the cases shown in (32) and (33). The following two are the contending constraints (34)

*HIATUS ―Assign a violation mark to heterosyllabic vocalic sequences‖

(35)

ONSET ―Syllables must have Onsets‖

The constraint *HIATUS is a prohibition against heterosyllabic vowel-vowel sequences. As the example from the nominal paradigm show, hiatus is normally resolved by inserting an epenthetic element. Epenthesis is possible because of the ranking *HIATUS >> DEP. It may be noted here, that the requirement of ONSET can also lead to the same results. Hence our tableaux reflects our indecision as to the choice of the proper constraint at this point (This problem is again dealt with immediately in (38) – (40) where the untenability of ONSET for all cases of hiatus avoidance is discussed ). The two tableaux below show how epenthesis surfaces in Assamese, in the light of the examples from the case extensions. (36)

*HIATUS >> DEP mɑ +ɛ a. mɑ yɛ b. mɑ ɛ c. mɑ

MAX

*HIATUS/ONSET

*! *!

DEP *

Morpheme-specific Exceptional Processes

(37)

87

*HIATUS >> DEP mɑ +k a. mɑ ɔ k

MAX

*HIATUS/ONSET *!

DEP

b. mɑ k Unlike the cases discussed above, instead of resolving hiatus by inserting an epenthetic element, morphemes in the verbal paradigm present exceptions to this normal routine of syllabification. The vowel /i/ triggers regular harmony in the verbal paradigm, but a few morphemes show quirks in the way hiatus is resolved – the morphemes /–il/ and /–im/ trigger vowel deletion and the morpheme /–is/ prefers hiatus over either epenthesis or deletion, in blatant violation of all constraints demanding hiatus resolution. In the light of this, a discussion of the alternatives and their untenability is also in order. In OT, syllabification has been shown to be the outcome of constraints like ONSET and DEP to the exclusion of constraints which make demands on proper syllabic shape like that of *HIATUS, which are deemed superfluous. We try to analyse the consequences of an alternative analysis which does not depend on *HIATUS. Insofar as the following example in (33) with an epenthetic element /l/ is concerned, a constraint which does not include *HIATUS does not seem to be counter-productive.

(38)

ONSET >> MAX >> DEP /rɔ /+/ilɑ / a. rolɑ

ONSET

b. roilɑ

*!

c. royilɑ

MAX *

DEP

*

ONSET is required in this hierarchy as the exclusion of ONSET will produce */misoiya/. However, in the evaluation of the two candidates which exhibit exceptional syllabification, this constraint set falters and leads to the selection of the wrong output candidates.

88

(39)

Shakuntala Mahanta

Analysis of deletion with ONSET >> MAX >> DEP /rɔ /+/isl3/+/il ɑ/ a. roisilɑ b. royisilɑ c. rosilɑ

MAX L3

ONSET

MAX

DEP *

* *

*!

Another instance where ONSET is unable to lead to the actually occurring output if it is ranked above MAX and DEP, is when an indexed MAX L constraint is ranked highest This is shown in the evaluation of the candidate /roisila/ where /–is/ prefers a hiatal context in the site of the juncture instead of hiatus resolution. The ranking of ONSET >> DEP pathologically selects /royisila/. The indexed constraint MAX L3 fails to give a verdict in this case. (40)

Failure of MAXL3 >> ONSET >> MAX >> DEP ranking misɑ +iyɑ a. misoiyɑ b. misoliyɑ c. misoyɑ

ONSET *!

MAX

DEP *

*!

This discussion has brought forth significant empirical motivation behind the proposal to motivate *HIATUS. Therefore, in the following sections we will proceed with an analysis where *HIATUS is used as a constraint to analyse instances of epenthesis in Assamese. 4.2. /i/deletion in the verbal paradigm In this paper it is shown that two set(s) of morphemes in the verbal paradigm idiosyncratically make one of the two choices – in one, the morphemes /–il/ /–ib/ and /–im/ opt for unmarked syllabification by choosing deletion over epenthesis and in the second case the morpheme /–is/ arbitrarily chooses a marked syllabification where the addition of /–is/ results in onsetless syllables. Before providing a complete analysis of the exceptions discussed till now, I will present the constraints which are required for an analysis of this pattern of deletion in Assamese verbs. The type of deletion and subsequent fusion of a featural quality is also known as coalescence. The constraint which prohibits coalescence in OT is the following:

Morpheme-specific Exceptional Processes

(41)

89

UNIFORMITY — ―No Coalescence‖ No element of S1 has multiple correspondents in S2 (McCarthy and Prince 1995)

This faithfulness constraint requires an output segment to correspond to only one input segment. The constraint UNIFORMITY is violated by those segments in which multiple elements in the input representation are fused in the output. In the evaluation in Assamese it will be shown that this constraint will be violated by sequences where the alternation is the one as following: (42)

/r1ɔ 2/+/i3l4/

/ r1o2,3 l4/

In this type of alternation /ɔ / and /i/ are fused in the output to be realised as /o/. Another faithfulness constraint which is relevant in the analysis of the type of deletion encountered here is the IDENT IO [F] constraint, proposed in Pater (1999). This constraint was proposed to deal with the asymmetry (as opposed to MAX constraints, where MAX[F] penalises deletion and DEP penalises insertion) in the IDENT family of constraints proposed in the correspondence model of faithfulness (McCarthy and Prince 1995). In the IDENT family of constraints an IDENT[F] constraint can be violated only in the presence of a segment‘s feature value in the output, and not in its absence. The faithfulness constraint required to prohibit featural deletion in Assamese is IDENT IO [ATR], which is stated below: (43)

IDENT IO [+ATR] Output correspondents of a feature specified as [+ATR] must be [+ATR]

This faithfulness constraint will evaluate the faithfulness of [+A TR] values in the output. In other words, an output representation with the deletion of a corresponding input [+ATR] value would incur a violation mark. Again we will see that an indexed *HIATUS constraint determines the emergence of syllabification patterns which can be considered exceptional in Assamese. I formulate this lexically specified faithfulness constraint as below: (44)

Constraint lexicon *HIATUS L2 /il/ /ib/ /im/

90

Shakuntala Mahanta

―Avoid heterosyllabic vocalic sequences‖ The analysis to be presented holds that *HIATUS L2 is crucial in determining the output candidate when the triggering segment is deleted in morphemespecific surface well-formedness constraints. (45)

*HIATUS L2 and faithfulness of the deleted feature /rɔ /+/im/L2*HIATUS L2IDENT IO UNIFORM *[–ATR] *HIATUS [+ATR] [+ATR] a. rɔ im *! * b. roim *! c. rɔ m d. rom

* *! *

In the tableau above, *HIATUS L2 effectively bars the candidates (45)-a and (45)-b from being the winners in the evaluation. In the absence of an indexed constraint, the candidate in (45)-b, [roim] would offer the most competition as it satisfies UNIFORM which the candidate now selected as a result of satisfying the highest ranking *HIATUS L2 does not. IDENT IO [+ATR] prohibits (45)-c from emerging as the winner as it does not preserve the [+ATR] quality of an input segment. (46)

*HIATUS L2 drives hiatus resolution in some parts of the verbal morphology

rɔ /+/–il– ɑ /L2 a. roilɑ b. rɔ ilɑ c. rɔ lɑ d. rolɑ

*HIATUS L2 *! *!

IDENT UNIFORM *[–ATR] *HIATUS IO [ATR] [ATR] * * * *! *

By evaluating another candidate which has a suffix of the shape /VCV/, we can see that the same process applies to all morphemes indexed as L 2 . While hiatus resolution drives deletion, requirements of featural faithfulness result in the expression of the morpheme‘s [+ATR] feature on the preceding vowel. In the tableau below, while the high-ranking *HIATUS L2 requires vowel deletion, the constraint IDENT IO [ATR] preserves the [ATR] feature in the output form, resulting in the optimal candidate which satisfies both constraints.

Morpheme-specific Exceptional Processes

91

In the evaluation in the tableau above, the resultant output form /rolɑ / is a product of the combined forces of IDENT IO [ATR] and *HIATUS L2. Candidates (46)-a and (46)-b violate *HIATUS L2. Candidate (46)-c violates IDENT IO [ATR], which demands faithfulness to the feature value of the deleted segment, resulting in a failed candidate. On one hand, /ɑ /-raising is one the factors which stands out in the morpheme triggered exceptional process of Assamese, and on the other hand, /ɑ / does not undergo alternation as a result of exceptional hiatus resolution in the verbal paradigm. The segment /i/ is deleted altogether without any corresponding coalescence or preservation of the value of [+ATR] of the deleted segment. The tableau below shows that a high-ranking IDENT [low] prevents any change in the feature specification of /ɑ /. (47)

Inertness of /ɑ / under morpheme deletion

/kʰ ɑ /+/il+ɑ *HIATUS L2 IDENT IDENT IO UNIFORM *[–ATR]*HIATUS [low] [ATR] [+ATR] L2 a. kʰ oilɑ *! * b. kʰ olɑ c. kʰ ɑ ilɑ *! d. kʰ ɑ lɑ

*!

* *

* *

*

The optimal output candidate /kʰ ɑ lɑ / violates IDENT IO [ATR] as it does not preserve the features of the deleted segment faithfully. The higher ranked status of IDENT[low] prevents the selection of candidates (47) -a, b and c. The hierarchy proposed till now is ranked as below: (48)

*HIATUS L2 ,IDENT IO [ATR] ≫ *HIATUS

UNIFORM≫*[–ATR][+ATR]≫

The examples like /roisʊ /, etc. which do not incur violation of *HIATUS also needs to be accounted for in this analysis. These instances of nonapplicability of exceptional hiatus resolution are a result of non-indexation of these verbal morphemic extensions to any constraint demanding exceptional hiatus resolution. Hence this shows that the proposed analysis correctly predicts prevention of hiatus resolution if the concerned morpheme is not indexed. In those cases, hiatus resolution is entirely absent in the juncture where the morphemes come together.

92

Shakuntala Mahanta

(49)

No hiatus resolution when the morpheme is /–is/

/rɔ /+/is/+/ɑ /

*HIATUS IDENT IO L2 [ATR]

UNIFORM *[–ATR] [+ATR]

a. roisɑ

* HIATUS

*

b. rosɑ c. rɔ isɑ

*! *!

* *

Finally recall that in Assamese, there are instances of epenthesis in other parts of the lexicon. The hierarchy proposed till now successfully accounts for these instances as well. (50)

*HIATUS and epenthesis

misɑ +iyɑ a. misɑ liyɑ b.  misoliyɑ c. misoyɑ d. misoiyɑ

*HIATUS L2IDENT IO UNIFORM *[–ATR] *HIATUS [ATR] [+ATR] *! *! *!

In these examples, hiatus is undone by epenthesising /l/ and /r/ respectively and even though *HIATUS is low-ranked it plays an active role in choosing the candidate without any hiatal gap. There are instances of epenthetic /l/ and /r/ in Assamese, probably because coronals are unmarked epenthetic segments. Although I cannot add anything thought-provoking with regard to the choice between /l/ and /r/, the featural value of epenthetic segments has been given phonological interpretation in earlier work, for example in Yawelmani Yokuts harmony (Kuroda 1967 and Archangeli 1985). More work in the featural content of these epenthetic elements in the examples in (33) may throw light on this area in future. But it may be mentioned that since Prince & Smolensky (1991), the idea that epenthetic elements are not physically added as fullfledged segments has gained currency. Phonological epenthesis is only a place-holder for prosodic structure and phonological constraints determine their featural specifications. For the time being, ignoring the featural content of the epenthetic elements brings us to the hierarchy discussed till now

Morpheme-specific Exceptional Processes

93

as an instance of how epenthetic behaviour can be fully accounted by the constraint hierarchy posited. The tableau above shows that even though *HIATUS is low-ranked, its presence in the hierarchy is responsible for ensuring that hiatus can still be resolved by epenthesis. The activity of *HIATUS constraint here is again reminiscent of emergence of the unmarked. A point of significance here is also that whenever there is scope for possible interaction between the determinants of exceptional behaviour, there is none. In the tableau in (50), /misɑ / ‘lie’ +/ –iyɑ / SUFFIX the output /misoliyɑ / violates IDENT LOW, but exhibits the unmarked pattern of syllabification, i.e. by epenthesis. At the same time, in the verbal pattern, emergent unmarkedness shows up when /khɑ / ‘eat’ + /–il/ + /–ɑ / leads to /khɑ lɑ /, an output where deletion is preferred over epenthesis, but IDENT LOW remains unviolated. In all these instances of morphology-phonology interface at the stem-suffix boundaries, there is no overlap which would result in highly marked patterns of syllabification and alternation, leading to the reinforcement of one of the primary assumptions of OT, that markedness is relative. 4.3. Conclusion In the final reckoning, exceptional behaviour is dependent on the morpheme‘s selection of the relatively unmarked. This leads us to advocate for a proposal of the emergence of the relatively unmarked in the light of the exceptional processes discussed in this paper. Therefore this paper posits the view that there can be no default unmarkedness, but only emergent unmarkedness depending on the relevant morphological contexts. This view therefore repudiates the one put forward by Alderete (1999, 2001a) where the outputs of morpho-phonological processes are supposed to be constrained by the grammar of the individual language (see also Inkelas and Zoll 2003). As we have seen from the Assamese examples, the attribute of ‗grammar dependence‘ does not hold any water insofar as the exceptional morpho-phonological interactions are concerned. A theoretical approach which would founder with respect to the Assamese patterns of morpho-phonological interaction is that of cophonology, more specifically in the manner argued by Anttila (2002), one of the main proponents of the cophonology approach. Cophonology in general requires exceptionalities in the lexicon and morphology to be analysed with the aid of lexically specified rankings of constraints which are already present in the grammar. In this approach propagated by Anttila (2002) only pairs of constraints whose ranking is unspecified in the grammar can have lexically specified rankings. When a lexical item is unspecified for the respective

94

Shakuntala Mahanta

ranking, it would show variation. In the case of the exceptions of the emergent variety, no such pairing is possible. In these cases, a low ranked constraint supercedes all other Markedness and Faithfulness Constraints to assume precedence over them. Anttila (2002) also presents this model to establish a connection between exceptionality and variation, so that whenever the constraints are unranked the resultant output has the potential to vary either on the Markedness or on the Faithfulness count. This kind of variation is not attested in Assamese and therefore this attribute of Cophonology cannot be applicable to all instances of exceptionality in the grammar of languages. Further, as argued in Pater (2006), Anttila (2002) cannot elegantly capture local effects of constraints which are active in exceptional triggering and blocking. To summarise the instances of exceptionality discussed in this paper, we can recall that exceptionality was divided between exceptional patterns in the nominal morphology as well as the verbal morphology. In the first case, a morpheme is expressed on the otherwise non-participating vowel /ɑ /. In traditional terms, this function of the morpheme can be seen as an overapplication of harmony, which otherwise applies only to [+High – ATR] - /ʊ / or [–High –ATR] - /ɛ / and /ɔ / vowels but never to a [+Low – ATR] vowel - /ɑ /. The application of harmony in this context violates the constraint IDENT [Low], which is otherwise highly ranked in the normal phonology of Assamese. Furthermore, by initiating progressive front harmony from root outwards, the morphemes /–iyɑ / and /–uwɑ / faithfully observe universally attested principles of front harmony. Even though this exceptional pattern apparently violates the universal metaconstraint on higher ranked root faithfulness than affix faithfulness, it is argued in this paper that the resultant exceptional processes are not determined by language-specific determinants of markedness, but rather supports a more holistic approach to the markedness of exceptional patterns in languages. In exceptional patterns relating to hiatus resolution, it is shown that morphemes differ in their choice of syllabification because unmarkedness is again not a static choice, and its dynamism may be relative, governed by various factors which determine unmarkedness.

Morpheme-specific Exceptional Processes

95

Acknowledgements I have benefited from the observations of an anonymous reviewer in presenting the subject matter clearly. In various other occasions, the critical observations of Janet Grijzenhout, Joe Pater and Wim Zonneveld have also led to substantial improvements. Remaining errors of understanding and analysis are mine.

Notes 1.

The status of these and other lexical items which are deemed to be roots in this paper are not subject to any dispute. They also do not alternate in the presence of non-ATR suffixal vowels. For instance /ʊ pɔ r/ ‘top’ +/ɔ t/ → /ʊ pɔ rɔ t/ ‘on top’ and /khɔ rɔ s/ + /ɔ r/ → /khɔ rɔ sɔ r/ ‘of expenses’ /tɛ l/+ /ɑ l/ → ‘having a lot of oil’ 2. The ranking of *[–ATR][+ATR] below IDENT [High] and *[–ATR, +High, – Back] is required because of blocking by /ɑ /. 3. This does not imply that I am arguing for a structure preserving (Kiparsky 1973) approach to Assamese harmony. The very fact that the outputs of harmony, i.e. [e] and [o] have an allophonic status shows that such an approach will not reflect the actual harmonic process of Assamese. 4. These two suffixes trigger another type of alternation when they are preceded by words with a mid-low vowel followed by a low vowel. These are illustrated in (18) and discussed immediately thereafter. 5. /–iyɑ / and /–uwɑ / behave in a largely equivalent manner insofar as their involvement in exceptional patterns is concerened. Therefore any reference to only one of them implies reference to the other one as well. 6. In contrast, /ɑ / in a root/stem position never alters the [±Back] quality of the prefixal vowel. /ɑ / following a prefixal position always shifts to /o/. 7. Nothing much can be said about the epenthetic /l/ here. There are a few instances of epenthetic /l/ and /r/ in Assamese. 8. The ranking is not reflected here, but it will be shown in all other instances where /ɑ / alters to /e/ when the mid vowel precedes it. 9. Front harmony is not instantiated anywhere in Assamese. 10. See also other work which argue against the metaconstraint FAITH ROOT >> FAITH AFFIX, especially, Karvonen and Sherman (1997), Krämer (2003), Revithiadou (1998). 11. I refer to the surface inventory of Assamese which, undoubtedly, also consists of [e] and [o] albeit with restrictions. Although there may be some circumspection whether segments like [e] and [o] should be considered to be a part of the language’s vowel inventory, this approach was necessitated because of the existence of [e] and [o] in some non-alternating contexts as well (see Mahanta 2008). Although these occurrences can very well be called exceptional, the

96

12.

13.

14.

15.

Shakuntala Mahanta

significance of gradient phonological behaviour displayed by [e] and [o] needs to be highlighted. These reasons have led to the tacit assumption in this paper that [e] and [o] are perhaps in an intermediate stage between allophony and non-allophony. Verbal roots which are considered for analysis are the barest forms of the verb without any additional person, number, tense, aspect or modal markers. As such, these bare forms without any markers are used in the imperative (in the sense of a command or a direction) and can be used only non-honorifically. In his typological study, Casali (1997) notes that in a root and suffix boundary, if the suffix is VC, a ranking of MAX MS (a constraint preserving all input segments) over MAX LEX (a faithfulness constraint protecting lexical words) would produce a deletion pattern, such as the one instantiated in Assamese. A discussion of various typological issues, as the ones raised by Casali, is outside the scope of this paper. A referee has pointed out the need to defend a hiatus avoidance analysis instead of an allomorphy analysis. I contend that even an allomorphy analysis is bound to address and analyse the same questions posited here. In other words, the phonological conditioning of the allomorphs /–l/ /–b/ /–m/ vis-a-vis /–il/ /– ib/ /–im/ will also stumble into the problem of hiatus avoidance. My approach in this paper argues for lexical indexation of the concerned morphemes instead of storing them as different allomorphs, as lexical indexation comes with builtin mechanisms for factors such as locality, which the phenomena under discussion complies with, whereas other analyses lack those means of analysing locality. The underlying forms that are hypothesised for the suffixes show that hiatusavoidance results in epenthesis. The strategy involved in these forms show two types of epentheis - /y/ epentheis in the presence of two vowels and /ɔ / epenthesis in the presence of two consonants. /ɔ / harmonises to /o/ if there is a following /i/.

Morpheme-specific Exceptional Processes

97

References Alderete, John 2001 Dominance effects as Transderivational Anti-Faithfulness. Phonology 18: 201-253. Anttila, Arto 2002 Morphologically conditioned phonological alternations. Natural Language and Linguistic Theory 20(1):1-42. Archangeli Diana 1985 Yokuts harmony: evidence for coplanar representations in nonlinear phonology. Linguistic Inquiry 16: 335-72. Bakovic, Eric 2000 Harmony, Dominance and Control. Doctoral dissertation, University of California, San Diego. Fukuzawa, Haruka 1999 Theoretical implications of OCP Effects on features in Optimality Theory. Doctoral dissertation, University of Maryland, College Park. Benua, Laura 2000 Phonological relations between words. New York: Garland Press. Gelbart, Ben 2005 The Role of Foreignness in Phonology and Speech Perception. Doctoral dissertation, University of Massachusetts, Amherst. Goswami, Golok Chandra 1982 Structures of Assamese. Gauhati University: Department of Publication. Horwood, Graham 1999 Anti-faithfulness and subtractive morphology. Ms, Rutgers University, New Brunswick, NJ. ROA-466. Inkelas, Sharon and Cheryl Zoll . 2003 Is grammar dependence real? Ms, UC Berkeley and MIT. ROA-587. Itô, Junko and Armin Mester. 1999 The Phonological lexicon. In Handbook of Japanese Linguistics, N. Tsujimura (ed.), 62-100.Oxford: Blackwell. Itô, Junko and Armin Mester 2001 Covert generalizations in Optimality Theory: the role of stratal faithfulness constraints. Studies in Phonetics, Phonology, and Morphology 7: 273-299.

98

Shakuntala Mahanta

Kiparsky, Paul 1973 Abstractness, opacity, and global rules. In Three Dimensions of Linguistic Theory, O. Fujimura (ed.), 57-86.Taikusha: Tokyo. Kiparsky, Paul 1993 Blocking in non-derived environments. In Studies in Lexical Phonology, S. Hargus and E. Kaisse (eds.), San Diego: Academic Press. Karvonen Daniel and Adam Sherman 1997 Sympathy, opacity and u-umalaut in Icelandic. Phonology at Santa Cruz 5 37:38. Krämer, Martin 2003 Vowel harmony and correspondence theory. Berlin: Mouton de Gruyter. Kraska-Szlenk, Iwona 1997 Exceptions in phonological theory. Proceedings of the 16th International Congress of Linguists. Pergamon, Oxford: Paper No. 0173. Kuroda. S. Y. 1967 Yamelmini Phonology. Cambridge, Mass: MIT Press. Lubowicz, Anna 2002 Derived environment effects in Optimality Theory. Lingua 112: 243280. Mahanta, Shakuntala 2008 Directionality and Locality in vowel harmony: with special reference to vowel harmony in Assamese. LOT: Utrecht. McCarthy, John and Alan Prince 1993 Generalized alignment. In Yearbook of Morphology, G. E. Booij and J. van Marle (eds.), 79 153. Dordrecht: Kluwer. McCarthy, John and Alan Prince 1995 Faithfulness and reduplicative identity. UMOP 18:249-384. McCarthy, John and Alan Prince 1999 Faithfulness and identity in prosodic morphology. In The ProsodyMorphology Interface, Rene Kager, Harry van der Hulst and Wim Zonneveld (eds.), 218-309. Cambridge: Cambridge University Press. Pater, Joe 2000 Nonuniformity in English stress: the role of ranked and lexically specific constraints. Phonology 17:237-274. Pater, Joe 2006 The locus of exceptionality: Morpheme-specific phonology as constraint indexation. In University of Massachusetts Occasional Papers in Linguistics 32: Papers in Optimality Theory III, Leah Bateman, Michael O'Keefe, Ehren Reilly, and Adam Werle (eds.), Amherst: GLSA. Prince, Alan and Paul Smolensky 2004 Optimality Theory: Constraint interaction in Generative Grammar. Oxford: Blackwell.

Morpheme-specific Exceptional Processes

99

Revithiadou, Anthi 1998 Headmost accent wins: Head dominance and ideal prosodic form in lexical accent systems. The Hague: Holland Academic Graphics. Tunga, S.S. 1995 Bengali and other related dialects of South Assam. New Delhi: Mittal Publications.

Special Contribution: Indian Sign Language (ISL)

Typology of Indian Sign Language Verbs from a Comparative Perspective Michael W. Morgan 1. ISL General Background Indian Sign Language (hereafter ISL) has been characterized as the only indigenous pan-Indian language (Zeshan 2006: 322). It has also been argued (Zeshan 2000) that its usage extends beyond the political boundaries of India, at least into Pakistan and perhaps also Nepal, if not also Bangladesh and even Sri Lanka – which would make it the only indigenous panSOUTH ASIAN language as well. ISL does not, of course, exist in isolation. Genetically, it may or may not be related, perhaps quite distantly, to other sign languages outside the subcontinent. That issue will not concern us here. In addition, like all minority languages, it is surrounded – and at times overwhelmed – by the majority (spoken) languages used around it. Although ISL is clearly not a signed variety of a spoken language (Signed Hindi, for example, or Signed English), as a minority language, it would be strange if it were not influenced in parts by the majority languages. South Asia being the kind of linguistic area that it is with its special areal features, we might expect ISL, as a pan-Indian language, to be a part of this South Asian Linguistic Area as well. For example, the “basic” word order of ISL is clearly SOV. However, given (1) that ISL is a topicalizing language, and (2) that in normal discourse it is rare to find both subject and object expressed lexically in a single sentence – both features shared perhaps with many other sign languages around the world – how meaningful the term “basic” is is questionable. Still, if there is no context, if there is no pragmatic marking, if the signer is forced to express both arguments lexically, and if the verb is a verb not inflecting for subject and object agreement, then we can indeed expect the word order will be SOV. Although not frequent, postpositions (locatives, actually) are found instead of prepositions. Noun attributes go before the noun if “genitive” (possessive) as we can expect for a South Asian language, however, both adjectives and demonstratives tend to be postposed. Other word-order features such as placement of modal verbs and negatives after the main verb, also make ISL

104

Michael W. Morgan

seem right at home areally speaking, though these latter features are perhaps Modern Indo-Aryan rather than South Asian. If we look at two fairly typical ISL sentences (ignoring the non-manuals, and transcribing just the “words”), they both appear quite South Asian: 1. CAT TABLE UNDER SIT, TV WATCH ‘The cat sat under the table and watched the television’ 2. I TEA DRINK CANNOT; TASTE LIKE NOT ‘I cannot drink tea, I don’t like its taste.’ In addition to existing in such a lingusitic area, to the linguist at least, ISL also exists in typological space – together with other sign languages, and/or perhaps with spoken languages which are typologically similar in some way(s). In the present paper, we will attempt in a small way to place ISL in the wider typological and areal perspective. In order to do this, we present a classification of verbs, originally developed for Japanese Sign language (Morgan 2005), and here slightly refined and applied to a corpus of 250plus ISL verbs. This classification typology focuses on how core verbal arguments are encoded in the verb form itself. Representative lists of ISL verbs from the corpus are assigned to their proper class, and then compared with corresponding verbs in three other sign languages: British (BSL), American (ASL), and Japanese (JSL). Finally, we will examine the question of whether ISL has a class of "dative-subject" experiencer verbs, which then also reflects on the question of how well ISL fits into the South Asian Linguistic Area.

2. Introduction to Sign Language “Phonology” Like other signed languages ISL can be described at the sub-morphemic level in terms of approximately six aspects, the units of which serve to distinguish meaning. These are the handshape used to form the sign (often referred to in sign language literature as designator or dez), location in signing space (tabula or tab), movement (signation or sig), orientation (of both fingertips and palm), handedness (that is, whether the sign is produced with one hand or both), and contact (both the presence / absence, and man-

Typology of Indian Sign Language Verbs 105

ner and place of contact between the hand(s) with either the other hand or any other part of the body).1 As with spoken languages, minimal meaning-determining units can be established by “structuralist” methods such as minimal pairs, each member differing from the other by a single feature. Minimal pairs for the handshape and movement features are given in Fig. 1:

GREEN

ENJOY

MY

SORRY

Figure 1. Minimal Pairs for Handshape and Movement Features

GREEN and MY differ only in handshape, as do also ENJOY and SORRY (ignoring concomitant facial expressions). In addition, for GREEN and MY there is no movement involved, while for ENJOY and SORRY there is circular movement. All other aspects of the sign pairs are the same. For the location feature, an example of a minimal pair is THINK versus WOMAN:

THINK

WOMAN

Figure 2. Minimal Pairs for Location Feature

In these two signs the meaning-distinguishing element is the location: at the side of the nose for WOMAN and at the temple for THINK.

106

Michael W. Morgan

Two of the remaining three aspects, orientation and contact, will be addressed below in the following sections, where a typology of how verbal arguments are expressed through ISL verb morphology, and comments on how two South Asian Areal features (dative-subject constructions and ‘explicator’ verb constructions) might also be found in ISL are presented.

3. Typology of ISL Verbs: Argument Encoding One of the major tasks of anyone decoding a linguistic message, whether that message is in a sign language or in a spoken language, is to identify who does what to whom; that is, to identify the various arguments of the verb and their roles (grammatical and other). These arguments and their roles can be coded in several ways: by (more or less) strict word order, by case marking on nouns, by coreferencing on the verb itself, or by combinations of the above. Indic and Dravidian languages typically have fairly unambiguous and transparent case marking on nouns (in the form of postposed case particles), fairly strict word order, and coreference on the verb of at least the grammatical subject role (subject of intransitives, logical patients of ergative-marked transitives). Sign languages seem universally to mark argument status (grammatical role) on the verb instead of marking on the noun. ISL is no exception. In the classical classification of ASL verbs, developed by Padden (1988) and subsequently built upon, ASL verbs can be classified as: (a) plain verbs, which are unmodified for verbal argument, (b) agreement verbs, which agree with the subject and/or object, and (c) locational and directional verbs, which ‘agree’ with locative or ablative and allative objects. The second and third classes both involve movement from one point in space to another, but differ based on what type of role is encoded; in agreement it is the grammatical roles of subject and object (agent and patient), while in locational and directional verbs it is the roles of location, source and goal. A further subtype of locational verbs are verbs that can be located on the body (e.g. to indicate such things as ‘shave one’s head’ versus ‘shave one’s beard’), and in fact, in this group the location can be interpreted as the object (patient) as well. In a fourth class of verbs, typically presented as separate from this classification framework, (d) classifier verbs, the verbal argument (usually the object) is incorporated into the form of the verb, not as movement but rather as handshape which stands for a class of objects (which are in turn arguments of the verb).

Typology of Indian Sign Language Verbs 107

Although this classification has, with only minor modification, been fruitfully applied to a large range of sign languages, since the purpose of classificatory schemes is to allow for maximal generalizations, and to both group together things that are similar and to keep separate things that are different, the present author proposed a modification which he applied to JSL (Morgan 2005). The main motivation for this new classificatory typology was the fact that JSL manifest certain active-stative properties which were not picked up in the classic classification scheme. It was felt that this is because the above scheme does not differentiate sufficiently between types of objects (i.e. direct versus indirect objects), and it over differentiates between others (indirect versus allative objects), and ignores others altogether (incorporated objects). It was felt that a more detailed model, which included both the formal distinctions and the semantic distinctions simultaneously, might be more useful. (As the focus in this paper, like Morgan (2005), is on core verbal arguments, the class of Locational and Directional Verbs is not treated.) A slightly modified form of this latter classificatory scheme is presented below, with details on the various sub-classes which can be identified, as well as a representative sample of ISL verbs in each class. As the purpose of this paper is to place ISL in a comparative context, comparisons are made with ASL, BSL and JSL. Data for these comparisons are based on: Baker-Shenk & Cokely (1980) for ASL; Brien (1992) for BSL, and the author’s 15 years as a active member of the Niigata, Kobe and Osaka signing communities for JSL, and 14 months as a member of the Bombay signing community for ISL. 3.1. Plain verbs Plain verbs are verbs which do not modify to show agreement with either subject or object. A sampling of ISL plain verbs, together with an indication of whether analogous verbs in ASL, BSL and JSL similarly belong to the plain verb class (the ‘=’ sign indicates they are treated the same; the ‘≠’ sign indicates that it belongs to another class, with the class indicated in parentheses; a question mark indicates that there is uncertainty or complications which prevent clear attribution to a given class at this time; lack of indication means either that there is no analogous verb in the language in question, or that no data is available as to its class; indications such as JSL1, JSL2 means that there are more than one sign with a given meaning), is as follows:

108

Michael W. Morgan

BE-ABLE/CAN (=ASL, =BSL, =JSL), BE-BLIND (=ASL, =JSL), BECOLD (=ASL, =BSL, =JSL), BE-DEAF (=ASL, =BSL, =JSL), BE-DIZZY (=ASL, =BSL, =JSL), BE-HOT (=ASL, =BSL, =JSL), BE-ILL (=ASL, =BSL, =JSL), BE-LATE (=ASL, =BSL, =JSL), BE-SURPRISED (=ASL, =BSL, =JSL1, =JSL2), BE-UNABLE/CANNOT (=ASL, =BSL, =JSL1, =JSL2), BEGIN (=ASL, =BSL, =JSL), BREATHE (=ASL, =BSL, =JSL), CELEBRATE (=ASL, =BSL, =JSL), CONTINUE (=ASL, =BSL, =JSL), CRY (=ASL, =BSL, =JSL1, =JSL2), DANCE (=ASL, =BSL, =JSL1, =JSL2), DECIDE (=ASL, =BSL, =JSL), DIE (=ASL, =BSL, =JSL), EXERCISE (=ASL, =BSL, =JSL), EXPERIENCE (=ASL, =JSL), FEAR/BEAFRAID (=ASL, =JSL1, =JSL2), FEEL (=ASL, =BSL, =JSL), FINISH1 and FINISH2 (=ASL, =BSL, =JSL), FORGET (=ASL, =BSL, =JSL), HAVE-FEVER (=JSL), GROW-UP (=ASL, =BSL, =JSL), INTERPRET (=ASL, =BSL, =JSL), KNOW (=ASL, =BSL, =JSL), LAUGH (=ASL, =BSL, =JSL), LEAVE (=ASL, =JSL), LIKE (=ASL, =BSL, =JSL), LIVE (=ASL, =BSL, =JSL1, =JSL2), LOVE (=ASL, =BSL, =JSL), MAKE (=ASL, =BSL, =JSL), MISS (=ASL, =BSL,=JSL), NEED (=ASL, =BSL, =JSL), PLAN (=ASL, =BSL, =JSL1, =JSL2), PREPARE (=ASL, =JSL), REST (=ASL, =JSL), RIDE-BICYCLE (=ASL, =BSL, =JSL), SEARCH (=ASL, =JSL), SLEEP (=ASL, =BSL, =JSL), SMILE (=ASL, =BSL, =JSL), STOP (=ASL, =BSL, =JSL), SUFFER (=BSL, =BSL, =JSL), SWIM (=ASL, =BSL, =JSL1, =JSL2), TASTE (=ASL, =JSL), TELL-LIE (=ASL, =BSL, =JSL), THINK (=ASL, =BSL, =JSL), UNDERSTAND (=ASL, =BSL, =JSL), VIDEO1 (=ASL, =JSL1), WAKE-UP (=ASL, =JSL1, =JSL2), WANT (=ASL, =BSL, =JSL), WORK (=ASL, =BSL, =JSL), WORRY (=ASL, =BSL, =JSL),

A: DIE

B: LIKE

Figure 3. Examples of Plain Verbs

C: DOUBT

Typology of Indian Sign Language Verbs 109

As we see from this sampling of plain verbs, many of them are intransitive verbs. Since distinguishing the roles of multiple arguments is not a problem generally with intransitive verbs (as there typically is just a subject argument), it is not surprising that these verbs should be plain verbs, and plain verbs in all four sign languages. Likewise, as in English, some sign language verbs (CONTINUE, INCREASE, STOP, for example) can be either transitive (‘He stopped the bus’) or intransitive (‘He stopped before crossing the street’). Although this might give reason for confusion in transitive sentences, the fact that these verbs have an intransitive reading, gives us less reason to be surprised at their lack of agreement. What remains are the anomalous verbs (like LOVE, or LIKE) which are always transitive, and whose status as plain verbs allows for confusion as to relative grammatical roles of verbal arguments (Does Rama love Sita? or is it the other way around?) For ASL, and for numerous other sign languages, it has been suggested that the lack of agreement is motivated by phonological constraints; to wit, “[i]f the object has a combination of features that enables it to control direct-case agreement, the verb will take direct-case agreement. If the object does not have one of the necessary combinations of features (or if any of the nominals has an experiencer S[emantic]R[ole] and the verb is body-anchored), the verb will not agree” (Janis 1995: 217). Although it is possible, as in DOUBT pictured above, to indicate object agreement through non-manual means (directionality of eye gaze, shift in upper torso, etc), this is equally the case with verbs which do agree, and so is not the determining factor. 3.2. Agreement Verbs Agreeing verbs are those which modify their movement and/or orientation parameters to show agreement with subject and object. Typically, the motion is iconic, from subject to object. This scheme can be nuanced by distinguishing full agreement verbs (with subject and object) from semiagreement verbs (with object only), since (at least for ASL) it has been observed that “even if the subject has the appropriate features, it cannot control agreement unless the object also does” (Janis 1995: 217). In addition, there are a few reverse agreement verbs, where the motion is from the object to the subject, rather than the normal subject-to-object direction. Additionally, it should be pointed out that while all urban sign languages represented in the literature have agreement verbs, at least two vil-

110

Michael W. Morgan

lage sign languages do not: Kata Kolok in Bali (Marsaja 2008: 168) and Al-Sayyid Bedouin sign language in Israel (Meir, et al 2007: 555f). This is important to remember since, as pointed out above, India too has village sign languages, which have yet to be examined. 3.2.1. Full Agreement Verbs ASK2 (=ASL, =BSL, =JSL), CHAT-WITH (=ASL, =JSL), FIRE-FROMWORK (=JSL; ≠BSL (plain)), FORCE (=ASL), GIVE-TO (=ASL, =BSL, =JSL), HELP (=ASL, =BSL, =JSL), INFLUENCE (=ASL, =BSL, =JSL), INFORM (=ASL, =BSL, =JSL), LOOK-AT (=ASL, =BSL, =JSL), SAYNO-TO (=ASL, =JSL), SCOLD (=ASL, =JSL), SEND-TO (=ASL, JSL; note: JSL has separate signs for SEND-BY-POST-TO, SEND-FAX-TO, SEND-EMAIL-TO, all being Full Agreement verbs), SHARE (=ASL, =JSL), SHOW-TO (=ASL, =BSL, =JSL), TEACH1 (=ASL, =BSL, ?JSL), TEASE (=ASL), VIDEO2 (= JSL2), WATCH (=ASL; ≠JSL (plain))

A: YOU-INFLUENCE-HIM

B: HE-INFLUENCE-ME Figure 4. Example of Full Agreement Verb

C: I-INFLUENCE-YOU

Typology of Indian Sign Language Verbs 111

It should be noted that while the majority of these verbs show agreement through both motion and orientation, a few (e.g. WATCH, LOOK-AT) may optionally show agreement through orientation alone. And at least the sign VIDEO2 shows agreement only by orientation. It should also be noted that a few of these verbs (e.g. WATCH, TEACH2) may not always (or by all signers) be used as full agreement verbs. Although the details of such nonagreement awaits further study, for these two verbs at least it seems that as they are quasi-body anchored signs (i.e. both are normally produced in the vicinity of the face), they can agree if sufficient context is present to make clear the lexical meaning; then the verb is free to detach itself from its normally anchored location and agree. 3.2.2. Reverse Agreement Verbs In this group movement is from object to subject, the reverse of what is normally the case: BORROW-FROM (=ASL, =BSL, =JSL), COPY (=ASL, =BSL, =JSL), INVITE (=ASL, =BSL, =JSL)

A: I-COPY-YOU

B: YOU-COPY-ME

Figure 5. Example of Reverse Agreement Verb

In all verbs in this class the reverse movement is iconic not of ‘subject acts on object’ directionality but rather of the ‘transfer’ of object: ‘I copy from you (to me)’. 3.2.3. Object-Only Agreement Verbs As noted above, some verbs in other sign languages (ASL, for instance) agree solely with their object. While this may also be the case for ISL, no

112

Michael W. Morgan

instances were noted in the corpus. Instances of a few verbs, such as TELL, WATCH, LOOK-AT, etc, do at times occur with only object-agreement. In fact, with some (WATCH, for example) subject-agreement is quite rare. However, since full agreement is possible, I have chosen to classify them as Full Agreement Verbs rather than, for example, as two separate verbs WATCH1 which has full agreement and WATCH2 which has only object agreement, which is the approach that Liddell (2003) has chosen. 3.2.4. Subject-Only Agreement Verbs It should be noted again that no verbs agree only with the subject (at least not in the sign languages dealt with here). This is important to remember in the discussion of Dative-Subject Verbs in Section 4.2 below. 3.3. Plain Verbs with Object (Classifier) Incorporation Some signs, although “plain” by the traditional definition, do in fact encode one or more of the main verbal arguments, not by agreement, but rather by incorporation. In a reexamination of the nature of sign language, Stokoe (the father of modern Sign Linguistics) proposed a ‘semantic phonology’ model (Stokoe 1991), whereby the structural elements of signs, normally taken to be equivalent to the phonological level in spoken languages, are not merely meaning-distinguishing, but also meaning-determining. The parts of the sign are described as grammatical: the handshape represents the agent and the movement represents the verb, and thus the sign itself is a predication: Sign languages unite noun classifier and verb theme in signing in a way that speech does not necessarily follow… In a language like ASL, the classifier furnishes the handshape that denotes the object, and moving that handshape in the appropriate way expresses the verb’s theme. (Stokoe 2001: 189-190)

While in fact in some signs the handshape does represent the agent, in the verbs in question here, the handshape represents the object. Examples of Plain Verbs with Object Incorporation include: BREAK (=ASL, =BSL, =JSL), DRINK (=ASL, = BSL, =JSL), EAT (=ASL, = BSL, =JSL), GIVE-BIRTH-TO (≠ASL (plain), ≠JSL (plain);

Typology of Indian Sign Language Verbs 113

note however JSL signs SON, DAUGHTER with object incorporation), SMOKE-CIGARETTE (=ASL, =BSL, =JSL) The handshape, analyzed as being a classifier, can be of two types: it can represent the shape (or size and shape) of the object (= entity classifier) or the manner in which the object is handled (= handling classifier): In ASL, to relate that someone gave something, movement expresses … the theme; that is, the hand moves from the assumed location of the giver toward the recipient. The handshape making the movement often appears as it would if it were actually holding the object being given: a book, a glass of water, a ball, a basket with handle, or something tall and thin like a staff, and so on. Thus the sign verb’s handshape is a stem. It is also a classifier because it denotes a salient characteristic of what is being given or held or whatever. (Stokoe 2001: 182)

Thus:

DRINK Figure 6. Example of Plain Verb with Object-Incorporation

This example, DRINK, and numerous others like it, could also be classified as representing another class: Plain Verbs with Instrument Incorporation. However, this complication appears unnecessary at this point. What is being represented iconically by the incorporated handshape may in fact be the instrument argument. While the handling classifier handshape is that for the GLASS and not the WATER or MILK in the glass which is what is really being drunk, if we were to set up a separate category, then we would need to analyze EAT as sometimes being in one class (with the classifier being object incorporation: such as EAT-SANDWICH) and at other times in the other (with the classifier being instrument incorporation: such as EAT-WITH-FORK =’eat spaghetti’, EAT-WITH-SPOON = ‘eat soup’,

114

Michael W. Morgan

etc.) However, as the choice of instruments is dictated by the direct object (e.g. whiskey is usually drank from small shot glasses rather than tumblers; spaghetti is eaten with a fork rather than a spoon, chopsticks or one’s hands, etc.), indirectly the object argument is also encoded. And it is this latter analysis that we apply here. 3.4. Agreement Verbs with Object Incorporation In the previous section we saw that there is a group of Plain verbs where the object argument is in fact encoded in the verb, not through the normal agreement pattern, but rather through incorporation of a classifier handshape. Similarly, there is a group of Agreement verbs which also employ classifier handshape incorporation to indicate an object. Thus one object (the indirect object) is encoded through agreement, and the other (direct object) is encoded through incorporation. In this group we have such verbs as: ANSWER (≠JSL (agreement), ≠ASL (agreement)), ASK1 (=JSL; ≠ASL (agreement)), GIVE-BOX-TO (=ASL, =JSL), GIVE-GIFT-TO (≠ASL (agreement)), GIVE-TEACUP-TO (=ASL, =JSL), TELEPHONE (=BSL, =JSL) For example, various signs for GIVE, depending on the object given, are shown in Fig. 7:

A: I-GIVE-TO-unmarked-YOU

Typology of Indian Sign Language Verbs 115

B: I-GIVE-BOX-TO-YOU

C: I-GIVE-TEACUP-TO-YOU

Figure 7. Example of Agreement Verb with Object Incorporation

If we now compare the sign GIVE-GIFT-TO (not shown) with the other signs listed for giving: GIVE-TO-unmarked, GIVE-BOX-TO, and GIVETEACUP-TO, there is clearly structural similarity. However, while, for instance, the handshape used for the sign GIVE-TEACUP-TO is the same as in TEA, the handshape in GIVE-GIFT-TO is the fingerspelled letter ‘G’ – indexic2 to the object, a gift (which word begins with the letter ‘G’). If we now look at the signs for ASK1 and ANSWER, what we find is that the handshapes are those of the finger alphabet Q and A, respectively. If we treat the handshape for tea as being incorporated (as a handling classifier) into the sign GIVE-TEACUP-TO, and by analogy the G handshape in GIVE-GIFT-TO, then it seems reasonable to analyze the Q and A handshapes as being the “classifiers” for the concepts QUESTION and ANSWER incorporated into a sign, where the other morpheme is the subjectto-object-directed-motion morpheme.

A: I-ASK1-YOU

B: I-ANSWER-YOU

Figure 8. Agreement Verbs with Fingerspelled Object Incorporation

What made the analsysis of the classes of verbs with argument incorporation interesting in the earlier study of JSL (Morgan 2005) was the ergative

116

Michael W. Morgan

patterning of what was incorporated: subject with intransitive verbs and objects with transitive verbs. Now, looking at these same verbs (and particularly those which also show agreement) within a South Asian Areal context, another possible interesting interpretation presents itself. Namely, given that the handshape is itself a sign (either a full sign as in PHONE, or a classifier as in DRINK or a fingerspelled letter as in ASK1), and that the movement is the manifestation of verbal agreement, then if we subtract these two we are left with the ‘verb’ … which is formally and semantically void. Although explicator verbs common in South Asian languages are not themselves semantically void (e.g., Nepali garnu means ‘to do’), in the context of explicator verb constructions (fon garnu ‘to phone’) they contribute nothing except the verb frame onto which we can ‘hang’ agreement and the incorporated object as well. ISL Agreement Verbs with Object Incorporation might thus be taken to be, mutatis mutandis, a type of explicator verb construction. 3.5. Plain Verbs with Object plus Instrument Incorporation As we saw above, some verbs like DRINK or EAT-WITH-SPOON may be analyzed as cases of instrument incorporation, but for simplicity’s sake we chose to treat them as object incorporation. However, there are further verbs where the handshape of one hand incorporates the object and that of the other incorporates the instrument. All these verbs appear to be Plain Verbs. For example: CUT/SLICE (=ASL, =JSL), KICK-BALL (=ASL, =JSL), SAW (=ASL, =JSL) Depending on what is being cut (its size and shape, and thus the handling classifier handshape) and what it is being cut with (the size of the blade), a number of combinations are possible. 3.6. Subject Incorporation In JSL with its human gender classifiers there is a long list of verbs in which the handshape represents neither an instrument nor an object of a transitive verb, but rather the subject of an intransitive verb (and in very exceptional cases, the agent of a transitive verb). Such verbs are much less common in either BSL or ASL, and even less so in ISL. Nevertheless, there are a few examples:

Typology of Indian Sign Language Verbs 117

CRASH: airplane (=ASL, =BSL, =JSL), CRASH:car (=ASL, =JSL), DESCEND (=ASL, =BSL, =JSL), FLY: airplane (=ASL, = BSL, =JSL), STAND (=ASL, = BSL, = JSL), WALK (=BSL, =JSL) This group of verbs is also important because, while it has been argued (for ASL at least) that verbs show agreement with subject arguments only if they also show object agreement, this restriction does not apply to argument incorporation. 3.7 PLAIN Verbs with SUBJECT and OBJECT INCORPORATION Although Subject arguments are typically incorporated only with intransative verbs, there is one typoe of verb which is an exception. For example: PASS-(SOMEONE) (=ASL, =JSL) In this group (illustrated here by a single verb, but no doubt also containing others) is that there is a certain equality (and reciprocity) between subject and object. (As noted above, this group is much more abundantly represnted in JSL than in the other sign languages dealt with.)

4. ISL and the South Asian Sprachbund: the Dative-Subject Construction 4.1. Is the Dative-Subject Construction Areal? One of the features Masica describes as area-defining without going beyond “India” is the dative-subject construction – which has, in various contexts, been called non-nominative subject, oblique subject, experiencer subject, etc. “It may well be that this [Dative Subject Construction – MWM] is a criterion that sets off the “Indian area” more sharply than any other … present in all the major languages to a degree that seems to be unparalleled elsewhere” (Masica (1976: 164) and has been described elsewhere as “a feature of some significance in defining India and South Asia as a linguistic area” (Verma & Mohanan 1990: vii) and as “a distinctively South Asian trait” (Masica 2001: 254). Such a characterization seems to be more than just an exaggeration, and the claim that “it is not characteristic of the immediately surrounding lan-

118

Michael W. Morgan

guages, including those in Central Asia [and t]he closest case geographically seems to be Georgian” demonstrates perhaps an insufficient examination of all the languages of the “immediate” area. And, in fact, this view has been challenged “in view of the fact that this type of construction is widely attested among areally and genetically distinct languages around the world” (Shibatani & Pardeshi 2001: 313). As Jelinek & Willie note: Across languages, canonical transitive verbs are said to assign an Agent theta role to the Subject, and a Patient theta role to the Object argument. In contrast, dyadic psych verbs are more varied in thematic structure; they assign Experiencer and Theme theta roles, and the Experiencer may be either Subject, Object, or an Oblique argument. While psych verbs seem idiosyncratic in thematic structure when compared to ordinary transitive verbs in a given language, they fall into similar classes across languages, and thus present important data for the theory of theta roles and argument structure in universal grammar. (1996: 15)

Even if we exclude languages where dative case for logical subjects of predicate expression is only found for expressing possession, which as Masica (2005 [1976]: 168) notes “cuts a wider swath” than other experiencer predicate types, roughly similar phenomena are found in almost every continent and major region of the world3: 1) Africa: Yoruba* (Rowlands 1976: 127-130); 2) Asia: Chantyal (Noonan 2003), Japanese, Korean, Lhasa Tibetan (DeLancey 2003), Oroqen (Avorin & Boldyrev 2001: 126), Tamang (Mazaudon 2003). Tatar; 3) Caucasus: Chechen (Aliroev 1999: 54, 96), Kabardian (Colarusso 1992: 179), Lezgi (Haspelmath 1993: 280-283; Alekseev & Shejxov 1997: 40, 90), Tabasaran (Alekseev & Shixalieva 2003: 38, 97), Tzakhur (Kibrik 1999: 351-352); 4) Central America: Kaqchikel (Lolmay & Pakal B’alam 1997: 189) 5) Europe (limiting ourselves to two per subfamily!): Basque, Bosnian/Croatian/Serbian, German, Icelandic, Irish(*), Italian, Russian, Spanish, Welsh*); 6) Middle East: Hebrew (Givón 2001: 206); 7) North America: Chickasaw (Givón 2001: 210), Choctaw (Broadwell 2006: 142, 145), East Pomo* (McLendon 1996: 354), Haida* (Mithun 1999: 214-216), Osage (Quintero 2005: 261-265); 8) Pacific: Tuvaluan (Besnier 2000: 271-275), Waris (Foley 1986: 109110), Yimas (Foley 2007 [1985): 373);

Typology of Indian Sign Language Verbs 119

9) South America: Damana (Adelaar 2004: 72), Kogui (Adelaar 2004: 72), Tariana* (Aikhenvald 2003: 239-241), etc. Although some represented areas are perhaps underrepresented (Africa, Central America, Middle East), only Australia and Southeast Asia seem to be lacking in this type of construction (though this may be a result of my very limited and unscientific convenience sampling). As Masica notes, the Caucasus area is a good parallel for South Asia, but genetically closer at hand, dative experience subjects seem to be present in abundance in every major Indo-European language family: Slavic (Russian parallels can be found for almost all the same types of verbs as in South Asia, as we see in the list below) and Germanic (the literature for the phenomenon in Icelandic is especially abundant) stand out as exceptionally well represented in dative-subject constructions. Among his suggestions for further work, Masica (2005 [1976]: 185186) indicates that compilation of some sort of “Swadesh-style” list of potential dative-subject verbs and predicates, together with a +/- testing of each in each language is desirable to refine our knowledge of the distribution of the feature in question. Then, if for example, a given language has only one or two pluses for a feature, or if the pluses are all limited to one type of verb/predicate (for example, modals but not possession or psychological or physiological state predicates), or if one class of predicates (for example, involuntary action) are present only among a very limited range of languages in the area in question, then we have reason to question, at least, whether the given language manifests the same phenomenon (rather than maybe a more general “universal” tendency) and is part of the proposed linguistic area. To my knowledge, nothing like such a checklist has ever been developed and applied to the full range of languages in South Asia (and, for comparison, out of South Asia as well). Masica’s claim that “[c]ertain other languages include varying small parts of this territory in their domain of the construction” (2005: 164) is something which surely can be empirically tested; it seems, for example, that the domain of this construction in Russian (and other Slavic languages) is hardly less than in the average (or above average) South Asian language. And, if we define the feature in question as oblique subject rather than dative subject (thereby allowing Bengali into the South Asian fold), then certainly the presence of the phenomenon in Welsh and Irish (and other modern Celtic languages) could also not be described as ‘small’.

120

Michael W. Morgan

I do not intend here to devise or present such a thorough and complete list of experience verbs/predicates represented with dative / non-nominative subjects in South Asia, but in order to test whether we can make any judgments of the category for ISL, below is a very tentative list, drawn from examples found in various works on the subject (Masica 1991: 347ff; Verma & Mohanan 1990; Shibatani & Pardeshi 2001; Sjoberg 2001). First we will just present the list, and check off whether the sources from which the list was compiled indicate that any Indo-Aryan or Dravidian languages in fact use the dative to code logical subjects for each verb/predicate concept, and, for comparison, we add Russian, a non-South Asian, IndoEuropean language where dative experiencer subjects are abundantly attested (indicating also for Russian cases where the subject is expressed by a non-nominative case other than dative): Table 1. Classification and List of Representative Experiencer Predicates Predicate type Possession/ Existence Psychological states

Example verb HAVE GET/RECEIVE LOSE B/G-ANGRY B-ASHAMED B-FED-UP-OF B/F-HAPPY B/F-SAD B/F-SHY B-SORRY B-SURPRISED BELIEVE FEAR FEEL-LIKE FORGET G/H-COURAGE HOPE KNOW LAUGH LIKE LOVE MIND MISS REALIZE

in Indic YES YES (genitive) YES YES

in Dravidian YES YES

YES YES YES

YES YES YES

YES

YES YES YES YES YES YES YES YES YES YES YES

YES YES YES

YES YES YES YES

in Russian (locative) NO NO NO YES YES YES YES (reflex) YES (reflex) NO (reflex) YES NO NO (reflex) NO (reflex) YES NO YES NO (reflex)

Typology of Indian Sign Language Verbs 121

Physiological states

Visual / auditory perceptions

Modal states of necessity (including obligation) and wanting

Modal states of potentiality (including ability and permission) Other Involuntary Uncontrolled

/

REMEMBER UNDERSTAND WANT WORRY B/G-FULL B/F-HUNGRY B/F/G-SLEEPY B/F-TIRED B/F-THRISTY FALL-ILL F-BAD F-COLD B/F/G-HURT G/H-COLD G/H-COUGH G/H-FEVER G/H-(head)ACHE F/H-PAIN APPEAR-(adj) FIND (CAN)-HEAR (CAN)-SEE SEEM SMELL-(adj) WANT-TO HAVE-TO MUST NEED NEED-TO OUGHT-TO SHOULD CAN/CANNOT KNOW-HOW-TO MANAGE-TO MAY/MIGHT SUCCEED-IN BE-IN-HABIT-OF HAPPEN-TO (involuntarily, on

YES YES YES

YES YES YES YES YES YES YES YES YES YES YES YES YES

YES YES YES YES YES YES

YES YES YES YES YES YES YES YES

YES YES YES NO NO YES YES YES YES NO YES YES (accusative) (locative) (locative) (locative) (accusative) YES

YES YES YES

NO YES YES YES

YES YES YES YES YES YES YES

YES YES YES YES YES YES YES YES NO YES NO YES NO NO

YES

YES YES YES YES YES

YES

122 action

Michael W. Morgan impulse, by accident)

KEY: B- = BE/BECOME, H- = HAVE, F- = FEEL, G- = GET. YES = concept can be (though not necessarily always is) expressed by dative subject, NO = cannot be expressed by dative subject+ a blank space means that no example was found in the sources I examined (but not necessarily that it is not found in the languge family in question). (locative), (accusative) = logical subject is non-nominative, but locative or accusative rather than dative, (reflexive) = logical subject is encoded both by nominative on noun and by non-nominative (i.e. reflexive/reciprocal) on verb which is ambiguous as to accusative vs. dative.

That the third and fourth columns have more ‘YES’es than the last column does not necessarily mean that dative experiencer subjects are more a phenomenon of South Asian Indic and Dravidian languages than it is of Russian, since a YES in these two columns means that dative subjects are found with a given experiencer predicate in one or more, but not necessarily every, Indic and Dravidian language. Probably if we examined each and every Indic and Dravidian language, we would find many with fewer ‘YES’es; no doubt we would also find a few with more. On average though, Russian is at least as prone to dative experiencer subjects as the average South Asian language, if not more so. 4.2. Does ISL have a Dative-Subject Construction? Unlike Indo-Aryan languages or Dravidian languages, neither ISL nor any other sign language I am aware of has any nominal cases. So, the question must be: is there anything about how experiencer verb signs are formed that is identical (or at least similar) to how dative verbs are formed? Although there is no separate class of “Dative verbs” in ISL, verbs which have indirect objects (GIVE-TO, SHOW-TO, etc) are mostly agreement verbs, and the prime characteristic of agreement verbs is movement of the sign towards the (indirect) object, and/or orientation (of palm or fingers) towards the object. Table 2 reproduces the complete list of Experiencer Predicate from Table 1, and presents data on the appropriate ISL verb which might give us a clue as to whether we should classify these verbs as Dative Verbs: whether it is an agreement verb, whether it is Anchoring (which we saw above is a counter-indicator of agreement), plus whether the movement and orientation of the sign is towards the signer or not.

Typology of Indian Sign Language Verbs 123 Table 2. Experiencer Predicates in ISL4 Predicate type

Example verb

Possession/ Existence

HAVE GET/RECEIVE LOSE B/G-ANGRY B-ASHAMED B-FED-UP-OF B/F-HAPPY B/F-SAD B/F-SHY B-SORRY B-SURPRISED BELIEVE FEAR FEEL-LIKE G/H-COURAGE HOPE KNOW LAUGH LIKE LOVE MIND MISS REALIZE REMEMBER UNDERSTAND WANT1 WANT2 WORRY B/G-FULL B/F-HUNGRY B/F/G-SLEEPY B/F-TIRED B/F-THIRSTY FALL-ILL F-BAD

Psychological states

Physiological states

SubjectObject agreement NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO

Anchoring (contact) NO NO NO YES YES YES YES YES YES YES YES YES YES YES YES NO YES YES YES YES NO NO YES YES YES NO YES YES YES YES YES YES YES YES

Motion towards signer NO YES NO NO NO NO NO NO NO NO NO NO NO --NO NO +\YES NO +/+/NO YES NO NO +/NO NO NO NO +/NO NO YES

Orientation towards signer NO YES NO NO NO NO NO YES YES YES NO NO NO --YES NO YES YES NO YES YES NO YES YES YES NO YES YES NO YES +/NO YES YES

124

Michael W. Morgan

Visual / auditory perceptions

Modal states of necessity (including obligation) and wanting

Modal states of potentiality (including ability and permission) Other Involuntary / Uncontrolled action

F-COLD B/F/G-HURT G/H-COLD G/H-COUGH G/H-FEVER G/H-HEADACHE F/H-PAIN APPEAR-(adj) FIND (CAN)-HEAR (CAN)-SEE SEEM SMELL-(adj) WANT-TO HAVE-TO MUST NEED NEED-TO OUGHT-TO SHOULD CAN/CANNOT KNOW-HOW-TO MANAGE-TO MAY/MIGHT SUCCEED-IN BE-IN-HABIT-OF HAPPEN-TO (involuntarily, on impulse, by accident)

NO NO NO NO NO NO NO NO NO NO NO

NO YES YES NO YES NO NO +/NO +/+/-

NO YES NO NO NO NO NO NO NO YES NO

NO YES YES NO NO NO YES YES NO YES YES

NO NO NO NO

NO NO NO YES

YES +/NO NO

YES YES NO NO

NO NO NO NO NO NO --

NO YES NO NO NO YES --

NO +/NO NO NO NO --

NO YES NO NO NO YES --

KEY: B- = BE/BECOME, H- = HAVE, F- = FEEL, G- = GET; signs are entered only once, so for instance the same sign is used for both NEED and NEED-TO, and so is entered only for the former and the latter is left blank

The first thing that will be apparent is that none of the experiencer verb signs show subject-object agreement; without exception they are plain verbs in the classical classificatory scheme. A number of these verbs are intransitive and for this segment we could expect agreement to be lacking. However, the fact that none of these verbs shows agreement demands ex-

Typology of Indian Sign Language Verbs 125

planation. For the overwhelming majority of these verbs, the fact that there is body contact (anchoring) – especially for the overwhelming majority of signs in the sub-classes of psychological and physiological states – may be a reason for this lack of agreement. Although there is a lack of classical agreement verbs – verbs showing the motion from subject-to-object type – there is a small group which shows motion towards the signer and a much larger group which shows orientation (usually of palm) towards the signer. If we take ‘directionality towards’ (either motion or orientation) as an indicator of dative (or at least object/oblique) status, then what we have is the signer marked as dative recipient (experiencer). Motion, if you will, is not towards the logical subject of the sentences, but rather towards the signer. That the signer is the prototypical subject, and that a consequence of this is the class of bodyanchored plain verbs, has been pointed out by Meir, et al. (2007). The fact that in the group of experiencer verbs (and in fact the overwhelming majority of all verbs that typically have dative subjects in South Asian languages) the orientation of signs is towards the body of the signer seems to indicate that the subjects here are objects of a sort – to wit, datives. Thus, ISL seems to fit in well with the South Asian model. This same applies, mutatis mutandis, to other sign languages as well. For JSL there is one exception, the verb HAVE, but even here the exception proves the rule. The sign HAVE in JSL optionally shows agreement, but agreement with only one of its arguments. Above (Section 3, Class B) we saw that if verbs agree with only one argument, then it is the object argument that they agree with. The verb HAVE, however, agrees not with the object, but with the subject, and does through motion and orientation. That is to say, the motion and palm are both directed at the (logical) subject. In Morgan (2005), I argued that HAVE was an affect, rather than a simple transitive verb. And affect verbs typically, be it in South Asian languages or in Caucasian languages, have dative subjects.

Conclusion If we return to the question of what kind of language ISL is, for some features (various word order feature, for example) it seems it could probably be described as a not overly aberrant South Asian language (at least with regards to non-phonological features), and thus a member of the South Asian Linguistic Area. If we allow for a certain creativity in analysis, cer-

126

Michael W. Morgan

tain other features characteristic of the wide swath of South Asian languages (such as or, as argued, dative-subject constructions and ‘explicator’ verbs, for instance) might also be seen as present in ISL. Not surprisingly, ISL manifests much less ambiguous typological affinities with other sign languages, and, of those examined, especially with JSL. Word order features, for example, are so similar that one feels one could sign ISL with one hand and JSL with the other and never get confused. On the other hand, this impression, based on one feature, may be contradicted by conclusions based on another feature (for example, although they both have basically the same classes of verbs as described above, JSL is much more highly incorporating). So again: what ‘type’ of language is ISL? The present work has been an addition to the small but growing descriptive and comparative works trying to answer that question. But, as with the ‘South Asian language type’, much more work needs to be done.

Acknowledgments The author would like to thank his Deaf ISL informants and co-workers: Sujit V. Sahasrabudhe, Sunil V. Sahasrabudhe, Prakash Khairnar, Mona Shah, Bhavini Modi and Smita Patel – especially the first two, with whom he has had numerous discussions on various issues related to ISL and sign linguistics.

Notes 1.

2.

These six, however, only concern the manual aspects of sign language production. In addition, sign languages use nonmanual features as well – features such as eye-gaze, mouthing and various mouth shapes, eyebrow-raising or lowering, head-tilt, upper torso-shift, etc. Whereas many of these function in sign languages primarily on the grammatical (and pragmatic) level, some, perhaps all are used on the lexical level as well. Prime examples are the BSL for BOSS and GOD, which are distinguished solely by the upward eye gaze accompanying the latter (Sutton-Spence & Woll 1998: 94), and ASL LATE and NOT-YET, which are distinguished by protruding tongue accompanying the latter of the two. For those perhaps unfamiliar with the Peircean notions of icon, index and symbol, as interpreted and applied to linguistics by R.O. Jakobson: “[O]ne

Typology of Indian Sign Language Verbs 127

3.

4.

may say that for the interpreter an index is associated with its object by a factual, existential contiguity and an icon by a factual similarity, whereas there is no compulsory existential connection between symbols and the objects they refer to. A symbol acts “by virtue of a law”. Traditional rules underlie the relations between the diverse symbols of one and the same system. The connection between the sensuous signans of a symbol and its intelligible (translatable) signatum is based on a learned, agreed upon, customary continuity. Thus the structure of symbols and indexes implies a relation of contiguity (artificial in the former case, physical in the latter), while the essence of icons consists in similarity. On the other hand, the index, in contradistinction to the icon and symbol, is the only sign which necessarily involves the actual copresence of its object” (Jakobson 1964: 217). In the list, languages with experiencer subjects marked by an oblique case (or “prepositional” marker) other than dative – including those which do not distinguish a separate dative – are marked with an asterisk after the name of the language; if dative subjects are present though outnumbered by non-dative oblique subjects, the asterisk is in parentheses. Languages where “dative” (i.e. indirect objects) and experiencer subjects are marked with the same preposition are treated as dative. Languages where only the verb have has an oblique subject have been excluded from consideration, although its inclusion would greatly increase the list of languages and areas represented. In some cases more than one experiencer predicate concept is represented by a single sign and vice versa. Thus, there are two signs for WANT (listed as WANT1 and WANT2), and there are altogether two signs for HAVE-TO, MUST, NEED, NEED-TO, OUGHT-TO, SHOULD (a fact not indicated on the table).

References Adelaar, Willem F.H. (with Pieter C. Muysken) 2004 [2007] The Languages of the Andes. Cambridge, UK: Cambridge University Press. Aikhenvald, Alexandra Y. 2003 A Grammar of Tariana, From Northwest Amazonia. Cambridge, UK: Cambridge University Press.

128

Michael W. Morgan

Aleksev, M.E. & É.M. Shejxov (= Алексеев, Михаил Егорович & Энвер Магомедалиевич Шейхов) 1997 Лезгинский язык. Москва: Academia. [Lezginskij jazyk. Moscow: Academia.] Aleksev, M.E. & S.X. Shixalieva (= Алексеев, Михаил Егорович & Сабрина Ханалиевна Шихалиева) 2003 Табасаранский язык. Москва: Academia. [Tabasaranskij jazyk. Moscow: Academia.] Aliroev, I.Ju. (= Алироев, Ибрагим Юнусович) 1999 Чеченский язык. Москва: Academia. [Chechenskij jazyk. Moscow: Academia.] Aliroev, Ibragim (= Алироев, Ибрагим Юнусович) 2004 Самоучитель чеченского языка. Москва: Academia. [Samouchitel' chechenskogo jazyka. Moscow: Academia.] Andronov, Mixail Sergeevich (=Андронов, Михаил Сергеевич) 1978 Сравнительная грамматика дравидийских языков. Москва: . [Sravnitel’naja grammatika dravidijskix jazykov. Moscow: Nauka.] Avrorin, A.V. & B.V. Boldyrev (= Аврорин, В.А. & Б.В. Болдырев) 2001 Грамматика орочского языка. Новосибирск: СО РАН. [Grammatika orochskogo jazyka. Novosibirsk: Siberian Division, Russian Academy of Sciences.] Baker-Shenk, Charlotte & Dennis Cokely 1980 American Sign Language: a Teacher=s Resource Text on Grammar and Culture. Washington DC: Clerc Books / Gallaudet University Press. Besnier, Niko 2000 Tuvaluan: A Polynesian Language of the Central Pacific. London & New York: Routledge. Bhaskararao, Peri & Karumuri Venkata Subbarao (editors) 2001 The Yearbook of South Asian languages and Linguistics, 2001: Tokyo Symposium on South Asian languages: Contact, Convergence and Typology. New Delhi, Thousand Oaks & London: Sage Publications. Brien, David (editor) 1992 Dictionary of British Sign Language / English. London & Boston: Faber and Faber. Broadwell, George Aaron 2006 A Choctaw Reference Grammar. Lincoln & London: University of Nebraska Press. Colarusso, John 1992 A Grammar of the Kabardian Language. Calgary: University of Calgary Press.

Typology of Indian Sign Language Verbs 129 DeLancey, Scott 2003 Lhasa Tibetan. Chapter 17 in Graham Thurgood & Randy LaPolla, eds. 270-288. Foley, William A. 1986 The Papuan Languages of New Guinea. Cambridge, UK: Cambridge University Press. Haspelmath, Martin 1993 A Grammar of Lezgian. Berlin & New York: Mouton de Gruyter. Jakobson, Roman O. 1964 On Visual and Auditory Signs, Phonetica 11, 216-220. Janis, W.D. 1995 A Crosslinguistic perspective on ASL Verb Agreement. In Karen Emmorey & Judy S. Reilly (eds.), Language, Gesture, and Space. Hillside, NJ: Lawrence Erlbaum Associates, Publishers. 195-223. Jelinek, E. & M.A. Willie 1996 “Psych” Verbs in Navajo. In E. Jelinek, S. Midgette, K. Rice, & L. Saxon (eds.), Athabaskan Language Studies: Essays in Honor of Robert W. Young. Albequerque: University of New Mexico Press. 15-34. Kegl, Judy 1990 Predicate Argument Structure and Verb-Class Organization in the ASL lexicon. In Ceil Lucas (ed.), Sign Language Research: Theoretical issues. Washington, Dc: Gallaudet University Press. 149-175. Kibrik, A.E. (ed.) (= Кибрик, Александр Евгеньевич) 1999 Элементы цахурского языка в типологическом освещении Москва: Наследие. [Élementy caxurskogo jazyka v tipologichesko osveshchenii. Moscow: Nasledie.] Liddell, Scott K. 2003 Grammar, Gesture, and Meaning in American Sign Language. Cambridge, UK: Cambridge University Press. Lolmay & Pakal B’alam (= Pedro Garcia Matzar & José Obispo Rodriguea Guaján) 1997 Rukemik ri Kaqchikel Chi’ (Gramatica Kaqchikel). Guatemala: Cholsamaj. McLendon, Sally 1996 Sketch of Eastern Pomo, a Pomoan Language. In Ives Goddard (ed.), Handbook of North American Indians, Volume 17: Languages. Washington, D.C/: Smithsonian Institution. 507-579. Masica, Colin P. 1976 [2005] Defining a Linguistic Area: South Asia. New Delhi: Chronicle Books. [First edition (1976) by Chicago: Chicago University Press.]

130

Michael W. Morgan

Masica, Colin P. 1991 The Indo-Aryan Languages. Cambridge, UK: Cambridge University Press. Masica, Colin P. 2001 The Definition and Significance of Linguistic Areas: Methods, Pitfalls, and Possibilities (with Special Reference to the Validity of South Asia as a Linguistic Area). In P. Bhaskararao & K.V. Subbarao (eds.). 205-267. Mazaudon, Martine 2003 Tamang. Chapter 18 in Graham Thurgood & Randy LaPolla, eds. 291-314. Meir, Irit, Carol A. Padden, Mark Aronoff & Wendy Sandler 2007 Body as Subject. J. Linguistics 43 (2007), 531–563. [accessed 5 Jan 2009 at: sandlersignlab.haifa.ac.il/pdf/Body_as_Subject.pdf] Mithun, Marianne 1999 The Languages of Native North America. Cambridge, UK: Cambridge University Press. Morgan, Michael W 2005 A Whole-Language Typology of Japanese Sign Language, Japanese Journal of Sign Linguistics, vol. 16: 31-43. Noonan, Michael 2003 Chantyal. Chapter 19 in Graham Thurgood & Randy LaPolla, eds. 315-335. Padden, Carol A. 1988 Interaction of Morphology and Syntax in American Sign Language. Outstanding Dissertations in Linguistics, Series 4. New York: Garland Publishing. [Based on Padden (1983). Interaction of morphology and syntax in American Sign Language. Unpublished doctoral dissertation, University of California at San Diego, La Jolla.] Quintero, Carolyn 2004 Osage Grammar. Lincoln & London: University of Nebraska Press. Rowlands, E.C. 1969 [1976] Yoruba. New York: David McKay. Shibatani, Masayoshi & Prashant Pardeshi 2001 Dative Subject Constructions in South Asian Languages. In P. Bhaskararao & K.V. Subbarao (eds.). 311-347. Sjoberg, Andrée F. 2001 Convergence and Resistance to Morphological Change in Agglutinative Languages of South and Central Asia. In P. Bhaskararao & K.V. Subbarao (eds.). 369-390. Stokoe, William C. 1991 Semantic phonology. Sign Language Studies 71: 107-114

Typology of Indian Sign Language Verbs 131 Stokoe, William C. 2001 Language in Hand: Why Sign Came Before Speech. Washington, D.C.: Gallaudet University Press. Sutton-Spence, Rachel & Bencie Woll 1999 The Linguistics of British Sign Language: An Introduction. Cambridge, UK: Cambridge University Press. Thurgood, Graham & Randy J. LaPolla (eds.) 2003 The Sino-Tibetan Languages. London & New York: Routledge. Verma, Manindra K. & K.P. Mohanan (eds.) 1990 Experiencer Subjects in South Asian Languages. Stanford, CA: Stanford Linguistics Association. Zeshan, Ulrike 2000 Sign Language in Indo-Pakistan: A Description of a Signed Language. Philadelphia & Amsterdam: John Benjamins Publishing Company. Zeshan, Ulrike 2006 Regional Variation in Indo-Pakistani Sign Language – Evidence from Content Questions and Negatives. In Ulrike Zeshan (ed.), Interrogative and Negative Constructions in Sign languages. Nijmegen: Ishara Press. 303-323.

Regional Reports

Sinhala in Contact with Arabic: The Birth of a New Pidgin in the Middle East Fida Bizri

This report presents a thumb-nail sketch of a new Arabic-based language born out of the linguistic contact between Sinhala and Arabic. Some of what seem to me to be the most remarkable morpho-synactic features of this language are outlined below. I shall refer to it as Pidgin Madam and present it as a pidgin. Pidgin Madam is a language born out of the contact between Sri Lankan female domestic workers and their Arab employers in many coutries of the Middle East. 1 I choose to call this language Pidgin Madam because it is mainly spoken between Arabic-speaking "Madams" (as maids refer to their female employers) and their maids. "Madam" is also the most frequent form of address used by the maids answering their female employers. In Arabic countries, this language is referred to by the name of "Sri Lankan maids' Arabic", or simply " Maids' Arabic". The data presented here were gathered in Lebanon, thus, the Arabic involved in this specific contact is Lebanese Arabic. While in Sri Lanka this phenomenon of domestic migration towards the Middle East is common for women from both Sri Lankan main linguistic communities, Tamil and Sinhala, in Lebanon most of the Sri Lankan maids are Sinhala-speaking. On the other hand, although Sri Lankan Sinhalese maids are not the only ones to head for the Middle East and to learn Arabic informally, this language, known in Lebanon as "Sri Lankan Arabic", is typical of them. 2 Ethiopians and Philippines are the two largest female domestic servant communities after the Sinhalese one in Lebanon, and neither of them speak Pidgin Madam. This is probably due to the fact that Ethiopians, for instance, speak Amharic, a Semitic language very close to Arabic, and pidginization develops only between languages from distant families. Ethiopians end up speaking a form of Arabic that is much closer to the target language than that of Pidgin Madam. On the other hand, domestic servants from the Philippines usually speak English with their Arabic employers, which is a good context for the development of an English-based pidgin.

136

Fida Bizri

Pidgin Madam is therefore a contact language that has developed in a context where only two linguistic communities are involved. In that sense, it is different from the extensively studied major pidgins which were born in a multilingual context, and where there was a need for a lingua franca between different substrate languages. This is not a novelty, since other pidgins with only one substratum are attested: one can mention for instance Pidgin Delaware (Goddard 1996) which was first born in the contact between the Delawares and the Dutch, before it was used by other European linguistic communities, and the Ndyuka-Trio pidgin (Huttar and Valentie, 1996), a contact language between Ndyuka and Trio. Whatever conditions are necessary leading to the birth of a pidgin, it is important to highlight the fact that the term "pidgin" used here is understood in the sense of "a process of pidginization" being recognized as a norm (Hymes 1971: 84: "A pidgin is the result of such a process [pidginization] that has achieved autonomy as a norm".) The difference that distinguishes a pidgin from a process of pidginization is, therefore, not relative to a bigger complexity that would make it more "real" as a language, but rather to the social reality of its acceptance as a norm by a given community. A pidgin is thus a communication means that is judged as being at a "sufficient competence stage" by its own speakers. The Two Languages in Contact 3 Sinhala is a special case amongst Northern Indo-Aryan languages: it has been separated from other languages of the same family, and has long been in contact with a Dravidian language, Tamil. Sinhala is also known to be a clearly diglossic language, the so-called "literary" or "formal" language (Gair 1998) being taught at school. Almost all the Sri Lankan maids who come to Lebanon (although they are from very poor villages) have received elementary education at school, and are familiar with the more complex grammar of the "literary" language. Their mother tongue and everyday language is, however, Colloquial Sinhala (CS). Thus, all references to Sinhala in this paper concern CS. The main aspects of CS that merit attention in this report are the following: the existence of a prenominal relative clause using the present or past participial forms of verbs (CSa). The various uses of the Sinhala conjunctive participles: alone in a sentence where a sequence of actions is expressed (CSb), the conjunctive participle indicating all actions preceding the final and main verb (the subject of which can be, as in Tamil, different

Sinhala in Contact with Arabic

137

or identical to that of the preceding actions), or followed by the inanimate existential verb -tiyenavâ- denoting experience (CSc). It is important to remind the reader that CS has no verb-agreement, and no femininemasculine gender. Sinhala has animate-inanimate gender, however. (CSa) mama hadana/hadapu me made(pr.part)/ have made (ps.part) The rice that I cook/ that I have cooked.

bat rice

(CSb) gedara gihin, bat kâlâ, nidâgattâ, house go (conj.part) rice eat(conj.part), sleep (past) I went home, had something to eat, and slept (or: having gone home, having eaten rice, I slept) (CSc) sudu nônâ lankâva-Ta gihin tiyenavâ white madam Lanka-dative go (conj.part) there is(inan.) The Western lady has already been to Sri Lanka. Lebanese Arabic (LA) belongs to the Eastern dialects of Arabic, a Semitic language. Arabic is also a diglossic language. Although in Lebanon everyday issues are expressed in the Lebanese dialect, a lot of issues are often dealt with in the so-called "literary" or "standard" Arabic (Ferguson 1960) : serious TV programmes, news, announcements, or even any serious learned discussion between friends about politics or philosophy tend to be on a linguistic continuum going from formal to informal forms of Arabic. Due to this diglossia, Sri Lankan maids, upon arriving in Lebanon (or for that matter, to any other Arab country), encounter thus a further difficulty making sense out of everything they hear around them. The main aspects of LA that deserve mention here are the following: LA has a sharp feminine-masculine gender distinction apparent in demonstratives, adjectives, pronouns (even for 2nd person singular), and verbal agreement. Pronouns can be either independent or suffixed. When suffixed, they can be added to nouns (LAa) denoting the possessor of the element expressed by the noun, to verbs (LAa), to prepositions (LAa), and to particles denoting intention badd- (LAa), and capacity fî-. The main characteristic of the verbal system is its aspectual distinction, where imperfective implies any action that has not yet been completed, and perfective denotes an action that has been completed. In the imperfective form, one can differentiate between: imperfective (LAb), and modal imperfective (here glossed as m.impf. Modal imperfective is constructed by omitting the prefix b- from

138

Fida Bizri

the imperfective form). The prohibitive expression (LAc) and the imperative (LAd) mode are constructed on the modal imperfective verbal base.

‫ع‬ann-e (LAa) ebn-ik badd-o yHabr-ik chi Son-your(fs) intention-his tell(impf.3sm)-you(fs) thing about-me Your [feminine addressee] son wants to tell you something about me. (LAb) b-y-êkol, b-t-êkol, eat (impf.3ms), eat (impf.3fs), He eats, she eats,

b-têkol, b-t-êkl-e. eat (impf.2ms), eat(impf.2fs) you (mas) eat, you (fem) eat.

(LAc) ma y-êkol. ma t-êkol. ma t-êkle neg eat (m.impf.3ms), neg eat(m.impf.2ms), neg eat(m.impf.2fs) Let him not eat. Don't eat (m). Don't eat (f.) (LAd) kôl, eat(imp.2ms), Eat (to man),

kele, eat(imp.2fs), eat (to a woman),

kelo eat(imp.2p) eat (to plural)

Some Grammatical Features of Pidgin Madam What follows is not a grammatical sketch of the language as much as it is a brief presentation of what seem to me to be some of its most remarkable features (listed below from 1 to 12). The examples presented here are all glossed according to LA grammar, so as to show the degree of difference between both languages. (1) Pidgin Madam shows an extensive usage of Arabic imperatives as verbal stems. The Sri Lankan maid uses imperatives for her affirmative present or past tenses. That is, instead of saying "I'm going to sleep" or "I went to sleep", she would say "I, do go to sleep"(1a). However, although these verbal forms (that are imperatives in LA) are quantitatively the most current in Pidgin Madam, other verbal forms do occur: LA modal imperfetive aspect (used in the construction of the prohibitive expression) does also occur, giving sentences like "me, don't you eat anything today", instead of saying "I haven't eaten anything today" (1b). LA perfective forms denoting past accomplished actions in Arabic are very much less frequent but they do occur (10b, c, d).

Sinhala in Contact with Arabic

(1a)

ana rûyi nêmi me go(imp.2sf) sleep(imp.2sf) I am going to sleep.

(1b)

ana ma têkle si lyôm me neg eat(m.impf.2fs) thing today I haven't eaten anything today

139

(2) An extensive use of the LA second singular feminine personal suffix (-ik: you, your, yours) or, to a lesser degree, the third singular feminine suffix (-a/-ya: she, her, hers), instead of the first person's (-e/-ne I, me, my). This results in sentences like "your son (is) a soldier" to say "my son (is) a soldier" (2a), or "your husband no there is hit her", instead of "my husband has never hit me"(2b). For an explanation of the "no there is" part of the latter sentence check below under feature number (8). (2a)

ana ebn-ik askari me son-your(fs) soldier My son is a soldier

(2b)

ana saws-ik mapi drobi-ya me husband-your(fs) there-is-no you hit(imp.2fs)-her(fs) My husband has never hit me

(3) Although there is no gender or number distinction in the PM, in most of the cases where a maid uses an adjective to denote any entity, the adjective has the LA feminine singular form. This gives sentences that sound to LA speakers like "all the children she's good", for "all the children are good" (3a), and "your husband she's good" for "my husband is good" (3b). (3a)

abel maDam lête bêbi kellu hilwi before madam 3 baby all beautiful(fs) All of my previous Madam's children were good-looking [i.e. The Madam in the previous house where I was before coming here]

(3b)

saws-ik nîya husband-your(fs) good(fs) My husband is good

140

Fida Bizri

(4) A lot of LA expressions that are normally transformed or conjugated in LA become frozen expressions in Pidgin Madam, like iconized clichés. This gives sentences like: "Starting from now, I do-you-want-me-to-putthe-money-for-you-in-the-bank" (4a), meaning "Starting from now, I want to save some money in the bank". Or: "you-love-me coming" for "I loved to come, I wanted to come" (4b). In fact, for the verb "to love/to like", the informants often use the Arabic phrase "do you love me" or "you love me" as the bare verbal stem that semantically implies the idea of "love" or "like" equivalent of English "to love". (4a)

halla sway badd-ik-nhott-o bank now a little intention-you(fs)-put(m.impf.1p)-it bank (Starting from) now, I want to save some money in the bank. [The informant had wasted all the money she had earned so far by sending it to her family on a monthly basis. Starting from now, she doesn't want to send them anything anymore, she wants to save it in the bank for her future plans] (4b)

bethebbî-ni jêye love(impf.2sf)-me coming (pr.part.s) I felt like coming. [In answer to "why did you come to Lebanon?", she said "I just felt like doing it", meaning that she is the one decided, nobody in her family forced her to") (5) Some Arabic causative verbs are also used in the pidgin as noncausative, or some transitive verbs used like intransitives: "to wake someone up" used in the pidgin instead of "to wake up (oneself)", giving sentences like "I wake-me-up at seven" for "I wake up at seven" (5a). The sign > is used to transcribe an intonation contour that will be discussed under (10). (5a)

ana> sitti nôs payyî-ne, me 6 half wake(imp.2fs)-me I wake up at six thirty. [More precisely in the context, and if we translate the > intonation gloss: "As for me, well, it's at 6 thirty that I wake up". The informant was comparing her status to that of 2 Ethiopian maids who live with her in the same house. The Ethiopian maids wake up at 8 thirty, whereas her, she has to wake up at 6 thirty].

Sinhala in Contact with Arabic

141

(6) Some interrogative forms (minus the interrogative intonation) are used in a non-interrogative context, either to express a relative clauses: "the dog what's his name Bobby", for "the dog's name is Bobby" (6a), or to focus on an element as in "I what-do-you-want that", for "it's that that I want". (6a)

hône kaleb s-usm-o bôbi here dog what-name-his Bobby This dog's name is Bobby.

(6b)

ana su badd-ik hayda me what intention-your(fs) that What I want is that thing.

(7) The relative clause is also often expressed by means of a pidgininvented usage of the LA particle fî, pronounced as pî in Pidgin Madam. fî in standard Arabic is a preposition meaning "in, inside", but in LA it is used to denote the existence either of people or of objects, with the meaning of "there is, there are". In Pidgin Madam, it is used, pre-nominally, as if it were the Sinhala existence animate and inanimate verbs (respectively: innavâ, tiyenavâ), in their present or past participial forms, meaning "being, having been" (7a, 7b). (7a)

hôn pî benet here there is girl The girl who is here

(7b)

sir lanka pi sogol no gûD Sri Lanka there is work no good (Eng) The jobs available in Sri Lanka are not good

(8) This usage of pî is further extended so as to cover all usages of the Sinhala tiyenavâ mentioned above under CS main characteristics, and more specifically its appearance after a conjunctive participle denoting experience, as in "I have done that previously in my life", or "I have been to Kuwait" (8a). The example (2b) previously mentioned offers a negative expression of pi, as mapi, giving the meaning of "having never done something". This particle can also combine both usages explained under feature (7) and feature (8): that of expressing a relative clause, and that of expressing an experience (8b, 8c).

142

Fida Bizri

(8a)

ana pi rûhi kuwêT me there is go(imp.2fs) Kuwait I have been to Kuwait

(8b)

ana pi amali sogol me there is do(imp.2fs) work The work that I do/ I have done

(8c)

hiyi abel pi jîbi masâre she before there is bring(imp.2fs) money The money which she had brought earlier

(9) Arabic prepositions are reinterpreted in Pidgin Madam as postpositions. Moreover, the prepositions in question have a complex Arabic form that is treated as a bare form in the pidgin. That is, from the preposition ma ‫" ع‬with", one can say in LA ma ‫ع‬-ik "with you(fs)", whereas in Pidgin Madam mayik meaning simply "with", and is used as a postposition (9a to 9c). The preposition pô meaning "above" appears, however, without the suffix (9d). One should note that in LA: "above" fô' + suffix -ik = faw'ik. (9a)

tarî hadd-ik opis road next to-you(2fs) office(Eng) The office that is by the road

(9b)

ana may-ik tayi me with-you(2fs) come (imp.2fs) Come with me

(9c)

ana masâre ana saws-ik may-ik me money me husband-your(2fs) with-2fsg My money is with my husband

(9d)

misTer ana pô nêmi Mister me above sleep(imp.2sf) Mister lay down on top of me

(10) Another important linguistic device in Pidgin Madam is a particular intonational contour. This specific contour, marked here by the sign >, is characterized by a rise on the element preceding the sign, followed by a gradual falling in the remaining sentence. This contour fulfils a wide range

Sinhala in Contact with Arabic

143

of functions: it is used to focus on the element preceding the > sign (10a), something very similar to the intonation in English sentences like "As for my son> well... he's with my mother"; it can also introduce a consequence or an implication concerning the element or action preceding the sign > (10b); it follows an element that occurred chronologically after the element before the sign ("Having done this> I proceeded further to do that", i.e. very similar to the Sinhalese use of conjunctive participles) (10c); and it follows subordinate clauses (time, manner, condition, or cause) (10d, 10e). (10a)

bêbi >ana mâma bêt baby me mum house As for my son, [well] he's in my mother's house

(10b)

ana supt-a hiye > ana tarip hiye no gûD me see(perf.1s)-her she me know(peft.1s) she no good(Eng) As soon as I saw her, I knew she was not a good [person]

(10c)

kallasit > ana isit finish(perf.3fs) me come(perf.3fs) Having finished [my contract abroad], I came [back].

(10d)

supt-a > tikki, no sûp-a > assan I see(peft.1s)-her cry(m.impt.3fs) no see(m.impf.1s)-her better If I see it, then I cry. If I don't see, then it's better [for me]. [The informant was telling why she doesn't want to have any pictures of her children while in Lebanon for three years. If she sees them in a picture, she cries, so she prefers not to see them]. (10e)

badên hêk, ana sêwi akel> hiyi rûyi then like this me do(imp.2fs) food she go(imp.2fs) Then [it was all] like this: If I cook, she would go away. [The informant was talking about a problem she has with another maid in the same house. They had both had an argument after which, the other maid refused all contact with the informant. She even refused to eat, because it was the informant who had prepared the food.] (11) Pidgin Madam makes extensive use of reduplication of adjectives (11a), adverbs (11a), and nouns (11b). Reduplication serves either to pluralize a noun, or to intensify its meaning.

144

Fida Bizri

(11a)

Tîr Tîr bîr bêt. bîr bîr very very big house big big The house is just so big [the informant was complaining about the enormous size of the Lebanese house where she works. A big house means a lot of work] (11b)

ana kil yôm sogol sogol me every day work work I have so much work to do everyday.

(12) One last important linguistic feature of this language is the creation of modal particles that modify the verbs. In fact, distinctions related to time and continuity of action are either understood from the context or indicated by adverbs, or by a set of preverbal modal morphemes created from some LA particles. We have already seen the particle pi, but there are two additional particles, kalas and badde. kalas comes from the LA particle HalaS meaning "it's over, it's finished" or simply "stop" . In Pidgin Madam it is used as a preverbal particle giving the verb an accomplished aspect, implying that the action is "finished" or "already done" (12a, 12b). On the other hand, badde is formed from the LA expression "I want", it can be deconstructed into badd- (volition, intention) + the first personal suffix -e "me, my". When used pre-verbally in LA, the whole phrase means "it is my desire to...+ verb". In Pidgin Madam its preverbal appearance implies that the subject of the verb is or was about to start the action, or that the subject has engaged in an action (12b, 12c). (12a)

bent-ik kalas jawwase daughter-yours(2fs) finished get married (imp.2fs) My daughter is already married

(12b)

kullu kalas sêwi bil* bêt all over do(imp.2fs) in house We have finsihed all work at home

*bil is from LA bi (in) + el the definite aticle "the". But in Pidgin Madam, the article is interpreted as being part of the preposition. (12c)

âdi, badde tekkî-ni awiyyi sitting(pr.part.fs) I want talk(m.impf.2fs)-me strong

Sinhala in Contact with Arabic

145

Sitting, they start talking aloud. [The informant was talking about two women that bother her. Everytime they meet, they both sit and they start talking for hours in a loud voice] (12d)

badde rûyi badde sûpi sûra I want go(imp.fs) I want see(imp.fs) picture So they went and they started looking at pictures. [The informant was telling me how her Lebanese employers chose her to be their maid: they decided to go to a maids' placement agency, and they started looking at pictures of Sri Lankan maids that the agent showed them. Her picture was one of them.] Conclusion A thorough study of the language briefly described above is still to be done in order to determine the underlying linguistic processes at stake, and the impact of each of the substrate and target languages in these processes. However, it seems clear that this language's morpho-syntactic profile is marked by highly economic strategies typical of situations where pidginization is involved: reduction, simplification, improvement and functionalization of some liguistic devices (such as intonation), reduplication, preference of explicit morphemes, analogical extension of rules, and creation of new linguistic devices In fact, nouns in Pidgin Madam are usually used without an article, gender is disregarded, and number is inferred from the context. Adjectives are almost always feminine. Personal pronouns, although they follow LA explcit independant pronouns (as opposed to dependant pronominal suffixes), are often not used, reference being inferred from the context. Except in possessive constructions where personal pronouns are useful (ana mâma "me mother" for "my mother"). Pidgin Madam verb very frequently derives from LA imperative singular feminine, ending with the sound [e/i]. In a few instances, a related LA noun functions as a verb (ana sogol "me work" for "I work" as well as "my job"). Time and aspect are inferred from the context or indicated by means of adverbs of time or pidgin-institutionalized modal particles. Since a large number of LA grammatical tools have been dropped from Pidgin Madam, some words are overworked (some adverbs, and some LA prepositions reinterpreted as postpositions cover more usages than in LA). Pidgin Madam lacks conjunctions, and uses juxtaposition plus intonation instead of

146

Fida Bizri

subordination. Reduplication (of adjectives or adverbs) is quite common for focussing or for pluralizing. This sort of pidginization phase does not always lead to the establishment of an autonomous pidgin. It could appear in an early stage in language learning. According to Schumann (1978) acquisiton of a second language begins with the pidginization of it, followed by expansion and complexification of the interlanguage system. It is thus somehow difficult to draw a clear dividing line between pidgins and some other linguistic systems, such as the different spectra of imperfectly acquired foreign languages. However, one may argue that this difference lies in the fact that in pidgins these syntactic devices become institutionalized and fossilized, which means that they become a recognisable norm. In fact, Pidgin Madam is a socially recognized and well defined way of speaking. It seems that when LA speakers talk in Pidgin Madam, they try to intuitively apply a certain set of "rules" that caricature Pidgin Madam's grammatical profile: the phonological transformation according to the Sinhala phonological system (that LA speakers largely guess), the intonation, an overuse of imperatives as verbal stems, an exclusive use of feminine adjectives and suffixes, use of 2nd and 3rd singular feminine suffixes -you, she- instead of 1st person singular -me-, use of the modal particle pi, badde and kalas, use of LA 'word + personal suffix' refering to 'word' (ebn-ik, literally "your son" in LA, but "son" in Pidgin Madam), use of English doublets of some words husband-sawsik, bêbi-ebnik, masâremoney, etc... Moreover, Pidgin Madam is not only a language used by Sri Lankan maids and their Arab employers, it is also used by speakers of the target language amongst themselves, as a joke, to claim their innocence, or to comically plead in favour of a victimized person. 4 Although I present it is a pidgin, I must in conclusion acknowledge (considering the degree of fluctuation attested in this langauge) that it has not yet reached a stable linguistic state. In fact, imperatives for instance, although they quantitatively constitute the bulk of the verbal system, are not the only verbal forms attested. We have many verbs in the perfective and imperfective aspects. Moreover, the use of second and third feminine singular person does not completely erase the existence of that of the first person singular (10b, 10d). On the other hand, the study of the recordings showed one important difference between the language spoken by Sri Lankan housemaids and that spoken by Sri Lankan free-lance workers: while the former has, as described above, an almost exclusive use of LA feminine and singular forms, the latter shows a large variety of masculine and plural forms completely unattested in Pidgin Madam (and not described in this

Sinhala in Contact with Arabic

147

report). This observation questions the impact of the input that the Sri Lankan maids receive and the context in which the informal learning takes place on the constitution of the language itself (a confined context in the former, and a more open context in the latter, i.e. with greater access to the target language norm). A deeper study of the structures of Pidgin Madam in both the Sri Lankan maids' speech and in that of their Arab employers is, therefore, necessary in order to better describe the processes of linguistic creation involved in this contact.Hopefully, this report can serve the function of inviting interested scolars to undertake studies of that sort.

Notes 1.

It is difficult to give an exact appreciation of the number of speakers of Pidgin Madam without having an estimation of the number of families living with Sinhala-speaking housemaids, and speaking with them in Arabic (as opposed to English).

2.

The data from which I draw my examples were collected in Lebanon and in Sri Lanka from women who had previously been to Lebanon. This study was mainly conducted on a set of free recordings of Sinhala maids talking about their life (total amount of time: 25 transcribed hours, total amount of informants 14). Seven out of those 25 hours were recordings of discussions between Sri Lankan maids and their "Madams" in order to study the Arabic used by the employer addressing the maid, but also the interaction between the two parties. Two of the 14 informants were not housemaids: One was working as a free-lancer in a big hospital, and the other in an office, both were living independently in a rented apartment. As a consequence, the context was different from that of housekeeping; much less confined, and with much more external contact, especially with men. Another set of recordings involved translating into LA a questionnaire written in CS. Some additional recordings of free speech (5 hours) were conducted in Sri Lanka, back home with 3 of the 14 informants. However, to the best of my knowledge, no study on the influence of Arabic on Sinhala has yet been undertaken. I have no such recordings yet. Moreover, this study did not include any Tamil-speaking Sri Lankan domestic servant, although it is clear that comparative study with the Tamil-case in the same context may prove extremely valuable. Some Sri Lankan maids (4 out of the 14) had first learned Arabic in other Arab countries with different dialects (Saudi Arabia, Kuwait, Dubai, Jordan). And it seemed that, in spite of minor structural differences between Arabic dialects, the Arabic spoken by Sri Lankan domestic servants throughout the Middle East is quite homogeneous. The dynamics in the grammatical creation proved

148

3.

4.

Fida Bizri to be the same. Nonetheless, it would be more careful to conduct a study in the Gulf area for a more accurate assessment of the situation. In the examples related to CS and LA throughout the report, the system of transcription adopted is not conventional. Therefore, for CS, capital letters refer to retroflex sounds (T, D), while for LA they refer to pharyngalized emphatic sounds (T, D, S). However, for LA, the capital letter H refers to the Arabic voiceless fricative post-velar sound In fact, due to the image of the exploited Sri Lankan maid, this language has come to be synonymous with "weakness, helplessness".

References Ferguson, Charles A. (ed.). 1960 Contributions to Arabic Linguistics. Cambridge, Massachusetts : Harvard University Press. Gair, James W. 1998 Studies in South-Asian Linguistics : Sinhala and other SouthAsian Languages. Oxford University Press: New York, Oxford. Goddard, Ives. 1996 Pidgin Delaware. In Sarah G. Thomason (ed.), Contact Languages, a Wider Perspective, 43-124. Amsterdam/ Philadelphia: John Benjamins Publishing Company. Hymes, Dell (ed.). 1971 Pidginization and Creolization of Languages. Cambridge: Cambridge University Press. Huttar, G. L. and Valentie, F. J. 1996 Ndyuka-Trio Pidgin. In Sarah G. Thomason (ed.), Contact Languages, a Wider Perspective, 125-172. Amsterdam/ Philadelphia: John Benjamins Publishing Company. Schumann, John H. 1978 The pidginization process: a model for second language acquisition. Rowley, Massachusetts: Newbury House Publishers.

Research on South Asian Languages in Japan: 2000–2008* Kazuyuki Kiryu and Prashant Pardeshi

1. Introduction Japan is one of the world’s major centers for the study of languages. Every year hundreds of papers and books related to language studies and linguistics are published in Japanese. The popularity of linguistics as an academic field has translated into the compilation of the world largest encyclopedia of languages and linguistics, Gengogaku Daijiten (The Sanseido Encyclopedia of Linguistics), edited by Takashi Kamei et al. which has been published in seven volumes by Sanseido Publishers between 1988 and 2001. The first five volumes are descriptions of about 3500 languages and language families and their sub-groups; the sixth one is devoted to technical terminology in linguistics while the seventh one is a dictionary of writing systems, which includes about 300 of the world’s scripts. This large scale work on languages and linguistics is no less important than Grierson and Konow’s splendid volumes of Linguistic Survey of India. The Sanseido Encyclopedia of Linguistics contains a lot of entries related to South Asian languages as well. Though the levels of description differ, the linguistic information on these languages is very useful for getting a broad overview. However, thanks to the recent achievements in the framework of linguistic description and basic grammatical theory, more detailed information on South Asian languages is available to us through descriptive works published by scholars worldwide. Japanese scholars have also made significant contributions in this regard. While the number of Japanese scholars who work on South Asian languages may not constitute a large population, the last decade has seen some important contributions from them. In this report, we attempt to give a summary of the current trends in the study of South Asian languages conducted by scholars in Japan based on those articles and books published between 2000 and 2008 that are available to us. We should, however, make it clear at the very outset that the survey presented here is neither exhaustive nor comprehensive since it is not possible to cover all the works given the vast amount of research and

150

Kazuyuki Kiryu and Prashant Pardeshi

the constraints they comprise. Nonetheless, we hope that this report will give an overall picture of the SAL research landscape in Japan between 2000-2008. The report is organized into four sections with one section devoted to each of the four major language families in South Asia, viz. IndoAryan, Dravidian, Tibeto-Burman and Austro-Asiatic. An attempt has been made to report on as many research works as possible. A critical appraisal of the contents of all of the works reported herein, however, falls outside the scope of this report.

2. Indo-Aryan languages Before moving onto the research works on SALs between 2000-2008 a brief sketch of the history of studies on SALs in Japan is in order. Studies of Hindi/Urdu and Tamil in Japan boast a history of more than a century. The teaching of Hindustani and Tamil started in the year 1908 in the formerly Tokyo School of Foreign Languages which transformed into the current Tokyo University of Foreign Studies (TUFS). In 1921 Osaka School of Foreign Languages was started which later became Osaka University of Foreign Studies (OUFS). OUFS recently merged with Osaka University and has been renamed as Research Institute of World Languages (RIWL). TUFS and former OUFS (current RIWL) have played a major role in the research on Hindi/Urdu in Japan. These two institutions provide undergraduate, master’s and Ph.D. degree program in Hindi and Urdu (for a detailed chronological survey of Hindi/Urdu studies in Japan, see Rituparna 2008). There are a few other institutions (Daito Bunka University, Kyoto Sangyo University, Toyo University, Bouei University among others) where South Asian languages are taught. The study of Sanskrit in Japan has an even longer history. The department of Sanskrit Language and Literature was founded at the University of Tokyo in 1901 (http://www.l.utokyo.ac.jp/indlit/index.html). At Kyoto University, the Department of Sanskrit Language and Literature was established in 1910 and has the only named chair in Japan. Apart from Tokyo and Kyoto University, departments of Sanskrit studies are found at Hokkaido University, Tohoku University, Osaka University, Kyushu University, Hiroshima University, and Tokyo Sangyo University. The 14th World Sanskrit Conference will be held from 1 September (Tue.) to 5 September (Sat.) 2009 in Kyoto. The conference will be hosted jointly by the International Association of San-

Research on South Asian Languages in Japan

151

skrit Studies (IASS) and the Department of Indological Studies, Graduate School of Letters, Kyoto University. As we have just seen, research on Sanskrit has a long history in Japan. From the linguistics perspective recent works by Eijiro Doyama merit special mention. Doyama (2005) is a monograph on first person conjunctives in Rigveda which won the Japan Association of South Asian Studies Award for its excellence. Doyama (2008) is a treatment of the functions of the root aorist participle. As for Hindi and Urdu, applied linguistic works such as preparation of textbooks and compilation of dictionaries for language pedagogy account for a substantial portion probably due to the fact that two universities, viz. the Tokyo University of Foreign Studies and the Research Institute for World Languages at Osaka University provide undergraduate and postgraduate level programs in Hindi and Urdu. Among the dictionaries, the following recent works merit special mention: Hindi-Japanese Dictionary compiled by Katsuro Koga and Akira Takahashi (2006, 1473 pages, Taishukan Publishers), Japanese-Hindi Dictionary compiled by Katsuro Koga (published by the compiler himself), and Urdu-Japanese Dictionary compiled by Kagaya (2005, 1592 pages, Daigakushorin Publishers). Kazuhiko Machida has developed full-text KWIC (key word in context) search for Hindi using the texts of Munshi Premchand’s Godaan and the Constitution of India. While the instructions for these two resources are given in Japanese, if one knows the conventions for inputting searches, one can easily use them. Miki Nishioka has been engaged in Hindi and Japanese contrastive studies. Nishioka (2000) is a detailed descriptive account of the use of the dative case marker –ko in Hindi within the framework of functional grammar. In this paper Nishioka highlights functional similarities between the dative case marker –ko in Hindi and the particle wa in Japanese. Nishioka (2001), her doctoral dissertation, offers an in-depth description of complex sentences in Hindi focusing on the multi-tiered structure of non-finite embedded clauses. Nishioka (2002) is a contrastive study of aspectual complex predicates in Japanese and Hindi while Nishioka (2003) is a contrastive study of issues related to transitivity and the derivation of causatives in Japanese and Hindi. In both of these works Nishioka offers a detailed description of the phenomena in question and documents similarities and differences between Japanese and Hindi. Nishioka (2004) is a detailed descriptive study of {verb+verb} combinations in Hindi.

152

Kazuyuki Kiryu and Prashant Pardeshi

Pardeshi (2000) and Pardeshi (2009) are note-worthy contributions pertaining to the passive constructions in Marathi. Pardeshi (2000) highlights the differences between the hitherto neglected passive construction using COME as a passive marker (“COME-passive”) and the widely discussed passive construction using GO as a passive marker (“GO-passive”). Pardeshi (2000) claims that COME passives imply meticulously planned intentional acts while GO passives are neutral with respect to planning and intentionality. Pardeshi (2009) takes up a construction in Marathi that uses a concatenation of {Action Noun/Predicative Adjective implying agency + BECOME} as its predicate and argues that it is a bona fide passive construction. Christening this construction the “BECOME-passive”, Pardeshi attempts to shed light on the functional division of labor among the three passives, namely, the BECOME, GO and COME passives in Marathi. Shibatani and Pardeshi (2001) and Pardeshi (2004) deal with the socalled dative subject construction (DSC). Shibatani and Pardeshi (2001) argue that DSCs are intransitive and advance the hypothesis that they are variants of double-subject constructions involving two subjects: the ‘large subject’ and the ‘small subject’ (preverbal noun). Refuting the polar categorization of the DSC either as transitive or intransitive Pardeshi (2004) argues that DSCs should be analyzed in a scalar way in terms of a continuum with some DSCs leaning more toward the transitive end while others lean more toward the intransitive end. Pardeshi (2003) is a discussion of the definitional issues and criteria for identification for the compound verb in Marathi. In this paper Pardeshi offers criteria to distinguish grammatical auxiliaries from the so-called vector verbs. Pardeshi et al. (2006) is a geo-typological study by a group of researchers working on the experientially basic verb, namely, EAT which is used in a large number of idiomatic expressions in South, Central, Southeast and East Asian languages. Through the analysis of a large pool of EAT idioms this work sheds light on the geographical distribution patterns of EATexpression in this vast area. One of the spinoffs of this large-scale study is Hook and Pardeshi (2009) which is an in-depth study of the semantic evolution of EAT-expression in Hindi and Marathi. In this paper Hook and Pardeshi explicate the polysemy network of EAT in Hindi and Marathi from two complementary perspectives: (1) a cognitive perspective and (2) a language-specific lexical perspective with a view to explore both the general (= “universal”) and the particular (= “language-specific”) factors underlying the extended uses of the verb EAT.

Research on South Asian Languages in Japan

153

Pardeshi (2008) deals with the constructions in four South Asian languages (Marathi, Hindi, Telugu and Tamil) which necessarily involve an agent but cannot encode it linguistically. Pardeshi claims that these constructions have received cursory treatment and argues that they should be treated as quasi-passive construction with the highest degree of agentdefocusing. Among the few Japanese scholars working on Marathi, Hideaki Ishida merits special mention. Ishida (2001) is a pedagogical conversation book and Ishida (2004) is a concise pedagogical grammar of Marathi written for Japanese learners. There are a few people working on the languages of eastern India. Junji Yamabe has been primarily working on Oriya. He has done extensive field work which crystallized in his 1998 unpublished Ph.D. dissertation from the University of Tokyo entitled “The relative and interrogative pronouns of Oriya.” Masayuki Onishi is a typologist who has been working on Bangla, among other languages. One of his prominent contributions is Onishi (2001) which offers a detailed description and analysis of the noncanonically marked transitive and intransitive subjects in Bangla. Chakma is a Bengali variety spoken in Chittagong Hill Tracts by ethnically TibetoBurman people. Keisuke Huziwara, in his MA thesis (2001), discusses the distinctive accents in Chakma from both synchronic and diachronic perspectives, arguing that they can be traced back historically to presence versus absence of aspiration. Huziwara (2002a) is a sociolinguistic essay on the people’s indifference to the minority languages of Bangladesh in contrast to Bangla, the national language, which assumed symbolic value in the formation of the country. He has also written a number of essays on Bangladesh and its languages (Chakma and Mru) for general readers. Nozomi Kodama has been working on language contact between IndoAryan and Dravidian languages. Kodama (2001) is an important contribution to the study of contact-induced changes involving a prolonged contact situation between some dialects of Konkani and Marathi spoken by immigrant communities in Tuluva or Tululand (coastal area around Manglore) with the local Dravidian languages like Tulu and Kannada. Kodama proposes a comparative method for reconstructing language changes possibly induced by language contact and leading to areal convergence.

154

Kazuyuki Kiryu and Prashant Pardeshi

3. Dravidian languages There has been some noteworthy work on Dravidian languages. In collaboration with his then colleague Tsuyoshi Nara, Peri Bhaskararao has worked on documentation of Toda vocabulary, texts and songs. The outcome of the research is reported in Nara and Bhaskararao (2001, 2002, 2003). The texts and songs are available in audio format (CD) as well. Along with Bhaskararao (2006, 2007a, 2007b) and Bhaskararao & Ladefoged (2007) these works are important contributions to our knowledge of phonetic and grammatical aspects of Toda—a minority language spoken in the Nilagiri Hills of South India by about one thousand speakers. Toda is well known for its large inventory of fricatives, trills, and vowels.The papers presented at the first international symposium “Contact, Convergence and Typology in South Asian Languages” held in December, 1999 appeared in The Yearbook of South Asian Languages and Linguistics 2001 which was guestedited by Peri Bhaskararao and K.V. Subbarao. The second international symposium on “Non-nominative Subjects” was held in December 2001. The papers presented at this symposium appeared in two volumes in the Typological Studies in Language series (volumes 60 and 61) from John Benjamins (Bhaskararao, Peri and K.V.Subbarao (eds.) 2004a, b). The third international symposium on “Indic Scripts: Past and Future” was held in December 2003 and the papers presented at this symposium appear in Bhaskararao ed. (2003).

4. Tibeto-Burman languages The Tibeto-Burman languages constitute a large linguistic family spoken in a vast area of Eurasia, from Gansu, Qinghai, Sichuan and Yunnan Provinces of China in the east, through the Himalayan regions, to Baltistan (Pakistan) in the west. The most populous Tibeto-Burman languages are Tibetan, spoken in the Tibetan Plateau of Central Asia and China, and Burmese, spoken in Southeast Asia. South Asia is also a home to many Tibeto-Burman languages, but many of them in this region are of small population. In this report, the classification of Tibeto-Burman languages is based on Matisoff (1991, 2003), which sub-divides Tibeto-Burman languages into seven subgroups (Kamarupan, Himalayish, Jingpho-NungLuish, Lolo-Burmese, Tangut-Qiangic, and Karenic) and two isolated languages (Tujia and Bai). The first four sub-groups are fully or partially

Research on South Asian Languages in Japan

155

included in South Asia. During the last couple of decades, the study of Tibeto-Burman languages has advanced to a great extent, thanks to scholars and the research groups led by them, such as James Matisoff, George van Driem, the late Mickey Noonan, David Bradley, Randy LaPolla, Carol Genetti, to name a few. In the study of Tibeto-Burman languages, Japanese scholars also have made a lot of contribution, especially in the researches of languages spoken in what Matisoff (1991:485) calls the Sino-sphere like Tibetan and its dialects, Lolo-Burmese languages, and so forth. Among them, Tatsuo Nishida is one of the pioneering Japanese scholars, who made a great achievement in the study of the Tangut language and its scripts. Another pioneering Japanese scholar is Yoshio Nishi. Nishi (2000) is a brief summary of the general trends in studies of Tibeto-Burman languages in the Himalayan region, listing the names of scholars and the languages studied by them before 1991, and providing more detailed information of scholars and their research languages after 1991. He also tries defining the linguistic area for Himalayan languages, and discusses the distinction between language and dialect, and on genetic affiliation. Shiro Yabu, who has been working on Burmese, hosts an informal circle, Tibeto-Burman Linguistic Circle, where TB scholars gather and exchange ideas three times a year. This circle plays an important role in promoting exchanges among the scholars working on Tibeto-Burman languages not only in Japan but also from abroad. Although the majority of the Tibeto-Burman linguists in Japan have worked on Tibetan and other Tibeto-Burman languages spoken in China and Southeast Asia, there are a few who have worked on the TibetoBurman languages of South Asia. The reason for this, one may suppose, is that in many cases, Tibetan studies are closely related to studies of Buddhism, one of the two major religions of Japan. In the 1980’s and the 1990’s, there were some linguists who carried out research on languages in Nepal: Yoshio Nishi (Tamangic languages), Mantaro Hashimoto (the Bhaktapur dialect of Newar), Yasuhiko Nagano (Manang, Kathmandu Newar), and Michiyo Hoshi (Prakaa). Sueyoshi Toba is another active Japanese linguist, who carries out field work in Nepal and has conducted research on Newar, Dhimal, Khaling, Toto and Kusunda. There are more Japanese scholars currently working on Tibeto-Burman languages than there were in the early 1990s, but the Japanese scholars working on the TB languages spoken in South Asia seem to be as few as before the 1990s. In the following subsections, I will summarize as many works as possible by Japanese scholars working on Tibeto-Burman lan-

156

Kazuyuki Kiryu and Prashant Pardeshi

guages in South Asia, including articles, books, and reports published between 2000 and 2008. Although there may be more in the form of papers presented at conferences and symposia, due to lack of the materials I do not include them. 4.1. Tibetic (Himalayish) Tibetic is a subgroup of Tibeto-Kanauri in Himalayish and comprises Tibetan, Tamangic and Bodish languages. The linguistic area of Tibetic is quite large and runs from the eastern to the western ends of South Asia. It is obvious that Tibetan is the largest in size and historically the most important of the Tibetic languages. Although Standard Tibetan is spoken in Central Asia, there are a number of Tibetan varieties spoken in South Asian countries at the foot of the Himalayas such as in Nepal and Bhutan. In Japan, as mentioned before, the study of Tibetan languages is more popular than studies of other Tibeto-Burman languages (except for Burmese) and the varieties of Tibetan mainly studied in Japan are the Central, Amdo and Kham dialects. Since they are spoken outside of South Asia, I do not review any works related to them but limit myself to Tibetan varieties spoken in South Asia. Dzongkha is a southern variety of Tibetan, spoken in Bhutan as the national language. There are three Japanese scholars who have published materials on this language. Suzuki (2004) discusses the phonology and the phonetic inventory of the standard dialect of Dzongkha, pointing out some problems in the descriptions of the consonants in previous studies. He argues that some complex consonants (the syllable initial nasals preceded by a glottal stop in high tone, and the phonemes comprising a bilabial plosive and an alveo-palatal affricate) should be regarded as consonant clusters, and further that the semi-voiced or voiced-aspirate consonants are not phonemic but rather positional variants of their voiceless-unaspirated counterparts occurring in low-tone syllables under a certain condition. He claims that the phonological system of Dzongkha is closer to that of Kham Tibetan than that of Lhasa Tibetan. Nishida (2004) is a study of the phonology and phonetics of the Gasa dialect of Dzongkha, with data collected from a native speaker of Gasa living in Hawaii, in comparison to the standard dialect of Dzongkha. The paper includes discussion of phonemic inventories, pitch and intonation, as well as a number of illustrations of phonological changes that have taken place in the development of Dzongkha as a lan-

Research on South Asian Languages in Japan

157

guage separate from Written Tibetan. A Dzongkha learning material has recently been published in Japan. Imaeda (2006) is a teach-yourself textbook of spoken Dzongkha, in which the use of Tibetan script is avoided and an alphabetical transcription is employed so that readers can learn spoken Dzongkha more easily. Among the Tamangic languages spoken in western Nepal the major ones are Tamang, Gurung, Thakali, Seke, Tangbe and Nar-phu. Isao Honda has studied some of them. Honda (2002a) discussing imperative affixes that are found in different forms among Tamangic languages, presenting a hypothesis for their historical development. Honda (2007) is another comparative study of Tamangic languages that focuses on historical change in demonstratives and plural markers. Honda provides a list of demonstratives and noun plural markers in other Tamangic languages, suggesting that their origin and historical development indicates some relation to corresponding Tibetan morphemes. Based on his fieldwork data, he compares the phonological systems in three Seke dialects (Tangbe, Chuksang, and Tetang) in Honda (2002b). Honda (2004) compares deictic motion verbs in Seke with those in other Tamangic languages, and argues that deictic motion verbs in Seke have developed more grammaticalized functions. For instance, ‘come’ has developed into a future marker. Honda (2003) is an overview of Tangbe, spoken in Mustang. His description contains the phonology, nouns and numerals, verbs, adjectives, and copulas. Based on his analysis, he suggests that Tangbe is a dialect of Seke. Kaike is spoken in the Dolpā district of Nepal. It has been suggested that Kaike is a Tamangic language. Honda (2008), based on his own field work data, discusses lexical correspondences between Kaike and other Tamangic languages, and concludes that Kaike is genetically closely related to Tamangic languages, though there are some differences from other Tamangic languages in the tone system and pronominals. 4.2. Western Himalayish (Himalayish) Western Himalayish (or West Himalayish) languages are closely related to Tibetan and are spoken in India along the Indo-Tibetan border and on both sides of the north-west Indo-Nepalese border. Yoshiharu Takahashi has written reports of his descriptive study of the Pangi dialect of the Kinnauri language spoken in the Kinnaur district of Himachal Pradesh, India. Takahashi (2001) gives a brief overview of Pangi

158

Kazuyuki Kiryu and Prashant Pardeshi

with respect to phonetics, phonology and some morphosyntactic phenomena. Takahashi (2004) is a revised version of his 2001 report, and includes a discussion of a reflexive/reciprocal suffix and a general tense suffix. Takahashi (2002) is a detailed description of case forms in Kinnauri. He identifies seven case markers: absolutive (zero marking), ergative/instrumental (-ɨs,-s), dative (-pɨn), locative (-ō,-ē), ablative (-č), genitive (-ū) and comitative (-ran). In Takahashi (2008), he discusses the verb morphology in terms of person, tense, aspect, and the intransitivecausative alternation; a verbal suffix -ši; as well as Pangi case forms and postpositions. He also describes a grammaticalized function of the locative marker -ō, which expresses an on-going action, and explicates the pattern of ergative case marking and the conditions on split ergative marking. Takahashi (2007) discusses the verb inflection, the deictic pattern of motion verbs and the case-marking system in Kinnauri, arguing that they reflect fundamentally the distinction between speech-act participants and everyone else, and the parallel distinction between speech-act location and everywhere else. Katsuo Nawa is an anthropologist who has been working on the Byansi people whose communities reside on both sides of the western border between India and Nepal. Nawa (2000) discusses a couple of sociolinguistic issues related to the identity of the mother tongues of the people in the Byans district of Nepal with respect to the self-denomination of their language as rang boli (Rang’s language), which refers to both the language spoken in the area (Byansi) and the ones spoken in the adjacent areas (Chaudansi and Darmiya), as well as how the Byansi people see their mother tongue in terms of linguistic purity and multilingualism. 4.3. Newari (Himalayish) Newar (aka Nepāl Bhāśā, Newari) is the language of the indigenous people of Kathmandu Valley, the Newars, who had kingdoms there until the late eighteenth century. The language has been studied by several western scholars: Austin Hale, Carol Genetti, David Hargreaves, among others. The Newar languages comprises a sub-family Newaric and are divided into three major varieties: Kathmandu-Patan-Bhaktapur, Dolakha, and Pahari. Two Japanese scholars, Ikuko Matsuse and Kazuyuki Kiryu (one of the authors of this report) have been working on the Kathmandu and Patan dialects.

Research on South Asian Languages in Japan

159

The main topic of Matsuse (2000, 2008) is motion verbs in Newar. Both papers treat motion verbs meaning ‘go’, ‘come’, ‘bring’ and ‘take’ and their extended uses. These papers are mainly descriptive and analyze the functions of the motion verbs from a cognitive linguistic point of view. Matsuse (2007) is a detailed description of the functions of the deictic verb waye ‘to come’, in which she tries to present a uniform treatment of the verb’s semantics. Matsuse (2004) analyzes functions of an auxiliary verb biye. The verb has the basic meaning ‘give’ and is also used as an multifunctional auxiliary verb having benefactive, malfactive, and causative senses. She tries to explicate the functions in a unified way based on a semantic continuum of GIVE schema. Kazuyuki Kiryu also has been working on Newar within a functional descriptive framework. Kiryu (2000a) is the result of Kiryu’s and Matsuse’s contrastive study on basic verbs in Newar and Japanese. The volume contains about 350 Newar basic verbs provided with meanings in Japanese as well as examples and notes of their use. Kiryu (2000b, 2001a) are descriptive studies on the tense and aspect system in Newar, discussing a perfect marker dhune/dhũːke, and the aspectual auxiliary verbs cwane (progressive/resultative) and taye (resultative). Newar has a productive causative suffix -k and Kiryu (2001b) describes its functions in details. He discusses its functions not only as a canonical causative marker but as marking an action oriented toward others (verbs of dressing), showing the subject’s indirect involvement in an event expressed by a causativized in conjunctive participle form, and also turning a non-intentional verb into an intentional verb without increasing valence. Kiryu (2004a) discusses the semantic contrast between the meanings expressed by the past disjunct and the stative form of a verb in the context of negation, illustrating the fact that a simple past negation is expressed by the negated stative form while the negated past disjunct form does not serve as a simple negation but as indicating a negative attained situation (a ‘not X any more’ sense). He classifies five different types of negative attained situations in Newar and illustrates from a typological point of view how the five negative attained situations are expressed in Japanese, Chinese, Thai and English. Kiryu (2004b), a concise introduction to numeral classifiers in Newar, provides a list of them in addition to discussion on their syntactic features and the dimensional adjectives formed from them. Kiryu (2007a, 2008b) are papers that discuss case marking patterns and ergativity in Newar, focusing on two place predicates and the case markers that the arguments bear. Kiryu (2008a) gives a detailed description of a verb naye ‘to eat’ in Newar, based

160

Kazuyuki Kiryu and Prashant Pardeshi

on the descriptive framework presented by Pardeshi et al. (2006). Kiryu has also compiled three volumes of descriptive materials of the Newar language. Kiryu (2000b) is a booklet that contains about 350 entries of Newar basic verbs, with details on usage and with examples. Kiryu (2002a) is a reference grammar of the Newar language and Kiryu (2002b) is a selected lexicon that includes about 2000 head words in Newar with Japanese translations. Both (2002a) and (2002b) were used in a six-week intensive course for Newar held in 2002 at the Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies. 4.4. Luish (Jingpho-Nungish-Luish) The Luish languages, also known as the Sak languages, are spoken in Bangladesh. Huziwara (2002b) reports on Cak, (or Sak), a Luish language spoken in the Chittagong Hill Tracts. He provides a 1000-word basic vocabulary list and an analysis of its phonetics and phonology. He identifies eight vowels, twenty-six consonants and two high and low tones in Cak, pointing out the existence of implosives and partial tonal changes. Huziwara (2005) is an annotated text in Cak, with a full translation of a story titled “The tiger’s dream” and grammatical analyses. Huziwara has recently received his Ph.D. from Kyoto University for a comprehensive descriptive study of Cak (Huziwara 2008a). The thesis also includes a list of about 800 Cak words with corresponding words in Bangla (Indo-Aryan), Marma (Lolo-Burmese) and Burmese (Lolo-Burmese). 4.5. Burmish (Lolo-Burmese) On the South Asian side of the Arakan mountains, Marma, a sub-dialect of Arakanese (Burmish), is spoken in the Chittagong Hill Tracts, Bangladesh. Huziwara (2003) describes the phonetics and phonology in Marma. He presents a phonetic inventory of Marma, concluding that it has seven vowels, 28 consonants and four tones (high, low, rising and glottal). It also has a list of 1000 basic words in Marma. 4.6. Bodo-Garo (Kamarupan) Huziwara and Kiryu also work on languages in the Bodo-Garo group of the Kamarupan languages. The former scholar has recently started working on

Research on South Asian Languages in Japan

161

another Tibeto-Burman language, Usoi, a southern dialect of Tripura (Kokborok). Huziwara (2008b) provides an overview of its phonetics, phonology, morphology and syntax, and also includes the annotated text of a folktale. The latter scholar is working on Meche (also known in India as Mech, Mechi), spoken in the Jhapa District (southeastern Nepal). It is closely related to the Boro spoken in Assam. Meche is one of the indigenous languages that the government of Nepal recognizes as endangered, therefore in need of protection. Kiryu (2004c) is a concise field report, which summarizes some basic grammatical information about Meche. Meche has an interesting particle that marks a change of situation (or change of state), chəi, and Kiryu (2007b) describes its functions, showing how it serves not only as a change of situation marker, but also as a discourse particle that marks sequential events in a procedural discourse. Kiryu (2008c), the result of three years of research on the Meche, includes a grammatical sketch, one annotated story and a glossary of about 1300 words. 4.7. Austro-Asiatic languages Toshiki Osada has been working on Mundari. Osada (2001a, b) are learning materials for Mundari (textbook and reader) written in Japanese. Osada (2001c) and Osada (2005) are works treating personal pronouns in Mundari in a comparative perspective with other South Asian languages. In these works Osada demonstrates that Mundari personal pronouns resist influence from or convergence with neighboring languages. Osada, Kobayashi and Murmu (2003) is a report on a pilot survey of the dialects of Kherwarian languages. Nicholas and Osada (2005) is an extremely important contribution on the vexing issue of word classes in Mundari. Mundari has often been cited as an example of a language without word classes, where a single word can function as noun, verb, adjective, etc. according to the context. Nicholas and Osada refute such claims of word class fluidity and argue that in fact Mundari does have clearly definable word classes. Osada (2007) is a detailed description of reciprocals in Mundari while Osada (2008) is a descriptive overview of the language. Apart from works mentioned here Osada has written many articles in Japanese on various grammatical aspects of Mundari. Makoto Minegishi has been working on Santali. One of his major contributions is Minegishi and Murmu (2001) which is a compilation of 1000

162

Kazuyuki Kiryu and Prashant Pardeshi

basic words of the Singhbhum dialect of Santali along with their examples and grammatical notes. The Santali examples are rendered in Devanagari as well as in phonemic transcription and are provided with English and Japanese translations. Unlike the dialect described in Bodding (1932-1936) which has 8 vowels, the Santali of Singhbhum has only six vowels.

Note *

Studies on Tibeto-Burman languages have been reported on by Kiryu and others by Pardeshi.We would like to thank Peri Bhaskararao, Eijiro Douyama, Keisuke Huziwara, Isao Honda, Hideaki Ishida, Ikuko Matsuse, Makoto Minegishi, Toshiki Osada, Tomio Mizokami, Miki Nishioka, Yoshiharu Takahashi and So Yamane for their valuable inputs, without which the completion of the report would not have been possible. Special thanks are due to Peter Hook for his pertinent suggestions on stylistical matters and careful proofreading. Needless to say, responsibility for any remaing errors lies solely with us.

References (The Japanese transliteration of the works cited here, unless provided with the original, is based on a modified Hepburn system (Hebon-shiki), in which long vowels are represented by doubling the same vowel.) Bhaskararao, Peri (ed.) 2003 Working Papers of International Symposium on Indic Scripts: Past and Present. Tokyo: Research Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Bhaskararao, Peri 2003 Elements of Indian Indic Scripts. In Peri Bhaskararao (ed.) Working Papers of International Symposium on Indic Scripts: Past and Present, 382-391. Tokyo: Research Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. 2004 Phonetic Documentation of Endangered Languages—Creating a Knowledgebase Containing Sound Recording, Transcription and Analysis. Acoustic Science and Technology (Journal of the Acoustical Society of Japan) 25(4): 219-226. 2006 Toda Verbal Paradigms—Past, Non-past and Negative. In Peri Bhaskararao (ed.) Research on Minority Languages of South and

Research on South Asian Languages in Japan

163

South-East Asia—Working Papers, 126-144. Tokyo: Research Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. 2007a Toda Verbs ‘to be’ and ‘to become’. In Peri Bhaskararao (ed.) Research on Minority Languages of South and South-East Asia— Working Papers 2: 112-118. Tokyo: Research Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Bhaskararao, Peri 2007b Toda Verbal Stem Alternants. In Peri Bhaskararao (ed.) Research on Minority Languages of South and South-East Asia—Working Papers 2: 119-141 Tokyo: Research Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Bhaskararao, Peri and Peter Ladefoged 2007 Timing constraints within gestures: A re-examination of Toda sibilants. In Bhaskararao Peri (ed.) Research on Minority Languages of South and South-East Asia—Working Papers 2: 106-111. Tokyo: Research Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Bhaskararao, Peri and K. V. Subbarao (eds.) 2004a Non-nominative Subjects. Volume 1. Amsterdam/Philadelphia: John Benjamins 2004b Non-nominative Subjects. Volume 2. Amsterdam/Philadelphia: John Benjamins. Bodding, P. O. 1932-36 A Santal Dictionary (5 Volumes, reprinted in 2003). India: Gyan Publishing House. Doyama, Eijiro 2005 A morphological study of the first person subjunctive in the Rigveda In Machikaneyama Ronso 39, 1-19. Osaka: Osaka Daigaku Bungakubu (Faculty of Letters, Osaka University). 2008 On the Function of the Root-Aorist Participle. In Journal of Indian and Buddhist Studies 56-3, 1043-1048. Evans Nicholas and Toshiki Osada 2005 Mundari: The myth of a language without word classes”, Linguistic Typology 9:351-390. Honda, Isao 2002a On Tamangic imperatives, Gipan 2: 67--80. Central Department of Linguistics, Tribhuvan University. 2002b Seke Phonology: Comparative Study of Three Seke Dialects, Linguistics of Tibeto-Burman Area 25(1): 191--210.

164

Kazuyuki Kiryu and Prashant Pardeshi

2003 2004 2007

A sketch of Tangbe. In Tej Ratna Kansakar & Mark Turin (eds.) Themes in Himalayan Languages, 49--64. Heidelberg: South Asia Institute & Kathmandu: Tribhuvan University. Grammaticalization of deictic motion verbs in Seke. In Anju Saxena (ed.) Himalayan Languages: Past and Present, 285--310. Berlin / New York: Mouton de Gruyter. A comparative and historical study of demonstratives and plural markers in Tamangic languages, In Roland Bielmeier & Felix Haller (eds.) Linguistics of the Himalayas and Beyond, 97--118. Berlin / New York: Mouton de Gruyter.

Honda, Isao 2008 Some observations on the relationship between Kaike and Tamangic, Nepalese Linguistics 23: 83--115. Hook, Peter and Prashant Pardeshi 2009 The Semantic Evolution of EAT-Expressions: Ways and Byways. In John Newman (ed.) The Linguistics of Eating and Drinking, 153172. Amsterdam: John Benjamins. Huziwara, Keisuke 2001 Chakumago no akusento ni kansuru koosatsu (A study of accent patterns in Chakuma), MA thesis. Kyoto: Kyoto University. 2002a Gengo minsyusyugi kara gengo têkokusyugi e -- syôsû gengo kara mita banguradesyu no gengo mondai--- (From linguistic democracy to linguistic imperialism: issues on languages in Bangladesh from the perspective of minority languages), Syakai Gengogaku (Sociolinguistics) Vol. II: 99--117. 2002b Chakkugo no onsee ni kansuru koosatsu (A phonetic analysis of Cak), Kyooto Daigaku Gengogaku Kenkyuu (Kyoto University Linguistic Research) 21: 217--73. 2003 Marumago no onsee ni kansuru koosatsu (A phonetic analysis of Marma), Kyooto Daigaku Gengogaku Kenkyuu (Kyoto University Linguistic Research) 22: 237--300. 2005 Chakkugo no shiryoo to bunpoo kaishaku -- tora no yume -- (Text and grammatical analysis of Cak --- “A Tiger’s Dream”), Kyooto Daigaku Gengogaku Kenkyuu (Kyoto University Linguistic Research) 24: 131--52. 2008a Chakkugo no kijyutsu gengogaku teki kenkyuu (A descriptive study of the Cak language), Ph.D. dissertation. Kyoto University. 2008b Usuigo bunpoo no gaiyoo (An outline of Usoi Tripura grammar), Kyooto Daigaku Gengogaku Kenkyuu (Kyoto University Linguistic Research), 27: 81--124 Imaeda, Yoshiro 2006 Zonkago Koogo Kyoohon (A Textbook of Colloquial Dzongkha), Tokyo: Daigaku Shorin Publishers.

Research on South Asian Languages in Japan

165

Ishida, Hideaki 2001 Jitsuyoo Maraatiigo Kaiwa (Practical Marathi Conversation). Tokyo: Daigakushorin Publishers. 2004 Kiso Maraatiigo (Basic Marathi). Tokyo: Daigakushorin Publishers. Kiryu, Kazuyuki 2000a Newaarugo Kihondooshi Yooreeshuu (A Collection of Examples and Usages of Basic Newar Verbs). A Report of the Contrastive Study of Basic Predicates Between Japanese and Newari with a Grant-in-Aid f Mimasaka Women’s Junior College. 2000b A note on perfect aspect in Newari, Mimasaka Joshidaigaku Mimasaka Joshidaigaku Tankidaigakubu Kiyoo (Bulletin of Mimasaka Women's College and Mimasaka Women's Junior College) 45: 45-50. 2001a Newaarugo no tensu/asupekuto o megutte (Issues of tense and aspect in Newar), Mimasaka Joshidaigaku Mimasaka Joshidaigaku Tankidaigakubu Kiyoo (Bulletin of Mimasaka Women's College and Mimasaka Women's Junior College) 46: 45--56. 2001b Types of verbs and functions of the causative suffix -k in Newar Kobe Papers in Linguistics 3: 1--9. Kobe: Department of Linguistics, Kobe University. 2002a Newaarugo Bunpoo (A Grammar of Newar). Tokyo: Research Institute for Languages and Cultures of Asia and Africa. 2002b Newaarugo Goishuu (A Newar Lexicon). Tokyo: Research Institute for Languages and Cultures of Asia and Africa. 2004a Hitee teki jyookyoo eno henka o arawasu hyoogen no taishoo kenkyuu -- newaarugo, nihongo, eego, taigo, chuugokugo o hikakushite (A contrastive study of expressions that depict changes into negative attained situations: a comparison in Newar, Japanese, English, Thai and Chinese.), Taro Kageyama & Hideki Kishimoto (eds.) Nihongo no Bunseki to Gengo-ruikee: Shibatani Masayoshi Sensee Kanreki Kinen Ronshuu (Analyses of Japanese and Language Typology: Festschrift for Professor Masayoshi Shibatani), 186--215. Tokyo: Kuroshio Publishers. 2004b Newaarugo no ruibetsushi (Numeral classifiers in Newar), Yoshihiro Nishimitsu & Shinobu Mizoguchi (eds.) Ruibetsushi no Taishoo (Contrastive Studies of Classifiers), Ch. 10, 186--215. Tokyo: Kuroshio Publishers. 2004c Mechego choosa nooto (A field report on the Meche language), Mimasaka Daigaku and Mimasaka Daigaku Tankidaigakubu Kiyoo (Bulletin of Mimasaka University and Mimasaka Junior College) 49: 31--39.

166

Kazuyuki Kiryu and Prashant Pardeshi

2007a

Newaarugo ni okeru kaku to tadoosee (Case marking and transitivity in Newar), Tadoosee no Tsuugengoteki Kenkyuu (Crosslinguistic Studies in Transitivity), 191--203. Tokyo: Kuroshio Publishers. Kiryu, Kazuyuki 2007b Mechego no chəi no kinoo ni tsuite (The functions of a particle chəi in Meche), Kobe Papers in Linguistics 5: 79--92. Kobe: Department of Linguistics, Kobe University. 2008a EAT expressions in Kathmandu Newar, Mimasaka Daigaku and Mimasaka Daigaku Tankidaigakubu Kiyoo (Bulletin of Mimasaka University and Mimasaka Junior College) 53: 1--9. 2008b Ergativity and case in Newar. In Tokusu Kurebito (ed.) Ambiguity of Morphological and Syntactic Analysis, 175--92. Tokyo: Research Institute for Languages and Cultures of Asia and Africa. 2008c An Outline of the Meche Language: grammar, text and glossary, A report of the research project supported by Grant-in-Aid for Young Scientists (B), Ministry of Education, Culture, Sports, Science and Technology, Japan (MEXT Grant, No. 177200093). Tsuyama: Mimasaka University. Kodama, Nozomi 2001 Convergence patterns in Tuluva: A new scope for comparative studies. In Peri Bhaskararao and K. V. Subbarao (eds.) The Yearbook of South Asian Languages and Linguistics 2001, 185-203. New Delhi: Sage Publications. Mamiya, Kensaku 1994 Sindiigo Bunpoo Gaisetsu (An outline of Sindhi Grammar). Gengoshiryo (Language materials) 15: 51-136. Tokyo: Tokyo University of Foreign Studies. 1996 Sindiigo Kiso 1500 go (1500 Basic Words in Sindhi). Tokyo: Daigakushorin Publishers. 1998 Sindiigo ni okeru goi-shakuyoo (Lexical borrowing in Sindhi). Minamiajia Gengobunka (Languages and Cultures of South Asia) 3: 124. Tokyo: Tookyoo Gaikokugo Daigaku Minami-ajia Gengo-bunka Kenkyuukai (Tokyo University of Foreign Studies, Research Circle on Languages and Cultures of South Asia). Matisoff, James A. 1991 Sino-Tibetan linguistics: present state and future prospects. Annual Review of Anthropology 20: 469--504. 2003 Handbook of Proto-Tibeto-Burman. Berkeley: University of California Press. Matsuse, Ikuko 2000 On ‘wane (to go)’ and ‘waye (to come)’ in Newar: their basic use and extensions. KLS (Proceedings of the Twenty-fourth Annual Meeting of Kansai Linguistic Society) 20: 175--85.

Research on South Asian Languages in Japan 2004

167

The ‘give’ verb and its auxiliary uses in Newar. Taro Kageyama & Hideki Kishimoto (eds.) Nihongo no Bunseki to Gengo-ruikee: Shibatani Masayoshi Sensee Kanreki Kinen Ronshuu (Analyses of Japanese and Language Typology: Festschrift for Professor Masayoshi Shibatani), 455--72. Tokyo: Kuroshio Publishers. 2007 Newaarugo ni okeru waye no sukiima to imikakuchoo ( A schematic approach to ‘waye (to come)’ in Newar). Kobe Papers in Linguistics 5: 131--41. Kobe: Department of Linguistics, Kobe University. 2008 Newaarugo ni okeru sieki-idoo-dooshi no kakuchoo-hyoogen (Extended uses of deictic causative motion verbs in Newar). KLS (Proceedings of the Thirty-Second Annual Meeting of Kansai Linguistic Society) 28: 152--62. Minegishi, Makoto and Ganesh Murmu 2001 Santali Basic Lexicon with Grammatical Notes. Tokyo: Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Nara, Tsuyoshi and Peri Bhaskararao 2001 Toda Vocabulary—A Preliminary List (ELPR Publication Series A3002). Osaka: Osaka Gakuin University. 2002 Toda Texts (ELPR Publication Series A3-005). Osaka: Osaka Gakuin University. 2003 Songs of the Toda (ELPR Publication Series A3-011). Osaka: Osaka Gakuin University. Nawa, Katsuo 2000 Nepaaru byansu ni okeru ‘bogo’ o meguru shomondai -- gengo mee no yoohoo to shiji taishoo o megutte (Some issues concerning ‘mother tongue’ in Byansi, Nepal: how the language name is used and what it refers to). Kotoba to shakai (Language and Society) 4: 201--220. Tokyo: Sangen Publishers. Nishi, Yoshio 2000 Himaraya chiiki no chibetto biruma-kee gengo kenkyuu no dookoo -kaisoo to genjyoo (Trends in studies of Tibeto-Burman languages in the Himalayan region: past and present), Kokuritsu Minzokugaku Hakubutsukan Kenkyuu Hookoku (Bulletin of the National Museum of Ethnology) 25 (2): 203--33. Nishida, Fuminobu 2004 Zonkago Gasa hoogen no on’in taikee (A Phonology of the Gasa dialect of Dzongkha), Reitaku Daigaku Kiyoo (Reitaku University Journal) 78: 13--29. Nishioka, Miki 2000 Hindiigo no koochishi ‘ko’ no kinoo ni tsuite—danwa bunpoo no kanten kara (The Function of Dative Postposition in Hindi: From the Viewpoint of Discourse Grammar). EX ORIENTE 4: 123-151.

168

Kazuyuki Kiryu and Prashant Pardeshi

2001

Hindiigo no jyuusooteki toogo koozoo (Multi-layered Syntactic Structure in Hindi), Unpublished Ph.D. dissertation. Osaka University of Foreign Studies. 2002 -te-kee dooshi ga kakawaru hukugoo-dooshi no hindiigo kara no koosatsu—asupekuto hyoogen o chuushin ni—(Japanese verb compounds with ‘-te’ form, analyzed from the viewpoint of Hindi: focused on aspectual expressions). Nihongo Nihonbunka (Japanese language-Japanese Culture) 28: 95-121. Osaka: Oosaka Gaikokugo Daigaku Ryuugakusee Nihongokyooiku Sentaa (Osaka University of Foreign Studies, Foreign Student Japanese Language Education Center). 2003 Nihongo, hindiigo ni okeru tadoo-shieki-see (Transitivity and causativity in Japanese and Hindi). Nihongo Nihonbunka (Japanese language-Japanese Culture)” 29: 85-113. Osaka: Oosaka Gaikokugo Daigaku Ryuugakusee Nihongokyooiku Sentaa (Osaka University of Foreign Studies, Foreign Student Japanese Language Education Center). 2004 Hindiigo no iwayuru hukugoo-dooshi ni tsuite (On the so-called compound verb in Hindi)”. EX ORIENTE 10: 207-253. Osaka: Oosaka Gaikokugo Daigaku Gengo-shakai Gakkai (Research Circle on Language and Society, Osaka University of Foreign Studies Society). Onishi, Masayuki 2001 Non-canonically marked A/S in Bengali. In Alexandra Y. Aikhenvald, R.M.W. Dixon and Masayuki Onishi (eds.), Non-canonical Marking of Subjects and Objects (Typological Studies in Language 46) , 13-148. Amsterdam: John Benjamins. Osada, Toshiki 2001a Mundago Kyoohon (Munda Textbook). Tokyo: Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. 2001b Mundago Tokuhon (Munda Reading Materials). Tokyo: Institute for Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Osada, Toshiki 2001c Personal pronouns and related phenomena in South Asian linguistic area: convergent features or convergence-resisting features? The Yearbook of South Asian Languages and Linguistics 2001, 269-287. New Delhi: Sage Publications. 2005 A historical note on inclusive/exclusive opposition in South Asian languages—borrowing or retention or innovation? Mon-Khmer Studies 34:79-96. 2007 Reciprocals in Mundari, In Vladimir. P. Nedjalkov (ed.) Reciprocal constructions, 1575-1590. Amsterdam: John Benjamins.

Research on South Asian Languages in Japan 2008

169

Mundari. In Gregory Anderson (ed.) Munda languages, 99-164. New York: Routledge. Osada Toshiki, Kobayashi Masato and Ganesh Murmu 2003 Report on a preliminary survey of the dialects of Kherwarian languages. Journal of Asian and African Studies 66: 331-364. Pardeshi, Prashant 2000 The passive and related constructions in Marathi. In Rajendra Singh (ed.) The Yearbook of South Asian Languages and Linguistics 2000, 147-171. New Delhi: Sage Publications. 2003 The compound verb in Marathi: definitional issues and criteria for identification. Indian Linguistics 64: 19-36. 2004 Dative subject construction: a semantico-syntactic kaleidoscope. In Kageyama, Taro and Kishimoto, Hideki (eds.) Nihongo no Bunseki to Gengo-ruikee: Shibatani Masayoshi Sensee Kanreki Kinen Ronshuu (Analyses of Japanese and Language Typology: Festschrift for Professor Masayoshi Shibatani), 527-541. Tokyo: Kuroshio Publishers. 2008 No smoke without fire: invisible agent construction in South Asian languages. In Rajendra Singh (ed.) Annual Review of South Asian Languages and Linguistics 2008, 63-82. Berlin: Mouton de Gruyter. 2009 A tale of three passives in Marathi--a glimpse into the subtle interplay between agent defocusing and intentionality. CLS 42: 199-210. Chicago: Chicago Linguistic Society. Pardeshi Prashant et al. 2006 Toward a Geotypology of EAT-expressions in Languages of Asia: Visualizing Areal Patterns through WALS. Gengo Kenkyu (the Linguistic Scoiety of Japan Journal) 130: 89-108. Rituparna, Suresh. 2008 Jaapaan mein hindi-urdu shikShaN: sau saal kaa saphar (Hindi-Urdu Education in Japan: A journey of 100 years). In Suresh Rituparna (ed.), iii-xxxvii. Tokyo: Tokyo University of Foreign Studies.

170

Kazuyuki Kiryu and Prashant Pardeshi

Shibatani, Masayoshi and Prashant Pardeshi 2001 Dative Subject Constructions in South Asian languages. In Peri Bhaskarao and K.V. Subbarao (eds.) The Yearbook of South Asian Languages and Linguistics 2001, 311-347. New Delhi: Sage Publications. Suzuki, Hiroyuki 2004 Zonkago no onsee tokuchoo to onso settee (Phonological features and the phonetic inventory in Dzongkha) Nidaba (A journal of Linguistic Society of West Japan) 33: 109--118. Takahashi, Yoshiharu 2001 A descriptive study of Kinnauri (Pangi dialect): a preliminary report. In Yasuhiko Nagano & Randy. J. LaPolla (eds.) New Research on Zhangzhung and Related Himalayan Languages: Bon Studies 3, Senri Ethnological Report 19: 97--119. Osaka: National Museum of Ethnology. 2002 A report on case forms in Kinnauri (Pangi dialect). In Yasuhiko Nagano (ed.), Shanshungo no Saikoochiku to Chibetto Bungo Keesee ni kansuru Soogooteki Kenkyuu (Reconstruction of the Zhangzhung Language and Formation of Written Tibetan), A report for Grant-in-Aid for Scientific Research supported by the Ministry of Education of Science, Sports and Culture, Japan, 1--13. Osaka: National Museum of Ethnology. 2004 Kinaurugo no Kijutsu Oyobi Keetai-toogoron-teki Kenkyuu (A Descriptive and Morphosyntactic Study on Kinnauri), a report of research project, Grant-in-Aid for Scientific Research (C) No.12610556. Nagoya: Aichi Prefectural University. 2007 On the deictic patterns in Kinnauri (Pangi dialect). In Roland Bielmeier & Felix Haller (eds.) Linguistics of the Himalayas and Beyond, 341--354. Berlin / New York: Mouton de Gruyter. 2008 Kinaurugo no Genchi Choosa niyoru Kijutsu oyobi Keetaitoogoronteki Kenkyuu (A descriptive and morphosyntactic study on Kinnauri (2)), A report of the research project supported by Grantin-Aid (C), Ministry of Education, Culture, Sports, Science and Technology, Japan (MEXT Grant, No.16520250). Nagoya: Aichi Prefectural University.

A Morphosyntactic Categorisation Scheme for the Automated Analysis of Nepali Andrew Hardie, Ram Raj Lohani, Bhim N. Regmi and Yogendra P. Yadava

1. Introduction In this report, we outline the linguistic rationale underlying the morphosyntactic tagging system used within the recently-completed Nepali National Corpus (NNC). This tagset is referred to as the Nelralec Tagset for Nepali, taking its name from the Nelralec1 project which constructed the NNC (see Yadava et al., in press). Morphosyntactic tagging, also known as part-of-speech (POS) tagging, is the most common form of analytic annotation applied to corpus data (for an overview, see van Halteren 1999). It has a range of applications. For instance, it can be used for disambiguation of homonymous word-types (e.g. English modal verb can versus English noun can) in corpus searches, or to allow searching for abstractly-defined grammatical patterns. The frequency of tags in different texts has also been found to be a useful metric of text-type variation (Rayson, Wilson and Leech. 2002) and even variation among groups of speakers (Rayson, Leech and Hodges 1997). Other applications include information retrieval, spelling- and grammarchecking, speech processing, handwriting recognition, machine translation, production of corpus-based dictionaries and grammars, and applications in the teaching of foreign languages and knowledge of grammar (see Leech and Smith 1999). Morphosyntactic tagging is also a first step in many more complicated procedures of corpus annotation, such as parsing and semantic tagging. Tagging may be done either manually, by a trained analyst, or automatically, by a dedicated piece of software called a tagger. However, in either case, an absolute prerequisite to the tagging process is the availability of some grammatical categorisation scheme for the words of the language

172

Hardie, Lohani, Regmi and Yadava

being tagged. Creating a tagset for a language that has not previously been annotated in this way represents a challenge unlike any other form of grammatical analysis. In most forms of tagging, every word in the text that is tagged is assigned to one, and only one category (that is, each word is given a single tag).2 This means that the categories within the tagset must be exhaustive, and the boundary between them must be clearly delineated, so that it is possible to make a decisions with regard to borderline cases. Furthermore, in creating a tagset it is necessary to make decisions regarding which linguistic features or grammatical categories will be distinguished within the analysis, and which will not. A tagset which makes too few distinctions among different classes of words will be less than optimally useful; a tagset which makes too many distinctions becomes unwieldy and impractical, and resistant to the creation of useful generalisations across tags. The aim of this paper is to describe and explain the process by which an analysis of Nepali grammar was devised which was amenable to being expressed in the form of a tagset. As well as explaining how the general issues enumerated above were addressed, it will also report on some specific issues relating to the grammar of Nepali that proved problematic to process of compiling the Nelralec Tagset. It is important to be clear on the limits of such a scheme of analysis. This tagset has been compiled on the basis of a number of widely published works on Nepali (including the grammars of Acharya 1991 and Adhikari 1998, and the dictionary of Schmidt 1993), as well as the insights of native-speaker linguists. It should be noted that the cited grammars describe, in the main, standard Nepali. It follows that the tagset may not be sufficient for other dialects of Nepali, although we would anticipate that no more than minor modifications would be required in this case. The Nelralec Tagset as presented here represents the end-point of a process of development. To ensure the usability of our categorisation scheme, we tested it at each stage of development by manually applying the tags to sample texts. Feedback from this process – on issues such as whether the definitions of the categories were clear, whether any phenomena had been encountered in the text which could not be analysed using the tagset, and so on – was then used to inform the next round of revisions to the tagset, before further testing. Sometimes these revisions were quite major, with the result that the tagset we present here is very different from our initial draft. It is clear from our experience that this process of practical testing is absolutely indispensible in the creation of such a scheme of analysis.

A Morphosyntactic Categorisation Scheme for Nepali

173

The paper is structured as follows. In section 2, we outline the impact of issues relating to tokenisation to our analysis of Nepali words, in particular case affixes/postpositions. Section 3 explores another problematic issue relevant to nouns and adjectives, namely that of gender. In section 4, we explain the model we used to characterise the very great diversity of inflected verb forms in Nepali within the tagset. Section 5 compares and contrasts the tagset to some other recently-devised annotation schemes for closely related (Indo-Aryan) languages. Finally, section 6 briefly outlines the work that has been done to date in terms of putting this tagset into use, and suggests some future avenues for research in this field. There is, unfortunately, no space within the scope of this paper for a full explanation of every category within the tagset. However, some idea of our approach to the categories that we do not discuss in depth may be obtained from the full listing of the tagset, available as Hardie et al. (2005), which lays out the definition of each category and its mnemonic tag, together with example words for each category.

2. Tokenisation and postpositions The first step in both manual and automated tagging is the division of the text into tokens. Tokens are often described as “words” but in this context it may be better to think of them as “appropriately-sized units for morphosyntactic analysis”; such a unit may be larger than, or smaller than, an orthographic word (i.e. a string of letters bounded by white space or punctuation). Usually, each token will receive a single tag to indicate the grammatical category of that token. Tokens which are larger than the orthographic word are typically called multi-word units or multi-word expressions; examples in English include phrases such as because of, in order to, all right, with regard to, and so on, which are often treated as single units for morphosyntactic analysis and thus given single tags. The initial work on Nepali POS tagging described here has not yet investigated the nature and extent of such multi-word units in Nepali. In this section, therefore, we will concentrate on issues of tokenisation below the level of the orthographic word. When a single graphical word contains multiple elements which would ideally be analysed separately, they must be tokenised separately, i.e. it is necessary to break the graphical word apart into two or more tokens, each of which can receive a tag. In the context of POS tagging, the form which is removed from the start or end of another word and made into a separate

174

Hardie, Lohani, Regmi and Yadava

token of its own is sometimes called a clitic. This is parallel to, but conceptually distinct from, the use of the term clitic in morphology for a morpheme that is syntactically independent but phonetically attached to another word. There are two main distinctions between the senses of the term. Firstly, since tagging is performed on the written form of language and has no access to the spoken form – and thus, when assessing the dependence of one form on another, we have to analyse that dependence in terms of whether or not they are written with a space or not – we cannot use pronunciation as a guide, as is possible in non-automated linguistic analysis. Secondly, while in morphology a clitic can be considered as having a “inbetween” status between an affix and a free morpheme, in tokenisation and POS tagging, there is no possible “in-between”: either two morphemes are tokenised and tagged as a single unit (as if the clitic were an affix), or they are tokenised and tagged as two units (as if the clitic were an independent word). Although there are various clitics (in the POS-tagging sense) in Nepali, the most prominent, and the focus of the remainder of this section’s discussion, are the forms which will be referred to here as postpositions. Like many other Indo-Aryan languages and indeed most SOV-order languages, Nepali possesses exclusively postpositions with no prepositions. They are used for marking all oblique cases and also for the marking of core grammatical relations (subject and object). They are also, in keeping with Nepali’s generally strong tendency to compound and agglutinate, typically (though not always) written as part of the same orthographic word as the noun, adjective or other word whose case they mark. For this reason, they are variously referred to in the literature as postpositions or suffixes. In fact, some grammarians have actually made a distinction between suffixes on the one hand and postpositions on the other. Some Nepali grammatical traditions consider the following forms to be suffixes: haruu, ko/kii/kaa, le, laaii (whose meanings are, respectively, plural/collective; genitive; ergative/instrumental; and accusative/dative), and consider all other postposition/suffix forms to be postpositions. An attempt to investigate the validity of such an intra-categorical distinction on an empirical basis has been made by Hardie (2007, 2008), but this issue goes beyond the scope of the present paper. The question which arises here is how best to describe these elements within the framework of a POS tagset. They are very probably best considered clitics in the morphological sense (that is, intermediate between affixes and independent words); and, in terms of the three morphological “layers” in the New Indo-Aryan languages identified by Masica (1991),

A Morphosyntactic Categorisation Scheme for Nepali

175

they are clearly “Layer II” elements – agglutinative markers that have arisen following the breakdown of the Old Indo-Aryan inflectional marking of case and number, as seen for example in Sanskrit. As Masica notes, Layer II elements are referred to variously as affixes and as postpositions “[d]epending on both the language and on scholarly predilections” (1991: 231). But as noted above, an intermediate state is impossible to indicate in POS tagging. So the issue at hand is whether these “Layer II” postpositions are better analysed as inflectional elements on the noun, or as separate tokens (clitics in the POS-tagging sense); or whether different considerations should apply to the so-called suffixes on the one hand (plural, genitive, ergative, accusative) and the other postpositions on the other. It is superficially attractive to consider haruu, ko/kii/kaa, le, and laaii as constituting a part of the nominal paradigm, namely as number and case inflections. These inflections could then be elegantly captured within the tags for nouns – for instance, using NN1E for singular ergative nouns (NOUN-le), NN2A for plural accusative nouns (NOUN-haruulaaii), and so on. Tagset standards such as EAGLES (see Leech and Wilson 1999) recommend that where case and number are present in the grammar of a language, they should be indicated as features of nouns in this fashion. However, there are certain problems with this approach when applied to Nepali. Firstly, it is hard to know where to stop treating postpositions as suffixes and when to start treating them as clitics which must be tokenised separately. Should maa (locative) be treated as a separate word? What about other frequent spatial-temporal postpositions such as baaTa “from”, sanga “with”, dekhi “from” and so on? Secondly, and more seriously, the association of even haruu, ko/kii/kaa, le, and laaii with nouns is neither as straightforward nor as exclusive as one might expect nominal inflections to be. With regard to exclusivity, these morphemes can follow (and attach to) many things other than nouns, including pronouns, adjectives, certain adverbs, and other postpositions. With regard to straightforwardness, the combinations in which these morphemes can occur are far more various than the simple number-case combinations outlined in the previous paragraph. Particularly where the genitive ko/kii/kaa is involved, a single noun root can be followed by multiple postpositions in a range of configurations. To create tags for all these possible combinations, not only for nouns but for adjectives and possibly adverbs as well, would inflate the tagset beyond the point of manageability, and still leave open the possibility of some combination occurring that had not been anticipated in the tagset. So this superficially attractive approach must be abandoned.

176

Hardie, Lohani, Regmi and Yadava

Instead, all the forms mentioned above – including the genitive, ergative, and accusative markers, and also including plural-collective haruu, as well as spatial-temporal postpositions such as maa, baaTa etc. – are considered in this scheme of analysis to be clitics in the POS-tagging sense. As such they are tagged separately to the noun, adjective, pronoun or other form to which they are attached. This necessitates the use of powerful (but computationally not problematic) tokenisation software to separate the postpositions from their bases. In order to preserve some indication of the “special” nature of the postpositions most often considered to be elements of the nominal inflection system (that is, haruu, ko/kii/kaa, le, and laaii), these are assigned to subcategories within the postposition category, as follows: – –

– – –

The general category of postpositions is tagged as II.3 The plural-collective marker haruu (which may nor may not be considered a postposition depending on one’s theoretical stance, but which is formally very similar to a postposition and therefore classed with them for purposes of the tagset) is tagged as IH. The genitive postpositions ko, kii and kaa are tagged as IKM, IKF and IKO respectively (see also the discussion of gender below). The ergative-instrumental postposition le is tagged as IE. The accusative-dative postposition laaii is tagged as IA.

Assigning haruu to its own subcategory, in particular, allows scholars using text tagged according to this scheme to exclude it from consideration with the other postpositions, if they would prefer to do so. While the tokenisation-based approach to Nepali case suffixes/postpositions was settled on or the reasons given above, it should be acknowledged that it does leave some residual inconsistencies. For example, while nouns do not have a genitive case tag (since ko/kii/kaa is treated as a separate token), some personal pronouns do require genitive case tags, since the first-person, second-person and reflexive possessive pronouns (mero, tero and aaphno respectively) are clearly not composed of a pronoun + postposition combination. The “possessive” (i.e. genitive) feature of these pronouns is indicated in their tags with the letter K (mero is PMXKM, tero is PTNKM, and aaphno is PRFKM) to show the parallel with the postposition tag IKM for ko. But still, there is an inconsistency in how two parallel linguistic realisations of the genitive are tagged, as schematised below:

A Morphosyntactic Categorisation Scheme for Nepali NOUN1_NN

ko_IKM NOUN2_NN

177

“NOUN1’s NOUN2”

versus mero_PMXKM NOUN_NN

“my NOUN”

However, inconsistencies of this degree are probably unavoidable in any extensive scheme of analysis; language as a phenomenon is itself rarely entirely consistent. Certainly, a similar inconsistency may be found in the tagging of any language which has specifically possessive pronouns alongside a clitic or adpositional genitive on nouns. While in many languages case, gender and number are best analysed together, in Nepali case and number are indicated by postpositions, whereas gender is marked by an inflected affix. Thus, the considerations that apply to gender are somewhat different to those discussed in this section. The following section outlines the descriptive treatment of Nepali gender adopted within the Nelralec Tagset.

3. Gender on nouns and adjectives Like many Indo-European languages, Nepali has grammatically-marked gender. The genders distinguished by Nepali are masculine and feminine. The assignment of gender to nouns is natural: nouns denoting female humans are feminine; nouns denoting male humans are masculine; the default for all other nouns is masculine. Those categories that mark gender do so by means of a three-way distinction between the suffixes o/ii/aa, where o is masculine, ii is feminine, and aa is “other” (it can indicate masculine plural, or feminine plural, or oblique case marking motivated by a following postposition, or honorific marking). For example, the adjective raamro “good” is a masculine form, the corresponding feminine is raamrii, and the corresponding “other” is raamraa. A similar pattern is found on some nouns, for example, keTo “boy”, keTii “girl”, and keTaa “boy (plural/oblique/honorific)”. Other categories that are marked in this way include various determiners, ordinal numbers, some verbal forms, and one postposition, genitive ko/kii/kaa. However, this gender marking is not found on all members of the gendered categories. Many words are “unmarked” – that is, they have a single invariant form regardless of gender. For example, adjectives are a gendered category, marked for gender agreement with the noun they modify. But

178

Hardie, Lohani, Regmi and Yadava

many, many adjectives are actually unmarked – that is, there is nothing in their morphology to indicate their gender. This set includes adjectives that are highly frequent in the NNC such as vibhinna “different, various”, and sampuurNa “all, complete”. Many of these adjectives are loanwords from Sanskrit. In most cases gender is not marked on nouns, but, for a minority of nouns, there are pairs of masculine and feminine nouns related through the –o/ii distinction, for instance the related forms keTo “boy” and keTii “girl”, mentioned above. But there are also numerous feminine nouns that end in ii without there being a masculine equivalent in o (e.g. aaimaaii “woman”), and many, many more nouns with no gender marking at all. Even in the case of the non-lexical categories that mark gender, there are often parallel forms in the same category that do not mark gender. For example, the possessive reflexive pronoun has different gendered forms: aaphno, aaphnii, aaphnaa. However, this pronoun is often found in combination with the emphatic affix ai to form aaphnai. This form can no longer inflect for gender, and is therefore, in effect, unmarked. To give another example, the category of demonstrative determiner contains the gender-marked forms yatro “of this size (masc.)” and yasto “of this kind (masc.)”, but also the unmarked forms yo, yi, and yati. So, in sum, while gender marking is found on many categories, it is not found consistently across any category: in any class of words where gender is marked, there are also many words where gender is not marked. The question becomes how this is best represented in a tagset. One option is to ignore the gender inflection altogether. For instance, it would be possible to tag all adjectives as JJ, all nouns as NN, all demonstrative determiners as DD, and so on, regardless of whether they have an o, ii, or aa inflection, or no inflection at all. This approach has the advantage of simplicity. However, it has other disadvantages. From the point of view of the corpus linguist, the main application of POS tags is to facilitate searching of a corpus. Without tags, a researcher who is interested in, say, the distribution of feminine-marked adjectives across genres has to search for a word-pattern rather than a tag: in this case, the researcher would search for words ending in ii. But many words end in ii which are not femininemarked adjectives; several examples have already been given in this section. Even if the search was for “words ending in ii that are tagged as adjectives” many false positives will be retrieved, for instance dhanii “rich”, which is an unmarked adjective that happens to end in ii.4 From a more computational-linguistic point of view, POS tagging is often the basis of further analysis such as parsing or semantic tagging. Gender may be a category marked on nouns or noun phrases in either of these levels of annota-

A Morphosyntactic Categorisation Scheme for Nepali

179

tion. Clearly, it would be absurd to exclude from the outset a feature whose utility for at least some purposes can be foreseen. However, the opposite solution of including gender as a distinguishing feature of the tags for the categories of words on which it appears, is equally unattractive. If, for example, gender was a feature across the tags for the adjective category, then an automatic tagging system would be in the difficult attention of attempting to decide the gender of invariant-form adjectives. This could only be done contextually, and some of the relationships that might be needed to identify the correct tag may be rather long-range. It is, it should be noted, extremely important in tagset design that the challenge that the tagset implicitly lays down for any automatic tagger should not be intractably complicated. These various factors are resolved in the tagset described here by distinguishing only the four-way distinction evident in the morphology by having one tag for each form of the words that take the o/ii/aa inflection, and a single tag for forms of the words that do not to indicate that these are unmarked words. So, for instance, for adjectives there are the following tags: JM, JF, JO and JX. JM and JF indicate adjectives with the suffixes o and ii respectively (masculine and feminine). JX indicates an unmarked adjective (including any derived from a marked adjective by addition of the ai particle). In JO, the “O” is an abbreviation for “other”, a covering term for the various significance that the aa ending may have. Attempting to distinguish between plural, oblique and honorific would be another potentially computationally intractable task, as this is an ambiguity which in some cases can probably only be resolved with reference to pragmatics (that is, in cases relating to honorificity) – that is, at a far more abstract linguistic level than that at which POS tagging software operates. Furthermore, classifying together the different uses of the “other” inflection obviates the need to have number (plural) and case (oblique) as distinguishing features of tags at the word level; as noted above, these features are both present in the tagset anyway, in the form of various postposition tags; and so the conceptual structure of the tagset is kept as simple as possible. The system as outlined above represents the system the tagset uses for adjectives, pronouns, determiners, and other categories of words where the gender distinction is an inflectional feature based on agreement. Nouns represent a slightly different case. As noted above, in the first place the assignment of nouns to gender categories is natural rather than grammatical; in the second place there are nouns such as aaimaaii which are feminine but have no masculine equivalent. That is, nouns do not exist in “sets” of lexically-equivalent forms in the way that a word like raam-

180

Hardie, Lohani, Regmi and Yadava

ro/raamrii/raamraa does. It seemed, then, odd to assign a feminine tag to a word like aaimaaii when that lexeme had no possibility of being masculine. It would result in a parallelism between the behaviour of nouns and the behaviour of adjectives being drawn by the tagset which goes beyond what is actually the case. Gender is simply not an inflectional category in the same way that it is for adjectives. Nouns have gender, other categories may agree with the gender of a some noun. The descriptive basis of the treatment of noun gender in the tagset is, then, that while some nouns carry explicit gender marking, this is deemed to be lexical/derivational rather than inflectional. Since the aim of the tagset is to mark morphosyntactic features of the language, the tagset does not indicate the difference between masculine and feminine nouns. Therefore, in terms of the final categories within the tagset, there is one tag for proper nouns (NP), and another for common nouns (NN). Feminine common nouns such as keTii are tagged NN, and feminine proper nouns such as siitaa are tagged NP – i.e. just as equivalent masculine nouns (keTo, raam) would be tagged. Neither the fact that keTii is gender-marked, nor the fact that siitaa is not, is indicated in the tagging. Likewise, where a noun has a separate oblique form (for instance keTo – keTaa), it is treated as a variant realisation of the noun base and does not receive a different tag. So, in summary, while gender, like number and case, is a relevant grammatical category for Nepali nouns, it is not marked on the tags. By contrast, adjectives, determiners, and other gender-inflected words are put into different categories depending on their gender inflection, and so there are multiple tags. While this introduces an inconsistency into the tagset – in that the feminine and “other” marking is tagged on adjectives, but not on nouns – it is a motivated inconsistency. The tags used for adjectives and some other categories with adjective-like inflection are exemplified in Table 1.

A Morphosyntactic Categorisation Scheme for Nepali

181

Table 1. Some tags for words with adjective-like inflection Category

Examples raamro, raamrii, raamraa, asal

Tags used

Personal possessive pronouns

mero, merii, meraa, merai

PMXKM, PMXKF, PMXKO, PMXKX

Reflexive possessive pronouns

aaphno, aaphnii, aaphnaa, aaphnai

PRFKM, ORFKF, PRFKO, PRFKX

Demonstrative determiner

yasto, yastii, yastaa, yati

DDM, DDF, DDO, DDX

Interrogative determiner

katro, katrii, katraa, ko

DKM, DKF, DKO, DKX

Relative determiner

jasto, jastii, jastaa, jo

DJM, DJF, DJO, DJX

Genitive postposition

ko, kii, kaa, kai

IKM, IKF, IKO, IKX

Ordinal number

pahilo, pahilii, pahilaa, paa~cau

MOM, MOF, MOO, MOX

d-participle verb

gardo, gardii, nagardaa, gardai

VDM, VDF, VDO, VDX

Adjectives

JM, JF, JO, JX

4. Modelling Nepali verb inflections The main issue facing an analysis of the Nepali verb for purposes of POS tagging is the very great multiplicity of inflected forms. These arise because complex verbal structures are present within single tokens, due to the extremely productive nature of the compounding process in the Nepali verb system. On the one hand, Nepali possesses the verbal structures, common in Indo-Aryan languages, typically referred to as “compound verbs” and consisting of a main verb plus one of a set of frequent, semantically general “vector verbs” or “light verbs”, which adds some shade of meaning to the sense of the main verb. In these structures, it is the light or vector verb which occurs second and which carries the tense marking, agreement, etc. of the clause. In Nepali, these types of combinations are written as a single token; for example the combination garidiyo, in which gari and diyo are inflections of the verbs meaning “do”(main verb) and “give” (light or vector verb) respectively. Nepali also has many verb structures where a verb token consists of a noun or adjective followed by a verb (these will be considered

182

Hardie, Lohani, Regmi and Yadava

here as also constituting “compound verbs”, although they are of course linguistically distinct from the two-verb structure mentioned previously). On the other hand, many tense-aspect-mood combinations in Nepali are created by the use of other auxiliary verbs (most prominently various forms of hunu “be”), which are also written together with the main verb as compound words, that is as single tokens. This is quite aside from those verbal inflections which are not compounded forms of auxiliary verbs. Examples from the paradigm of garnu “do” include forms such as garthyo “(he) used to do”, garcha “(he) does”, garirahyo “(he) continued to do”, and garnecha “(he) will do”. Such forms will also be treated in the following discussion as compound verbs, like the two types mentioned above. Thus, for present purposes, the term “compound verb” is used in a wide sense, to refer to any verb token that consists of two or more elements that are identifiable as belonging to separate independent verb lexemes. The problem presented by compound verbs is that the large number of compounded forms, combined with the inflections for categories like person and gender that are not indicated by compounded auxiliaries, means that the full list of grammatical features that may be marked on a single token is rather long. A single verb token may be marked for passivity, causativity, person, number, gender, honorificity, and/or tense-aspect-mood (or as possessing one of a set of non-finite forms) in various combinations. The negative particle is also compounded into the verb token, sometimes as a prefix and sometimes as a suffix. It would theoretically be possible to indicate the value of a given verb token in each of these categories: for example, verbs could be tagged as V _ _ _ _ _ _ _ _ _, where the first letter indicates the voice of the verb, the second the tense, the third the mood, the fourth the aspect, the fifth the person, the sixth the gender, the seventh the number, the eighth the honorificity, and the ninth the polarity. (This is, of course, without even taking into account those compounds made up of a main and a vector verb.) However, this leads to a situation where vast numbers of tags are required – thousands of distinct tags for the verbs alone, in the system just outlined. While information at this level of granularity may be of use in a morphological analysis, at the level of POS tagging it is rather a disadvantage than an advantage. When the number of tags is so great, creating meaningful generalisations across tags becomes difficult. For example, one of the main approaches to automated POS tagging is the probabilistic or stochastic approach, where a statistical model such as a hidden Markov model is used to deduce what tag a given token is most likely to have. In this approach, the linguistic knowledge of the system takes the form of a matrix of tag transition probabilities, that is, the proba-

A Morphosyntactic Categorisation Scheme for Nepali

183

bility of tag A following tag B, for each pair of tags in the tagset These probabilities are derived from a tagged corpus of training data, and the amount of training data required is proportional to the square of the size of the tagset.5 So as a tagset grows to contain thousands of tags, this kind of probabilistic training very quickly becomes impractical due to the huge amount of training data that would be required. Thus, a tagset of the size that would be needed to indicate in the tags every feature marked on the Nepali verb would be in significant respects less than optimal for the purposes to which a tagset is put. One way to solve the problem of the complexities of verbal inflection would be to apply a tokenisation strategy, as employed in the analysis of postpositions, with each verb element within the compound being treated as a separate token. However, this solution is not as straightforward for verbs as for nouns and postpositions, since there are some forms within some verbal compounds that are not identical to a freestanding element. For instance, the verb form huncha “is” is composed of hun and cha, but while cha is also a freestanding element, hun is not (the independent equivalent would be the root form hu “be”). For this reason, the retokenisation approach has been avoided. The other option is to incorporate simplifying assumptions of some description into the descriptive model of the Nepali verb underlying the tagset. This is the approach that our tagset adopts. The remainder of this section is devoted to outlining these assumptions. To avoid the problem of compound verbs creating an unmanageable number of categories, the first simplifying assumption was that compounding would effectively be ignored. The way to do this consistently was to determine, as a basic principle of the tagging, that every compound verb should be tagged according to the last element in the compound. This is because, as noted above, it is the last element of the compound that carries the person-number-gender inflection in a finite clause, these being categories which are not ignored.6 Only the last identifiable verb in a compound verb is taken into account when deciding the tag. If there is only one identifiable verb, then the whole thing is taken into account. This means that only those grammatical categories which are inflected on an individual verb form need to be taken into account in the tagset. Categories which are only indicated through compounding do not need to be tagged. An example is honorificity; while nonhonorific, and medial-honorific are indicated as inflections on individual verb elements, the higher levels of honorificity are indicated with com-

184

Hardie, Lohani, Regmi and Yadava

pounds. For example, the high-honorific level is indicated by compounding with a form of the verb hunu “be”. A high-honorific verb is thus never tagged as high-honorific; rather, it receives whatever tag that form of hunu would have received in isolation. As a further simplifying measure, certain other features of verbal morphology system are ignored. These are the passive, causative, negative, and also tense-aspect-mood. The passive and causative affect the root in a way that is deemed to be lexical-derivational rather than strictly inflectional (see the discussion in the previous section). The negative is problematic because of its various realisation as a prefix and a suffix, meaning it may or may not be expected to be marked on the last element of the verb; thus including it in the tagset would complicate the “last-element-only” principle. A similar consideration applies to tense-aspect-mood. Those features of tense-aspect-mood which are marked on individual verb elements are not indicated in the tags because there would then be an inconsistency in the tagging (since those other features of tense-aspect-mood which are indicated via compounds cannot be indicated in the tags due to the “last-elementonly” principle). The effects of these simplifying assumptions are exemplified in Table 2. The tagset makes, for convenience of reference only, a distinction between finite (person-marked) verb forms, and non-finite (non-personmarked) verb forms. Many of the non-finite forms occur embedded at the start of a longer verb (e.g. gardai-thyo with an embedded d-participle, garnu-huncha with an embedded infinitive). However, in accordance with the general rule, they are not tagged separately in this case. This means that non-finite tags are only used if the non-finite form is written as a separate word, or if the non-finite form is at the end of the longer verb.

A Morphosyntactic Categorisation Scheme for Nepali

185

Table 2. Examples of simplifying assumptions in the treatment of verbs Verb form (from paradigm of garnu “do”, third person)

Would be tagged identically to…

garnubhaena “did not do” (honorific)

bhaena “was not”

garnecha “will do”

cha “is”

garirahibaksanthyo “used to continue to do (high honorific)”

baksanthyo “was/used to (high honorific)”

garidiyo “did”

diyo “gave”

garirahanuhuncha “does, continues to do (honorific)”

cha “is”

garibaksanuhu~dorahecha “does, continues to do (high honorific)”

cha “is”

garnuhune “be doing”

hune “being”

The non-finite forms distinguished by tagset are as follows (examples of the different forms can be seen in the tagset listing, Hardie et al. 2005): – –

– –

The infinitive A group of participles7: – the e(ko)-participle, sometimes called the perfect participle – the d-participle (do/dii/daa/dai), which is used for three functions: – the converb/participle function (also known as the conjunctive participle, the progressive participle, and the simultaneous converb) – the modifier function – as an element of compound verbs (where it does not receive a separate tag) – the ne-participle (the imperfect participle or infinitival participle) – the sequential converbs, also called absolutive participles, of which there are three, which all receive the same tag: – the era-participle – the ii-participle – the iikana-participle Two other forms: – the e-form (often referred to as subjunctive or conditional) – the i-form, sometimes referred to as the passive root. Three command forms (also called the imperative)

186

Hardie, Lohani, Regmi and Yadava

Each separate non-finite form essentially receives a tag of its own in the tagset. This has the benefit of allowing distinctions among the different syntactic structures in which each appears to be made by reference to the tags. By contrast, the tags for finite forms group together numerous different forms within the paradigm of a verb. Given the simplifying assumptions noted above, tags do not indicate the tense-aspect-mood of a verb token, but rather only its agreement and honorificity features. So the distinctions operating on finite verbs indicated by the tagset are as follows: – – – –

Person: first, second, third Gender: masculine (default), feminine Number: singular, plural8 Honorific level: non-honorific, medial-honorific9

Finite verbs that differ in tense-aspect-mood but have the same person, number, gender and honorific level receive the same tag. So, for instance, the third person singular masculine non-honorific form of a verb may be indicated by a number of suffixes, most notably -yo, -thyo, -echa, -cha, necha, and -laa. Verbs terminating in anyone of these suffixes would be allotted the same tag. Given this list of distinctions and values, in theory there should be 3 x 2 x 2 x 2 = 24 tags for finite verbs. However, not all possible combinations of features correspond to an actually distinct verb form. There is, for instance, no specific form in any part of the paradigm for the second-person non-honorific feminine plural. There is just a single second-person plural form. In general, only second and third person singular verbs are marked for gender and honorific level. Plural verb forms are not marked for gender or for honorific level. First person verbs are not marked for gender or for honorific level. Similarly, at the medial-honorific level, there is no distinction between singular and plural, except for feminine verbs. Finally, while third person singular non-honorific has distinct masculine and feminine forms – for example, cha (masculine) versus che (feminine), both forms of hunu “be” – the corresponding negative does not – chaina is the negative that corresponds to both cha and che. Taking all such phenomena together, if the tagset treats alike those categories which are merged together within the Nepali verb paradigm, only ten tags are necessary to cover all finite verb forms. So, in summary, the different positions in the tags for finite verbs contain the following symbols:

A Morphosyntactic Categorisation Scheme for Nepali

Person10 VV… M T Y

Honorific N M X

Number 1 2

187

Gender ( ) F

The six “masculine” tags (VVMX1, VVMX2, VVTN1, VVTX2, VVYN1, VVYX2) are indicated by the absence of the F for feminine, and the four “feminine” tags (VVTN1F, VVTM1F, VVYN1F, VVYM1F) by its presence. This asymmetry is because of the very fact that the “masculine” tags often do not indicate masculinity, but rather a neutralisation of gender, as noted above, and because of the reported usage of “masculine” verbs with feminine subjects in some varieties of Nepali, i.e. non-usage of the limited gender marking that exists. There are separate tags for optative verbs, as these verbs behave differently in many ways to the other finite verbs (e.g. by taking a prefix to indicate the negative, rather than a suffix). However, the person-numberhonorific categories in the optative paradigm are directly parallel to those of the general finite paradigm. There are no feminine forms in the optative, and thus no tags for them. Tags for optative verbs begin VO… in contrast to the VV… tags for general finite verbs. A full list of the verbal tags can be found in the tagset listing (Hardie et al. 2005).

5. The Nelralec Tagset and other tagsets: a brief comparison As the discussion above may indicate, a great deal of morphosyntactic detail is included in the Nelralec Tagset, although not as much detail as the Nepali language actually affords. Although this tagset is the first to be created for Nepali, other tagsets have been defined for closely-related languages (especially Hindi-Urdu) which include relatively more or less morphosyntactic detail. In this section, some points of comparison between these various tagsets will be discussed. The closest point of comparison with the Nelralec Tagset is the Urdu tagset of Hardie (2003, 2004). Indeed, since the experience of some Nelralec team members of working with this Urdu tagset informed the development of the Nelralec Tagset, this scheme of analysis for Urdu is the nearest thing the Nelralec Tagset has to a direct antecedent. On a superficial level, the same abbreviations (such as J for adjective, I for adposition) are utili-

188

Hardie, Lohani, Regmi and Yadava

sed in each. More notably, both tagsets aim to represent the morphosyntax of the target language to a reasonable depth of detail, beyond simply the top-level POS categories; both enshrine the principles of decomposability (where each one-letter or two-letter substring within a tag represents some grammatical feature) and hierarchy (where tags may seen as instantiating a hierarchy of categories, generally moving from left to right). 11 For example, in the Nelralec Tagset NN (common noun) and NP (proper noun) are subdivisions of N… (noun); VV… (finite verb) and VO… (optative verb) are subdivisions of V… (verb); and IKM (ko), IKF (kii) and IKO (kaa) are subdivisions of IK… (genitive postposition). Decomposability can be seen at its most extensive in Nepali tags such as PTNKF (pronoun – second person – non-honorific – possessive – feminine) or VVTM1F (verb – finite – second person – medial-honorific – singular – feminine). The differences between the Urdu tagset of Hardie (2003) and the Nepali tagset described here are, despite these similarities, significant, and go beyond those dictated by the differences between the two languages. In particular, the Nepali tagset was designed with the intent of minimising the amount of word-form ambiguity – that is, minimising the number of tags that a given type can possibly have, which ultimately serves the purpose of simplifying the task of automated tagging. This was not the case in Hardie’s (2003) tagset for Urdu, which incorporates, for example, four tags for different case-number categories of feminine adjectives, even though Urdu, like Nepali, has only a single feminine suffix (ii in both cases). The principle underlying the inclusion of such distinctions was that these features, while not evident on the individual word, might be deducible at the phrase or clause level, and thus could possibly be tagged; moreover, since these distinctions are present on masculine adjectives, it was seen as desirable to include them on feminine adjectives for consistency. However, in the end this was not the case, and most feminine-marked words are left as ambiguous between the different case-number tags by the Urdu tagger that employs this tagset (see Hardie 2004, 2005). Similar considerations apply to other distinctions made by that tagset that are not evident in the surface forms of the word (for instance, the distinction between auxiliary and main-verb usages of the same verb-types, distinction between proximal and distal demonstratives). These distinctions serve to inflate the tagset, 12 and also to complicate the task of tagging. The ambiguity of the tagger using the Urdu tagset has not yet been reduced beyond the level of around 2.6 tags per token (see Hardie 2004, 2005). By contrast, automatic tagging using the Nelralec Tagset – which does not attempt to tag such invisible

A Morphosyntactic Categorisation Scheme for Nepali

189

distinctions for the sake of an unattainable level of consistency – can eliminate all ambiguity, labelling each token with exactly one tag, and still retain a level of accuracy above 90% (although this is with a difference of an order of magnitude in the amount of available training data; see further the discussion of computational implementation in the following section). So the ambiguity-minimising approach taken in the design of the Nelralec Tagset has proven to have significant advantages, while still producing an analysis embodying a considerable level of morphosyntactic detail. However, many other tagsets for Indo-Aryan languages have gone much further in reducing the complexity of the analysis. Sticking with Urdu, we may observe that Sajjad’s (2007: 15-19) Urdu tagger uses less than 40 tags. This is achieved by excluding all consideration of inflectional categories (such as tense-aspect-mood or agreement inflections on verbs, or casenumber-gender on nouns and adjectives) and only including major category distinctions and some subcategory distinctions such as proper versus common noun. A tagset of only 26 tags, using similar simplification strategies, is proposed by Bharati (2006). This scheme of analysis reaches a sufficient level of generality that it may actually be used for different languages – including Hindi, Bengali and Telugu (see Bharati and Mannem 2007). It might be questioned whether a more radically simplifying approach like that of Sajjad (2007) and Bharati (2006) would have been a more productive (and less work-intensive) approach to take in the tagging of Nepali than the more morphosyntax-oriented approach which it actually takes. However, we must bear clearly in mind the purposes to which a tagset is intended to be put. The simpler tagsets are typically used by computational linguists and software developers, for whose purposes a minimal number of distinctions often suffices. By contrast, theoretical and descriptive linguists exploiting the corpus as a tool for linguistic investigation are more likely to require the finer morphosyntactic distinctions. It is easy to imagine a situation where the ability to search for feminine possessive pronouns, or first-person as opposed to third-person verbs, to give just two examples, might be useful in a descriptive study. As the Nepali National Corpus is intended for use by linguists and lexicographers as much as by language engineers, it would seem that the use of a morphosyntactically rich tagset is justified. With regard to the opposing requirements for both simplified and morphosyntactically-rich annotation, the most recent edition of the British National Corpus (BNC) may point to a way forward. As well as detailed C5 tags,13 a second, simplified tagset is used. The two different forms of

190

Hardie, Lohani, Regmi and Yadava

annotation are encoded as separate XML attributes on each token in the corpus, so both are present and available for analysis at any time. The second tagset14 is very greatly reduced, consisting only of the following ten tags: ADJ, ADV, ART, CONJ, INTERJ, PRON, SUBST (nouns), VERB, STOP (punctuation) and UNC (unclear, unclassifiable, other). However, the detailed C5 tags map directly, on a many-to-one basis, to this simplified set of tags. The simultaneous use of a simple and a detailed tagset may be the way to reconcile the two approaches to POS analysis, and the needs of all potential users. In the context of the study of South Asian languages, a further benefit of such a reconciliation is that the simpler type of tagset is likely to be cross-linguistically consistent (as demonstrated by Bharati’s 2006 multilingual tagset) and thus a potentially highly useful tool for cross-linguistic comparative analysis.

6. Conclusion: Subsequent and future work Work involving the tagset described in this paper has progressed a long way since the category system was initially defined, and is ongoing. Most notably, the team of analysts whose work contributed to the design of the tagset have applied this annotation system to the manual tagging of text; a 300,000 word subsection of the Nepali National Corpus has been annotated in this way. This amount of text is sufficiently great to be used as training data for a probabilistic tagger. The Unitag system (see Hardie 2004, 2005) has been employed for this purpose. Using a combination of hand-crafted and corpus-derived lexical resources, and a Markov model for contextual disambiguation, the Nepali version of Unitag achieves an accuracy rate of around 93% on written texts. We have not yet had the opportunity to test it on the spoken text in the NNC, but anticipate that the accuracy rate will be slightly lower, due to the known problems that probabilistic taggers have when working on a text-type different to that of their training data. Given the amount of training data available and the level of detail in the tagset, 93% represents a good rate of accuracy; this system has been used to tag the version of the written NNC that has now (as of 2007) been released. Some further work on the both the tagset itself, and its implementation in manual or automatic tagging, would clearly be desirable. With regard to the latter, one thing we have yet to test is inter-rater reliability: that is, if two analysts are given the same text to tag, what percentage of the tags that they assign will be identical? We cannot expect this to be 100%, but on the

A Morphosyntactic Categorisation Scheme for Nepali

191

basis of previous work (see for example Baker 1997) for a tagset of this type, it is not unreasonable to expect in excess of 97% agreement between annotators who are well-versed in the annotation scheme. However, this test has yet to be put into practice. In terms of longer-term advances, we hope to develop a system for constituent parsing of Nepali text, using the POS tagging as a basis. With regard to the tagset itself, one feature in particular which may require further work is the apparent inconsistency between the treatment of nouns on the one hand, and verbs on the other, in that while inflectional elements compounded onto nouns (i.e. postpositions) are split off and tagged separately, inflectional elements compounded onto verbs (i.e. auxiliary verbs) are not, and the entire compound verb complex is given a single tag. While this inconsistency was introduced in response to the various descriptive and pragmatic factors cited in sections 2 and 4 above, it is still less than desirable, and additional work on Nepali tagging – revisiting in particular the tagging of verbs – may allow it to be eliminated. In summary, then, we have in this report briefly described the development of a tagset for the morphosyntactic annotation of Nepali, and discussed in detail three areas of particular conceptual complexity: tokenisation, especially as applied to postpositions; the modelling of gender within the tagset; and the modelling of the extremely complex Nepali verbal inflection system. We have also explored some similarities and differences between this Nelralec Tagset and some other recently-devised tagsets for POS annotation of Indo-Aryan languages. There are two points which we would wish to underline. In the first place, the creation of a morphosyntacticallydetailed tagset requires that an extensive analysis of the grammar of the language in question be done afresh, with the task of tagging specifically in mind. Even if the grammar of the language has already been studied in detail – as, indeed, the grammar of Nepali has long been studied – issues often emerge that are specific to the very particular task of creating a tagset that other types of grammatical study need not deal with. This was, for example, the case with regards to the issue of whether Nepali postpositions should be treated as separate tokens or not. Secondly, the design of such a tagset must necessarily incorporate the undertaking of actual annotation of actual texts – manually in the first instance – to test the validity of the categorisation scheme and to make sure that all problematic phenomena have been considered. Some of the conceptually complex areas discussed above were not even considered in detail by us until it became evident in the process of manual annotation that they were potentially problematic; others

192

Hardie, Lohani, Regmi and Yadava

we did consider, but the approach we took to them was much reformed by the experience of attempting to apply the tagset to actual text. A tagset based on such analysis and such trial-and-error implementation will almost certainly incorporate at least some inconsistencies, as our description of the Nelralec Tagset has indicated. However, inconsistencies that are motivated by the needs of the POS tagging process, or by the uses that the tagging will ultimately be put to, are not necessarily problematic in a scheme of morphosyntactic analysis such as that which has been outlined here.

Notes 1.

2.

3.

4. 5.

6.

Nepali Language Resources and Localization for Education and Communication; funded by the EU Asia IT&C programme, reference number ASIE/2004/091-777. It is possible in tagging for a word to be given more than one tag, i.e. assigned more than one analysis. However, this is usually only done in cases where there is some ambiguity about which category a word actually belongs in. It does not imply that the categories themselves are not clearly defined. The practice of using I as an abbreviation for adpositions derives from the Lancaster tradition of tagsets for English; it has the benefit of not being confusable with adjectives or pronouns, as abbreviations derived more transparently from “adposition”, “preposition” or “postposition” are prone to be. Other abbreviations of this type used in the Nelralec Tagset include J for adjective, R for adverb, M for numeral and F for non-word elements; by contrast, N for noun, D for determiner, P for pronoun and V for verb are fairly universal abbreviations. An example of the type of tagset for English which utilises these abbreviations may be seen at http:// ucrel.lancs.ac.uk/ claws6tags.html . There are likewise adjectives that happen to end in aa which are actually unmarked, e.g. saphaa “clear, clean”. This is true for bigram tag transition probabilities, as used by most Markov model taggers. Some taggers use trigram probabilities (the probability of tag C following the sequence of tag A then tag B); in this case, the amount of training data required is proportional to the cube of the tagset size. See ElBeze and Merialdo (1999) for a detailed discussion of Markov model tagging. A corollary of this practice is that if a form which can be written as a single orthographic word is written as two orthographic words, then each is considered individually. For example, garidiyo would usually be written as one or-

A Morphosyntactic Categorisation Scheme for Nepali

7.

8.

9. 10.

11. 12.

13. 14.

193

thographic word, but if it were written as two orthographic words (gari diyo), each part would receive a separate tag. These forms should probably not all be considered “participles” in the most precise grammatical sense; the label is one of convenience. Some are “converbs” and some may be more precisely analysed as infinitival. Note that number here is very different to number as marked on nouns, adjectives, etc. On those words, number relates solely to the presence or absence of haruu. By contrast, for finite verbs, number is an inflectional category indicated by the same suffixes that indicate person and (sometimes) gender/honorificity. The higher honorific levels are conveyed through compound verbs. M stands for ma (= first person), T for ta (= second person), and Y for yo (= third person). Letters are used instead of numerals to avoid confusion with singular/plural (which are indicated by numerals). See Hardie (2004: 48) for more on the concepts of hierarchical and decomposable tagsets. The overall number of tags in the Urdu tagset of Hardie (2003) is over 350. This is many more than the Nelralec Tagset, which contains 112 tags, for a language which is arguably less complex morphologically, given the much lesser extent of auxiliary verb and postposition compounding in the written form of Urdu than the written form of Nepali. The C5 tagset for English may be seen at http://ucrel.lancs.ac.uk/claws5tags. html This second BNC tagset has no particular name; it may be seen at http:// www.natcorp.ox.ac.uk/XMLedition/URG/codes.html#klettpos .

References Acharya, Jayaraj 1991 A descriptive grammar of Nepali. Washington, D.C.: Georgetown University Press. Adhikari, H. R. 1998 Samasāmayik Nepalī Vyakarana [Contemporary Nepali Grammar]. Kathmandu: Vidyarthi Pustak bhandar. Baker, Paul 1997 Consistency and accuracy in correcting automatically tagged data. In Corpus Annotation, Roger Garside, Geoffrey Leech and Tony McEnery (eds.), 243-250. Longman Addison-Wesley. Bharati, Akhshar 2006 Part-of-speech tagger for Indian languages. http://shiva.iiit.ac.in/SPSAL2007/iiit_tagset_guidelines.pdf .

194

Hardie, Lohani, Regmi and Yadava

Bharati, A. and Prashanth R. Mannem 2007 Introduction to the Shallow Parsing Contest for South Asian Languages. In Proceedings of the workshop on Shallow Parsing for South Asian Languages (SPSAL-2007), http://shiva.iiit.ac.in/SPSAL2007/proceedings.php El-Beze, Marc and Bernard Merialdo 1999 Hidden Markov models. In: van Halteren (1999). van Halteren, Hans (ed.) 1999 Syntactic wordclass tagging. Dordrecht: Kluwer Academic Publishers. Hardie, Andrew 2003 Developing a tagset for automated part-of-speech tagging in Urdu. In Proceedings of the Corpus Linguistics 2003 conference, Dawn Archer, Paul Rayson, Andrew Wilson and Tony McEnery (eds.) (UCREL Technical Papers Volume 16.) Department of Linguistics, Lancaster University. http://eprints.lancs.ac.uk/103/ 2004 The computational analysis of morphosyntactic categories in Urdu. Ph.D. diss., Linguistics and English Language, Lancaster University. http://eprints.lancs.ac.uk/106/ 2005 Automated part-of-speech analysis of Urdu: conceptual and technical issues. In Contemporary issues in Nepalese linguistics, Yogendra Yadava, Govinda Bhattarai, Ram Raj Lohani, Balaram Prasain and Krishna Parajuli (eds.), 73-90. Kathmandu: Linguistic Society of Nepal. 2007 Collocational properties of adpositions in Nepali and English. In Proceedings of the Corpus Linguistics conference, CL 2007, Matthew Davies, Paul Rayson, Susan Hunston and Pernilla Danielsson (eds). University of Birmingham. http://www.corpus.bham.ac.uk/ corplingproceedings07/ 2008 A collocation-based approach to Nepali postpositions. Corpus Linguistics and Linguistic Theory 4(1): 19-61. Hardie, Andrew, Ram Lohani, Bhim Regmi and Yogendra P. Yadava 2005 Categorisation for automated morphosyntactic analysis of Nepali: introducing the Nelralec Tagset (NT-01). Nelralec/Bhasha Sanchar Working Paper 2. http://www.bhashasanchar.org/pdfs/nelralec-wptagset.pdf Leech, Geoffrey and Nick Smith 1999 The use of tagging. In: van Halteren (1999). Leech, Geoffrey and Andrew Wilson 1999 Standards for tagsets. In: van Halteren (1999). [Edited version of EAGLES Recommendations for the Morphosyntactic Annotation of Corpora (1996): available on the internet at http://www.ilc.cnr.it/ EAGLES96/annotate/annotate.html .]

A Morphosyntactic Categorisation Scheme for Nepali

195

Rayson, Paul, Geoffrey Leech and Mary Hodges 1997 Social differentiation in the use of English vocabulary: some analyses of the conversational component of the British National Corpus. International Journal of Corpus Linguistics 2(1): 133-152. Rayson, Paul, Andrew Wilson and Geoffrey Leech 2002 Grammatical word class variation within the British National Corpus sampler. In New frontiers of corpus research: Papers from the Twenty First International Conference on English Language Research on Computerized Corpora, Sydney 2000, Pam Peters, Peter Collins and Adam Smith (eds.), 295 - 306. Amsterdam: Rodopi. http://www.comp.lancs.ac.uk/computing/users/paul/publications/rwl_ lc36_2002.pdf Sajjad, Hassan 2007 Statistical Part of Speech Tagger for Urdu. Unpublished MSc thesis, National University of Computer and Emerging Sciences, Lahore, Pakistan. http://www.crulp.org/Publication/theses/2007/part_of_speech_ta gger.pdf Schmidt, Ruth Laila (ed.) 1993 A practical dictionary of Modern Nepali. Delhi: Ratna Sagar. Yadava, Yogendra P., Andrew Hardie, Ram Lohani, Bhim Regmi, Srishtee In press Gurung, Amar Gurung, Tony McEnery, Jens Allwood, and Pat Hall. Construction and annotation of a corpus of contemporary Nepali. Corpora 3(2).

Reviews

Shishir Bhattacharja Word Formation in Bengali: A Whole Word Morphological Description and its Theoretical Implications 2007. München: Lincom Europa. 454 pp. ISBN 978 3 89586 356 1

Reviewed by Niladri Sekhar Dash

1. Objective of the Book The basic objective of the book under review is to provide an exhaustive description of the Bengali morphology within theoretical frame of Whole Word Morphology (WWM) postulated by Ford, Singh & Martohardjono (1997) and elaborated in Singh and Agnihotri (1997), Singh and Dasgupta (1999), Singh (1999), and Singh (2006). The WWM theory proposes that words do not have any internal hierarchical structure, since there is neither any list of word-parts nor directions on how they have to be concatenated. Therefore, in WWM, the description of words of a language is nothing but an exhaustive list of morphological rules called Word Formation Strategies based on which one can describe how words belong to different categories and are lexically related to each other. Following this line the author of the present book provides a description of Bengali morphology. To achieve his goal the author carries out the study on a large lexical database obtained manually from the Eastern Standard Bengali – a variety used in Bangladesh. The author believes that the conclusions drawn from his study are more or less equally valid for the Western Standard Bengali spoken in West Bengal and other states of India. 2. Contents of the Book After a short Preface (ix-x), Acknowledgements (xiii-xiv), Abbreviations (xv-xvii), and List of tables and figures (xviii-xix), the book is divided into four parts. Part 1 (Introductory Matters) contains four chapters. Chapter 1 presents aims and objectives, and some theoretical rudiments. Chapter 2 presents outlines of word-based morphological theories proposed by earlier

200

Reviews

scholars. Chapter 3 describes the basic theoretical framework of WWM as well as provides explications and illustrations. Chapter 4 explains what the author means by the term ‘Bengali’, presents an account on previous works on word formation in Bengali; highlights problems and deficiencies of earlier works vis à vis morphology of Bengali, and justifies the selection of WWM as a theoretical framework for describing Bengali morphology. Part 2 (The Morphological Description) represents the core of the work in two chapters. Chapter 5 describes how the author formulas and classifies morphological strategies. Chapter 6 records a list of 1207 word formation strategies distributed across two broad types. Part 3 (Extensions and Conclusions) contains four chapters. Chapter 7 presents an account of reduplication used in Bengali and shows how the socalled reduplicated words are analyzed within WWM. Chapter 8 highlights the basic properties of Bengali morphology based on the descriptions given in earlier chapters. Chapter 9 draws a morphological profile of Bengali based on statistical descriptions, and Chapter 10 highlights the problem areas of Bengali word formation. Part 4 contains two appendices. While Appendix-1 describes patterns of potential strategies or relics of dead strategies (pp. 389-411), Appendix2 discusses the basic concept of word that appears as an unresolved issue of linguistics (pp. 413-429). The book ends with a list of references (pp. 431446) followed by indices of authors (pp. 447-448), language and language families (p. 449), and subjects (pp. 450-454). 3. Critical Analysis The new approach to morphology the book under review is based on is a much needed break-through, since for long we are stagnant with age-old models and theories of morphology that can hardly address many of our queries related to Bengali word formation vis à vis Bengali morphology. We become curious when the author argues that compared to other existing theories, WWM can explain most of the facts of word formation in Bengali in a more satisfying manner to provide a better morphological description of the language. The author presents sketches of morphological theories and approaches proposed by earlier others in Chapter 2 (“Outline of some word-based approaches to morphology”, pp. 11-17). Starting with the ‘atomistic’ theory proposed in Panini and adopted in Whitney (1889) and Bloomfield (1933), the author, in a systematic way, refers to the basic arguments as well as the limitations noted in neo-grammarian model (Saussure 1988), Government and Binding theory (Chomsky 1970), Structuralist and Transformational

Reviews

201

model, Full Entry/Lexicalist Theory (Jackendoff 1975), Word-based Morphology (Aronoff 1976), Semiotic Primacy Theory (Dressler 1988), and A-morphous Morphology (Anderson 1992). Finally he comes and settles on WWM. In Chapter 3 (“The chosen theoretical framework: WWM”, pp. 19-40) the author divides his discussion into two sections. In the first section he describes rudiments of theoretical framework of WWM and their eventual consequences. In this section he refers to the seven maxims on which the WWM theory (Ford et al.1997) stands: (a) no multiple morphologies, (b) no morphological operations on units other than the word, (c) unity of the morphological operation, (d) operation has no privileged direction, (e) no subcategory of strategies is determined morphologically, (f) morphological integrity of word, and (g) morphology has little or no architecture. These maxims help us to understand that morphology of a language is nothing but the study of formal relationship between words. In the second section the author provides explications and illustrations of WWM theory in order to present its fuller description. To achieve his goal he explains morphological operations as well as formal mechanisms involved in them; explores the word and its sub-components; argues for two pairs of words for fruitful operation of WWM; highlights the fuzzy frame of so-called compound and reduplication words; and recalculates the formal differences and semantic relatedness among the word pairs used in operation. In Chapter 4 (“WWM and Bengali”, 41-86) the author subdivides his discussion into four sections. In the first section he explains what he tries to mean by the term ‘Bengali’ as well as identifies the varieties (Eastern and Western) of the language he has used in his study. In the second section he presents a general overview of previous works on Bengali morphology, (Chatterji 1926, Dasgupta 1987, Sen 1992, Chakrabarti 1992, Sarkar and Basu 1994, Bhattacharya 1993, Bhattacharja 1998, and Chakrabarti 2000), draws general observations on earlier works, probes into the treatment of Bengali verb morphology by other scholars, and draws differences between traditional, structuralist and generative descriptions of Bengali morphology. In the third section the author finds several problems with the previous descriptions as well as well observes inadequacies in earlier works. He identifies not only the limitations of traditional and structuralist approaches but also highlights the limitations of generative approaches used to account for Bengali morphology. This critical and insightful groundwork was necessary to redirect readers to an altogether new approach to Bengali morphology. In the fourth section he justifies the selection of WWM as the most suitable theoretical framework for describing Bengali morphology.

202

Reviews

In Chapter 5 (“Bengali morphology: on describing and classifying the strategies”, pp. 89-121) the author describes how he employs different morphological strategies and classifies them for his work. After accepting the traditional notion of word and its categories, he designs morphological strategies for Bengali; classifies strategies into two broad types; discusses the methods of word formation by intercategorical and intracategorical morphology; refers to the types of mechanism such as identity, suffixation, prefixation, subjunction and prejunction used in Bengali; identifies partially specified variables; highlights process of segmental modification; critically investigates the nature and patterns of morphophonological changes; and refers to the difficulties involved in the process of classification. In subsequent sections he focuses on phonological changes that cause morphological change of words and shows how his model is better suited to address problems we come across in Bengali word formation. Finally, he presents a brief outline of Bengali phonology; shows how the phonemic inventory is used in word formation, and how well-formedness conditions operate in Bengali through activation of various phonological processes like gemination, aspiration in coda, consonant clusters in coda, consonant clusters in onset, vowel placement as nucleus, redundancy of velar nasal at onset, regressive voicing, non-occurrence of two consecutive nasal vowels, etc. His minute and systematic observations are valuable for understanding morphophonological interfaces operating within Bengali words, although the WWM theory pays no attention to morphophonology. In Chapter 6 (“Morphological Description”, pp. 123-292) the author presents two types of morphological strategy used in word formation in Bengali: intercategorical (460) and intracategorical (747), and shows how these actually work in the language. He argues that at least two pairs of words are required to make the strategies work, although there may exist more forms worth mapping into the word pairs. Quite elaborately he shows how pronouns and adjective, although constitute a closed lexical category, can be mapped into his morphological strategies. In case of intercategorical strategies he uses main syntactic categories such as noun, pronoun, adjective, verb, adverb, postposition, conjunction and interjection and some marginal categories such as numeral, ordinal, quantifier, measure words, and date words. In case of intracategorical strategies, he uses nominal subcategories such as case, gender, number, and definiteness. In the description of intracategorical morphology, he uses the traditional labels for morphological categories rather than inventing a new set of terms. The deficiency of the chapter, however, lies with the number of categories, which appear to be very large for the users. Moreover, each strategy requires supporting explanation to understand how it works to map

Reviews

203

with pairs and what kind of cognitive interface works in our mind while the mappable pairs are retrieved. In Chapter 7 (“Reduplication in Bengali and WWM”, pp. 295-316) the author presents a short discussion on Bengali reduplication and explains how these words can be formed and analyzed in WWM. After presenting a short introduction about reduplication as observed by earlier scholars, he makes distinction between patterns and processes, identifies morphological strategies used in formation of reduplication, and highlights some of the problematic examples related to this area. He is right when he argues that “it is possible to give a better description of this phenomenon in the light of WWM compare to other approaches which nevertheless pay special attention to reduplication and consider the latter as an indispensable field of research” (p. 296). After successfully defending his argument, he observes, following Ford, Singh and Martohardjono (1997: 3), “There is nothing in reduplication which makes it radically different from other bits of morphology, except the fact that strategies activated for forming or retrieving the so-called reduplicated words consist of repeating the (partly specified or totally unspecified) variable” (p. 312). In Chapter 8 (“On morphological categories and operations in Bengali: some generalizations”, pp. 317-364) the author focuses on intracategorical morphology of Bengali nouns and verbs on the basis of morphological description presented in earlier chapters. While dealing with Bengali nouns and pronouns he focuses on features of definiteness, gender, number, and case morphology, and in case of morphology of Bengali verbs he addresses finite and non-finite forms. Finally, he shows how the conjugated forms of Bengali general verbs as well as Bengali compound verbs can get better treatment within the frame of WWM theory. The author draws a general sketch of morphological profile of Bengali based on statistical descriptions presented in tabular forms in Chapter 9 (“Morphological profile of Bengali”, pp.365-380). Based on operation and mechanism types he finds a number of intracategorical and intercategorical strategies. Finally, he presents an overview of Bengali morphology derived from comparative study of the strategies furnished with tabulated statistics. In Chapter 10 (“Some problematic cases and a general conclusion”, pp.381-386) the author highlights some of the problematic areas of Bengali word formation, which call into question some of the axioms of WWM. After discussing problems related to ‘strategies of two or more variables’ and ‘strategies that allow morphological metathesis’; he draws a conclusion of his work. Thus the author succeeds in showing that WWM is an adequate model for presenting a morphological description of Bengali. He shows that

204

Reviews

Bengali words do not have any internal hierarchical structure, units smaller than word cannot really be said to exist, there is no need to postulate separate morphological categories like derivation, compounding, inflection or reduplication, and almost all morphologically complex words of the Bengali lexicon can be analyzed or formed with the rule schema /X/α-/X’/β (p.385). 4. Conclusion The book offers an exhaustive and elaborate description of word formation in Bengali. As we know that it is really difficult to account for the entire scheme of lexical generativity of a language like Bengali, the work draws our admiration by the amount of its coverage of the issues as well as by the fresh insights used in interpretation of the problems. However, the notable limitation of the book I find in its total silence about the issue of generating new words in Bengali. Although the strategies and examples furnished in the book deal with the words already available in the vocabulary, it fails to show how new words are formed in the language, and in a situation, when we intend to form new words, what kinds of strategy we use to achieve our goal. The morphodynamics that leads a language user to form new words in accordance with the rules and strategies available in a language is, in other words, not as clearly presented as we would like. Whenever we form a new word, we adopt one of the productive means that fits best with the morphophonemic structure of an existing word to achieve our goal. For instance, consider the following examples: (a) jagat ‘world’ + -ik (adj. suffix) = jaagatik ‘relating to world’ (b) sharat ‘Autumn’ + -ik (adj. suffix) = shaaratik ‘relating to Autumn’ The example in (a) is an existing word in Bengali while the example in (b) is not. We can form this word (i.e., shaaratik) by adding the suffix -ik, and the final form is an acceptable addition to the vocabulary of Bengali. We would like to know if WWM can suggest why we have selected this particular suffix (i.e., -ik) out of many other productive suffixes available in the language to form this word? We would be delighted if WWM can account for this. The nearly exhaustive description of Bengali morphology presented in this book will help us understand morphology of Bengali as well as allow us to verify if other theories of morphology can challenge this in the goal of better representation of the language. In essence, the author successfully shows how WWM can be used as an adequate model for the morphological

Reviews

205

description of Bengali and how this work can provide sensible solutions to most of the problems of Bengali morphology still remain unresolved. Like the author I also hope that the present work will lead us to consider WWM as a suitable alternative way for interpreting the Bengali morphology with newly gained insights and understanding of the domain. Although the author does his job exceptionally well, it still remains a puzzle – perhaps to be solved by psychology – how a language user ‘knows’ and forms new words by using the rules of word formation.

5. References Anderson, S.R. 1992 A-morphous Morphology. Cambridge: Cambridge University Press. Aronoff, M. 1976 Word Formation in Generative Grammar. Cambridge, Mass.: MIT Press, Bhattacharja, S. 1998 Sanjanani Byakaran (Generative Grammar). Dhaka: Tcarou. Bhattacharya, K. 1993 Bengali-Oriya Verb Morphology: a Contrastive Study. Kolkata: Dasgupta and Co. Bloomfield, L. 1933 Language. Chicago: University of Chicago Press. Chakrabarti, U.K. 1992 Bangla Padaguccher Sangathan (Structure of Noun Phrases in Bengali). Calcutta: Prama Prakashani. Chakrabarti, U.K. 2000 Bengla Sangbartani Byakaran (Generative Grammar of Bengali). Kolkata: Shri Aurobindo Publication. Chatterji, S.K. 1926 The Origin and Development of the Bengali Language. Kolkata: Calcutta University Press (Reprint in 1993 by Rupa). Chomsky, A.N. 1995 The Minimalist program. Cambridge, Mass.: The MIT Press. Dasgupta, P. 1987 Kathar Kriya Karma (Activities of Words). Kolkata: Dey’s. Dressler, W.U. 1988 Preferences vs. strict universals in morphology: word-based rules. in M. Hammond and M. Noonan (Eds.) Theoretical morphology, Academic press, San Diego, pp. 143-154.

206

Reviews

Ford, A., R. Singh and G. Martohardjono 1997 Pace Panini, towards a word-based theory of morphology. New York: Peter Lang. Jackendoff, R. 1975 Morphological and semantic regularities in the lexicon. Language, 51:639-71. Sarkar, P. and G. Basu 1994 Bhasa Jijnasa (Language Queries). Kolkata: Vidyasagar Pustak Mandir. Saussure, F.de 1988 Cours de linguistique générale. Paris: Editon Payot (Originally published in 1915). Sen, S. 1992 Bhashar Itivrittva (History of Language). Kolkata: Ananda. Singh, R. 1999 Towards a word-based approach to morphological typology: an illustration. Indian Linguistics, 65:183-195. Singh, R. 2006 Whole word morphology. In Elsevier Encyclopedia of Linguistics. 2nd Edition, pp.1413-1417. Singh, R. and P. Dasgupta 1999 On So-called Compounds. The Yearbook of South Asian Languages and Linguistics. 1999: 265-292. Singh, R. and R.K. Agnihotri 1997 Hindi morphology, a word-based description. New Delhi: Motilal Banarsidass. Whitney, W.D. 1889 Sanskrit grammar. Cambridge, Mass.: Harvard University Press.

Yamuna Kachru and Larry E. Smith Cultures, Contexts, and World Englishes 2008. London: Routledge. $41.95. ISBN 978-0-8058-4733-8

Reviewed by Graeme Cane

The authors state in the preface that the book’s main objectives are (1) to sensitize users of English to the varieties of the language operating across different cultures and (2) to emphasize that effective cross-cultural communication in English can be achieved by cultivating an awareness of the variation in the language with regard to its cultural, social and ideational functions. The book thus aims to make English scholars and speakers more aware of the linguistic and sociolinguistic contexts that have brought about the development of different varieties of the language in both the Inner and Outer Circles. Cultures, Contexts, and World Englishes is organized in three parts. Part One, ‘Verbal Interaction and Intelligibility’, discusses the background necessary for appreciating variation in language. It aims to establish the relevance of cultural context and discusses the concepts necessary to view verbal interaction as a dynamic process where all parties contribute to the outcome. Concepts from pragmatics, sociolinguistics, conversation analysis and artificial intelligence are used in the authors’ integrated approach to investigating cross-cultural exchanges between users of different varieties of English. The fascinating question of what exactly is involved in determining intelligibility across English varieties is dealt with in some detail. On page 59, the authors point to prior research (Smith, 1987) which indicated (1) native English speakers are often not intelligible to fluent non-native speakers; (2) native English speakers are not better than non-native speakers in understanding varieties of English which are different from their own; and (3) even if ESL users can understand one Inner Circle variety, they may not be able to understand other varieties from any Circle unless they have interacted with users of the variety. Looking at the wide range of English speakers and the multiple functions of the language across the world today, one has to agree with Kachru and Smith’s point that it is not necessary ‘for every user of English to be intelligible to every other user at all times’ (p.60) or even at any time. We need to be intelligible only to those with whom we need to communicate, and we may, of course, switch

208

Reviews

from one variety of English to another depending on who our interlocutors are. A speaker may use an international form of English in a business meeting and, two minutes later, switch to a localized form of the language when talking with friends. The two forms will not be intelligible to all parties and they do not need to be. The authors make useful distinctions between intelligibility, comprehensibility and interpretability. Intelligibility is defined as ‘the recognition of a word or another sentence-level element of an utterance’ (p. 61). To check the intelligibility of an utterance, even a fairly deviant one such as e.e. cummings’ ‘anyone lived in a pretty how town’, we could ask a listener to repeat or write down what we have said. For Kachru and Smith, comprehensibility refers to ‘the recognition of meaning attached to a word or utterance’ (p. 62). If, for example, one hears the word ‘please’, and realizes the function of the utterance is a polite request to do something, then, according to the authors, there has been high comprehensibility of the utterance. The term interpretability is used to refer to a hearer’s understanding of the purpose of an utterance. Unlike intelligibility or comprehensibility, to achieve interpretability one needs to know something about the cultural context. To demonstrate what is involved in interpretability, the authors give an insightful example of a text (p. 64) which is intelligible and comprehensible in terms of the words used but which seems impenetrable as to its overall purpose as a text. However, once we are given a title for this passage, we begin to relate the words and sentences to a known context, allowing us to produce a meaningful interpretation of the whole text. The authors then present a genuine conversational example from African English (cited in Bokamba, 1992:132) which is intelligible (we can recognize the words and repeat them) and comprehensible (we can recognize the meaning of individual words) but which will have low interpretability (knowing the speaker’s intentions) if we are not familiar with African English. Read the following and try to interpret whether the President has left for Nairobi yet or not. ‘Hasn’t the President left for Nairobi yet?’ ‘Yes’. The meaning here is that the President has not yet left for Nairobi, but those of us unfamiliar with the sociolinguistic context would probably give the opposite interpretation. As Kachru and Smith write at the end of Part One of their book, ‘With the global spread of English and the development

Reviews

209

of multiple varieties of English, issues of intelligibility will continue to be matters of concern’ (p. 68). Part Two, ‘Sound, Sentence and Word’, is made up of three chapters, ‘Sounds and Rhythm’, ‘Phrases and Sentences’, and ‘Words and Collocations’. The first of these discusses the notion that stress assignment in the Outer and Expanding Circle varieties follows different rules from those which operate in Inner Circle varieties. The chapter goes on to ask whether English speakers from non-Inner Circle speech communities use a syllabletimed rhythm rather than a stress-timed rhythm when speaking. In discussing the attempts currently being made by linguists to define a core for the pronunciation of English as a lingua franca (e.g. Jenkins, 2000), Kachru and Smith argue that success in cross-cultural communication will be more effectively achieved through developing greater sensitivity to the different types of variation found in World Englishes. ‘Those who interact with other variety users accommodate to the variation they notice in each other’s speech or writing and gradually learn to communicate more effectively’ (p. 82-3). Thus, the authors suggest, greater awareness of the contrasting linguistic, sociolinguistic and contextual features of different varieties is the key to achieving success in communication between speakers across all the circles of English. The second chapter of Part Two looks at some of the grammatical differences found among varieties and how they may interfere in cross-cultural communication issues. The authors discuss, for example, the lack of verb inflection for tense in many Southeast Asian languages and how tense and aspect are marked in these languages and in the varieties of English spoken in Southeast Asia compared with the system of verb inflection usually used in Inner Circle varieties. They note that it is common for native speakers of many Southeast Asian languages to mark tense in English with an adverbial rather than in the verb itself, as they would normally do in their own language (e.g. ‘I talk to her yesterday’). In an utterance such as ‘Her fiancé at that time brought over some canned ribs, pork ribs, yes, about 28 cans…and then we return about 14 of them’, the speaker seems to feel that, once the past tense has been established by the adverbial ‘at that time’, tense marking can become optional (p. 92). Kachru and Smith then go on to discuss a court case in the United States quoted by Gumperz (1982) involving two Filipino nurses in which the nurses were perceived to be untruthful because what they said appeared to be contradictory. Q: Would you say that the two of you were close friends during that period of time?

210

Reviews

A: I would say that we are good friends but we are really not that close because I don’t know her and we don’t know each other that much. (Gumperz, 1982: 173-74) It seems that the Filipino nurses had become good friends during the course of the trial but did not know each other very well prior to that. In answering the question above, the nurse presumably felt that past time had already been established by the adverbial ‘during that period of time’ in the question and so she was free to use any tense in her answer. For speakers of Standard American English, however, the use of present tense by the nurse in the quote above implies that her answer was inconsistent and thus untruthful. In Part Three, the authors set out the conventions of language use in speech and writing across different cultures. The section has three main chapters: Conversational Interaction, Interaction in Writing, and Contextualizing World Englishes Literatures. After discussing the spread and the functions of English across different cultures and the resulting variation in pronunciation, grammar and lexis in Parts One and Two, the authors examine language use in Part Three. In this section, they focus on how the capacity to use English in speech and writing differs across cultures. For example, in discussing turn taking (p.121), the authors propose that the ‘one party at a time’ convention for conversations in the Inner Circle varieties has produced a system where children are told that it is rude to interrupt and people feel they have a right to continue talking until they have said what they want to: ‘Let me finish what I’m saying’. This ‘one party at a time’ convention, the authors claim, may not be held so strictly in the Hindi speech community in India or in Japan or in some communities in the Middle East. Strategies to indicate agreement or disagreement may also vary across English-speaking communities around the world. In the following conversation between three female speakers of Indian English, the third speaker wishes to agree with speaker two but begins with a ‘No’, which would be extremely unusual in an Inner Circle context to indicate agreement. A: Do you think it (wife abuse) is common? B: In India? In rural families, this is common. C: No, it’s common. Very much common even in very literate families. (Data from Valentine, 1995: 243-244)

Reviews

211

The book concludes by estimating that there are at present more than a billion learners enrolled in English classes today across the world. Learners of English thus constitute the largest group of language learners in the history of humanity. This fact means that countries whose citizens are learning English have to make decisions with regard to what kind of English is to be taught and the relationship between English and the other languages taught and spoken in the country. Issues of the use of English in education in Outer and Expanding Circle communities and of ideology (linguistic imperialism, linguistic human rights, etc.) are also briefly discussed in the concluding pages. While I would not recommend Cultures, Contexts, and World Englishes as a core text for a university course in language variation, it would be useful as a supplementary text for any course in World Englishes. Each chapter includes useful suggestions for further reading and has follow-up activities for students to do. The activities are generally stimulating and well constructed, but some of them require video materials which may not be available to teachers and students in developing countries. For example, in the activities for Chapter 6, readers are told to watch The Story of English, Part 7 and discuss the characteristics of Australian English. Most libraries in the universities I am familiar with in Pakistan and Southeast Asia would not have copies of The Story of English, a TV series which is now over twenty years old. Having said that, Cultures, Contexts, and World Englishes is clearly written, well structured and contains a wealth of examples from around the world to illustrate the points made. While many of the examples are not new and will be known to those familiar with the literature of World Englishes, they provide illuminating insights for students into the different linguistic and sociolinguistic features which exist today in the varieties of English spoken around the world.

212

Reviews

References Bokamba, E. G. 1992 The Africanization of English’ in BB Kachru (ed.) The Other Tongue: English Across Cultures (2nd edn). Urbana, IL: University of Illinois Press. Gumperz, J. (ed.) 1982 Language and Social Identity. Cambridge: CUP. Jenkins, J. (2000) The Phonology of English as an International Language. Oxford: OUP. Smith, L. E. (ed.) 1987 Discourse Across Cultures: Strategies in World Englishes. London: Prentice Hall. Valentine, T. 1995 ‘Agreeing and Disagreeing in Indian English Discourse: Implications for Language Teaching’ in M.L. Tickoo (ed.) Language and Culture in Multilingual Societies: Viewpoints and Visions (pp. 227-250). Singapore: SEAMEO Regional Language Centre.

Pingali Sailaja Indian English Edinburgh: Edinburgh University Press (Dialects of English). 172pp. Price: ₤19.99. ISBN 978 0 74 862595 6

Reviewed by Claudia Lange The latest addition to Edinburgh University Press’ series “Dialects of English” is Indian English by Pingali Sailaja. All titles in the series aim to “provide a starting point for anyone wishing to know more about a particular dialect”, as the back-cover blurb states, and in so doing, they all follow “a common structure, covering the background, phonetics and phonology, morphosyntax, lexis and history of a variety of English, and conclude[s] with an annotated bibliography and some sample texts.” The accompanying website http://www.lel.ed.ac.uk/dialects/india.html will host the audio samples that are given as transcriptions in the book; they were, however, not available online at the time of my writing this review (end of April 2009). Since the books in the series are intended as concise quick-reference guides for the variety in question, they have to offer broad coverage of their topic without burdening the general reader with too many theoretical intricacies or too much technical terminology. Sailaja handles this task admirably, steering clear of linguistic jargon as far as possible without sacrificing precision or theoretical sophistication. In the introductory chapter, Sailaja unfolds a panoramic view of the range and depth of English in India, discussing demographic factors such as the actual number of Indian speakers who consider English as their first or second language as opposed to “the tremendous, even disproportionate, significance that English carries” (2) in the country. The issue of ‘Indian English’ versus ‘English in India’ (13-15) is touched upon, and Sailaja takes a firm stand for ‘Indian English’: “In this book, ‘Indian English’ is used without apology because there is a variety of English that is identifiable as Indian; this variety has several different facets to it. There are Indian Englishes no doubt but they are Indian English first” (15). Consequently, the main focus of the book is “on those features that are panIndian” (viii), and then predominantly on standard Indian English as opposed to non-standard and informal varieties.

214

Reviews

The second chapter on phonetics and phonology contains two huge surprises for the student of Indian English. First, Sailaja states that “Standard IE pronunciation (SIEP) is non-rhotic, in which feature it matches RP” (19). This is a rather striking claim to make, given that practically everybody writing on the topic thinks otherwise, as the following barrage of quotations might indicate: EIE [Educated Indian English] is a rhotic accent, that is to say ‘r’ is pronounced wherever it occurs, unlike BRP [British Received Pronunciation] where post-vocalic ‘r’ is not pronounced. (Nihalani, Tongue, and Hosali 1979: 211) In Indian English /r/ is often retained in all positions. (Bansal and Harrison 1991: 70). Despite the long-standing influence of Received Pronunciation and other generally non-rhotic British accents, English in India is almost universally rhotic: that is, r is pronounced in all positions. (McArthur 2003: 320) Although postvocalic realizations of /r/ might be an instance of spelling pronunciation, it must be conceded that the English brought to India from the earliest times is likely to have its postvocalic r’s intact. (Gargesh 2008: 238)

To my mind, the burden of proof would then lie on Sailaja to argue convincingly for the non-rhoticity of Standard Indian English. Her pronouncement does not seem to be founded on her own research, but rather on comments made by Agnihotri and Sahgal (1985) as well as Sahgal and Agnihotri (1988), to the effect that an “/r/-less accent is a prestige marker in India” (19). To quote a newspaper headline from D’Souza (2001), ‘Sonia…and yet so far’ (referring to Sonia Gandhi’s election campaign in 1997) as evidence for non-rhoticity falls a bit short of a convincing argument. Sailaja continues by saying that “Most non-standard varieties of IE are rhotic … There are those whose speech would be somewhere in the middle of the cline but they may still have non-rhotic speech” (20). This brings us back to the question of how exactly Sailaja conceptualizes of the opposition standard versus non-standard pronunciation. Turning back to page 18, we find that she rather idiosyncratically refers to the model provided by the All India Radio’s newsreaders as the standard accent: An acquired variety modelled on RP is that of the newsreaders of All Indian Radio …. The variety that is described in this work, and one that is called the standard here, is close to but does not precisely match All India Radio’s newsreaders’ speech. It is not the generalized IE that is mentioned above either [as described by Nihalani et al.]. Generalised IE is the variety of speech that has more Indian features in it and is the second variety of the three types mentioned above [i.e. standard – non-standard – informal]. (18)

Reviews

215

If I understand it correctly, then Sailaja assigns Nihalani et al.’s account, which explicitly aims at capturing a general educated Indian English accent, to the realm of the non-standard. This point of view will doubtlessly be met with great interest by the specialist reader, but is likely to confuse the intended audience, namely the general reader. One more piece of received wisdom about Indian English is rejected in this chapter: Sailaja maintains that Standard Indian English is not syllabletimed (34). Again, a host of studies could be mustered against her pronouncement, and again, one would like to see more evidence to substantiate her point in this respect. Chapter three deals with morphosyntax, an area less likely to exhibit notable divergence from British or American English: When a language is learnt as a second or foreign language, the focus on ‘correctness’ is much greater than when the language under consideration is the native language. There are features that are Indian in Standard Indian English but usually native varieties become the benchmark for correctness. … Since there is no written grammar for IE, in case of doubt, an English grammar is consulted. (40)

Sailaja considers many of the morphosyntactic features she discusses in this chapter as non-standard (e.g. article use, use of the progressive, word order in interrogatives, topicalization, focus marker only); their inclusion, however, is warranted by the fact that speakers of standard IE may resort to non-standard features in informal contexts: “Syntax, like phonology, illustrates the point that standard and non-standard in IE are not water-tight compartments” (65). Chapter four on lexis and discourse gives a comprehensive overview of lexical items that are unique to Indian English, including a section on words of Indian origin that have made their way into other varieties of English, notably British English. While the specialist reader may well have Singh’s dictum in mind that lexical innovation in Indian English is “nothing to write home about” (Singh 2007: 38), the general reader will find this chapter highly informative, especially when discourse features such as address forms, politeness and style are discussed and situated in their cultural context. Chapter five is devoted to the history of English in India. As a concise overview of the processes and protagonists that were instrumental in the institutionalization of English in India, this chapter is much more balanced and informative than e.g. Kachru (1994). The volume is rounded off by an extensive annotated bibliography and a fair number of (annotated) sample texts, drawn from different registers and representing Indian English from 1794 into the twenty-first century.

216

Reviews

As said above, Sailaja has definitely met the task set before her by the “Dialects of English” series editors. Indian English is a welcome addition to the literature on Indian English, and it is likely to assert its place as the most concise undergraduate textbook on the topic.

References Agnihotri, Rama Kant, and Anju Sahgal 1985 Is Indian English retroflexed and r-ful? Indian Journal of Applied Linguistics. X1 (1): 97-108. Bansal, R. K. and J. B. Harrison 1991 Spoken English: A Manual of Speech and Phonetics. Mumbai: Orient Longman. D’Souza, Jean 2001 Contextualizing range and depth in Indian English. World Englishes. 20 (2): 145-159. Gargesh, Ravinder 2008 Indian English: Phonology. In Varieties of English 4: Africa, South and Southeast Asia, Rajend Mesthrie (ed.), 231-243. Berlin: Mouton de Gruyter. Kachru, Braj B. 1994 English in South Asia. In The Cambridge History of the English Language. Vol. 5. English in Britain and Overseas: Origins and Developments, Robert Burchfield (ed.), 497-553. Cambridge: Cambridge University Press. McArthur, Tom 2003 Oxford Guide to World English. Oxford: Oxford University Press. Nihalani, Paroo, R. K. Tongue, and Priya Hosali 1979 Indian and British English: A Handbook of Usage and Pronunciation. New Delhi: Oxford University Press. Sahgal, Anju, and Rama Kant Agnihotri 1988 Indian English phonology: A sociolinguistic perspective. English World-Wide 9 (1): 51-64. Singh, Rajendra 2007 The nature, structure, and status of Indian English. In Annual Review of South Asian Languages and Linguistics 2007, Rajendra Singh (ed.), 33-46. Berlin, New York: Mouton de Gruyter.

Tove Skutnabb Kangas Linguistic genocide in education – or worldwide diversity and human rights. 2008. New Delhi: Orient Longman Private Limited. ISBN 81-250-3461-7

Reviewed by Otto M. Ikome As the title of Tove Skutnabb-Kangas’ (TS-K) provocative and challenging work suggests, languages like living species cannot be allowed to disappear from the globe without a concerted and conscientious fight to save them. Their right to survival is, she suggests, pegged to human rights and the consolidation of our rich human patrimony of knowledge. She does more than sound an alarm by calling on the world community and individual speakers to gain knowledge about language in order to successfully promote linguistic and cultural diversity. Writing from the powerful Western Center with unwavering determination for the oft marginalized periphery, TS-K sees herself as a marginalized sociopolitical advocate who has chosen to research, document and empirically argue for minority language rights, against a few so-called ‘‘standardized’’ languages. To back her precondition for meaningfully protecting human languages, she has painstakingly documented her arguments, claims and recommendations. Her writing style is particularly engaging and there is no escaping her scathing critique of tokenism and paying lip service to the need to save languages through research and teaching. She wonders why well-meaning teachers working with good intentions still produce disastrous results. (xx) She sets the tone of her challenge and invitation to act using a series of ‘‘whyquestions’’. While the mammoth 785-page book may appear a long and tedious read, its illustrative and cross-referencing style accounts for its volume – a style that is typical of the author’s no-stone-left-unturned approach to making a tough case in an academy that prides itself on empirical proof, comparative analysis and scientific and theoretical argument. Although she writes from a locality in the West against which she addresses her initial challenge, she certainly has set foot in many a battleground where marginalization of weaker languages, minority language groups, languages threatened with immediate extinction and educational systems tailored to maintaining the status quo and to systematically ignore the rights of minority

218

Reviews

children to be taught in their mother tongues. The use of numerous inserts, elaborate footnotes, information, address and definition boxes, reader tasks and tables, make reading and understanding the book a lot more enjoyable and rewarding. These complementary additions to the critique and argument help provide clarity and context for the many metaphors and analogies that characterize TS-K’s unapologetic writing style. This very affordable paperback should be of obvious interest to the readers of ARSALL for it serves the less powerful in the periphery and empowers them to join the struggle by acquiring the knowledge necessary to do the ground work that can bring about change in their own communities. Upon reading the encyclopedic treatise woven intricately by TS-K, the usual readers of the ARSALL will see that they are being invited to research, document and expose the facts about all languages of the world beginning with those in their own locality, and there are, as is well known, many threatened and menaced languages in South Asia. The book is divided into three sections and has a total of nine chapters, including a comprehensive bibliography, author/person index, languages/peoples index, country/state index, and a subject index. A guiding preface provides a clear outline, including some advice to the reader on how to use the book and who might want to read it. The lengthy list of acknowledgements attests to the author’s scope of consultations and documentation of the book. Luckily for the reader, many of the typos that dot the book’s complex landscape do not detract from the author’s message. Section I sets the scene in four chapters that examine the languages of the world, linking biological, linguistic and cultural diversity to mother tongue, culture, ethnicity and self-determination; and questioning whether linguistic diversity is a good or a bad thing. Section II delves into the death of languages as genocide perpetrated by state policies and the effects of globalization. Its two chapters suggest how schools promote discrimination and biased educational practices which, based on two paradigms, are shown to ultimately contribute to language death and linguistic genocide. Section III draws the reader into the author’s prescribed struggle against the ‘killing off’ of the world’s weakest and most vulnerable languages along with many of their cultures and speakers, and argues for fighting for linguistic human rights within a coercive educational system. In these last three chapters, linguistic human rights are introduced and the case for their acknowledgement in bilingual/multilingual education is made, while alternatives to the destructive, ignorance-prone attitude towards mankind’s best

Reviews

219

asset – language – can help stave off the threat of extinction for an evergrowing number of the World’s languages and cultures. One of the inherent causes of the gradual abandonment of indigenous languages is the fact that children do not ‘learn’ their own mother tongues; in California, for instance, indigenous languages are currently dominated by powerful languages, and will be abandoned faster than in regions where these dominant languages haven’t made significant inroads. Other reasons, including the limited number of speakers may be responsible for their ultimate demise (cf. the ‘moribund’ languages of Adamawa/Mambila region of Cameroon). If the ‘bloodied hand’ of education were to be sanitized by an ‘additive’ bilingual education that promotes a balanced democratic use of both the indigenous mother tongue and the dominant language, this may stem the rate of abandonment or ‘disuse’ of the weaker language. Like dominant languages of education (especially in areas of European colonization), a lingua franca – with its purported ‘neutral’ interethnic mode of communication – can command an ‘unhealthy’ following which may entice speakers to abandon their mother tongues in its favor (cf., Fulfulde, KiSwahili). TS-K suggests that any advantages these dominant languages may offer should be ‘additive’ to the advantages of using indigenous mother tongues. TS-K shows that even with good intentions, many MTs still disappear because there is a lack of knowledge about both the subtle and overt actions that contribute to their loss. The parallels drawn between biological and linguistic biodiversity help with focusing on how to account for and protect the languages of the world.. A similar parallel is drawn by Mufwene’s (2001) ‘‘ecology of language evolution’’ in which he ‘‘develops a general model for language evolution which draws heavily on parallels to biological evolution: languages are analogous to species and population thinking is critical.’’ TS-K has made knowledge, including scientific data on biodiversity central to her assessment of linguistic diversity. The detailed review of the facts about biological species and their survival make it possible for her to establish the relevant associations, measurements and predictions about the survival of human languages. In doing so, she goes beyond mere metaphor and analogy and firmly establishes the correlates between language and human species survival. TS-K shows that using a non native code additively, ensures that the original creative potential of the indigenous language will have its chance to contribute the cultural baggage it bears to humanity. Pointing to the natural dictates of diversity, one cannot promote the killing of one set of species in favor of a select dominant species. The

220

Reviews

‘‘high’’ or ‘‘thick’’ occurrence of biological and language species around the tropics is not an accident, according to Luisa Maffi (1996). Such diversity is warranted by the environment wherein these species are expected to thrive interdependently. An artificial process of annihilation through urbanization, globalization, colonization, etc., (Harmon 1995), destabilizes both the biological and linguistic ecosystems. The biological ability of whales to use SONAR and sound to hunt, mate and socially interact, intrinsically merges biology and communication together to ensure survival within their water habitat. TS-K uses a similar example from Nils Jernsletten to firmly tie biodiversity to linguistic and cultural diversity: ‘‘… the ‘prerequisite for making a living from nature’ is for the hunter, fisherman/woman and reindeer herder [to have] ‘an intimate knowledge of the landscape’.’’ (Jernsletten 1997: 90); the professional Sámi terminology on reindeer, salmon and snow encapsulates local ‘‘traditional knowledge’’ needed to ensure survival of its people; the disappearance of this language will be accompanied by the possible disappearance of this specific knowledge as well as some of its best human guarantors. Language Human Rights (LHR) are addressed with conventions in mind and TS-K uses it here to encourage institutions to protect MTs and MLs that do not enjoy the automatic recognition nor hold the necessary power to foster Human Factor Development. (Adjibolosoo 2000) To prevent ‘linguistic genocide’ and promote linguistic diversity worldwide, we must protect children’s MTs and encourage their use as the primary mode of learning in school. Ethnicity is a culturally shared concept that has misguidedly been restricted to minorities, thus bringing about discrimination between minority versus majority cultures. While the cultural ‘core’ of the majority is secure in a given educational policy, that of the minority is often relegated to the status of a ‘private ethnicity’ (see John Edwards 1984) – meaning that the individual’s or group’s private cultural business has to be discussed or carried out in the privacy of their homes! This leaves the minority cultural group only one true choice, that of assimilating/integrating into the dominant culture and ultimately using the dominant language coercively or voluntarily. The question as to whether language diversity is a curse or a blessing can only be made sense of by those who would condone the ‘‘invisibilization’’ of the numerous languages, cultures and peoples of the world that remain a thorn in the thigh of colonizers, assimilationists, elitists, and maintainers of the dominant language status quo. The true curse is the ‘‘the exploitation of man by man’’ in capitalist power dynamics – a dynamic that makes it ‘‘acceptable’’ to label large swaths of say, the African continent,

Reviews

221

into Francophone Africa, Anglophone Africa, Lusophone Africa (TS-K 232-233), when the phony ‘‘phones’’ associated with these regions are hardly known to or used by the indigenous population. These labels are upheld by the indigenous elite and promoted as successful, cost-effective aspects of the educational system – an act of ‘linguistic suicide’ indeed. Does the investment in the development of indigenous, minority and less powerful languages have to be pegged to political posturing? Take away the ‘‘power’’ component, and the threat, rivalry and competition between languages will dissipate significantly. Colonizing Germans somehow had the vision to promote literacy in local/indigenous languages before investing in the limited teaching of German to the indigenes. Had this model been systematically practiced by all colonizers, explorers and evangelists, TSK’s current efforts would have had a leg ahead, albeit a small one. If within small national boundaries the conditions for catering to small, powerless languages seems so difficult, imagine what the fate of minority languages really is in an interconnected globalized economy in which ‘‘the education many minorities receive [to be able to compete, actually] serves to maintain an unequal division of power and resources in the world’’. TSK insists on the precondition that the world’s material resources be redistributed fairly, as a human right, for substantive changes to occur. A good test of this principle is to establish an educational system that promotes linguistic and cultural diversity. The right to an education in one’s MT through bilingual and multilingual schools, also directly addresses the issue of protecting minority languages beyond international conventions and declarations. TS-K proposes alternative actions that will allay the current trend of ‘‘genocide’’ and ‘‘dystopia’’, including a redistribution of resources that will help everyone contribute to the maintenance of programs through rational language policies that respect linguistic human rights. Tackling ‘‘linguicism’’ and standing up to governments that still foster neocolonial subjugation of developing countries will not only break the stranglehold on Southern poor countries but it may also bring about change in the Western educational policies that continue to trample on the language rights of minorities by discouraging cultural and linguistic diversity, and killing off several indigenous languages.

222

Reviews

References Adjibolosoo, Senyo B-S.K. 2000 The Human Factor in Shaping the Course of History and Development. University Press of America. Edwards, John 1984 Language, Diversity and Identity. In J. Edward (ed.) Linguistic Minorities, Policies and Pluralism. London: Academic Press. 77310. Harmon, David 1995 Loosing Species, Loosing Languages: Connections Between Biological and Linguistic Diversity, Paper presented at the symposium on Language Loss and Public Policy, Albuquerque. N.M., June 30-July 2 1995. In Press, in Southwest Journal of Linguistics 15. Jernsletten, Nils 1997 Sami Traditional Terms Concerning Salmon, Reindeer and Snow. In Gaski (ed.) Sami Culture in a New Era. The Norwegian Sami Experience. Karasjok: Davvi Girji, 86-108. Maffi, Luisa 1996 Language, Knowledge and the Environment: Threats to the World’s Biocultural Diversity. Terralingua Newsletter. 2 December 1996. Mufwene, Salikoko S. 2001 The Ecology of Language Evolution. Cambridge: Cambridge University Press.

K. G. Vijayakrishnan The Grammar of Carnatic Music Phonology and Phonetics Series, Volume 8. Berlin: Mouton de Gruyter. 2008.

Reviewed by Nirmalangshu Mukherji The book under discussion is a field-opening work in a variety of ways and can in fact be said to herald the dawn of music theory for Indian classical music. This becomes clear when we compare developments in music theory in the West with studies of Indian classical music. First, demands of orchestral and polyphonic music in the West opened the tradition of written scores in Western (classical) music centuries ago. This textuality not only allowed a wider dissemination of Western classical music in Europe and beyond, it enabled the development of theoretical investigations into the history, form and (musical) content of the music. In contrast, the tradition of classical music in India, though traceable to the Vedic times, has been essentially an oral tradition, based on emphasis on improvisation. The situation did not improve despite the subsequent institutionalization of classical music through the university system. Music departments in India essentially played the role of widening the base of performance of classical music which was so far restricted to the gharana traditions. Drawing upon inadequate earlier work, Vijaykrishnan’s book goes a long way in establishing a theoretically salient notational scheme for displaying salient passages from Carnatic music; the scheme could be easily extended to accomodate Hindustani classical music as well. It is unclear if the scheme transfers adequately to Western notational system, say, on the piano, but there are enough examples in the book (Chapter 4) to suggest that the organization of well-tempered scheme of current Western music may at least be suitably compared with the tonal organisation of Carnatic music (76-88). In any case, theoretical investigation into the structure and the content of Indian classical music can now be pursued in earnest. Second, for the purposes of this review, music theory in the Western sense can be broadly classified, until very recently, into two major efforts. With the advent of written scores, as noted, a rich tradition of musicology emerged in the Western tradition. Until about late 19th century, musicology

224

Reviews

was almost exclusively concerned with systematic archiving of Western classical music into delineable periods such as baroque, classical, romantic, etc. It focused on individual composers and the development of their music, engaged in analysis and criticism of various traditions, styles and composers, musical valuation of compositions and their performances, and the like. Although the study of musical form and tonal organisation was implicit in much of this work, a direct study of tonal structure—music theory proper—was a much later development, not surprisingly. To mention just one of the important theoretical moves in that direction, the work of Heinrich Schenker proposed novel tools for displaying the hierarchy of tonal organisation across large chunks of tonal music. However, even there, the focus was restricted to the study of individual composers, most notably Beethoven. A general theory of tonal organisation designed to capture aspects of musical interpretation by the audience of tonal music was still missing. Third, in that sense, the work of Lerdahl and Jackendoff (1983) could be viewed as a pioneering attempt to develop a cognitive theory of music along the lines of generative theory of language proposed by Noam Chomsky three decades earlier. An interest in language-theory-governed theory of music could be traced to some ideas proposed by the composerconductor Leonard Bernstein (1976). Impressed with the development of Chomskyan theory of language, Bernstein suggested that a similar attention be directed to music cognition since, according to him, the system of music displayed many of the central features of human languages: speciesspecificity, universality, hiearchically articulated structure, and the like. Lerdahl and Jackendoff’s work was arguably the first major articulation of Bernstein’s project. Building up on Schenker’s work and on much else, Lerdahl and Jackendoff showed in careful detail how the listeners of Western classical music perceive hierarchical grouping structures, while aligning them with rhythmic structures, as pieces of music progress through evergrowing complexity. Two restrictions of this otherwise groundbreaking work are worth mention here. For one, the work is essentially restricted to Western classical tonal music with its emphasis on harmony, modulation and counterpoint. Although, some attempts have been made to extend the model to other forms of music, the results are insufficient and unclear. Second, the authors basically set aside a central aspect of Bernstein’s project, namely, to study the abstract relationship between language and music. They held that music theory relates to language theory more in style and methodology than in actual content, except for unsurprising points of contact in prosodic and rhythmic structures which are generally assumed to be non-specific to these systems in any case. In fact, according to later

Reviews

225

work (Ramus et al 2000), it is known that prosodic and rhythmic structure are not even species-specific. Vijaykrishnan deserves much admiration for covering these phases essentially single-handedly in the space of a single book. Being a trained musician himself, apartment from being a first-rate linguist, he has been able to encode much of the nuances of Carnatic music, including its raaga and taala systems, to present a rich view of the variety and the complexity of this music from actual examples. However, in doing this, he very consciously stays away from either listing the raaga system or to engage in aesthetic evaluation. Much like the enterprise in Lerdahl and Jackendoff, his basic focus is to study the progression of this music through individual performances to extract some general features that seem to apply to the whole of this form of music, and probably beyond. Yet, he veers away significantly from the work of Lerdahl and Jackendoff in actually incorporating a specific formal theory—optimality theory—that has so far been applied almost exclusively to study human languages. In the process, the author is able to make some interesting general comments on the very character of Carnatic music vis a vis human languages; as noted, these comments may well apply beyond Carnatic music. Turning briefly to his specific proposals, Vijayakrishnan holds, in the spirit of optimality theory, that musical progression in Carnatic music is the result of a competition between a set of mutually independent and formally specifiable constraints. The reader will benefit much from the forword to the book in which the noted linguist Paul Kiparsky gives an excellent introduction to the framework of optimality theory and its relevance for musical analysis. The two basic sets of constraints are called Markedness constraints and FaithLex constraints, with individual members specified within each set. For example, a Markedness constraint says that the notes of the supposedly universal twelve tone scale are less marked and, hence highly ranked in terms of cognitive preference. Carnatic music, however, requires notes such as E flat and B Flat which are highly marked and, hence, have a low rank with respect to the twelve tone scale. This conflicts with a FaithLex constraint which requires that the specifics of a raaga—its tonal structure, ascent/descent conditions, typical phrases, etc.—that belong to the Lexicon of Carnatic music need to be satisfied: a musical structure must be faithful to its lexicon. Interesting details aside, the conflict is addressed in the music such that these low ranked notes are achieved by lowering the targets for them, say, by deflecting the string on the preceding note (63). Such theoretical resources have been used through a rich store of examples

226

Reviews

across a variety of compositional styles in Carnatic music. The discursive originality of Vijaykrishnan’s work is that a very wide survey of Carnatic music is in fact achieved through theoretical moves such as the one just described. This is in sharp contrast to tiring exegesis of hundreds of raagas usually found in treatises of Indian classical music. Not surprisingly, Vijaykrishnan’s work is likely to be most contentious theoretically exactly at these innovative points and the general conclusions about the design of music he attempts to derive thereof. Consider the very choice of optimality theory for studying structure of music. Over two decades of work in optimality theories of human languages arguably suggest that the model applies more convincingly to the phonological aspects of language than to its syntax and semantics. Not only that very little of the core problems in syntax can be directly addressed with the resources of optimality theory (Barbosa et al 1998), there are concerns that even these restricted resources are psychologically implausible—for example, they are often too ‘costly’. Be that as it may, the point is that, if the preceding scenario regarding the applicability of optimality theory is roughly correct, then optimality theory may apply at best to the ‘phonological’—that is, sound—aspects of music, not to its syntax and semantics. Vijaykrishnan seems to be agreed to the idea since, according to him, Carnatic music has no syntax and semantics, only ‘phonetics’. In effect, Carnatic music has no hierarchic structures which are computationally interpreted. However, the author seems to allow that Western classical music may have syntax, unlike Carnatic music. If that is so, then the picture raises difficult questions about universality of music as a species-specific device. Limited psychological experiments suggest that in studies on melodic expectancy and tonal hierarchies, considerable agreement was found between listeners from the music’s cultural context or from outside it. Thus, ‘the inexperienced listeners were able to adapt quite rapidly to different musical systems’ (Krumhansl et al 2000). If the tonal structures of Carnatic and Western classical music differ as sharply as Vijaykrishnan suggests, then either the phenomenon just cited remains unexplained or Carnatic and other systems of music somehow fall out of the universal set. In general, we may ask if the author wants to hold that Carnatic music somehow fails to allow for unending embeddings typically found in music (and language) across the world (Fitch 2006). If so, then how do we account for this remarkable specificity? My own preliminary hunch is that this otherwise unsavory nonuniversalistic result is a consequence of the geneal framework the author adopts. For example, his suggestion that the Lexicon of Carnatic music

Reviews

227

contains all of the marked and unmarked pitches, raaga structures, characteristic passages and the like, perhaps helps in creating the no-syntax picture. Since much of the input information to the system is already structured, progression works by simply iterating and re-iterating them in a flat structure. But then a raaga system, viewed as a collection of pitches—a pitch set—is itself a structured object. Which device in the system constructs the pitch set, not to speak of the more elaborate characteristic phrases of the raaga system? The simplest way of constructing an unordered set in the recent minimalist program in linguistic theory (Chomsky 1995) is the basic operation merge that puts α and β to generate {α, β}, incorporating the No Tampering Condition (NTC) which leaves α and β in tact, and results in a hierarchy—embedding—when a third element γ is added. In this alternative picture, the ‘atomic’ objects α, β, γ belong to the lexicon and the resulting set is generated by the computational system; the complex object is then interpreted by systems external to the core computational system. In a sense then, syntax and semantics follow from the very requirements of setconstruction which seems to be an absolutely primitive requirement in any symbolic domain we wish to look at. Since Vijaykrishnan takes these sets themselves to be primitive, it is no wonder that in his scheme Carnatic music fails to have either syntax or semantics.

References Barbosa, P., D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (Eds.) 1998 Is the Best Good Enough?: Optimality and Competition in Syntax. Cambridge: MIT Press. Bernstein, L. 1976 The Unanswered Question. Cambridge: Harvard University Press. Chomsky, N. 1995 The Minimalist program. Cambridge: MIT Press.Fitch, W. 2006 The biology and evolution of music: A comparative perspective. Cognition, 100, 173–215. Krumhansl, C., P. Toivanen, T. Eerola, P. Toiviainen, T. Järvinen and J. Louhivuori 2000 Cross-cultural music cognition: cognitive methodology applied to North Samiyoiks. Cognition, 76, 13-58.

228

Reviews

Lerdahl, F. and R. Jackendoff 1983 A Generative Theory of Tonal Music. Cambridge: MIT Press. Ramus, F., M. Hauser, C. Miller, D. Morris and J. Mehler 2000 Language discrimination by human newborns and by cotton-top tamarin monkeys. Science, 288, 349-351.

Dialogue

Aspects of Assamese Morphonology Revisited: Reflections on Mahanta Luc Baronian

1. Introduction In this note, I invite the reader to reflect on the tendency in contemporary generative theories, whether constraint or derivation-driven, to treat different modules of grammar in a uniform way. Transformational syntacticians venture South into morphology with Head Movement and Affix Lowering, while Optimality-theoretic phonologists head North with Output-Output, Sympathy and the like. We shall leave the former aside for another day and map out an alternative to Mahanta’s northern expedition (this volume). The point is certainly not to criticize Mahanta herself, as her analysis is clear, detailed and coherent with the framework of assumptions shared by OT phonologists. In fact, I ask for the reader’s indulgence in not anticipating a “better” analysis. Space constraints and an ignorance of Assamese prevent me from even trying that, but I believe the remarks offered below merit attention, particularly from phonologists working on Assamese, including Mahanta, of course. Since Aronoff (1976), several theories of “morphology by itself” have been proposed, among which is the radically amorphous and truly wordbased theory of Whole Word Morphology (WWM). This theory (Ford and Singh 1991, Ford, Singh and Martohardjono 1997) holds that all of morphology, which includes morphonology, can be accounted for by what its architects refer to as Word-Formation Strategies (WFS). They relate word descriptions and allow the speaker to project an already acquired word onto new forms (e.g. create a noun from an existing verb or a plural from a singular, vice-versa and etc.). The schema in (1) describes the general form of WFS.

(1) /X /a ↔ / X’ /b where: a. /X/a and /X’/ b are words and X and X’ are abbreviations of the forms of classes of words belonging to categories a and b (with which specific

232

Luc Baronian

words belonging to the right category can be unified or onto which they can be mapped). b. ’ represents all the form-related differences North of automatic phonology between /X/ and /X’/. c. a and b are syntactic categories. d. ↔ is a bidirectional implication (if X, then X’ and if X’, then X). In the next section, I propose a WWM alternative to some parts of Mahanta’s analysis that is not a possibly unwarranted extension of a phonological analysis but a truly autonomous morphological one.

2. The alternative I shall work in reverse from Mahanta by first accounting for the hiatus avoidance facts she reports in example (32) of her paper with a simple pair of WFS. The two WFS in (2a and b) vary the suffixes according to whether a vowel or a consonant precedes: (2) a./XVy /Erg ↔/XVk/Acc ↔/XVloi/Dat ↔/XVr/Gen ↔/XVt/Loc ↔/XVr /Inst b./XC /Erg↔/XC k/Acc↔/XColoi/Dat↔/XC r/Gen↔/XC t/Loc↔/XCV r /Inst Mahanta points out in her endnote 14 that an allomorphy analysis would also have to deal with hiatus avoidance. While WWM offers competing WFS instead of allomorphs, it is clear in (2a and b) that hiatus is avoided. Even if hiatus avoidance is general in the language, it is necessary to specify somewhere in the grammar that the ergative and instrumental cases avoid hiatus differently, something WWM does straightforwardly. Mahanta further points out that lexical indexation comes with built-in locality, but I would like to point out that lexical indexation itself is not integrally built into OT—it is simply a tool to allow one to do morphology within OT. Could WWM express the “suffix allomorphy” in (2) as a result of an initial (non local) segmental difference? Yes, as the fictional but legitimate WFS in (3) suggest, but the historical facts that would lead to such a situation would make it very rare. While WWM morphologists argue for an independent morphology, we all recognize that phonology also exists and that it definitely has built-in locality. Historical phonological alternations being typically responsible for this kind of synchronic morphological allomorphy, (3) is an implausible (if not impossible) WFS pair.

Aspects of Assamese Morphonology Revisited

(3)

233

/VXy /Erg ↔/VXk/Acc /CX /Erg↔/CX k/Acc

Secondly, it is possible to account for vowel harmony facts in WWM, if we incorporate a multi-tier representation. We can thus account for Mahanta’s Assamese facts in example (1) of her paper with (4) below: (4)

[yATR –ATR] ↔ /X/N

[yATR +ATR] /Xi/Adj

In (4), at the same time as one suffixes /i/ to a noun (making it an adjective), one changes the –ATR feature shared by the rightmost vowels to +ATR. The ATR condition will vacuously apply when the final vowel of the noun is a low vowel, by virtue of this vowel not being specified for ATR. (Note that I am not sure from Mahanta’s paper whether the nominal root is bound or not in Assamese; even if it is, it shouldn’t be too complicated to adjust (4) accordingly).

3. Conclusion I have certainly not provided as detailed an account of the Assamese facts as Mahanta has, but I have, hopefully, shown that a purely morphological account of morphonological facts such as CV-sensitive allomorphy and vowel harmony is possible under a unified theory of morphology such as WWM. I view this as a double opportunity. First, I hope morphologists working in other autonomous theories will see through our divergences and seize the opportunity to demonstrate their frameworks’ capabilities in this regard. Second, I invite OT (and other) phonologists to lift a burden off their otherwise respectable theory by giving morphology a little more space. OT phonologists in particular should, in my view, seriously consider the implications of adding lexical indexation to their toolkit—as with constraint weight, the “concern is well founded” that the loss of restrictiveness entailed by it could, in the words of Prince and Smolensky (1993:215216), remove “the hope of principled explanation.”

234

Luc Baronian

References Aronoff, Mark 1976

Word Formation in Generative Grammar. Cambridge MA: The MIT Press. Ford, Alan, and Rajendra Singh 1991 Propédeutique morphologique. Folia Linguistica 25: 549-575. Ford, Alan, Rajendra Singh and Gita Martohardjono 1997 Pace Panini: Towards a Word-based Theory of Morphology. New York: Peter Lang. Mahanta, Shakuntala This volume Morpheme-specific Exceptional Processes and Emergent Unmarkedness in Vowel Harmony. Pince, Alan, and Paul Smolensky 1993 Optimality Theory: Constraint interaction in generative grammar. Rutgers University Center for Cognitive Science Technical Report 2.

Nativeness, Deviance and Ownership: A Response to Singh Pingali Sailaja R. Singh’s (2007) article on Indian English (IE) raises questions regarding the nature of Indian English which is still a much-debated topic today. The view that there are unique structures in Indian English and that it is a deviant variety have been stated in the literature explicitly and implicitly. Singh attempts to provide a counter-point to these ideas. In recent times, the question of nativeness has taken hold, primarily because speakers of many varieites of English feel an ownership with regard to the language. There are those who believe that IE is a native variety, and such a belief, in a scientific enterprise such as linguistics, must be backed by argument and evidence. It is in these that Singh’s article provides some directions. The issues raised are basic ones, and clarity on them is essential for a discussion or description of IE. Of particular interest is the idea of the native speaker especially in the context of English. I am in agreement with much of what Singh says in his article.I do, however, believe that there is some amount of complexity that linguists and those attempting to describe a variety such as this must take into account. My response to Singh’s article is restricted to three aspects of IE— namely nativeness, Indian English as a variety of English and the notion of deviance, and a discussion of the socio-political considerations that lead to ownership. A major portion of Singh’s article focuses on whether IE is a ‘nonnative variety’, and how the ‘native speaker’ of a language is to be identified. In order to determine this, we need to have a definition of what constitutes a native variety and what constitutes a non-native variety and who a native speaker is. Unless we have definitions for these ‘ideas’ we cannot classify a variety in any way. It would be to put the cart before the horse. We do not have an unambiguous definition of what constitutes a native variety/speaker. Since we are not clear on this score, I believe Singh’s attempt at trying to understand this is important. This is a complex notion and is by no means a settled issue. Native and L1 (first language) versus non-native and L2 (second language) come from different sources. Native and non-native come from linguistics and L1 and L2 are from pedagogy. And mother

236

Pingali Sailaja

tongue is used in common parlance. These sets of terms are not necessarily used synonymously in the literature. Without confusing theoretical aspects with pedagogical ones (one that Singh points out is the problem with those who write on IE), it is important to keep in mind that one’s first language is not necessarily one’s ‘native language’. Moreover, the first language changes for a given individual depending on the use a language is put to much in the manner of passive and active vocabulary. When Singh says that a ‘native speaker’ of a language must use it, this is probably what is meant. As for who a native speaker may be, Singh’s working definition is a good starting point (p. 36). The definition correctly emphasises the necessity of stable and consistent judgements that are required. I am not however fully convinced about the comparison that Singh warns us against: nonnative speaker performance with native speaker competence —not because such a caution is unwarranted but because there is greater complexity here than is usually acknowledged. It is true that ‘non-native speakers’ do not necessarily accept the structures they produce—this is also noticed by Suman (2005) who studies Hindi-English code-switching. The Hindi-English or Hinglish structures produced by the speakers were rejected by the very people who used them in the first place. But it appears that the difference between acceptability and grammaticality seems to be lost sight of in studies that make use of judgements. The acceptability of an utterance is determined by several factors, of which grammar is just one. This difference between acceptability and grammaticalness goes back to Chomsky (1965: 10-12). It is extremely unlikely that speakers have perfect access to the grammar of structures and are able to pronounce judgements on gramaticality. Rather, they are able to comment on their acceptability. Therefore, such judgements cannot always be used to determine the grammar of a variety. Furthermore, in test conditions especially, the purist cap is donned. We need to take ‘grammaticality judgements’ with a great deal of caution, although we cannot abandon them. As a consequence, it is important to keep in mind which structures speakers are USING consistently, not just what they are judging consistently. As to the question of whether IE is a ‘deviant’ variety, this is closely related to whether one believes that it is non-native. Those who would characterise it as native cannot then call it deviant. Irrespective of whether Indian English is native or not, it has been the endeavour of many to find common patterns. This attempt is not complete and is ongoing. It is acknowledged by many that there are features that are typically Indian, at all

Nativeness, Deviance and Ownership

237

levels—phonetics, phonology, lexis, syntax and discourse. The pointtherefore is if these are consistent and systematic, and these are unique to India, then surely a ‘legitimate’ variety of English, which we may call Indian, exists? And since only Indians use it, it must be their native language. And therefore, IE must be a native variety. Let us examine the idea that Indian English is a deviant variety. Let us use the examples of Wh-questions which are a favourite of most people to classify IE as deviant (for example, Verma 1978). The structure-type: (1)

Where you are going?

is quite common in IE. While this is ‘deviant’ from standard British or American English, it is stable and consistent in use and in judgements for many Indian English speakers. This makes it a native form for many Indians. In turn however, there are Indians for whom the structure (2)

Where are you going?

is the appropriate form, equally stable and consistent as the earlier example. It is evident that these Indians do not speak a ‘deviant’ form as defined above, nor by any stretch of imagination can it be claimed that they speak standard British or American English (more on this below). This must be seen rather as a standard versus non-standard form of Indian English for both of which there are consistent judgements and one finds consistent use. Bhatt (2000, 2004) maintains that Indian users of English move between the standard and the non-standard and both are available to them. Another example of such a difference is illustrated below. (3) and (4) exemplify indirect questions. The former is normal for some speakers while the latter is correct for others. (3)

I asked him what did he want for his birthday.

(4)

I asked him what he wanted for his birthday.

Indian English speakers don not just move from standard to non-standard, but actually have structures from both varieties. Thus, there are also speakers who would consistently use: Where are you going? but I asked him what did he want for his birthday. I have heard these two forms being used consistently by the same speaker.

238

Pingali Sailaja

It is indeed interesting that many Indian users of Indian English believe that they speak a variety that is not Indian. Further, it is those who believe that their form is close to/same as native English(es) who consider IE to be deviant. In a classroom, the structure that a conscientious teacher would employ would be (2 and 4) above and she would assiduously correct a student who uses (1 and 3) above. We thus need to distinguish between standard and non-standard varieties of Indian English, the former being used/imparted in the classroom, as is done for any language. The problem arises because we do not have written grammars of IE. Until we begin to write these grammars, especially for syntax because that seems to be crux of the problem, such dichotomies will persist. Nobody disputes that IE is phonologically unique, and that its lexis is unique too in that it is different. Now, the next question is whether standard Indian English has features that make it different from native varieties of English. Take the example of the verb gift. Those IE speakers who consider their English to be equal to that of standard British English or American English would be surprised to find that a structure such as (5)

I gifted him a new bicycle for his birthday.

is actually very Indian. The process of standardisation (not Indianisation) of this verb is almost complete. In their studies of verb complementation patterns, Olavarria de Ersson & Shaw (2003) and Mukherjee & Hoffman (2006) demonstrate the differences between British and Indian standard Englishes. These works also provide a direction for the writing of the grammars of IE. As for the lexis of IE, while Singh says that the unique lexical items are not unexpected, I would go so far as to say that they must exist. If there were no typical IE lexical items at all, then we would have to claim that Indians speak standard British English. Variation in a language is seen at many levels and lexis is one of the most important of these. Indian English lexis is different not only because of the Indian words that have got assimilated into IE, but also because there are words that are used differently in IE. An example is shift which is used in the context of moving houses. (Incidentally, Tyneside English also uses this verb as is seen in IE, Trousdale 2006). More examples are seen in Nihalani et al (2005). With regard to the morphology again, while prima facie, it appears that IE morphology is no different from the morphology of standard Englishes, this requires further investigation. There are for example some

Nativeness, Deviance and Ownership

239

preferred structures such as Delhiite, Bombayite, hostelite etc., in IE that are not the preferred structures in standard British English. The question is (assuming that the rules for the construction of these adhere to standard British English), whether IE must conform to such rules. Let us assume for the sake of argument that we will discover structures of morphology (we know that many such do exist for syntax) that do not conform to native structures. Such a discovery, in and of itself, should not make one categorise IE as deviant. Why not, as demonstrated above, determine the acceptability (as standard or non-standard) in India? The problem of ‘native’ versus ‘non-native’ and ‘deviant’ versus ‘pure’ arises because one is constantly looking to the norms of native varieties to legitimise (or delegitimise) Indian English. While there may be specific differences (as there should be), what is important is the point made by Singh on pp. 37-38 that there are no structural features that characterise all native varieties as opposed to all non-native varieties. If Indians spoke English EXACTLY as the British then that would hardly be ‘native’ to Indians and there would be no ‘Indian English’ at all. I now come to the issue of ownership. To categorise IE as non-native because it was not originally of this country is extremely problematic. If history as we know it today is true, then Sanskrit, the mother of many Indian languages, is also not of this country, but the ownership of Sanskrit is zealously guarded! How many thousands of years must pass before a language may be called one’s own? Just because we know in recent history that English is an outsider’s language, does that make it any less Indian? For the modern generation, it is as much their language as any other language they may speak. Having said that, it must however be noted that while many Indians consider English to be their first language, they do not return this as their mother tongue in the census reports (Sailaja 2009). This stems from various factors: what one is told to believe, a sense of belonging, and one of identity in a country where community politics plays a very important role, and ‘mother’ is worshipped in various forms (Brass 2004). These are often not really concerns of a linguist who purports to describe a variety as form. Identity politics has its place but must not be confused with linguistics. Identity is not only about a sense of belonging but also being part of a culture and a social group. Unfortunately, English does not offer that in a serious way in India. There are urbanised and westernised classes who seem to consider English to be their language and belong to a secular culture, but this phenomenon is still to be examined in detail. Sreetilak’s (2007) study of IE films is an attempt to portray the social life of English,

240

Pingali Sailaja

which is very real for many outside of films. Yet, it is not enough. As of now, we must consider English as a language (with its linguistic features) as the sole determining factor for nativeness or otherwise.

References Bhatt, Rakesh M. 2000 Optimal expressions in Indian English. English Language and Linguistics 4: 69-95. 2004 Indian English: Syntax. In The Handbook of Varieties of English: vol. 2, Morphology and Syntax, Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar W Schneider and Clive Upton (eds.), 101630. Berlin: Mouton de Gruyter. Brass, Paul R. 2004 Elite interests, popular passions, and social power in the language politics of India. Ethnic and Racial Studies 27: 353-375. Chomsky, Noam 1965 Aspects of the Theory of Syntax. Cambridge: MIT Press. Mukherjee, Joybrato & Sebastian Hoffmann 2006 Describing verb-complementational profiles of New Englishes: A pilot study of Indian English. English World-Wide, 27: 147-73. Nihalani, Paroo, R.K. Tongue, Priya Hosali & Jonathan Crowther 2005 Indian and British English: A Handbook of Usage and Pronunciation. New Delhi: Oxford University Press. Olavarria de Ersson, Eugenia and Philip Shaw 2003 Verb complementation patterns in Indian Standard English. English World-Wide 24: 137-61. Sailaja, Pingali 2009 Indian English. Edinburgh: Edinburgh University Press. Singh, Rajendra 2007 The nature, structure, and status of Indian English. Annual Review of South Asian Languages and Linguistic-2007, Rajendra Singh, (ed.): 31-44.Berlin: Mouton de Gruyter. Sreetilak, S. 2007 Fiction in Films, Films in Fiction: The Making of New English India. New Delhi: Viva Books Private Limited.

Nativeness, Deviance and Ownership

241

Suman, Mickey 2007 An Acceptability Study of Hinglish. Unpublished M. Phil. Dissertation, University of Hyderabad. Trousdale, Graeme 2006 How many Englishes are there? Talk given at the University of Hyderabad. 27 February. Verma, S.K 1978 Syntactic irregularities in Indian English. In Indian Writing in English: Papers Read at the Seminar on Indian English held at CIEFL,Hyderabad, July 1972, Ramesh Mohan (ed.), 207-20. Bombay: Orient Longman.

Appendices

Announcements The Gyandeep Prize We are happy to re-announce the continued availability of this annual prize of CDN$400. It is awarded by The Gyandeep Foundation (Tupper Avenue, Montreal) to the most outstanding student contribution to ARSALL.

Housekeeping As it is still our intention to bring out future issues earlier than November/December of each year, potential contributors to ARSALL should get in touch with the editor as soon as possible. Our new deadlines are: November 1: Initial submission. Must be in pdf format, preferably made from the letter-size template (for contributions to edited volumes) furnished by Mouton (see below). March 1: Final submission. The final version of an accepted paper MUST be done on the Mouton template (letter-size) and fully respect the Mouton style-sheet. It should be sent to the edior, after all revisions and corrections, as a pdf file together with the Mouton-template word-file from which the pdf was made. Papers submitted after these deadlines will be processed, but only for a later issue. A paper initially submitted after November 1, 2010, for example, will be considered only for the 2011 issue of ARSALL. Mouton web-site: http://www.degruyter.com/cont/imp/mouton/moutonAuthors.cfm We want to emphasize that we reserve the right not to process papers requiring unnecessary editorial work. We would also like potential contributors whose primary language is not English to have their initial submissions looked at by a competent writer of English.

Notes on Contributors

Luc Baronian, Ph.D. (Stanford) is Assistant Professor of linguistics at Université du Québec à Chicoutimi. He has published on theoretical issues in phonology (consonant shift, clitic contractions) and morphology (verb systems, paradigm gaps), as well as in dialectology (North-American French and Armenian). [[email protected]]. Fida Bizri, Ph.D. (EPHE, Paris) teaches Sinhala at the Institute of Oriental Languages and Civilizations, Paris. Her main research interests are Semitic languages (more specifically Arabic), Sinhala, Buddhist literature, and diglossia. [[email protected]]. Graeme Cane, Ph.D. (Starthclyde) has taught English language and linguistics at universities in Colombia, Saudi Arabia, Papua New Guinea, Brunei, Japan, Oman and Singapore. He is currently the Head of the Centre of English Language at the Aga Khan University in Karachi. [[email protected]]. Probal Dasgupta, Ph.D. (New York University) is Professor of linguistics at the Indian Statistical Institute, Kolkata. His best known books include Kathaar kriyaakarmo (1987), The otherness of English: India's auntie tongue syndrome (1993), Primico (1977), and Projective syntax: theory and applications (1989). [[email protected]]. Niladri Sekhar Dash, Ph.D. (Calcutta University) works at Linguistic Research Unit, Indian Statistical Institute, Kolkata. He has published several books and papers in various international and national journals. He is particularly interested in corpus linguistics. [[email protected]]. Andrew Hardie is a lecturer in the Department of Linguistics and English Language at Lancaster University. He specialises in corpus linguistics and his recent research interests include automated part-of-speech tagging and other forms of corpus-based analysis of the languages of South Asia (especially Urdu and Nepali).

248

Notes on Contributors

Peter Hook, Ph.D. (Univ. of Pennsylvania) is Professor Emeritus at the University of Michigan and Visiting Scholar at the University of Virginia. His research interests include Indo-Aryan languages, linguistic typology, semantics and world poetry. [[email protected]]. Otto M. Ikome, Ph.D. (Montreal) is Associate Professor of Linguistics and Translation at Télé-université, Université du Québec à Montréal. He has developed a series of distance learning courses on English for Specific Purposes, Varieties of English, Business English, as well as French, English and Spanish translation courses distributed by Université du Québec. [[email protected]]. Kazuyuki Kiryu is Associate Professor in the Department of Environmental Design for Special Needs, Mimasaka University, Japan. He has been working on Tibeto-Burman languages, Newar and Meche from descriptive and functional perspectives. He is also doing research on tense and aspect within a contrastive linguistic approach, comparing Japanese with English and Asian languages. [[email protected]]. Ram R Lohani is lecturer in the Linguistics Department at Tribhuvan University, Kathmandu. His interests are in theoretical linguistics and quantitative studies of linguistic phenomena. Shakuntala Mahanta, Ph.D (UiL-OTS, The Netherlands) is a Senior Lecturer in Linguistics at the Department of Humanities and Social Sciences of Indian Institute of Technology, Guwahati, and is the author of Directionality and Locality in Vowel Harmony (2008, LOT: Netherlands). Nirmalangshu Mukherji, Ph.D. (Waterloo) is Professor of Philosophy at the University of Delhi. He has held visiting appointments at several European institutions and has published extensively on the philosophy of language and on the connection between language and music. He holds the view that the universal grammar of human language extends to the formal structure of music. In his forthcoming M.I.T. Press monograph The Primacy of Grammar he suggests that the scientific understanding of human languages is restricted to the grammars of these languages. [[email protected]]. Michael W. Morgan, Ph.D. (Indana University) has been involved in comparative, historical, and typological sign language research for fifteen

Note to Contributors 249

years. Most recently, he has also been managing director of Ishara Foundation (Mumbai), an NGO dedicated to Deaf bilingual higher education, but will be returning to academia in the near future. [[email protected]]. Prashant Pardeshi, Ph.D. (Kobe University) is a lecturer in the Graduate School of Humanities, Kobe University. The focus of his research is on issues related to transitivity, voice and compound verbs. He is also interested in and has written about linguistic typology, areal linguistics, Japanese linguistics, and pedagogy. [[email protected]]. Bhim Narayan Regmi is a Teaching Assistant in Central Department of Linguistics at Tribhuvan University, Kathmandu. His research interest is spoken language through corpus linguistics including the basic requirements for corpus development like orthography, encoding system, issues of standards in language applications which are applicable to the languages of Nepal. Pingali Sailaja¸ Ph.D. (CIEFL, Hyderabad) is Professor of English at the University of Hyderabad. She is interested in the historical, educational and linguistic aspects of English in India, and in morphology. Her previous publications include English Words: Structure, Formation and Literature (2004, Pertinent Publishers, Mumbai) and Indian English (2009, Edinburgh University Press). [[email protected]]. Yogendra P. Yadava, Ph.D. (EFLU, Hyderabad) is professor of linguistics at Tribhuvan University, Kathmandu (Nepal). His areas of interest include generative linguistics, endangered language documentation, language localization, and multilingual education. Presently he is involved with the linguistic survey of Nepal. He has half a dozen books and several articles to his credit. [[email protected]].