265 84 2MB
English Pages 366 [377] Year 2007
Chomskyan Linguistics and its Competitors
Chomskyan Linguistics and its Competitors
Pius ten Hacken
Published by UK: Equinox Publishing Ltd., Unit 6, The Village, 101 Amies St., London SW11 2JW USA: DBBC, 28 Main Street, Oakville, CT 06779 www.equinoxpub.com First published 2007 Paperback edition 2009 © Pius ten Hacken 2007 All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage or retrieval system, without prior permission in writing from the publishers. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. ISBN-13
978 1 84553 054 9 978 1 84553 554 4
(hardback) (paperback)
Library of Congress Cataloging-in-Publication Data Hacken, Pius ten. Chomskyan linguistics and its competitors / Pius ten Hacken. p. cm. Includes bibliographical references and index. ISBN-13: 978-1-84553-054-9 (hb) 1. Chomsky, Noam. 2. Linguistics--Research--Methodology. 3. Linguistics--History--20th century. 4. Language acquisition. I. Title. P85.C47H24 2007 410’.92--dc22 2006039446 Typeset by Catchline, Milton Keynes (www.catchline.com) Printed and bound in Great Britain and the USA
Contents Introduction
1
1
Research programmes 1.1 The empirical cycle 1.2 The role of research programmes 1.2.1 Problems with the empirical cycle 1.2.2 Approaches to the problems of the empirical cycle 1.3 Truth, progress, and revolutions 1.3.1 Truth 1.3.2 Progress 1.3.3 Revolutions 1.4 Research programmes in linguistics 1.4.1 From natural science to linguistics 1.4.2 From paradigms to research programmes
5 6 11 11 14 19 19 22 24 29 29 34
2
The research programme of Chomskyan linguistics 2.1 The nature of individual languages 2.1.1 Competence versus performance 2.1.2 Grammatical versus pragmatic competence 2.1.3 I-language versus E-language 2.1.4 Conclusion 2.2 The nature of data 2.2.1 Grammaticality judgements 2.2.2 Corpus data 2.2.3 Psycholinguistic experiments 2.2.4 Conclusion 2.3 The function of grammars 2.3.1 Grammar and competence 2.3.2 Idealisations 2.3.3 The problem of indeterminacy 2.4 The role of language acquisition 2.4.1 Language acquisition versus use of language 2.4.2 Linguistic universals 2.4.3 Extension of the model 2.4.4 Additional idealisations
39 41 42 46 49 52 53 54 57 60 63 65 66 70 73 75 75 81 83 90
2.5 The unity of the research programme of Chomskyan linguistics from its emergence to the 1980s 2.5.1 From Standard Theory to Principles and Parameters 2.5.1.1 The treatment of a passive sentence in ST and P&P 2.5.1.2 Implications of the differences
2.5.2 The early stages of Chomskyan linguistics 2.5.3 Conclusion 2.6 The position of the Minimalist Program in Chomskyan linguistics 2.6.1 Continuity and its problems 2.6.2 Two additional questions about language 2.6.2.1 Linguistics and brain science 2.6.2.2 The evolutionary origin of language
2.6.3 Extension of the model 2.6.4 Perfection 2.6.5 Conclusion 3
The Chomskyan revolution 3.1 The research programme of Post-Bloomfieldian linguistics 3.1.1 The nature and boundaries of a language 3.1.1.1 Bloomfield’s conception of language 3.1.1.2 Post-Bloomfieldian conceptions of language
3.1.2 The nature of the data 3.1.3 The status of a grammar 3.1.3.1 Classification and prediction 3.1.3.2 Reality of structure
3.1.4 Approaches to non-uniqueness 3.2 A comparison of the two research programmes 3.2.1 Continuities 3.2.2 Differences 3.2.2.1 Mentalism 3.2.2.2 Indeterminacy
3.2.3 Incommensurability 3.2.3.1 Expressions of puzzlement 3.2.3.2 Observational adequacy 3.2.3.3 Descriptive and explanatory adequacy
3.3 Has there been a Chomskyan revolution?
94 95 96 99 103 105 106 108 109 111 114 118 122 123 129 130 133 133 136 139 142 144 145 149 156 156 160 160 164 166 167 169 171 175
4
Some modern competitors 4.1 Lexical-Functional Grammar 4.1.1 The crisis: psychological reality 4.1.2 A new research programme 4.1.3 The ‘Competence Hypothesis’ in Chomskyan linguistics 4.1.4 Comparison of the models of LFG and Chomskyan linguistics 4.1.5 Interaction of LFG and Chomskyan linguistics 4.1.5.1 The interpretation of psycholinguistic data 4.1.5.2 Language acquisition 4.1.5.3 Some theoretical notions
4.1.6 Conclusion 4.2 Generalised Phrase Structure Grammar 4.2.1 The crisis: generative grammar 4.2.2 A new research programme 4.2.3 Comparison of the models of GPSG and Chomskyan linguistics 4.2.4 Theoretical discussions and incommensurability 4.2.4.1 Wanna-contraction 4.2.4.2 X-bar theory
4.2.5 Conclusion 4.3 Head-driven Phrase Structure Grammar 4.3.1 The crisis: a ‘meta-crisis’? 4.3.2 A new research programme? 4.3.3 Comparison of the model of HPSG with other models 4.3.4 Interaction of HPSG with other frameworks 4.3.5 Conclusion 4.4 Jackendoff’s linguistics 4.4.1 The crisis: integrating semantics 4.4.2 Architecture and research programme 4.4.2.1 Syntactocentrism versus parallel architecture 4.4.2.2 Jackendoff’s presentation of the research programme of Chomskyan linguistics 4.4.2.3 Theory versus research programme
4.4.3 The debate on the evolution of language 4.4.3.1 Recursion 4.4.3.2 Adaptation 4.4.3.3 Evolution and architecture
4.4.4 Conclusion 4.5 Conclusion
183 184 186 189 196 197 200 200 202 205 208 209 210 212 217 218 219 223 229 231 231 234 241 242 243 245 246 249 249 251 253 258 259 262 264 266 267
5
Aspects of language development and use 5.1 The nature of named languages 5.1.1 Why English cannot exist 5.1.2 Why English is a problematic notion 5.1.3 English as a phenomenon 5.2 Empirical aspects of language acquisition 5.2.1 Language acquisition as parameter setting 5.2.2 Learning strategies 5.2.3 The critical period hypothesis 5.2.4 Maturation versus continuity 5.3 Second language acquisition 5.3.1 The difference between first and second language acquisition 5.3.2 The logical problem of second language acquisition 5.3.3 The critical period in second language acquisition 5.3.4 The initial state of second language acquisition 5.4 Language change 5.4.1 A history of I-languages 5.4.2 An example: change of word order 5.5 Language and communication 5.6 Conclusion
273 274 274 276 279 281 282 285 290 293 300 301 304 307 309 317 318 320 324 327
References
332
Author index
352
Subject index
358
Table of figures Figure 1.1 Figure 1.2 Figure 1.3 Figure 1.4 Figure 1.5 Figure 1.6 Figure 1.7 Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 Figure 2.9 Figure 2.10 Figure 3.1 Figure 3.2 Figure 3.3 Figure 4.1 Figure 4.2 Figure 4.3 Figure 4.4 Figure 4.5 Figure 4.6 Figure 4.7 Figure 4.8 Figure 5.1 Figure 5.2 Figure 5.3 Figure 5.4 Figure 5.5 Figure 5.6 Figure 5.7 Figure 5.8
Data and knowledge in science Theories and observations in the scientific process The empirical cycle Hierarchical organisation of observations, laws, and theories Tree representation of the empirical cycle Man with a suitcase (?) The research programme in relation to the empirical cycle Language as a mental component and its oppositions Classification of the oppositions of competence Competence, grammar, and data in the model of Chomskyan linguistics The language acquisition device Language acquisition as a process The position of UG in the model of Chomskyan linguistics Model of the research programme of Chomskyan linguistics Simplified deep structure and surface structure of (79b) Simplified D-structure and S-structure of (79b) Model of the research programme of Chomskyan linguistics as used in the Minimalist Program Bloomfield’s stimulus-response model The model of Post-Bloomfieldian linguistics Problems and mysteries for Bloomfield and Chomsky The research programme of LFG The research programme of GPSG The HPSG model of science The agreement feature of Latin bonus The research programme of HPSG Syntactocentric architecture Parallel architecture Recursion in visual grouping Parameter setting in language acquisition The Subset Principle The Strong Continuity Hypothesis Two versions of the Maturation Hypothesis The Full Access Hypothesis for L2 acquisition combined with the Strong Continuity Hypothesis for L1 acquisition The Incompleteness Hypothesis in L2 acquisition combined with a Maturation Hypothesis for L1 acquisition Parameter setting as a source of change Gradual change
6 7 8 9 10 12 19 41 52 65 83 84 85 86 96 97 120 135 143 161 195 216 234 237 240 249 250 261 283 287 294 295 312 314 319 322
Introduction Language is a subject that many academics and students are interested in. For most of them, however, generative linguistics is a rather exotic, highly specialised area of research. The most common associations evoked by the term are a highly complex formalism, a distinction between deep and surface structure, and the name of Noam Chomsky. If the complexity of the formalism is not sufficient to put them off, they will soon discover that the label of generative linguistics covers a variety of theories in various, mutually incompatible formalisms. In the course of his career, Chomsky has proposed different theories that make different assumptions on how grammars have to be formulated. Others have proposed variants or competing theories, elaborating Chomsky’s ideas or in conflict with them. How can the results of generative grammar be used in adjacent fields if they emerge from such a complex, confusing set of theories? The most typical way people are introduced to generative linguistics is by means of a textbook. This is ideal for undergraduate students taking a course taught by a lecturer who presents his or her own approach. They will have to accept a number of assumptions and are rewarded by being able to do research within one particular variety of generative grammar. For many research students or researchers in adjacent fields, however, textbooks do not provide the most appropriate introduction. Researchers tend to have their own background assumptions about language and want to know what distinguishes the various theories. If textbooks treat such issues explicitly at all, it is usually in a brief introduction. For such people, the present book is intended. It will explain some of the basic assumptions made in generative grammar and address the question of how different approaches within generative grammar can be grouped together. A crucial distinction to be made in the description and comparison of different varieties of generative grammar is that between research programmes and theories. The following analogy will give the reader a first impression of this difference. 1
2
Chomskyan Linguistics and its Competitors
In the Netherlands, amateur football is organised in two different competitions, the Saturday competition and the Sunday competition. Traditionally, Protestants were not allowed to play on Sunday, whereas for Catholics, Sunday after church was the preferred time for sports. As a consequence, each year there are two amateur champions. It is easy to determine the national champion, however. At the end of the season, the two champions play two matches to determine which team is better. It is not always so easy to determine which of two teams is better, however. Consider for instance the comparison of a football team, e.g. Ajax, and a handball team, e.g. Hellas. We would like to organise a match between them, but there is no meaningful framework for such a match. If Ajax play football and Hellas handball, they are continually violating each other’s rules. If both play handball, Hellas is likely to win, but Ajax is at a disadvantage because they are not able to mobilise their true potential. If both play, for instance, hockey, the match is fairer, but the result is in a sense irrelevant to the original question. The situation of the Saturday and Sunday amateurs is typical of different theories that work on broadly the same background assumptions. The situation of Ajax and Hellas is typical of theories in different research programmes. The concept of research programme should of course first be elaborated somewhat more before we can start using it. Therefore, Chapter 1 outlines the context in the philosophy of science that gave rise to its emergence, and relates it to a number of central philosophical issues such as truth and progress. Chapter 2 then describes the research programme of Chomskyan linguistics. The advantage of starting with Chomskyan linguistics is that it is by far the most well-documented research programme. It will be shown that various stages in the theoretical development of Chomskyan linguistics can be interpreted as part of the same research programme. We can organise a match between them and designate a generally accepted winner. Chapter 3 describes and analyses the Chomskyan Revolution. This is the process by which Chomskyan linguistics emerged and gained the upper hand against its immediate predecessor, PostBloomfieldian linguistics. It will be shown that this was not because they confronted each other in a direct match, but because people stopped playing one of the games. Chapter 4 describes a number of variants within generative grammar. Here another advantage of starting with Chomskyan linguistics comes to the fore, because each of them defines itself in opposition to Chomskyan linguistics. For each of the approaches, the question is what their research programme is and how it differs from Chomskyan linguistics. Finally, Chapter 5 considers a number of areas of linguistic research that are not directly concerned with the description of language and shows to what extent and how they can be integrated with Chomskyan linguistics.
Introduction
3
Each analogy has its limitations. When we use an analogy to elucidate a complex concept in simpler terms, we will invariably reach a point where the parallel ends. In the case of sports and science, a central issue that is not covered by the analogy is that of progress. Science is a rational enterprise and it results in increasing insight. The invention of a new sport or the loss of interest in an obsolete one can in general be attributed to fashion. The emergence of a new research programme cannot be explained in such terms. Instead, it is a response to a crisis in the old one. Also the abandonment of a research programme owes less to fashion than to rational decisions. Therefore, the issues of progress, crisis, and rationality will play a central role throughout this book. In describing different research programmes, I have attempted to approach each research programme in its own terms, rather than looking for analogies for central concepts in Chomskyan linguistics. My point of departure was that a framework which attracts a number of intelligent people for a prolonged period cannot be incoherent or irrational. I hope therefore that each research programme will come across as a rational approach to linguistics. In many cases, discussions about research programmes can be very lively. I have therefore made an effort to present the arguments in the original terms so that the reader can form their own idea of how convincing they are. To this end I have made extensive use of quotations. These quotations are numbered for ease of reference. The same type of numbering is used for examples, definitions, and lists. Therefore I always use quotation marks to avoid confusion with other numbered items. Together the numbered items will also indicate the structure of each section. In quotations I always use the exact typographical devices of the original. The meaning of smallcaps, italics, single or double quotation marks, etc., is not explicitly marked in a text. Normalisation to, for instance, italics for emphasis would have imposed a certain interpretation on the text. The only operations applied to quotations are the division into a, b, etc. for ease of reference, and the use of […] to mark skipped passages. The way footnotes in the quoted text are dealt with is always described explicitly. The research leading to this book started as part of my Habilitation at the Universität Basel (1995–2000). I am grateful to my colleagues in Basel for discussion and support. I benefited also from various discussions with Fritz Newmeyer and critical remarks by John Joseph on my early conceptualisation of the material. I am grateful to Equinox Publishers for reviving my interest in the topic by asking me to write this book. Their guidelines prompted me to describe my findings in a way that has little in common with the concise treatment in the first two chapters of my Habilitationsschrift. Without the support of my colleagues at Swansea University, this book would not exist.
4
Chomskyan Linguistics and its Competitors
The research leave I had in 2005/06 was crucial in the writing process. I count myself very fortunate to have been able to show earlier drafts of (parts of) this book to Ray Jackendoff, Sarah Kennedy, Ellen-Petra Kester, Sixta Quaßdorf, Andrew Rothwell, Peter Sells, and Alona Soschen. I am grateful for their helpful remarks, criticism, encouragement and for providing me with literature I could not access otherwise. The approach I have in this book was formed in large part by my teachers of linguistics at the Universiteit Utrecht. In particular Riny Huybregts gave me the tools to develop in this direction. Of course the viewpoints taken as well as any remaining errors in the final result are my own responsibility.
1
Research programmes
The study of science is a large field encompassing both epistemological and historical aspects. An ideal approach to this field would on the one hand state what qualifies as a science and how a science is built up, on the other hand show how this matches actual historical development. There is a large volume of studies addressing these questions in philosophy and history of science. The purpose of this chapter cannot be to provide a full introduction to these fields. For general introductions, see for instance Couvalis (1997) and Ladyman (2002). At the same time, the study of Chomskyan linguistics and its competitors cannot be undertaken without the use of some concepts emerging from these fields. The purpose of this chapter is to introduce the relevant terms, explain their meaning and contextualise the corresponding concepts. The central concept to be introduced here is that of research programme. This concept emerged as a response to certain problems with the common-sense view of science. Therefore, Section 1.1 spells out some of the common assumptions as to how science is supposed to work. Section 1.2 starts by showing the logical problems raised by this view of science and proposes a solution in the form of research programmes. As will be shown, assuming research programmes does not only solve the logical problems but also explains why these problems are not usually observed as such. Section 1.3 then addresses a number of other key concepts in the philosophy of science which play an important role in the present study. They include truth and its relation to the goal of science, progress as a measure of success, and revolutions as perhaps the most controversially discussed historical phenomenon in science. Finally, Section 1.4 narrows the focus to the study of language, considering to what extent it poses specific problems in the application of these terms.
5
6
Chomskyan Linguistics and its Competitors
1.1
The empirical cycle
A prototypical example of a science is astronomy. Astronomy is concerned with planets, stars, and galaxies. It will be useful to refer to examples from astronomy to illustrate statements about science, before drawing parallels with linguistics, because astronomy has been one of the areas to which most analysis effort and discussion in the philosophy of science has been devoted. Considered at a highly abstract level, a science can be characterised as in Figure 1.1.
Data
Scientific process
Knowledge
Figure 1.1: Data and knowledge in science
Applied to astronomy, the three elements in Figure 1.1 can be identified as follows. The data are the results of various types of observation of the sky, including naked-eye observations, telescope observations, and observations with various other instruments. Knowledge includes theories covering these observations, for instance a model of the solar system with the orbits of the planets, a model of the physical processes involved in a black hole, or a model of the Big Bang at the start of the universe. The scientific process is the procedure for producing knowledge, i.e. theories, on the basis of data, i.e. observations. It is the structure of this process that we should try to understand. In a classical exposition, Nagel (1961) describes science as in (1). (1)
a.
‘Scientific thought takes its ultimate point of departure from problems suggested by observing things and events encountered in common experience; b. it aims to understand these observable things by discovering some systematic order in them; c. and its final test for the laws that serve as instruments of explanation and prediction is their concordance with such observations’. [Nagel (1961: 79)]1
In (1a), Nagel establishes observations as the starting point of the scientific process. That these observations should be ‘encountered in common experience’ is true if we look at the ‘ultimate point of departure’, but once the scientific process gets underway, it is the entire body of existing knowledge which informs the problems and the observations referred to. It is not possible
Research programmes
7
to observe black holes unless we already have a sophisticated knowledge of astronomy. An unskilled observer will not see them. In (1b), the focus is the nature of the theories as formulations of the knowledge resulting from the scientific process. If we have sufficiently many and sufficiently accurate observations of the position of Mars, we may discover that it is in an elliptical orbit around the sun. This orbit is a model describing the systematic order underlying the observed positions. In (1c), attention shifts to the scientific process itself. When we have discovered a pattern, it does not only cover the observations that were used to find it, but also suggests new observations. We can then check whether these new observations are correct. If we have a theory of the movement of the moon and of the sun, we can predict the exact time of an eclipse and check whether it occurs at that time. Given this interaction between theories and observations, a good starting point for the representation of the scientific process is Figure 1.2.
Theory2
2
Theory1 3 1
Observations2 Observations1
Figure 1.2: Theories and observations in the scientific process
In Figure 1.2, one and a half cycles in the interaction of theories and observations are represented. The arrow marked 1 represents the formulation of a first hypothesis, theory1, about the underlying system for a set of observations, observations1. The arrow marked 2 represents the prediction of further observations, observations2, suggested by theory1. In step 3, the evaluation of these observations2 leads to an improved theory, theory2. Further interactions will in the optimal case lead to more and more observations covered by a sequence of ever more sophisticated theories. An example of the interaction depicted in Figure 1.2 in astronomy would be the calculation of the orbit of Mars. First a number of observations are made, leading to theory1. On the basis of theory1, further observations are predicted. The result of checking these predictions is observations2. To the extent that the
8
Chomskyan Linguistics and its Competitors
predictions are not entirely accurate, they can be used to improve the original theory, leading to theory2. In the case of the orbit of Mars, a large number of cycles were necessary. The interaction of the two elliptical orbits of Mars and the Earth around the Sun, which are furthermore not in the same plane, leads to a very complex set of observations of the position of Mars from the Earth. 2 In Figure 1.2, no distinction is made between theories of different levels of generality. It is common, however, to distinguish at least two levels, empirical laws and theories in the more restricted sense. Nagel (1961) does so in (2). (2)
‘the distinction between experimental laws and theories is based on the contention that laws subsumed under the first of these labels, unlike laws falling under the second one, formulate relations between observable (or experimentally determinable) traits of some subject matter’. [Nagel (1961: 81)]
According to (2), ‘laws’ and ‘theories’ are not essentially different types of entity. In Figure 1.2, we referred to both of them as ‘theories’, while in (2) Nagel calls both of them ‘laws’. The difference is that empirical or, as Nagel calls them, experimental laws are closer to observations. A typical example of an empirical law is the generalisation that Halley’s comet returns every 76 years. It contrasts with a theory of comets in the narrow sense, which sets out the nature and orbits of comets in general. In the scientific process, empirical laws play a role by mediating between observations and more abstract theories. This is often represented in terms of the empirical cycle, as in Figure 1.3.
Theories
Empirical laws1
Empirical laws2
Observations
Figure 1.3: The empirical cycle 3
In the empirical cycle, theories as represented in Figure 1.2 are separated into three boxes. The two boxes labeled ‘empirical laws’ are different in function but not in nature. Empirical laws1 are used as a basis to formulate a theory, whereas
Research programmes
9
empirical laws2 are predictions by the theory, used as a test. The generalisation about Halley’s comet can only appear in one of the two roles, but which one depends only on ‘accidents’ of the history of science. If the generalisation is used to discover a theory, it is an empirical law1. If the generalisation follows from the theory, it is an empirical law2. It is not possible to predict the role of an empirical law on the basis of what it says. Another difference in comparison to Figure 1.2 is that in Figure 1.3 the development of the sets of observations and the versions of the theory are not represented. Obviously, the empirical cycle is not meant as static, but should rather be seen as ‘spiralling up’, i.e. leading to an ever larger set of observations and a more and more sophisticated theory. So far, we have considered science as a process. Another perspective on science is as a body of knowledge. In this perspective we also find theories, empirical laws, and observations, but their relationships are static. Thagard (1988: 14) represents them as in Figure 1.4. Theory
Law
Observation
Law
Observation Observation
Observation
Figure 1.4: Hierarchical organisation of observations, laws, and theories
In Figure 1.4, science is modelled as consisting of observations and generalisations of different levels. In replacing the cycle in Figure 1.3 by a hierarchy, Figure 1.4 highlights the fact that several observations are covered by an empirical law and several laws by a theory. What is striking in Figure 1.4 is the parallelism between the relations of law to observations and theory to law. In 1682, when Edmond Halley observed the comet that would later be named after him, he combined this observation with earlier reports from 1456, 1531 and 1607 and formulated a law about the regular occurrence of this planet. He predicted that it would reappear in 1758 (cf. Taylor (1998: 95) for more details). In the same way, a theory of comets is based on a large number of observations and laws and predicts other observations and laws. Once the generalisations and the theory have been formulated and tested, the difference between empirical laws1 and empirical laws2 in Figure 1.3 is no longer important. It only plays a role in the discovery process. We can see Figure 1.3 as a representation of the scientific process and Figure 1.4 of the system of knowledge resulting from it.
10
Chomskyan Linguistics and its Competitors
As Figure 1.3 and Figure 1.4 give different perspectives on the same model, it is interesting to see how the directionality of the empirical cycle can be represented in the hierarchical structure, yielding Figure 1.5. Theory 2 Law1
Law2 1
Observation1
Observation2 Observation1
Observation2
Figure 1.5: Tree representation of the empirical cycle
In Figure 1.5 the subscripts have been used in much the same way as in Figure 1.3, but their use has been extended to observations. For each law, observation1 stands for the observations used in formulating it and observation2 for observations predicted by it. In the case of Halley’s comet, the observations in 1456, 1531, 1607 and 1682 were observations1, and the appearance in 1758 was an observation2. The arrows marked 1 and 2 in Figure 1.5 point to an interesting difference in status of law2, depending on which of the two arrows was there first. As an example let us consider the phases of Venus. As described in more detail by Moore (2002: 34–41), Venus has phases in the same way as the moon. If it occurs as the Morning Star, it is waxing and if it appears as the Evening Star, it is waning. These phases can be explained as a consequence of the orbits of Venus and the Earth around the Sun. The generalisations about the phases of Venus take the role of law2 in Figure 1.5. If arrow 1 was historically first, this means that the phases of Venus were discovered and their regular occurence described, for instance, in a table, before any explanation of the phenomenon was available. At that point, the table of phases is an empirical law without a theory covering it. If subsequently a theory of planetary movement is formulated on the basis of other observations, this table can become a law2. It provides so-called ‘external evidence’, i.e. evidence that was not used in formulating the theory. 4 An alternative historical possibility is that the theory of planetary movement precedes any observation of the phases of Venus. In this case nobody has seen these phases before a theory has been formulated from which their existence follows as a consequence. In this case, arrow 2 appears first and arrow 1 does not appear at all, because all observations corresponding to law2
Research programmes
11
are triggered by the law. In this case, these observations also provide external evidence for the theory. The importance of historical precedence, while undoubtedly significant for individual researchers’ careers, is not so great if we consider a science as a body of knowledge. In Figure 1.4 the difference between the alternatives sketched here is entirely neutralised. There is no temporal sequence in Figure 1.4, but only a static picture of the relations between theories, laws, and observations as it is at a particular moment. The empirical cycle is central to how most scientists perceive science and their role in it. Figure 1.2, Figure 1.3, Figure 1.4 and Figure 1.5 all represent the empirical cycle. They are not equivalent, but they leave out some aspects and emphasise others. Figure 1.2 represents the sequence of theories, but leaves out empirical laws. Figure 1.3 represents the two roles of empirical laws in the scientific process, but leaves observations as a generic category. Figure 1.4 represents the state of knowledge in a science at a particular time, but leaves out the process. Figure 1.5 reintroduces this process, but does not represent the feedback function of predictions. Together they give a fairly complete picture of the nature of scientific knowledge and of the scientific process as perceived by most scientists.
1.2
The role of research programmes
Science is generally considered to be the most rational human enterprise. As such, it is assumed to be based on objective observations and repeatable experiments, from which theories are derived by a process of sound logical argumentation. If the empirical cycle is taken as a model for the scientific process, we expect to find these rigid conditions on observation and reasoning incorporated in the model. In fact, matters are not so simple. As shown in Section 1.2.1, the empirical cycle raises a number of problems. Section 1.2.2 presents some approaches to handling these problems.
1.2.1
Problems with the empirical cycle
If we look at the empirical cycle as represented in Figure 1.3 from a strictly logical point of view, we are faced by questions such as (3). (3)
a. Which of the many possible observations are worth recording as data? b. Which aspects of the selected data should be taken as a basis for a generalisation? c. When is a theory deep enough to be considered explanatory?
12
Chomskyan Linguistics and its Competitors
Question (3a) seems to be simple enough. In an objective scientific approach, everything we can observe should be accounted for in some way. There are a number of problems involved in this assumption, however. First, as argued extensively by Jackendoff (1983), any act of perception involves an act of interpretation.
Figure 1.6: Man with a suitcase (?)
In Figure 1.6 we see a man with a suitcase. In doing so, we interpret the torso, arms and head to belong together with the legs. We assume the body of the man continues behind the suitcase. In the picture, however, nothing of the kind can be detected. Nevertheless, we cannot help seeing a man with a suitcase in Figure 1.6. Only on reflection do we see that the elements of the picture we interpret as the body of the man are not a continuous whole but are actually interrupted by the suitcase. This kind of automatic inference is the result of the structure of our cognition and of our knowledge of the world. Observation cannot be objective in the sense of being determined by the object observed rather than the subject observing, because observation depends crucially on the subject’s activity. If knowledge of the world determines how we observe reality, it is clear that previous experience influences our observations. Becoming an astronomer involves, among others, learning how to see through a telescope and interpret the visual input it gives. The concept of ‘possible observation’ as mentioned in (3a) also refers to the active role of the observer in producing the observed phenomena by carrying out experiments. The idea of carrying out controlled experiments to be able to make specific observations seems to have emerged in the Renaissance period (cf. Rossi, 1997). We should distinguish experiments from the use of instruments for observations. Using a telescope to see more details on the Moon is not an experiment. An experiment produces phenomena that would not be there otherwise. Determining the composition of a moon stone by bringing it into contact with indicator substances is an experiment because it produces controlled observations of events that do not occur in this way in nature. One of the problems highlighted by (3a) is that we have to know in some way which experiments to carry out.
Research programmes
13
Problem (3b) emerges when we have established a set of data. There are indefinitely many generalisations compatible with a finite set of data. They will imply different things about what is not in the initial set of data, but none of them is refuted by the initial set. To illustrate this, consider the correlating pairs in (4). (4)
a. b. c. d.
5 8 12 15
green red red green
Some possible generalisations are given in (5). (5)
a. Even numbers are red, odd numbers are green. b. Only numbers divisible by 5 are green and only those divisible by 4 are red. c. Numbers between 7 and 14 are red, higher and lower numbers are green. d. Numbers with three factors are red, prime numbers and products of two primes are green.
All generalisations in (5) are compatible with the data in (4), but they make different predictions as to the colour associated with numbers not given in (4). For 6, (5a) predicts red, (5c-d) green, and (5b) neither red nor green. Adding more pairs to our set of data in (4) will eliminate some generalisations, but as there are infinitely many generalisations, we can never reduce them to a manageable set. We have to take recourse to a notion of ‘sensible’ generalisation. The example in (4) highlights this problem by presenting the starting point as a set of decontextualised pairs. The final problem, (3c), concerns the depth of explanation. If we consider the appearance of Mars in different positions in the sky at different points in time as a set of data, a theory consisting of an elliptical orbit of Mars and the Earth around the Sun can be said to explain the data. This is the level of explanation reached by Kepler, who calculated the orbits and proposed an elliptical rather than circular form. What Kepler did not explain is why Mars and the Earth are in these orbits. A deeper explanation addresses this question. Newton introduced the concept of gravity to explain how the planets remain in their orbit. This is a deeper explanation of the phenomena than Kepler could give, but it raises further questions, for instance, how the planets got into this orbit, what is the nature of gravity, etc. For several centuries after Newton, these questions were the subject at most of speculation. Even if we have answers to them, they will inevitably raise further questions. In fact, every explanation is incomplete and subject to deeper questions. When an explanation is accept-
14
Chomskyan Linguistics and its Competitors
able this is not the result of these questions bottoming out but of the scientific community being satisfied with the level of depth. In conclusion, there is a discrepancy between the general acceptance of the empirical cycle by practitioners of science as a basis for the rational conduct of scientific research and the logical problems highlighted by the questions in (3). This discrepancy emerges quite clearly once we try to emulate scientific reasoning in a computer program. Thagard (1988) describes such an approach, in which all hidden knowledge used in the various steps of the empirical cycle has to be made explicit to the computer.
1.2.2
Approaches to the problems of the empirical cycle
If we want to account for scientific practice, we have to explain why the empirical cycle is perceived as governing it, when questions such as (3) are not usually considered. One approach to this problem is to deny any special status to science as an activity. Woolgar (1988), for instance, analyses science as a social activity like any other social activity. He argues, among other things, that a proof by means of logical rules is in fact conventional (1988: 45–48), and that what is presented as the discovery of an object is rather its creation in the communicative interaction of scientists (1988: 61–65). In this approach, science is studied in the same way as an ethnographer studies a ritual in a remote tribe. In both cases there is a set of activities governed by some conventions. Participants believe that these conventions are necessary, rational, logical, or inevitable. An outside observer sees their relativity. Therefore this approach has been called relativism. Relativism in the philosophy of science such as exemplified by Woolgar’s ethnographic approach was highly popular in the early 1990s. It came under heavy attack, however, when a parody of the approach, written by the physicist Alan Sokal, was accepted for publication in one of the leading journals, Social Text. Dubois (2001: 205–256) gives an overview and an analysis of this socalled Sokal Affair. A less radical approach to the problem of the discrepancy between logical and perceived status of the empirical cycle is the one by Thomas Kuhn (1922–1996). He was a physicist who had moved into philosophy of science through work on the history of science. This explains his concern to produce a practically and historically correct (rather than normative) account of science. His accessible writing style gained him a large readership but also caused serious misunderstanding. On the one hand misunderstandings were due to his informal and sometimes imprecise way of expressing his ideas, on the other hand to the fact that his readership has sometimes been keener on discovering inconsistencies and literal falsehoods than on understanding
Research programmes
15
the intended meaning of his texts. After the publication of his The Structure of Scientific Revolutions in 1962 a large part of Kuhn’s energy and time were taken up by efforts to counter misunderstandings it had generated, both in writings and in discussions at conferences. Kuhn (1970a) is the second edition of this work, with an afterword reacting to some of the criticism that had appeared by then. Kuhn (1970a: 10) introduces the concept of paradigm to refer to the shared, largely tacit consensus of scientists working in a particular field. This consensus restricts the choices in each step of the empirical cycle in such a way that the questions in (3) do not block the scientific process. The concept of paradigm has been at the centre of much confusion and many heated discussions and Kuhn is at least in part to blame for this himself. Masterman (1970: 61–65) identifies 21 different descriptions of what is a paradigm in the text of the first edition. Although she observes that ‘not all these senses of ‘paradigm’ are inconsistent with one another’ (1970: 65), it is obvious that the lack of a clear definition does not help the clarity of the ensuing discussion. It seems as if Kuhn is trying to evoke a concept that the reader is already supposed to know. In the postscript added to the 1970 edition, Kuhn addresses this problem and identifies two main senses, which he labels disciplinary matrix and exemplar (1970a: 181–191). It is the disciplinary matrix that contains the elements of shared commitment which constitute the consensus necessary for the empirical cycle. In Kuhn’s words, the fact that scientists share a disciplinary matrix accounts for ‘the relative fullness of their communication and the relative unanimity of their professional judgments’ (1970a: 182). He then continues to discuss four components of different types present in the disciplinary matrix (1970a: 182–187), as listed in (6). (6)
a. b. c. d.
Symbolic generalisations Metaphysical paradigms Values Exemplars
The status of the list in (6) is somewhat unclear. Kuhn (1970a: 182) introduces it as ‘not […] an exhaustive list, but […] the main sorts of components of a disciplinary matrix’ and adds when reaching the fourth component that it is ‘not the only other kind but the last I shall discuss here’ (1970a: 186). To my knowledge, however, Kuhn has never suggested any other components. In the authorised exposition of his theory by Hoyningen-Huene (1989), the same four items are discussed ‘without any claim to be exhaustive’ (1989: 146). 5 With symbolic generalisations, (6a), Kuhn refers to expressions such as f = ma or ‘action equals reaction’ (1970a: 183). Generalisations such as Newton’s law relating force to mass and acceleration ‘function in part as laws
16
Chomskyan Linguistics and its Competitors
but also in part as definitions of some of the symbols they deploy’ (1970a: 183). The term metaphysical paradigms, (6b), illustrates Kuhn’s pervasive use of paradigm in slightly different senses. It refers to what Kuhn (1970a: 184) calls ‘heuristic models’ which serve as a source of ‘preferred or permissible analogies or metaphors’. They guide scientists to look for certain types of solutions. One of his examples is ‘the molecules of a gas behave like tiny elastic billiard balls in random motion’ (1970a: 184). The values in (6c) refer to the standards for assessing a theory. Kuhn (1970a: 185) mentions that theories should be ‘simple, self-consistent, plausible’, but such criteria can only become operational if there exists a certain degree of agreement about their elaboration. 6 Finally, the exemplars in (6d) are for Kuhn (1970a: 186) ‘the complete problem-solutions that students encounter from the start of their scientific education’ as well as examples of good practice in technical literature. If we take the empirical cycle as a basis, it is obvious how the elements in (6) can contribute to a solution of the problems in (3). The symbolic generalisations of (6a) provide a solid basis for explanation. As they are not questioned, the urge for an ever deeper explanation mentioned in (3c) can bottom out. The metaphysical paradigms of (6b) suggest a way of looking at the data, which will contribute to a sensible selection of data, the problem in (3a). The values and exemplars of (6c-d) will constrain any generalisation step as referred to in (3b), i.e. when we go from data to an empirical law and from empirical laws to a theory. Their roles are in a sense opposite. Whereas exemplars make positive suggestions on the basis of previous successes, values penalise proposals that are too complex or inconsistent with other accepted laws or theories. In brief, a paradigm or disciplinary matrix provides a solution for the main problems confronting an account of scientific practice in terms of the empirical cycle. One of the most problematic properties of paradigms is that there may be more than one of them for the same science. These may be valid at different times. Thus, until the end of the Middle Ages, astronomy was governed by the Ptolemaic paradigm, which put the Earth at the centre of the universe and had the Moon, the Sun, the planets and the fixed stars revolve around it in circular orbits. After Copernicus, Kepler, and Newton, a new paradigm had emerged which put the Sun at the centre of the solar system and assumed elliptical orbits for the Earth and the planets. Different paradigms may also flourish simultaneously. In optics, for instance, the corpuscular theory of light, which considers light as particles and was proposed by Newton in his Opticks, published in 1704, and the wave theory of light, developed by Christiaan Huygens in the seventeenth century, co-existed for a long time. As described by Hoyningen-Huene (1989: 143f.), Kuhn first assumed that a paradigm must be adopted by all practitioners of a particular field. If there is no consensus in this respect, there is no paradigm. This state of mind is what
Research programmes
17
underlies Kuhn’s (1970a: 11–15) description of the crucial distinction between the pre-paradigmatic period and the period following the emergence of the first paradigm. In the pre-paradigmatic period, scientists concentrate on discussions about fundamental concepts. When a paradigm has been reached, discussion can turn to productive use of the fundamental concepts. A pre-paradigmatic situation is illustrated by the early Greek philosophers. While Thales of Miletos considered water as the primitive element, Herakleitos of Ephesos proposed fire, and Anaximandros of Miletos the “peiron (‘boundless’). Later, however, Kuhn changed his mind. In the postscript to the second edition, he assumes that in what he used to call a pre-paradigmatic situation all different schools have their own paradigm (1970a: 178–179). He also mentions ‘the relative scarcity of competing schools in the developed sciences’ (1970a: 209), which implies that two or more paradigms can coexist in the same field. Although there is a tendency for one paradigm to emerge in a particular field, universal adoption by all scientists in a field is not a condition for recognising a paradigm. The existence of different paradigms in the same science gives rise to a particular problem, the incommensurability of paradigms. The problem can be described as follows. If we have two theories within the same paradigm, we can determine which of the two is better by applying the evaluation criteria derived from (6c) of the paradigm. However, this method does not work if the two theories are based on different paradigms. If theory T1 is, for instance, based on the paradigm P1 that sees light as particles and theory T2 on the paradigm P2 that sees light as waves, T1 will be better than T2 if we take the evaluation criteria of P1, but T2 will be better if we take the evaluation criteria of P2. There is no ‘neutral ground’ that is at the same time sufficiently general to be independent of P1 and P2 and sufficiently powerful to express a judgement on the difference between theories such as T1 and T2. It is important to see the exact extent of the problem. Incommensurability makes it impossible to achieve a completely rational, decisive judgement about the comparative value of T1 and T2 in the above example, but it does not exclude communication. It is not impossible to explain T2 in terms of P1, but the theory will not appear at its best. If T2 refers to symbolic generalisations made by P2 that are not available in P1, they will have to be translated into terms accessible in P1. This will make certain statements look strange and unconvincing. As noted already, evaluation criteria may also differ and as T2 was developed with a different set of criteria in mind, it will not usually fare very well in view of P1. Most importantly, perhaps, there is no guarantee that the data and generalisations incorporated in T2 are significant in the context of P1. What counts as convincing evidence for T2 in view of P2 may be largely irrelevant in view of P1. It is this problem which most of all complicates meaningful communication across paradigm boundaries.
18
Chomskyan Linguistics and its Competitors
Incommensurability effects occur whenever theories developed in different paradigms are compared. Concrete examples of situations of this kind tend to become rather technical. Instead of elaborating an example from astronomy here, I will therefore postpone detailed discussion of examples to Chapters 3 and 4, where linguistic cases will be studied. Kuhn’s paradigms are both social and intellectual constructs. Our discussion so far has only highlighted the intellectual properties. It is the content of elements such as (6) which saves the empirical cycle while at the same time generating the problem of incommensurability. For Kuhn, however, social aspects seem to have been even more important than these. He insists, for instance, that it is not properties such as (6) which provide the primary access to the identification of a paradigm or disciplinary matrix. Instead, the identification should start with the study of social groups (1970a: 176–179). For Kuhn, the scientific community is structured hierarchically in the sense that smaller, more specialised groups are included in larger groups with a more general subject field. The highest level is the community of all natural scientists, followed by, for instance, chemists, organic chemists, protein chemists, etc., down to research groups of less than 100 people. Criteria for group membership are, for instance, subject of highest degree, membership of professional societies, journals read, and conferences attended. The approach Kuhn takes to paradigms here implies that a paradigm is what a previously identified group of scientists shares. For the purpose of this book, I would like to emphasise the intellectual component at the expense of the social one. This means that I will not adopt all aspects of Kuhn’s notion of paradigm. In order to make this difference clear, I will introduce a new name, research programme, for the concept defined as in (7). (7)
Research programme A research programme is the set of assumptions, tacit or explicit, which make research along the lines of the empirical cycle possible.
The definition in (7) makes research programme much more general than paradigm or disciplinary matrix. Kuhn’s theory of paradigms can be seen as a specific elaboration, at least as far as the discussion of (6) is involved. Other attempts to develop concepts that fulfil more or less the same function as Kuhn’s paradigms include Lakatos’s research programmes and Laudan’s research traditions. Lakatos (1970) uses the term research programme in a much more specialised sense than I do here. His theory is intended as a compromise reconciling the main insights of Kuhn’s theory with Popper’s (1959, 1963) view. Laudan (1977) introduces the concept of research tradition
Research programmes
19
as an alternative to Kuhn’s paradigms in order to account for the historical development in a different way. My choice of the name research programme is independent of its use by Lakatos. The name has in fact also been used pre-theoretically elsewhere, e.g. by Kasher (1991). As long as it is clear that I do not want to commit myself to all aspects of Lakatos’s theory, but take the term as in (7), this should not be problematic. Given the definition in (7), the position of the research programme with respect to the empirical cycle can be represented as in Figure 1.7.
Research programme
Theories
Empirical laws1
Empirical laws2
Observations
Figure 1.7: The research programme in relation to the empirical cycle
As illustrated in Figure 1.7, the research programme delimits a particular space for the empirical cycle. As such, it influences every step in the cycle, whereas there is very little if any influence in the reverse direction.
1.3
Truth, progress, and revolutions
This section introduces three central concepts in the philosophy of science that play an important role in later chapters of this book. The relationships between them and the position they occupy with respect to the empirical cycle and research programmes will become clear in the course of their presentation.
1.3.1
Truth
A statement is true if it corresponds to reality. This is easy to apply to simple statements such as (8).
20 (8)
Chomskyan Linguistics and its Competitors The door is open.
If there is agreement about which door is meant, (8) is true when this door is indeed open and false otherwise. For scientific theories, it is often much more difficult to evaluate them in these terms. Nevertheless, an important range of theories in the philosophy of science is based on the assumption that the purpose of science is to come up with theories that represent truth. The strongest representative of this idea in the twentieth century is no doubt the logical positivism of the Wiener Kreis (Vienna Circle). In Verein Ernst Mach (1929) they present their manifesto, which contrasts their view of science with metaphysics. Ayer (1946 [1935]) explains the logical positivist view of science for a British audience. Logical positivism aims for a scientific method that by its very nature guarantees that scientific statements are true. The basic idea is to start from true observations and use only truth-preserving operations on them to arrive at scientific theories. The problems this approach encounters have a large overlap with the problems of the empirical cycle discussed in Section 1.2.1. They include the interpretation component inherent in any observation and the inevitable degree of uncertainty in the step from observations to theories. Arguably, the principal merit of the Wiener Kreis is that they made these problems explicit by their rigorous approach. Popper (1959 [1934]) can be seen as a reaction to the problems encountered in this enterprise. For Popper, it is illusory to avoid false statements in science, but properly scientific theories should be falsifiable, i.e. make predictions that may turn out to be false. This means that a scientific theory is only a hypothesis, not a verified truth. Like the Wiener Kreis, Popper was particularly concerned with the demarcation issue (1959: 34–39), i.e. determining whether a theory is scientific. His solution is that in order to count as scientific, a theory should specify under which conditions it has to be given up (1959: 78–92). In terms of the empirical cycle, a theory should make some of its predictions crucial, such that if they turn out to be false, the entire theory is withdrawn. This view led him to look for historical examples of crucial experiments, i.e. experiments that test crucial predictions and show that they are wrong, so that the theory making these predictions has to be abandoned. As an example of a crucial experiment, Popper (1959: 108) mentions the Michelson-Morley experiment. This experiment proved that the theory of light as a wave was not correct. If light is a wave, it consists of disturbances transmitted through some type of substance. Sound waves can be transferred through air or water, but not through a vacuum. As opposed to sound, however, light passes through space, for instance when we observe a distant star. Therefore, it was assumed in the nineteenth century that space was filled with a substance, ether, which transferred light waves in the same way as air transfers sound waves.
Research programmes
21
However, if ether has the same effect on light waves as air on sound waves, it must be measurable in similar ways. For air waves and sound, a well-known phenomenon is the Doppler-effect. The horn of an approaching ambulance has a higher pitch than the same siren when the ambulance has passed and moves away from the observer. This is explained because the sound waves are compressed through the movement of the approaching ambulance and stretched out when it is moving away. The physicists A. A. Michelson and E. W. Morley devised a similar experiment for light, which they carried out in the 1880s. They compared the velocity of light when it is emitted in the same direction as the Earth’s motion with its velocity when emitted in a perpendicular direction to this motion. They were not able to measure any difference. On the basis of this result, in Popper’s interpretation, the theory of light as waves transmitted through ether was given up. The Wiener Kreis and Popper share the conviction that truth is the ultimate standard for evaluating scientific theories. Theories in the philosophy of science which are based on the truth of scientific theories are called absolutist. The alternative to absolutist is relativist. An example of a relativist theory is Woolgar’s (1988) sociological approach mentioned in Section 1.2.1. In a relativist approach, there is no absolute standard of truth. As we have seen, Woolgar claims that logical rules of proof are conventional and objects are created in the communicative interaction of scientists rather than discovered. This has serious implications for the status of science. In particular, it implies that there is no way to establish that current science, both as a practice and as a body of knowledge, is any better than alternatives based on different assumptions, e.g. folklore. Kuhn’s theory of paradigms was motivated at least in part by the observation that the model of science that follows from Popper’s falsificationism, which uses falsifiability as a necessary condition on scientific theories, is far removed from scientific practice. He expresses this as in (9). (9)
‘[T]here is no such thing as research without counter-instances.’ [Kuhn (1970a: 79)]
Whereas Popper expects scientists to advance a hypothesis and reject it when it turns out to be falsified by further observations, Kuhn sees anomalies (or ‘counter-instances’), data that cannot be explained by the theory, as a normal situation. Within a paradigm, these anomalies are treated as puzzles. Puzzles are problems that are guaranteed to have a solution. It is the function of a paradigm to identify puzzles and guarantee that they can be solved (1970a: 36). The solution consists of a manipulation of the theory within the space and according to the rules laid out by the paradigm. Experiments should rather be seen as testing the researcher than as testing the paradigm. A researcher who
22
Chomskyan Linguistics and its Competitors
blames the paradigm will be seen in the community as the proverbial carpenter who blames his tools (1970a: 79). This approach to the status of theories makes it not at all straightforward to classify Kuhn’s approach as either absolutist or relativist. The concept of truth has at most a peripheral role in Kuhn’s view of science. Kuhn (1970b: 266) argues that it has only a significance within a particular paradigm. The incommensurability of paradigms implies that there is no neutral language in which to discuss their claims. In the same way as for proof, also truth is thereby limited in its scope of application to discussion inside a paradigm. This has led to charges of relativism by, for instance, Shapere (1964: 392), Popper (1970: 56), Lakatos (1971: 120), and Laudan (1977: 138–146). Kuhn has always countered such charges. We will come back to this issue when discussing progress and revolutions, because they are central in his argument why his theory is not relativist. It is interesting to note here that the adoption of research programmes in the sense of (7) and Figure 1.7 is not incompatible with an absolutist view of science. In fact, the main difference between Lakatos’s (1970) conception of research programmes and Kuhnian paradigms is that Lakatos assumes truth as the aim of science (cf. Larvor (1998: 102)). As discussed by Zahar (2004), Lakatos’s theory can be seen as an attempt to save Popper’s falsificationism in the light of the problems raised by Kuhn’s observation (9).
1.3.2
Progress
The notion of progress introduces a direction in the history of science. If science makes progress and we are given two theories about the same topic, T1 and T2, we can in general see which of the two theories is newer. There have been different ideas about how we can observe progress. A popular set of criteria is the one given in (10). (10)
Progress (to be revised) T2 is seen as representing progress with respect to T1 if a. T2 is closer to the truth than T1; b. T2 covers more data than T1.
As an example, suppose we have two theories of the movement of Mars, one from 1550 and one from 1650. The older one is based on Copernicus, the newer one on Kepler. Both have Mars moving in an orbit around the Sun, but in one of them, TC, this orbit is characterised in terms of a system of circles, in the other, TK, it is characterised in terms of an ellipse. How do we apply (10) to find out which one is the more progressive T2? At first sight, (10a) seems to be the most relevant criterion. We can argue that as we know that planetary orbits
Research programmes
23
are elliptical rather than based on circles, TK must be T2. This argument is flawed, however, because rather than comparing the two theories to the truth, we are comparing them to our current state of knowledge. A proper application of (10a) requires an omniscient observer. In a view of science based on Popper (1959), (10b) is more or less an automatic consequence of the fact that T2 is newer than and based on T1. The older theory T1 was discarded at some point in the light of falsifying data. When the newer theory T2 was proposed, the set of available data included at least all data covered by T2 plus the data that falsified T1. If T2 was not immediately rejected, it must cover at least these data. In a view of science based on research programmes, the empirical cycle generates an ever greater set of data, which also suggests (10b). The idea of progress is in fact illustrated in Figure 1.2. As theories suggest further observations of a particular kind, each subsequent theory is based on a larger set of data than its predecessor. The research programme has the function of making this work in practice. While there are many different possible theories for a particular set of data, the research programme should provide a heuristics for selecting one that suggests interesting new data. Given Kuhn’s concept of anomalies, cf. (9), there is no absolute guarantee of progress of the (10b) type. To be sure, there is progress in coverage as a definite tendency but if counterexamples do not immediately disqualify a new theory, no logical argument for linear progress can be given as in Popper’s theory. A further criterion to identify progress is suggested by the following example. Suppose we take TK from the previous example and compare it to a theory from 1750, based on Newton’s work. The latter theory, TN, also assumes an elliptical orbit for Mars. The difference between TK and TN is not so much in their coverage. However, TN offers an explanation for the elliptical orbit in terms of gravity. Newton was heavily criticised for proposing gravity as an invisible force working at a distance, but he was able to provide reasonably accurate calculations of the strength of this force based on the mass of the bodies concerned. Kepler did not manage to provide anything similar. The difference between TK and TN is not one of coverage, but one of depth of explanation. TK only explains appearances of Mars in terms of an orbit that can be calculated precisely. TN also explains the orbit of Mars in terms of a force, gravity, that can be calculated precisely. It is on this basis that TN is seen as representing progress compared to TK, leading to a new formulation of progress as in (11). 7 (11)
Progress in a research programme (revised) T2 is seen as representing progress with respect to T1 if a. T2 covers more data than T1; b. T2 offers a deeper explanation of the data than T1.
24
Chomskyan Linguistics and its Competitors
Another plausible candidate as a criterion for progress, not included in (11), is simplicity. According to a principle widely known as Occam’s Razor, if two theories cover the same set of data, the simpler one should be preferred. The problem with Occam’s Razor is that there are many different ways of measuring simplicity and they do not usually yield the same results. To some extent, research programmes will specify what counts as simpler, but simplicity is much more difficult to use in practice than the two criteria listed in (11). Given this notion of progress, we can understand why Kuhn considers his philosophy of science as non-relativist. Kuhn (1970a) sees progress as an essential property of ‘normal science’, i.e. science taking place in the context of a research programme. He states this in (12). (12)
‘it is only during periods of normal science that progress seems both obvious and assured. During those periods, however, the scientific community could view the fruits of its work in no other way.’ [Kuhn (1970a: 163)]
Implicit in (12) is the reference to revolutions to be discussed in Section 1.3.3. At least for normal science, however, (12) expresses Kuhn’s commitment to the reality of progress as a goal of scientific work and as its successful outcome. The so-called Strong Programme in the sociology of scientific knowledge, as represented for instance by Woolgar (1988), does not include the idea of progress in science. According to this view, changes in scientific theories come about in a way not essentially different from changes in fashion, by a mixture of community orientation and authority. If we take this view on progress as characteristic of any relativist theory of science and assume that relativism implies a commitment to the lack of a specific direction in the historical development of science, Kuhn is not a relativist.
1.3.3
Revolutions
In order to characterise the concept of revolution in science, it is useful to look at the concept of the same name used in politics. Kuhn (1970a: 92f.) suggests the same parallelism. For the sake of concreteness, let us compare events in France in 1830 and in 1981. As described by Louessard (1999), the uprising of the people of Paris in July 1830 led to the end of the reign of Charles X. Although he was replaced by another king, Louis-Philippe, and changes in political orientation were relatively minor, there is no doubt that the events of 1830 marked a revolution. In 1981, François Mitterand was elected president. In the second round of the election, on 10 May, he gained 51.76% of the votes, against 48.24% for the outgoing president Valérie Giscard d’Estaing. In the elections for the Assemblée Nationale on 14 and 21 June, his Parti Socialiste gained an absolute
Research programmes
25
majority of 269 of the 491 seats. This ended a period of uninterrupted Gaullist and right-wing governments which had started in 1958. It is not surprising, then, that Knecht (1982: 392) starts his coverage of these events ‘The year 1981 marked a turning point in the history of modern France.’ The crucial difference between 1830 and 1981 is not in the extent of their repercussions. It can certainly be argued that these were bigger in 1981. In many respects, Louis-Philippe continued the policies of his predecessor. He was finally deposed in the 1848 revolution, which led to the Second Republic. 1981 saw a political fight between two men with very different political views. The outcome marked a turning point with far-reaching consequences for economic, social, and foreign policy in France. Yet 1830 was a revolution whereas 1981 was not. The reason is that there were no provisions for the way Charles X was removed from power, whereas Giscard d’Estaing’s defeat and the rise of the Parti Socialiste were entirely within the constitutional arrangements in place. In the same way as the course of events in politics is regulated by a constitution or another body of laws or conventions, developments in science are constrained by the research programme. As noted in Section 1.2.2, there is no feedback mechanism by which results of work within the empirical cycle would bring about modifications in the research programme. This means that there is a crucial difference between changes in the theory and changes in the research programme. There are established procedures for the former, indicated in the research programme, but not for the latter. This is formulated in (13). (13)
Revolution A revolution is a change of research programme.
In some cases, changes characterised as revolutions in (13) may seem less radical in their consequences than theoretical changes. The reason for considering them nevertheless as being of a different type is the evaluation procedure. As explained in the discussion of (6) in Section 1.2.2, values used in the evaluation of theories are bound to research programmes. The normal scientific process, in which new theories are based on modifying previous theories, does not encourage radical changes. We expect progress by relatively small extensions of the theory and occasionally minor modifications of existing parts. There is no principled reason for excluding more radical changes, however. As long as proponents of both theories use the same evaluation criteria, as laid down in the research programme, there is enough common ground for deciding which one is better. No such common ground exists for theories in different research programmes. This is the problem of incommensurability discussed in Section 1.2.2. The concept of revolution as defined in (13) is rather different from the analysis of revolutions in Popper’s (1959) falsificationist theory. For Popper, as
26
Chomskyan Linguistics and its Competitors
described in 1.3.1, crucial experiments can falsify theories. While he imposes the condition that crucial experiments are reproducible (1959: 86), he does not discuss them in terms of competing theories. In principle, a crucial experiment could falsify a theory without any alternative theory taking over. In his discussion of the Michelson-Morley experiment, however, Popper (1959: 108) presents it as ‘the experiment which led to the theory of relativity’. This implies that at least in practice there was no danger of returning to a state of complete predicament, i.e. without any theory. The existence of revolutions in science as a historical phenomenon raises the question of what triggers them. This question is important because the rationality of science depends on a proper explanation. As we have seen in Section 1.3.1, anomalies occur at any time, cf. (9). Therefore, the mere existence of anomalies cannot motivate a revolution. This excludes Popper’s approach and makes it an urgent problem to find an alternative rational motivation. Kuhn (1970a) solves this problem by referring to a crisis, e.g. as in (14). (14)
‘the sense of malfunction that can lead to crisis is prerequisite to revolution.’ [Kuhn (1970a: 92)]
Whereas the ‘sense of malfunction’ cannot be reduced to purely rational factors, it does exclude the idea of revolutions taking place because of the whim of an influential individual or for reasons of fashion. As Kuhn puts it, ‘few scientists will easily be persuaded to adopt a viewpoint that again opens to question many problems that had previously been solved’ (1970a: 169). Therefore, revolutions are not likely to be met with enthusiasm by the scientific community, unless the alternative of continuing to work with the old paradigm is even less attractive. Hoyningen-Huene (1989: 226) lists three properties Kuhn ascribes to a crisis in order to clarify its nature and role. First, not every acknowledged anomaly has to lead to a crisis. As long as the paradigm indicates avenues of progress, i.e. ways to enlarge the set of data accounted for, cf. (11), there is no compelling reason to concentrate on anomalies. As Kuhn (1970a: 81f.) describes, scientists are generally prepared to trust that the anomalies will eventually disappear unless they occur in areas of immediate practical significance. Second, the crisis may be experienced differently in different parts of the scientific community. This is an immediate consequence of the previous point, because what has immediate practical significance differs from one part of the scientific community to the next. Kuhn (1970a: 92f.) gives the example of X-rays. Their discovery raised many essential problems in radiation theory, because they did not fit into any of the accepted classes of radiation. In astronomy, however, they were simply another phenomenon to be observed about objects in space.
Research programmes
27
Third, the crisis does not have to be recognised consciously by the members of a research community. Kuhn (1970a) formulates this as in (15–16). (15)
‘the effects of crisis do not entirely depend upon its conscious recognition.’ [Kuhn (1970a: 84)]
(16)
‘Often a new paradigm emerges, at least in embryo, before a crisis has developed far or been explicitly recognized.’ [Kuhn (1970a: 86)]
The ‘effects of crisis’ in (15) refer to the emergence of an attitude among scientists which makes them consider accepting a new paradigm. The acceptance of a new paradigm depends on the sense of crisis, but, as (16) suggests, the spread of the new paradigm can go hand in hand with the spread of the sense of crisis in the community. Kuhn (1970a: 86) gives the example of Thomas Young’s work on the wave theory of light. In the early nineteenth century, Newton’s particle theory of light was more or less generally accepted, but Young concentrated on a few effects that were more easily explained in a wave theory. 8 Throughout his entire lifetime (1773–1829), his ideas were rejected by most of his colleagues, but by the middle of the century the wave theory had become almost generally accepted and a paradigm shift had taken place. In the meantime, the anomalies of the particle theory had developed into a crisis for which Young’s theory seemed to provide just the right answer. By taking a crisis as a necessary condition for the climate in which a scientific revolution can take place, a rational account of the historical development of science becomes possible again. If the rationality of the account depends on progress, however, the occurrence of revolutions is still a potential threat. In Section 1.3.2, (11) was proposed as a measure of progress, highlighting the quantitative criterion of the coverage of data. Kuhn’s (12) conspicuously restricted this type of progress to periods of normal science. In a revolution, explanations may be lost. As Kuhn puts it, ‘The transition from a paradigm in crisis to a new one from which a new tradition of normal science can emerge is far from a cumulative process’ (1970a: 84). This point is highlighted in the conditions for acceptance of a new paradigm in the scientific community, which include that it ‘must promise to preserve a relatively large part of the concrete problem-solving ability that has accrued to science through its predecessors’ (1970a: 169). If it is only a relatively large part that is preserved, this implies that another part of what had been achieved before the revolution cannot be preserved. This is often called the ‘Kuhn loss’, e.g. by Chen (1997: 267). Nevertheless, there is a general sense of progress through revolutions. To a limited degree, this is the consequence of a change of perspective. If a revolution takes place, the history of science will be written by the representa-
28
Chomskyan Linguistics and its Competitors
tives of the new paradigm, because the old paradigm is no longer attracting young researchers. According to the evaluation criteria of this new paradigm, the theories of the old paradigm are not as good as the ones developed in the new paradigm (cf. Section 1.2.2). Kuhn (1970a: 167) emphasises, however, that this does not lead to arbitrary or irrational developments because of the special nature of a scientific community. He concludes that despite the losses in a revolution ‘the nature of such communities provides a virtual guarantee that both the list of problems solved by science and the precision of individual problem-solutions will grow and grow’ (1970a: 170). The spectacular nature of many revolutions has attracted a lot of attention to these episodes in the history of science. What should be called a revolution is not straightforward, however, even if we observe incompatible research programmes. Three possibilities are listed in (17). (17)
a. A revolution occurs when a new research programme emerges. b. A revolution occurs when the old research programme loses its currency. c. A revolution is a period marked by a particular type of argumentation, different from the one characteristic of normal science.
The three possibilities each have their problems. In (17a) the problem is that the emergence of a new research programme is not certain to lead to a revolution. In fact, Kuhn (1970a: 84) refers quite explicitly to different possible outcomes of a crisis. Only one of them is the replacement of the old paradigm by a new one. It is also possible that the old paradigm holds its case. The problem with (17b) is that if the new paradigm proves stronger, it may still take a long time for the old one to disappear. There is no crucial experiment with limelights highlighting the significance of the event, as suggested by Popper, but at most a slow passage into oblivion of the defeated paradigm, so that the end of a paradigm is difficult to observe. The approach in (17c) seems more attractive. Hoyningen-Huene (1989: 245–251) actually gives an overview of the characteristics of the type of argumentation current in revolutionary periods. The problem here is that if two or more research programmes exist simultaneously for a prolonged period of time, not all of the discussion is directed against the opposing research programme. An example of such a situation is found in optics, where the wave and particle theories co-existed for a long time. Of course, proponents of the two opposing views of light were trying to show that their opponents were wrong. At the same time, however, a significant portion of the work within each of these research programmes was of the type we expect to see in a period of normal science. In this work they basically ignored the other research programme and simply tried to extend the coverage of their own theory.
Research programmes
29
In many contexts, it is in fact more important to show that there are different research programmes than that one of them replaced the other one at a particular point in time. The succession of research programmes may be obvious with sufficient historical distance, but is in general hard to observe for contemporaries. When ’t Hooft (1996) describes the competition between different models in particle physics, the ultimate outcome is as yet unknown. What can be usefully established, however, is to what extent these models represent different research programmes. The answer to that question has major consequences for the interpretation of scientific arguments because of the incommensurability of different research programmes. Therefore, it is arguably more important to identify research programmes than to identify revolutions.
1.4
Research programmes in linguistics
So far, the discussion of research programmes has been illustrated with examples from prototypical sciences, in particular astronomy and optics. It is mainly on the basis of sciences like these that the philosophy of science develops its theories. In this section, a number of problems are discussed that arise when we turn to the field of linguistics. These problems can be divided into two categories. The first concerns the justification for applying research programmes to the field at all. The second includes more practical problems linked to linguistics as a field in which to apply insights of the philosophy of science.
1.4.1
From natural science to linguistics
The concept of research programme as developed here is strongly based on Kuhn’s concept of paradigm or disciplinary matrix. It is not meant as an independent contribution to the philosophy of science, but only as a variant of Kuhn’s concept. The reason for introducing it is mainly to avoid being distracted by problems that are not central to the purpose of this book. Kuhn’s (1970a [1962]) book and the ideas of paradigms and revolution were soon discovered by a wider audience, and applied not only in the context of the philosophy of science, but also in a range of other areas. One of the areas where this happened was linguistics. The publication of the first edition of Kuhn’s book coincided with the early stages of the rise of Chomskyan linguistics and it was not long before the latter was seen as a Kuhnian paradigm. Thorne (1965) seems to have been the first to make this link. An important part of the discussion of whether Kuhn’s theory could be applied to linguistics turned on the question as to whether the rise of Chomskyan linguistics could
30
Chomskyan Linguistics and its Competitors
be seen as a scientific revolution in the sense of Kuhn (1970a). This question will be addressed in detail in chapter 3. In this section we will only consider whether the arguments against the application in principle of Kuhn’s concept of paradigms to linguistics are valid and, to the extent they are, whether research programmes encounter the same problems. Oesterreicher (1977: 266–271) makes the point that Kuhn’s theory was crucially devised for natural sciences only. His conclusion is (18). (18)
‘The validity of Kuhn’s theory of the development of science, which was conceived for natural sciences, should therefore be absolutely rejected for social sciences.’9 [Oesterreicher (1977: 270)]
A first point to be made with respect to (18) is that it only applies to linguistics if linguistics is considered a social science. Oesterreicher seems to make this assumption without explicitly arguing for it. It is of course true that language has social aspects, in particular in its use. The question is, then, how much weight these social aspects should have in a linguistic theory. It is not a new viewpoint that in Chomskyan linguistics the social aspects of language are not the ones that determine the theoretical outlook. Chomsky on various occasions emphasises the biological basis of language. Chomsky (1976b), for instance, refers approvingly to Lenneberg (1967) in this respect. Therefore, even if (18) is valid, it is not obvious that it should apply to linguistics of the type pursued by Chomsky. Another aspect of the question whether (18) should apply to linguistics concerns Kuhn’s reasons for distinguishing natural and social sciences. The question is discussed in detail by Kuhn (1959). The point highlighted by Kuhn is the way knowledge is passed on, as formulated in (19). (19)
‘The single most striking feature of this education [i.e. education in the natural sciences] is that, to an extent totally unknown in other creative fields, it is conducted entirely through textbooks.’ [Kuhn (1977 [1959]: 228)]
The reference to textbooks in (19) should be read in conjunction with the list of features of a paradigm in (6). (6d) refers to the existence of exemplars used in the transfer of knowledge in a paradigm. These exemplars provide the examples and the exercises in textbooks. As Kuhn (1970a: 189) describes, the student’s understanding of a point comes about typically not through the reception and understanding of the explanation provided in the text, but through the successful solution of the exercises. This success is possible by discovering and exploiting analogies between these exercises and examples or problems the student has already solved before.
Research programmes
31
The importance of textbooks in the field of linguistics has increased considerably in the past decades. Matthews (1993) describes two points in this development in (20). (20)
a.
‘Before 1960, genuine textbooks in linguistics scarcely existed, and students were obliged from the beginning, b. as ideally they should be, c. to read works of original scholarship. […] d. From the later 1960s this kind of education has become increasingly rare.’ [Matthews (1993: 98)]
The opposition between (20a, 20c) and (20d) suggests that only from the late 1960s linguistics is taught in a way Kuhn considers typical of natural sciences. For the period before then, there were few textbooks, but, more importantly, their use was not general and if one was used, one ‘would have been forced very rapidly to go beyond it’ (Matthews, 1993: 98). As the value judgement in (20b) suggests, this development is not described by someone who highlights it in order thereby to advance the status of linguistics as a science. In the same way as students of physics do not read original works by Newton or Einstein, students of linguistics do not read original works by Bloomfield or Chomsky. The reasons are the same. The theory has become so complex that a gentle introduction is necessary. Moreover, these original works are directed to peers at the time of their publication. The way the theory is presented depends on shared educational background and knowledge of the then-current scientific discussions. These aspects are not necessarily crucial for the theory but they complicate the reception of the original works. Therefore, the use of textbooks rather than original works in linguistics, at least in the past few decades, aligns it with the sciences for which Kuhn developed his theory. In conclusion, even if we accept Oesterricher’s (1977) restriction of the domain of Kuhn’s theory of paradigms in (18), there are at least two reasons why it should not apply to linguistics of the type studied here. First, there is no immediate reason to consider linguistics of this type as a social science. Second, the reason why Kuhn distinguishes natural and social sciences at all is that only in the former do textbooks play an essential role in education and largely replace the direct reception of original works. The rise of textbooks has been documented also by opponents of this development, as Matthews’s (20) illustrates. Therefore, Oesterreicher’s (18) is not a convincing argument for excluding linguistics from the scope of Kuhn’s theory. There have been many discussions of the relevance of Kuhn’s theory to the field of linguistics. Oesterreicher (1977) is exceptional in giving a generally
32
Chomskyan Linguistics and its Competitors
well-informed summary of the theory. Percival (1976), though published in Language and much more widely read and cited, contains a number of strange errors in this respect. They seem to be due to ‘uncooperative reading’, i.e. looking for points in Kuhn’s text that can be attacked rather than trying to interpret it as a coherent set of ideas. An example is (21). (21)
a. b. c. d.
‘out-of-date theories are no less scientific than those current today: all one can say is that the canons of scientific theory and practice vary from period to period. In other words, Kuhn proposes to relativize the notion of science.’ [Percival (1976: 285)]
The problem with (21) is that it contains some correct statements about Kuhn’s theory but leads to a conclusion Kuhn would certainly not have accepted. There is no reason to assume that Kuhn would have objected to (21a) or (21c). One of the motivations for the concept of paradigm was in order to account for the nonlinear, non-cumulative development of science. Thus, Kuhn (1970a: 2) states that ‘once current views of nature were, as a whole, neither less scientific nor more the product of human idiosyncrasy than those current today.’ However, the way (21b) modifies (21a, 21c) is not in accordance with Kuhn. It suggests that there is no progress and that paradigm changes are of the same nature as changes in fashion. This is in direct conflict with statements such as (14) which make a revolution dependent on a crisis and would be more in place in the context of the Strong Programme in the sociology of science than of Kuhn’s theory. As for (21d), some problems with the label of relativism were discussed in various parts of Section 1.3. Although when (21b) is taken out, the conclusion (21d) no longer follows, it is also worth considering Kuhn’s (1970b: 264) discussion of two types of relativism. In one sense, the one related to truth (cf. Section 1.3.1), Kuhn is happy to agree to the label. In the other sense, the one implied in (21d), he cannot accept it. He formulates this as in (22). (22)
‘For me, therefore, scientific development is, like biological evolution, unidirectional and irreversible. One scientific theory is not as good as another for doing what scientists normally do. In that sense I am not a relativist.’ [Kuhn (1970b: 264)]
Percival’s representation of Kuhn’s theory contains further divergences from what Kuhn must have intended by it. Thus, he characterises a paradigm as ‘the striking achievement of a single scientific genius’ and as ‘universally recognized’ (1976: 286). In fact, Kuhn (1970a: 55) rejects the idea of attributing, for instance, the discovery of oxygen to an individual and to a moment in time. Kuhn’s change of mind as to the issue of competition between paradigms was discussed in Section 1.2.2. The explicit denial of universal recognition in
Research programmes
33
the scientific community as a condition for the existence of a paradigm in the postscript of Kuhn (1970a) means that there is no excuse for continuing to ascribe this condition to him. Against this background of misunderstandings of Kuhn’s notion of paradigm, Percival then develops an argument that Kuhn’s theory of science is a danger for scientific standards in linguistics. He sees these dangers as in (23). (23)
a.
‘an unhealthy situation might arise if linguists began to look upon all theoretical disagreements within their profession as conflicts between rival paradigms, i.e. incommensurable viewpoints, and used this as an excuse not to observe the ground rules of rational discussion. b. Moreover, since (according to Kuhn) any genuine paradigm is destined inevitably to be accepted by the entire profession, some linguists might feel impelled to give premature assent to any novel theory which they observed gaining wide support, for fear of ending up as isolated adherents of a discarded paradigm.’ [Percival (1976: 292)]
Implicit in (23) is the assumption that the philosophy of science should provide criteria for good science. In this respect Kuhn’s theory of science differs from Popper’s and from that of the logical positivists, who imposed criteria on good science such as falsifiability of theories. Kuhn does not do so. Kuhn’s philosophy of science is an approach to describing how science works and explaining why it works in that way. The objections in (23) arise only if Kuhn’s theory is not used in its original role, but in a role assumed by Popper’s theory. While Kuhn’s theory cannot be blamed for any unintended use, it is worth pointing out why the problems in (23) should not normally arise according to Kuhn. Incommensurability in (23a) is not something that scientists resort to but something that happens to them. We will see some examples in the following chapters in which it leads rather to genuine misunderstanding and mutual frustration than to casual claims to belong to a superior or otherwise immune paradigm. As already noted, revolutions do not lead to premature assent to a novel theory as alleged in (23b), because a new paradigm only has a chance if the field is in a crisis. Again, as in (21), the context of a crisis is the crucial element left out of the picture. Moreover, even if there is a crisis and a new paradigm emerges, there is no guarantee that the new paradigm will win. Kuhn (1970a: 84) recognises three possible outcomes of a crisis. Only if the old paradigm is not able to solve the anomalies that gave rise to the crisis and the new paradigm provides a convincing solution without losing too much of the explanatory power of the old paradigm will the new paradigm come out of the process as the winner. 10 From the discussion of Percival (1976) we can conclude that he does not formulate any compelling reason not to use Kuhn’s theory of science as a basis
34
Chomskyan Linguistics and its Competitors
for explaining the history of the field of linguistics. According to Koerner (1995b: 8), ‘This discussion [i.e. about the applicability of Kuhn (1970a) to linguistics] appears to have subsided during the later 1970s, possibly as a result of Percival’s (1976) paper’. A possible interpretation is that Percival (1976) took away the basis for linguists who intended to use Kuhn’s theory in the sense of (23). It should be noted, however, that Koerner (1995a) uses paradigm and refers to Kuhn in his introduction to the history of the field of linguistics.
1.4.2
From paradigms to research programmes
For the recognition of paradigms in the sense of disciplinary matrices, Kuhn (1970a) gives clear guidelines in his postscript. They are based on scientific communities. ‘A paradigm is what the members of a scientific community share,’ but, in order not to be circular, ‘Scientific communities can and should be isolated without prior recourse to paradigms’ (1970a: 176). The orientation of research programmes as explained in Section 1.2.2 is slightly different. A research programme is what is necessary to make the empirical cycle work, which is not quite the same as what the members of a scientific community share. While in both cases, elements such as (6) are intended, research programmes focus on them whereas in Kuhn’s view they emerge from and depend on the community. In considering this difference, a first observation that can be made is that the study of the community structure of a field is a very elaborate way of arriving at the assumptions required for the empirical cycle. It may be ultimately the best one, but it is certainly not the most efficient. There is also another problem. The community structure is a complex sociological network. There is no simple hierarchical structure of groups that can be represented as a tree. Kuhnian paradigms are associated with each of these levels of community. This means that at the highest levels, not enough may be shared to ensure successful research following the empirical cycle. At lower levels of the hierarchy, much more may be shared than what is minimally necessary. It is in view of these considerations that the introduction of research programme as an alternative to paradigm has advantages. By concentrating on what is logically necessary to make the empirical cycle work as it is perceived to work, the identification of research programmes becomes much more feasible in practice and the level of generality is determined more clearly than in a model based on community structure. Given the function of a research programme as defined in (7) and illustrated in Figure 1.7, a research programme needs to contain the elements listed in (24).
Research programmes (24)
a. b. c. d. e. f.
35
Heuristics for the collection of interesting data Heuristics for grouping observations into useful generalisations Heuristics for finding plausible explanations for generalisations Criteria for selecting relevant data Criteria for evaluating theories Absolute assumptions as a basis for explanations
These elements correspond to the tasks of the research programme identified in Section 1.2. It is instructive to compare this list to Kuhn’s list of elements of a paradigm, given in (6) and repeated here for convenience. (6)
a. b. c. d.
Symbolic generalisations Metaphysical paradigms Values Exemplars
The different types of heuristics (24a-c) correspond at least in their function to the exemplars in (6d). They guide the steps of the empirical cycle by giving a general sense or intuition of what is promising. Calling them ‘guidelines’ would be misleading because they are not usually formulated explicitly. Margolis (1993) writes about ‘habits of mind’ in a similar sense. For Kuhn, these heuristics result from exemplars and he is definitely right that it is easier to find exemplars than explicit heuristics. Among the other three elements in (24), a further match can easily be identified between the evaluation criteria in (24e) and the values in (6c). This leaves (24d) and (24f) to be accounted for and a suggestion that (6a-b) may account for them. The symbolic generalisations in (6a) are not necessarily expressed in the form of purely mathematical symbols. Kuhn (1970a: 183) gives the examples of ‘elements combine in constant proportion by weight’ and ‘action equals reaction’ from the natural sciences. What is important about them is that although they are formulated as laws, they are not interpreted as empirical generalisations or theoretical statements, but as definitions. Such definitions can then take on the role required by (24f). Because they are immune to further questions, they can provide a rock-bottom basis for explanations. Kuhn (1970a: 184) uses ‘metaphysical paradigms’ as in (6b) to refer to models. They include models of ‘the relatively heuristic variety’ such as the example of billiard balls for the molecules of a gas mentioned in Section 1.2.2. The first set of examples Kuhn gives, however, includes such ideas as ‘heat is the kinetic energy of the constituent parts of bodies’. Models of this type determine how a subject field is conceptualised. Many of the relevant terms are
36
Chomskyan Linguistics and its Competitors
subsequently developed and defined in the symbolic generalisations of (6a). Therefore the model provides not only further elements of the basic assumptions needed in (24f), but also an orientation of observations as required in (24d). In Kuhn’s example, the statement about heat tells how this phenomenon should be considered. In linguistics, we expect a model of this type to specify what language is and what should be explained about it. There is a clear sense in which (6a-b) can be said to underlie (6c-d). The model and definitions in (6a-b) provide a background for evaluation criteria (6c) and determine what counts as an example of good scientific practice to be emulated (6d). In what Kuhn calls normal science, (6c-d) are what determines scientific practice most directly. However, in a crisis (6c-d) may be challenged. At that point, the discussion links them to the model as determined by (6a-b). This has consequences for the way we should describe research programmes. Exemplars/heuristics and values/evaluation criteria determine what people do, but if we want to understand why they do it, the model should be taken as central. This also has a practical advantage. Kuhn (1970a: 46) refers to ‘the severe difficulty of discovering the rules that have guided particular normal-scientific traditions.’ This difficulty arises much more when we try to formulate heuristic guidelines and evaluation criteria than when we turn to the underlying models. Whereas the role of (6c-d) is often suppressed in academic writing, this is much less so for (6a-b). This motivates my approach to research programmes as based on a model. In the context of linguistics, this model tells us what language is and how it should be studied. SUMMARY
•
Science is perceived to work according to the empirical cycle. Generalisations about observations are used to formulate a theory. The theory is tested on the basis of its predictions. The tests generate additional data which can be used to improve the theory.
•
The empirical cycle is not logically sufficient. There is a discrepancy between what scientists think they do and what they must be doing.
•
Kuhn’s (1970a) theory of science explains this discrepancy in terms of paradigms. A scientific community shares assumptions that are not logically necessary but are powerful enough to make the empirical cycle work.
Research programmes
•
Whereas for Kuhn the identification of a paradigm has to start from the community sharing it, the identification of a research programme starts from what is necessary to make the empirical cycle work.
•
Progress in science is seen in the increase of the set of data covered by the current theory and of the depth of explanation, not in terms of truth.
•
A revolution is the replacement of one paradigm by another. In the study of research programmes, the identification of oppositions is more interesting than the identification of revolutions.
•
Arguments against the application of paradigms to the field of linguistics are not compelling.
•
At the heart of a research programme is the model representing the connections between the central notions of the field. In the case of linguistics, the model should tell what language is and what should be explained about it.
37
Notes 1
The division of a quotation into sections marked a, b, etc. is here, and generally throughout the book, a device to facilitate reference to individual parts of a quote. The sections do not correspond to a division in the original quoted.
2
The historical process is described by Kuhn (1957). The recognition that Mars is in an elliptical orbit around the Sun, rather than an orbit based on circles around the Earth is one of the prototypes of a revolution (cf. 1.3.3).
3
Figure 1.3 is adapted from Koningsveld (1976: 66). Koningsveld’s figure has arrows in both directions, suggesting a balance. In his text he reinforces this suggestion by stating that he presents the elements of the figure in clockwise direction, as if the other direction were equally valid. In fact, the counterclockwise arrows represent derived relationships dependent on the corresponding clockwise arrows, as follows from the description here.
4
Of course, this table can also be used as a law1 for the formulation of a theory of planetary movement. In that case, it will not be a law2 at all.
5
‘Ohne Anspruch auf Vollständigkeit’, my translation.
38
Chomskyan Linguistics and its Competitors
6
Simplicity, for instance, can be measured in terms of the number of concepts a theory introduces, the number of rules, the variety of concept types, the number of background assumptions, etc. Thagard (1988: 161) gives the example of creationism, which is simple in the sense that it has a single explanatory principle (‘God’s will happens’), but complex in the sense that it needs many auxiliary hypotheses (for each observation X, ‘God wants X’).
7
The reason for restricting progress in (11) to ‘progress in a research programme’ will become clear in Section 1.3.3.
8
Zeilinger (2003: 30–35) gives a detailed and accessible explanation of the experiments. Other parts of the same book indicate their significance from a current perspective.
9
‘Die Gültigkeit der für die Naturwissenschaften konzipierten Entwicklungstheorie Kuhns ist daher grundsätzlich für die Sozialwissenschaften abzulehnen.’ My translation. In the original the entire statement is underlined for emphasis.
10 It is interesting to note in this context that Matthews (1974: 216) describes the dangers indicated in (23) in similar terms, while noting that ‘Of course, Professor Kuhn is not responsible for the way his book has been seized on in linguistics.’
2
The research programme of Chomskyan linguistics
Starting the study of research programmes in linguistics with Chomskyan linguistics is not meant as a sign that I consider Chomskyan linguistics as representing the truth or as better than any other type of linguistics. Given the discussion of the nature and function of research programmes, it would be anomalous to propose that one of them would be true, because truth is not a relevant property of scientific theories. There are two main reasons for taking Chomskyan linguistics as our starting point. First, within the domain of linguistics covered in this book, it takes a central position. Other approaches, to be discussed in Chapter 4, define themselves with respect to Chomskyan linguistics or used to do so at their origin. Of course Chomskyan linguistics also defined itself in opposition to another approach, but this approach, to be discussed in Chapter 3, is no longer a major force in current linguistics. Apart from this central position, Chomskyan linguistics has the advantage of an extensive documentation of its scientific underpinnings and an abundance of discussion attacking, defending, and elaborating them. Therefore, the literature about Chomskyan linguistics contains more material to support a systematic elaboration of the concept of research programme in the area of linguistics than could be found for any of the alternative approaches. The purpose of this chapter is to explain the research programme adopted by linguists working in the framework of Chomskyan linguistics. It is not my intention to discuss all the objections that have been raised against it. Some of these objections will be presented in later chapters, because they play a role in the discussions between Chomskyan linguistics and its competitors. Other objections originate from considerations outside of linguistics, e.g. philosophy or psychology. They will not generally be discussed here, because they do not emerge from a competing system of linguistics. Botha (1989) gives a
39
40
Chomskyan Linguistics and its Competitors
useful systematic overview of the discussions in particular of philosophical and psychological issues. As explained in Section 1.4.2, the starting point for the identification of research programmes is the model. This model provides answers to foundational questions such as, in the case of linguistics, (1) and (2). (1)
a. What is a language? b. What is language?
(2)
a. What is the status of a theory of a language? b. What is the status of a theory of language?
In Section 2.1 we start with question (1a). This is followed in Section 2.2 by a discussion of the nature of the data implied by the answer Chomskyan linguistics gives to this question. In Section 2.3 we turn to question (2a). After this discussion of the way Chomskyan linguistics approaches individual languages, Section 2.4 considers the status and role of universals, as implied by questions (1b) and (2b). A crucial assumption throughout these sections is that it is possible to determine what counts as Chomskyan linguistics. Botha (1989: 1–11) discusses the related concepts of generative grammar, Chomskyan generative grammar, Chomskyan linguistics, and Chomsky’s linguistics, pointing out subtle differences in their meaning. The name Chomskyan linguistics is used here to refer to the research programme in linguistics developed by Noam Chomsky. 1 Chomsky’s own work can be used as a reference point in this respect. Some other works, for instance Lyons (1970), Newmeyer (1983), Uriagereka (1998), present the research programme in much the same terms. Inevitably, individual presentations include instances of underspecification or ambiguity. If we take the entire body of texts, however, most of these instances can be eliminated when we assume that the research programme is internally consistent. Since its first appearance in the late 1950s, Chomsky’s theoretical framework has gone through a number of major changes. Chomsky (1965) provided a standard for much subsequent discussion, but marked a break with some of the assumptions adopted in earlier work. One of its innovations was the introduction of the lexicon. Chomsky (1981a) describes a rather different grammatical framework, in which individual rules are replaced by general principles with language-specific parameter settings. More recently, Chomsky (1995b) proposed his Minimalist Program (MP). Given the extent of the theoretical differences between these stages, the question of the unity of the research programme has to be asked. Sections 2.5 and 2.6 address this question, showing that there is a large degree of continuity in the research programme. Its development takes the form of a gradual specification rather than a substitution of older assumptions by new ones.
The research programme of Chomskyan linguistics
2.1
41
The nature of individual languages
Language is a complex phenomenon about which many different questions can be asked. Examples include how and why language changes, how languages are related, how they are used to build up social hierarchies and nations, etc. Depending on the choice of research questions, different perspectives on the nature of language can be adopted. Of course, a full theory of language should have something to say about all these aspects. What is said about each aspect may vary. Some aspects may be selected as central and constitute the starting point for an explanation. Others may be taken to be peripheral. Generally it is at least indicated either how they are linked to the theory or why they are irrelevant. The starting point for Chomskyan linguistics is how to explain the observation in (3). (3)
‘The central fact to which any significant linguistic theory must address itself is this: a mature speaker can produce a new sentence of his language on the appropriate occasion, and other speakers can understand it immediately.’ [Chomsky (1964: 7)]
The creative aspect of language is highlighted in (3) by the reference to a ‘new sentence’. For an adequate knowledge of a language it is not sufficient to have a stored list of sentences. The reference to ‘the appropriate occasion’ excludes the random generation of sentences. Speakers formulate sentences corresponding to what they intend to say. The approach taken in Chomskyan linguistics is to assume that the mind of a speaker of a language has a component that contains the knowledge necessary to explain the observation in (3). Chomsky has discussed the nature of this component in terms of three oppositions. The relationship between these oppositions is represented in Figure 2.1.
Performance
Competence / grammatical competence / I-language
E-language
pragmatic competence
Figure 2.1: Language as a mental component and its oppositions
42
Chomskyan Linguistics and its Competitors
What Figure 2.1 shows is that competence, grammatical competence, and I-language are three names for the same thing. They are used in three different oppositions. In this book I will in most cases use competence as the normal term. If it is necessary to refer to a plural, however, I will use I-language instead. In direct oppositions I will also use grammatical competence in opposition to pragmatic competence and I-language in opposition to E-language. The different directions of the oppositions in Figure 2.1 are meant to suggest that the entities being compared to competence are very different in nature. The discussion of the individual oppositions will clarify this.
2.1.1
Competence versus performance
The distinction between competence and performance is probably one of the best known elements of Chomskyan linguistics. Chomsky gives various equivalent descriptions of it. The one in (4) has the advantage of being concise and independent of anaphoric references to the context. (4)
‘A distinction must be made between what the speaker of a language knows implicitly (what we may call his competence) and what he does (his performance).’ [Chomsky (1966a: 3)]
The action referred to by ‘what he does’ in (4) is described by Chomsky (1965: 4) as ‘the actual use of language in concrete situations’. The contrast between knowledge and action characterises the opposition as one between empirical entities of a different nature. Competence is a real entity in the mind of the speaker. It is as real as the orbit of Mars around the Sun. Performance is a real event in space and time. It is as real as eating and drinking. As noted by Chomsky (1980a: 205), the distinction described in (4) is a very general, conceptual distinction. It is a starting point of investigations rather than the answer to significant questions. Competence is a real entity so that we can find out properties it has and hypothesise a model of what it is like. There can hardly be any question that there is knowledge which underlies linguistic performance. It is an empirical question what this knowledge is like. A first indication of the nature of competence and its relationship to performance can be obtained by considering fragments of performance such as (5). (5)
A: Oh Carol [eh] about that telephone call [eh] would you [eh] I mean … B: No problem, let’s forget about it. No one can do anything about it anymore.
The discourse in (5) is a made-up example, but it could well be the start of a conversation. In (5A) we see hesitation, a change of plan and the absence
The research programme of Chomskyan linguistics
43
of any grammatical sentence. The naturalness of the reply in (5B) shows that this does not prevent successful understanding. Without more knowledge of the background, for instance what kind of telephone call is referred to, we can only get a very vague idea of the content of the communication. We can immediately see, however, in what sense (5A) is ungrammatical and how the individual fragments can be used in grammatical sentences. This knowledge derives from our competence. The creative aspect of language is taken in (3) as the central problem of linguistics. Chomsky (1961: 222) refers in this respect to ‘the ability of a speaker to produce and understand an indefinite number of new sentences’. Here ‘new’ refers to the absence of pure imitation. We do not just repeat sentences we have heard before. The ‘indefinite number’ refers to the absence of any language-internal bounds on how many sentences we can produce. Any constraint in this respect stems from factors external to language. This point is made more explicit in (6). (6)
a.
‘Putting aside irrelevant limitations of time, patience, and memory, people can in principle understand and use sentences of arbitrary length and complexity. Correspondingly, as these limitations are relaxed in practice, our ability to use language increases in scope – in principle without bound. b. A sentence that is incomprehensible in speech may be intelligible if repeated several times or presented on the printed page where memory limitations are less severe. But we do not have to extend our knowledge of language to be able to deal with repeated or written sentences that are far more complex than those of normal spoken discourse. Rather, the same knowledge can be applied with fewer extrinsic constraints.’ [Chomsky (1980a: 220–221)]
The point in (6a) is illustrated in detail by Jackendoff (1993: 8–20), who presents various patterns for constructing indefinitely long sentences. While every sentence has a finite length, it is always possible to construct a longer sentence by putting ‘She thinks that’ in front of it (or replacing she by any proper name and thinks by any appropriate verb). As an example of the type of sentences referred to in (6b), consider the sequence of examples in (7). (7)
a. The pilot slept. b. The pilot whom the company had fired slept. c. The pilot whom the company that the American investment group had taken over had fired slept. d. The pilot whom the company that the American investment group that two ministers belong to had taken over had fired slept.
44
Chomskyan Linguistics and its Competitors
Central embedding as illustrated in (7) rapidly makes sentences difficult to understand. Nevertheless, the process is part of English. The same process that produces (7b) as an extension of (7a) is used in constructing (7c) on the basis of (7b) and (7d) on the basis of (7c). A sentence such as (7d) is stylistically awkward and unlikely to occur, but by writing it down and taking it apart in the way suggested by the sequence of (7a-d), we are able to understand it. The claim in (6) is that there is no new type of knowledge involved in doing this. We use the same syntactic knowledge in interpreting (7d) in this way as we use in interpreting (7b) when it occurs in normal speech. Given the inherent finiteness of the human brain, in which the mind is implemented in some way, the only possibility to encode the infinity of language in competence is for competence to use a system of rules that involves recursion. Recursion occurs when a rule invokes itself, as central embedding does in (7). It is an empirical question how the rule system of competence can best be described. A hypothesis making this somewhat more concrete is found in (8). (8)
‘The I-language consists of a computational procedure and a lexicon. The lexicon is a collection of items, each a complex of properties (called “features”), such as the property “bilabial stop” or “artifact”. The computational procedure selects items from the lexicon and forms an expression, a more complex array of such features.’ [Chomsky (1995a: 15)]
In (8), Chomsky uses ‘I-language’ instead of competence, but as suggested in Figure 2.1, the two are equivalent. The computational procedure referred to in (8) is the system of rules producing expressions, e.g. sentences, by combining elements from the lexicon, e.g. words. These lexical items are associated with features carrying information about how to pronounce them and about what they mean. As stated in (4), Chomsky considers competence as a type of knowledge. Knowledge is meant here as contrasting with ability and skill. Every human being that is not seriously disabled has the competence of a natural language. Individual differences are minimal. Where differences do arise is in the skill of using this knowledge. Not everyone is a great poet or a good journalist or a fluent reader of legal prose. There are courses to extend one’s skills in such respects. They do not teach language competence, but techniques of using this competence to greater effect. The difference between knowledge and ability is discussed for instance by Chomsky (1988: 10f.). There have been cases of people losing their ability to speak after an accident that involves brain injury. If they regain this ability, this is not by the same process as learning a language in the first place. If Brian, a monolingual speaker of English, loses his ability to speak
The research programme of Chomskyan linguistics
45
because of brain injury in an accident in France and gradually recovers and starts speaking again, he will regain the ability to speak English. Even if the episode happens in a French hospital, and everybody around him speaks only French, there is no reason to expect that he will suddenly start speaking French instead of English. If his ability to speak can be regained relatively easily, it is because his competence was not affected. Cases like these have happened, showing that competence and the ability to use it are separate components of the brain functions. The use of ‘know’ in (4) does not imply a commitment to every aspect of the meaning of this verb in English. In the relevant sense, knowledge does not imply, for instance, conscious access. Every mature speaker of English will be able to work out the meaning of examples such as (7) by using their knowledge of English. Exactly what knowledge is used and how it is used is a different question. In case this causes confusion, Chomsky (1980a: 92) proposes to replace it by cognise to make this dissociation clear. As long as we accept that know is a technical term, however, there is no need to change our usage. The distinction between competence and performance is one between two real-life entities. As such they are both important as objects of research. However, while both have to be accounted for in the end, there is a clear sense of priority given to competence. Performance is seen as derived from competence, as expressed in (9). (9)
‘Performance, that is, what the speaker-hearer actually does, is based not only on his knowledge of the language, but on many other factors as well – factors such as memory restrictions, inattention, distraction, nonlinguistic knowledge and beliefs, and so on.’ [Chomsky and Halle (1968: 3)]
Already Chomsky (1965: 15) states that ‘There has been a fair amount of criticism of work in generative grammar on the grounds that it slights study of performance in favor of study of underlying competence.’ Giving priority to the study of competence does not logically imply the neglect of performance. In fact, competence is one of the key elements of performance. Chomsky even claims that ‘There seems to be little reason to question the traditional view that investigation of performance will proceed only so far as understanding of underlying competence permits’ (1965: 10). He does not specify the tradition this view is based on, but goes on to claim that ‘To my knowledge, the only concrete results […] concerning the theory of performance […] have come from studies of performance models that incorporate generative grammars […]’ (1965: 10). The reference to ‘performance models’ suggests that in 1965 Chomsky was only thinking of performance as language processing, i.e. procedures of interpretation and production. Another aspect of performance is what is studied in sociolinguistics. In a more recent discussion, Chomsky (1995a:
46
Chomskyan Linguistics and its Competitors
50) states that ‘As for sociolinguistics, it is a perfectly legitimate inquiry, externalist by definition. It borrows from internalist inquiry into humans, but suggests no alternative to it.’ Here ‘externalist’ means considering the human being from the outside and ‘internalist’ refers to the inner workings of the human mind. Therefore, Chomskyan linguistics is not incompatible with the study of performance, but it assumes that a theory of competence is a major factor in a theory of performance.
2.1.2
Grammatical versus pragmatic competence
The distinction between grammatical competence and pragmatic competence came up as a result of further discussion and analysis of the nature of the distinction between competence and performance. The reason for distinguishing the two is that although competence underlies performance, performance is not a straightforward reflection of competence. This raises the question of how performance diverges from this ideal and which factors are responsible for this divergence. We can distinguish at least three cases, listed in (10). (10)
a.
Performance contains errors, violations of the constraints implied by competence and in most cases easily recognised on reflection. b. Performance does not contain certain sentences although they are allowed by competence. c. Performance draws on other types of knowledge to constrain the selection of what is possible according to competence.
An example of (10a) is what we see in the discourse in (5). Incomplete sentences and hesitation are the result of what in (9) is subsumed under inattention and distraction. There may be more felicitous characterisations of the underlying causes than inattention and distraction, but the phenomenon is clear. An example of (10b) is the difficulty in processing central embedding, as illustrated in (7). Theoretically, this category is highly important because it means that competence cannot be explained as a result of the imitation of perceived input. It also raises difficulties for data collection that will be discussed in Section 2.2. In (9), the reason that (10b) occurs is attributed to memory restrictions. As an example of (10c), consider the question in (11a) and the possible answers (11b-d). (11)
a. b. c. d.
Do you know what time it is? Yes. It’s four thirty. You’re right, we should go.
The research programme of Chomskyan linguistics
47
In most situations, (11b) will be used only jokingly and it will not be accepted as appropriate by the person asking (11a), even when it is logically correct. It may be used appropriately, for instance, when A and B are on a shopping trip and have to be back in time to pick up their children from school. In (11a), A seeks confirmation that B is taking responsibility for keeping an eye on the time, and in (11b) B gives this confirmation. The most common type of answer to (11a) is illustrated in (11c), although it is not a literally correct answer. Like (11b), however, (11c) can also be inappropriate and unhelpful in certain situations. Consider, for instance the situation where A and C are at a party and A wants to remind C that it is time to leave. Here (11d) would be more appropriate than (11c). The best choice among (11b-d) depends on various types of situational knowledge that Chomsky and Halle (1968) refer to as ‘nonlinguistic knowledge and beliefs’ in (9). As Newmeyer (1983: 35) states, ‘many things we “know” about language do not fall under competence’ in the sense intended by Chomsky. This observation has given rise to different extensions of the concept of competence. One approach is exemplified by Hymes (1971). He discusses this issue in the context of the assessment of the linguistic competence of socially advantaged and disadvantaged children. He introduces the term communicative competence, and although he does not give a formal definition, the quotations in (12) suggest what his intended use of this term is. (12)
a.
‘We have to account for the fact that a normal child acquires knowledge of sentences, not only as grammatical, but also as appropriate.’ [Hymes (1971: 10)] b. ‘The grammatical factor is one among several which affect communicative competence.’ [Hymes (1971: 12)] c. ‘competence must be the central concern. But arbitrary restriction of the domain of underlying knowledge can be ended.’ [Hymes (1971: 11)]
The claim in (12a) seems fairly uncontroversial. The question is how such an account should be structured. Hymes introduces communicative competence to cover all of the knowledge in (12a) and in (12b) makes competence as introduced by Chomsky in (4) a hyponym of communicative competence. In (12c), the restriction of competence to the concept in (4) is deemed ‘arbitrary’, which implies that his new concept of (communicative) competence should take over the role of leading principle in the study of language. Chomsky also noticed that there are other types of knowledge that are relevant to the use of competence, but he approaches the problem indicated by (12a) in a different way. He introduces the opposition between grammatical and pragmatic competence in (13).
48 (13)
Chomskyan Linguistics and its Competitors ‘For purposes of inquiry and exposition, we may proceed to distinguish “grammatical competence” from “pragmatic competence”, restricting the first to the knowledge of form and meaning and the second to knowledge of conditions and manner of appropriate use, in conformity with various purposes.’ [Chomsky (1980a: 224)]
Instead of introducing a superordinate term, as Hymes does in (12b), Chomsky introduces a co-hyponym in (13). Pragmatic competence is at the same level of generality as grammatical competence. This has immediate consequences for the role of (grammatical) competence in the study of language. Instead of a holistic approach, considering all types of knowledge reflected in performance as basically of the same kind, Chomsky maintains a difference between two types of knowledge that can be studied separately. Although Chomsky does not explicitly refer to Hymes (1971), the way the distinction between grammatical and pragmatic competence is formulated in (13) suggests that it is at least in part a reaction to of attitude reflected in (12c). In (14), he describes these restrictions in more detail. 2 (14)
‘By “grammatical competence” I mean the cognitive state that encompasses all those aspects of form and meaning and their relation, including underlying structures that enter into that relation, which are properly assigned to the specific subsystem of the human mind that relates representations of form and meaning.’ [Chomsky (1980a: 59)]
The case for distinguishing different components of competence is of course strengthened if it is shown that they can appear individually. Yamada (1990) actually describes the case of a woman who is seriously mentally retarded and has a very limited pragmatic competence, although her grammatical competence is not affected. As a result she produces grammatical but inappropriate sentences. Conversely, Gopnik and Crago (1991) describe the case of a family of which half of the members over three generations suffer from a type of aphasia. They cannot produce grammatical sentences, struggling in particular with verb and noun endings, but otherwise they can communicate normally. The existence of cases in which only one type of competence is affected provides a strong argument for making the distinction between them and rejecting Hymes’s suggestion in (12). A problem associated with every introduction of a new conceptual division is the question of where to draw the boundary between the two concepts. Newmeyer states in this context that ‘for any given linguistic phenomenon, no hard-and-fast criterion exists to decide which aspects of that phenomenon should fall under competence’ (1983: 37), where ‘competence’ refers to grammatical competence. The question is how serious this problem is in the present
The research programme of Chomskyan linguistics
49
context. Ultimately, the goal is to account for as many aspects of performance as humanly possible. The introduction of the distinction between grammatical competence and pragmatic competence is meant to bring this goal closer. By identifying two more internally coherent domains, the idea is to make it easier to account for each of them. In the end the theories developed for each domain should determine where individual phenomena can be accounted for more easily.
2.1.3
I-language versus E-language
The distinction between I-language and E-language was introduced by Chomsky (1986a) in order to clarify the notion of competence as developed up to then. Without giving a precise definition or concise description, he suggests the meaning of E-language in (15). (15)
a.
‘Structural and descriptive linguistics, behavioral psychology, and other contemporary approaches tended to view a language as a collection of actions, or utterances, or linguistic forms (words, sentences) paired with meanings, or as a system of linguistic forms or events.’ [Chomsky (1986a: 19)] b. ‘Let us refer to such technical concepts as instances of “externalized language” (E-language), in the sense that the construct is understood independently of the properties of the mind/brain.’ [Chomsky (1986a: 20)]
The long disjunction in (15a) shows that E-language is by no means a unified concept. What its different manifestations have in common, as (15b) states, is that they are independent of the mental concept of competence. The two statements in (15) are separated by half a page of exemplification. Of these examples, we will encounter the theories proposed by Bloomfield and Zellig Harris in Chapter 3. The corresponding concept of I-language is introduced in (16). (16)
‘Let us refer to this “notion of structure” as an “internalized language” (I-language). The I-language, then, is some element of the mind of the person who knows the language.’ [Chomsky (1986a: 22)]
The ‘notion of structure’ refers to an expression Jespersen (1924: 19) uses to refer to what we have called competence here. 3 Although Smith (1999: 38) warns against ‘confusion concerning the difference between competence and I-language’ the two names refer to what is basically the same concept. Smith argues that competence is informal usage and I-language is a formal term, but neither is formally defined and standardised and either can be treated in this way. The main difference between competence and I-language is that as
50
Chomskyan Linguistics and its Competitors
words in English they have different grammatical and selectional properties, so that we can speak, for instance, of several I-languages but not several competences. While I-language refers to a familiar concept, the distinction between I-language and E-language is of a fundamentally different type compared to the oppositions discussed in previous sections. Chomsky explains this point in (17a, c). (17)
a. ‘The technical concept of E-language is a dubious one b. in at least two respects. In the first place, as just observed, c. languages in this sense are not real-world objects but are artificial, somewhat arbitrary, and perhaps not very interesting concepts.’ [Chomsky (1986a: 26)]
The other of the ‘two respects’ in (17b) is not relevant here. It refers to the context of (17) in a larger argument concerning the development of the field of linguistics in the history of thought. Confusion about the nature of E-language has taken two forms. First, its introduction has been interpreted as a change of mind on Chomsky’s part, amounting to a reorientation of Chomskyan linguistics. Secondly, E-language has been interpreted as another name for performance. While these two positions are mutually incompatible, neither corresponds to Chomsky’s intention in introducing E-language as a new term. The suggestion that in earlier work Chomsky had taken a different attitude to the object of linguistics is based on statements such as (18), which occur in various places in older publications. (18)
‘We may think of a language as a set of sentences, each with an ideal phonetic form and an associated intrinsic semantic interpretation.’ [Chomsky and Halle (1968: 3)]
In (18), ‘language’ corresponds to a ‘collection of linguistic forms paired with meaning’ in (15a), i.e. to E-language. At the same time, as we have seen in Section 2.1.1, Chomsky proposes that the object of linguistics should be competence. These two positions are not in contradiction, however, as long as we admit that (18) does not claim that language in the sense defined here is the object of linguistics. Chomsky makes this explicit in (19). (19)
‘The fundamental cognitive relation is knowing a grammar; knowing the language determined by it is derivative.’ [Chomsky (1980a: 70)]
The research programme of Chomskyan linguistics
51
The term grammar used in (19) will be discussed in more detail in Section 2.3.1. Here it refers to the system in the mind of the speaker of a language. This system enables a speaker to produce sentences of the language, understand sentences produced by others, and determine whether a particular sentence is part of the language. The reason for using ‘grammar’ rather than ‘competence’ in (19) is that the English word competence cannot be a direct object of the verb know in this context. By the introduction of the new pair of terms in (15) and (16), it is now possible to reformulate (19) as ‘The fundamental cognitive relation is knowing an I-language; knowing the E-language determined by it is derivative.’ Thus, ‘language’ in (18) and in (19) corresponds to E-language. This interpretation is supported by Chomsky’s (1986a: 29) explanation that ‘In the literature of generative grammar, the term “language” has regularly been used for E-language in the sense of a set of well-formed sentences’ and coincides with Matthews’s analysis of this change as merely a ‘change of terminology’ (1993: 237). The second type of confusion interprets the introduction of I-language and E-language as a change of terminology for the old distinction between competence and performance. We have already seen that there is a change of terminology involved and that competence corresponds to I-language, but performance and E-language are quite different entities. If we take (15) as determining the meaning of E-language, it is difficult to make sense, for instance, of Lust’s (1999: 136) reference to ‘the length constraint on children’s language (affecting their E-language)’ in the context of the study of language acquisition. A child does not have an E-language, but a child’s performance is indeed marked by short utterances. Ooi (1998: 4) even explicitly equates the two when he states that ‘the E-language (‘externalised language’) or linguistic performance may be regarded as everyday speech and writing’ in his introduction to corpus linguistics. Given the description of performance in (9) and of E-language in (15), we can describe the difference between them as follows. Performance is a real-life entity. It can be embodied in sound waves that can be recorded on a tape, or in ink blots on a piece of paper, or in a digital code on a computer, but in any case there is a natural entity to turn to for verification. What can be verified, for instance, is whether a particular sentence occurs in a particular book or speech. E-language, by contrast, is not a real-life entity. E-language is ‘understood independently of the properties of the mind/brain’, as (15b) states, in a sense that performance is not. Performance depends on competence. E-language is an abstract entity, not embodied in any way. This also implies that there is no natural authority to turn to in order to verify whether a particular sentence is part of an E-language.
52
2.1.4
Chomskyan Linguistics and its Competitors
Conclusion
The discussion of three oppositions involving the concept referred to as competence, grammatical competence, or I-language has clarified the nature of this concept as well as the nature of the oppositions it is involved in. It is this concept that is the answer to question (1a) in Chomskyan linguistics. Based on Figure 2.1, the representation in Figure 2.2 shows the way performance, pragmatic competence, and E-language are related to the central object of study in Chomskyan linguistics.
Performance
Competence / grammatical competence / I-language
E-language
Mind
World
pragmatic competence
Figure 2.2: Classification of the oppositions of competence
The main distinction between competence and performance is that competence is a mental object and performance is not. Grammatical competence and pragmatic competence are similar in the sense that they are both mental objects and they both underlie performance. Compared to E-language, the other three objects are similar because they are real-life objects, existing in the concrete world. E-language has no place in the concrete world and is an abstract object. For the study of language, this means that (grammatical) competence, pragmatic competence, and performance are all to be accounted for in the end. Grammatical competence and pragmatic competence can be used to account for many aspects of performance. The boundary between grammatical competence and pragmatic competence is not given a priori. It will ultimately follow from the best theories for these two entities. E-language does not have any direct relationship to the study of natural language, because it is not a natural object.
The research programme of Chomskyan linguistics
53
SUMMARY
2.2
•
Competence is the speaker’s knowledge of language. It is the focus of attention in Chomskyan linguistics.
•
Performance is the result of using language. It is the product of the interaction of competence with other types of knowledge, filtered and sometimes distorted by various cognitive constraints.
•
Grammatical competence is a more precise name for competence, in particular when it is opposed to pragmatic competence.
•
Pragmatic competence is knowledge about how to make communication successful. It interacts with grammatical competence but does not concern the structure of language as such.
•
I-language is an alternative name for competence. Its different distributional properties in English make it more suitable in some statements (e.g. ‘a number of I-languages’).
•
E-language is a notion of language, for instance as a set of sentences with their associated meanings, that is independent of any mental knowledge component.
•
Grammatical and pragmatic competence are mental objects. Performance is a non-mental entity. They are all entities in the real world. E-language is an abstract entity with no real-world correlate.
The nature of data
The selection of data for a particular scientific investigation depends on the questions investigated. Once a particular topic has been determined, all data that are accessible and can give information about the topic are taken to be relevant. Chomsky stated this on many occasions, e.g. (20). (20)
‘As in the case of any inquiry into some aspect of the physical world, there is no way of delimiting the kinds of evidence that might, in principle, prove relevant.’ [Chomsky (1986a: 37)]
Data collection methods can in general be divided into gathering naturalistic data and experimentation. Nagel (1961) refers to the former as ‘observing things and events encountered in common experience’ in (1a) of Chapter 1. The systematic designing and carrying out of controlled experiments has become common from the sixteenth century onwards (cf. Rossi, 1997).
54
Chomskyan Linguistics and its Competitors
Both methods are in principle available to linguistics. The main restriction on carrying out experiments is ethical. Experimentation on humans is limited to procedures that do not affect the life of the subjects negatively. Thus, it is not possible in practice to ‘inquire into the operative mechanisms by intrusive experimentation’, as Chomsky (1980a: 197) states. Despite this general view of data collection, Chomskyan linguistics has come to be associated with work on introspective grammaticality judgements. This method is often thought of as the direct opposite of corpus-based work. Therefore, Sections 2.2.1 and 2.2.2 will address these two methods in detail. The main difference between the two is that the former is experimental and the latter naturalistic. Another type of experimental data is what is collectively referred to as psycholinguistic experiments. They will be discussed in Section 2.2.3.
2.2.1
Grammaticality judgements
A grammaticality judgement is an intuitive assessment by the speaker whether a particular linguistic unit, typically a sentence, is grammatical. An example of the use of grammaticality judgements is (21). (21)
a. There is a book on the table. b. *There is the book on the table.
Whereas (21a) is grammatical, the star in front of (21b) indicates ungrammaticality. The examples in (21) are typical of the use of grammaticality judgements in the sense that they often appear as contrasts, used to explore the borderlines of grammaticality in a particular area. The person giving the judgements is a speaker of the language, either the linguist doing the research (introspection), or an informant. Chomsky’s statement in (22) describes this state of affairs. (22)
‘linguistics as a discipline is characterized by attention to certain kinds of evidence that are, for the moment, readily accessible and informative: largely, the judgments of native speakers. Each such judgment is, in fact, the result of an experiment, one that is poorly designed but rich in the evidence it provides.’ [Chomsky (1986a: 36)]
As is clearly recognised in (22), grammaticality judgements constitute an experimental method for gathering data. The advantage of this method, in particular when the informant is the linguist doing the research, is that it is possible to collect the most relevant data in any desired quantity and in an efficient way. By coming up with the pair in (21) the linguist indicates that the definiteness of the article in the book is what makes (21b) ungrammatical. Other, similar pairs can be constructed at will to confirm this result. Further
The research programme of Chomskyan linguistics
55
exploration can concentrate on minimal variation in the examples, looking at different ways definiteness is expressed or different discourse functions definiteness can have. In introducing the concept of grammaticality, Chomsky (1957: 15) emphasises the difference from acceptability. He gives the examples in (23) to illustrate the contrast. 4 (23)
a. b. c. d.
Colorless green ideas sleep furiously. *Furiously sleep ideas green colorless. Have you a book on modern music? *Read you a book on modern music?
Whereas (23a-b) are difficult to interpret, (23c-d) are readily understandable. Nevertheless, (23a) is grammatical and (23d) not. The example in (23a) has become so famous that various attempts have been undertaken to assign a meaning to it. The fact that this is possible illustrates its contrast from the ungrammatical (23b). The fact that specific attempts are necessary highlights the contrast to (23d), which is not difficult to interpret but violates grammatical rules. The existence of examples such as (23a) and (23d) shows that grammaticality and acceptability are independent. Chomsky links them to the contrast between competence and performance in (24). (24)
‘The notion of “acceptable” is not to be confused with the notion of “grammatical”. Acceptability is a concept that belongs to the study of performance, whereas grammaticalness belongs to the study of competence.’ [Chomsky (1965: 11)]
What is meant by (24) is that a grammaticality judgement implies an abstraction from aspects not relevant to competence. The difference between (23d) and (23a) is that the former violates a constraint imposed by competence, whereas the latter only violates other types of constraint. As another example, while the repeated use of central embedding, illustrated in (7), rapidly makes a sentence unacceptable, no degree of central embedding can make a sentence ungrammatical. Of course, there are many constraints on sentences, both in competence and in other factors underlying performance. It is therefore not surprising that ‘Like acceptability, grammaticalness is, no doubt, a matter of degree’, as Chomsky (1965: 11) states. It is clear, for instance, that (23b) is more ungrammatical than (23d), because it violates more constraints. Other constraints may be relatively weak, leading to a kind of ‘semi-grammaticality’. Therefore, there is no clearly delimited set of grammatical sentences constituting the E-language. Chomskyan linguistics wants to account for the constraints, not for the sentences. This is captured by Lyons in (25).
56 (25)
Chomskyan Linguistics and its Competitors ‘for Chomsky the intuitions of the speaker (that is to say his mental representation of the grammar of the language), rather than the sentences themselves, are the true object of description.’ [Lyons (1970: 87)]
Somewhat confusingly, Lyons uses the expression ‘the intuitions of the speaker’ in (25) to refer not to any kind of data, but to the ‘mental representation of the grammar of the language’, i.e. the competence. Grammaticality judgements and other types of data only give evidence for this competence, which is the ‘true object of description.’ For many people it is difficult to accept that intuitive judgements can be used as scientific data. The obvious problem is that the data cannot be evaluated objectively. In a science like medicine, much emphasis is placed on validating results, for instance by means of large-scale, double-blind trials. The purpose of the special measures surrounding such tests is to increase the objectivity of the resulting data. The reason for the difference in emphasis as to the objectivity of the data lies in a number of differences between medicine and linguistics. First, the economic impact of the results is more important in medicine. A pharmaceutical company can have significant gains or losses depending on the outcome of the experiment. Moreover, public health could be jeopardised by releasing the wrong or not releasing the right medicine. Neither of these reasons applies to linguistics, which is not an applied science and whose results do not have great commercial value or large-scale social impact. A third reason is of a different kind. In medicine, what is investigated is a chemical or biochemical process. Arguably, linguistic competence is ultimately to be reduced to biochemical properties of the brain, but as yet no causal relations between observations concerning competence and observations of biochemical properties of the brain have been discovered. This means that statistical evaluation of large numbers of grammaticality judgements is of little value. It would average quantities for which no physical correlate is known. There is no physical object corresponding to the ‘common competence’ of speakers of English which would underlie similarity in grammaticality judgements (cf. also Section 5.1). Opinion polls would not strengthen the results in (21). There are two potential problems with the use of intuitive judgements, whether introspective or based on informants. The first is that it is not possible to check whether the judgements are honest and genuine. If I propose a theory and you give a counterexample there is no way I can prove your counterexample is wrong. I can try to explain your judgement in terms of performance factors, but there is no formal procedure to demonstrate that it is not a matter of competence. We have to keep in mind, however, that data alone can never refute
The research programme of Chomskyan linguistics
57
a theory. They can only become a threat when there is an alternative theory that explains them. This is a general point of research programmes, discussed in Section 1.3.1 and highlighted in Kuhn’s (9) in that section. Hornstein and Lightfoot (1981: 15), among others, restate this point for linguistics. The second problem with the use of intuitive judgements is that they can be rather subtle. An example is data about so-called parasitic gaps. The examples in (26) are taken from Chomsky (1986b: 58). (26)
a. he’s a man that [everyone who [gives presents to e]] likes t b. *he’s a man that [any present [they’ll give to e]] will please t
The examples in (26) contain brackets to indicate constituents, and non-pronounced elements e and t to indicate positions where a noun phrase coreferring with man is to be interpreted. Some speakers do not feel the contrast in (26) at all. Others observe a fairly robust contrast. Importantly, the judgements of the two groups about various sentences with parasitic gaps are internally consistent. This means that the phenomena are real. Unless there are clear indications of the contrary, there is no reason to assume that disagreement about grammaticality judgements is a matter of dishonesty or bad practice on the part of one of the parties. They may be real differences of the I-language. Different people cannot share one I-language, because every I-language is materialised in one mind/brain. Alternatively, the disagreement may be caused by different analyses of what is in the competence and what belongs to other performance factors. Newmeyer (1983: 51–54) argues that most of the conflicts over grammaticality judgements are not conflicts about data but about whether certain factors that influence the judgements are attributed to competence or not. Finally, it should be emphasised that grammaticality judgements and other linguistic intuitions of native speakers obtain their prominence only because of the factors mentioned in (22). The experiments are practical and rich in output, but in no way privileged, as implied by (20). Chomsky states this explicitly in (27). (27)
2.2.2
‘The perceptual judgments called ‘linguistic intuitions’ are also just data, to be evaluated alongside other kinds: they do not constitute the data base for the study of language.’ [Chomsky (1997: 12)]
Corpus data
A corpus is a collection of performance data. In classical American linguistics, corpora of spoken utterances were widely used in the study of American Indian languages. From the 1980s onwards, corpora of written texts, stored in
58
Chomskyan Linguistics and its Competitors
electronic form, have been used for the study of various linguistic questions, especially in Europe. It is widely assumed that Chomskyan linguistics rejects the use of corpus data because it is not interested in performance. In Section 2.1.1 we have seen that Chomskyan linguistics does not deny that performance is interesting and should be explained as far as possible. We should therefore not be too surprised to find statements such as (28). (28)
‘Clearly, the actual data of linguistic performance will provide much evidence for determining the correctness of hypotheses about underlying linguistic structure, along with introspective reports.’ [Chomsky (1965: 18)]
In (28) the use of corpus data is at least in principle placed at the same level as the use of grammaticality judgements. In fact, (28) is no more than an explicit statement of a special case of (20). A stronger claim seems to appear in (29). (29)
a.
‘The English language, like all the natural languages, consists of an indefinitely large number of sentences, only a small fraction of which have ever been uttered or will ever be uttered. b. The grammatical description of English may be based upon a corpus of actually attested utterances, c. but it will describe these, and classify them as ‘grammatical’, only incidentally as it were, by projecting them on to the indefinitely large set of sentences which constitutes the language.’ [Lyons (1970: 38)]
(29b) suggests the possibility of a ‘Chomskyan corpus linguistics’. The relationship between corpus and language stated in (29c) is generally accepted in corpus linguistics. The corpus is derived from the language as a sample, which makes it an adequate tool for studying the language. An essential element in making (29b) compatible with Chomskyan linguistics is the definition of ‘English’ in (29a). ‘The English language’ as described in (29a) is an E-language. According to (29b), a corpus-based grammatical description of this E-language is possible. However, as we saw in Section 2.1.3, Chomskyan linguistics is not interested in E-language. Therefore, (29) does not open the perspective of corpus linguistics in a Chomskyan framework. There are two problems with using a corpus. One problem is illustrated in (30). (30)
‘A full service of your toilets are at 18.30’ [notice in some toilets at Swansea University]
Examples such as (30) occur in a corpus. We see immediately that (30) is ungrammatical. The treatment of such examples is determined by what is considered more important, its occurrence or its ungrammaticality. It is clear
The research programme of Chomskyan linguistics
59
that the ungrammaticality reflects competence whereas the occurrence is caused by other factors in performance. It is for this reason that Chomsky relativises the use of corpus data in (31). (31)
a.
‘A corpus may contain examples of deviant or ungrammatical sentences, and any rational linguist will recognize the problem and try to assign to observed examples their proper status. […] b. insofar as a corpus is used as a source of illustrative examples, we rely on the same intuitive judgments to select examples as we do in devising relevant examples with the aid of an informant (or ourselves).’ [Chomsky (1980a: 198–199)]
As stated in (31a), it is not possible to use a corpus as a source of data on competence without using also intuitive judgements to determine the status of the examples found in the corpus. We can use ungrammatical sentences found in the corpus, but only in combination with the information that they are ungrammatical. A second problem in the use of corpora is indicated in (31b). Not every example is equally informative. This aspect is elaborated in (32), a later statement by Chomsky, taken from an interview. (32)
a.
‘the corpus doesn’t matter, it’s like the phenomena that you see out of the window. b. If you can find something in the corpus that is interesting, great. Then you’ll explore that with what amounts to doing experiments. c. But in fact, a lot of the most interesting work has been on things that nobody ever says, like parasitic gaps, for example.’ [Chomsky (2002: 128)]
In order to account for competence it is not enough to have a number of isolated examples. As (32b) states, such examples are at most a starting point. Consider again the examples in (21). The grammatical (21a) might occur in a corpus. What is important is then to explore how small variations such as (21b) affect the grammaticality. In fact, it is because of the contrast in grammaticality that the examples in (21) are remarkable. In (32c) this limitation of the role of the corpus is emphasised even more. Parasitic gaps, illustrated in (26), are of particular interest because they do not occur in a corpus but still yield robust enough grammaticality judgements. The crucial problem of working with corpora is to extract relevant examples. As Chomsky states, ‘The problem of determining what data is valuable and to the point is not an easy one. What is observed is often neither relevant nor significant, and what is relevant and significant is often very difficult to observe’ (1964: 28, fn.).
60
Chomskyan Linguistics and its Competitors
The role of the corpus is summarised in (32a) by a parallel with physics. The phenomena we see around us are related to the data in physics, but physicists do not sit and wait for something to happen before they start work. Moreover, they prefer experiments to the naturally observed phenomena because in experiments the different factors involved can be controlled. Therefore a corpus has no specific value as an authority. Nevertheless, many Chomskyan linguists will recognise Hoekstra’s feeling when in the acknowledgement section of his PhD thesis he apologises to his wife ‘for the countless conversations I interrupted to ask her a question relating to the form of what she said, when the contents deserved my undivided attention’ (1984: x). Naturalistic data, including data from a corpus, are widely used as a source of inspiration, but they are at most a starting point for research of competence.
2.2.3
Psycholinguistic experiments
The use of grammaticality judgements is a quick and reliable method to get a lot of relevant data for a particular phenomenon. However, the conscious, introspective nature of these data may make them problematic. In doubtful cases, for instance parasitic gaps, judgements may become somewhat indeterminate. They may also, consciously or unconsciously, be influenced by the theory one adopts. It is rather easy to convince oneself that a particular sentence is grammatical if that fits the theory. In addition, there are the problems of interpersonal differences and judgement fatigue. Interpersonal differences do not constitute a theoretical problem. Every speaker has their own competence yielding an in principle independent set of judgements. It does constitute a practical problem, however. In doubtful cases, asking a colleague or another informant is helpful, but only if the general picture that emerges is consistent. Judgement fatigue strikes especially with doubtful examples. Asking for too many judgements in short succession may blur the informant’s intuition. It is not surprising, then, that other types of experimental techniques have been tried to supplement the collection of grammaticality judgements. While it is obvious that (20) allows for such techniques in principle, they have to be calibrated in some way. An early formulation of this need is found in (33). (33)
‘we evaluate the success and relevance of an operational test (just as we evaluate the success of a generative grammar) by asking how well it corresponds to the given data.’ [Chomsky (1961: 227)]
A well-documented example of the method of calibration concerns the use of the so-called click paradigm to determine constituent boundaries. The research question in this case is to determine the boundary between major constituents in a sentence. Some examples of relevant cases are (34) and (35).
The research programme of Chomskyan linguistics (34)
a. Adèle [αloves Benjamin] b. [βAdèle loves] Benjamin
(35)
a. Charles forces [αDaphne to stay] b. [βCharles forces Daphne] to stay c. Eliana believes Floris to be open-minded
61
In (34) the distinction between the two divisions is clear. In (34a), α is the VP. In (34b), β is not a constituent. For the examples in (35), the choice between two analyses is less obvious. In a sense, Daphne in (35a-b) is at the same time the subject of the embedded verb, suggesting that it is part of the embedded sentence α in (35a), and the object of the matrix verb, which would make it part of the matrix sentence β in (35b). In (35c), a similar decision has to be taken, but the semantics of the verb suggest that a structure analogous to (35a) is more likely. The ideal use of an experimental technique in a case like this would then be to show that it gives the expected result in (34) and clear results in (35). Fodor et al. (1974: 252) ascribe the discovery of click effects to Ladefoged and Broadbent (1960). They carried out experiments in which the subjects heard a stretch of speech through one channel of their headphone and a noise (the ‘click’) through the other one. They were then asked to tell at what position of the stretch of speech they heard the click. Ladefoged and Broadbent noticed that errors were significantly larger when the input was a sentence than when it was a sequence of numbers. Fodor and Bever (1965) were the first to link this effect to constituent boundaries. They formulate the hypothesis in (36a) and derive (36b). (36)
a.
‘The unit of speech perception corresponds to the constituent.’ [Fodor and Bever (1965: 415)] b. ‘Noise heard during speech should tend to shift perceptually towards the boundaries of constituents. This shift should occur in such fashion as to minimize the number of constituents the noise is perceived as interrupting.’ [Fodor and Bever (1965: 416)]
What is described in (36b) is a prediction that can be tested by means of click experiments. It implies that people will tend to perceive clicks at major constituent breaks, e.g. α in (34a), even when objectively they are simultaneous with a preceding or following syllable. However, before concluding that click experiments demonstrate (36a) it should be established that the results do not reflect other factors of perception than constituent boundaries. Fodor and Bever address the question of pauses between words and show that ‘the pausal characteristic of the speech signal cannot be the sole factor tending to determine the subjective placement of clicks’ (1965: 419). Fodor et al. (1974: 329–335)
62
Chomskyan Linguistics and its Competitors
describe a variety of other factors tested, including memory constraints and prosody. They conclude that the click experiments do indeed correlate with constituent structure. This part of the calibration effort only discards explanations of the results of click experiments that would claim that they show nothing at all about constituent structure. The next question to ask is exactly what it is that they show about constituent structure, i.e. we can now test whether (36b) holds. This question is still part of calibration. Fodor and Bever (1965) hypothesise that all constituent boundaries are equally important, which implies that in (34), the position before the start of α attracts clicks more strongly than the position after loves, because in the latter position the click interrupts the VP and in the former it does not. Fodor et al. (1974: 336–338) describe various efforts to delimit the type of boundary found by click experiments. In the course of these experiments, (36b) was largely disproved. It was found, for instance, that the contrast in (34) could not be demonstrated with click experiments. Fodor and Bever (1965) worked with examples such as ‘That he was happy was evident from the way he smiled’ and found a major point of attraction after happy. This suggested that instead of boundaries of all constituents, it might be only clausal boundaries that attract clicks. Bever et al. (1969) explored this hypothesis. They used sentences similar to the ones in (35) to test it. This means that they assumed a particular analysis of (35) as correct and tested their hypothetical interpretation of the click experiments against this assumption. At this point it is clear that click experiments have a very limited value in giving evidence about constituent structure. Instead of telling us whether (35a) or (35b) is correct, we have to assume one of them in order to test the validity of the experimental technique. The status of the hypotheses in (36) is what Botha (1981: 327) calls a bridge theory. A bridge theory explains the relationship between the object of investigation, i.e. the speaker’s competence, and the phenomena observed in the experiment. Arguably, a bridge theory is used whenever data are taken to be relevant. For the use of grammaticality judgements the bridge theory is small and relatively uncontroversial. For click experiments, the discussion never transcended the level of the validation of the bridge theory. Until recently, all psycholinguistic experiments were based on the analysis of reactions of a subject to a certain stimulus. Subjects were exposed to linguistic input, e.g. sentences, and asked to react according to certain instructions, e.g. press a button or write down where they heard a click. A more recent development is neuroanatomical research. With the help of fMRI and PET imaging techniques, 5 the activity of individual areas of the brain can be made visible.
The research programme of Chomskyan linguistics
63
In principle, brain images promise to be a very useful source of linguistic evidence. Chomsky (1986a: 39f.) suggests that evidence of this type may be used to choose between two theories that are compatible with the same set of grammaticality judgements. Chomsky (1997: 12) refers to ‘observations of electrical activity of the brain, which has become quite suggestive in recent years.’ For the moment, however, this potential is not quite realised yet. On the basis of a large-scale analysis of experiments that have been carried out in this domain, Friederici (2002) proposes a processing model. This model involves steps such as the identification of word forms and word categories as well as syntactic structure building and thematic role assignment. These steps are placed in a structure (2002: 79) identifying them with a time slot (measured in milliseconds after the input has been given) and a brain area. Measurement of event-related brain potentials (ERPs) leads to the identification of a number of typical patterns associated with the recognition of certain types of errors. Thus, a pattern called ELAN ‘correlates with rapidly detectable word-category errors’, another pattern, called LAN, ‘correlates with morphosyntactic errors’, and a third, P600, ‘correlates with outright syntactic violations’ (2002: 81). 6 What the overview by Friederici (2002) shows most of all is the type of question addressed by current neuroanatomical study of language. The questions are related to the major components of information and their interconnections. They take components such as syntax and phonology as units, without considering their internal organisation. As a consequence, the results have no bearing on, for instance, the structure of sentences such as (35). Neuroanatomical research may be more promising than classical psycholinguistic research as exemplified by the click experiments, but whereas the latter never managed to deliver results beyond the calibration of the method, the former has not even entered this stage yet. It is for the moment concerned only with processing at a rather general level. In Chomskyan linguistics, however, as we will see in Section 2.4.1, processing is considered as secondary to the nature of the knowledge of language. We have to know what the competence is before we can study how it is used.
2.2.4
Conclusion
Everything that can give us insight into the nature of competence can be used as evidence. To the extent that grammaticality judgements have a special status in practice, this is because of their easy accessibility and the richness of the evidence they provide. Using them, a linguist can quickly delimit the nature of a phenomenon by identifying whether sentences with minimal differences are grammatical or not.
64
Chomskyan Linguistics and its Competitors
The use of naturalistic data, i.e. a corpus, is in general much less rewarding. There is no principled objection to its use, but it can never be used without supplementing it with grammaticality judgements. The most fruitful use is as a source of inspiration. Grammaticality judgements constitute a type of experimental data. Other types of experimental data can be collected by more elaborate test setups. Psycholinguistic experiments have to be validated against intuitive data before they can be used. In the case of click experiments, this validation proved so problematic that the actual results did not add much to what could be found by means of simpler experiments such as grammaticality judgements. Brain research may in future supplement the data available to the linguist. At this point it has not reached the level of sophistication that is required to address the questions linguists working in Chomskyan linguistics would be interested in.
SUMMARY
•
In principle, any type of data can be used in Chomskyan linguistics.
•
Grammaticality judgements constitute a powerful and efficient source of experimental data, but they are not a priori privileged as data.
•
The difference between grammaticality and acceptability parallels the one between competence and performance.
•
Corpora are useful as a source of inspiration for setting up experiments. They can only be interpreted by using intuitions, because they may contain ungrammatical sentences without special marking.
•
Certain types of psycholinguistic experiment measure the reaction of subjects to particular input after specific instructions. Click experiments are an example.
•
The value of click experiments is very limited, because it is difficult to determine the theoretical correlate to what they measure.
•
Neuroanatomical experiments measure the activity of sections of the brain when particular input is processed. Currently the results are too crude to be of much theoretical value.
The research programme of Chomskyan linguistics
2.3
65
The function of grammars
The position of grammars in the framework of Chomskyan linguistics is summarised in (37). (37)
‘A grammar can be regarded as a theory of a language; it is descriptively adequate to the extent that it correctly describes the intrinsic competence of the idealized native speaker.’ [Chomsky (1965: 24)]
Given the discussion of the nature of competence in Section 2.1 and of data in Section 2.2, (37) indicates that at a certain level of abstraction we can represent the model of the research programme of Chomskyan linguistics as in Figure 2.3.
Competence
describes
Grammar
test Observable facts
describe
explains
Observations
Figure 2.3: Competence, grammar, and data in the model of Chomskyan linguistics
Figure 2.3 summarises the answer to question (2a) on the status of a theory of a language. As stated in (37), the grammar describes the competence and can be regarded as a theory. The two rounded rectangles on the right in Figure 2.3 correspond to the theory and observations in the empirical cycle represented in Figure 1.3. Empirical laws are not represented in Figure 2.3, but they can be thought of as part of the processes marked ‘test’ and ‘explains’. As an example, if we study the phenomenon of presentational there exemplified in (21), empirical laws may take the form of generalisations over contexts in which presentational there is possible. The possibilities of testing the grammar and explaining the observations follow from the empirical cycle, as embedded in the research programme in Figure 1.7. The ovals on the left in Figure 2.3 represent the parts of the world corresponding to the theoretical entities on the right. The label
66
Chomskyan Linguistics and its Competitors
‘observable facts’ was chosen to highlight that every observation corresponds to a fact and that it is not possible to limit these facts to, for instance, grammaticality judgements. The choice of this label rather than, for instance, linguistic facts is not meant to suggest that competence underlies every fact in the world, but acknowledges that facts do not come with labels indicating whether they are relevant for the study of competence. The research programme of Chomskyan linguistics leads researchers to select certain types of observations because the facts they are based on are taken to give evidence about competence. The relationship between grammar and competence raises some further issues, elaborated in Section 2.3.1. Two other points are suggested by the formulation in (37). First, it refers to the ‘idealized native speaker’. This concept is explained in Section 2.3.2. Second, it refers to ‘descriptive adequacy’. This suggests the possibility of different types of adequacy, a question taken up in a general sense in Section 2.3.3 and in more detail in Section 2.4.
2.3.1
Grammar and competence
One of the issues arising from the use of grammar as the name for the theoretical description of competence is the relationship to the traditional notion of a descriptive grammar. Chomsky describes this difference in (38). (38)
‘A traditional grammar has serious limitations so far as linguistic science is concerned. Its basic inadequacy lies in an essential appeal to what we can only call the ‘linguistic intuition’ of the intelligent reader.’ [Chomsky (1962a: 528)]
The statement in (38) and many similar statements elsewhere should not be interpreted as a complete rejection of traditional grammar. First, Chomsky (1965: 5) considers the descriptions in traditional grammars as a valid starting point for linguistic research. Second, Chomsky (1980a: 133) recognises that traditional grammars have a different purpose, for which the appeal to ‘the linguistic intuition of the intelligent reader’ is entirely appropriate. An example illustrating the difference in goals and in treatment is the discussion of the construction with there in sentences such as (21). Quirk et al. (1985: 1402–1411) treat this construction in some detail. They concentrate on the discourse function of the construction and give extensive exemplification of its use. The contrast illustrated in (21) is dealt with by the constraint that ‘the clause concerned has an indefinite subject (but cf Note [a])’ (1985: 1403). Note [a] gives a number of contexts in which definite noun phrases are allowed. They are described in terms of examples and loose generalisations of the type ‘This limitation can be waived, however, where the definite noun phrase conveys new information’ (1985: 1404). Reference grammars such
The research programme of Chomskyan linguistics
67
as these are on the one hand rich sources of information, on the other not explicit enough in their descriptions to be used as a grammar in the context of Figure 2.3. The descriptions of contexts in which the there-construction is used is adequate for speakers of English who want to know how they can use it effectively, but not as an account of the implicit knowledge these speakers have in their competence. Another point concerning the use of grammar as a term in Chomskyan linguistics is its ambiguity. In writings up to the mid 1980s the term is used in two different senses, as stated, for instance, in (39). (39)
a. ‘We use the term “grammar” with a systematic ambiguity. b. On the one hand, the term refers to the explicit theory constructed by the linguist and proposed as a description of the speaker’s competence. c. On the other hand, we use the term to refer to this competence itself.’ [Chomsky and Halle (1968: 3)]
The sense of grammar referred to in (39b) corresponds to the way it is used in Figure 2.3. The sense in (39c) is an alternative name of competence in Figure 2.3 that we encountered also in (19). In terminology, ambiguity, as explicitly referred to in (39a), and synonymy, as implied by (39c) are generally considered bad practice. 7 Chomsky (1986a) introduces the term I-language to replace the second sense of grammar. In earlier texts, however, the ambiguity signalled in (39) can be confusing. An example is (40). (40)
a.
‘the notion language is a much more abstract notion than the notion of grammar. b. The reason is that grammars have to have a real existence, that is, there is something in your brain that corresponds to the grammar. That’s got to be true. c. But there is nothing in the real world corresponding to language.’ [Chomsky (2004: 131)]
A first striking point about (40) is that although it appears in a book published in 2004, the text dates from before the introduction of the term I-language. The book in question is a reissue of a book first published in 1982, with new material added. The 1982 book consists of interviews, which according to the preface took place in 1979 and 1980 (2004: 29). The statement in (40a) may be felt as counterintuitive, but this intuition will probably be based on a sense of grammar and language not intended by Chomsky. If we take grammar in the sense of the linguist’s grammar, as in (39b), (40a) is strange. It is usually held that a linguist tries to come up with a grammar for a particular language that is given. The reference to the brain in (40b) points to the correct interpretation of grammar, which
68
Chomskyan Linguistics and its Competitors
corresponds to (39c) here. We have come across this somewhat confusing use of grammar and language in (18) and (19) in Section 2.1.3. If we replace these terms by the ones introduced by Chomsky (1986a), we get the much less obscure (41). (41)
the notion [of] E-language is a much more abstract notion than the notion of I-language. The reason is that I-languages have to have a real existence, that is, there is something in your brain that corresponds to the I-language. That’s got to be true. But there is nothing in the real world corresponding to E-language.
The new terminology also makes it possible to state in a fairly obvious way in which sense a grammar is supposed to be ‘real’. Chomsky states this in (42). (42)
‘The statements of a grammar are statements of the theory of mind about the I-language, hence statements about structures of the brain formulated at a certain level of abstraction from mechanisms. These structures are specific things in the world with their specific properties.’ [Chomsky (1986a: 23)]
The philosophical question of realism is concerned with the nature of the relationship between a theory and the corresponding real-world object. In the context of grammar, it is often called the question of the psychological reality of a grammar. It has generated a lot of debate, some of which will be presented in Chapter 4. The ambiguity of grammar has no doubt obscured this point. Chomsky’s position is first of all that I-language is a real entity, as we have seen in Section 2.1. As for the grammar, he holds that it has a status similar to a theory in physics. Chomsky (1980a: 106f.) argues that there is no reason to distinguish between the psychological reality and the truth of a theory in linguistics, because in physics no distinction is made between ‘physical reality’ and truth. However, this simplifies matters too much, as Harman (1980) states in (43). (43)
‘given any theory we take to be true, we can always ask what aspects of the theory correspond to reality and what aspects are mere artifacts of our notation. Geography contains true statements locating mountains and rivers in terms of longitude and latitude without implying that the equator has the sort of physical reality the Mississippi River does.’ [Harman (1980: 21)]
In his reaction, Chomsky admits that in (43) Harman ‘correctly points out an error in my formulation: there is a question of physical (or psychological) reality apart from truth in a certain domain’ (1980b: 45). At an earlier stage, however, Chomsky had already indicated which elements of his theory were intended to be like the Mississippi River and which like the equator. We find this at least implicitly in (44).
The research programme of Chomskyan linguistics (44)
69
a.
‘a generative grammar is not a model for a speaker or a hearer. It attempts to characterize in the most neutral possible terms the knowledge of the language that provides the basis for actual use of language by a speaker-hearer. b. When we speak of a grammar as generating a sentence with a certain structural description, we mean simply that the grammar assigns this structural description to the sentence. c. When we say that a sentence has a certain derivation with respect to a particular generative grammar, we say nothing about how the speaker or hearer might proceed, in some practical or efficient way, to construct such a derivation.’ [Chomsky (1965: 9)]
The term generative grammar in (44a) refers to a grammar in the sense of a theory of a speaker’s competence, as opposed to a traditional type of grammar relying on the reader’s linguistic intelligence. The reference to ‘the knowledge of the language’ in (44a) has to be taken in a restrictive sense corresponding to grammatical competence to the exclusion of pragmatic competence, two terms introduced only later. Thus (44a) describes part of Figure 2.3 and states that the grammar describes the competence underlying the facts of performance. The contrast between (44b) and (44c) is remarkable because it distinguishes two aspects of what a grammar does. A grammar links sentences to particular representations of their structure. According to (44c), the process by which this link comes about, i.e. the derivation, does not necessarily correspond to a process that goes on in the speaker’s mind. The resulting structural description, however, as suggested by (44b), does in fact correspond to knowledge represented in the speaker’s mind. 8 This means that representations have the same status as the Mississippi River, whereas derivations are more like the coordinate system used in geography. This interpretation of (44) is supported by later writings. Thus, Chomsky (1988: 90) concludes after a long discussion of Spanish data that ‘We now have evidence for the presence of two traces in mental representation’, which implies that the ‘traces’, elements in the structural representation of certain sentences, are psychologically real. However, the relative priority of derivations and representations has been the subject of much debate in the history of Chomskyan linguistics. The shifts in theoretical orientation within the research programme, briefly discussed in Sections 2.5 and 2.6, have also affected the relative importance of representational and derivational perspectives. Nevertheless we can safely assume the conclusions in (45).
70 (45)
Chomskyan Linguistics and its Competitors a.
The hypothesis that a grammar G is a correct representation of the competence of a speaker S does not imply that every aspect of G has a physical correlate in the brain of S. b. The way S uses G to produce and interpret sentences is not encoded in G.
As in good software design, (45b) implies that the data are separated from the procedures. A grammar includes the information (i.e. data in the software design metaphor) required for the link between a sentence and its syntactic representation(s), not the procedures involved in establishing this link. At the same time, (45b) shows how Lees was mistaken in his review of Chomsky (1957), when he claimed that ‘while this type of grammar has been constructed to permit the automatic generation of all sentences, there is, of course, no provision for correctly analyzing any given utterance in the presence of the grammar’ (1957: 404). In fact, the grammar is not constructed to account for generation any more than for analysis. Analysing and generating sentences is a task performed by a separate component, which makes use of the knowledge described by the grammar.
2.3.2
Idealisations
In scientific research it is often useful to make idealisations. An idealisation temporarily ignores complicating factors found in the real world. A description of the ‘idealized native speaker’ referred to in (37) is given by Chomsky in (46). (46)
a. b. c. d.
‘Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech community, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance.’ [Chomsky (1965: 3)]
There has been much discussion of this passage, but much of the criticism of (46) is based on misunderstandings. Indeed, Newmeyer writes about the paragraph of which (46) is the first part that ‘no paragraph in any generativist work has engendered as many misunderstandings’ (1983: 73). Most of these misunderstandings concern (46a-b). To some critics, the function of an idealisation is not clear. By appealing to an ideal speaker-listener in (46a), it should be obvious that no claim is made as to the actual occurrence of (46b-d). Moreover, ‘ideal’
The research programme of Chomskyan linguistics
71
is meant in a technical sense, not in the evaluative sense. Nothing is said about the desirability of (46b-d). An idealisation is a proposal to analyse reality into different underlying factors and to make abstraction of some of these factors in order to arrive at an explanation of the most important factors first. Chomsky (1980a: 24–26) discusses the status of idealisations with the example of (46b). His perspective is to consider what would be a well-motivated basis for objections against this idealisation. He observes that it ‘cannot be that real speech communities are not homogeneous’ (1980a: 25) because although this is obvious as a fact, it has no bearing on the well-foundedness of an idealisation. A serious objection should be based on an argument that ‘the idealization so falsif[ies] the real world that it will lead to no significant insight’ (1980a: 25). This would be the case, for instance, if language could only function in a non-homogeneous speech community. This is hard to imagine, however. He therefore concludes that ‘it is hard to see how anyone could reject the idealization’ (1980a: 26). The case of (46d) is of a somewhat different nature. Although there is no reason to object to an idealisation along these lines, stating it as such is redundant. The distinction between competence and performance, discussed in Section 2.1.1, assigns the factors listed in (46d) to the properties of performance not evolving from competence. The reason for stating them in (46) seems to be purely expository. This passage is on the first page of the book and precedes the introduction of competence and performance. The status of idealisations is also highlighted by Botha (1981) when he discusses the use of dynamic aphasia data as a source of information about competence in (47). 9 (47)
a.
‘a grammar need not explain the properties of the products of the linguistic performance of dynamic aphasics. b. The ideal speaker’s linguistic performance is perfect and unimpaired by, for instance, dynamic aphasia. c. Data about products of the linguistic performance of aphasics are therefore, strictly speaking, qualitatively irrelevant with regard to hypotheses about the form of an ideal speaker’s linguistic competence.’ [Botha (1981: 326–327)]
The statement in (47b) is a straightforward extension of the idealisations in (46). It implies (47a), which is unobjectionable, but not (47c). What Botha seems to do in (47) is to build up a wall around the ideal speaker’s linguistic competence in such a way that all data not directly pertaining to the area within the wall are irrelevant. The actual function of idealisations, however, is not to exclude data that are potentially useful, but to protect the theory from problematic data. The
72
Chomskyan Linguistics and its Competitors
emphasis in (47a) should be on need. Categories of data that do not have priority can still be used with the same force as other categories of data when they give information about our focus of attention. The focus of attention is the speaker’s competence. Any idealisation is only an auxiliary measure, not a restriction. Perhaps the most remarkable element of (46) in view of the preceding discussion is the reference to the ‘language’ of a ‘speech community’ in (46c). We have seen earlier that language should often be interpreted as E-language. In (46c), this interpretation is reinforced by ‘its’, implying that the language is a property of the speech community rather than of the speaker. The use of ‘perfectly’ implies that there is a standard which is determined by (or at least relative to) the speech community and against which the speaker’s competence is measured. As such it seems at odds with statements such as (40a) about the priority of grammar/I-language over (E-)language. The explanation of this apparent discrepancy is again based on the nature of idealisations. It is (46c) which makes it possible for linguists to say that they are investigating data from a particular language, e.g. French, rather than from a particular speaker and to use data from different informants as relevant to the same object. This is common practice and as an idealisation it is harmless. When problems with the notion of French as a language arise, we can defuse them by invoking (46c). Ultimately, we also need an account for the fact that I-languages tend to converge to such ideals as French. At that point we undo the idealisation and focus on the relationship between the I-languages and the less well-defined, ontologically more problematic notion of E-language. This means that we raise the question of why French is perceived as a language. Assigning to the use of ‘language’ the status of an idealisation in the way this is done in (46c) licenses common research practice without changing epistemological commitments. This interpretation is supported by the elaboration of the point in (48). (48)
‘We may imagine an ideal homogeneous speech community in which there is no variation in style or dialect. We may suppose further that knowledge of the language of this speech community is uniformly represented in the mind of each of its members, as one element in a system of cognitive structures. Let us refer to this representation of the knowledge of these ideal speakerhearers as the grammar of the language.’ [Chomsky (1980a: 219–220)]
In this way, Kayne (1975), for instance, can correctly claim to study French and use data from different speakers of French as his informants to support his analysis.
The research programme of Chomskyan linguistics
2.3.3
73
The problem of indeterminacy
One of the problems a research programme is a response to is the problem of the indeterminacy of the data. As explained in Section 1.2.1, this problem arises because an infinite number of different generalisations is possible for a particular set of data. In the same way, an indefinitely large number of theories is compatible with the initial set of data. Obviously, this problem also arises for the theory of grammar. For every set of data taken as observations in Figure 2.3, an indefinite number of grammars compatible with these data can be devised. Chomsky formulates this problem as (49). (49)
a.
‘Gross coverage of many facts can undoubtedly be obtained in many different ways. What we want in a grammar is not mere coverage of facts, but insightful coverage, something much more difficult to define or to attain.’ [Chomsky (1962a: 549)] b. ‘What we must demand, in other words, is that the general theory of linguistic form lead to the selection of grammatical descriptions that are true and insightful.’ [Chomsky (1962a: 550)]
In (49a) Chomsky highlights the futility of approaching this problem only by collecting more and more data. Instead, we have to aim for ‘insightful coverage’ of the data. Chomsky makes the same general point when he states that the ‘Choice of a descriptively adequate grammar for the language L is always underdetermined (for the linguist, that is) by data from L’ (1966a: 11fn.). As (49b) states, this involves the choice among alternative grammars. Chomsky approaches this problem of choice by drawing a parallel between the linguist and the child, as in (50). (50)
‘The problem for the linguist, as well as for the child learning the language, is to determine from the data of performance the underlying system of rules that has been mastered by the speaker-hearer.’ [Chomsky (1965: 4)]
The parallel between the linguist and the child draws attention to a number of conspicuous differences. Whereas children learn a language, which involves the acquisition of competence, without conscious effort, linguists find it much more difficult to devise a grammar describing this competence. There are also interesting similarities, as indicated by (51). (51)
‘As a pre-condition for language learning, he must possess, first, a linguistic theory that specifies the form of the grammar of a possible human language, and, second, a strategy for selecting a grammar of the appropriate form that is compatible with the primary linguistic data.’ [Chomsky (1965: 25)]
74
Chomskyan Linguistics and its Competitors
The antecedent of ‘he’ in (51) is ‘the child’, but the activity attributed to the child here is described in terms of what the linguist does. In this sense, (51) exploits the intentional ambiguity of ‘grammar’ we have seen in Section 2.3.1 and extends it to apply also to ‘linguistic theory’. Although a grammar in Figure 2.3 is a theory, ‘linguistic theory’ as used in (51) is a theory of a different kind. It is a higher-order theory which encodes general criteria for the selection of a lower-order theory referred to as a grammar. The model of grammar represented in Figure 2.3 suffers from a certain degree of indeterminacy. This is a general problem of all research. It is worth solving in Chomskyan linguistics because it is assumed that the entity described by the grammar is real. As Chomsky (1986a: 250) argues, if different grammars describe the same set of data, e.g. grammaticality judgements obtained by introspection and from informants, there is more to be said than that both are possible grammars. There is a real entity, the competence as it is actually organised in the speaker’s mind/brain, that may correspond to a particular theory to a larger or smaller degree. As discussed in Section 1.3.2, there are two possibilities for the choice between different theories. One is that the research programme defines a particular simplicity measure. The competing theories that are compatible with the data are compared for simplicity and the simplest theory wins. As noted in the discussion, simplicity can be defined in different ways and the choice between them is made by the research programme. A second and in general more satisfactory way to choose the best theory is to aim for a deeper level of explanation. This is what (11) in Section 1.3.2 calls progress. SUMMARY
•
A grammar is a theory of a speaker’s competence. It explains the observations of facts deriving from this competence. Conversely these observations can test the grammar.
•
A grammar in Chomskyan linguistics differs from a traditional grammar in that it specifies the linguistic intuition presupposed by the latter.
•
Until the introduction of the term I-language, the term grammar was used ambiguously to refer to the knowledge in the speaker’s competence and the linguist’s theory of this knowledge.
•
A grammar describes the knowledge used to formulate and understand sentences, not the way this knowledge is used in the processes of generating or analysing sentences.
The research programme of Chomskyan linguistics
•
Idealisations allow the linguist to make abstraction of some complicating factors underlying the actual observations in order to gain insight into the central mechanisms of grammar.
•
An often used idealisation in Chomskyan linguistics is the homogeneous speech community.
•
Grammars are underdetermined by observations.
2.4
75
The role of language acquisition
In order to resolve the indeterminacy in an approach to language guided only by the model illustrated in Figure 2.3, Chomsky considers the questions in (52). (52)
a. What constitutes knowledge of language? b. How does such knowledge develop? c. How is such knowledge put to use?
These questions appear in Chomsky (1981b: 32) as ‘a number of fundamental questions’ guiding the study of language and in Chomsky (1986a: 3) as ‘The three basic questions that arise’. The model in Figure 2.3, and our discussion so far, only consider (52a). In principle, both (52b) and (52c) could be used to increase explanatory depth and solve the indeterminacy of grammars. From these, Chomsky chooses language acquisition rather than the use of language. In this section, first the empirical motivation of this choice will be considered (Section 2.4.1) and its consequences for the interpretation of linguistic universals (Section 2.4.2). Then we turn to the consequences of this choice for the research programme. Section 2.4.3 extends the model represented in Figure 2.3 to include the function of universals in language acquisition and Section 2.4.4 discusses the additional idealisations involved in this extension.
2.4.1
Language acquisition versus use of language
The indeterminacy problem described in Section 2.3.3 is a problem of the research programme. It is the research programme which determines the criteria to be fulfilled by good theories. While some properties, e.g. internal inconsistency, will always be seen as marking bad theories, different and partly incompatible considerations can determine what are the most important positive qualities of a theory. The choice between language acquisition and use of language, i.e. (52b) or (52c), as guiding questions for the evaluation of
76
Chomskyan Linguistics and its Competitors
grammars can therefore only be motivated in part by rational arguments. The arguments for (52b) make the choice rationally plausible but cannot make it rationally compelling. Chomsky (1957: 51) considers three ways a corpus can be used as a basis for finding a grammar for the language L from which the corpus is drawn. They are listed in (53). (53)
a.
A discovery procedure is a mechanism that derives the grammar of L from the corpus. b. A decision procedure is a mechanism that, given a grammar G provided by some unspecified source, decides whether G is the best grammar for L. c. An evaluation procedure is a mechanism that, given two grammars G1 and G2 provided by some unspecified source, decides which of the two is better as a grammar for L.
Chomsky (1957) discusses the procedures in (53) from the perspective of the linguist looking for a grammar for L. He argues that a discovery procedure as in (53a) is totally unrealistic. This conclusion is supported by the insight that a grammar is a theory and theory choice is not deterministic. Even a decision procedure as in (53b) is unrealistic, because it would imply a complete overview of all possibly relevant competing grammars. Therefore only an evaluation procedure for the type of task described in (53c) is realistic. This means the linguist has to hypothesise the competing grammars before comparing them. As suggested by (50), the problem of selecting a grammar on the basis of a corpus arises also for the child learning their first language. Chomsky models language acquisition in this way when he states that ‘The task for the language-learning device is to select the highest-valued grammar that is compatible with these data’ (1962a: 535). There are two striking differences between the linguist and the child. First, as discussed in Section 2.2, the linguist has a variety of data collection techniques not available to the child. Second, every normal child manages to select a grammar, but generations of intelligent linguists have not managed to formulate one. This suggests that selecting a grammar as a child is supported in a way not available to the linguist. Chomsky formulates this as in (54). (54)
‘What is required for feasibility is that given data, only a fairly small collection of languages be made available for inspection and evaluation (e.g., languages might be sufficiently “scattered” in value so that only few are so available).’ [Chomsky (1986a: 55)]
It is this precondition to language learning that is of interest as a procedure to solve the indeterminacy of grammar choice. The availability of only a relatively
The research programme of Chomskyan linguistics
77
small number of possible grammars (I-languages) to the child makes an evaluation procedure as in (53c) possible. The idea that data available to the child are not sufficient for a discovery procedure is often called the argument of the poverty of the stimulus. Chomsky (1986a) calls it ‘Plato’s problem’. Hornstein and Lightfoot (1981: 9f.) describe three levels of ‘deficiency of the data’ available to children that are not reflected in the competence when they grow up. First, the corpus of utterances children hear contains ungrammatical sentences or fragments, as illustrated by (5). After language acquisition they are able to distinguish grammatical and ungrammatical sentences. Second, the corpus is finite but ‘the child comes to be able to deal with an infinite range of novel sentences, going far beyond the utterances actually heard during childhood’ (1981: 9). Grammaticality judgements such as the contrast in presentational there in (21) or the examples in (23) do not depend on previous exposure to precisely these grammatical sentences. Third, the corpus does not contain certain rare constructions, e.g. parasitic gaps illustrated in (26). Nevertheless children attain consistent and reliable knowledge about such constructions. A restriction not explicitly listed by Hornstein and Lightfoot (1981) is the restriction to positive evidence. As Chomsky states, ‘There is good reason to believe that children learn language from positive evidence only (corrections not being required or relevant)’ (1986a: 55). While the linguist works with minimally different pairs of grammatical and ungrammatical sentences, nothing of the kind is necessary for children in the language acquisition process. Even if negative evidence, in the form of overt corrections, is presented to the child, it is in many cases ignored. There are many anecdotes about children’s reactions to corrections illustrating this point. Jackendoff (1993: 22, 104) gives some amusing examples of how a child resists correction even when the parent goes to great length to make the child produce the right construction. The importance of the argument from the poverty of the stimulus is summarised in (55). (55)
a.
‘An investigation of the final states attained, that is, the grammars, reveals that the knowledge acquired and to a large extent shared involves judgments of extraordinary delicacy and detail. b. The argument from poverty of the stimulus leaves us no reasonable alternative but to suppose that these properties are somehow determined in universal grammar, as part of the genotype. There is simply no evidence available to the language learner to fix them, in many crucial cases that have been studied.’ [Chomsky (1980a: 66)]
In (55a) Chomsky summarises the point made in more detail by Hornstein and Lightfoot (1981). He uses ‘final state’ to refer to an I-language as found in adults. The device that is in this final state is the mechanism the child uses
78
Chomskyan Linguistics and its Competitors
in the language acquisition process. This device is referred to as ‘universal grammar’ in (55b). It is part of the ‘genotype’ in the sense that it is genetically determined. The reasoning is that if every child has the possibility to learn a language in a way that cannot be explained by means of properties of the input, the child must have a kind of tool independent of the input. This tool is then part of the properties specified for the human species in the genetic code realised in every individual. The poverty of the stimulus argument means that an important part of language is genetically determined. This is the only solution of Plato’s problem acceptable to the modern state of scientific knowledge. As a consequence, language acquisition works in a way quite different from what is traditionally considered as learning. Chomsky formulates this in (56). (56)
a.
‘knowledge of grammar, hence of language, develops in the child through the interplay of genetically determined principles and a course of experience. […] b. in certain fundamental respects we do not really learn language; rather grammar grows in the mind.’ [Chomsky (1980a: 134)]
There are two further properties of language acquisition that correspond more closely to growth than to learning. First, the child does not have to make a special effort. In this sense language is different from writing. Second, the child does not have a choice whether to do it or not. Given the minimal input conditions, the ‘course of experience’ in (56a), what (56b) calls ‘grammar’ (i.e. I-language) develops independently of the child’s control. So far, we have concentrated on reasons why question (52b), language acquisition, should be used as a basis for solving the indeterminacy of the grammar. For a complete argument, we should also find reasons for not adopting (52c), use of language, in this role. In this context, it is interesting to see that Katz (1964) comes to a different conclusion in this respect. In (57) his formulation of three questions corresponding to (52) is given. (57)
a. ‘What is known by a speaker who is fluent in a natural language? […] b. How is such linguistic knowledge put into operation to achieve communication? […] c. How do speakers come to acquire this ability?’ [Katz (1964: 130)]
Katz (1964: 131) argues that (57a) is ‘logically prior to the others’ and that (57b) ‘is, in the same sense, logically prior to’ (57c). His reasoning is that knowledge of language is necessary for language use and use of language in communication is necessary for language acquisition. There are two reasons why this reasoning is not adopted in Chomskyan linguistics. 10
The research programme of Chomskyan linguistics
79
First, if we accept Katz’s argument, we have to take ‘the course of experience’ in (56a), i.e. the input the child gets, to be identical to use of language in communication. There is evidence, however, that the input for language acquisition can be much less than this. An example is the formation of creoles. Creoles come into existence when a pidgin language becomes the first language of a group of children. Pidgins are auxiliary languages used by people who do not share a common language for basic, functional communication. They are in essential ways incomplete and unsystematic. When children grow up getting pidgin as their linguistic input, the language they acquire is a creole. A creole is a full-fledged language, much richer in expressive potential and much more systematic than a pidgin. Therefore, children can learn a language (creole) without getting as input the use of that language in communication. Rather they only get input from a defective language (pidgin), different from the language they learn. 11 One of the consequences of the (56b) perspective of growth rather than learning is that Katz’s argument for the logical connection of the three questions in (57) is no longer compelling. A second reason why (57) is not taken as a basis in Chomskyan linguistics is related to a subtle difference in the formulation of (57b) and (52c). Instead of ‘use’ of language in (52c), Katz refers to putting it ‘into operation to achieve communication’ in (57b). In the most common sense of communication, it does not include certain types of language use that are quite important. While writing this book or making a shopping list, for instance, I continually structure my thoughts by using language. I use language to express thoughts without actually communicating. 12 This illustrates that (52c) is more general than (57b). Chomsky formulates the conditions for a proper answer to (52c) as in (58). (58)
‘The answer to the third question would be a theory of how the knowledge of language attained enters into the expression of thought and the understanding of the presented specimens of language, and derivatively, into communication and other special uses of language.’ [Chomsky (1986a: 4)]
According to (58) the question of the use of language consists of two parts. Chomsky (1988: 4f.) calls these the production problem and the perception problem. They correspond to ‘expression’ and ‘understanding’ in (58), respectively. The production problem has to do with the creativity of language, i.e. how and why we choose to formulate certain sentences. It is this problem that he contrasts with the acquisition problem in (59).
80 (59)
Chomskyan Linguistics and its Competitors a.
‘The study of the development of cognitive structures ([…]) poses problems to be solved, but not, it seems, impenetrable mysteries. b. The study of the capacity to use these structures and the exercise of this capacity, however, still seems to elude our understanding.’ [Chomsky (1975a: 77)]
The claim in (59) is that there is a distinction in solvability between the language acquisition problem, referred to in (59a), and the problem of the creative use of language, as formulated in (59b). 13 In fact, (59) is the conclusion of a rather long, technical argument why language acquisition should be taken as more important than use of language in constraining the choice of a grammar in the context represented in Figure 2.3. The discussion by Chomsky (1975a) does not mention the perception problem in the same detail. However, this appears to be a better candidate than the production problem for selecting a grammar. After all, we generally understand each other when we use language and we do not need a long time to interpret a sentence. The question of whether the perception problem could be used, instead of language acquisition, as a criterion for choosing the correct grammar for an I-language is addressed by Chomsky (1997). Some excerpts from this discussion are given in (60). (60)
a.
‘Why then does parsing seem so easy and quick, giving rise to the conventional false belief? b. The reason is that when I say something you ordinarily understand it without effort. […] But from that fact we cannot conclude that language is designed for quick and easy parsing. It shows only that there is a part of language that we parse easily, and that is the part we tend to use. […] c. large parts of the language – even short and simple expressions – are unusable […] d. The language is simply not well adapted to parsing.’ [Chomsky (1997: 15)]
In (60a), ‘parsing’ refers to the process of assigning a structure to a sentence. This process is the core of the solution to the perception problem. If we want to take the perception problem as the problem to which the choice of a grammar should be tuned, instead of the language acquisition problem, we have to argue that parsing is at least as effortless and automatic as acquisition. In (60d) this is explicitly denied. The basis for this conclusion is on the one hand an explanation of why parsing is perceived as easy, (60b), on the other a reason why this perception is false, (60c). Central embedding as illustrated in (7) is one of the constructions supporting the claim in (60c). It produces grammatical sentences that are very difficult to parse. In line with (60b), they are avoided in practical use.
The research programme of Chomskyan linguistics
81
In conclusion, in Chomskyan linguistics the question of language acquisition is adopted as the main restriction on the choice of a grammar to describe competence. Whatever grammar is adopted, it must be learnable by a child in order to be a valid candidate for a proper description of the competence as actually realised in the mind/brain of the speaker. This means that it must be compatible with a view in which reduced input is sufficient and the language ‘grows’ in the child. The question of language use is not identical to the question of linguistic communication because language is used for expressing thoughts and communication is only a special case of this use. The general question can be divided into a production problem and a perception problem. The production problem involves the creative use of language. It is doubtful whether an explanatory model of this aspect of language is possible in principle. The perception problem involves parsing. Parsing seems easy and quick, not because of properties of the language but because we can only use the parts in which parsing is easy and quick. Therefore neither component of language use is a good substitute for language acquisition as a guide to finding the grammar that correctly describes a speaker’s competence.
2.4.2
Linguistic universals
The study of linguistic universals can be undertaken from different perspectives. There is a long tradition of looking for common properties of languages in order to come up with typological classifications. Comrie (1989) gives an overview of this approach. In this type of study the starting point is a large collection of different languages and universals emerge from the attempt to discover systematic differences. It is in the context of this work that the distinction between formal and substantive universals is made. The difference between formal and substantive universals is that the former concern rules and the latter entities referred to by the rules. As explained by Comrie (1989: 15f.), substantive universals are lists of categories, e.g. phonetic features or syntactic categories. They may be used in two ways. First, they can be taken as the set of possible items from which all languages can choose their own repertoire as a subset. An example of such an attempt is Ladefoged’s (2001) systematic enumeration of phonemes in the world’s languages. The only way such an enumeration can be falsified is by a new language contributing a new phoneme to the list. Secondly, they may be used as a background for the statement of generalisations, for instance, all languages have vowels. Such generalisations are falsified if we find a counterexample. Formal universals are statements about the form of rules. An example is that no language has a rule inverting the words of a sentence (e.g. to form a question) independently of the structure.
82
Chomskyan Linguistics and its Competitors
Although Chomsky (1965: 28f.) also makes the distinction between formal and substantive universals, it is not crucial to the way universals are used in Chomskyan linguistics. The position of universals is summarised in (61). (61)
‘The most intriguing of the studies of language structure are those that bear on linguistic universals, that is, principles that hold of language quite generally as a matter of biological (not logical) necessity.’ [Chomsky (1980a: 232)]
The focus of attention in (61) is on biologically necessary linguistic universals, i.e. universals that have to be part of what Chomsky (1980a: 28) calls the ‘human biological endowment’ in order to make language acquisition possible. This excludes logical approaches working by induction or deduction. Inductive approaches take the properties of all existing languages as a basis. As Newmeyer (1983: 40) emphasises, ‘it does not thereby follow that any feature common to all the languages of the world is a feature of universal grammar.’ If a mass extinction of languages took place and only English, Spanish, French, and German survived, a future generation of linguists might take it to be a universal that languages have definite and indefinite articles. We know this cannot be necessary for language acquisition, because a language like Russian does not have articles, but if all languages without articles disappear without a trace, this argument is not available to these hypothetical future linguists. Therefore, if we find that a particular property is shared by all languages we have access to, the property does not automatically qualify as a universal in the sense intended by (61). Before we can draw such a conclusion we have to establish that the property in question plays a role in language acquisition. It may be the case that there are possible human languages not having the property in question, but we do not know them because they happen to be extinct. Deductive approaches start from the concept of language and derive the necessary properties such a concept must have. As we have seen, language needs qualification to be a useful concept. Formal languages can play a role as E-language, but this concept is not particularly relevant in Chomskyan linguistics. Natural languages are I-languages, empirical objects, for which an empirical approach rather than a logical approach is appropriate. In (61) this is expressed by the reference to the biological rather than logical necessity. In Chomskyan linguistics, the existence of universals is a kind of side effect of the assumption that there is a genetic predisposition for the acquisition of language. It would be incorrect to state that universals explain acquisition or the reverse. Rather the genetic code explains both. 14
The research programme of Chomskyan linguistics
2.4.3
83
Extension of the model
The role assigned to language acquisition is perhaps the most distinctive feature of Chomskyan linguistics as compared to other approaches. As a consequence, it has been discussed, attacked, and defended from different points of view. In Chomsky’s writings, a number of different terms are introduced, of which it is not always straightforward to see how they are related to each other. In this section the most important ones will be explained in order to arrive at an extension of the model of Figure 2.3. A first perspective takes language acquisition as a task for the child, equipped with a specialised mental device as represented in Figure 2.4. Primary linguistic data
LAD
I-language
Figure 2.4: The language acquisition device
Figure 2.4 is what is described by Chomsky in (62). (62)
a.
‘certain problems of linguistic theory have been formulated as questions about the construction of a hypothetical language-acquisition device. […] b. Much information can be obtained about both the primary data that constitute the input and the grammar that is the “output” of such a device and the theorist has the problem of determining the intrinsic properties of a device capable of mediating this input-output relation.’ [Chomsky (1965: 47)]
The term language acquisition device (LAD) introduced in (62a) has since been established. The primary linguistic data (PLD) are the data available to the child in language acquisition. (62b) uses ‘grammar’ in the sense of I-language. It is striking that (62b) takes the perspective of the linguist while Figure 2.4 represents the internal processes of the child. In (63) Chomsky changes this perspective as well as the terminology. (63)
a.
‘The language faculty is a component of the mind/brain, part of the human biological endowment. b. Presented with data, the child, or, more specifically, the child’s language faculty, forms a language, c. a computational system of some kind that provides structured representations of linguistic expressions that determine their sound and meaning.’ [Chomsky (1988: 60)]
84
Chomskyan Linguistics and its Competitors
In (63a-b), the term language faculty is used instead of LAD. Whereas the term language acquisition device has a strong procedural flavour, suggesting a particular action as its only purpose, language faculty is more declarative in nature. The relatively informal context of (63) explains the use of ‘language’ in the sense of I-language. The interpretation is unambiguous with the explicit description in (63c). We find figures similar to Figure 2.4 in several places in Chomsky’s work, suggesting that he considers it a central part of his linguistic approach. It appears with a neutral designation of the LAD in Chomsky (1964: 26) where the function of the LAD is contrasted with that of a parsing device for the understanding of utterances. A figure much like Figure 2.4 also accompanies (63) and another variant is used by Chomsky (2004: 150). While Figure 2.4 describes the language faculty as a black box, determined by its input and output, in many cases a different type of description, focusing on the process of acquisition is more appropriate. This perspective is represented in Figure 2.5. Primary linguistic data
S0
SS
Figure 2.5: Language acquisition as a process
The process in Figure 2.5 is described in (64). (64)
‘there is a fixed, genetically determined initial state of the mind, common to the species with at most minor variation apart from pathology. The mind passes through a sequence of states under the boundary conditions set by experience, achieving finally a steady state at a relatively fixed age.’ [Chomsky (1980a: 187)]
In Figure 2.5, following Chomsky (1986a: 24), the ‘initial state’ mentioned in (64) is represented by S0 and the ‘steady state’ by SS. The relationship between Figure 2.4 and Figure 2.5 can be described as follows. SS corresponds more or less to the mature I-language. PLD occur in both figures. Therefore S0 together with the arrow leading to SS corresponds to the LAD. The focus on the development of the language faculty makes Figure 2.5 a more suitable representation of language acquisition as the growth of an I-language, as described in (56).
The research programme of Chomskyan linguistics
85
In order to link language acquisition to the model of the research programme in Figure 2.3, we have to include not only the mental component but also its theoretical description. The relationships of these entities are described by Chomsky in (65). (65)
‘In conventional terminology, adapted from earlier usage, the language organ is the faculty of language (FL); the theory of the initial state of FL, an expression of the genes, is universal grammar (UG); theories of states attained are particular grammars; the states themselves are internal languages, “languages” for short.’ [Chomsky (2002: 64)]
The replacement of language faculty by faculty of language seems to be motivated exclusively by the abbreviation. Whereas LF had been taken for Logical Form, a level of syntactic representation in Chomsky’s theory, from the 1970s, FL was still available. The reference to a language organ in (65) reflects a new point of view, in which the language faculty is considered an organ of the mind in the same way as the liver is an organ of the body. The relations between the concepts designated by the italicised expressions in (65) are represented in Figure 2.6.
Language Faculty
describes
Universal Grammar test
Competence Competence Competence Competence Competence
describes
explains
Grammar Grammar Grammar Grammar Grammar
Figure 2.6: The position of UG in the model of Chomskyan linguistics
An essential feature of Figure 2.6 is the plurality of I-languages (‘competence’) and grammars, each I-language corresponding to the competence of an individual. UG explains that there is a grammar for each instantiation of competence because it describes FL, which underlies each instance. Conversely, each grammar describing an I-language can be used to test UG. If the grammar is not compatible with UG then either UG or the grammar has to be revised.
86
Chomskyan Linguistics and its Competitors
Parallel to the ambiguous use of grammar, as seen in (39), also universal grammar has been used ambiguously. Chomsky admits that it was one of the ‘questionable terminological decisions’ affecting ‘the term UG, introduced later with the same systematic ambiguity, referring to S0 and the theory of S0’ (1986a: 28f.). We therefore have a cluster of terms for what is basically the same concept, as listed in (66). (66)
Terms referring to the universal mental component of language a. Language Acquisition Device (LAD) b. Language faculty, Faculty of Language (FL) c. Initial state (S0) d. Universal Grammar (UG) e. Language organ
The different terms in (66) each have their own justification, highlighting different aspects of the component in question. Apart from (66d), which introduces unnecessary ambiguity, they are all acceptable. Here (66b) will be used as the neutral term. The full model of the research programme of Chomskyan linguistics combines Figure 2.6 with Figure 2.3. It is given in Figure 2.7.
describes
Universal Grammar tests
describes
Grammar
test describe
explains
Observations
Data
Observable facts
explains Individual
Competence
Species
Language Faculty
Figure 2.7: Model of the research programme of Chomskyan linguistics15
On the left-hand side in Figure 2.7 we find entities of the real world. At the data level, the observable facts can be registered, for instance, as soundwaves or characters stored on a computer, or as measurements. At the level of the
The research programme of Chomskyan linguistics
87
individual speaker, the competence is realised in the speaker’s brain. At the species level, the language faculty is realised in the genetic code. The theoretical description at each level, represented on the right-hand side of Figure 2.7, is marked by a rather high degree of abstraction. Observations, for instance of utterances, are not recorded in terms of sound bites. A grammar is not formulated in terms of neurons. UG is not formulated in terms of DNA. While it may ultimately be possible to discover the neuronal and DNA equivalents of the grammar and UG, they would not be very informative without the more abstract description. The links between real-world entities in Figure 2.7 can both be described rather vaguely as ‘underlies’ in the sense that competence is an essential factor in the origin of the observable facts taken into account, as is the language faculty in the origin of competence. The question whether language faculty and competence are different entities or the same entity in different stages of development will be taken up in the discussion of first and second language acquisition in Sections 5.2 and 5.3. The links between the theoretical entities in Figure 2.7 have been discussed in the contexts of Figure 2.3 and Figure 2.6. When we combine them into one system, there is a certain tension between the two levels of testing and explaining. This tension is described in (67). (67)
a. ‘The theory of UG must meet two obvious conditions. b. On the one hand, it must be compatible with the diversity of existing (indeed, possible) grammars. c. At the same time, UG must be sufficiently constrained for the fact that each of these grammars develops in the mind on the basis of quite limited evidence.’ [Chomsky (1981a: 3)]
Hornstein and Lightfoot (1981: 13f.) illustrate the tension signalled in (67) by considering two extreme positions. An optimal solution to (67b) would be to impose no constraints at all in UG. This would make testing the grammars trivial, because every grammar would be accepted, but explaining their acquisition impossible. Given the poverty of the stimulus, language acquisition cannot be explained without sufficient genetically encoded backing in the language faculty. An optimal solution to (67c) would be to impose the entire grammar of an I-language on UG. This would make explaining language acquisition trivial, but would fail the test of many grammars. It cannot be that, for instance, Dutch is hard-wired into the language faculty, because not everyone speaks Dutch. Given the variety of languages, input (i.e. PLD) must play a role in determining the individual competence in SS. As the extreme positions are unrealistic, an optimal point in between has to be found. Condition (67b) requires a large enough degree of variation among
88
Chomskyan Linguistics and its Competitors
I-languages, thus weakening UG, whereas condition (67c) requires a set of constraints strong enough to explain the learnability of these I-languages, thus strengthening UG. As Hornstein and Lightfoot state, ‘this dual requirement restricts severely the principles one can ascribe to the genotype’ (1981: 14). An interesting consequence of (67) is noted by Chomsky in (68). (68)
a.
‘It is important to bear in mind that the study of one language may provide crucial evidence concerning the structure of some other language.’ [Chomsky (1986a: 37)] b. ‘Because evidence from Japanese can evidently bear on the correctness of a theory of S0, it can have indirect – but very powerful – bearing on the choice of the grammar that attempts to characterize the I-language attained by a speaker of English.’ [Chomsky (1986a: 38). An obvious typo (‘corrrectness’) has been corrected in (68b).]
If we do not take into account considerations of language acquisition and UG, infinitely many grammars are compatible with the data we gather for English. This is the situation that the tension in (67) is supposed to solve. The solution it entails is indicated in (68a). Data from other languages should be compatible with the same UG that accounts for the learnability of English. By looking at more than one language, as proposed by (68b), a new dimension comes into play. Japanese must be learnable on the basis of the same UG as English. Although this first of all restricts the possibilities for the formulation of UG, indirectly it constrains the possibilities for the formulation of a grammar for English. In this sense, data from Japanese may influence the grammar for English. Figure 2.7 also provides a basis for the account of three levels of adequacy, introduced in (69). (69)
a.
‘The lowest level of success is achieved if the grammar presents the observed primary data correctly.* b. A second and higher level of success is achieved when the grammar gives a correct account of the linguistic intuition of the native speaker […] c. A third and still higher level of success is achieved when the associated linguistic theory provides a general basis for selecting a grammar that achieves the second level of success […] d. let us refer to these roughly delimited levels of success as the levels of observational adequacy, descriptive adequacy, and explanatory adequacy, respectively.’ [Chomsky (1964: 28–29), footnote at * deleted]
The research programme of Chomskyan linguistics
89
Chomsky (1964: 29) relates the three levels of success to the three elements of a figure corresponding to Figure 2.4. Observational adequacy corresponds to the PLD, descriptive adequacy to the I-language, and explanatory adequacy to the LAD. One problem with this description is that the object for which a particular level of adequacy is reached remains ambiguous. If we keep to (69), observational and descriptive adequacy are predicated of a grammar in (69a-b), and explanatory adequacy of a ‘linguistic theory’ (equivalent to UG) in (69c). Chomsky (1965: 34), however, contrasts the descriptive and explanatory adequacy of a linguistic theory. Another problem is that a restriction of the data covered for observational adequacy to the PLD in Figure 2.4 is somewhat artificial. The linguist uses data that go beyond the data available to the child. 16 If we consider the levels of adequacy in relation to Figure 2.7, these problems disappear. In Figure 2.7, observational adequacy can be related to the level of the data, descriptive adequacy to the level of the individual, and explanatory adequacy to the level of the species. The levels of adequacy obtain for the entities available at these levels. Observational adequacy requires that a grammar covers the observations but does not require that it describes the competence. A grammar that is only observationally adequate is therefore not to be identified with the grammar at the level of the individual in Figure 2.7. It may be a simple characterisation of the data, equivalent to a listing. Descriptive adequacy imposes the constraint that the grammar describes the actual competence, but makes no claim as to UG. Explanatory adequacy requires that UG correctly describes the language faculty. The main advantage of linking the levels of adequacy to Figure 2.7 rather than to Figure 2.4 is that the claim in (70) becomes obvious. (70)
‘It is not necessary to achieve descriptive adequacy before raising questions of explanatory adequacy.’ [Chomsky (1965: 36)]
In some cases, (70) has been taken as a paradox. How is it possible to explain something before it is properly described? If descriptive adequacy is taken as achieved when competence is described and explanatory adequacy when it is explained, this seems indeed strange. If the two levels of adequacy are taken as technical notions in relation to Figure 2.7, the paradox is resolved. In fact, the tension expressed in (67) makes it impossible to achieve complete descriptive adequacy without raising questions of explanatory adequacy. As the discussion of (70) shows, the levels of adequacy should not be seen as stages on the road to success of a research programme. Instead, they characterise the goals set by a research programme. 17
90
Chomskyan Linguistics and its Competitors
2.4.4
Additional idealisations
The extension of the model from Figure 2.3 to Figure 2.7 adds a reference to language acquisition, but in a very specific perspective only. A number of further idealisations are used here in order to make abstraction from the more messy parts of actual language acquisition. Botha (1981: 42f.) draws a parallel between linguistic performance and language acquisition to clarify this. Competence is clearly a major factor in performance and we try to eliminate the other factors when we study performance. Similarly, the language faculty is a major factor in language acquisition and we try to eliminate the other factors when we study language acquisition. Chomsky expresses this in (71). (71)
a.
‘Obviously, to construct an actual theory of language learning it would be necessary to face several other serious questions, involving, for example […] the continual accretion of linguistic skill and knowledge […] b. What I am describing is an idealization in which only the moment of acquisition of the correct grammar is considered.’ [Chomsky (1965: 202, fn. 19)]
In (71a) the development factor represented by the dashed arrow from S0 to SS in Figure 2.5 is evoked, but (71b) proposes to study language acquisition as if Figure 2.4 were the correct representation. Abstraction is made from the temporal factor in the perception of the PLD, from intermediate stages of knowledge between S0 and SS, and from the activity of the child’s mind in passing from one stage to the next. As (71b) suggests, the entire process is collapsed into one moment. Chomsky and Halle (1968: 331) formulate this as ‘We have been describing acquisition of language as if it were an instantaneous process.’ In the same way as for the idealisations in Section 2.3.2, instantaneous language acquisition is not meant as a factual claim. The purpose of the idealisation is to focus attention on the ‘logical problem of language acquisition’ as a basis for explanation, as the title of Hornstein and Lightfoot’s (1981) collection suggests. In their introduction they defend this view as in (72). (72)
a.
‘We are also under no obligation to pay special attention to child grammars […] b. For the moment the most useful strategy seems to us to focus attention on claims about the initial and the mature states. c. Idealizing to ‘instantaneous’ acquisition, ie ignoring data about developmental stages for the moment, does not seem to us to introduce significant distortions.’ [Hornstein and Lightfoot (1981: 30, fn. 8)]
The two motivations for instantaneous acquisition are stated in (72b), it is useful, and (72c), it does not introduce significant distortions. (72a) draws
The research programme of Chomskyan linguistics
91
attention to the position of ‘child grammars’, the competence of a child at intermediate stages between S0 and SS. The use of language acquisition as a logical problem does not commit the research programme to have a theory on them. Meanwhile, it is generally desirable to have a theory on any aspect of language. Therefore, formulating theories accounting for child grammars, as discussed in Section 5.2, is not in contradiction to (72a). Apart from instantaneous acquisition, another obvious idealisation is the uniformity of the language faculty among humans, as expressed in (73). (73)
‘We are also assuming another idealization: That the property of mind characterized by UG is a species characteristic, common to all humans. We thus abstract from possible variation among humans in the language faculty.’ [Chomsky (1986a: 18)]
In the same way as human hands come in different sizes but (except for pathological cases) all have four fingers and an opposing thumb, the human language faculty will vary between different instances, but (73) takes these differences to be immaterial. It is a different matter to note that aphasia occurs. In terms of the comparison to the hand, aphasia is like lacking a finger, either as a result of an accident or from birth. By analogy to (72a) we are under no obligation to consider aphasia, but we may do so if the data it produces are useful (cf. also the discussion of (47) in Section 2.3.2). Another issue that arises in this context is the distinction between core language and periphery. The exact status of this distinction is somewhat less clear, however. In (74) it is presented as an idealisation. (74)
‘it is reasonable to suppose that UG determines a set of core grammars and that what is actually represented in the mind of an individual even under the idealization to a homogeneous speech community would be a core grammar with a periphery of marked elements and constructions.’ [Chomsky (1981a: 8)]
The idea underlying the distinction is that competence, as presented in Section 2.1, is too heterogeneous to be described by a single theory. As noted in (8), for instance, it consists of a ‘computational procedure’ and a lexicon. These two are different under the acquisition perspective. There is evidence for a critical period for the acquisition of the computational procedure in the sense that if the development of the first language is not triggered by a certain age, normal language acquisition is no longer possible. An example is Curtiss et al.’s (1974) description of a child deprived of normal linguistic input until she was 13. She managed to learn certain aspects of a language afterwards. Her acquisition of vocabulary, in particular content words, was faster than for normal children. She had serious difficulties with the grammar system, however. A more detailed discussion of the critical period hypothesis will be postponed to Section 5.2.3.
92
Chomskyan Linguistics and its Competitors
For the interpretation of (74) this means that the lexicon, at least as far as semantic and phonological properties of content words are concerned, belongs to the periphery. Another good candidate for this status is the choice in some languages between the equivalents of have and be as past tense auxiliaries. Languages such as Dutch, German, French, and Italian use these for a different selection of verbs and sometimes with slight variation between otherwise almost identical dialects. Following (74), assigning them to the periphery means that we take them out of the domain to be accounted for by UG. The question is whether the distinction between core and periphery is part of the research programme or of a particular theory in the research programme. Chomsky seems not to be entirely sure as to this decision. In (75), the distinction between core and periphery is presented as parallel to the homogeneous speech community. (75)
a.
‘We would expect the individually-represented artifact to depart from core grammar in two basic respects: b. (1) because of the heterogeneous character of actual experience in real speech communities; c. (2) because of the distinction between core and periphery. d. The two respects are related but distinguishable.’ [Chomsky (1981a: 8)]
The distinction between the two factors in (75) is rather subtle. What is meant in (75b) is the influence of accidental factors in the PLD on the resulting competence. Whether words like beech or wren are in the child’s lexicon will be a matter of such accidental factors. Clearly they are part of the periphery. In (75c) a further range of elements of knowledge is excluded from the core. That a Frenchspeaking child will grow up to know that the temporal auxiliary of être (‘be’) is avoir (‘have’), as in j’ai été (‘I have been’), whereas a Dutch-speaking child will know that Dutch has ik ben geweest (lit. ‘I am been’), is not due to accidental properties of the PLD. Nevertheless it still belongs to the periphery. The parallellism in the statement of (75b) and (75c) strongly suggests an interpretation as two idealisations. In (76), however, the latter distinction seems to be explicitly taken as part of the theory. (76)
a. ‘Suppose we distinguish core language from periphery, b. where a core language is a system determined by fixing values for the parameters of UG, and the periphery is whatever is added on in the system actually represented in the mind/brain of a speaker-hearer. c. This distinction is a theory-internal one; it depends crucially on a formulation of UG.’ [Chomsky (1986a: 147)]
Whether (76) contradicts our interpretation of (75) depends on the role of (76b). With its reference to parameters, 18 (76b) is clearly embedded in a par-
The research programme of Chomskyan linguistics
93
ticular theory of UG. If a distinction is theory-internal, it is part of a particular theory rather than of the research programme. The precise interpretation of (76) depends on the antecedent of ‘This distinction’ in (76c). If it refers to the distinction identified in (76a), it means that the core/periphery distinction is not part of the research programme. It only divides the theory of UG into different parts. If we take (76b) together with (76a) as the characterisation of ‘This distinction’ in (76c), we can maintain that it is not the core/periphery distinction as such but the particular elaboration that is theory-internal. For the correct interpretation of (75) and (76) we have to keep in mind that, in principle, idealisations are meant to be temporal measures. We are happy if we can dispense with the idealisation of instantaneous acquisition because we have a more powerful theory that can deal with the acquisition process as well as with the logical problem of language acquisition. In this spirit, I tend to interpret the apparent discrepancy between (75) and (76) as the result of theoretical progress. Because of a theory hinted at in (76b) we are now able to account for the distinction between core and periphery in a way that was not available before. In the interview of Chomsky (2004 [1982]: 132) we find hints that it had been much less well-defined earlier and that there were allegations that its main purpose was to exclude unwelcome data from the domain of UG. Therefore, the distinction between core and periphery is an example of an idealisation that is no longer needed because a more advanced theory can incorporate it. SUMMARY
•
Language acquisition is selected in Chomskyan linguistics as the central criterion for the choice of a grammar.
•
Language acquisition takes place despite serious shortcomings of the input (‘poverty of the stimulus’). This is possible because of a human genetic predisposition, the language faculty.
•
Language use is less important in the research programme of Chomskyan linguistics. It is divided into the production problem and the perception problem.
•
The production problem involves the creative use of language. This can probably not be explained because of principled limitations of human cognition.
•
The perception problem involves parsing. There is no efficient parsing method for language, so that only part of language can be used in practice.
94
2.5
Chomskyan Linguistics and its Competitors
•
Universals are interesting in Chomskyan linguistics only if they are necessary for language acquisition.
•
Language acquisition is the development of a genetically determined initial state into a steady state under the influence of primary linguistic data. It is more like growth than like learning.
•
Universal Grammar (UG) describes the language faculty. It explains individual grammars in the sense that they describe I-languages that can be learned by means of the language faculty.
•
UG can be tested by individual grammars because for each I-language a grammar must be available that is compatible with UG.
•
There is a tension between accounting for the learnability and for the variety of languages. The former tends to strengthen UG, the latter to weaken it.
•
Observational adequacy means correct description of the data, descriptive adequacy correct description of the I-language, explanatory adequacy correct description of the language faculty.
•
The idealisation of instantaneous language acquisition means that the intermediate stages between the initial state and the steady state may provisionally be disregarded.
•
The distinction between core and periphery means that not all aspects of an I-language are acquired in the same way. Only the core directly reflects the language faculty as described by UG.
The unity of the research programme of Chomskyan linguistics from its emergence to the 1980s
In the preceding sections, the research programme of Chomskyan linguistics was illustrated by quotations from a range of sources covering the period from Syntactic Structures in 1957 to the interviews with Chomsky published in 2002 and 2004. There have been significant changes in the practice of Chomskyan linguistics in the course of this period. This raises the question to what extent these changes have affected the research programme in the sense in which this term is explained in Chapter 1. In this section and the next, the question of the unity of the research programme will be addressed. The classical dividing line is the one between
The research programme of Chomskyan linguistics
95
the so-called Standard Theory (ST) of Chomsky (1965) and the Principles and Parameters (P&P) model of Chomsky (1981a). Section 2.5.1 will be devoted to this transition. Another period for which it has been claimed that the research programme did not remain constant is the early period up to the ST. This period will be considered in Section 2.5.2. Whereas both developments can be discussed with a certain degree of comfortable historical distance, the emergence of the Minimalist Program (MP) is much more recent. For this reason it will be discussed in a separate Section, 2.6.
2.5.1
From Standard Theory to Principles and Parameters
In his well-documented history of linguistic ideas in twentieth century America, Matthews distinguishes ‘Chomsky’s classic period’ (1993: 205) from what he calls the ‘new Chomskyan school’ in (77). (77)
a.
‘The 1980s were marked externally by the emergence of a new Chomskyan school, largely different in both personnel and character from the old school that had disintegrated in the 1970s. b. One major difference, as we remarked in §1.1, is that it is not so predominantly American. […] c. The new Chomskyan school is also addressing a different problem.’ [Matthews (1993: 233)]
In (77a), two reasons are mentioned for considering the transition described here as a major shift. One concerns ‘personnel’ and is elaborated in (77b). As discussed in more detail in Chapter 1, the concept of research programme as defined in (7) in Section 1.2.2 differs from Kuhn’s concept of paradigm or disciplinary matrix in exactly this aspect. Considering the ‘personnel’ is the starting point for Kuhn but it is deliberately ignored in the discussion of research programmes. The second reason for distinguishing two schools, however, indicated as ‘character’ in (77a) and elaborated in (77c), cannot be ignored, in particular because various published remarks by Chomsky point in the same direction. An example is (78). (78)
‘The P&P framework was in significant respects a much more radical departure from the long and rich tradition of linguistic inquiry than earlier work in GG.’ [Chomsky (2004: ix)]
In (78), Chomsky uses ‘GG’ for Generative Grammar, a term he uses for what is here called Chomskyan linguistics. A good starting point for an evaluation of the nature of the transition is the comparison of analyses in Section 2.5.1.1. The implications of the differences are discussed in Section 2.5.1.2.
96
Chomskyan Linguistics and its Competitors
2.5.1.1 The treatment of a passive sentence in ST and P&P In order to get an impression of the type of change involved, it is useful to consider an example analysis, using the sentences in (79) as a basis. (79)
a. Graziana asked a question. b. A question was asked by Graziana.
A somewhat simplified analysis of (79b) in the style of Chomsky (1965) is given in Figure 2.8. S
NP
Graziana
S
PredP
NP
Aux -ed
VP
V ask
a question
NP
a question
PredP
Aux be + -ed
VP
V ask + -en
PP
by Graziana
Figure 2.8: Simplified deep structure and surface structure of (79b)
The deep structure on the left in Figure 2.8 is the same for both sentences in (79). It is generated by rewrite rules of the type illustrated in (80). (80)
a. S → NP PredP b. PredP → Aux VP c. VP → V NP
The three rules in (80) are responsible for the part of the structure that is shown in Figure 2.8. The internal structure of the NPs is not relevant to the discussion here. Therefore it is not elaborated, which is indicated by a triangle. The category Aux (for auxiliary) is used both for verbal inflection and for modal and temporal auxiliaries. Its internal structure is also ignored here for the sake of simplicity. While it diverges from the analysis by Chomsky (1965) in certain details, this tree represents the general spirit of work in the ST. 19 The fact that (79b) is a passive sentence is reflected in the surface structure tree on the right in Figure 2.8. The passive is brought about by a transformation. This is a rule of the type illustrated in (81).
The research programme of Chomskyan linguistics (81)
X 1
NP 2
Aux V 3 4
NP 5
97
Y 6 => 1 5 be+3 4+en by+2 6
In (81), X and Y designate context not affected by the rule. Both are zero in its application to the deep structure in Figure 2.8. Four different operations are performed on the four constituents between X and Y. The object NP 5, matching a question here, is moved to the position of the subject NP; the subject NP 2, here Graziana, is moved to the end and turned into a PP by insertion of the preposition by; Aux is supplemented by the passive auxiliary be; and the verb is accompanied by the passive participle marker -en. The combination of these four operations constitutes the passive construction. One of the reasons for distinguishing a deep structure and a surface structure is that they have different functions. The deep structure is the input for semantic interpretation, leading to a representation of the meaning. The two sentences in (79) have the same meaning and their deep structure is the same. The surface structure is the input for phonetic interpretation, leading to a pronounceable representation. This process will, for instance, produce asked as the passive participle corresponding to ask+-en. It will also check subject-verb agreement to produce was rather than were for be+-ed. Agreement can only be checked after passivisation, because it is the surface structure subject, i.e. the deep object of a passive, which determines the form of the inflected verb. An equally simplified analysis of (79b) in the style of Chomsky (1981a) is given in Figure 2.9. In the 1980s, the names of deep and surface structure were replaced by D-structure and S-structure. IP
IP
NP e
Iʼ
Iʼ
NPi
Infl
VP
V asked
a question
NP
PP
VP
Infl
V asked
NP ti
a question by Graziana
PP
by Graziana
Figure 2.9: Simplified D-structure and S-structure of (79b)
If we consider the two structures in Figure 2.9, we notice a number of differences from the structures in Figure 2.8. The D-structure to the left in Figure 2.9 has the passive verb inserted. The ‘deep subject’ Graziana appears
98
Chomskyan Linguistics and its Competitors
in a PP immediately and the subject position is occupied by an empty NP e. The S-structure on the right contains an empty object position with a trace t, coindexed with the subject NP by means of the subscript i. In both structures, S has been replaced by IP, PredP by I’, and Aux by Infl. More important than these differences in structures are the differences in the mechanisms producing them. 20 Individual rewrite rules of the type illustrated by (80) have been replaced by X-bar-theory. Chomsky (1970: 210f.) proposed X-bar theory as a parallel structure for noun, verb, and adjective phrases. This was soon generalised to include prepositional phrases. Chomsky (1981a: 164) suggests that Infl may be seen as the head of S, and Chomsky (1986b: 2–4) sketches a version of X-bar theory in which Infl has a full projection, parallel to N, V, A, and P. The effect of X-bar theory is illustrated in Figure 2.9 with the three levels of IP, I’, and Infl. The consequence of adopting X-bar theory is that it is no longer possible to claim that a particular structure is excluded because there is no rule generating it. Instead, everything can be generated as long as it fulfills the conditions of X-bar theory. In a similar vein, individual transformation rules such as (81) are abolished. Instead Chomsky (1981a: 18) assumes a single transformation move α. The central idea is that in principle anything can move anywhere. If a particular movement is not possible, this must be explained. There can be no claim that a particular movement is not possible because there is no transformation that performs it. Instead, general principles are involved in blocking certain movements. It is assumed that movement leaves a trace t, coindexed with the moved element, and that certain constraints hold on the resulting structure. Two constraints involved in the structure in Figure 2.9 are the theta-criterion and the Case filter. According to the former, every NP must have exactly one thematic role and every thematic role must be assigned to exactly one NP. According to the latter, every NP must have abstract Case. 21 The Case filter and the theta-criterion are applied to chains. A chain is formed by a constituent and all its coindexed traces. In a structure with an active, transitive verb, e.g. (79a), V assigns a thematic role to the subject and one to the object. These roles are not marked with labels such as agent or patient because the difference is not relevant in syntax. Accusative Case is assigned by V to its object. Nominative Case is assigned to the subject by Infl. What happens in the passive is analysed by Chomsky (1981a: 124f.) as follows. A passive verb does not assign a thematic role to its subject and does not assign Case to its object. They are both ‘absorbed’ by the passive morpheme on the verb. Therefore in the D-structure on the left of Figure 2.9, the NP a question does not have Case. The only way for it to get Case is to move to the subject position. The resulting chain [NPi, ti] gets a thematic role in ti, assigned by V, and a Case in NPi, assigned by Infl.
The research programme of Chomskyan linguistics
99
The roles of D-structure and S-structure in Figure 2.9 do not match those of deep and surface structure in Figure 2.8. The introduction of traces implies that all information available at D-structure is also available at S-structure. S-structure is more abstract than surface structure and D-structure is only an intermediate stage in the derivation. Additional levels of representation are introduced as interface levels to phonetic and semantic interpretation, Phonetic Form (PF) and Logical Form (LF), respectively. Chomsky mentions ‘the role of LF as representing the contribution of the language to semantic interpretation’ (1986a: 179). Both PF and LF are derived from S-structure.
2.5.1.2 Implications of the differences A direct comparison of the analysis of a sentence in ST and P&P as in Figure 2.8 and Figure 2.9 gives a good sense of the difference hinted at by Matthews (1993) in (77c) and Chomsky (2004) in (78), but it does not show the full extent to which linguists are actually ‘addressing a different problem’ in P&P. As described by Matthews (1993: 233–237), the domain of discussion has increased dramatically. The linguist working in ST tries to formulate a set of rewrite rules of the type in (80) and transformations of the type in (81) to describe a construction (e.g. passive) in a particular language (e.g. English). Other languages, e.g. French or German, each have their own set of rules. Other constructions in English, e.g. relative clauses, may share aspects of the deep structure, but they have their own transformations. If the linguist working, for instance, on the English passive, does not overlook any data about the construction in question, the result can count as a solution to the problem set at the outset of the investigation. A collection of such solutions constitutes the grammar of a language. Although sometimes contradictions may be discovered when the solutions are combined, the general approach is to accumulate solidly established knowledge encoded in grammatical descriptions of constructions. All of this no longer applies in P&P. X-bar theory, move α, and constraints such as the theta-criterion and the Case filter are not specific to individual constructions or to individual languages. Thus, although Jaeggli (1986) is devoted primarily to the analysis of the passive in English, (82) relativises the notion of passive on the first page. (82)
‘Passive constructions are simply the result of the interaction of certain morphological and syntactic operations. Only these operations have any theoretical validity.’ [Jaeggli (1986: 587)]
In ST, one could change a rule such as (81) a little bit to accommodate additional data. As (82) indicates, such an approach is impossible in P&P. If one adapts the formulation of the Case filter, this affects the entire grammar, because the
100
Chomskyan Linguistics and its Competitors
Case filter is not specific to the passive. The Case filter is also valid in other languages. Therefore, we find statements such as (83). (83)
‘The Kinyarwanda and Scandinavian facts (and many other cases illustrated in detail by Siewierska (1984)) call into question the assumption that passive participles are never capable of assigning structural Case.’ [Jaeggli (1986: 597)]
Kinyarwanda is a Bantu language spoken in Rwanda. ‘Scandinavian’ groups together Norwegian, Danish and Swedish. What (83) implies is that if you investigate the English passive you cannot fence off your domain from the passive in other languages. The Case filter applies equally to Kinyarwanda and Scandinavian, and if we modify it in view of the English data, this affects the grammar of these languages too. Of course, we have to make allowance for the fact that languages differ. The Case filter may vary, though with strong limitations, between languages. Chomsky expresses this in (84). (84)
‘What we expect to find, then, is a highly structured theory of UG based on a number of fundamental principles that sharply restrict the class of attainable grammars and narrowly constrain their form, but with parameters that have to be fixed by experience.’ [Chomsky (1981a: 3–4)]
(84) states the basis of the P&P model. The ‘fundamental principles’ are valid for all languages. They are genetically determined and independent of the child’s experience in language acquisition. Differences between languages exist at two levels. Certain obvious differences including the choice of vocabulary items belong to the periphery in the sense of Section 2.4.4. They have to be learnable on the basis of experience and take the genetic basis of language as a starting point. Other differences are more closely related to the principles. For those cases a limited number of pre-determined options is encoded in the principle. They are what (84) refers to as ‘the parameters that have to be fixed by experience’ in the language acquisition process. As noted in (76b), the core language is the result of fixing the parameters in the principles. An example of a contrast relevant to the passive is its applicability to intransitive verbs as illustrated by the English examples (85) and their Dutch translations (86). (85)
a. The city was attacked. b. *In the city was battled. c. *In the city was died.
(86)
a. De stad werd aangevallen. b. In de stad werd gevochten. c. *In de stad werd gestorven.
The research programme of Chomskyan linguistics
101
There are some intransitive verbs in Dutch that allow a passive, as in (86b), whereas others do not, cf. (86c). Their English counterparts (85b-c) are generally ungrammatical. It is reasonable, in the P&P framework, to assume that this contrast is the consequence of the different setting of a parameter. The question is then what type of parameter in which principle should account for it. In order to answer such a question, the linguist can invoke data from various constructions and various languages. Typical questions to be asked in this context are which other languages behave like English and like Dutch, what other properties they have in common, and in what way a small variation in the formulation of a principle might account for these correlations. Matthews (1993: 237) sees the development from the type of analysis exemplified in Figure 2.8 to the one in Figure 2.9 as problematic. He expresses his reservations in (87). (87)
a.
‘a theory of universal grammar was no longer directly vulnerable to conflicting evidence. b. Suppose that, in a particular language, we discover facts that seem to run contrary to it. c. It might indeed be that the theory is wrong. d. But it might be that we have simply discovered a new parameter. e. Therefore our best strategy is not to scrap the principle that has been proposed. Instead we take the variation on board, and look for other things that may be connected with it.’ [Matthews (1993: 237)]
It is interesting to compare (87) as a description of a P&P strategy with the strategies available in ST. The qualification ‘no longer’ in (87a) suggests that falsification of theories plays a more important role in ST. A theory describing the passive in English involving, say, the rules in (80) and (81) may be falsified by other data only if they are from English and if they are directly relevant to the passive. Our reaction to such falsification will be to modify one of the rules. If we choose to change (81), this has no implications beyond the passive in English. Rules in (80) may have some repercussions for other parts of the English grammar. There is then very little evidence to choose an alternative theory from. The risk of falsification by new data is restricted by the small scope of a theory. This means on the one hand that once a description has been achieved, it becomes an established part of the grammar. On the other hand, it means that many different descriptions of a particular construction are possible and the choice between them cannot be determined by the data. The chances of (87b) occurring in the P&P model are much better because more data are relevant to any specific part of the theory, both from other constructions and from other languages. The dichotomy proposed by (87c) and (87d) is in fact entirely artificial. It is normal for a theory to be surrounded by
102
Chomskyan Linguistics and its Competitors
problems, cf. Kuhn’s (9) in Section 1.3.1. The important difference between ST and P&P is that the latter has a strategy for dealing with them, indicated in (87e). In P&P counterexamples from other constructions and from other languages are made relevant to the choice of an analysis, whereas ST only deals with them in the sense of excluding them from the domain of discussion. It is no surprise, then, that P&P was rapidly recognised as a promising line of research. It was especially attractive to young linguists who did not have an entrenched mindset of the cumulative strategy of the ST, but certainly not limited to them. It is the strategy of (87e) which is alluded to in (68) when Chomsky states that Japanese evidence can be used to solve indeterminacies in English grammar. This development and the optimism it generated is illustrated by the difference in tone of (88) and (89). Chomsky describes the position of UG in (88) from the point of view of ST and in (89) from the point of P&P. (88)
‘As a long-range task for general linguistics, we might set the problem of developing an account of this innate linguistic theory that provides the basis for language learning.’ [Chomsky (1965: 25)]
(89)
‘What seems to me particularly exciting about the present period in linguistic research is that we can begin to see the glimmerings of what such a theory might be like.’ [Chomsky (1981a: 4)]
In ST, there was no coherent strategy for formulating sensible hypotheses for UG so that this was only ‘a long-range task’ as (88) puts it. With the formulation of P&P, the strategy described in (87e) had become available, triggering the optimism of (89). While this analysis of the transition shows that it can be seen as progress, the question remains whether it is progress within the same research programme. In (78), Chomsky calls the P&P a ‘radical departure’ from earlier work. In (90) he describes the main consequence of this radical departure. (90)
‘the traditional constructions – verb phrase, relative clause, passive, etc. – are taxonomic artifacts, their properties resulting from the interaction of far more general principles.’ [Chomsky (1995a: 17–18)]
Instead of construction-specific rules for verb phrases as in (80c) and passive as in (81), P&P has X-bar theory and move α with constraints that belong to UG. Therefore the ‘radical departure’ Chomsky refers to, e.g. in (78), only concerns the form of the theory. The model of the research programme as represented in Figure 2.7 is equally described by (37) and (62), taken from Chomsky (1965), as by (42) from Chomsky (1986a) and (63) from Chomsky (1988). Here (37) and (42) in Section 2.3 describe the function of grammars
The research programme of Chomskyan linguistics
103
and (62) and (63) in Section 2.4.3 the function of UG. We can safely conclude that ST and P&P are theories in the same research programme. Given this analysis of the relationship between the starting point and the endpoint of the transition, it is not surprising that the development of the theory from ST to P&P can be described as a gradual move to deeper generalisations rather than a radical break. One of the first of these was the X-bar theory of Chomsky (1970). An important breakthrough was the formulation of parameters for systematic differences between languages. Rizzi’s work on the contrast between English and Italian of the late 1970s, collected as Rizzi (1982), was probably the first attempt to do so. That Chomsky (1981a) is based on his Pisa lectures given in 1979 is not entirely accidental.
2.5.2
The early stages of Chomskyan linguistics
When, as we concluded in the previous section, ST and P&P belong to the same research programme, we must assume that the process of the emergence of Chomskyan linguistics was completed at the latest when Chomsky (1965) appeared. The question has sometimes been raised to what extent we can apply the same label to earlier work. We have to distinguish three perspectives here. First, there is the social perspective, the rise of Chomsky’s fame and of the influence of his theory. The publication of Chomsky (1957) was a landmark in this respect. Another landmark was Chomsky’s appearance as an invited speaker at the 1962 International Congress of Linguists. As Anderson et al. (1996: 66f.) describe, Chomsky was the only American among the five invited speakers at this large conference, which provided him with an opportunity to reach a large audience. As explained in Chapter 1, this perspective is not central to the way research programmes are understood here. A second perspective is the theoretical one. As can be expected in a new research programme, initially the rate of theoretical development was relatively fast. Chomsky (1966a: 30–47) gives an overview of the changes in the theoretical framework. The main changes are the introduction of recursion and phrase structure trees at deep structure and the introduction of the lexicon. In the earliest work, it was assumed that rewrite rules would only generate so-called kernel sentences. A kernel sentence is a sentence without embedded S nodes. 22 A special type of transformation would bring about their combination into a single tree structure. Therefore, deep structure was conceived of as an overview of the transformations applied to a set of kernel sentences to produce a surface structure. The replacement of this mechanism by recursive phrase structure grammars simplified the procedure for the generation of sentences.
104
Chomskyan Linguistics and its Competitors
It was also assumed that lexical items were introduced by means of rewrite rules of the type N → house. This method was found to be inadequate for the treatment of subcategorisation, selectional restrictions, and agreement. 23 By introducing a lexicon, the information required for the treatment of these phenomena could be encoded as features associated with the lexical entries. In both cases, the role of the research programme was to provide a background orientation of research and basis for the argumentation of theoretical proposals. The third perspective on the emergence of Chomskyan linguistics concerns the research programme itself. Here the question is how far back we can trace the model summarised in Figure 2.7. It has often been remarked that Chomsky (1957) does not refer to the psychological side of language. Huck and Goldsmith (1995: 24) claim to have evidence for ‘a shift in his [i.e. Chomsky’s] expression of the goals of linguistic theory’ in the early 1960s. However, in Chomsky’s contribution to the proceedings of a conference that took place in 1960 we find the statements in (91). (91)
a.
‘What we seek, then, is a formalized grammar that specifies the correct structural descriptions […]. Such a grammar could properly be called an explanatory model, a theory of the linguistic intuition of the native speaker.’ [Chomsky (1962a [1960]: 533)] b. ‘A general theory of linguistic structure of the sort just outlined would, in this way, provide an account of a hypothetical languagelearning device and could thus be regarded as a theoretical model for the intellectual abilities that the child brings to language learning.’ [Chomsky (1962a: 535)]
In (91a) we find an explicit link between grammar and competence, called here ‘linguistic intuition’, thus establishing the level of the individual in Figure 2.7. In (91b), the ‘general theory of linguistic structure’ refers to UG and ‘the intellectual abilities that the child brings to language learning’ to the language faculty in Figure 2.7, thus establishing the species level of the model. Apart from the use of general descriptions instead of the terms used later, the statements in (91) describe the familiar model. There can be no doubt, therefore, that the research programme of Chomskyan linguistics was fully in place in 1960 at the latest. Evidence for the period before 1960 is less clear. In an interview, Chomsky (2004 [1982]: 88–90) declares that Chapter 1 of Chomsky (1965) had been written in 1958–1959 and explains that Chomsky (1957) does not contain references to the psychological side of the research programme because of the particular
The research programme of Chomskyan linguistics
105
way it came into being. This would antedate the model of Figure 2.7 by another one or two years and invalidate the main argument against the assumption of the psychological component of the research programme from its beginning. Although based on much later self-declaration, it is not implausible because it concords with some other pieces of evidence, quoted in (92) and (93). (92)
‘The empirical data that I want to explain are the native speaker’s intuitions.’ [Chomsky (1962b [1958]: 158)]
(93)
‘It is often argued that experience, rather than innate capacity to handle information in certain specific ways, must be the factor of overwhelming dominance in determining the specific character of language acquisition, since a child speaks the language of the group in which he lives. But this is a superficial argument.’ [Chomsky (1959: 44)]
The statement in (92) is from the discussion following a paper by Chomsky presented at a conference in 1958. By describing the data in this way, Chomsky unambiguously establishes the role of the native speaker’s competence. The statement in (93) is from the review of Skinner (1957). This review argues against behaviourist theories of language acquisition. The reference to the ‘innate capacity’ as an important contribution to language acquisition shows that Chomsky considered the issue of language acquisition in a way at least not incompatible with the model in Figure 2.7. In conclusion we can say that there is very little evidence for a shift of opinion by Chomsky in the early stages of the research programme. A shift in the role of language acquisition could only have taken place before 1960 and a shift in the role of competence as underlying the speaker’s intuitions before 1958. When the research programme started to attract followers, its psychological side was fully in place. If it is correct that Chomsky changed his expression of the goals in the early 1960s, as Huck and Goldsmith (1995) claim, the most plausible reason is that he felt other people did not pay enough attention to, or misunderstood, the way he intended the model of the research programme to work.
2.5.3
Conclusion
In deciding whether Chomskyan linguistics constitutes a single framework from its emergence in the late 1950s until the establishment of P&P in the early 1980s, we have to distinguish the level of theory from the level of research programme. It is obvious that important changes have taken place in the theo-
106
Chomskyan Linguistics and its Competitors
retical framework, in particular between the stage exemplified by Chomsky (1965) and the one exemplified by Chomsky (1981a). The model determining what a theory should describe, as represented in Figure 2.7, has remained largely the same throughout this period. This means that there was a commonly accepted basis for discussing and evaluating different versions of the theory. The changes can be explained as a result of increased knowledge and a gradual shift of emphasis from the individual level to the species level. SUMMARY
2.6
•
ST and P&P are different theories in the same research programme of Chomskyan linguistics. They can legitimately be compared and evaluated by means of the same criteria.
•
In Chomsky’s (1965) ST, a grammar was formulated in terms of rewrite rules and transformations. This restricts the variety of data relevant to the formulation of individual rules. The explanation of competence in terms of language acquisition is not possible.
•
In Chomsky’s (1981a) P&P model, UG consists of a set of principles with parameters. The possible values of parameters is predetermined by UG. Individual (core) grammars are determined by setting the parameters.
•
P&P makes data from one language relevant to the analysis of another language. The strategy involved strengthens the potential of the theory to explain language acquisition.
•
All elements of the research programme of Chomskyan linguistics were in place in 1960 at the latest. Earlier publications suggest at most a development of increasing specification, no radical change compared to the later explicit formulation.
The position of the Minimalist Program in Chomskyan linguistics
The history of Chomskyan linguistics did not stop of course with the publication of Chomsky (1981a). The version of the P&P-model described there, often called Government & Binding (GB) theory, constituted the basis not only for a wide range of empirical work, applying the framework to new languages and new data, but also for further theoretical work, modifying and developing the framework on the basis of the empirical results. A typical example of the
The research programme of Chomskyan linguistics
107
latter is Chomsky (1986b), which proposes to reduce the principles adopted in Chomsky (1981a) to common primitives, including the barriers that gave their name to the book. Whereas barriers were generally recognised as an elaboration of GB-theory, the status of the Minimalist Program (MP) in relation to GB-theory is more controversial. A central concept of the MP is economy. This has a very general meaning and started occurring as a pervasive concept in Chomsky’s MIT lectures in the late 1980s. It is strongly related to simplicity and shares with this notion the dependence on a framework of assumptions. There does not exist a single dimension of simplicity, so that a set of criteria for evaluating the relative importance of the different dimensions is implicit in any claim that one theory is simpler than another. 24 The same can be said of economy. The full force of the pursuit of economy in syntactic theory came to be noticed with the publication of Chomsky (1993a), which is usually taken as the starting point of the MP. It generated a lot of work on the specification of theoretical variables (e.g. the precise definitions of theoretical concepts) and on the analysis of linguistic data, but also a substantial amount of controversy. For an understanding of the position of the MP with respect to Chomskyan linguistics in general and GB-theory in particular, we can take (94) as a starting point. (94)
‘In the view of many linguists working in syntax Chomsky’s (1995), (1998), and (1999) minimalist program (MP) constitutes a major paradigm change in the theory of grammar.’ [Lappin et al. (2000a: 665)]
The statement in (94) is taken from a critical assessment of the transition from GB-theory to the MP. Of the works referred to, Chomsky (1995b) is a collection of four articles marking this transition, in which Chomsky (1993a) appears as Chapter 3. The two later references are manuscript versions of Chomsky (2000c) and (2001), which further elaborate the approach. A ‘major paradigm change’ would in the terminology explained in Chapter 1 correspond to a change of research programme if it concerns the elements that make the empirical cycle work. In Section 2.6.1 we will address the question whether the research programme as established for the period up to and including GB-theory is also adhered to in the MP. In doing so, it will become clear that problems of indeterminacy had emerged not unlike the ones discussed in Section 2.3.3. Section 2.6.2 looks into the approach taken to solving these problems. As in the earlier case this gives rise to a discussion of questions of the type in (52). On the basis of these questions, Section 2.6.3 proposes a further extension of the model of Figure 2.7. With this model, Section 2.6.4 can deal with one of the most controversial points of the MP, its appeal to a notion of perfection.
108
Chomskyan Linguistics and its Competitors
2.6.1
Continuity and its problems
The discussion in Section 2.5 has shown that if we want to understand the relationship between two theories we have to distinguish the elements pertaining to the research programme from the purely theoretical components. In the case of the MP, it is not difficult to find evidence for its allegiance to the Chomskyan research programme as represented in Figure 2.7. The introduction to Chomsky (1995b) sketches this background as in (95). (95)
a.
‘To attain descriptive adequacy for a particular language L, the theory of L (its grammar) must characterize the state attained by the language faculty, or at least some of its aspects. b. To attain explanatory adequacy, a theory of language must characterize the initial state of the language faculty and show how it maps experience to the state attained.’ [Chomsky (1995b: 3)]
(95) is a highly concise description of the elements in Figure 2.7 at the levels marked ‘individual’ and ‘species’. In (95a), ‘a particular language L’ refers to the competence, considered as a state of the language faculty. In (95b), ‘a theory of language’ corresponds to UG. Otherwise the terminology is the same as used before. The main differences between the MP and GB-theory are at the theoretical level. Lasnik (1999) characterises one of the most important differences in (96). (96)
‘If there is a leading technical idea in Minimalism, it is that movement is a last resort, taking place only when triggered by a driving force.’ [Lasnik (1999: 2)]
Given that GB-theory assumed move α as the only transformation, i.e. move anything anywhere, Last Resort as referred to in (96) means a complete reversal of the approach to movement. In GB-theory, principles were concerned with constraints on movement, whereas in the MP the discussion is on triggers for movement. Another major difference is that Chomsky (1995b [1993]: 186–199) abolishes D-structure and S-structure as levels of representation in the MP. Instead, a derivation starts by selecting the lexical entries and combining them. At some point, the derivational path splits into a path leading to LF and to conceptual interpretation (meaning) and one leading to PF and phonetic interpretation. 25 The fact that the nature of these changes is a matter of theory does not absolve us from explaining them in terms of the research programme. For most theoretical changes, it is not so difficult to interpret them as the result of previous theory, new data, and the research programme. This can be illustrated
The research programme of Chomskyan linguistics
109
by Rizzi’s (1980 [1977]) study of Italian wh-questions. Rizzi takes Chomsky’s (1973) Subjacency Condition as a theoretical background. The new data from Italian are in conflict with the condition as proposed for English. Rizzi’s solution is to propose a parameter. In this way the new data can be incorporated in the modified theory in a way compatible with the learnability condition imposed by the research programme. Compared to this episode, the transition from GB theory to the MP constitutes a much more radical change. It requires a large-scale reformulation of the existing theory. The article that (94) is taken from is an argument that the transition to the MP was not rationally motivated. It triggered reactions by a number of prominent linguists working in Chomskyan linguistics. The two statements in (97) and (98), drawn from these reactions, give an indication of how the motivation for the MP was perceived by researchers in Chomskyan linguistics. (97)
‘By the mid-nineties, the expressive power of GB-theory had increased to such an extent that virtually any fact could be accommodated. Instead of a theory, it had become just a descriptive apparatus.’ [Reuland (2000: 845)]
(98)
‘Many linguists who worked seriously within GB noticed that it allowed too much power.’ [Uriagereka (2000: 870)]
The situation described by (97) and (98) is not unlike the one we found in Section 2.3.3. As suggested by (49), a theory that has enough descriptive power to incorporate anything can explain nothing. What (97) and (98) show is that the tension expressed in (67) in Section 2.4.3 was no longer adequate to provide a basis for explanation. There were too many possibilities to account for the same set of data and not enough arguments to favour one of these possibilities over the others. It is against this background that Chomsky (1993a) starts to reconsider the entire basis of the theory, trying to reduce it to ‘the domain of virtual conceptual necessity’ (1995b [1993]: 169). Conceptual necessity is of course a matter of perspective. It is determined by the research programme. It is in this context that additional questions about language are considered.
2.6.2
Two additional questions about language
The discussion of the questions listed in (52) in Section 2.4 is basic to the research programme of Chomskyan linguistics. Chomsky lists them in several publications. Chomsky (1988) adds a fourth question.
110 (99)
Chomskyan Linguistics and its Competitors a.
‘4. What are the physical mechanisms that serve as the material basis for this system of knowledge and for the use of this knowledge?’ [Chomsky (1988: 3)] b. ‘The fourth question is a relatively new one, in fact one that is still on the horizon.’ [Chomsky (1988: 6)]
Before any published account of the MP, the question in (99a) suggests a new way to guide linguistic research. The novelty is emphasised in (99b). Chomsky (1988: 133) repeats the list of four questions, calling them ‘four central questions that arise in the study of language’ without suggesting that the list is complete. In fact, in a 1993 lecture, Chomsky extends the list as in (100). (100) a. b. c. d.
‘What exactly are these properties of things in the world? How do they arise in the individual and the species? How are they put to use in action and interpretation? How can organized matter have these properties (the new version of the unification problem)?’ [Chomsky (1993b: 46)]
The context in (100) is a discussion of the mind/body problem as formulated by Descartes. As Chomsky explains, Descartes assumed a mechanistic world, in which every physical process should be explained in terms of contact phenomena. The mind is then a different kind of substance. This world view turned out to be untenable because it is incompatible with gravity, as proposed by Newton, which assumes attraction between bodies that are not in contact. The initial resistance Newton’s proposal generated was overcome by the admiration for his elegant explanation, unifying the account of physical processes such as apples falling from trees and of astronomical processes such as planets orbiting around the Sun. If there is no coherent theory of bodies, it does not make sense to single out mind as a different type of substance. In the eighteenth century, the idea emerged that mental processes should have a physical basis. It is this context that makes ‘these properties of things in the world’ in (100a) equivalent to ‘knowledge of language’ in (52). 26 Although (100) only includes four items, originally presented as running text, it actually contains five questions. The three questions in (52) correspond to (100a-c) and the additional question in (99a) to (100d), but (100b) also raises the issue of the origin of the language faculty in the species. Therefore we have two new questions compared to (52), the realisation of the language faculty in the brain and the evolutionary origin of the language faculty. Before considering each of these questions in more detail, it is worth relativising their novelty. In fact, in Aspects we already find (101).
The research programme of Chomskyan linguistics
111
(101) ‘there is surely no reason today for taking seriously a position that attributes a complex human achievement entirely to months (or at most years) of experience, rather than to millions of years of evolution or to principles of neural organization that may be even more deeply grounded in physical law.’ [Chomsky (1965: 59)]
The context of (101) is an argument against the eighteenth century belief that language is entirely learned. Chomsky identifies exactly the two issues of evolution and representation in the brain as yielding the most convincing evidence against such a view. Starting from a more general perspective, Chomsky (1980a [1976]: 227) lists five ‘questions we might ask about an organ of the body’, e.g. the eye: function, structure, physical basis, development in the individual, evolutionary development. He applies these to language as well, but draws a modest conclusion for the two new questions in (102). (102) ‘I have said nothing so far about the questions (1c) and (1e) – namely, the physical realization of the abstract structures of language and their evolutionary history. In fact, little is known about these questions, though the first, at least, may be open to serious investigation.’ [Chomsky (1980a: 239)]
Therefore, the two new questions are new only in the sense that they start playing a more substantive role in discussions from the 1990s onwards. Let us now consider each question separately, concentrating on their possible contribution to the research programme.
2.6.2.1 Linguistics and brain science The realisation of competence in the brain was a question present from an early stage. Chomsky (2000a: 24) mentions that it had been formulated ‘about forty years ago’ and we find a particularly clear statement in (103), embedded in Katz’s argument that mentalism of the type adopted by Chomsky is not incompatible with science. (103) ‘a linguist can assert that his theory correctly represents the structure of the mechanism underlying the speaker’s ability to communicate with other speakers. This mechanism is, according to the mentalist linguist, a brain mechanism, a component of the neural system.’ [Katz (1964: 128)]
Chomsky (1980a: 5) makes the same point as (103) when he calls the difference between mind and brain a matter of the degree of abstraction. In (104), he goes a step further and draws a parallel between organs in the mind and in the body.
112
Chomskyan Linguistics and its Competitors
(104) ‘When we turn to the mind and its products, the situation is not qualitatively different from what we find in the case of the body. Here too we find structures of considerable intricacy, developing quite uniformly, far transcending the limited environmental factors that trigger and partially shape their growth.’ [Chomsky (1980a: 38)]
The ‘structures’ in (104) refer to organs like kidneys but also to mental organs. Given their complex and intricate nature, it may be considered as surprising that these organs are so similar among different individuals. In the case of kidneys, we are not really surprised because we assume that they are genetically determined. The limited environmental factors (e.g. amount and nature of food) need not contribute much to their shape, although they have a certain influence. In (104) Chomsky invites us to think of mental organs, including language, in similar terms. As examples of other mental organs Chomsky (1980a: 248f.) mentions the number faculty and the face recognition faculty. The idea of language as a mental organ is a recurrent theme, also seen in (65) above. Another recurrent theme related to brain science is the question of unification, encountered in (100d). Chomsky addresses this point in various places, but (105) is a good summary. (105) a.
‘We naturally want to solve the unification problem, that is, to relate studies of the brain undertaken at various levels. b. Sometimes unification will be reductive, as when much of biology was incorporated within known biochemistry; c. sometimes it may require radical modification of the more ‘fundamental’ discipline, as when physics was ‘expanded’ in the new quantum theory, enabling it to account for properties that had been discovered and explained by chemists. d. We cannot know in advance what course unification will take, if it succeeds at all.’ [Chomsky (1993b: 44)]
If we assume that the object of study in linguistics is a real object, as Chomskyan linguistics does, we expect that linguistic findings will at some point converge with the results of other sciences studying related real objects. Thus, (105a) is a direct consequence of the model in Figure 2.7. The evaluation of relations between different sciences is often difficult. As (106), taken from a recent history of chemistry suggests, (105b-c) are not uncontroversial. (106) a.
‘In contrast to the on-going debates about the possibility of reducing the description of the living to the laws of physics and chemistry, b. or the description of “mental states” to brain states, c. one point seems established: chemistry has been effectively reduced to physics.’ [Bensaude-Vincent and Stengers (1995: 309)27]
The research programme of Chomskyan linguistics
113
Chomsky’s assessment in (105b) is presented as controversial in (106a) and his conclusion in (105c) is emphatically denied in (106c). While BensaudeVincent and Stengers (1995: 304–308) describe the process of unification between physics and chemistry in very similar terms to, for instance, Chomsky (1997: 17f.), their conclusion is opposite. Another interesting feature of the comparison is that (106b) takes the very basis assumed by Chomsky as being controversial. The explanation of this statement can be very simple, because there is no general agreement that the Chomskyan view of language is the correct one. Even when important details are more controversial than Chomsky presents them in (105), however, the main point remains that there are in fact different ways two scientific fields can be unified. The conclusion in (105d) is not affected by the disagreement about the analysis of particular cases illustrated in (106). There are two potential reasons for raising a question such as (99a) or (100d) on the relation of language and brain. On the one hand, it is a question about language and we would like a theory of linguistics to be at least compatible with the best answer. In this sense, the role of the question is similar to (52c), language use. On the other hand, the question may be given a role in constraining linguistic theory in the same way as (52b), language acquisition. Chomsky has quite outspoken views on this issue, based on his assessment in (107). (107) ‘The current situation is that we have good and improving theories of some aspects of language and mind, but only rudimentary ideas about the relation of any of this to the brain.’ [Chomsky (1995a: 11)]
The explanatory power of theories developed in the brain sciences is called into question in (107). We may find interesting data, such as the ones summarised by Friederici (2002) and discussed in Section 2.2.3, but they remain very limited in their impact because they only indicate active areas in the brain. What we would need is a theory linking these findings to linguistic statements at a level of detail that is interesting to linguistics. Instead of only patterns indicating the processing of word-category errors as opposed to syntactic errors, we would need data about, for instance, the activation of particular principles, so that we can assess their scope. So far, no theory of electrical activity of the brain has emerged that can establish causal links of this kind. The relation between linguistics and brain science does not assume a strong role in the model of the research programme of Chomskyan linguistics. The difference in success rates referred to in (107) rather indicates that brain science should take linguistic results as heuristics than the other way around. At the moment linguistics still tells brain science what to look for in order to ‘calibrate’ its measuring tools.
114
Chomskyan Linguistics and its Competitors
2.6.2.2 The evolutionary origin of language The question of the origin of language has for centuries been the subject of much speculation. However, most of this speculation concerns elements that are not directly relevant to Chomskyan linguistics. In the eighteenth century, philosophers wondered about the origin of words, asking, for instance, what was the oldest language. In the nineteenth century, the emergence of Darwin’s theory of evolution raised the issue of the origin of human speech organs. Biologists asked, for instance, whether particular early hominids could speak. The twentieth century saw a rise of interest in language as communicative behaviour, which led to questions such as how increased group size correlated with the emergence of language. None of these particular issues is central to Chomskyan linguistics. Words in language do not exist without people knowing them. Moreover, the lexicon does not belong to the core of language. Speech organs, i.e. the vocal tract and the ear, are subordinate to competence. It is now generally assumed that sign languages as used by the deaf community are as fully fledged languages as sound languages. The basic argument to this effect is summarised, for instance, by Jackendoff (1993: 83–98) and by SuttonSpence and Woll (1999: 1–20). The idea that communication is only one of the functions of language was discussed in Section 2.4.1. The Chomskyan position is exemplified in (58). In the domain of language origin, the questions of most interest in Chomskyan linguistics concern the origin of the language faculty, FL. Hauser et al. (2002: 1570) identify ‘three theoretical issues’ that ‘cross-cut the debate on language evolution.’ They are listed in (108). (108) a. Is the FL uniquely human or shared with other species? b. Was the emergence of FL gradual or saltational? c. Is FL based on the adaptation of an earlier system of communication or is it an exaptation?
The issue in (108a) has been hotly debated. Chomsky’s position has always been that there is a qualitative difference. Central in human language is its creativity, as expressed in (3). Chomsky contrasts this with animal communication systems in (109). (109) a.
‘Each known animal communication system either consists of a fixed number of signals, each associated with a specific range of eliciting conditions or internal states,
The research programme of Chomskyan linguistics
115
b. or a fixed number of ‘linguistic dimensions’ each associated with a nonlinguistic dimension in the sense that selection of a point along one indicates a corresponding point along the other.’ [Chomsky (1966b: 78, fn. 8)]
The systems alluded to in (109a) consist of a finite set of discrete signals. They are found among birds (e.g. parrots) and among nonhuman primates (e.g. chimpanzees). An example of a system alluded to in (109b) is bee dancing. Bees indicate the direction, distance, and quantity of food they have found by the orientation, duration, and intensity of their dance, along continuous scales. Anderson (2004) gives a useful overview discussion of animal communication systems and their difference from human language. The issue in (108b) is whether the emergence of FL was a series of small adaptations leading to gradually more sophisticated versions or a single major step separating animals without FL and human beings with FL. The one in (108c) concerns the question whether FL developed out of an earlier, less sophisticated system of communication or from something with a different purpose. The terminology is taken from Gould (1991). Adaptation is the classical process invoked in evolution, by which natural selection shapes a particular feature of a species for a particular use. The problem is that there are many features that adaptation cannot explain, because they must have arisen independently of their current use. Thus, wings could not develop as an adaptation for flying, because they have to reach a certain size before they can serve the purpose of flying. The classical solution is that they developed for another purpose, e.g. thermoregulation, and were then discovered as enabling flight. Gould proposes to use the term such that in this scenario, wings were an adaptation for thermoregulation and their use for flight was an exaptation. The position taken by Hauser et al. (2002) is that in order to address the questions in (108), FL should be analysed into its component parts as in (110). (110) a.
‘Faculty of language – broad sense (FLB). FLB includes an internal computational system (FLN, below) combined with at least two other organism-internal systems, which we call “sensory motor” and “conceptual-intentional”. […] b. Faculty of language – narrow sense (FLN). FLN is the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces.’ [Hauser et al. (2002: 1570f.)]
116
Chomskyan Linguistics and its Competitors
It is important to keep in mind that FLB, as intended in (110a), does not include non-linguistic elements. This means that FLB excludes memory, respiration and other ‘organism-internal systems that are necessary but not sufficient for language’ (2002: 1571). Moreover, if we accept (110a) and its further explanation, FLB also excludes, for instance, the sensory-motor systems involved in producing and listening to music and the conceptual-intentional aspects of seeing and enjoying a beautiful painting. The central interest of Chomskyan linguistics is in FLN as defined in (110b) and elaborated in (111). (111) a. ‘a core property of FLN is recursion’ b. ‘This capacity of FLN yields discrete infinity.’ [Hauser et al. (2002: 1571)]
The concept of ‘discrete infinity’ as differentiating human language from animal communication systems is found also in Chomsky (2004 [1982]: 47) and elsewhere. Infinity of this type arises through recursive rule application. There is no longest sentence, because for every sentence S it is possible to embed it in a pattern such as ‘ said/thought/etc. that S’. The result of applying this pattern is a new, longer S, which can itself be input to the same pattern (recursion). The infinity achieved in this way is discrete rather than continuous, because the number of applications of the pattern is an integer. It can be applied once, six times, or forty times, but not 1.6 times. Continuous infinity exists in bee communication, where infinitely many values of parameters such as distance and quantity of food can be distinguished on a cline. Between any two points on each of these clines, a new value is available. A first consequence Hauser et al. (2002) draw from the analysis of FL is that questions such as (108) should not be answered for FL as a whole, but for each of the components separately. In particular, they formulate the hypothesis in (112). 28 (112) a.
‘FLB contains a wide variety of cognitive and perceptual mechanisms shared with other species, b. but only those mechanisms underlying FLN – particularly its capacity for discrete infinity – are uniquely human.’ [Hauser et al. (2002: 1573)]
Evidence for both parts of (112) is given by the many studies of primate and bird communication systems. Research on grey parrots reported by Pepperberg (1999), for instance, suggests that they have both the sensory-motor capacities
The research programme of Chomskyan linguistics
117
to distinguish and produce sounds of human language and the conceptualintentional capacities to associate words with concepts. As an example of the level of sophistication of their sensory-motor capacities, Pepperberg (1999: 288) mentions that her parrot Alex is able to produce and recognise the difference between /p/ and /t/ in pea and tea. Most of her book is devoted to showing the high level of cognitive processing. Pepperberg (1999: 62–79) argues, for instance, that Alex managed to learn the concepts of same and different, and apply them to the shapes, colours, and materials of objects. Remarkable though these findings may be, they are fully compatible with (112a). In a more recent summary of her findings, Pepperberg states that ‘He uses a very limited form of syntax’ (2005: 66) and that ‘the behavior of the parrots described here is fully equivalent neither to human language nor to the vocal behaviour exhibited even by young humans’ (2005: 67). This is exactly what is to be expected in view of (112b). Additional evidence for (112) comes from a non-linguistic source. As mentioned in the context of (104) above, Chomsky considers the number faculty as another ‘mental organ’. Integers also have the property of discrete infinity. Hauser et al. list the results of various experiments demonstrating that the manipulation of numbers by animals and humans displays a similar difference to the one found in language and summarise their findings in (113).29 (113) ‘A human child who has acquired the numbers 1, 2, and 3 (and sometimes 4) goes on to acquire all the others; he or she grasps the idea that the integer list is constructed on the basis of the successor function. For chimpanzees, in contrast, each number on the integer list required the same amount of time to learn.’ [Hauser et al. (2002: 1577)]
The conclusion Hauser et al. (2002) draw is that (112) is a reasonable hypothesis as an answer to (108a). Elsewhere, Chomsky and others have speculated about the consequences of this hypothesis for (108b) and (108c). Thus, Gould takes language as one of the human characteristics that exemplify exaptation, stating that this is accepted by Chomsky as ‘a proper translation of his views into the language of my field’ (1991: 61f.). Such speculations and insights mainly serve as a background for the hypothesis in (112). They do not have the same degree of independent influence on the way the research programme is formulated. It is to the consequences for the research programme that we will turn now.
118
2.6.3
Chomskyan Linguistics and its Competitors
Extension of the model
Both the realisation of the language faculty in the brain and its evolution as part of the genetic specification of the human species are important questions in the study of language. They are questions at a higher level of abstraction than the others we considered so far. The question about the nature of competence can be addressed on the basis of grammaticality judgements and various other types of data. Questions about the use and acquisition of competence are one step removed from these data. Questions about the realisation and evolution of the language faculty are one step further removed. I will argue here that from the perspective of the role they assume in the research programme, the distinction between the questions of the realisation and evolution of the language faculty is parallel to the one between the use and acquisition of language. In Section 2.4 we considered the roles of use and acquisition in the research programme. Language use gives interesting data about the nature of competence, in the form of performance, but it does not restrict the theory of competence beyond that. It tells us what to account for, but not how to account for it. Language acquisition not only gives data about the nature of competence, but also imposes constraints on the type of theory that can be hypothesised for competence. It creates the tension between variety and learnability of I-languages necessary to solve the indeterminacy arising in Figure 2.3. Let us now turn to the problem of indeterminacy associated with Figure 2.7. Instead of the indeterminacy of the grammar as a description of the competence in terms of rewrite rules, we are now dealing with the indeterminacy of UG as a description of the language faculty in terms of principles and parameters. Observations about the realisation of the language faculty in the brain give us data about the nature of the language faculty. They tell us what to account for in a theory of the language faculty, but do not restrict the ways we can account for it. This is a role similar to that of language use in a theory of competence. On the other hand, the evolution of the language faculty, while less prolific in directly providing data, constrains the way it can be properly described. In order to solve indeterminacy, we need a tension between forces drawing in opposite directions. As a background, (114) describes the assumption about what exactly is included in FLN. (114) ‘FLN comprises only the core computational mechanisms of recursion as they appear in narrow syntax and the mappings to the interfaces.’ [Hauser et al. (2002: 1573)]
The research programme of Chomskyan linguistics
119
Chomsky and Lasnik describe ‘narrow syntax’ as ‘the derivation relating D-structure, S-structure, and LF.’ and state that ‘D-structure, LF, and PF are interface levels’ (1995: 34). 30 This description is based on the syntactic model from before the MP. Chomsky (1995b: 187–199) argues that D-structure and S-structure should be dispensed with as levels of representation. This does not alter the scope of narrow syntax, however, because the derivation continues to cover the same domain. We still start with elements from the lexicon and end up with LF and PF representations. Hauser et al. (2002: 1573) use (114) as an argument why FLN is not necessarily an adaptation. It is so restricted in inherent properties, that it may have arisen ‘accidentally’, as an exaptation. On the basis of this idea, Chomsky arrives at the Strong Minimalist Thesis (SMT) in (115). (115) a.
‘The extralinguistic systems include sensorimotor and conceptual systems, which have their own properties independent of the language faculty. b. These systems establish what we might call “minimal design specifications” for the language faculty. To be usable at all, a language must be “legible” at the interface: the expressions it generates must consist of properties that can be interpreted by these external systems. c. One thesis, which seems to me much more plausible than anyone could have guessed a few years ago, is that these minimal design specifications are also maximal conditions in nontrivial respects.’ [Chomsky (2000a: 25)]
The ‘thesis’ indicated in (115c) is referred to as ‘This strong minimalist thesis, as it is sometimes called,’ further down on the same page. The idea of (115c) is that the only conditions determining FLN are recursion, as mentioned in (114), and compatibility with the interfaces to sound and meaning, as mentioned in (115b). The external conditions are then imposed by the two systems mentioned in (115a), in which we recognise the part of FLB not included in FLN in (110a). The SMT is not compatible with a rich system of articulated principles such as X-bar theory and Case theory, as described, for instance, by Chomsky (1982: 4–17). If such a system were assumed as the outcome of evolution, it could hardly be the result of a single exaptation but would have to involve a series of small adaptations. Therefore, the SMT implies a rather different conception of the nature of FLN. In view of the SMT, we may extend the model of the research programme of Chomskyan linguistics from Figure 2.7 to Figure 2.10.
120
Chomskyan Linguistics and its Competitors
describes
General theory
tests
describes
Universal Grammar
tests
describes
Grammar
test
describe
explains
Observations
Data
Observable facts
explains Individual
Competence
explains Species
Language Faculty
Life/matter
World
Figure 2.10: Model of the research programme of Chomskyan linguistics as used in the Minimalist Program
The levels of data, individual, and species in Figure 2.10 are the same as in Figure 2.7. What is added is the higher level marked ‘Life/matter’. This level represents the belief that, ultimately, physics, chemistry, biology, psychology, and linguistics describe different aspects of the same world. Therefore they have to be mutually compatible. A comprehensive theory of the world is of course far beyond what we can hope to formulate, but specialised aspects of it can be used as constraints on theories in other domains. The top level in Figure 2.10 is a kind of general interface between different specialised sciences, each of which has a model of the type of Figure 2.10 describing how they approach theory formation. Although the transition from GB-theory to the MP can thus be seen to affect the model of the research programme, it can hardly be said to lead to a new research programme. In this sense, the analysis in (94) is not correct.
The research programme of Chomskyan linguistics
121
The entire model adopted in GB-theory (i.e. Figure 2.7) is still valid in the MP. The transition is rather of a type similar to the one from ST to GB-theory. In ST, the focus of attention was the level of the individual. Linguists were aware that there was a higher level, but this did not have direct implications for linguistic work. In GB-theory, this higher level, marked ‘species’ in Figure 2.7 and Figure 2.10, received more attention. It was made operationally relevant by the P&P model which, as (68) states, makes data from one language relevant to the analysis of another one. Throughout, linguists were aware that the language faculty must have evolved in some way. Chomsky (1965) states this in (101). Also the discussion by Chomsky (1980a), from which (102) is taken, shows this awareness. At that point, however, there was no way to make this idea operational as a constraint on theorising. What the MP does is to indicate some tentative lines of argument to make the evolution of the language faculty relevant to the formulation of theories about the language faculty. In GB-theory, it was not necessary to have a full UG before applying it to individual grammars. Instead, a general form of such a theory was assumed (the P&P framework) and the specification of UG continued in parallel to the formulation of aspects of individual grammars. In the MP, a further level is added, above UG. The main conceptual difference is that this higher level is much more inclusive and does not itself belong to linguistics. As a consequence, we do not have a framework comparable to P&P, but only whatever different disciplines have contributed to what we know about the world. The SMT makes some of this knowledge relevant to linguistics. As Roberts states in (116), this means raising the level of permissible why-questions. (116) ‘ ‘Substantive’ minimalism attempts to provide UG principles with a more solid basis by addressing the ‘why’ question: why should UG have the properties it appears to have?’ [Roberts (2000: 853)]
In (116), ‘substantive minimalism’ is used in opposition to methodological minimalism. This opposition is explained, for instance, by Chomsky (2004: 154–158). Methodological minimalism is very similar to theoretical elegance, ‘You want the best theory of whatever object you’re looking at’ (2004: 156). In the case of language, it corresponds to explanatory adequacy in (69). Substantive minimalism, as stated in (116), is in a sense one level higher. Where descriptive and explanatory adequacy ask for ‘what’ and ‘why’ at the level of competence, methodological and substantive minimalism ask for ‘what’ and ‘why’ at the level of the language faculty. Ten Hacken (2006c) incorporates this insight into the levels of adequacy by introducing the concepts of relativised descriptive and explanatory adequacy.
122
Chomskyan Linguistics and its Competitors
2.6.4
Perfection
A controversial feature of the MP is that it is often thought of as claiming that language is in some sense ‘perfect’. This perception is formulated in (117). (117) ‘Underlying the economy principles of the MP is the idea that grammar is a perfect computational system for mapping a selection of lexical items (a numeration) to a pair of interfaces with conceptual (semanticpragmatic) and articulatory (phonetic) cognitive modules, respectively.’ [Lappin et al. (2000a: 665)]
On the basis of (117), Lappin et al. take Chomsky to task for not specifying the type of perfection. After considering a number of interpretations and arguing that they do not apply to the MP, they conclude that ‘the foundational assumption of the MP rests upon an obscure metaphor rather than a precise claim with empirical content’ (2000a: 666). In fact, (117) misrepresents the role of perfection in the MP. Chomsky introduces perfection in (118). (118) a.
‘This work is motivated by two related questions: (1) what are the general conditions that the human language faculty should be expected to satisfy? and (2) to what extent is the language faculty determined by these conditions, without special structure that lies beyond them? […] b. To the extent that the answer to question (2) is positive, language is something like a ‘perfect system,’ meeting external constraints as well as can be done.’ [Chomsky (1995b: 1)]
The specification of the goals of the MP in (118) should be read in conjunction with (115). The ‘general conditions’ in (118a) are the ‘minimal design specifications’ in (115b). They are the conditions imposed by the environment, in particular the sensorimotor and conceptual systems in (115a). The attribute perfect is applied in (118b) to qualify the degree of success in describing language in accordance with the ideal proposed in (118a). This ideal corresponds to the Strong Minimalist Thesis in (115). Therefore, perfection is not a claim, but rather a heuristic. This interpretation is supported by (119). (119) ‘we seek to determine just how far the evidence really carries us toward attributing specific structure to the language faculty, requiring that every departure from ‘perfection’ be closely analyzed and well motivated.’ [Chomsky (1995b: 9)]
The heuristic in (119) works such that, for each element of the theory, the question is asked whether it can be explained in terms of the external constraints placed on the language faculty. If no plausible explanation can be found, we should consider whether we can dispense with it. It is this methodology which also motivates the label minimalist in the MP.
The research programme of Chomskyan linguistics
2.6.5
123
Conclusion
In deciding whether the MP is a continuation of the P&P framework established in the 1980s, we have to distinguish the level of theory from the level of research programme. The issue is very similar to the one concerning the earlier stages of Chomskyan linguistics. Again we can observe an important change in theoretical framework. All results obtained in the earlier framework had to be reformulated, because GB-theory is based on formulating constraints on movement and the MP assumes movement as a last resort. The arguments for the change are based on many of the same considerations, however. Like in the earlier period, a certain shift of attention to a higher level of explanation can be observed. Although this leads to an extension of the model of the research programme in Figure 2.10, the change is not a revolutionary break. Indications of the added level can be found in publications from the mid 1960s onwards, although the remarks in question did not have any practical value at the time. Therefore, the research programme of Chomskyan linguistics provides a basis for discussing the relative merits of analyses in a GB framework and in the MP. Recent developments have been criticised in terms such as (120). (120) ‘the rapid and widespread move from GB to MP has been, in no small part, conditioned by a tendency to accept arguments from authority rather than by critical inquiry.’ [Lappin et al. (2000b: 888)]
While Lappin et al. no doubt intended the ‘authority’ in (120) to be Noam Chomsky, a better interpretation of (120) is arrived at when we take it to be the research programme of Chomskyan linguistics. Not every researcher has to individually assess and judge each development in the theory, but the research programme offers an authority that makes such an assessment possible whenever it is desired. SUMMARY
•
The GB-theory of Chomsky (1981a) and the MP of Chomsky (1995a) are different theoretical approaches within the same research programme of Chomskyan linguistics.
•
In the context of the MP, two additional questions are raised about language, viz. the realisation of the language faculty in the brain and the evolution of the language faculty as part of the genetic specification of the human species.
•
Neurophysiological studies of the realisation of language in the brain are as yet guided by linguistic theory rather than the reverse.
124
Chomskyan Linguistics and its Competitors
•
The question of evolution triggers the analysis of the language faculty into different components. The broad FLB includes cognitive and perceptual mechanisms shared with other species and the narrow FLN, those unique to the human species.
•
FLN covers narrow syntax and the mappings to the interfaces with the sensori-motor and conceptual-intentional modules. It includes recursion, which yields the capacity for discrete infinity.
•
Ideally, FLN can be described as recursion and a number of properties to satisfy the conditions imposed by the environment (interface modules).
•
To the extent that this ideal can be achieved, language is perfect, the language faculty can be explained and linguistics can be unified with (other) natural sciences. Substantive minimalism addresses the question to what extent this ideal is achieved.
•
Perfection is a heuristic tool. Whenever a divergence from perfection is found, the approach is to attempt to either eliminate it or explain it as a necessary concession to the interfaces.
Notes 1
While generative grammar is the name used by Chomsky to refer to his own research programme, this term is often used with a broader meaning, including, for instance, the approaches discussed in Chapter 4.
2
Note that Chomsky’s (1980a) book is compiled from different sources. The chapter containing (13) is a lecture given in November 1976. The chapter containing (14) is one of a series of lectures given in November 1978. This explains the partial overlap and increased specification in (14).
3
Jespersen (1924) does not propose a Chomskyan model of competence, but in the section ‘Formulas and Free Expressions’ (1924: 18–24) he argues that language cannot be reduced to formulas and that a ‘notion of structure’ is necessary to account for the freely formulated expressions. In other respects, Jespersen’s views differ markedly from Chomsky’s. His account of language acquisition, for instance, diverges from the one in Chomskyan linguistics, presented in Section 2.4 below, in essential respects.
4
The examples are numbered (1), (2), (3) and (5) in Chomsky’s text. The last two do not have an initial capital in the original. Chomsky (1957) does not use the convention of stars to indicate ungrammaticality.
The research programme of Chomskyan linguistics
125
5
Positron emission tomography (PET) is an imaging technique which produces a three dimensional image or map of functional processes in the body by means of a scan of a radioactive tracer isotope. Functional magnetic resonance imaging (fMRI) is a technique in which the local consumption of oxygen in the brain is made visible. It uses the fact that hemoglobin, the blood substance that transports oxygen, has different magnetic resonance properties depending on whether it is oxygenated or not. Both techniques are generally taken to show neural activity. Blakemore and Frith (2005: 188–195) give an accessible overview of data collection techniques in neuroscience.
6
LAN stands for ‘left-anterior negativity’, ELAN for ‘a very early lan’, and P600 is the name of ‘a late centro-parietal positivity’.
7
Matthews (1967: 121f.) also criticises the systematic ambiguity of grammar, but his perspective is rhetorical rather than terminological. The point made by Matthews is that Chomsky’s assumptions are not compelling. From the perspective outlined in Chapter 1, however, the presentation of a research programme is meant to be coherent, convincing, and suggestive of interesting solutions rather than logically compelling.
8
At the time (44) was written, the structure, represented as a tree, was supposed to be the result of a series of rule applications (rewrite rules and transformations). An example is given in Section 2.5.1.1.
9
Dynamic aphasia is a type of aphasia described by Luria (1966: 358–362) as marked by problems with understanding and producing complex verbal constructions.
10 Chomsky and Miller (1963) present the questions in the same order and also refer to ‘ability’. Their presentation is discussed in Section 4.1.4. 11 The details of the analysis of creoles and their origins are the subject of much debate. This sketch is broadly based on Bickerton (1999). Sebba (1997) gives an overview of issues involved and discusses arguments for and against Bickerton’s analysis. Another example where input is much less systematic than the I-language acquired by the child is home sign, as discussed by Goldin-Meadow and Mylander (1990). 12 The question is often framed in terms of determining the purpose of language. Chomsky (1975a: 55ff.) discusses a proposal by John Searle to save the claim that communication is the purpose of language by subsuming the expression of thoughts into the concept of communication. He argues that Searle’s proposal waters down the concept of communication to such an extent that the claim that the purpose of language is communication loses any interest. 13 The actual formulation in (59b) does not distinguish the production problem from the perception problem, but the preceding discussion does. Thus, Chomsky observes that ‘The rules that a person “accepts” do not tell him what to say’
126
Chomskyan Linguistics and its Competitors (1975a: 77). Here accept is used in a specific technical sense. Accepting a rule R corresponds to ‘having a cognitive state that is described by R’.
14 It has sometimes been argued that a common origin of all languages is an alternative explanation of universals. Chomsky (1972a: 86) shows that it is inadequate because this explanation does not contribute to the explanation of language acquisition. Comrie (1989: 23f.) argues that it is also inadequate from a typological perspective because of the existence of implicational universals. 15 A similar figure is used in ten Hacken (2006a: 582). An earlier version appeared in ten Hacken (1997a: 289). That version was inspired by Botha (1981: 437). 16 Chomsky (1988: 60) solves this by adding another box to the right of the one labelled ‘I-language’ in Figure 2.4. This new box, labelled ‘structured expressions’, represents the data available to the linguist. 17 A more detailed analysis of the levels of adequacy, including their significance for the latest developments in Chomskyan linguistics, can be found in ten Hacken (2006c). 18 See Section 2.5.1 for a brief discussion of the concept of parameter and its role in the Government and Binding theory, the predominant theory in Chomskyan linguistics in the 1980s. 19 As indicated by the tree given by Chomsky (1965: 129), he proposed at that point to include at deep structure an explicit indication that the passive transformation should take place. Early textbooks make simplifications similar to Figure 2.8, e.g. Jacobs and Rosenbaum (1968: 25), Nique (1974: 120). Van Riemsdijk and Williams (1986: 34) give a passive transformation rule adapted to a deep structure of the type Chomsky (1965: 129) gives, which is slightly more complex than (81). 20 The description of the mechanisms here can do no more than indicate the spirit of the analysis. An accessible, well-founded presentation that introduces the three principles used in the analysis of the passive is contained in Chapters 1–3 of Haegeman (1994). As they are the start of her exposition of the theory, they can be read independently. It is indicative of the increased degree of sophistication of the theory that Haegeman reaches a ‘preliminary discussion’ of the passive on p. 180ff., whereas Jacobs and Rosenbaum (1968) do so on p. 23ff. 21 As Chomsky (1981a: 16, fn. 1) states, he uses capitalisation to mark Case as a technical term. 22 Not only subordinate clauses were treated as embedded S nodes, but also, for instance, adjectival modification (e.g. red car) and nominal compounds (e.g. steamboat). Lees (1960) proposes to derive red car from The car is red and steamboat from Steam powers the boat.
The research programme of Chomskyan linguistics
127
23 Subcategorisation is concerned with syntactic constraints on the choice of lexical items, stating, for instance, that sleep cannot have an object. Selectional restrictions are semantic constraints on the choice of lexical items of the type that sleep requires an animate subject. 24 Two typical examples of dimensions are the number of rules and the number of theoretical entities. The research programme has to decide how much a decrease of one number is worth in terms of the other. In actual practice, the number of dimensions to be weighed against each other is much bigger and not even given in advance. 25 Note that, as stated in (44) in Section 2.3.1, the grammar is not a model of a speaker. Phonetic interpretation refers both to the encoding operations involved in speech production and the decoding operations in perception. 26 The account of Chomsky’s view of the mind/body problem is based mainly on Chomsky (2000b). The discussion by Lycan (2003) and Chomsky’s (2003) reply elaborate some of the points involved. 27 ‘Or, en contraste avec les débats où se disputent dès cette époque, et aujourd’hui encore, la possibilité de réduire la description du vivant aux lois de la physicochimie, ou la description des « états mentaux » à ceux du cervaux, un point semble acquis : la chimie, quant à elle, a été effectivement réduite à la physique.’ My translation, emphasis in the original. Some contextual anaphors have been resolved in the translation to improve readability. 28 Elsewhere on the same page, Hauser et al. formulate the hypothesis ‘that most, if not all, of FLB is based on mechanisms shared with nonhuman animals’ (2002: 1573, emphasis added), repeated in the conclusion as ‘we have argued that most if not all of FLB is shared with other species, whereas FLN may be unique to humans’ (2002: 1578, emphasis added). The parts in italics are not compatible with the exposition of FLN and FLB in (110). Especially the latter is in direct contradiction. 29 Anderson (2004:41–44) discusses the case of Clever Hans, a horse alleged to be able to do arithmetic calculations and communicating the result by tapping his hoof. Elaborate research showed that this was neither a reflection of a number faculty in horses nor the result of secret signs consciously given by the trainer. Instead, the horse was found to be able to interpret the unconscious signs of tension on behalf of the person asking the question when the count of hoof taps reached the right number. 30 The status of D-structure as an interface level is perhaps not straightforward. Chomsky explains it as ‘D-structure is the internal interface between the lexicon and the computational system’ (1995b: 187).
3
The Chomskyan revolution
From an early date, the emergence of Chomskyan linguistics has been considered a revolution. In his review of Chomsky (1957), Voegelin already raises the question whether it would trigger a ‘Copernican revolution’ (1958: 230). One of the earliest explicit positive answers to this question is (1). (1)
a.
‘It seems to me to be undisputable (though I know that there are very many who would dispute it) b. that a revolution of the kind Kuhn describes has recently taken place in linguistics – c. dating from the publication of Chomsky’s Syntactic Structures in 1957.’ [Thorne (1965: 74)]
It seems straightforward to use the term ‘Chomskyan revolution’ for what is described in (1b). This term is used, for instance, by Lyons (1970: 9). The discussion of (17) in Section 1.3.3 showed that it is often problematic to attribute revolutions to a particular event, whether this is an experiment or the publication of a book, as in (1c). As highlighted by (1a), the use of the term Chomskyan revolution has been controversial from the beginning. The objections to its use can be divided into three categories, listed in (2). (2)
a.
There has been no Chomskyan revolution because Kuhn’s model of science does not apply to the field of linguistics. b. There has been no Chomskyan revolution because Chomskyan linguistics is marked by a succession of internal revolutions. c. There has been no Chomskyan revolution because (early) Chomskyan linguistics is a continuation of its predecessors.
Objections of the type in (2a) were discussed in Section 1.4. It was concluded that the arguments against applying Kuhn’s model to linguistics were not compelling, in particular if we concentrate on research programmes as defined in (7) in Section 1.2.2. Objections of the type in (2b) were discussed in Sections 2.5 and 2.6. It was concluded that the ‘internal revolutions’ were major changes 129
130
Chomskyan Linguistics and its Competitors
in the theoretical framework, which did not correspond to similarly radical changes in the research programme. What we observed instead is a gradual activation of elements of the research programme that had not contributed to the constraints of practical research options before. In this chapter we will be concerned with objections of the type in (2c). A question immediately raised by (2c) is what are the relevant ‘predecessors’ of Chomskyan linguistics. Mohrmann et al. (1961) offer a panoramic overview of the field of linguistics in preparation for the 1962 International Congress of Linguists. 1 There are twelve chapters, ranging from mathematical to anthropological linguistics and from linguistics in language teaching to IndoEuropean comparative linguistics. The relevant predecessors of Chomskyan linguistics are discussed in Fries’s (1961) chapter on ‘The Bloomfield ‘School’’. For reasons to be explained below I will use the more precise term of PostBloomfieldians. In this chapter, we will first consider the research programme of this group (Section 3.1). Section 3.2 will then discuss the similarities and differences to the research programme of Chomskyan linguistics as explained in Chapter 2 as well as their consequences. Section 3.3 evaluates the claim in (1) in the light of these findings.
3.1
The research programme of Post-Bloomfieldian linguistics
Before the emergence of Chomskyan linguistics, the segment of linguistics covering the study of phonology, morphology and syntax was dominated by the work of Leonard Bloomfield (1887–1949). This dominance is reflected in more than one way in (3), taken from Robins’s (1967) historical overview. (3)
a.
‘Every scholar is an individual and ‘schools’ and ‘periods’ are abstractions doing doubtful justice to the work and the workers actually comprised in them. b. But in a survey as this, ‘Bloomfieldian linguistics’ can reasonably be treated as a unity; c. and because during this period (1933–1957) […].’ [Robins (1967: 209)]
First, Bloomfield’s role in the field is reflected in the label Bloomfieldian linguistics introduced in (3b). Second, although the idea of indicating periods is preceded by the cautionary remark in (3a), the period indicated for Bloomfieldian linguistics is delimited by the publication of Bloomfield’s (1933) Language and Chomsky’s (1957) Syntactic Structures. As to the importance of the latter, Robins’s analysis in (3) is completely in line with Thorne’s statement in (1). It indicates Bloomfieldian linguistics as the immediate predecessor of Chomskyan linguistics.
The Chomskyan revolution
131
However, there is no complete agreement among historiographers of linguistics in this respect. Fought (1995) expresses a different view in (4). (4)
a.
‘the main discontinuities in theory and practice within the American structuralist community appeared later, b. between the group made up of Boas, Bloomfield, and Sapir together with certain of their early students and followers c. and the younger group of distributionalist structuralists rather misleadingly known as ‘Bloomfieldians’.’ [Fought (1995: 295)]
The term American structuralism in (4a) has a wider scope than Bloomfieldian linguistics in (3). The hidden presupposition in (4) is that the main opposition would be the one between Bloomfield and Sapir. It is the group in (4c) that we are interested in here, because the period of their main influence, which Fought (1995: 295) places ‘in the late 1940s and 1950s’, immediately precedes and overlaps with the emergence of Chomskyan linguistics. 2 The Bloomfieldians in (4c) are thus not the same group as the adherents of Bloomfieldian linguistics in (3b). Following Newmeyer (1986a: 5) and Matthews (1993: 18) I will call the group referred to in (4c) Post-Bloomfieldians. The analysis of Bloomfieldian linguistics as a unity in (3b) clashes with the division made in (4). One of the objections to the analysis in (3) concerns the role of the Post-Bloomfieldian school. Bloomfield (1946: 2f.) explicitly opposed the idea of taking a common interest or opinion as a basis for a ‘school’. By contrast, an analysis of citation patterns by Matthews (1993: 18–20) shows that the Post-Bloomfieldians constituted ‘very much a school, and one that had become inward-looking’ (1993: 18). 3 In the same way as for Chomskyan linguistics, the question of the sources arises for the study of the research programme of Post-Bloomfieldian linguistics. The situation is more complex for the latter, because there is no single person whose role corresponds to Chomsky’s. Fought (1995: 303) mentions Bernard Bloch, Zellig Harris, Archibald Hill, Charles Hockett, Martin Joos, Henry Lee Smith, George Trager, and Rulon Wells as ‘prominent figures’ in Post-Bloomfieldian linguistics. Newmeyer (1986a: 5) gives a very similar list. Hymes and Fought (1981: 128f.) discuss various listings proposed in the 1950s and 1960s. The variation is minor and all lists indicate the prominent role of Bloch, Harris, and Hockett. We should not be surprised not to find a clear, concise and coherent presentation of the Post-Bloomfieldian research programme by any of these linguists. Coherent presentations of a research programme are rather unusual in science in general. For Chomskyan linguistics we are exceptionally lucky to have such works as Chomsky (1980a, 1986a), devoted primarily to explaining and defending the research programme. Post-Bloomfieldians did not generally feel the need to explain or defend their research programme in so much detail. They
132
Chomskyan Linguistics and its Competitors
continued to take (portions of) Bloomfield’s work as a basis and sometimes added methodological remarks to their research papers. This is the more typical situation as encountered in what Kuhn (1970a: 10) calls ‘normal science’ (cf. Section 1.3.2 above). Apart from the absence of an explicit, coherent presentation of the research programme, the study of Post-Bloomfieldian linguistics has to overcome the problem of historical distance. Post-Bloomfieldian linguistics is no longer an active research programme. In trying to reconstruct its model, we have to keep in mind the distinction between the intelligence of linguists and their knowledge. There is no reason to assume that linguists in the 1940s or 1950s were less intelligent than later linguists. Their collective knowledge, however, has increased since then. New insights have been gained in the course of the years and it is far less difficult to preserve them than it was to discover them. Therefore we have to be very careful in analysing the research programme of Post-Bloomfieldian linguistics. We can admit the absence of certain newer insights, but the overall result should be a coherent, rational research programme. Kuhn (1977: xi-xiii) describes a similar problem when he tried to understand Aristotle’s view of physics through the question of how much of Galilean and Newtonian mechanics was known to Aristotle. He could only conclude that Aristotle knew very little of it and stated many obvious falsehoods and confusing generalisations. In order to understand how an intelligent man and keen observer such as Aristotle came up with such a system, Kuhn discovered that he had to abandon his original question and approach Aristotle’s theory on his own terms. The vigorous debates between proponents of Post-Bloomfieldian and of Chomskyan linguistics will be addressed in Section 3.2 below. The fact that they took place should make us suspicious when coming across such representations of the Post-Bloomfieldian research programme as Botha (1981: 425f.). Although most of the elements of Botha’s description are based on elements of Post-Bloomfieldian writings, the general incoherence of the resulting model should convince us that it does not represent the way the Post-Bloomfieldians saw their own work. 4 For similar reasons, Smith’s (1999: 8) claim that before Chomsky, American structuralists had ‘a naïve methodology’ should not be taken at face value. Post-Bloomfieldian linguistics is not a failed attempt to arrive at Chomskyan linguistics, but an attempt to arrive at the scientific study of language that was convincing enough to attract a rather large following throughout the 1940s and 1950s. The strongest argument Post-Bloomfieldians had for their approach is that they offered a method of studying language that was as scientific as the method employed in a hard science such as astronomy. An example of good scientific work in astronomy is the collection of observations of positions of
The Chomskyan revolution
133
a particular planet and their systematic presentation in terms of the planet’s orbit. An essential component of science in this context is that the interpretation process is strictly driven by nature, external to the scientist/observer. The observations are determined by nature and for their interpretation there are explicit, logically secure procedures which leave no room for subjective considerations by the scientist. To what extent this analysis of astronomy is correct is a question I will temporarily leave out of consideration. For now it is sufficient that it was (and still is) a view which is widely adhered to by scientists and non-scientists alike. The questions engendered by this perspective are in many respects similar to the ones we discussed in detail in Chapter 2. Thus, Section 3.1.1 addresses the question of the nature of individual languages, Section 3.1.2 that of the nature of the data, and Section 3.1.3 that of the status of grammars. Apart from these issues, the question of scientific procedures for the handling of data had a prominent role in Post-Bloomfieldian linguistics. It is discussed in Section 3.1.4 in terms of the approaches to non-uniqueness, a concept to be explained on our way there.
3.1.1
The nature and boundaries of a language
Despite the opposition between Bloomfield and the Post-Bloomfieldians, it is appropriate to start a discussion of the Post-Bloomfieldian notion of a language with an overview of Bloomfield’s ideas in this respect. On the one hand, these ideas continued to influence linguistic research throughout the Post-Bloomfieldian period, on the other they constitute the background for the Post-Bloomfieldian views as far as they diverged from Bloomfield’s.
3.1.1.1 Bloomfield’s conception of language In order to know what an individual language is, we have to delimit it from other languages and from other domains than language. Bloomfield (1926) gives a list of definitions and assumptions (‘postulates’) intended to provide a formal, scientific basis for the study of language. The definitions and assumption concerning the nature of individual languages are listed in (5). (5)
a. ‘An act of speech is an utterance.’ b. ‘Within certain communities successive utterances are alike or partly alike.’ [Bloomfield (1926: 154)] c. ‘Any such community is a speech-community.’ d. ‘The totality of utterances that can be made in a speech-community is the language of that speech-community.’ [Bloomfield (1926: 155)]
134
Chomskyan Linguistics and its Competitors
It is interesting to note that although Bloomfield carefully marks the difference between ‘definitions’, including (5a, c-d), and ‘assumptions’, including (5b), the definition of speech-community in (5c) depends on the selection of certain communities in (5b). In order to see the implications of (5), it is useful to compare it with the idea of a language as a set of well-formed sentences (cf. (18) in Section 2.1.3). There are two important differences, related to sentence and to well-formed, respectively. First, in (5d) Bloomfield refers to utterance rather than sentence. This difference is significant. Sentence is a much more theoretical notion than utterance. According to Bloomfield, ‘a sentence is a construction which, in the given utterance, is not part of any larger construction’ (1926: 158). The boundaries of a sentence are then determined by the analysis of constructions. The boundaries of an utterance, by contrast, are readily observable. Harris makes this contrast more explicit in (6). (6)
a.
‘An UTTERANCE is any stretch of talk, by one person, before and after which there is silence on the part of the person. b. The utterance is, in general, not identical with the ‘sentence’ (as that word is commonly used), since a great many utterances, in English for example, consist of single words, phrases, ‘incomplete sentences’, etc.’ [Harris (1951: 14)]
We have to be careful if we combine ideas published by different authors 25 years apart, but there is no reason not to see (6a) as a valid expansion of (5a). 5 If (6a) is taken as the definition of utterance, (6b) follows immediately from it. The second difference between Bloomfield’s notion of language in (5) and the idea of a language as a set of well-formed sentences concerns the notion of well-formedness. In (5d), Bloomfield uses ‘that can be made in a speech-community’ instead of well-formed. He addresses the problems this formulation entails in (7). (7)
a. ‘We are obliged to predict; hence the words ‘can be made’. b. We say that under certain stimuli a Frenchman (or Zulu, etc.) will say so-and-so and other Frenchmen (or Zulus, etc.) will react appropriately to his speech. c. Where good informants are available, or for the investigator’s own language, the prediction is easy; elsewhere it constitutes the greatest difficulty of descriptive linguistics.’ [Bloomfield (1926: 155)]
It is important to observe that in (7) no explicit recourse is taken to judgements. In order to avoid any role of judgements, (7a) interprets the expression ‘can be
The Chomskyan revolution
135
made’ as a prediction of fact rather than taking the modal in a deontic sense.6 Moreover, (7b) takes externally observable behaviour as the criterion for determining whether the speech community accepts the utterance. Nevertheless, (7c) identifies a potential major problem when no ‘good informants’ are available. This is not readily understandable unless we take (5d), as elaborated in (7), not as a proper definition but as an approximation of an externally determined concept. On the basis of Goddard’s (1987) analysis of Bloomfield’s work on Menominee, Fought (1995: 302) argues that for Bloomfield this externally determined concept is the community norm as constructed by the linguist. Although Bloomfield states that the linguist ‘must not select or distort the facts according to his views of what the speakers ought to be saying’ (1933: 38), Goddard concludes that ‘the uniform and consistent community norm was at least in part a construction arrived at by linguistic analysis and was not directly observable’ (1987: 196). Therefore, ‘can be made’ in (7a) has a deontic value for Bloomfield after all. This discrepancy is, according to Fought, ‘nearly always resolved by linguists and indeed by other social scientists just as Bloomfield did’ (1995: 302). In (5a), an utterance is defined as a speech act. Bloomfield analyses the action of making an utterance in his stimulus-response model in Figure 3.1. Figure 3.1 is a direct rendering of Bloomfield’s (1933: 26) illustration, with only the braces and their labels added for convenience of reference. These labels correspond to the ones Bloomfield uses, e.g. (1926: 154) and (1933: 23).
[A] S
[C]
➳r
s
➳R
[B] Figure 3.1: Bloomfield’s stimulus-response model
The underlying idea of Figure 3.1 is that an action is to be analysed as a reaction to a stimulus. Language is a way to relay this process. In Figure 3.1, S and s are stimuli and R and r reactions. Bloomfield uses the capitals S and R to stand for ‘practical stimulus’ and ‘practical reaction’, the lower case s and r for ‘linguistic substitute stimulus’ and ‘linguistic substitute reaction’ (1933: 25). The three stages represented in Figure 3.1 are then [A] the speaker produces an utterance r, [B] this utterance reaches the hearer, and [C] the hearer processes the utterance. Bloomfield describes their relevance for the study of language in (8).
136 (8)
Chomskyan Linguistics and its Competitors a.
‘The happenings which in our diagram are represented by a dotted line, are fairly well understood. b. The speaker’s vocal chords, tongue, lips, and so on, interfere with the stream of his outgoing breath, in such a way as to produce sound-waves; these waves are propagated through the air and strike the hearer’s eardrums, which then vibrate in unison. c. The happenings, however, which we have represented by arrows, are very obscure. d. We do not understand the mechanism which makes people say certain things in certain situations, or the mechanism which makes them respond appropriately when these speech-sounds strike their ear-drums. […] e. In the division of scientific labor, the linguist deals only with the speech-signal (r. . . . . . . .s).’ [Bloomfield (1933: 31f.)]
There is an opposition between [A] and [C] in Figure 3.1 on the one hand and [B] on the other. Whereas the processes of the latter are, according to (8a), ‘fairly well understood’, the former are, as (8c) states, ‘very obscure’. The processes involved in [B] are, as (8b) explains, the physical processes involved in the production, transmission, and perception of sounds. The processes in [A] and [C] are, as (8d) makes clear, the psychological processes underlying the selection of utterances and the reaction to their interpretation. In (8e) Bloomfield draws the conclusion that [B] is what the linguist should be concerned with.
3.1.1.2 Post-Bloomfieldian conceptions of language Post-Bloomfieldians generally considered their work as marking progress along the lines set out by Bloomfield. It is not surprising, then, that (5) remained highly influential. Nevertheless Bloch and Trager (1942) take a different approach to defining the concept of a language. It is summarised in (9). (9)
a.
‘A LANGUAGE is a system of arbitrary vocal symbols by means of which a social group cooperates.’ b. ‘A language is a SYSTEM.’ [Bloch and Trager (1942: 5)] c. ‘A language is a system of SYMBOLS.’ d. ‘The symbols which constitute a language are VOCAL symbols.’ [Bloch and Trager (1942: 6)] e. ‘Finally, the linguistic symbols are ARBITRARY.’ [Bloch and Trager (1942: 7)]
The Chomskyan revolution
137
The initial definition in (9a) is analysed into the four components (9b-e). In the text, each of the statements in (9) is followed by an explanation. The residue of (9a) after taking away (9b-e), the functional aspect of language in society, is discussed before (9b) is stated. The central condition in (9) is (9b). It marks the contrast between the approach by Bloch and Trager and the one by Bloomfield in (5). Whereas (5) approaches a language from an extensional perspective, starting from the set of utterances, (9) takes an intensional point of view, stating properties of the system underlying these utterances. The two definitions can be taken as complementary, but they mark different perspectives. The three properties attributed to the system of a language in (9c-e) have two functions. On the one hand they are used to identify the type of system referred to in (9b). This is the case for (9d), which excludes, for instance, signal flags, traffic lights, and drum beats. 7 On the other hand, they give norms for the appropriate study of language. This is the case for (9c) and (9e). In (9c) Bloch and Trager establish the status of linguistic expressions as ‘not somehow mystically identical with the objects and events they symbolize’ (1942: 6). Their intention in (9e) is to exclude the approach in which linguists ‘elevate the grammar of a particular language – usually Latin – to the rank of abstract reason, and to regard all deviations from this pattern in other languages as illogical corruptions’ (1942: 7). The conditions in (9c) and (9e) thus indicate the contrast between Post-Bloomfieldian linguistics and traditional grammar. As Newmeyer (1986b) discusses in detail, this contrast was a continuous preoccupation of the Post-Bloomfieldians. It is interesting to consider to what extent the difference in orientation between the definitions in (5) and in (9) is to be interpreted as a difference between Bloomfield and the Post-Bloomfieldians. Bloch (1948) formulates an updated set of postulates intended to reflect the advance in linguistics since the publication of Bloomfield (1926). Reflecting this progress, his list is both longer and more restricted in scope. Much of (5) is taken over in Bloch’s (1948) postulates, but (5d) is replaced by (10). (10) ‘The totality of the conventional auditory signs by which the members of a speech-community interact is the language of the community.’ [Bloch (1948: 7)]
In (10), language is not defined in terms of utterances but in terms of ‘conventional auditory signs’. Here ‘conventional’ corresponds to ‘arbitrary’ in (9), ‘auditory’ to ‘vocal’, and ‘signs’ to ‘symbols’. Thus, (10) takes ‘totality’ and ‘speech-community’ from (5d), but the rest of the definition matches (9). The
138
Chomskyan Linguistics and its Competitors
difference between a speech community and what (9a) calls a ‘social group’ is probably not significant. The difference between totality and system is that the former is agnostic as to the internal structure of the set of items, whereas the latter assumes there is one. A further noteworthy property of (10) is that it leaves out the expression of probability in (5d). Instead of utterances that ‘can be made’, (10) refers to a system by which the community ‘interacts’. The link between the utterances and the language has become indirect. It is mediated by the notion of habit. This is a notion used by Bloomfield as well. Thus he states that ‘language is a matter of training and habit’ (1933: 34). This is of course not a definition, but a property attributed to language. The notion gets a more central role in Hockett’s (11). (11)
a. ‘A language is a complex system of habits.’ [Hockett (1958: 137)] b. ‘An act of speech, or utterance, is not a habit but a historical event, though it partly conforms to, reflects, and is controlled by the habits.’ [Hockett (1958: 141)]
Hockett (1958) is more a textbook than a formal characterisation of the field like Bloch (1948). Since (11a) is not presented explicitly as a definition, we should be careful in using it as such. Nevertheless, we cannot avoid observing that the relationship (11b) establishes between habits and utterances implies that (11a) is in contradiction to (5d). Instead of a ‘totality of utterances’, as in Bloomfield’s (5d), for Hockett a language is a system of the habits underlying these utterances. The definition of language in terms of a speech community does raise the question of how to accommodate individual variation. According to Fought (1995: 297), there was a largely tacit agreement among American Structuralists, dating from the early twentieth century, to ignore individual linguistic variation in studies concerned with linguistic theory and description. Nevertheless, Bloomfield (1933) states the problem quite clearly in (12). (12)
‘The difficulty or impossibility of determining in each case exactly what people belong to the same speech-community, is not accidental, but arises from the very nature of speech-communities. If we observed closely enough, we should find that no two persons – or rather, perhaps, no one person at different times – spoke exactly alike.’ [Bloomfield (1933: 45)]
As noted in the discussion of (7) above, Bloomfield had a tendency to solve the problem in (12) by imposing a community norm on his data. The PostBloomfieldians were more precise in this respect. Bloch (1948: 7) introduces the term idiolect and includes a definition of this term in his postulates. In order to cover the variation indicated in (12), an idiolect is taken to be specific
The Chomskyan revolution
139
not only to one person, but also to a specific time. In practical research, PostBloomfieldians did not treat the situation of (12) as problematic. Harris (1951), for instance, describes it in (13). (13)
a.
‘Even though any dialect or language may vary slightly with time or with replacement of informants, it is in principle held constant throughout the investigation […] b. In most cases this presents no problem, since the whole speech of the person or community shows dialectal consistency; we can define the dialect simply as the speech of the community in question.’ [Harris (1951: 9)]
Central in (13) is the dialect or language as a subject of investigation. The investigation referred to in (13a) concerns the system it reflects rather than the boundaries. The only important point, as mentioned in (13b), is the consistency of the data gathered about this system. The approach to the problem of individual variation is reminiscent of the idealisation of a ‘completely homogeneous speech community’ in (46) of Section 2.3.2. In Post-Bloomfieldian linguistics, however, it is not explicitly expressed as such. 8
3.1.2
The nature of the data
It is generally agreed that the data of Post-Bloomfieldian linguistics are a corpus of utterances. It is important, however, both to see that such a statement needs interpretation and to understand the way it is interpreted in Post-Bloomfieldian linguistics. Therefore it is useful to start the discussion of the data in PostBloomfieldian linguistics by opposing Botha’s Chomskyan view in (14) to Harris’s (15). (14)
‘the object of study of a taxonomic grammar is a finite collection or corpus of concrete utterances of a particular language.’ [Botha (1981: 425)]
(15)
‘Investigation in descriptive linguistics consists of recording utterances in a single dialect and analyzing the recorded material. The stock of recorded utterances constitutes the corpus of data.’ [Harris (1951: 12)]
Post-Bloomfieldian linguistics is designated by ‘taxonomic grammar’ in (14) and by ‘descriptive linguistics’ in (15). A first striking difference is the status these two statements assign to the corpus. In (14) the corpus is alleged to be ‘the object of study’, whereas in (15) the corpus is said to be ‘the stock of recorded utterances’. The difference between these formulations lies in the intentionality. In (15), i.e. in the way Post-Bloomfieldians saw their own work, the corpus is nothing more than a side effect of ‘recording utterances’, one of the two steps
140
Chomskyan Linguistics and its Competitors
in an investigation of a dialect (or language). In (14), it is presented as ‘the object of study’ itself. A more explicit formulation of the Post-Bloomfieldian position in this respect is Hockett’s (16). (16)
a. ‘STEP 1. The utterances of a language are examined.*’ b. ‘Obviously not all of them, but a sampling which we hope will be statistically valid. By working with successively larger samplings, and by predicting on the basis of each what else will occur, we approach, at least asymptotically, a complete description.’ [Hockett (1947: 322)]
In (16a), the start of a step-by-step description of linguistic research is described. The asterisk marks a footnote, given in (16b). According to (16a), the object of study is in principle not the corpus but the utterances of a language. The restriction to a corpus is only imposed by practical constraints, as suggested in (16b). The reference to ‘statistically valid’ as a desirable property of the corpus indicates that the corpus should be representative. This point is emphasised by Harris in (17). (17)
‘the analysis of a particular corpus becomes of interest only if it is virtually identical with the analysis which would be obtained in like manner from any other sufficiently large corpus of material taken in the same dialect.’ [Harris (1951: 13)]
It is thus not the finiteness of the corpus, as stated in (14), that is perceived as a central property by the Post-Bloomfieldians, but its representativity of the language. Any particular, finite corpus is of no special interest without the latter property. The question is then whether or to what extent the selection of data in PostBloomfieldian linguistics is restricted in an artificial way, i.e. by constraints that do not follow from the analysis of the object of study. Here, we should consider Bloomfield’s characterisation in (18). (18)
‘In principle, the student of language is concerned only with the actual speech (B); the study of speakers’ situations and hearers’ responses (A and C) is equivalent to the sum of total human knowledge.’ [Bloomfield (1933: 74)]
The labels A, B, and C in (18) refer to what is represented in Figure 3.1. The limitation to actual speech is necessary because we do not know enough to explain how human beings arrive at a linguistic response. This includes the problem Chomsky (1988) refers to as the ‘production problem’ and of which he suggests in (59) of Section 2.4.1 that it might be a mystery in the technical sense, i.e. in principle beyond the reach of human understanding. There is good reason, then, to accept (18) and its consequence of restricting linguistic
The Chomskyan revolution
141
research to (B) in Figure 3.1. 9 Given this restriction, Harris expresses a very liberal attitude to data collection in (19). (19)
‘These procedures are not a plan for obtaining data or for field work. In using them, it does not matter if the linguist obtains the data by taking texts, questioning an informant, or recording a conversation.’ [Harris (1951: 1)]
In the passage from which (19) is taken, Harris explains the scope of his book. This scope does not include Hockett’s ‘Step 1’ of (16a), simply because anything is allowed. In particular, what is allowed is extending the corpus in the course of investigation, as (20) states. (20)
‘The corpus does not, of course, have to be closed before analysis begins.’ [Harris (1951: 12)]
In (20), ‘of course’ indicates especially clearly that considering the finiteness as an essential property of the corpus, as (14) does, is a serious misinterpretation. It is only a trivial consequence of practical constraints, in the same way as the finiteness of any set of data. The significance of this non-constraint is explained in (21). (21)
a.
‘In much linguistic work we require for comparison various utterances which occur so infrequently that searching for them in an arbitrary corpus is prohibitively laborious. b. To get around this, we can use various techniques of eliciting, i.e., techniques which favor the appearance of utterances relevant to the feature we are investigating c. (without influencing the speaker in any manner that might bring out utterances which would not have sometimes occurred naturally).’ [Harris (1954: 48)]
In (21a) the motivation is given for why we should not restrict our corpus to a closed set. Elicitation as introduced in (21b) is the purpose-driven extension of the corpus. The restriction in (21c) means that elicitation should not distort the data in the corpus otherwise than quantitatively. It should only change the frequency of utterances, not their nature. Harris elaborates on the techniques of (21b) in (22). (22)
a.
‘Rather than constructing a form cx and asking the informant ‘Do you say cx?’ or the like, b. the linguist can in most cases ask questions which should lead the informant to use cx if the form occurs in the informant’s speech.’ [Harris (1951: 12, fn. 12)]
142
Chomskyan Linguistics and its Competitors
The expression cx in (22) stands for an expression for which the linguist has reason to assume that it exists, although it does not occur in the corpus. Harris gives the example of ax and bx found in the corpus, where a, b, and c are otherwise very similar. An instantiation of such a situation can be found in French defective verbs, which do not have all the expected forms. The verb frire (‘fry’), for instance, does not occur in the plural. There is no obvious semantic reason, as in the case of pleuvoir (‘rain’) why a form such as frions would not be the first person plural. The technique that (22a) excludes is very close to grammaticality judgements as discussed in Section 2.2.1, although no distinction is made between grammaticality and acceptability. The reason we should not ask ‘do you say frions’ is quite clearly that this would suggest a particular form, in violation of the constraint in (21c). What (22b) proposes instead is to ask a question such that the most natural response would involve this form. If the informant avoids the form, for instance by using a different verb or a construction with frire in the infinitive instead of inflected for person and number, the linguist concludes that frire is indeed a defective verb without first person plural forms. In sum, the data of Post-Bloomfieldian linguistics are a corpus of utterances. What is important is that the utterances are natural. Otherwise there are no constraints on the corpus. Elicitation can enlarge the corpus at any point in the research in order to find occurrences of rare phenomena. The corpus has the same instrumental role in linguistics as a telescope in astronomy.
3.1.3
The status of a grammar
The position of a grammar in Post-Bloomfieldian linguistics is described by Hockett in (23). (23)
a.
‘A grammatical description must be a guidebook for the analysis of material in the language – b. both material examined by the analyst before the description was formulated, and material observed after that.’ [Hockett (1954: 232)]
The intended meaning of (23a) is that a linguist can use the grammar to analyse any material in the language. As stressed again in (23b), the scope of the grammar is not restricted to a finite corpus. The relationship between the corpus and the grammar is expressed by Harris in (24). (24)
a.
‘For the linguist, analyzing a limited corpus consisting of just so many bits of talking which he has heard, the element X is thus associated with an extensionally defined class consisting of so many features in so many of the speech occurrences in his corpus.
The Chomskyan revolution
143
b. However, when the linguist offers his results as a system representing the language as a whole, he is predicting that the elements set up for his corpus will satisfy all other bits of talking in that language. c. The element X then becomes associated with an intensionally defined class consisting of such features of any utterance as differ from other features, or relate to other features, in such and such a way.’ [Harris (1951: 17)]
Before a grammar is produced, each element is only extensionally defined, as (24a) states. This means that the only characterisation available is in terms of a list of occurrences of the element in the corpus. By producing a grammar, the linguist changes this. As stated in (24c), the characterisation in the grammar is intensional, i.e. in terms of properties. In this way, the characterisation is generalised from the particular corpus to the entire language, as stated in (24b). On this basis, we can present the grammar as a theory of the language, as in Figure 3.2.
Language
describes
Grammar
derives
Utterances
describes
covers
Corpus
Figure 3.2: The model of Post-Bloomfieldian linguistics
The corpus in Figure 3.2 is a description of a selection of utterances of a language. It is used to derive a grammar. The grammar covers the corpus, but is intended as a description of the language the utterances are taken from. Language in Figure 3.2 is taken in the countable sense, as an individual language, not in the generic sense of human language. The derivation procedure relating the corpus to the grammar will be discussed in more detail in Section 3.1.4. A first approach to the meaning of covers is given in (23) and (24) above, but the exact meaning raises a number of issues that were debated within PostBloomfieldian linguistics. Two of them will be presented here.
144
Chomskyan Linguistics and its Competitors
3.1.3.1 Classification and prediction It is often thought that a grammar in Post-Bloomfieldian linguistics is simply a classification of the data. It is this idea that underlies the qualification of Post-Bloomfieldian linguistics as taxonomic by Chomskyan linguists, e.g. in (14). The first sentence of Hockett’s (1942) treatment of the methodology of phonetic analysis, given in (25), is often quoted to support this view. (25)
‘Linguistics is a classificatory science.’ [Hockett (1942: 3)]
Unambiguous though (25) may seem, there is still room for interpretation. The simplest type of classification is in terms of a catalogue of classes with their members. This idea is rejected by Harris in (26). (26)
‘The operations are not intended to classify elements merely for cataloguing convenience.’ [Harris (1951: 372, fn. 16)]
While (26) leaves open the possibility of a grammar as a classification of elements, the result is not a mere catalogue. The question is then, of course, what additional features there are. Harris answers this in (27). (27)
‘The work of analysis leads right up to the statements which enable anyone to synthesize or predict utterances in the language.’ [Harris (1951: 372)]
In (27), ‘the statements’ make up the grammar. These statements give the elements and their distribution. Elements are linguistic expressions of various sizes (morpheme, word, phrase). The distribution of an element is, as Harris states, ‘the total of all environments in which it occurs’ (1951: 16). The statements contained in the grammar can be used to synthesise an utterance from the elements described in the grammar. If, as (27) expresses, ‘anyone’, i.e. not only a speaker of the language described by the grammar, is able to do so, the relevant information must all be included in the grammar. 10 The use of ‘predict’ in (27) should not be taken in the probabilistic sense, i.e. that a particular utterance will be made, but in a way parallel to (7). Hockett clarifies this point in (28). (28)
a.
‘The description must also be prescriptive, not of course in the Fidditch sense, but in the sense that b. by following the statements one must be able to generate any number of utterances in the language, above and beyond those observed in advance by the analyst – c. new utterances most, if not all, of which will pass the test of casual acceptance by a native speaker.’ [Hockett (1954: 232)]
The intended sense of ‘prescriptive’ in (28a) corresponds to the type of explicitness indicated in (27). 11 To the modern reader, the verb ‘generate’ in (28b) is
The Chomskyan revolution
145
of course familiar from Chomskyan linguistics. Finally, (28c) sets up a simple experiment with a native speaker as the authority for approving the utterances generated by the grammar. The test suggested by ‘casual acceptance’ is that a native speaker will react to the utterance in terms of its content, rather than objecting to the form. By setting up a test of this kind, we can create an empirical cycle for the theory and the data in Figure 3.2. Given the importance of prediction and tests, it may seem surprising to find Harris’s statement in (29). (29)
‘the linguistic elements do not describe speech or enable one to reproduce it. But they make it possible to organize a great many statements about speech, which can be made in terms of linguistic elements.’ [Harris (1951: 19, fn. 21)]
At first sight, (29) seems to be in contradiction to (27) and (28). However, there are two considerations that have to be made here. First, a grammar does not cover all aspects of speech. It is not a blue-print for a talking machine. Second, (29) only mentions the linguistic elements. In order to make predictions of the type referred to in (27) and (28), we also need the statements about their distribution. Therefore, (29) is not strictly in contradiction to the idea of grammars predicting utterances in the sense of testing the grammar.
3.1.3.2 Reality of structure Another issue discussed among Post-Bloomfieldians concerns the reality of the structure described in the grammar. Householder (1952) raises this issue in his review of Harris (1951), making the distinction between two positions as in (30). (30)
a.
‘On the metaphysics of linguistics there are two extreme positions, which may be termed (and have been) the ‘God’s truth’ position and the ‘hocus-pocus’ position. b. The theory of the ‘God’s truth’ linguists (and I regret to say I am one) is that language has a structure, and the job of the linguist is (a) to find out what that structure is, and (b) to describe it as clearly, economically, and elegantly as he can, without at any point obscuring the God’s truth structure of the language. c. The hocus pocus linguist believes (or professes to believe – words and behavior are not always in harmony) that a language (better, a corpus, since we describe only the corpus we know) is a mass of incoherent, formless data, and the job of the linguist is somehow to arrange and organize this mass, imposing on it some sort of structure.’ [Householder (1952: 260)]
146
Chomskyan Linguistics and its Competitors
A first observation about (30) concerns the nature of the opposition. In (30a), the two named positions as to the reality of structure are presented as ‘extreme positions’, which implies that there are intermediate positions. The formulation of (30b-c) does not elaborate on this possibility. On the contrary, the fact that Householder identifies with one of the extremes in (30b) suggests that he sees the distinction as a binary opposition. It is a common feature of human perception that one’s own opinion is considered moderate rather than extreme. Therefore one would expect that if Householder sees (30b) and (30c) as the extremes of a cline, he would describe his own position as somewhere in between the two. The distinction between the two positions is that in the God’s truth view of language, (30b), the structure is discovered by the linguist, whereas in the hocus-pocus view, (30c), it is imposed by the linguist. Householder’s sympathy for the former view is expressed not only explicitly in (30b), but also implicitly in the way language and corpus are dealt with in (30c). Householder highlights the role of the corpus in restricting the scope of theoretical statements to the corpus analysed in the actual investigation in (30c), but in Section 3.1.2 we saw that this restriction is only a trivially practical one, not one of a theoretical nature. Whenever necessary, the data base can be extended. Therefore, (30c) makes the hocus-pocus position less attractive than it could be. In the rest of his review, he is actually rather critical of the fact that ‘many, many parts of the book seem to me pure hocus-pocus’ (1952: 261). 12 Despite this bias in the description, we can use (30) to clarify the relationship between the structure described by the linguist and the model of Figure 3.2. If the linguist discovers the structure of the language, the structure is part of the real world, located in the language. If it is imposed by the linguist, the structure is only part of the theoretical account, located in the grammar. In the former case, we can evaluate the grammar in terms of how well the structure it describes corresponds to the structure of the language. In the latter case, such a criterion is not available. We can evaluate the structure of the grammar only in terms of external criteria, for instance, how elegant the structure is or how useful. If the opposition is described in these terms, Harris’s (31) gives some support for Householder’s allegation that Harris holds a hocus-pocus position as in (30c). (31)
a.
‘It therefore does not matter for basic descriptive method whether the system for a particular language is so devised as to have the least number of elements (e.g. phonemes), or the least number of statements about them, or the greatest over-all compactness, etc. b. These different formulations differ not linguistically but logically.
The Chomskyan revolution c.
147
They differ not in validity but in their usefulness for one purpose or another (e.g. for teaching the language, for describing its structure, for comparing it with genetically related languages).’ [Harris (1951: 9, fn. 8)]
There are different grammars compatible with the data collected for a language. In (31a), Harris considers different criteria of elegance or simplicity and plays down the importance of choosing between them. He relegates the choice to external factors, logic in (31b) and the application of the grammar in (31c). This suggests that he does not believe one of them is privileged as corresponding to reality. Let us now consider (32), what seems to be Harris’s reply to Householder, although no direct mention of Householder’s terms is included, let alone a reference to the review. (32)
a.
‘Some question has been raised as to the reality of this structure. Does it really exist, or is it just a mathematical creation of the investigator’s? […] b. there are two quite different questions here. c. One: Does the structure really exist in the language? The answer is yes, as much as any scientific structure really obtains in the data which it describes – the scientific structure states a network of relations, and these relations really hold in the data investigated.* d. Two: Does the structure really exist in the speakers? Here we are faced with a question of fact which is not directly or fully investigated in the process of determining the distributional structure. Clearly, certain behaviors of the speakers indicate perception along the lines of the distributional structure.’ [Harris (1954: 36), footnote at * deleted]
The question in (32a) evokes the same two possibilities as described in (30b-c). He addresses this question by analysing it into two senses of exist, one with respect to the language and one with respect to the speaker. Harris’s approach to the first question in (32c) is reminiscent of Chomsky’s approach to the question of the psychological reality of grammars as discussed in Section 2.3.1. If the grammar is correct the structure it describes must be there. This conclusion immediately highlights the difference between Harris’s formulation of the question in (32a) and Householder’s in (30). In a sense, (32c) sidetracks the distinction between God’s truth and hocus-pocus by implying on the one hand that if the grammar describes structure, there is no ‘mass of incoherent, formless data’ to start with but on the other hand that this is noth-
148
Chomskyan Linguistics and its Competitors
ing special. If we transfer Harris’s reasoning to the history of astronomy as discussed in Section 1.3.2, Copernicus’s complex system of circles to describe the orbits of planets, taken over from antiquity, and Kepler’s elliptical orbits, still assumed today, are equally real. In terms of (32c), both systems describe the relationships between objects in the sky. The question in (32d) is not directly triggered by (30), which only discusses the existence of structure in the language. In (33), Hockett (1948) links these two perspectives. (33)
a.
‘The structure of a language may be regarded as the end-product of a game […] b. every speaker of a language plays just such a game, the end-product being a state of affairs in his nervous system […] c. The linguistic scientist must regard the structure of the language as consisting precisely of this state of affairs. […] d. For the scientist, then, ‘linguistic structure’ refers to something existing quite independently of the activities of the analyst.’ [Hockett (1948: 270f.)]
Hockett (1948) discusses two positions comparable to the ones in (30b) and (30c), calling them the scientific position and the game perspective, respectively. Thus, (33a) refers to the hocus-pocus position. In (33b) Hockett draws a parallel between the child and the (hocus-pocus) linguist, to be compared with the one Chomsky states in (50) of Section 2.3.3. According to Hockett, they play the same game, but with a different result, as stated in (34). (34)
‘The child in time comes to BEHAVE the language; the linguist must come to STATE it.’ [Hockett (1948: 270)]
If the speaker’s behaviour referred to in (34) is the result of ‘a state of affairs in his nervous system’ as claimed in (33b), it is reasonable to expect that, as (33c) states, this state of affairs provides the ultimate criterion for the evaluation of the grammar produced by a linguist. This leads to the God’s truth view in (33d). In (32d), Harris does not deny the claim in (33), but treats it as an unsolved empirical problem. The conclusion remains that (30) raises a perfectly legitimate question about the reality of structure, although perhaps formulated in a slightly tendentious way. Harris does not provide an answer in terms of this formulation but in (32) chooses to treat the issue of the reality of structure in entirely different terms, hardly compatible with (30).
The Chomskyan revolution
3.1.4
149
Approaches to non-uniqueness
A well-known problem among Post-Bloomfieldians was the issue of the ‘nonuniqueness of grammars’. It was first described by Chao (1934) for phonological analysis. Harris explains it more generally in (35). (35)
‘It is possible for different linguists, working on the same material, to set up different phonemic and morphemic elements, to break phonemes into simultaneous components or not to do so, to equate two sequences of morphemes as being mutually substitutable or not to do so.’ [Harris (1951: 2)]
In terms of Figure 3.2, (35) states that different grammars can be proposed for a particular language on the basis of the same corpus of data. As explained in Section 1.2.1, this is an instantiation of a general problem of science. For any set of data, there are indefinitely many different compatible theories. There are several approaches to such a situation. Three possibilities are listed in (36). (36)
a.
Ignore the problem and proceed with research as usual. Perhaps new findings will provide a clue. b. Take a criterion not in the set of data for selecting one among the possible grammars. c. Provide arguments to the effect that the multiplicity of grammars is not a serious problem.
The approach in (36a) is typical of the attitude towards what Kuhn calls anomalies in normal science (cf. Section 1.3). It was probably the most common approach in Post-Bloomfieldian linguistics. For obvious reasons, however, it does not lead to publications discussing the problem. The solution in (36b) is adopted, for instance, in Chomskyan linguistics. In Section 2.3.3, the problem corresponding to (35) was called the indeterminacy of grammar. In Chomskyan linguistics the criterion adopted to solve this indeterminacy is learnability. In Post-Bloomfieldian linguistics, both (36b) and (36c) can be found. The latter is stated by Harris in (37). (37)
a.
‘It may turn out that several systems of statements are equally adequate, for example, several phonemic solutions for a particular language (or only, say, for the long vowels in a language). It may also be that different systems are simpler under different conditions. […]
150
Chomskyan Linguistics and its Competitors b. In any case, there is no harm in all this non-uniqueness,* since each system can be mapped onto the others so long as any special conditions are explicit and measurable.’ [Harris (1954: 35), * footnote with reference to Chao (1934) deleted]
In the text between (37a) and (37b), Harris gives a number of examples of situations that may apply. They are comparable to what is stated more concisely in (31). 13 The conclusion in (37b) is an instance of the type of reasoning in (36c). The systems are said to be equivalent because they are all based on the same data. This solution to the problem of non-uniqueness interacts with the position taken on the issue of the reality of structure. It is compatible with the view that only low-level relations between elements are real and any higher-order organisation of these relations is a construct by the linguist. A somewhat different approach is found in Hockett’s work. He proposes a number of methodological constraints on grammar, which he calls the frame of reference. The two papers published in 1942 and in 1954 give different elaborations of this concept, but as (38) and (39) show, its function remains the same. (38)
‘neither the process of analysis, nor the presentation of the results of analysis, need resemble the general picture of phonology given in this paper. The system given here is a FRAME OF REFERENCE.’ [Hockett (1942: 21)]
(39)
‘By a “model of grammatical description” is meant a frame of reference within which an analyst approaches the grammatical phase of a language and states the results of his investigations.’ [Hockett (1954: 210)]
In one sense the two papers from which (38) and (39) are taken are complementary. Hockett distinguishes ‘two basic levels, PHONOLOGICAL and GRAMMATICAL’ (1942: 3) and as the formulations show, (38) is from a proposal for a phonological, (39) from a proposal for a grammatical methodology. An interesting difference is that (38) occurs in the very last paragraph, whereas (39) is the start of the paper. This suggests that between 1942 and 1954 the term frame of reference had been established. In each paper Hockett proposes a list of criteria. The older list consists of ‘six requirements which seem essential for a correct system’ (1942: 20), given in (40). (40)
a. ‘Range and criteria must be accurately and unambiguously defined. b. There must be no mentalism. c. The terminology must involve no logical contradictions; terms defined as variables, class names, and quality names must be consistently used in those values.
The Chomskyan revolution
151
d. No material should be excluded which might prove to be of grammatical importance, and none should be included that cannot be of grammatical importance. e. There must be no circularity; phonological analysis is assumed for grammatical analysis, and so must not assume any part of the latter. The line of demarcation between the two must be sharp. f. The way should be left open for the introduction of any criteria whatsoever on the grammatical level, barring mentalism.’ [Hockett (1942: 20–21), originally numbered in running text]
The items in (40) are something of a mixed bag. On the one hand we find a number of constraints of a very general nature. Thus, (40c) applies to any technical or scientific undertaking. In the same way, (40d) and (40f) are essentially applications to the field of linguistics of such general points. The other three criteria are more specific and may be thought of as part of the research programme. Hockett explains the terms used in (40a) in (41). (41)
a.
‘Selection and preliminary ordering of data determine the RANGE of analysis; b. the choice of criteria fixes the LEVEL of analysis.’ [Hockett (1942: 3)]
Given (41a), (40a) obliges us to specify the boundaries of the language taken as the subject of investigation. Taking (41b) together with (40a), we have to state clearly whether we are doing phonology or grammar. This constraint interacts with the one in (40e), which prohibits ‘mixing levels’. Harris seems to be more liberal in handling this constraint in (42). (42)
a.
‘it is possible to determine the morphemes of a language without any previous determination of the phonemes.*’ b. *‘This is not done for a whole language, because of the complexity of the work.’ c. ‘Just as we can go from morphemes to phonemes, so can we go, but far more easily, from phonemes to morphemes.’ [Harris (1951: 23), footnote at * given as b]]
In (42a) Harris considers the possibility of doing morphology before phonology. 14 After a paragraph on how this could be done, (42c) considers the more usual order as an alternative. The option in (42a), while contrary to the letter of (40e) may be thought of as a simple reversal, not violating the constraint against ‘mixing levels’. The text of the footnote given in (42b), suggests, however, that (42a) should be taken to allow that for some parts of the language, morphological analysis precedes phonological analysis, whereas for the rest the order is reversed. As it is not possible to exclude interaction between the
152
Chomskyan Linguistics and its Competitors
parts analysed in different orders, (42) implies that the prohibition of mixing levels is not strictly observed. The fact that there was discussion about this constraint is also illustrated by (43), taken from a review. (43)
‘Robins’ analysis thus involves a phonetic feature which is predictable only from morphological data – a procedure defended by Pike,* but condemned by many American linguists as involving an unwarrantable mixture of levels.’ [Bright (1959: 101), footnote at * deleted]
The book under review in Bright (1959) is a grammar of an American Indian language by a British linguist. This may explain the reference to ‘many American linguists’ in (43). While undoubtedly belonging to American linguistics, the work of Kenneth Pike is not part of the Post-Bloomfieldian mainstream. 15 Therefore, (43) supports the analysis of a broad acceptance of Hockett’s (40e), in spite of Harris’s consideration of alternatives. The orthodoxy is also stated by Trager and Smith in (44). (44)
‘microlinguistic analysis can and must deal with statements about the distributions of the elements rigidly observed on ascending levels of complexity of organization.’ [Trager and Smith (1951: 81)]
In (44), ‘microlinguistic analysis’ refers to phonology and morphology as opposed to communicative aspects of discourse. Trager and Smith’s ‘assumption’ in (44) does not allow for the possibility Harris mentions in (42a) because the ‘ascending levels’ means that phonology has to be completed before morphology. There is no comparable discussion among Post-Bloomfieldians about (40b), which precludes reference to the speaker’s mind in an account of language. It is a direct consequence of Bloomfield’s interpretation of Figure 3.1, as given in (8) and discussed as such in Section 3.1.1.1. Even when Harris (1954) mentions the question of the realisation of linguistic structure ‘in the speakers’ in (32d), there is no implication that the way the structure is implemented may influence the shape of the theory. Compared to (40), which reads a bit as a list of whatever came to the author’s mind in no particular order, Hockett (1954) gives the much more systematic list in (45). (45)
a.
‘A model must be GENERAL: it must be applicable to any language, not just to languages of certain types. b. A model must be SPECIFIC: when applied to a given language, the results must be determined wholly by the nature of the model and the nature of the language, not at all by the whim of the analyst. […]
The Chomskyan revolution
153
c.
A model must be INCLUSIVE: when applied to a given language, the results must cover all the observed data and, by implication, at least a very high percentage of all the not-yet-observed data. […] d. A model must be PRODUCTIVE: when applied to a given language, the results must make possible the creation of an indefinite number of valid new utterances. This is the analog of the ‘prescriptive’ criterion for descriptions. e. A model must be EFFICIENT: its application to any given language should achieve the necessary results with a minimum of machinery.’ [Hockett (1954: 232f.), originally a list numbered (1) to (5)]
This list of five properties can be divided into two parts. The conditions in (45a-b) are conditions imposed directly on the model itself. Generality, as described in (45a) can be seen above all as a clause against any claim by a linguist that for their language the method is not applicable. Alternatively, it can be interpreted as a condition on the model, that for each language it should provide an analysis. In the latter case it is the natural counterpart to (45b), which excludes non-uniqueness. Hockett therefore takes an explicit stand of type (36b), as opposed to Harris’s (37). The remaining three conditions (45c-e) are conditions on the grammars produced by the model. (45c) and (45d) are absolute conditions that can be tested for any particular grammar. In (45c) we have to evaluate the grammar with respect to a corpus of data. (45d) resumes the prescriptiveness in the sense discussed for (28) above. It can be tested by a look at the grammar. In contrast to these conditions, (45e) gives an evaluation measure for the comparison of two grammars. Although some of the properties assigned to a model in (45) can be part of a research programme, it is not possible to identify the model with the research programme. A research programme gives heuristics for developing a theory, but not a procedure independent of the ‘whim of the analyst’, as (45b) requires. It would be reasonable to stipulate (45a) in a research programme, but finding a theory remains the responsibility of the linguist. Whereas (45c-e) could be part of a research programme, they would not be properties of the grammar produced by the model, but evaluation criteria for grammars, either absolute, as in (45c-d), or relative, as in (45e). If (45) is not considered part of the research programme, an alternative assumption would be that it is part of a higher level theory. The problem with such an assumption is that it remains unclear what this theory is supposed to be a theory of. Another point which remains unclear about (45) is to what extent the conditions it specifies can in principle be fulfilled. Hockett evaluates the state of Post-Bloomfieldian linguistics with respect to (45) as in (46).
154 (46)
Chomskyan Linguistics and its Competitors a.
‘If we were confronted with two models, one of which fulfilled all the above requirements while the other did not, choice would be easy. b. If we were confronted with two, both of which fulfilled all the requirements, we would have to conclude that they differed only stylistically. c. Neither of these situations, of course, is at present the case.’ [Hockett (1954: 233)]
It is difficult to interpret (46) without taking into account that (45e) is probably a later addition to the list. Hockett states in this respect ‘I am hesitant about the fifth, but no set would be complete which did not include the first four’ (1954: 232). The relative nature of this criterion would produce a ranking of grammars not foreseen in (46). Even if we restrict the scope to the absolute conditions in (45a-d), there is still a problem with (46b). There is in fact no reason to assume that there would be exactly one general, specific, inclusive, and productive model, unless we also assume that there is exactly one correct analysis of a language. If we want to maintain (46b), the implied absolute standard for the evaluation of analyses and the evaluation process have to be stated as well. The situation in (46c) may thus be a consequence of setting the target unrealistically high. The opposition between Hockett and Harris about non-uniqueness was noted already. In (47) Harris takes a very different approach to the formulation of general conditions than Hockett. (47)
‘The only preliminary step that is essential to this science is the restriction to distribution as determining the relevance of inquiry. The particular methods described in this book are not essential.’ [Harris (1951: 6)]
In the introduction of what is no doubt the most elaborate and explicit exposition of Post-Bloomfieldian methodology, (47) is striking because of its modesty. Instead of any of requirements of the type illustrated in (45), it takes an analysis of the distribution of elements as the only constraint on theory formation. Any further methods are interchangeable. In sum, there are two explicitly stated approaches to non-uniqueness in Post-Bloomfieldian linguistics. One approach assumes that non-uniqueness is a property of language we have to live with. The other interprets it as a deficiency in the present (i.e. 1950s) state of the theory. The former is represented by Harris, the latter by Hockett. The research programme of Post-Bloomfieldian linguistics is not strong enough to enforce a decision in favour of one or the other option.
The Chomskyan revolution SUMMARY
•
Before the emergence of Chomskyan linguistics, American linguistics was dominated by the Post-Bloomfieldians. Post-Bloomfieldian linguistics was inspired by Leonard Bloomfield, but diverged from his views in a number of points.
•
In the stimulus-response model of a communicative act, the speech signal is distinguished from what underlies it in the speaker and what it triggers in the hearer. The speech signal is defined as the object of investigation of linguistics.
•
Bloomfield defined the language of a speech community as the totality of utterances that can be made in that speech community.
•
Post-Bloomfieldians defined a language as a system of symbols or habits underlying the utterances made in a speech community.
•
The corpus of utterances used for analysis should be representative of the language described. It can be extended by elicitation, but this should not result in utterances that would not be produced naturally.
•
The grammar is derived from a corpus of utterances. It covers the corpus and describes the language from which the utterances are taken.
•
The grammar not only classifies the elements of the language, but also predicts how they can be used in new utterances.
•
According to the God’s truth hypothesis, the linguist discovers the structure of the language. According to the hocus-pocus hypothesis, the linguist imposes structure on the language. Both positions are compatible with Post-Bloomfieldian linguistics.
•
There was no unanimous position on non-uniqueness. Harris considered it harmless. Hockett expected a yet to be discovered general and specific model of analysis to eliminate it.
•
Linguistic analysis should avoid mentalism.
•
According to widespread agreement, phonological analysis should be completed before the recognition of morphemes and their distribution.
155
156
Chomskyan Linguistics and its Competitors
3.2
A comparison of the two research programmes
If we take the research programme of Chomskyan linguistics to be adequately represented by the model in Figure 2.10 and the research programme of PostBloomfieldian linguistics by the one in Figure 3.2, it is immediately obvious that they are not the same. There have been analyses, however, that downplay the significance of these differences. In this section, three different angles will be taken to the comparison of Post-Bloomfieldian and Chomskyan linguistics in order to determine whether an analysis as different research programmes is justified. Section 3.2.1 will address the continuity across the emergence of Chomskyan linguistics. Section 3.2.2 will consider the nature of the differences between the two approaches. Finally, Section 3.2.3 analyses the consequences these differences have for discussions between representatives of the two approaches.
3.2.1
Continuities
The transition from Post-Bloomfieldian linguistics to Chomskyan linguistics as the dominant framework in a particular area of the study of language has often been the subject of ideologically based discussion. An analysis that is both thorough and free of excessive ideological bias is the one by Kaldewaij (1986). Its conclusion is (48). (48)
a.
‘although there exist certain discontinuities in the transition from (American) structuralism to TGG, there are also numerous continuities. b. Moreover it has been established that certain new aspects of TGG were not foregrounded immediately but only in the course of its development. c. The emergence of TGG can therefore not be characterized as a revolution in the sense of Kuhn (1970)’. [Kaldewaij (1986: 267)]16
The question addressed by Kaldewaij is whether the emergence of Chomskyan linguistics constitutes a revolution in the sense of Kuhn (1970a), i.e. whether such claims as Thorne’s (1) are correct. Kaldewaij approaches this question on the basis of an overview of various currents in twentieth century linguistics, including Saussure, the European structuralism of the Prague school, Sapir, and Bloomfield. Therefore ‘(American) structuralism’ in (48a) cannot be identified with Post-Bloomfieldian linguistics. Nevertheless it is this type of structuralism that yields the most significant continuities. ‘TGG’ in (48) stands for ‘Transformational Generative Grammar’, the label Kaldewaij uses for Chomskyan linguistics. The basis for the claim in (48a) is an analysis concentrating on the comparison of Post-Bloomfieldian and Chomskyan linguistics taking up the second half
The Chomskyan revolution
157
of Kaldewaij’s (1986) book. One can find nine claims of continuity, presented here in three groups. The first group is (49). (49)
a. Language is an autonomous system. (1986: 171)17 b. Linguistic elements are described on the basis of their syntagmatic properties. (1986: 199–200) c. Formalisation of the description is an aim. (1986: 171–173)
The three properties in (49) are very general. They characterise the field of study rather than the research programme. Autonomy in (49a) means that language is studied independently of its function or the application of the results of the study. The claim in (49a) is denied in functionalist approaches and by-passed in applied linguistics. Syntagmatic properties in (49b) refer to the relationships between phonemes or morphemes in a linguistic expression. These properties determine the distribution of an element. They are opposed to paradigmatic properties as found in, for instance, inflectional paradigms. Formalisation in (49c) means the use of a formalism to make statements precise. The alternative is that statements remain informal and vague. One might see traditional grammar as an approach that would deny the three claims in (49). They are certainly not specific enough to argue for a continuity of research programme. A second group of claims from Kaldewaij (1986) is (50). (50)
a. Form plays a central, meaning a peripheral role. (1986: 212) b. Sentence and word have a status as domains for the application of certain rules. (1986: 230) c. Sentences have a phrase structure. There are levels of representation and transformations. (1986: 177)
The three points in (50) are of a theory-internal nature. (50a) is very general and can be related to the language as an autonomous system in (49a). (50b) contrasts with the Prague School approach that distinguishes words and sentences in a much more fundamental way. (50c) is perhaps the most obvious continuity between Harris’s and Chomsky’s work. Harris (1952: 18–23) introduces the concept of grammatical transformation. The more detailed elaboration by Harris (1957) must have struck many contemporaries as similar to Chomsky’s early work. However, while (50) lists some genuine and specific continuities in the transition from Post-Bloomfieldian to Chomskyan linguistics, they are theory-internal decisions which do not determine the research programme. An analogy with astronomy illustrates this point. When Copernicus replaced the geocentric model of the universe as formulated by Ptolemy by a heliocentric one, he maintained many of the Ptolemaic mechanisms used to calculate planetary orbits. The same calculations were used, however, in a different interpretation. Instead of the orbit of a planet around the Earth they
158
Chomskyan Linguistics and its Competitors
were now interpreted as the result of the combined effects of the orbit of the Earth around the Sun and the planet around the Sun. This new interpretation determined further developments, in particular Kepler’s proposal to replace the combination of circles by an ellipse. In a parallel way, we can observe that mechanisms such as transformations are used both by Harris and by Chomsky, but that they use them to describe different entities. Harris, adopting a model like Figure 3.2, uses transformations to describe aspects of a language. Chomsky, adopting a model like Figure 2.7, uses them to describe aspects of the speaker’s competence. Considerations of learnability subsequently led him to replace individual transformation rules by the generalised movement rule, move α. In the same way as elliptical planetary orbits would be inconceivable in the Ptolemaic system, move α would be hard to reconcile with Harris’s. A third group of continuities mentioned by Kaldewaij, given in (51), is neither so general as (49), nor theory-internal as (50). (51)
a. There are universals and their study is relevant. (1986: 252f.) b. The description of the system has a realist status. (1986: 169–171) c. The description should cover the productivity of language. (1986: 245)
The point about universals in (51a) is that the type and status of universals is quite different in Post-Bloomfieldian and in Chomskyan linguistics. The study of universals in Post-Bloomfieldian linguistics has always been restricted by Bloomfield’s famous statement in (52). (52)
‘The only useful generalizations about language are inductive generalizations. Features which we think ought to be universal may be absent from the very next language that becomes accessible.’ [Bloomfield (1933: 20)]
As Hockett (1963: 2) correctly observes, it is impossible to find universals without ‘extrapolation’, i.e. without violating the restriction to induction in (52). Kaldewaij (1986: 115f.) also mentions methodological universals as a type of universal concerned by (51a). They consist in the application of the same analysis method and the same concepts (e.g. phoneme, lexicon) to all languages. This approach to universals contrasts very strongly with the one taken in Chomskyan linguistics. The research programme of Chomskyan linguistics requires the existence of universals in the language faculty. As we saw in Section 2.4.2, Chomskyan linguistics is not interested in inductive universals of the type referred to in (52), but in the ones that are necessary for language acquisition. They do not constitute an optional extension but an indispensable component of a theory of language.
The Chomskyan revolution
159
The question of realism, as raised by (51b), is a vexed one, because the label realist has been used in different senses. In Chomskyan linguistics, there is no room for questioning that the grammar is meant to be a description of a real-world entity, the speaker’s competence. Realism is required by the research programme. In Post-Bloomfieldian linguistics, the discussion of God’s truth and hocus-pocus positions described in (30) may be interpreted in the sense that realism is optional. However, in (32a-c) Harris avoids the hocus-pocus position by reinterpreting realism. The research programme does not make any demands in this respect because it does not specify how language is realised. A question such as (32d) on the relation between a language and the speaker’s knowledge can thus be asked open-mindedly in Post-Bloomfieldian linguistics. Therefore, depending on the precise interpretation of realism, it is either not the same in Chomskyan and Post-Bloomfieldian linguistics, or it is obligatory in the former and optional in the latter. The productivity or open-endedness of language as referred to in (51c) is acknowledged in both research programmes. In (3) of Chapter 2 it was presented as a starting point for Chomskyan linguistics. In this chapter, Hockett’s (23) and (28) and Harris’s (27) show a similar commitment in PostBloomfieldian linguistics. Two differences should be noted here, however. In (27) Harris wants a grammar to be explicit enough ‘for anyone to synthesise or predict utterances’. In the context of Chomskyan linguistics, this would leave the general human language capacity unexpressed, because anyone has it and does not need its specification to use it. In (28c) Hockett uses ‘casual acceptance by a native speaker’ as a test for the generated utterances. In the context of Chomskyan linguistics, this would be thought of as lacking in power and reliability compared to grammaticality judgements. Therefore, as soon as productivity is elaborated the differences between Post-Bloomfieldian and Chomskyan linguistics emerge. To sum up, the ‘numerous continuities’ Kaldewaij refers to in (48a) contain a number of points that are not significant for the question of whether we are dealing with one or two research programmes if we consider Post-Bloomfieldian and Chomskyan linguistics. First we have to discard theory-internal properties as irrelevant to the question. This concerns (50). Then we have to take into account that some of the properties are so general that they can be implemented in many different ways. This concerns (49). The remaining issues include some, (51a-b), for which Post-Bloomfieldian linguistics leaves open several possibilities whereas Chomskyan linguistics determines a specific approach. The productivity of language, (51c), is the only significant point where the two approaches coincide, but in the formulation of this point we immediately identify characteristic differences between the two approaches.
160
Chomskyan Linguistics and its Competitors
Before moving on to the discussion of the differences, let us briefly consider the other parts of (48). In (48b) it is implicitly conceded that there was no major break in the history of Chomskyan linguistics. As we concluded in Sections 2.5 and 2.6, the most that can be said in this respect is that gradually more elements of the research programme could be operationalised. It is questionable whether (48b) would justify the conclusion in (48c), even if the continuities referred to in (48a) were more significant than we concluded above. In his analysis of the Copernican revolution, Kuhn (1957) argues that the revolution started with the proposal of a heliocentric universe by Copernicus and concluded with the explanation of planetary orbits by Newton’s law of gravitation. Ten Hacken (1997b) elaborates the parallel between this analysis of the Copernican revolution and the stepwise operationalisation of elements of the research programme of Chomskyan linguistics. Assuming this parallel, if (48b) justified (48c), the Copernican revolution would not be a revolution in Kuhn’s (1970a) sense. Since the Copernican revolution is one of Kuhn’s prime examples of a scientific revolution, this conclusion is not acceptable unless we reject Kuhn’s theory.
3.2.2
Differences
A comparison of the models of Post-Bloomfieldian linguistics in Figure 3.2 and of Chomskyan linguistics in Figure 2.3 reveals a number of differences. These differences can be reduced to two major underlying distinctions. First, although the boxes representing theoretical and real-world entities and the arrows linking them appear in the same configuration, their labelling is different. These differences evolve from a different attitude to the relationship between grammars, languages, and speakers (Section 3.2.2.1). Second, the model of Figure 2.3 is embedded in an architecture with additional levels in Figure 2.7 and Figure 2.10, whereas no such extension exists for the model in Figure 3.2. This reflects a different attitude towards the indeterminacy of grammars (Section 3.2.2.2).
3.2.2.1 Mentalism The Post-Bloomfieldian attitude to mentalism is expressed concisely and forcibly in Hockett’s statement (40b). The general agreement on this point derives directly from Bloomfield’s analysis of communication as represented in Figure 3.1. Weiss (1925) states the same point from a psychological perspective in (53).
The Chomskyan revolution (53)
161
a. ‘No non-physical, non-biological forces need be postulated, b. and until it has been conclusively demonstrated that the biological structure of man and his complex language and social environment are unable to produce the social institutions which differentiate him from the animals, c. the assumption of a special mental force or a mind is gratuitous. d. As we learn more about language, there arises a tendency to shift the burden of proof as to the existence of a special mental force, upon those who hold this hypothesis.’ [Weiss (1925: 56)]
At first sight, the Chomskyan idea of language as a mental component of knowledge, as in (14) of Section 2.1.2, is an example of a ‘gratuitous’ assumption in the sense of (53c). The terminology used in (53), however, points to a different analysis. Weiss opposes ‘special mental force’ in (53c) to physical and biological forces in (53a). This is reminiscent of the mind-body dualism of Descartes, as mentioned in Section 2.6.2. As indicated there, the hypothesis alluded to in (53d) is not one held by Chomsky. Chomskyan mentalism assumes, as argued in Section 2.6.2.1, that mind is a name for the cognitive aspects of a physical entity, the brain and does not correspond to a non-physical force. Both Bloomfield and Chomsky distinguish problems and mysteries, where it is reasonable to expect a solution to a problem but not to a mystery. Bloomfield expresses this in (8), explaining why ‘the linguist deals only with the speech signal’ in Figure 3.1. The analysis of the speech signal (B) is ‘fairly well understood’, but the way information is processed by the speaker (A) and hearer (C) is ‘very obscure’. Chomsky mentions the contrast in (59) of Section 2.4.1. He assumes that the ‘development of cognitive structures’ is a well-formed problem, but the ‘capacity to use these structures’ is a mystery. The resulting difference is illustrated in Figure 3.3. Speaker
Hearer
Mind
Mind
Th
S
Cp
Th
r
s
Cp
R
[B] Figure 3.3: Problems and mysteries for Bloomfield and Chomsky
162
Chomskyan Linguistics and its Competitors
In Figure 3.3, the Bloomfieldian concept of the subject of linguistics is indicated by [B], as in Figure 3.1. The happenings indicated by [A] and [C] in Figure 3.1 have been analysed so that they are mediated by the mind, in which competence (Cp) and thought (Th) interact. The Chomskyan concept of the subject of linguistics in this diagram is Cp. It is striking that the two concepts of the proper domain of linguistics do not have any overlap. What for Bloomfield constitutes language is for Chomsky a physical side effect of the use of language. What for Chomsky constitutes language is for Bloomfield part of the mystery. The mystery for Chomsky is how the interaction of competence and other parts of the mind maps stimuli into responses. A consequence of the shift in the notion of language is that in Chomskyan linguistics Figure 3.3 loses its status as the only or privileged realisation of language. That it has this status in Post-Bloomfieldian linguistics is clearly indicated by Trager and Smith in (54). (54)
a.
‘It must be recalled in this connection that language is a societal phenomenon. b. The language of one speaker – an idiolect – is therefore necessarily and by definition incomplete, c. since at least two speakers (one of whom may be imaginary) are involved in every normal communicational situation.’ [Trager and Smith (1951: 9)]
If language can only be observed in [B], it is strange to consider the language of one speaker. The normal starting point, as indicated in Section 3.1.1, is the speech community. In Chomskyan linguistics, the ‘societal phenomenon’ in (54a) is derived from the use of individual competence. Instead of an idiolect, as in (54b), the language of one speaker is considered from the point of view of the speaker’s competence. This competence is by no means incomplete on its own. In (58) of Section 2.4.1, Chomsky indicates the expression of thought as the central use of language, of which communication is a special case. There is no need for an ‘imaginary’ speaker as mentioned in (54c) to save the generality of the communicative model. We can see the influence of Chomskyan mentalism on the research programme if we compare the labels in Figure 3.2 to those in Figure 2.3. In both figures, the grammar is related to the object it describes and to the data. A Post-Bloomfieldian grammar describes a language. A Chomskyan grammar describes a speaker’s competence. In Post-Bloomfieldian linguistics, the speaker’s knowledge of a language is dependent on the existence of the language. Speakers only come into the picture as members of a speech community in the definitions in (5d), (9a) and (10). 18 In Chomskyan linguistics, a language (in the sense of E-language) is derived in some way from what the speakers know (cf. Section 2.1.3).
The Chomskyan revolution
163
The difference between a language or a speaker’s competence as the object of description has immediate consequences for the types of data that are acceptable. Both in Post-Bloomfieldian and in Chomskyan linguistics it is possible to use a corpus of utterances as a source of data, but the status of the corpus is not the same. In Post-Bloomfieldian linguistics, it is a sample of the object of description. Both the corpus and the language consist of utterances. In Chomskyan linguistics, it is a product of the object of description. The speaker’s knowledge contributes to the production of the corpus. This difference also determines the difference between the two techniques for extending the set of data provided by a corpus, elicitation techniques and grammaticality judgements. Elicitation techniques extend the corpus without changing the nature of the data. As (21) emphasises, they should not ‘bring out utterances which would not have sometimes occurred naturally’. For this reason, (22) explicitly excludes a technique that comes close to grammaticality judgements. This also means that it is impossible to get the equivalent of negative grammaticality judgements. In our French example of frions, the linguist can create a context favourable to the appearance of the form. If the informant uses it, the form is attested. If the informant seems to avoid it, the linguist can never be sure that the form does not exist in the language. By means of a grammaticality judgement the latter conclusion can be established quite straightforwardly. Another instantiation of the same difference is illustrated by (55). (55)
a. b. c. d. e.
‘Given that we have no record of anyone having ever said either The blue radiator walked up the window. or Here is man the. we can devise a few situations in which the former will be said but can predict that the latter will be said far less frequently (except for situations which can be stated for each culture, e.g. explicitly linguistic discussions).’ [Harris (1951: 22)]
The contrast between the examples (55b-c) can be compared to Chomsky’s examples in (23) of Section 2.2.1, which illustrate the contrast between grammaticality and acceptability. The concept of grammaticality is not available in a Post-Bloomfieldian framework because it appeals crucially to the speaker’s competence. Therefore the only way Harris can express the difference between the two examples is in terms of frequency, as he does in (55d). However, this sits uncomfortably with (55a). Both of (55b-c) have a frequency close to zero. As highlighted also by the problem in (55e), recourse to frequency is a poor substitute for the actual difference, which concerns grammaticality. Without grammaticality judgements, which require the concept of competence, this property is not available to the linguist.
164
Chomskyan Linguistics and its Competitors
3.2.2.2 Indeterminacy One of the common features of the models in Figure 2.3 and Figure 3.2 is that there are many different solutions to the problem of finding a grammar that fits the data. This situation had been recognised by Chao (1934) so that both the Post-Bloomfieldians and Chomsky were aware of it. Their approaches are strikingly different, however. In Chomskyan linguistics, as discussed in Section 2.4, the indeterminacy or non-uniqueness is solved by adding a higher level of explanation in Figure 2.6. In Post-Bloomfieldian linguistics, as discussed in Section 3.1.4, the attitude is marked by a tolerance of different approaches and by increased attention to the method used to arrive at a grammar. This contrast highlights another difference between grammars in the two models. If we compare the labels of the downward arrows in Figure 2.3 and Figure 3.2, we see that the grammar explains the data in Chomskyan linguistics, but only covers them in Post-Bloomfieldian linguistics. The reason for this difference is easy to detect. In Post-Bloomfieldian linguistics, data are interpreted in terms of Figure 3.1. An explanation of the data would have to take into account [A] and [C], which is excluded by the ban on mentalism. In Chomskyan linguistics, this problem is avoided by reinterpreting the status of the data. As far as natural communication provides data, they are interpreted in terms of Figure 3.3. Only to the extent to which the data give information about the competence (‘Cp’) are they explained. The possibility of such a position is a direct consequence of the distinction between competence and performance. Let us now turn to the upward arrows in Figure 2.3 and Figure 3.2. In Chomskyan linguistics the grammar is tested by the data, whereas in PostBloomfieldian linguistics the grammar is derived from the data. These terms appeal to two different views of science. In one view, science is governed by the empirical cycle in Figure 1.3, and a theory is a hypothesis to be tested. The other view appeals directly to Figure 1.1, where scientific knowledge is derived by a scientific process applied to the data. A theory in this perspective is scientific knowledge. In order to make the knowledge scientific, we should ensure that the data are correct and the procedure applied to the data is correct. This view of science is by no means obsolete. It is represented, for instance, in the title From Data to Knowledge of Gaul and Pfeifer’s (1995) proceedings of the annual conference of the Gesellschaft für Klassifikation, the German Classification Society. The role of classification in this perspective reminds one of the discussion in Section 3.1.3.1, in particular (25) and (26). In many respects the two perspectives of science can be seen as complementary. If we describe the history of a science, we tend to appeal to the
The Chomskyan revolution
165
empirical cycle, showing how theories were gradually improved by adapting them when tests had shown their deficiencies. If we compare the merits of two theories, we tend to present all the evidence for a theory systematically, rather than historically. Although to some extent a representation can be translated from one perspective into the other, there is a sense in which the assumption of one of the perspectives as the basic one influences research. The cyclic perspective of Figure 1.3 requires evaluation criteria for theories. We want to gradually improve our theory (=hypothesis) by seeing how alternative theories fare in view of the data. The one-step perspective of Figure 1.1 requires evaluation criteria for the scientific process. We want to maintain the scientific nature of our theory (=knowledge) by ensuring that high-quality data are collected and the procedures applied to them are scientific. In Chomskyan linguistics, the empirical cycle is chosen as a basis. Theories are formulated to explain observations. It it always possible to consider deeper explanations. Therefore, the indeterminacy of grammars can be approached by having a higher-level theory that constrains the grammar. As the discussion of the Minimalist Program in Section 2.6 shows, we do not have to actually formulate the higher-level theory in order to use the constraints it imposes on the lower level. We do not need a full theory of evolution, but only the constraints it imposes on the emergence of the language faculty in order to use them to restrict the range of theoretical options. In Post-Bloomfieldian linguistics, the linear procedure is taken as the basic perspective. Although in practice Post-Bloomfieldian linguists did use hypotheses and tests, as indicated by Hockett’s (28c), the prototypical presentation of the results is in terms of a linear procedure. This is demonstrated by the fact that properties of the analysis procedure were heavily discussed among Post-Bloomfieldians. Discussions of this type were at the origin of the constraint against mixing levels. Significantly, Hockett expresses this in (40e) as a prohibition of circularity. The approach to non-uniqueness is determined by this linear perspective. Hockett formulates constraints on the procedure, in particular (45). The specificity constraints of (45b) excludes non-uniqueness, but as (46) concludes, there is as yet no procedure that satisfies this constraint. Harris accommodates to non-uniqueness, stating in (47) that as long as distribution is the only basis for any operation, procedures are equivalent and in (32c) that all analyses obtained by these procedures are equally real. He considers non-uniqueness as inevitable but harmless, cf. (37b). The opposition between these two perspectives and in particular their consequences for the development of theories was addressed by Chomsky (1957: 51) when he compared the functions of a discovery procedure, a decision procedure, and an evaluation procedure, cf. (53) in Section 2.4.1. A discovery procedure does what Harris wants ‘the work of analysis’ to do in (27): It ‘leads
166
Chomskyan Linguistics and its Competitors
right up to’ the grammar by processing the data. It is this type of procedure Hockett in (45a-b) requires to be general and specific. Decision or evaluation procedures test competing theories, where a decision procedure compares all possible theories and an evaluation procedure any number of actual theories. Chomsky describes his position in this respect as in (56). (56)
a.
‘The point of view adopted here is that it is unreasonable to demand of linguistic theory that it provide anything more than a practical evaluation procedure for grammars. That is, we adopt the weakest of the three positions described above. b. As I interpret most of the more careful proposals for the development of linguistic theory,* they attempt to meet the strongest of these three requirements. That is, they attempt to state methods of analysis that an investigator might actually use, if he had the time, to construct a grammar of a language directly from the raw data. c. I think that it is very questionable that this goal is attainable in any interesting way.’ [Chomsky (1957: 52), footnote at * deleted]
The three positions referred to in (56a) correspond to the three types of procedure. The footnote at * in (56b) gives references, including Bloch (1948), Harris (1951) and Hockett (1947), 19 so that it is clear that the focus here is PostBloomfieldian linguistics. The expression ‘if he had the time’ in (56b) refers to the fact that the formal procedures are more time consuming than the intuitive ones normally used. Harris also refers to ‘cumbersome but explicit procedures offered here in place of the simpler intuitive practice’ (1951: 3). (56c) is a clear negative response to the programme laid out by Hockett in (45). The difference in the attitude to the indeterminacy of the choice between grammars or, in the formulation favoured by Post-Bloomfieldians, the nonuniqueness of grammars thus shows a fundamental difference between the two approaches to the study of language. Chomskyan linguistics is based on a view of science governed by the empirical cycle, whereas Post-Bloomfieldian linguistics is based on a linear view of science.
3.2.3
Incommensurability
One of the phenomena Kuhn (1970a) tries to explain by means of his paradigms is the fact that discussions between proponents of two paradigms are extremely difficult if not impossible. In Section 1.2.2, we called these phenomena incommensurability effects. They are typical also of clashes between research programmes. It is therefore interesting to consider discussions between representatives of Post-Bloomfieldian and Chomskyan linguistics in order to see whether such incommensurability effects can be detected.
The Chomskyan revolution
167
As an example of a discussion let us consider Householder (1965) and the reaction it triggered. Householder (1965) attacks the system of phonology proposed by Halle (1962) and the general theory this system is embedded in, as outlined by Chomsky (1964), in an article in the newly launched Journal of Linguistics. Chomsky and Halle (1965) reply in the next issue with an article about twice the length of Householder’s. These two articles will be the main focus of our analysis. After them, in the same journal, Householder (1966) gives a brief reaction restating his main points, and Matthews (1968) attempts to analyse the discussion and reconcile the two positions. Elsewhere, Chomsky (1967) reacts to one of Householder’s (1966) points in a footnote and Householder (1971) includes some remarks about the discussion.
3.2.3.1 Expressions of puzzlement A first remarkable feature of the discussion is the occurrence of explicit acknowledgements of incomplete understanding of the opposing point of view. A sample of such remarks from Householder (1965) is given in (57) and from Chomsky and Halle (1965) in (58). (57)
a.
‘As these are expressed here, only number one ‘observational adequacy’ is intelligible (at least to me).’ [Householder (1965: 14)] b. ‘What Halle means is not at all clear to me.’ [Householder (1965: 19)] c. ‘The one remaining philosophical point is a real puzzler (for me, at least).’ [Householder (1965: 20)]
(58)
a.
‘Perhaps Householder has in mind something else, but once again we have no way of knowing.’ [Chomsky and Halle (1965: 115)] b. ‘We have no idea what this comment means, and therefore make no attempt to discuss it.’ [Chomsky and Halle (1965: 118)] c. ‘Householder also states that ‘the link Chomsky makes with biuniqueness is quite puzzling to me’. This comment is quite puzzling to us.’ [Chomsky and Halle (1965: 129, fn. 26)]
Even without explaining the anaphoric references in these quotations, we can see that Householder and Chomsky and Halle have to struggle to understand each other’s positions. We also note a certain difference in the way this misunderstanding is presented. Householder’s references to ‘me’ in (57) can readily be interpreted as a sign of modesty. Chomsky and Halle’s ‘we’ in (58a-b) may rather be seen as an implied ‘one’. If ‘we have no way of knowing’, this is a problem of the other side. Similarly, (58b) suggests that ‘this comment’ is not worth discussing. While the last sentence of (58c) is structurally similar to (57c), its context with the jocular repetition of ‘puzzling’ takes away any effect of modesty.
168
Chomskyan Linguistics and its Competitors
This difference in attitude is not an accidental property of the articles under discussion here. Chomsky is often charged with dogmatism, as Householder does in (59). (59)
a.
‘It is only fair to mention that many followers believe they have a practical test for explanatory adequacy: (1) if it’s not from MIT, it’s wrong; (2) of two from MIT, the one O.K.’d by Chomsky or Halle is correct.’ [Householder (1965: 16, fn. 3)] b. ‘Though Chomsky vehemently rejects the charge of dogmatism, it remains a fact that he KNOWS certain things for sure that I do not know and appear incapable of learning; the knowledge enables him to say what linguists must do to remain linguists (or become linguists).’ [Householder (1965: 33, fn. 17)]
The irreverent tone of (59a) qualifies the modesty expressed in (57), or at least its sincerity. In (59b) one may well read a complaint that Chomsky does not sufficiently explain his point to be convincing. Chomsky and Halle, however, charge Householder with not making full use of the available literature, e.g. in (60). (60)
a.
‘The explanation for our conclusion that so puzzles Householder is given in a clear and straightforward fashion in Halle (1962 & 1964b).’ [Chomsky and Halle (1965: 119)] b. ‘Householder not only overlooks the rich literature and serious linguistic studies that have been concerned with justifying and improving the distinctive feature system […].’ [Chomsky and Halle (1965: 127)] c. ‘A more careful reading of Chomsky (1964a; 1964c) would have explained to Householder the source of his difficulty.’ [Chomsky and Halle (1965: 128, fn. 25)]
The explicit references in (60a) and (60c) are to the two works criticised by Householder and the versions reprinted in Fodor and Katz (1964). The remarks in (60) highlight that what Householder reads in these works is not the same as what Chomsky and Halle intended to write in them. Looking back on the discussion a couple of years later, Householder describes his perception of the different styles in (61). (61)
a. ‘I thought I asked my questions in an amiable and friendly manner, b. but the response, which appeared in the next issue,* was frightening in its lack of courtesy or of any attempt at that effort to understand without which communication must always fail. c. And (so far as I can tell) none of my questions was answered, and my puzzlement by certain arguments evoked repetition of the same arguments.’ [Householder (1971: vii), * footnote with reference deleted]
The Chomskyan revolution
169
While ironic remarks such as (59a) do not lend support to (61a), 20 it is true that Chomsky and Halle presuppose their own research programme and fail to recognise, in remarks such as (60), that theirs is not the only possible one. Their failure to understand Householder’s position, as noted in (61b), makes them imply that Householder is not intelligent enough to understand their arguments and explanations. The lack of power, as indicated in (61c), and the impatience such a situation produces on either side of the discussion is typical of incommensurability.
3.2.3.2 Observational adequacy Much of the content of the discussion is concerned with phonological theory. Although it is rich in material, full appreciation of the arguments of both sides requires more background knowledge about the theoretical assumptions they make than available space allows me to present here. Matthews (1968) discusses the arguments for the use of phonemes or distinctive features as the basic elements. Another question concerns the relationship between phonology and syntax. The relationship between the arguments raised in these discussions and the research programmes they assume is relatively complex. We are exceptionally lucky, however, that part of the discussion turns directly on an element of the research programme, viz. the levels of adequacy as proposed by Chomsky in (69) of Section 2.4.3. Let us therefore concentrate on this part of the discussion. Householder starts his discussion of the levels of adequacy with the remark in (57a). However, he continues with a discussion of the ‘examples which make them somewhat clearer’ (1965: 15). On the basis of these examples, his view of observational adequacy is modified as in (62). (62)
a.
‘it turns out that observational adequacy means that the grammar gives back the data AND NO MORE. b. This is surely not an acceptable goal to any linguist, c. even though Chomsky ascribes it to (all?) ‘post-Bloomfieldian’ Americans and to ‘the London School of Firth’. d. It appears also to be impossible to attain in any other manner than by simply reproducing the corpus.’ [Householder (1965: 15)]
It is obvious from (62b) that there is a basic agreement between Householder and Chomsky that observational adequacy is not an acceptable goal. However, their disagreement is not restricted to (62c). It hinges on the precise content of ‘and no more’ in (62a) and of ‘simply reproducing the corpus’ in (62d). 21 Householder’s position becomes clearer when we consider (63).
170 (63)
Chomskyan Linguistics and its Competitors a. ‘The examples for descriptive adequacy show that it means […] b. for syntax, the inclusion of means for identifying all pairs of structures which have a transformational relationship to each other. c. This last is fine, and I’m all for it, but I would say it is equally required by economy and observational adequacy (in a reasonable sense).’ [Householder (1965: 15)]
In (63), Householder discusses descriptive adequacy. Between (63a) and (63b) he describes the consequences for phonology. To the ‘unreasonable’ version of observational adequacy of (62), Householder opposes in (63b-c) a ‘reasonable’ one which includes, among other elements, an account of the relationships between, for instance, an active and a corresponding passive sentence. For Householder, who works in a model such as Figure 3.2, observational adequacy is adequacy at the level of a grammar describing a language. Transformational relationships as referred to in (63b) are incorporated in the language. Chomsky and Halle explain their viewpoint in (64). (64)
a.
‘statements of ‘patterns’, ‘regularities’, and ‘underlying principles’ go beyond the data. b. They are based on some assumption about the nature of linguistic patterns or regularities. […] c. such assumptions, which are the heart of linguistic theory, can be tested for adequacy in only one way, namely, by determining whether the descriptions to which they lead are in accord with tacit knowledge concerning the language.’ [Chomsky and Halle (1965: 103)]
The relationships mentioned in (63b) are examples of regularities as referred to in (64a). According to Chomsky and Halle, they cannot be included in observational adequacy, as Householder proposes in (63c). The minimal advance beyond a grammar which ‘gives back the data and no more’, (62a), is different for the two sides in the discussion. The only way to advance beyond this, according to (64c), is to ground the patterns in the ‘tacit knowledge concerning the language’, i.e. the competence. What Householder sees as going beyond (62a) in (63c) is not recognised as such by Chomsky and Halle. The reason is that they are not working in Figure 3.2 but in Figure 2.7. In their conception of linguistics, there is no genuine advance as long as no relation to the speaker’s competence is established. Householder reacts indirectly to Chomsky and Halle’s objection in (65). (65)
a.
‘I believe that our brains (unlike most computers) have no need for economizing with storage space. b. A linguist who could not devise a better grammar than is present in any speaker’s brain ought to try another trade.’ [Householder (1966: 100)]
The Chomskyan revolution
171
The comment in (65b) can only be explained by referring to the model in Figure 3.2. Householder assumes that there are other criteria for evaluating a grammar than its correspondence to competence. Such criteria, e.g. elegance, simplicity, etc., can apply independently of the implementation of competence in the brain. In the context of Figure 2.7, (65b) would be absurd. The competence, i.e. the grammar in the speaker’s brain, is the standard of evaluation for the linguist’s grammar. Chomsky reacts to (65) in (66). (66)
‘A similar example of apriorism can be found in a recent discussion of phonology by Householder (1966), who states that, in his view, information is organized in the brain in systems of highly redundant lists, the basic retrieval device being table look-up. To be sure, there is no neurological evidence for this idea (just as there is none against it), but he regards alternatives as unacceptable.’ [Chomsky (1967: 107, fn. 7)]
What (66) is reacting to is rather the system alluded to in (65a). Chomsky ridicules this system without distinguishing between the role that the nature of the actual system of language in the brain plays in the Post-Bloomfieldian model and in his own. Whereas in Figure 3.2, the relationship between the language and the knowledge of language as stored in the brain is an accessory question, in Figure 2.7 it determines the object to be described by a grammar. This distinction is a direct consequence of the difference in the attitude to mentalism.
3.2.3.3 Descriptive and explanatory adequacy Elaborating on (57a), Householder explains the difficulties he has with Chomsky’s formulation of descriptive and explanatory adequacy along two lines. First, in (67) he claims that the statement of the levels of adequacy leads to a disrespect for the data. (67)
a.
‘Chomsky feels that ‘observational adequacy’ is completely ‘uninteresting’ b. (and whatever Chomsky himself may intend by this term, his followers without exception interpret it as meaning ‘bad’, ‘scientifically unsound’, ‘to be avoided at all costs’, etc.). c. This has the unfortunate effect that mere mistakes of fact, no matter how gross and glaring, tend to be looked upon as trivial, and no votary would admit publicly that he spends any time avoiding them.’ [Householder (1965: 14)]
The statement in (67a) is a good example of one that means different things to different people. Chomsky and Halle motivate it in (68).
172 (68)
Chomskyan Linguistics and its Competitors ‘innumerable ‘patterns’ and ‘regularities’ can be found in any data, all mutually conflicting, and most of them, for some reason, quite ridiculous.’ [Chomsky and Halle (1965: 103)]
(68) is an instantiation of the familiar problem of indeterminacy. What is ‘to be avoided at all costs’, as (67b) states, is not so much observational adequacy, but neglect of descriptive adequacy. For Householder, there is only one valid relationship between these two levels, expressed in (69). (69)
a.
‘until a given level of ‘observational adequacy’ (combined with a respectable level of economy, and based on a corpus which has been repeatedly enlarged at specific test points) is reached, it is sheer braggadocio to talk about descriptive adequacy, b. even if one knew how to discover what a ‘correct account of the linguistic intuition of the native speaker’ is. c. No doubt Chomsky means ‘descriptive adequacy’ to include ‘observational adequacy’ somehow.’ [Householder (1965: 15)]
Householder sees the levels of adequacy as successive stages of progress. First, as (69a) states, observational adequacy with respect to the corpus has to be reached. Only then, descriptive adequacy with respect to the native speaker’s intuition can be envisaged. Therefore, as (69c) states, in a rational interpretation, descriptive adequacy has to include observational adequacy. Only a ‘votary’, as (67c) mentions, not a rational researcher could think otherwise. For Chomsky, however, (68) means that descriptive adequacy only entails observational adequacy in a rather trivial, uninteresting way. The problem in (67c) is nothing else than the ‘counter-instances’ Kuhn refers to in (9) in Section 1.3.1. The difference between the two approaches is then that Householder can only see theoretical adequacy as based on the full set of data in a one-step operation, whereas Chomsky sees a theory as a hypothesis that can be discarded only in favour of a better one, not on the basis of counterexamples. In Householder’s view, a counterexample causes the operation of deriving a grammar to fail. In Chomsky’s view, counterexamples do not have immediate drastic effects. Counterexamples may trigger further research, but a theory will stand until a better theory is formulated. The second line of difficulties Householder has supports this analysis. He explains his problems with descriptive and explanatory adequacy in (70). (70)
a. ‘I find the word ‘correct’ here particularly puzzling, b. and regard the ‘linguistic intuition of the native speaker’ as extremely valuable heuristically, but too shifty and variable (both from speaker to speaker and from moment to moment) to be of any criterial value.
The Chomskyan revolution
173
c.
In scientific discourse it seems to be a fact that all arguments based on bare intuition, whether the linguist’s or the native speaker’s, constitute a hindrance to communication, since there is no way of evaluating conflicting claims. […] d. In the account of explanatory adequacy I am distrustful of the definite article ‘THE descriptively adequate grammar of each language’, which seems to imply that there can be only one.’ [Householder (1965: 15)]
There is a footnote following (70b), the start of which is given in (70c). In (70a) ‘here’ refers to the quoted part of (69b), characterising descriptive adequacy in Chomsky’s terms. Chomsky’s view in (64c) is not acceptable to Householder because of (70b-c). The emphasis on a scientific methodology in (70c) is once more typical of a one-step derivation rather than cyclic development of a theory. The problem with explanatory adequacy as formulated in (70d) restates non-uniqueness as a virtue rather than a problem. It is elaborated in (71). (71)
‘I am more inclined to the view that two inconsistent and irreconcilable descriptions of a language may each convey some important ‘intuition’ about the language which cannot be conveyed by the other, nor both by any third.’ [Householder (1965: 16)]
Chomsky and Halle (1965: 105) quote (71), emphasising the last five words, and single out this statement for particularly harsh criticism, leading to the conclusion in (72). (72)
a.
‘A linguist, who, like Householder, is willing to accept inconsistent accounts – in fact, claims that such inconsistency is ineliminable – has disavowed any concern for the topic of descriptive or explanatory adequacy. b. He has simply given up the attempt to find out the facts about particular languages or about language in general. c. His work is immune to criticism, of course, as an automatic consequence of his tolerance of inconsistency.’ [Chomsky and Halle (1965: 106)]
It is conceivable that Householder would not be opposed to (72a), because he has problems with the notions of descriptive and explanatory adequacy in the first place. The conclusion in (72b) depends on the conceptions of the ‘facts about particular languages’, which are crucially different for the two sides of the argument. As it stands, it is certainly not acceptable to Householder. The problem noted in (72c) may well be due to an overstatement of nonuniqueness by Householder in (71). If we compare (71) with Harris’s (37),
174
Chomskyan Linguistics and its Competitors
we notice that Householder has added ‘inconsistent’ to what originally was only ‘different’ and ‘equally adequate’. Chomsky and Halle describe their alternative approach in (73). (73)
‘Where two equally effective fragments can be constructed, the grammarian will attempt to choose between them by enriching the domain of relevant fact or deepening linguistic theory.’ [Chomsky and Halle (1965: 105)]
The crucial difference between the accepted range of relevant facts about languages implies that the solution to non-uniqueness in (73) is not available to the Post-Bloomfieldians. Therefore, in his reply, Householder can only restate his point as in (74). (74)
‘I still do not like the use of such strong God’s truth language, saying always ‘such-and-such IS X’ instead of ‘such-and-such may conveniently be described or looked at as X’. ’ [Householder (1966: 99)]
It is remarkable that the same person who in (30b) calls himself a ‘God’s truth linguist’ objects to such a position in (74). It seems, then, that at least some Post-Bloomfieldian linguists were attached to the variety of possible positions allowed by the model in Figure 3.2 even though it was originally noted as a problem. SUMMARY
•
Several continuities have been identified between Post-Bloomfieldian linguistics and Chomskyan linguistics, but they are either very general or theory-internal or elaborated in different ways in the two research programmes.
•
The main differences between Post-Bloomfieldian and Chomskyan linguistics concern the attitudes towards mentalism and nonuniqueness.
•
The variety of mentalism condemned by Bloomfield and Weiss in the 1920s is not the same as the one endorsed in Chomskyan linguistics. The mind is a non-physical, non-biological entity in the former, but not in the latter.
•
Only if mentalism is accepted can the grammar be a description of the speaker’s knowledge. Grammaticality judgements depend on the activation of this knowledge. Therefore, the attitude to mentalism determines the possibility of using grammaticality judgements.
The Chomskyan revolution
3.3
•
Whereas Post-Bloomfieldian linguistics did not provide any solution for non-uniqueness, Chomskyan linguistics approaches it by adding a deeper level of analysis.
•
The discussion between representatives of Post-Bloomfieldian and Chomskyan linguistics is hampered by the differences between the research programmes. This is illustrated by the discussion of the levels of adequacy by Householder (1965) and Chomsky and Halle (1965).
175
Has there been a Chomskyan revolution?
In a sense, the question whether the emergence of Chomskyan linguistics should be analysed as a revolution, as claimed in (1), has been the leading question of this chapter. Before turning to the formulation of an answer, it is worth emphasising again what such an answer does and does not imply. First, the question is framed here in terms of research programmes. This means that the social aspects of the Kuhnian framework are disregarded. To the extent Thorne’s reference to Kuhn in (1) is taken to be crucial, I do not propose an answer to the question whether (1) is correct. Second, a revolution does not imply that the new research programme should command universal assent. It is worth mentioning this point again here, because even otherwise well-informed discussions sometimes go astray in this respect. Thus, Newmeyer states that Kuhn (1970a) ‘claims that a central criterion is the resultant uniformity of belief, within the scientific community, in the new ‘paradigm’ ’ (1986c: 6). As shown in Section 1.2.2, Kuhn does not make this claim. Third, a revolution does not imply that the old framework is unscientific. When the old research programme is abandoned, however, this is a sign that the new research programme constitutes progress, as argued in Section 1.3.2. Progress does not affect intelligence but only insight. Margolis (1993: 68–85) gives the emergence of the probability calculus as an example of the contrast involved. The joint efforts of some of the greatest seventeenth century mathematicians (Fermat, Pascal, Huygens) were necessary to discover the probability calculus, but once discovered it could be understood by any averagely intelligent student. Since gaining new insight of this type is so much harder than transferring it, progress is measured in terms of insight. As we concluded in the discussion of revolutions in Section 1.3.3, the question whether two theories belong to different research programmes is logically prior to the one whether there has been a revolution. Even a Kuhn-
176
Chomskyan Linguistics and its Competitors
based discussion such as Newmeyer (1986c) concentrates largely on showing that Post-Bloomfieldian and Chomskyan linguistics belong to different paradigms. The comparison of the models of the research programmes in Figure 2.7 and Figure 3.2 shows that there are two crucial differences. In Section 3.2.2, they are labelled mentalism and non-uniqueness. In Post-Bloomfieldian linguistics, a grammar describes a language and whether there are different but equally valid grammars for the same language is a matter of debate. In Chomskyan linguistics, a grammar describes a speaker’s competence and learnability should solve non-uniqueness by selecting one grammar as the correct one. The discussion of the debate between Householder and Chomsky and Halle in Section 3.2.3 shows how these differences result in incommensurability. Arguments that seem decisive to one side are considered irrelevant by the other. This conclusion is also drawn by Matthews in (75), when he introduces the debate. (75)
a.
‘The participants on both sides are scholars for whom any linguist must have some respect, and it is fair to assume, at the outset, that neither party was talking nonsense. b. Nevertheless, it is clear that Chomsky and Halle saw little sense in Householder’s criticisms, and had the impression (confirmed to me by a letter from Halle) that their reply would in effect settle the matter. c. Equally, it is clear from Householder’s brief rejoinder (Householder, 1966) that he in turn had considerable difficulty in relating Chomsky and Halle’s counter-arguments to the case which he had actually put forward.’ [Matthews (1968: 275)]
The nature of incommensurability effects is very well described in (75). As indicated by (75a), they are not caused by a lack of intelligence on either side. Each side of the argument may equally think their arguments are stronger and typically fails to see the relevance of some of the argumentation adduced by the other side. This is stated for each side separately in (75b-c). Incommensurability effects are a strong indication that the two sides are working within different research programmes. An interesting question at this point is whether Post-Bloomfieldian linguistics was in a state of crisis. Newmeyer denies this in (76). (76)
‘far from being in a state of crisis, post-Bloomfieldian structuralism in 1957 was enjoying a period of unprecedented optimism, in which it was believed that the fundamental questions of linguistic analysis had all been solved.’ [Newmeyer (1986c: 7, fn. 6)]
The Chomskyan revolution
177
The claim in (76) is not only problematic for the argument that the emergence of Chomskyan linguistics is a revolution. As seen in (14) of Section 1.3.3, Kuhn considers a crisis as essential for a revolution. The underlying assumption is that without a crisis there is no room for a new research programme of any strength. Scientists do not abandon a successful research programme unless there is a reason to doubt its future promise of continued success. Expanding the domain of an existing research programme is generally more rewarding than developing a new one, because in a new research programme much effort has to be devoted to incorporating results the old one has already achieved. An essential point here is to relativise the notion of crisis. Kuhn does this in (15) and (16) of Section 1.3.3. It is not necessary that the entire community feels a crisis, but at least some members must be aware of the importance of an unsolved problem and ready to consider radical solutions to it. Kuhn gives the example of Thomas Young’s experiments that showed a crucial weakness in the particle theory of light. In the case of Post-Bloomfieldian linguistics, there are good arguments for the view that non-uniqueness was responsible for a crisis. The crisis was not felt by everyone, so that Harris could state in (37) that it was harmless and continue research without being bothered by it. As Hockett’s (45a-b) shows, however, a general and specific method for deriving a grammar from the data remained a desideratum. Moreover, in (46) Hockett concedes the absence of such a method without any suggestion that success would be ‘around the corner’. It is this state of affairs which gave Chomskyan linguistics a chance. Chomskyan linguistics promises a solution to non-uniqueness whereas Post-Bloomfieldian linguistics could at most argue that it was harmless. The promise of solving the non-uniqueness problem may also explain the ambiguous attitude of Post-Bloomfieldian linguists. As Newmeyer (1986a: 37f.) shows, Chomsky’s theory had an initial appeal to some of them. This explains, for instance, his invitation to be a keynote speaker at the International Congress of Linguists in 1962 (cf. Section 2.5.2). Bernard Bloch, editor of Language 1940–1965, had a special role in this respect. As Noordegraaf (2000) documents, the review of Syntactic Structures by Lees (1957) was accepted by Bloch just weeks after the publication of the book. Lees was a student at MIT and Voegelin characterises the tone of the review quite well when he calls Lees ‘Chomsky’s explicator’ (1958: 229). The revolutionary character of Chomsky’s work was also recognised by Post-Bloomfieldian linguists at an early stage. Hockett (1965: 185), for instance, calls the publication of Chomsky (1957) one of only four major breakthroughs in linguistics since 1786. 22 If we accept that there was a Chomskyan revolution, this raises the question whether it marked progress and if so, of what kind. Householder notes a negative aspect in (77).
178 (77)
Chomskyan Linguistics and its Competitors ‘their claims and assertions, if all wholly true, would tend to make all phonological work impossible on any known lines.’ [Householder (1965: 13)]
With the last four words of (77), Householder supports the analysis of a Chomskyan revolution. What (77) highlights is the loss of certain possibilities in a revolution, as noted in Section 1.3.3. The way out of the non-uniqueness problem offered by Chomsky involves abandoning the linear approach to science, in which the foundation of the data and the procedures are central, in favour of the cyclical approach, based on testing hypotheses. It is interesting to see that Householder later recognises this in (78). (78)
a.
‘Why was such emphasis placed upon discovery and the use of rigorous principles of discovery? Beyond doubt it depended on the basic notion of replicability. If one scientist cannot duplicate the work of another with the same results, doubt may be legitimately cast upon those results. In order for work to be replicable, it must be described in very explicit terms, and the rules to be followed must be agreed upon. All this is clearly based upon the rules for experiments in the natural sciences. b. Is the analysis of a language, then, analogous to an experiment or series of experiments in physics or chemistry? c. Or is it rather, as Chomsky seems to maintain, analogous to the formation of a hypothesis? d. I think we must now agree, at this distance, that Chomsky was right. A linguistic description is a hypothesis, and like hypotheses in other sciences it is of no direct relevance to its truth how it has been arrived at. Once given, it can be tested, and either accepted or rejected on the basis of suitable tests. It is these tests which are analogous to experimentation in the physical sciences, not the procedures used in forming the hypothesis originally.’ [Householder (1971: 137)]
In (78b) and (78c) Householder contrasts two possible parallels between natural sciences and linguistics. The one in (78b) focuses on rigorous and explicit procedures, as elaborated in (78a). In natural sciences, rigour of this kind is not found in theory formation but only in carrying out experiments. It is this analogy that Post-Bloomfieldian linguistics uses to ensure that its method is scientific. At the start of this chapter we considered the analogy with deriving the orbit of a planet from observations of its position. If we approach theory formation in astronomy along the lines of (78a), a number of mathematically equivalent solutions emerge and there is no way to choose among them. The advantage of seeing a theory as a hypothesis as in (78c) is that the empirical cycle allows resorting to a deeper level of explanation. We can
The Chomskyan revolution
179
distinguish the elliptical orbit from mathematically equivalent solutions to the problem of accounting for the planet’s positions because by assuming an elliptical orbit we can reach a deeper explanation. Newton proposed gravity as such a deeper explanation, because it governs both terrestrial and planetary phenomena by the same laws. In (78d) Householder recognises Chomsky’s role in making linguistics more scientific. Chomsky proposed learnability as a deeper explanation, because it applies to all human languages in the same way. This solution was attractive, but demanded a different view of what the grammar describes. We should not forget, however, that the problem of non-uniqueness, which triggered this progress, was formulated and discussed for the first time in Bloomfieldian and Post-Bloomfieldian linguistics. Although Chomskyan linguistics was a new research programme, the crisis which paved the way for its emergence arose thanks to the diligent work of Post-Bloomfieldian linguists. The emergence of Chomskyan linguistics should then indeed be seen as a revolution, because it proposed a new research programme which gradually replaced the older one. As the debate between Householder and Chomsky demonstrates, the incommensurability between the two research programmes hampered communication, but ultimately it did not prevent the insight of (78). SUMMARY
•
The emergence of Chomskyan linguistics was a revolution because it is based on a different research programme from PostBloomfieldian linguistics and it gradually replaced the latter.
•
The Chomskyan revolution can be seen as progress because it replaced the emphasis on procedures applied to the set of data by a productive interaction of hypotheses and tests in the empirical cycle.
Notes 1
As mentioned in Section 2.5.2, at this conference Chomsky was one of the keynote speakers which gave him the opportunity to explain his ideas to a large international audience.
2
It should be noted here that Bloomfield never recovered from a stroke he suffered in 1946. Franz Boas (1858–1942) is considered the founder of American structuralism. Both Bloomfield and Edward Sapir (1884–1939) were his students.
3
In this context it is interesting to see that Fries (1961), who presents an idealised picture of Bloomfield and his approach to linguistics, refers to a ‘number of
180
Chomskyan Linguistics and its Competitors American linguists’ exemplified by Bloch (1948), Trager and Smith (1951) and Harris (1951), who ‘have been considerably influenced by Bloomfield’ as trying ‘to go beyond him’ in certain respects, noting that ‘No examples of descriptive analyses accomplished on this basis have appeared’ (1961: 216). There is no doubt that this criticism is addressed to the Post-Bloomfieldians, and it is unlikely that they would have agreed with Fries’s evaluation.
4
The most obvious error is Botha’s claim in (14), discussed in Section 3.1.2 below. I regret to say that my representation in ten Hacken (1997a: 291f.), while correcting this specific point, remains far too close to Botha’s representation in other respects, in particular by searching for what corresponds to elements of the research programme of Chomskyan linguistics and setting up the result as a deficient research programme.
5
It is not the only way utterance has been considered in Post-Bloomfieldian linguistics. Thus, according to Bloch ‘the length or inclusiveness of utterances can be ignored. It makes no difference here whether the term ‘utterance’ is taken to cover only the speech activity carried on between two respirations of a speaker, or the total speech activity carried on in the course of a day’ (1948: 7).
6
Modals such as can are ambiguous between an epistemic and a deontic reading. The former refers to a degree of certainty, the latter to a degree of permissibility.
7
It also excludes sign language (cf. Section 2.6.2.2), but there is no reason to interpret this exclusion as deliberate. The insight that sign languages are in fact fully-fledged languages did not emerge in linguistics until the 1960s.
8
Hockett (1958: 321–338) elaborates a system of terms to account for the variation between dialects and languages, but the distinctions this system makes are not part of what Post-Bloomfieldians found essential for the study of language. This system and its place in the historical development of the approach to dialects and individual variation are discussed by ten Hacken (2005).
9
Note that stating that there is a good reason for a choice is not the same as stating that the choice is necessary. It only shows that the choice is rational. An important difference between Post-Bloomfieldian and Chomskyan linguistics, to be discussed in Section 3.2.2.1, is that the latter rejects the analysis in Figure 3.1.
10 Although this is probably a correct rendering of Post-Bloomfieldian linguistics, it is not quite logically correct. What is necessary is that the grammar is sufficiently explicit so that human beings can perform this task. If human beings are considered a tabula rasa as far as language is concerned this is equivalent. 11 Fidditch in (30a) refers to Miss Fidditch, the mythical American school teacher representing the prescriptive tradition in grammar teaching. She was often evoked by the Post-Bloomfieldians as representing the opposite to their own view of language.
The Chomskyan revolution
181
12 It is not obvious whether in 1952 Householder should be ranged among the PostBloomfieldians. None of the lists of Post-Bloomfieldian linguists given by Hymes and Fought (1981) and by Fought (1995) includes his name. Later, Householder (1965) is highly critical of Chomskyan linguistics (cf. Section 3.2.3), defending what seems to be a Post-Bloomfieldian alternative. According to Hockett ‘Any difference between Householder’s views and my own are subliminal compared to our joint disapproval of the assumptions, techniques, and the manner of Chomskyan ‘phonology’ ’ (1968: 4, fn. 3). 13 There seems to be an implied contrast between (37a) and (35). Whereas in (35) we only find a reference to different sets of elements, (37a) mentions different systems of statements about these elements. However, as the footnote quoted in (31) appears in a context in which Harris mentions ‘the defining of the elements and the stating of the relations among them’ (1951: 9), there is no reason to assume that Harris changed his mind in this respect between the two publications. 14 Harris (1951) uses the term morphology instead of grammar, but otherwise distinguishes the same two levels as Hockett. 15 The status of Pike’s idea in Post-Bloomfieldian linguistics is indicated, for instance, by Householder’s reference to ‘the old Pikean heresy, which merely concerned the advantages of knowing the language when choosing between alternative phonemic analyses’ (1965: 18). 16 ‘er zijn weliswaar zekere discontinuïteiten aanwezig in de overgang van het (Amerikaans) structuralisme naar de TGG, maar ook tal van continuïteiten. Bovendien is geconstateerd dat bepaalde nieuwe aspecten van de TGG niet direct zo geprononceerd naar voren kwamen, maar pas in de ontwikkeling ervan. De opkomst van de TGG kan dus niet getypeerd worden als een revolutie in de zin van Kuhn (1970).’ My translation. 17 For the claims in (49)-(51) I give page references to Kaldewaij’s book indicating where he discusses the issue. The formulation and classification of the nine continuities is my own. Kaldewaij (1986: 260–262) classifies some of the points in a different way and his claims are in some cases more cautious than the formulation I discuss here, but in this context a clear exposition of the issues is more important than the exact wording. 18 A more psychological interpretation is suggested by Hockett’s qualification of what the result of language acquisition as ‘a state of affairs in his [i.e. the speaker’s] nervous system’ in (33b), but, as (34) emphasises, the psychology involved is behaviourist and considers language as a set of habits, cf. (11). 19 Chomsky also refers to his own earlier work in this footnote. Interestingly, Chomsky (1953: 242) provides the clearest statement of the position criticised in (56). Thus he states that ‘it is first necessary to reconstruct carefully the set of procedures by which the linguist derives the statements of a linguistic grammar
182
Chomskyan Linguistics and its Competitors from the behaviour of language users’. In fact, his statement that ‘The present paper will be an attempt to formalize a certain part* of the linguist’s generalized syntax language’ (footnote at * deleted) shows that this was also Chomsky’s own goal in 1953.
20 Cf. also the use of votary for researchers working in Chomskyan linguistics in (67c) below, which evokes rather associations with worshipping than with rational inquiry. 21 Chomsky and Halle (1965: 135) construe Householder’s position as incoherent on the basis of what seems to be a conspicuous case of uncooperative reading. They compare Householder’s statements based on the definitions of the levels of adequacy with the ones in the paragraph where he discusses the further explanation and conclude that (57a) is inconsistent with (62b) and with statements on descriptive adequacy as an intelligible goal. However, (57a) is explicitly restricted to ‘As these are expressed here’ which by no means excludes a further elaboration leading to more understanding. 22 In 1786 William Jones delivered an address to a scientific society that is often taken as the start of historical-comparative linguistics. The other ‘breakthroughs’ Hockett mentions are Verner’s Law and Saussure’s Cours de Linguistique Générale.
4
Some modern competitors
Chomskyan linguistics has always been the subject of controversial discussion and criticism from various sides. Not all of the criticism can be thought of as the product of competitors in the field of linguistics. Thus, an important part of it is directed to the psychological and philosophical consequences of Chomsky’s proposals. Some of the chapters in Antony and Hornstein (2003) fall into this category. Critics of this orientation do not qualify as competitors in the sense adopted here, because their work does not result in an alternative theoretical framework for linguistics. If we concentrate on competitors among theoretical approaches to linguistics, different degrees of similarity to the Chomskyan approach can be observed. A very basic level of disagreement is the opposition discussed by Newmeyer (1998). He considers Chomskyan linguistics, or more precisely GB-theory, as an example of the ‘formalist (structuralist, generativist) orientation’ (1998: 7), which is opposed to the functionalist orientation. Newmeyer’s discussion does not concentrate on research programmes and this is not accidental. As his presentation of the variety of generativist and functionalist approaches suggests, the level of disagreement within these two broad types of approach is too big to suppose that there should be a single research programme on each side (1998: 11–18). Moreover, the disagreement between the two broad approaches is so profound that a detailed specification of the research programmes involved is not necessary to highlight it. While Newmeyer’s method is adequate for the opposition between formalist and functionalist linguistics, it tends to obscure the distinctions within these approaches. He analyses the state of generative linguistics as marked by ‘two trends’, one of which ‘associated with the work of Chomsky and his associates’, the other one consisting of ‘a dozen or more named theories’ (1998: 11) and concludes that ‘For most of the issues that concern us in this book the differences between P&P and its rivals are unimportant’ (1998: 12). As opposed to Newmeyer’s goal of elucidating the opposition between generative and functionalist approaches to linguistics, here the purpose is a more 183
184
Chomskyan Linguistics and its Competitors
precise analysis of the distinctions within generative linguistics. This analysis concentrates on the differences in the underlying research programmes and the consequences of these differences. The method adopted here involves a relatively detailed analysis of a small number of approaches or theories. Their choice is determined in part by the importance of the theories in terms of the size of the research communities and their influence in the field, in part also by the issues they raise for the discussion of research programmes. Lexical-Functional Grammar (LFG) and Head-Driven Phrase Structure Grammar (HPSG) have a relatively long tradition and a community of researchers large enough to publish introductory textbooks and to organise regular specialised conferences. 1 They are discussed in Sections 4.1 and 4.3, respectively. The inclusion of Generalised Phrase Structure Grammar (GPSG), discussed in Section 4.2, is motivated on the one hand by its influence on HPSG, which is in some respects its successor, on the other hand by the issues of formalisation it raises. Finally, Ray Jackendoff’s work is discussed in Section 4.4. It is interesting because it is very close to Chomskyan linguistics but takes a different position on some issues that have recently gained prominence.
4.1
Lexical-Functional Grammar
The publication of Bresnan (1982) can be seen as the foundational event of Lexical-Functional Grammar (LFG). This 900-page volume presents analyses of a number of syntactic phenomena in English and other languages, including Russian, Icelandic and Malayalam, as well as the LFG-formalism and discussions of language processing and acquisition. With single-authored contributions accounting for a quarter of the page length of this book and jointly authored contributions for another quarter, Joan Bresnan clearly emerges as the central figure. Some of the key chapters, in particular the extensive introduction defending the general approach and the chapter on the LFG-formalism, are co-authored by Bresnan and Ronald M. Kaplan. According to Falk, ‘Bresnan and Kaplan are the key players in the LFG world’ (2001: 3). The origin of LFG was a sense of dissatisfaction with certain aspects of Chomskyan linguistics as it presented itself in the 1970s. Even twenty years later, it seems not quite settled how the distinction between LFG and Chomskyan linguistics should be assessed. In (1) and (2), different and apparently conflicting views on this issue are given from within LFG.
Some modern competitors
185
(1)
a.
‘Towards the end of the twentieth century, new formal ideas began to achieve prominence in linguistic theory, […] b. These newer theories are compatible with different linguistic epistemologies drawing on structuralist and functional/typological ideas which have both predated and coexisted with generative grammar. c. One such theory is lexical-functional grammar (LFG) (Kaplan and Bresnan 1982).’ [Bresnan (2001: 3–4)]
(2)
a. ‘LFG rejects the assumptions of transformational theory, not its goals. b. The basic argument for the LFG approach to syntax is simply that certain transformationalist assumptions are incompatible with the search for a theory of Universal Grammar. c. LFG is therefore a variety of generative grammar, an alternative to transformational theory.’ [Falk (2001: 2)]
The assessment in (1) is taken from the introduction to Bresnan’s (2001) overview of the state of the art in LFG. It follows a paragraph about Chomsky’s theory, to which the ‘new’ in (1a) forms a contrast. After (1a), a number of these ‘formal ideas’ are listed. The theories embracing these ideas, among which (1c) explicitly includes LFG, are associated by what (1b) calls ‘different linguistic epistemologies’. This suggests that LFG does not have the same research programme as Chomskyan linguistics. The assessment in (2) is taken from the introduction to Falk’s (2001) introductory textbook. It also follows a discussion of Chomsky’s model of syntax, but it emphasises the common ground with Chomskyan linguistics, referred to here as ‘transformational theory’. Whereas (1b) opposes the theories of a class in which (1c) includes LFG to ‘generative grammar’, (2c) states that LFG is ‘a variety of generative grammar’. In (2a) a distinction is made between ‘goals’ that are shared and ‘assumptions’ that are not shared with Chomsky’s theory. This suggests that LFG might have the same research programme as Chomskyan linguistics. In this perspective, however, (2b) is puzzling. It suggests that Chomskyan linguistics is not able to set up a proper research programme. As Chapter 2 demonstrated how the research programme of Chomskyan linguistics can be analysed coherently, this suggests that the goals of ‘the search for a theory of Universal Grammar’ may not be quite identical for the two. In order to investigate the nature of the opposition, we will first consider the origins of LFG. After an analysis of the sense of crisis that gave rise to LFG in Section 4.1.1, Section 4.1.2 explores the research programme adopted by LFG. This is followed in Section 4.1.3 by a discussion of the Chomskyan reaction to the crisis identified in Section 4.1.1. Then a comparison of the
186
Chomskyan Linguistics and its Competitors
two models is given in Section 4.1.4. Finally, an analysis of some theoretical discussions between the two approaches in Section 4.1.5 will demonstrate the impact of these differences and similarities.
4.1.1
The crisis: psychological reality
Evidence for a crisis at the origin of the emergence of LFG is documented especially in Bresnan (1978). The starting point is the quotation from Chomsky in (3). (3)
‘a reasonable model of language use will incorporate, as a basic component, the generative grammar that expresses the speaker-hearer’s knowledge of the language.’ [Chomsky (1965: 9)]
Before considering the original context of (3) and Chomsky’s most likely intention in formulating it (cf. Section 4.1.3 below), let us see how (3) has been interpreted and used by the group of researchers from which LFG originated. Bresnan (1978) sketches the status of (3) in (4). (4)
a.
‘This assumption is fundamental in that it defines basic research objectives for transformational grammar: b. to characterize the grammar that is to represent the language user’s knowledge of language, and c. to specify the relation between the grammar and the model of language use into which the grammar is to be incorporated as a basic component.’ [Bresnan (1978: 1)]
In Bresnan (1978), (4) immediately follows a version of (3) with an initial capital. As (4a) states, she considers (3) not as a statement about the study of language use, but as a statement about grammars. The difference between these two perspectives can be formulated as follows. In the former case (3) means that if you do research on language use, you should refer to the grammar as formulated in your theory. In the latter case (3) means that in order to find the right grammar, you must ensure that it is the one actually used by the speaker in their language use. Taken as a statement about the study of language use (3) implies neither (4b) nor (4c) as immediate research objectives. Although there might be other reasons why they should be desirable or even essential research objectives, this status does not follow from (3) unless (3) is taken as a fundamental assumption about the study of grammar, as (4a) does. In their introduction to Bresnan (1982), Bresnan and Kaplan also quote (3) and call it ‘Chomsky’s Competence Hypothesis’, said to express a ‘longstanding hope of research in theoretical linguistics’ (1982: xvii).
Some modern competitors
187
The position of (4) in Bresnan (1978) is indicative of its perceived status. She uses it in her introduction as a background to her argument, a neutral rendering of common knowledge about the state of the field. In fact, from a very early date, psycholinguistic experiments had been carried out that take (4) as their basis. In their review of this experimental tradition, Fodor et al. (1974) refer to the underlying hypothesis as in (5). (5)
‘A number of early psycholinguistic studies in generative grammar appear to have been motivated precisely by the hypothesis that the complexity of a sentence is measured by the number of grammatical rules employed in its derivation. We shall call this hypothesis the derivational theory of complexity (DTC).’ [Fodor et al. (1974: 320)]
The DTC as defined in (5) is a specific hypothesis on the relation between the grammar and the model of language use referred to in (4c). It was tested in particular by a group under the direction of George A. Miller at the Harvard Center for Cognitive Studies. An early report is contained in Miller’s presidential address to the Eastern Psychological Society (1962: 756–759). The framework adopted was the one of Chomsky (1957). In this framework, corresponding active and passive sentences and corresponding affirmative and negative sentences share the same deep structure. The DTC implies that processing is slowed down by additional rules such as passive and negative transformations. As Fodor et al. (1974: 321) describe, early results seemed to confirm this. However, problems for the DTC were soon noted, as in (6). (6)
a.
‘Kernels were easiest and passive negative hardest, as one would expect from the transformational theory in question and from experiments discussed above. b. But the passive, which is grammatically more complex than the negative, consistently required less time to evaluate than the negative. c. Thus the semantic variable of affirmation-negation seemed to be more important than the grammatical variable of transformational complexity in this case.’ [Slobin (1966: 220)]
The ‘kernels’ in (6a) are the sentences whose surface structure is derived by only carrying out obligatory transformations, i.e. active affirmative sentences. If the measure for the DTC takes into account not only the number but also the complexity of transformations, which seems a reasonable assumption, (6b) is surprising. Therefore, Slobin considers an alternative explanation in (6c). In order to test it, he introduced an additional variable in the experiment, contrasting sentence pairs such as (7). 2
188 (7)
Chomskyan Linguistics and its Competitors a. The dog is chasing the cat. b. The girl is watering the flowers.
Sentences such as (7a) are reversible, in the sense that the roles of the two participants in the action can be reversed without loss of normality. Sentences such as (7b) are nonreversible. Reversibility is a semantic property that has no direct relation to grammatical rules. The experiments showed that reversibility plays an important role in sentence processing. This suggests that the influence of negation should also be attributed at least in part to its semantic rather than grammatical properties. In their overview, Fodor et al. (1974: 319–328) show that various types of experiments failed to confirm the DTC. Slobin’s (1966) results are typical in this respect. It should be noted also that in the early 1960s some assumptions about the grammar changed. Thus, kernel sentences were abolished (cf. Section 2.5.2). The resulting situation is described by Bresnan in (8). (8)
a.
‘In their review of this literature, Fodor, Bever, and Garrett (1974) conclude b. that the experimental evidence tends to support the psychological reality of grammatical structures, but c. that the evidence does not consistently support the reality of grammatical transformations as analogues of mental operations in speech perception and production.’ [Bresnan (1978: 2)]
The conclusion in (8b) presumably refers to the click experiments discussed in Section 2.2.3. The one in (8c) summarises the DTC-based experiments. The term psychological reality in (8b) refers to the existence of a straightforward, detailed correlation between components of the grammar (e.g. individual rules) and properties of human language processing as evidenced by psycholinguistic experiments. It is a direct rendering of the research goal in (4c). The summary by Bresnan and Kaplan (1982) in (9) expresses the situation in an even more dramatic tone than (8). (9)
‘despite intensive efforts by psycholinguists, it remains true that generativetransformational grammars have not yet been successfully incorporated in psychologically realistic models of language use (Fodor, Bever, and Garrett 1974).’ [Bresnan and Kaplan (1982: xvii)]
The statement in (9) mentions the main elements typical of a crisis of the type Kuhn (1970a) describes (cf. Section 1.3.3). The theoretical basis for the crisis consists of two elements, ‘generative-transformational grammars’ and the competence hypothesis, which implies that a grammar should be ‘incorporated in psychologically realistic models of language use’. The factual basis for the
Some modern competitors
189
crisis consists of the ‘intensive efforts’ and the negative results they yielded. Assuming that the factual basis cannot readily be changed, there are two possible ways out of the crisis, listed in (10). (10)
a. Give up generative-transformational grammars. b. Give up the competence hypothesis.
For the researchers at the basis of LFG, (10b) was not an option. Therefore, when the options were narrowed down to the two in (10), they opted for (10a). Of course (10a) only makes sense in the presence of a new grammatical formalism that replaces generative-transformational grammars and is compatible with the competence hypothesis. This triggered an intensive search for an adequate alternative formalism. Bresnan (1978: 50–58) discusses the use of Augmented Transition Networks (ATNs), a formalism originally developed by Woods (1970) for the processing of natural language on a computer. Although there are various interesting features in this formalism, she concludes that it is not adequate. Kaplan and Bresnan (1982) then present the new formalism, LFG, as the endpoint of this search.
4.1.2
A new research programme
In presenting the background of LFG, Bresnan and Kaplan (1982) describe a research programme in which it is intended to work. This description sometimes refers to Chomsky’s work, but it can be read independently. It is therefore possible to set up a model of the LFG research programme without immediately comparing it to the research programme of Chomskyan linguistics. Although it is useful to compare individual concepts to the ones introduced in Chapter 2, it would be wrong to start by looking for the LFG equivalent of, for instance, each component of Figure 2.10. Bresnan and Kaplan recognise two levels of description, at both of which the terms grammar and theory are used. They describe these levels in (11). (11)
a.
‘On the lower level of description, we speak of the grammar of a particular language such as Navajo. At this level, a grammar is a set of rules within a formal system. The grammar generates the language it is a grammar of […] b. On the higher level of description, we speak of a theory of grammars. This is a set of primitives, axioms, and rules of inference (often unformalized) that characterizes the class of possible grammars of particular languages. A theory of grammar is sometimes referred to as a Universal Grammar.’ [Bresnan and Kaplan (1982: xvii)]
190
Chomskyan Linguistics and its Competitors
In interpreting (11), it should be kept in mind that it is meant to clarify the different levels in a very general way. The fact that we can ‘speak of the grammar of a particular language’ in (11a) does not mean that the grammar does nothing more than characterising, in the terminology of Chomsky (1986a) and Section 2.1.3, an E-language. We should rather say that, as a side effect of more interesting properties it has, a grammar of Navajo also describes Navajo as an E-language. Analogously, the theory of grammars also characterises the class of possible human languages. A further specification of the conditions on a proper individual grammar and on a proper theory of grammars is given in (12). (12)
a.
‘A grammar of Navajo, in that it provides specific rules for the construction of Navajo sentences, represents the kind of knowledge of that language that one must have to speak it. It is such grammars, grammars on the lower level, that we assume will represent the stored knowledge in competence-based models of linguistic performance. b. Grammar on the higher level, the Universal Grammar that is a theory of grammars, is not necessarily represented in such models in the same way. For example, principles of Universal Grammar might characterize aspects of the structure of the language-using device.’ [Bresnan and Kaplan (1982: xviii)]
In (12a), the object described by a grammar is characterised as ‘the stored knowledge’ of a particular language ‘that one must have to speak it’. Therefore, a grammar describes competence, (12a), as well as the derived E-language, (11a). In (12b) Bresnan and Kaplan propose that UG describes certain aspects of the language-using device. It is interesting to compare the wording of (12a) with (12b). Whereas (12a) makes a general claim (12b) starts with a negative and then continues with an example. This suggests that Bresnan and Kaplan claim a more general validity for (12a) than for (12b). In (12a) they define competence-based approaches to language, whereas in (12b) they only sketch an approach to (11b) as an example. As a particular way of characterising the class of possible languages they propose the restriction to languages ‘the language-using device’ can deal with. The description of the relation between the grammar and the knowledge of a language in (12a) is rather vague; it is only stated that the grammar ‘represents’ the knowledge. Elaborating on their interpretation of (3) as the Competence Hypothesis, Bresnan and Kaplan propose a more restrictive relation between the two in (13). (13)
a.
‘we assume that there is a competence grammar that represents native speakers’ tacit knowledge of their language.
Some modern competitors
191
b. Next, suppose that we are given an information-processing model of language use that includes a processor and a component of stored linguistic knowledge K. […] We call the subpart of K that prescribes representational operations the representational basis of the processing model. (The representational basis is the “internal grammar” of the model.) […] c. we do require that every rule of the representational basis be interpreted in a model of some behavior; thus, the internal grammar cannot contain completely otiose rules. d. We can now say that a model satisfies the strong competence hypothesis if and only if its representational basis is isomorphic to the competence grammar.’ [Bresnan and Kaplan (1982: xxxi)]
The Strong Competence Hypothesis is defined in (13d). It requires an isomorphism between the ‘representational basis’ and the ‘competence grammar’. These terms are explained in (13b) and (13a), respectively. They are coined to avoid the ambiguity of the term grammar between the description by the linguist and the object of description in the speaker’s mind. From (13a) we can conclude that competence grammar refers to the same idea as Chomsky’s term I-language. From (13b) we can deduce that the representational basis is the description of the competence grammar in the model developed by the linguist. The isomorphism condition in (13d) means that every element in the description (i.e. the linguist’s grammar) should correspond to an element in the object described (i.e. the speaker’s competence). This has the implication noted in (13c). If a rule R is proposed as part of the linguist’s grammar, R must correspond to a rule R’ in the competence grammar of the speaker and it must be possible to find empirical evidence for R’. Without (13c), one could propose any rule and claim that the corresponding rule exists but does not have any observable effects, which would make (13d) vacuous. The central problem to be solved by a grammar, both the competence grammar and its description by the linguist, is the syntactic mapping problem, defined in (14). (14)
a.
‘The syntactic mapping problem is the problem of computing, for any human language, the grammatical relations of any string of words of that language.’ [Bresnan and Kaplan (1982: xxxviii)] b. ‘Recall that the term grammatical relations is used here in a theoryneutral way to refer to the associations between the surface constituents and the semantic predicate argument structure of a sentence.’ [Bresnan and Kaplan (1982: lii, fn. 5)]
192
Chomskyan Linguistics and its Competitors
The definition in (14a) is followed by a footnote, the start of which is given in (14b). The problem in (14a) is considered as the core of human language processing. From experience it is obvious that there must be a solution, because we manage to formulate our thoughts and to understand linguistic expressions. Moreover, the solution meets a number of additional constraints. Three of them are introduced in (15). (15)
a.
‘The essential properties of all generative grammars reflect certain theoretical constraints on the set of possible processes that compute solutions to the syntactic mapping problem. b. The constraints are creativity (the domain and range of the mapping are theoretically infinite), c. finite capacity (there is only a finite capacity for the knowledge representations used in the mapping), d. and, though not all generative grammars have turned out to satisfy this constraint, reliability (the mapping provides an effectively computable characteristic function for each natural language).’ [Bresnan and Kaplan (1982: xxxviii-xxxix)]
The constraints in (15b-d) are identified in (15a) as essential for a grammar to be a generative grammar. Among these, creativity in (15b) is a property that corresponds to Hockett’s productivity constraint in (45d) of Section 3.1.4. 3 It also corresponds to Chomsky’s leading observation in (3) of Section 2.1. The consideration in (15c) only comes up in a mentalist view of language. Chomsky uses it to motivate the need for recursion. Bresnan and Kaplan derive the implication in (16). (16)
‘The finite capacity constraint implies that this computation must decompose into a finite grammar G and a recursive procedure mG for projecting G onto the infinitely many sentences of a language.’ [Bresnan and Kaplan (1982: xl)]
In (16), the property of recursion is not primarily assigned to the grammar, but to the procedure for the application of the grammar. The division of the descriptive mechanism into a grammar G and a procedure mG is an important assumption of the research programme of LFG. The issue of reliability in (15d) is more controversial than the other two constraints. Whereas Bresnan and Kaplan devote just over one page to the discussion of (15b-c), the discussion of (15d) takes up almost five pages. The ‘characteristic function’ in (15d) refers to the function that takes as input a string of words and yields 1 or 0 depending on whether the sentence does or does not belong to the language. A function is ‘effectively computable’ if it is possible to formulate an algorithmic procedure for it. The intuitive motivation for (15d) is given in (17).
Some modern competitors (17)
193
‘It is plausible to suppose that the ideal speaker can decide grammaticality by evaluating whether a candidate string is assigned (well-formed) grammatical relations or not.’ [Bresnan and Kaplan (1982: xl)]
The ‘ideal speaker’ in (17) is not affected by ‘the actual performance limitations of real language users’ (1982: xl), in a way similar to the Chomskyan idealisations discussed in Section 2.3.2. Apart from the three constraints in (15), which are ‘familiar from early work in generative grammar’ (1982: xliv), Bresnan and Kaplan impose two further constraints, formulated as in (18). (18)
a.
‘order-free composition, requiring that the grammatical relations that the mapping derives from an arbitrary segment of a sentence be directly included in the grammatical relations that the mapping derives from the entire sentence, independently of operations on prior or subsequent segments, and b. universality, requiring that the mapping incorporate a universal procedure for constructing representations of grammatical relations.’ [Bresnan and Kaplan (1982: xliv)]
The constraint in (18a) means that the internal analysis of a segment is not determined by the material preceding and following it. It is motivated by ‘the fact that complete representations of local grammatical relations are effortlessly, fluently, and reliably constructed for arbitrary segments of sentences’ (1982: xlv). There is an obvious problem with this constraint as illustrated by (19). (19)
a. … Holger wants to meet Ilse. b. The man who is speaking to Holger wants to meet Ilse. c. You have to know what Holger wants to meet Ilse.
In the sentence fragment (19a), Holger is naturally interpreted as the subject of want and the VP to meet Ilse as the object of want. In the sentence (19b), Holger is the object of speaking to. In (19c), the VP is a purpose clause not depending on want. Bresnan and Kaplan consider cases like these and provide (20) as their solution. (20)
a.
‘the order-free composition constraint asserts that sentential context may determine the choice of one of a set of locally computed grammatical relations for a segment, b. but the computation of grammatical relations for a segment may not involve the computation of the grammatical relations of the context.’ [Bresnan and Kaplan (1982: xlvii)]
194
Chomskyan Linguistics and its Competitors
According to (20a), if we change the perspective from a fragment to the full sentence, what happens is disambiguation within the set of possible interpretations. For a fragment such as (19a) this means that it may but need not be a constituent. Instead, it may be a sequence of constituents or fragments of constituents. In (19b), a major constituent boundary follows Holger, in (19c), such a boundary follows wants. In both cases, this means that the fragment in (19a) is not a single constituent. In (20b) it is allowed that we have to disambiguate constituent boundaries on the basis of the rest of the sentence, but once these boundaries are fixed, we can determine grammatical relations within a constituent without referring to outside material. The universality constraint in (18b) can be formally expressed as in (21). (21)
‘there is a universal mU such that for any G, mG = mU.’ [Bresnan and Kaplan (1982: xlvii)]
In the formulation of (21), mG is used in the sense introduced in (16) above, i.e. as the procedure for applying grammar G. What (21) states is that there is only one such procedure. The motivation for constraint (21) is derived from language acquisition. Bresnan and Kaplan assume that language acquisition proceeds by the evaluation of hypothesised grammars. They formulate the link between this assumption and (21) in (22). (22)
a.
‘To test a hypothesized grammar G*, there must be some universal effective procedure for constructing the mental representations r of lexical strings s given G*; call it mU. b. While it is conceivable that this universal procedure is different from the one that the language learner normally uses in comprehending language (mG), the simplest, strongest, and most plausible assumption is that the procedures are the same.’ [Bresnan and Kaplan (1982: xlvii)]
The idea of (22) is not to provide a logical proof for (21), but to make it plausible. The link between (22a) and (22b) is obvious. If we have a universal procedure, we might as well use it universally. The contentious claim is rather in (22a). If we accept the reliability constraint in (15d), there has to be an mG*, otherwise we would not be able to use, let along evaluate G*. The problem of finding such a procedure becomes trivial when we assume that there is a universal, genetically determined solution to it. Combining the information collected in the discussion of Bresnan and Kaplan (1982), we can construct the model of Figure 4.1 for the research programme of LFG.
Interpretation procedure
describes
Universal Grammar
Competence grammar
isomorph
Representational basis test
Observable facts
describe
195
grammar
competence
Some modern competitors
explains
Observations
Figure 4.1: The research programme of LFG
In accordance with (12) and (16), Figure 4.1 represents both the grammar and the competence it describes as divided into two sections, one universal and the other language-specific. The terms competence grammar and representational basis are taken from (13a-b). The label isomorph renders the Strong Competence Hypothesis in (13d). The suggestion in (12b) that the Universal Grammar describes the ‘language-using device’ is combined with the universality constraint in (18b) and (21) to yield the top part of Figure 4.1. Here interpretation procedure stands for mU, the universal, genetically determined procedure for applying competence grammars to sentences. The labels competence and grammar in Figure 4.1 apply to the entire boxes. They do not introduce new elements but refer to the combination of interpretation procedure and competence grammar and to the combination of UG and representational basis, respectively. The reason for including these boxes is that it is the entire competence and grammar rather than their components that interact with the data. About the nature of the data, Bresnan and Kaplan make the statement in (23). (23)
‘Since we cannot directly observe this “internal grammar”, we must infer its properties indirectly from the evidence available to us (such as linguistic judgments, performance of verbal tasks in controlled experimental conditions, observation of the linguistic development of children, and the like).’ [Bresnan and Kaplan (1982: xxiii-xxiv)]
In the same way as Chomsky’s (20) in Section 2.2, (23) implies the use of any relevant data.
196
Chomskyan Linguistics and its Competitors
4.1.3
The ‘Competence Hypothesis’ in Chomskyan linguistics
One of the questions raised by the preceding discussion is why the Competence Hypothesis in (3), though included in one of Chomsky’s foundational works, has not played a major role in the development of Chomskyan linguistics. In order to understand this, we have to consider (3) in a slightly wider context, given in (24). (24)
a.
‘When we say that a sentence has a certain derivation with respect to a particular generative grammar, we say nothing about how the speaker or hearer might proceed, in some practical or efficient way, to construct such a derivation. b. These questions belong to the theory of language use – the theory of performance. c. No doubt, a reasonable model of language use will incorporate, as a basic component, the generative grammar that expresses the speakerhearer’s knowledge of the language; d. but this generative grammar does not, in itself, prescribe the character or functioning of a perceptual model or a model of speech production.’ [Chomsky (1965: 9)]
The quotation in (3), which Bresnan and Kaplan (1982: xvii) call ‘Chomsky’s Competence Hypothesis’, is included in (24c). There is an overlap between (24) and (44) in Section 2.3.1, in that (44c) = (24a). The first observation to be made about (24) is that there is no sense of a prominent hypothesis. Although (3) is included in (24c) it is by no means the centre of the argument. It is slightly misleading when Bresnan (1978: 1) quotes (3) as an independent sentence, with an initial capital and a final period. The second observation is that the context of (24) makes it clear that the fragment Bresnan singles out in (3) is not intended as a statement about grammars. In the discussion of (4), we noted a certain ambiguity as to whether (3) is a statement about the study of language use or about the nature of grammars. In the context of (24) only the former can be maintained. In (4), however, Bresnan takes it to be a statement about grammars. A bit further down in her argument, she gives (3) with some context, the start of (44a) in Section 2.3.1, ‘a generative grammar is not a model for a speaker or a hearer’, followed by three dots and (24c-d). Her comment is that ‘we find that it is couched within an admonition’ (1978: 1). It is in particular (24a-b), however, that exclude the interpretation of (24c) as a constraint on how to formulate a grammar. Instead of the interpretation given by Bresnan, (24) can be interpreted as follows, in line with the research programme of Chomskyan linguistics. By formulating a grammar which results in a particular derivation of a sentence,
Some modern competitors
197
‘we say nothing’ about the way this sentence is processed by the speaker, (24a), a question that belongs to a separate area of research, (24b), which will probably want to use the results obtained in the study of grammar, (24c), without being determined in any way by it, (24d). This explains why the crisis giving rise to LFG was not perceived as such in Chomskyan linguistics. All of this does not say that the Competence Hypothesis is wrong. What it states is that it is not Chomsky’s (1965) hypothesis. While it may be inspired by Chomsky’s work and is formulated in Chomsky’s words, it is Bresnan’s hypothesis.
4.1.4
Comparison of the models of LFG and Chomskyan linguistics
If we compare the research programme of LFG as represented in Figure 4.1 to the research programme of Chomskyan linguistics, it is first of all striking how similar the two are. A good starting point for the comparison is the model in Figure 2.3, which only represents grammar and observations. A central problem with the model in Figure 2.3 is that there are too many different grammars compatible with the observed data so that it is not possible to determine which grammar describes the actual competence. This is the problem of indeterminacy discussed in Section 2.3.3. Chomskyan linguistics and LFG propose different solutions to this problem. In the context of Chomskyan linguistics, the three questions in (53) of Section 2.4 are raised, concerning the nature, acquisition, and use of language. These questions are listed by Chomsky (1981b) and repeated by Chomsky (1986a) as the fundamental questions guiding the study of language. Earlier, Chomsky and Miller (1963) formulate them as in (25). (25)
a.
‘The fundamental fact that must be faced in any investigation of language and linguistic behavior is the following: a native speaker of a language has the ability to comprehend an immense number of sentences that he has never previously heard and to produce, on the appropriate occasion, novel utterances that are similarly understandable to other native speakers. b. The basic questions that must be asked are the following: 1 What is the precise nature of this ability? 2 How is it put to use? 3 How does it arise in the individual?’ [Chomsky and Miller (1963: 271), original numbering]
The ‘ability’ in (25b) is explained in (25a) in terms very similar to Chomsky’s (1964) (3) in Section 2.1. The three questions in (25b) correspond one-by-one to the three questions in (52) of Section 2.4. Still there are two remarkable
198
Chomskyan Linguistics and its Competitors
differences. First, in (52) we find ‘knowledge’ instead of ‘ability’ in (25). In the same way, ‘has the ability to’ in (25a) corresponds to ‘can’ in (3) of Section 2.1. Second, the last two question in (25b) are in reverse order compared to (53). The distinction between knowledge and the ability to use it was discussed in Section 2.1.1. In that section, the explanation of competence and performance in (4), from Chomsky (1966a), associates competence with knowledge instead of ability and in (6) Chomsky (1980a) explicitly opposes competence to the ability to use it. Chomsky and Miller (1963) do not seem to be aware of this distinction when formulating (25). The relevance of the order of questions in (25b) depends on the extent to which the second question is given priority over the third. The issue was discussed in Section 2.4.1. In (57) of that section, Katz (1964) proposes the same order as in (25b). He argues that each question is ‘logically prior’ to the ones following it on the list. Chomsky and Miller make the observations in (26). (26)
a.
‘Our second question calls for an attempt to give a formal characterization of, or model for, the users of natural languages. […] b. Our third question is no less important than the first two, yet far less progress has been made in formulating it in such a way as to support any abstract investigation. c. What goes on as a child begins to talk is still beyond the scope of our mathematical models. We can only mention the genetic issue and regret its relative neglect in the following pages.’ [Chomsky and Miller (1963: 272)]
The interpretation of the second question in (26a) is compatible with Bresnan’s interpretation of (3) as a Competence Hypothesis. However, it does not mention the relation of the research this question would trigger to the description of competence. Therefore, (26a) is equally compatible with the intended interpretation of (3) as discussed in Section 4.1.3. The third question is said in (26b) to be ‘no less important than the first two’. This is a far weaker statement than the conclusion reached in Section 2.4.1. In (59) of that section, Chomsky (1975a) concludes that the question of language acquisition is a well-formed problem, whereas the question of language use is a mystery which eludes our understanding. The use of ‘still’ in (26c) is fully compatible with such a view, but it might also be seen as inviting an inferred contrast with the study of language use, which in (26b) is implied to have made more progress. It is not quite obvious what to make of the apparent opposition between (26) and (24) in the role of language use and language acquisition. As demonstrated
Some modern competitors
199
in Section 2.5.2, the importance of the question of language acquisition for the model of Chomskyan linguistics emerges earlier. In (91b) of that section, Chomsky (1962a) explicitly links the general linguistic theory to the language acquisition device. Chomsky and Miller (1963: 275–277) summarise the question of language acquisition in familiar terms, referring to Chomsky (1962a). Therefore it is hard to maintain that (26) represents an earlier stage of Chomsky’s thinking. On the other hand, as mentioned in Section 4.1.1, Miller played a pivotal role in the research that led to the crisis associated with psychological reality. In such a situation it is altogether conceivable that at least in the perception of some linguists associated with Chomskyan linguistics in the early 1960s, the questions of language use and language acquisition were each possible as leading questions to approach the problem of indeterminacy associated with the model in Figure 2.3. If we accept this view, Chomskyan linguistics and LFG can be seen to diverge on the choice between these two questions. Whereas Chomskyan linguistics takes language acquisition, LFG takes language use as the leading question. Both approaches have to make certain abstractions. As we saw in Section 2.4.4, Chomskyan linguistics interprets the question of language acquisition in terms of learnability. This entails an idealisation excluding aspects of the actual learning process. LFG interprets language use in the sense of language processing. This entails an idealisation excluding the creative aspects of language use, which constitute the primary reason for Chomsky to call the question of language use a mystery. The selection of language acquisition or language processing as the solution to the indeterminacy of grammars is not the only difference between the research programmes of Chomskyan linguistics and LFG. If we compare Figure 4.1 to Figure 2.7, we see that the universal and language-specific features are placed in different relations to each other. In Chomskyan linguistics, they are placed on different levels. UG describes the language faculty at the species level and a grammar describes the competence at the individual level. In LFG, they are different components of the grammar and the competence. Grammar and competence each consist of a language-specific and a universal part. A first observation to be made here is, of course, that this distinction is first of all a matter of the representation. It is possible to adapt Figure 2.7 so that it diverges from Figure 4.1 only in labels or, conversely to adapt Figure 4.1 in an analogous way. 4 There are reasons, however, for not doing either. When we consider the relationship between the language-specific competence and the language faculty in Figure 2.7, we can say that the language faculty underlies the competence, because it is involved in its origin. The universal interpretation procedure, however, is part of the competence and cannot be said to underlie the individual competence. Therefore, for LFG the representation as a competence
200
Chomskyan Linguistics and its Competitors
divided into two parts, a universal and an individual one, is more appropriate. For this reason it would also be incorrect to claim that UG explains the representational basis in LFG. In Chomskyan linguistics, there is a tension between the power of UG and the power of individual grammars, which results in the possibility of explanation. In LFG, by contrast, the grammar is divided a priori into a representational basis and a UG describing the interpretation procedure. There is no tension, but an independently motivated distinction into two different parts.
4.1.5
Interaction of LFG and Chomskyan linguistics
In this section, three examples of the interaction of LFG and Chomskyan linguistics will be presented and related to the research programmes as analysed here and in Chapter 2.
4.1.5.1 The interpretation of psycholinguistic data The use of data derived from psycholinguistic experiments is directly related to the question of psychological reality as discussed in Section 4.1.1. Here I would like to concentrate on the role psycholinguistic data have in the selection of a grammar. In (27), Botha (1968) formulates a view that was widely held in Chomskyan linguistics during the 1960s and led directly to the LFG position in this respect. (27)
‘The criteria for the complete testing of a “mentalistic” grammar, in order of application are thus the following: 1 All grammars not capable of accounting for the known relevant primary data and from which no correct predictions can be derived, are rejected. 2 The simplest, with ‘simplicity’ taken in the technical transformational sense, grammar of those satisfying criterion (1) is preferred.* 3 Only a grammar satisfying criteria (1) and (2) for which behavioural correlates have been established by means of psycholinguistic testing, can be viewed as making true existential claims, i.e. true references to reality.’ [Botha (1968: 100), original numbering, footnote at * deleted]
A three-step procedure as in (27) first establishes a set of grammars that are adequate for the available primary data (i.e. data available to the child for language acquisition) and intuitive judgements (e.g. data derived from comparing a grammar’s predictions of grammaticality with actual grammaticality judgements). Then, a simplicity measure is applied. As mentioned in Section 2.3.3, a simplicity measure is one way of solving the problem of indeterminacy.
Some modern competitors
201
The third step, finally, takes this simplest grammar and tries to demonstrate its reality by psycholinguistic tests. It is in reaction to such views that Chomsky formulates (28). (28)
a.
‘If we accept – as I do – Lenneberg’s contention that the rules of grammar enter into the processing mechanisms, then evidence concerning production, recognition, recall, and language use in general can be expected (in principle) to have bearing on the investigation of rules of grammar, on what is sometimes called “grammatical competence” or “knowledge of language.” b. But such evidence, where it is forthcoming, has no privileged character and does not bear on “psychological reality” in some unique way. c. Evidence is not subdivided into two categories: evidence that bears on reality and evidence that just confirms or refutes theories (about mental computation and mental representation, in this case).’ [Chomsky (1980a: 200f.)]
In (28a), Chomsky refers to Lenneberg (1967) and states his acceptance of results from psycholinguistic experiments as evidence for the rules of a grammar describing the speaker’s competence. This does not mean, however, that this evidence is more important than other sources of data, (28b), or should be separated from these other data in a qualitative way, (28c). What (28b-c) rejects is exactly what the third step of (27) does. The alternative procedure Chomsky proposes is the one in (29). (29)
a.
‘We observe what people say and do, how they react and respond, often in situations contrived so that this behavior will provide some evidence (we hope) of the operative mechanisms. b. We then try, as best as we can, to devise a theory of some depth and significance with regard to these mechanisms, testing our theory by its success in providing explanations for selected phenomena. c. Challenged to show that the constructions postulated in that theory have “psychological reality,” we can do no more than repeat the evidence and the proposed explanations that involve these constructions.’ [Chomsky (1980a: 191)]
Data collection in (29a) is more general than step 1 in (27). In accordance with (28c), psycholinguistic data are brought in at this stage instead of in step 3. In (29b), ‘providing explanations’ is used as a criterion instead of simplicity in step 2 of (27). This means that psychological reality is just a label attached to the best theory, as implied by (28c). Bresnan and Kaplan (1982) object to this presentation of theoretical work in (30).
202 (30)
Chomskyan Linguistics and its Competitors a.
‘The challenge to Chomsky’s theory is not the philosophical question that he addresses (whether theoretical constructs correspond to real mental entities and processes), b. but the scientific question (whether these theoretical constructs can unify the results of linguistic and psycholinguistic research on mental representation and processing). c. To the latter question, Chomsky’s response is plainly inadequate.’ [Bresnan and Kaplan (1982: xx-xxi)]
Bresnan and Kaplan quote (29c) after (30c) in order to illustrate their point. This suggests that they believe that Chomsky only addresses the philosophical question formulated in (30a). They consider linguistic and psycholinguistic data in (30b) as being in need of unification. This assumption presupposes that (28c) is wrong. In LFG, the processing mechanism is part of the competence to be described by a grammar, whereas in Chomskyan linguistics it is the subject of a bridge theory required to make data from psycholinguistic experiments relevant to the study of competence. The situation is typical of discussions between proponents of different research programmes. Having explained how each position is anchored in the different research programmes, there is no more we can do. While it is perfectly reasonable to assume the division of data into different categories in the research programme of Figure 4.1, it is equally reasonable to reject such a division in the research programme of Figure 2.7. Both positions are rational within their own research programme but make unwarranted assumptions from the point of view of the other research programme. There is no absolute rational basis to reject one in favour of the other.
4.1.5.2 Language acquisition As presented in Section 2.4, language acquisition is used as the main criterion to select a grammar for a particular language in Chomskyan linguistics. As explained there, this does not mean that all aspects of language acquisition are used equally. LFG-based arguments against the treatment of language acquisition in Chomskyan linguistics often neglect or attack these restrictions. As indicated in (23), LFG takes language acquisition as a source of data, obtained by the ‘observation of the linguistic development of children’. Pinker (1982) presents a comparative discussion of the account of language acquisition in what he calls transformational grammars and LFG. The scope of the discussion is indicated in (31).
Some modern competitors (31)
203
a.
‘In this chapter, I explore in greater depth the possibility that lexical grammars may succeed where standard transformational grammars have failed in serving as a foundation for a psychologically plausible theory of language acquisition – b. a theory both adequate in principle to explain the fact of language learning and c. capable of interpreting the developmental sequence that children pass through on their way to language mastery.’ [Pinker (1982: 656)]
As (31a) states, the comparison is not between two research programmes but between two theories of grammar. For ‘standard transformational grammars’ Pinker (1982: 655) refers to the framework of Chomsky (1965). On the basis of the models of the research programmes in Figure 2.7 and Figure 4.1, we can predict a difference in attitude to the two criteria in (31b-c). The criteria correspond to the logical problem of language acquisition, (31b), and its practical equivalent, (31c). In Chomskyan linguistics, (31b) is the central criterion for the selection of a grammar, whereas the developmental sequence in (31c) is only a potential source of data which may or may not be incorporated with priority. In LFG, the developmental sequence is used in the same way, but (31b) is approached only through (31c). If we can explain the process of language acquisition, we can by implication explain that language acquisition is possible. If it were not for such statements as Hornstein and Lightfoot’s (72) in Section 2.4.4, learnability would probably hardly be mentioned in an LFG context, if at all. A consequence of the emphasis on the developmental sequence in Pinker’s argument is the need of a bridge theory, as stated in (32). (32)
‘most arguments about language acquisition are unsound unless they refer to mechanisms of well-defined learning models capable in principle of acquiring a correct grammar given exposure to linguistic data.’ [Pinker (1982: 664)]
It should be kept in mind that learnability in the sense used in Chomskyan linguistics does not belong to the class of ‘most arguments’ referred to in (32). However, the main point in LFG discussions of language acquisition concerns the sequence of constructions observed in child language. Pinker explains the problem in (33).
204 (33)
Chomskyan Linguistics and its Competitors a.
‘a transformation-by-transformation acquisition theory makes two predictions about the order of acquisition of grammatical constructions that child language data have not dealt kindly with. b. First, constructions derived by the application of a particular transformation should be mastered later than their untransformed counterparts. c. Second, constructions derived by the application of two transformations should be mastered only after each of the transformations is mastered in isolation.’ [Pinker (1982: 669)]
A more implicit statement of (33b) is found in Bresnan (1978: 44f.), who gives some examples of contradictions between observed sequences and transformational derivations. The argument is basically the same as the one underlying the DTC as stated in (5). An example of a pair of constructions mentioned both by Bresnan and by Pinker is (34a-b). (34)
a. Jason wants Kassandra to go. b. Jason wants to go. c. Jason wants PRO to go
The construction exemplified in (34b) is analysed as having an empty infinitival subject PRO in GB-theory, as indicated in (34c). An overview of the historical development of this analysis is given in van Riemsdijk and Williams (1986: 129–138). An earlier analysis proposed that a copy of the subject Jason would be in the position of PRO in (34c) at deep structure and a transformation would delete it. If this analysis is adopted and the DTC is assumed, (34b) is more complex than (34a), because (34b) involves a deletion transformation that is not necessary in (34a). Both Bresnan and Pinker point to studies showing that children acquire the construction in (34b) before the one in (34a). Pinker then argues that in LFG sentences such as (34a) and (34b) ‘are generated by different phrase structure rules’ and that the order of acquisition is ‘predictable from the relative complexity of their underlying f-structures’ (1982: 669f.). An f-structure is a representation of the structure of a sentence in LFG in which grammatical functions such as subject, object, and predicate are central. Pinker’s idea is then to show that LFG is better at explaining the order of acquisition of these constructions. While this discussion highlights the difference in attitude to language acquisition, it has not prompted any prominent reaction from Chomskyan linguistics. There are a number of reasons why this is not surprising. First, the analysis of (34b) as resulting from a transformation has been controversial from the beginning, as van Riemsdijk and Williams (1986) indicate. Without the extra transformation rule, Pinker’s argument loses some of its force. Second,
Some modern competitors
205
the idea of individual transformations was replaced by generalised movement (move α) with constraints in GB-theory, as discussed in Section 2.5.1. Therefore the ‘transformation-by-transformation acquisition theory’ mentioned in (33a) could no longer be stated in the theory dominating Chomskyan linguistics at the time Pinker’s article was published. Without this assumption, the argument against Chomskyan linguistics dissolves. The discussion of language acquisition by Pinker (1982) shows how this area of linguistics is considered in LFG. The opposition between LFG and transformational grammar it invokes is an opposition at the level of theory. Assuming the LFG research programme, Pinker compares transformational accounts such as Chomsky (1965) with the LFG theory in Kaplan and Bresnan (1982). Meanwhile, in Chomskyan linguistics, they had been replaced by another theoretical framework for other reasons.
4.1.5.3 Some theoretical notions Despite the difference in research programme, any researcher vaguely familiar with LFG and GB-theory will be struck by the similarity of certain theoretical notions and ideas in both. As an example, consider the Binding Theory as formulated in (35) and (36). (35)
Binding Theory (A) An anaphor is bound in its governing category (B) A pronominal is free in its governing category (C) An R-expression is free [Chomsky (1981a: 188), his (12)]
(36)
a.
Principle A: A nuclear (reflexive) pronoun must be bound in the minimal nucleus that contains it. b. Principle B: A nonnuclear pronoun must be free in the minimal nucleus that contains it. c. Principle C: (Other) nominals must be free. [Bresnan (2001: 215), her (9); boldface replaced by italics]
Although the precise use of the Binding Theory is different in the two frameworks and some of the terms do not quite match, there is a remarkable pointby-point correspondence between Chomsky’s Binding Theory in (35) and Bresnan’s in (36). The central cases covered by them are exemplified in (37).
206 (37)
Chomskyan Linguistics and its Competitors a. b. c. d. e.
Lodewijki shaves himselfi/*j. Lodewijki shaves himj/*i. *Lodewijki thinks [Machteld loves himselfi]. Lodewijki thinks [Machteld loves himi]. Lodewijki thinks [Machteld loves Lodewijkj/*i].
The subscripts i and j in (37) indicate coreference and disjoint reference. In (37a), himself is an anaphor for (35a) and a nuclear pronoun for (36a), so that it must be coreferential with Lodewijk. Conversely, him in (37b) is a pronominal for (35b) and a nonnuclear pronoun for (36b) so that it cannot be coreferential with Lodewijk. The square brackets in (37c-d) indicate the governing category for (35) and the minimal nucleus for (36). In (37c-d), himself and him do not have a binder in this domain, so that (37c) is ungrammatical and (37d) is grammatical. In the case of a full noun, i.e. an R-expression for (35c) and an ‘(other) nominal’ for (36c), it is not sufficient that it is free in this minimal domain, as illustrated by the second occurrence of Lodewijk in (37e). Another example of similar components of a theory in both research programmes is X-bar theory. The emergence and impact of X-bar theory in Chomskyan linguistics were outlined in Section 2.5.1.1. X-bar theory is also used in LFG. However, a remarkable discrepancy exists in the presentation of the origins and development of the theory. Bresnan presents the origin of X-bar theory in (38). (38)
‘Because the category labels are simple, unanalyzed symbols, there is no necessary relation expressed between VP and V or NP and N, a problem originally pointed out by Lyons (1968: 234–5), which led to the development of X’ theory.’ [Bresnan (2001: 120)]
Bypassing Chomsky’s (1970) exposition of X-bar theory, Bresnan then moves on to Jackendoff (1977). In his preface, Jackendoff (1977: xi-xii) presents his book as an overview of the concequences of Chomsky’s Lexicalist Hypothesis, concentrating on phrase structure. He attributes the Lexicalist Hypothesis to a series of lectures by Chomsky at MIT in the fall of 1967, that formed the basis of Chomsky (1970). While Lyons (1968) does in fact point out the problem mentioned in (38), Jackendoff does not refer to him and neither does Chomsky (1970). It is also interesting to observe the conflicting accounts of the origin of the proposals to consider S as the maximal projection of Infl (IP) and S’ as the maximal projection of Comp (CP). Bresnan (2001) attributes both to LFG accounts in (39).
Some modern competitors (39)
207
a.
The category CP as a projection of C was first introduced into LFG by Fassi Fehri (1981: 141ff; 1982: 100ff) in his analysis of Arabic syntax. […] b. The hypothesis that the sentence is a projection of a functional head (the IP hypothesis) is due to Falk (1984), using ‘M’ for ‘I’, in his analysis of auxiliaries within LFG. [Bresnan (2001: 119)]
In Section 2.5.1.1 we saw that Chomsky (1986b: 2–4) describes a version of X-bar theory with IP and CP. At that time this idea was not new. Chomsky ¯’ with (1981a: 364) has an index entry for ‘Inflection (INFL) as head of S, S six references. The first of these is (40). (40)
¯ system * should be regarded as a One debated question is whether the S, S projection of V, with verbs taken to be heads of clauses, or whether this is a ¯ separate system, perhaps with INFL as head. I will assume here that the S, S system is separate’ [Chomsky (1981a: 51), footnote at * deleted]
Subsequent passages where the issue is mentioned invariably give advantages of the hypothesis that INFL is indeed the head of IP. Chomsky (1981a: 163) ¯ to have COMP as its head’. In his PhD also considers the possibility ‘we take S thesis supervised by Chomsky, Tim Stowell takes a more definite position. He concludes that S = IP (1981: 67f.) and devotes an entire chapter to the argument that S’ = CP (1981: 373–432). As we saw in Section 1.4.1, Kuhn (1970a: 55) rejects the idea of attributing, for instance, the discovery of the theory of oxygen to a single researcher. It is rather more common that such an idea is ‘in the air’ and gets formulated by different researchers independently at around the same time. Nevertheless, the way Bresnan downplays Chomsky’s role in the origin of X-bar theory in (38) is difficult to defend. The evidence suggests that Chomsky had formulated the central idea of X-bar theory around 1967. Bresnan completed her PhD, supervised by Chomsky, in 1972. It is unlikely that she should not have been aware of this idea until the publication of Jackendoff (1977). The case for a simultaneous development of the idea of IP and CP is somewhat stronger. Bresnan was professor at MIT in 1979–1983, at the time when Chomsky published his LGB and supervised Stowell’s (1981) PhD thesis. Still, unless we take the scope of ‘into LFG’ in (39a) to extend to (39b), attributing the origin of the IP hypothesis to Falk (1984), a paper first received by the journal in March 1983, is difficult to sustain. In general we can say that many components of theories occur both in Chomskyan linguistics and in LFG. Given the large degree of agreement on
208
Chomskyan Linguistics and its Competitors
the model as represented in Figure 2.3, which can be thought of as part of both research programmes, this should not surprise us. It is remarkable that the history of such theoretical ideas is sometimes perceived differently in the two research programmes. In the context of competing research programmes, conflicting priority claims are a well-known phenomenon.
4.1.6
Conclusion
We started this section on LFG with two contrasting assessments of its relationship to Chomskyan linguistics. At this point, we are in a position to draw a conclusion, explaining to what extent both are correct. It was shown in various ways that LFG and Chomskyan linguistics do not have the same research programme. The central argument is the construction of the model in Figure 4.1 in Section 4.1.2, and its comparison to the model adopted in Chomskyan linguistics in Section 4.1.4. Then, the origins of LFG as outlined in Section 4.1.1 correspond to the typical pattern of a crisis followed by the emergence of a new research programme. That the crisis was only perceived as such in part of the field and did not affect the Chomskyan mainstream in the same way does not contradict this analysis. As explained in Section 4.1.3, it reflects different interpretations of the inadequacies of Chomskyan theory in the 1960s, which can be reduced ultimately to different assumptions about the goals of linguistics. Finally, Section 4.1.5 gives some examples of the type of discussion characteristic of exchanges between proponents of different research programmes. They are marked by incommensurability effects and by conflicts about the priority of ideas. Although the research programmes of LFG and Chomskyan linguistics are not the same, they have much in common. As argued in Section 4.1.4, they can both be considered as elaborations of the model in Figure 2.3. It is this common ground that accounts for the possibility of sharing many theoretical notions, as indicated in Section 4.1.5.3. The main difference resides in the approach to the indeterminacy problem this model suffers from. Whereas Chomskyan linguistics approaches this problem by appealing to learnability, LFG bases its approach on a universal theory of language processing. SUMMARY
•
LFG emerged in response to a perceived crisis in Chomskyan linguistics concerning the psychological reality of grammars.
•
Psychological reality is understood as an isomorphy between the grammar as formulated by the linguist and the structure of the speaker’s competence, to be demonstrated by psycholinguistic experiments.
Some modern competitors
4.2
•
The procedure mapping between an expression of a language and its interpretation is supposed to be universal and innate.
•
Although the Competence Hypothesis from which the isomorphy condition follows has been attributed to Chomsky, it is unlikely that he has ever adhered to it. This explains why the crisis giving rise to LFG was not noticed in the same way by Chomsky.
•
LFG and Chomskyan linguistics share the idea that competence should be the focus of attention and that a grammar is a theory of a speaker’s competence.
•
These similarities explain why theoretical insights and components can be taken over from one framework into the other.
•
LFG and Chomskyan linguistics diverge on the role attributed to language processing and learnability in the evaluation of grammars. They have different views of the nature of Universal Grammar.
•
These differences explain why incommensurability effects affect the quality of discussion when the topic turns to the interpretation of psycholinguistic data on processing and of language acquisition data.
209
Generalised Phrase Structure Grammar
In the early 1980s, Generalised Phrase Structure Grammar (GPSG) was often mentioned alongside LFG as one of the main alternatives to Chomsky’s Government and Binding Theory. Thus, Sells (1985) selects these three for his Lectures on Contemporary Syntactic Theories. Newmeyer (1986a) goes even further in (41). (41)
‘The most successful current alternative to GB is the framework known as “generalized phrase structure grammar” (GPSG). Indeed GPSG and GB stand together as the only models of syntactic description that have won more than a small handful of recruits beyond their leading members’ immediate circle.’ [Newmeyer (1986a: 209)]
The subsequent fate of GPSG and LFG makes (41) a remarkable statement. Whereas LFG developed into a framework attracting a large group of linguists who, for instance, organise annual conferences, GPSG is now mainly of historical importance. No doubt one of the reasons for this divergence in development is the attitude of the leading members to their framework. 5 This does not take away the significance of GPSG. Many of the issues it raised as well as the general approach it took to these issues are still important today.
210
Chomskyan Linguistics and its Competitors
The four main proponents of GPSG were Gerald Gazdar, Ewan Klein, Geoffrey Pullum, and Ivan Sag. Together they published a book, Gazdar et al. (1985), which constitutes the main authority on the nature of the theory. It is interesting to note the contrast of this book to Bresnan (1982). Gazdar et al. (1985) is a comparatively slim volume, less than 280 pages, and the individual chapters are not attributed to individual authors. All examples are from English and an overview of features and rules, a formal grammar of English, is given as an appendix.
4.2.1
The crisis: generative grammar
The emergence of a new research programme is generally a response to a crisis. The case of GPSG is special, because the crisis was not triggered primarily by the inability to account for certain data, but by a sense that the standards of theorising were insufficient. The central notion in the crisis is that of generative grammar. Chomsky (1965) introduces this notion as in (42). (42)
‘by a generative grammar I mean simply a system of rules that in some explicit and well-defined way assigns structural descriptions to sentences.’ [Chomsky (1965: 8)]
In very similar terms, Gazdar et al. (1985) open their book as in (43). (43)
‘This book is a contribution to the discipline known as generative grammar. This approach to linguistics is characterized by its goal of investigating natural language through the construction of fully explicit descriptions of particular languages and a formalized general framework for defining the space within which to locate such descriptions.’ [Gazdar et al. (1985: 1)]
Although there are subtle differences in formulation between (42) and (43), it is not difficult to see the latter as an elaboration of the former. Gazdar et al. (1985) use explicit formalisation as a standard for the classification of work in linguistics when stating (44). (44)
a.
‘It will be clear that our use of the term “generative grammar” covers GPSG, LFG, APG, Montague Grammar in all its varieties, the work presented in Syntactic Structures (Chomsky 1957), Stockwell et al. 1973, Lasnik and Kupin 1977, and other work, b. but includes little of the research done under the rubric of the “Government Binding” framework, c. since there are few signs of any commitment to the explicit specification of grammars or theoretical principles in this genre of linguistics.’ [Gazdar et al. (1985: 6)]
Some modern competitors
211
The list of approaches and individual works in (44a) has one striking similarity in orientation with the argument for LFG in Section 4.1.1. 6 In both cases, the idea is to show the existence of a split between Chomsky’s earlier work and his later work, such that in certain respects the later work diverges from the promises made in earlier publications. The split here is between Chomsky (1957) and Chomsky (1981a), as alluded to in (44b). Within Chomskyan linguistics, formalisation was also considered as a virtue from an early stage. An example is (45). (45)
‘The other most important result of Chomsky’s theory of language is his very strict axiomatization of linguistic theory.’ [Lees (1957: 391)]
As indicated in Section 3.3, Lees (1957) is a very friendly review of Chomsky (1957), rather taking the form of an explanation of the theory than of a critical assessment. This illustrates that Chomsky’s students and collaborators took the formal nature of grammars as formulated in (42) very seriously. Another example of this attitude is Botha’s (1981) overview of the research methodology of Chomskyan linguistics, in which the requirements imposed on a grammar are described as in (46). (46)
a. b. c. d. e. f.
‘A GRAMMAR takes the form of a system of mutually related rules. These rules must be finite in length and number. The rules must be completely explicit. The rules must enumerate all the grammatical sentences of the language and no ungrammatical sentences. The rules must assign an appropriate structural description to each enumerated sentence. In doing (d) and (e) the rules must express linguistically significant generalizations about the language.’ [Botha (1981: 134), originally (11) (a)-(f)]
If we compare (46) to Chomsky’s explanation of the term generative grammar in (42), we recognise (46a), (46c), and (46e) as more formal variants of conditions contained in (42). Both (45) and (46) have a rather more formalistic tone than (42). As such they are closer to (43). The sense of crisis that gave rise to the emergence of GPSG was triggered by the increasing emphasis on explanatory goals in Chomskyan linguistics. The replacement of individual rules by general principles, as described in Section 2.5.1, does not contribute to any of the goals in (46). If an interpretation of generative grammar as in (46) is used as a standard for the evaluation of theories, this development can only be judged negatively. As appears from Gazdar (2001), another problem that especially Gerald Gazdar had with Chomskyan linguistics concerns the treatment of semantics.
212
Chomskyan Linguistics and its Competitors
As described in Section 2.5.1, syntax has always been considered the central component of linguistics in Chomskyan linguistics. Chomsky (1965) derives the meaning of a sentence by interpretive rules operating on Deep Structure. Chomsky (1981a) moves the point of operation to LF. In both cases, the rules remain largely implicit. The lack of interest in semantics was not itself part of a crisis, but it influenced Gazdar’s approach to the crisis he perceived in generative grammar in the sense that he was attracted by the treatment of semantics in Montague Grammar. Montague describes the distribution of tasks between syntax and semantics as in (47). (47)
4.2.2
‘The basic aim of semantics is to characterize the notions of a true sentence (under a given interpretation) and of entailment, while that of syntax is to characterize the various syntactical categories, especially the set of declarative sentences. […] I fail to see any great interest in syntax except as a preliminary to semantics.’ [Montague (1970: 373f., fn. 2)]
A new research programme
Although a sense of crisis can be identified at the root of GPSG, it is not at all obvious that a new research programme would be the typical outcome. The most straightforward reaction was rather to put into practice the explicitness in the specification of rules that was felt to be lacking in much of the work in Chomskyan linguistics. As Gazdar states later, ‘I never really wanted to initiate a grammar framework’ (2001). The problem, however, is that if two groups of linguists use different criteria for the evaluation of their theories, they will come up with different decisions on which theory is better. As an example of this, consider Gazdar’s (1981a) proposal to eliminate transformations and treat the entire grammar of natural languages by means of context-free rewrite rules. His argument is based on a demonstration that unbounded dependencies and coordination can both be treated by a context-free grammar (CFG) if a slash feature is used. As an example, S/NP is an S from which an NP is missing. By GPSG standards, this is a strong argument. If we take this proposal in the context of Chomskyan linguistics, however, it is far less convincing. Although Gazdar also links the argument for a CFG account to language acquisition and parsing (1981a: 155), he does not address the indeterminacy problem discussed in Section 2.3.3. In the research programme of Chomskyan linguistics, a restriction to CFGs is not sufficient for learnability. Therefore, the same proposal gets entirely different evaluations in GPSG and Chomskyan linguistics. An elegant formalisation in a GPSG perspective, it does not contribute to the explanatory adequacy of the theory in the view of Chomskyan linguistics.
Some modern competitors
213
In the light of this example, a central assumption distinguishing the framework assumed in GPSG from the one in Chomskyan linguistics is (48). (48)
a.
‘We make no claims, naturally enough, that our grammatical theory is eo ipso a psychological theory. […] b. we feel it is possible, and arguably proper, for a linguist (qua linguist) to ignore matters of psychology.’ [Gazdar et al. (1985: 5)]
In (48) the link between grammar and competence is severed. Between (48a) and (48b) it is made explicit that this excludes the direct use of grammar to account for language acquisition, parsing, or for ‘the structure of an as-yetunidentified mental organ’. In this way the primacy of formalisation over explanation is established. It should be pointed out that (48) is not inconsistent with Gazdar’s (1981a: 155) use of the argument that the restriction to CFGs facilitates language acquisition and parsing. While parsing (as opposed to language acquisition) does not play a direct role in the research programme of Chomskyan linguistics, it can still provide external evidence for a theory. In the same way, psychological considerations can play this role in GPSG. If a grammar does not describe a psychological concept, it has to assume a different role in GPSG. This role is described in (49). (49)
a.
‘The basic assumption made in generative grammar is that languages can be regarded as collections whose membership is definitely and precisely specifiable. The elements of such a collection are the expressions in the language. […] b. Clearly the set of compound linguistic expressions in a natural language is not finite, so we cannot list them. An interpreted formal system defining membership of the collection of linguistic expressions, and assigning a structure and an interpretation to each member, is required. This is what we call a grammar.’ [Gazdar et al. (1985: 1)]
The concept of language evoked in (49a) is that of what Chomsky (1986a) calls an E-language (cf. Section 2.1.3). This is also the notion Botha refers to in (46d). Therefore, (46) is compatible with GPSG rather than with Chomskyan linguistics. As presented in Section 2.2.1, grammaticality for Chomsky is rather ‘a matter of degree’ (1965: 11), because for him competence rather than grammaticality judgements is the focus of theorising. In (49b), a grammar is said to describe a language in the sense of (49a), i.e. as an E-language. The relationship between the grammar and the language in (49) is extremely tight. If the grammar ‘defin[es] membership of the collection of linguistic expressions’, as (49b) states, and the language ‘can be regarded as’ such a collection, as (49a) states, the grammar is not a theoretical description but a
214
Chomskyan Linguistics and its Competitors
formal definition of the language. The two are not independent objects. The language is the extension corresponding to the intensional description in the grammar. Therefore the correctness of the grammar cannot be tested empirically by scrutinising its adequacy for the language. In this context it is interesting to consider the first of ‘three crucial methodological assumptions’ in GPSG, given in (50). (50) ‘A necessary precondition to ‘explaining’ some aspect of the organization of natural languages is a description of the relevant phenomena which is thorough enough and precise enough to make it plausible to suppose that the language under analysis really is organized in the postulated way.’ [Gazdar et al. (1985: 2)]
What is interesting about (50) in the present context is that it evokes a concept of language that is different in crucial respects from the concept of language in (49). In (50), language is ascribed an organisation that is to be discovered, whereas in (49) it is simply the extensional counterpart to a grammar. This suggests that the evaluation of a grammar can be based on the comparison of the language it generates, as in (49), and the ‘language under analysis’ as referred to in (50). What we have here is an ambiguity of language parallel to the ambiguity of grammar discussed in Section 2.3.1. A similar view is suggested by (51). (51)
‘This mathematical conception of semiotics does not imply that data are irrelevant to, for instance, the syntax of English. Just as mathematicians refer to intuitions about points and lines in establishing a geometrical theory, we may refer to intuitions about sentences, noun phrases, subordinate clauses, and the like in establishing a grammar of English.’ [Thomason (1974: 2)]
The statement in (51) is taken from the introduction to the collected papers of Richard Montague. As Thomason points out ‘According to Montague the syntax, semantics, and pragmatics of natural languages are branches of mathematics’ (1974: 2). The ‘mathematical conception of semiotics’ corresponds to the view in which grammar and language are intensional and extensional definitions of the same entity, e.g. {English}. Intuition creates an independent entity, e.g. #English#. A comparison of #English# with different variants of {English} can then be used to decide which options to take in setting up {English}. When we consider the data used in GPSG, it is interesting to note that Gazdar includes ‘such methodological advances as the use of native speaker acceptability judgments’ among Chomsky’s ‘major contributions to linguistics’ (1976: 207).7 Gazdar (2001) explicitly repeats his commitment to this view, while adding that ‘That is not to say that I don’t think that corpus work can’t
Some modern competitors
215
be useful, even in theoretical syntax.’ This is entirely compatible with (51). The legitimacy of other types of data used in Chomskyan linguistics and in LFG is less obvious. In the same way as intuitive judgements, they are not excluded by the attitude to psychology in (48), but the lack of interest in the nature of competence implies that there is no urge to set up complex experiments whose only additional benefit is a different type of access to the mental implementation of language. Therefore, the data in GPSG are intuitive judgements and corpus data. An aspect of (50) that we have not considered so far is the relation it establishes between description and explanation. According to (50), the description of a phenomenon is a ‘necessary precondition’ for its explanation. Intuitively, this seems obvious. However, in the empirical cycle as discussed in Chapter 1 (cf. Figure 1.7), the relation between observations and theory is cyclic. There is not only no closed set of data before theorising starts, but generalisations about these data can be modified in the light of tests and new data. There is no sense in which the description of the phenomena can be completed before explanation starts. The attitude to description and explanation described in (50) fits in well with the study of language as a branch of mathematics, as suggested in (51). It can be illustrated by the parallel with the difference in method in physics and mathematics. In an empirical science such as physics, insight consists of knowing the causes of a phenomenon, i.e. explanation. We can observe that water expands when it freezes. In order to explain this, we have to invoke the molecular structure of water and ice. As this is not the most natural way to describe the phenomenon in the first place, its description is modified by the search for an explanation. The explanation gives deeper insight because it shows that the observed phenomenon is not arbitrary but has a clearly determined cause. In mathematics, the approach is different. We can observe that there are many prime numbers, and they tend to be more thinly spread with rising numbers. This effect is inherent in the definition of prime number and there is not much mathematical theory can do to explain it. Rather than explanations it provides proofs. The proof, for instance, that there is no highest prime number increases insight, even though it does not explain anything. The view of linguistics as a formal science parallel to mathematics also underlies the remaining two ‘crucial methodological assumptions’ of GPSG in (52). (52)
a.
‘A grammatical framework can and should be construed as a formal language for specifying grammars of a particular kind. The syntax and, more importantly, the semantics of that formal language constitute the substance of the theory or theories embodied in the framework.
216
Chomskyan Linguistics and its Competitors b. The most interesting contribution that generative grammar can make to the search for universals of language is to specify formal systems that have putative universals as consequences, as opposed to merely providing a technical vocabulary in terms of which autonomously stipulated universals can be expressed.’ [Gazdar et al. (1985: 2), originally II and III, following (50)]
In (52a), the descriptive apparatus used in linguistics is considered as a formal language. This formal language determines which grammars can be formulated. The idea of (52b) is that this formal language should be defined such that empirical universals of human languages follow from it. These ideas can be illustrated with the universal in (53). (53)
‘Universal 6. All languages with dominant VSO order have SVO as an alternative or as the only alternative basic order.’ [Greenberg (1963: 79)]
First of all, if the universal in (53) is worth stating, the grammatical framework should be set up such that concepts such as dominant word order and the references to subject, object, and verb can be stated. If this condition is not fulfilled, (53) cannot be meaningfully stated. Furthermore, (52b) suggests that if (53) is valid and worth stating, the ideal grammatical framework implies (53). This means that a language violating (53) cannot be described by any grammar permitted by the framework. If such a language is found, it falsifies both (53) and the framework incorporating it. On the basis of this discussion, the model underlying GPSG can be represented as in Figure 4.2. Language universals
Grammatical framework tests
Object language
describes
ge
Sentences
ra ne
constrains
Grammar
tes
test
test
Intuitions
Figure 4.2: The research programme of GPSG
Some modern competitors
217
In Figure 4.2, three different types of entity are represented by different shapes. The ovals on the left are abstract entities. The object language is used for the ‘language under analysis’ in (50). The object language and the sentences it contains are two sides of the same coin. The rounded rectangles on the right are theoretical entities. The grammar describes the object language by generating the sentences of that language. The grammatical framework constrains the search space for grammars. To the extent that it must still be possible to formulate a grammar for each object language, the grammar can be said to test the framework. The shaded entities in Figure 4.2 are less directly involved in the scientific process. Whereas (52b) encourages the encoding of universals into the framework, this does not correspond to any epistemological necessity. Universals do not correspond to entities in the real world, but are only inductive generalisations. Intuitions constitute a third type of entity. They can be used to test the grammar in the sense that the grammar has to generate all and only grammatical sentences. Determining the exact status of intuitions is less important, because they are not described or explained by the theory.
4.2.3
Comparison of the models of GPSG and Chomskyan linguistics
The central difference between GPSG and Chomskyan linguistics is that in GPSG linguistics is seen as a formal science and in Chomskyan linguistics as an empirical science. As a consequence, the position of grammars in the research programme is different. In GPSG, a grammar is a formal account of an abstract entity, whereas in Chomskyan linguistics it is a theoretical description of an empirical entity. In GPSG, a grammar is a formal system which can be refuted by a counterexample. In Chomskyan linguistics, a grammar is a theory that can only be replaced by a better theory. Another noteworthy difference concerns the position of universals. In GPSG, universals are the consequence of generalisation about individual object languages. The extent to which they can be built into the grammatical framework is a measure of elegance. In Chomskyan linguistics universals are necessary for the explanation of language acquisition. Every property of a language is either learnable on the basis of limited evidence or universal. Therefore, meaningful hypotheses about universals can be formulated on the basis of the study of a single language. A final difference to note is the nature of the data. In GPSG, data are observations about the grammaticality of sentences or sentence fragments. Grammaticality is a binary feature determining whether the linguistic expression is part of the object language or not. Statements about the grammaticality of an expression are similar to axioms in mathematics. Although they are
218
Chomskyan Linguistics and its Competitors
based on intuition, the intuition is not studied. In Chomskyan linguistics, data are whatever can be observed about I-languages. There is no restriction to grammaticality judgements and grammaticality judgements need not be binary. Grammaticality is not itself the main object of study, but rather a tool to access the underlying competence. The differences between GPSG and Chomskyan linguistics are quite radical. Still they are both included under the label of generative grammar, e.g. by Newmeyer (1986a). The main similarity concerns the formalism for grammars, at least in the 1970s. This originates from an underlying concern to formulate and discuss formalised rules. However, as will become clear in Section 4.2.4, even in this respect opinions diverge as soon as we go beyond a thin veneer of form. Ten Hacken (2002) argues that the similarity between GPSG and Chomskyan linguistics is often overestimated. In the same way as the research programme of Chomskyan linguistics has given rise to a number of different theories, also the research programme identified here as adopted in GPSG is not restricted to a single theory. Arguably, at least large parts of the research programme are shared by Montague Grammar, Johnson and Postal’s (1980) Arc-Pair Grammar, and other theories. 8 Although they differ in formalism, they share essential assumptions on what is language and how language should be studied.
4.2.4
Theoretical discussions and incommensurability
The interaction between proponents of GPSG and other theories within the same research programme on the one hand and Chomskyan linguistics on the other has been marked by a persistent criticism of the standard of formalisation from the former, with relatively few reactions from the latter. The nature of the crisis giving rise to the emergence of GPSG provides an indication of the reasons for this type of interaction. The cause of the crisis was the perception that standards of formalisation were dropping. In Chomskyan linguistics, the same development was perceived as a move towards deeper explanations. In this section, we will consider two theoretical discussions between proponents of the two research programmes. 9 Section 4.2.4.1 concerns the use of certain data concerning the contraction of want to to wanna in English. Section 4.2.4.2 moves on to the use of a theoretical notion, X-bar theory, relating this discussion to the conflict about Chomsky’s statement that the number of human languages is finite and about the use of formalisation in general.
Some modern competitors
219
4.2.4.1 Wanna-contraction Wanna-contraction is the phonological reduction of the sequence want to to wanna. Lakoff (1970: 632) describes the contrast in (54). (54)
a. b. c. d.
Teddy is the man I want to succeed. Teddy is the man I wanna succeed. I want Teddy to succeed. I want to succeed Teddy.
The sentence in (54a) is ambiguous between the two readings (54c-d). By contrast, (54b), which is normally only spoken in this form, can only be interpreted as (54d), not as (54c). In the course of the 1970s, trace theory was developed in Chomskyan linguistics and Lightfoot (1976: 566–570) uses an analysis of the data in (54) to support it. 10 After several refinements, Chomsky (1981a: 180–182) presents the argument on the basis of the examples in (55). (55)
a. b. c. d.
They want to visit Paris. They want Bill to visit Paris. Who do they want to visit Paris. Who do they want to visit.
Instead of contrasting ambiguous and unambiguous sentences, (55) contrasts (55a) and (55d), in which want and to can be contracted, to (55b) and (55c), in which contraction is impossible. In (55b) the reason why contraction is excluded is obvious. The two elements want and to are separated by Bill. They are not adjacent. Chomsky argues that the same explanation applies to (55c). He assumes the partially specified S-structures in (56) for the sentences in (55). (56)
a. b. c. d.
they want [PRO to visit Paris] they want [Bill to visit Paris] who do they want [ [NP e ] to visit Paris] who do they want [PRO to visit [NP e ] ]
Two types of empty category occur in (56). [NP e ] is the trace of wh-movement. It indicates the position in the embedded clause in which who is interpreted, i.e. subject of visit in (56c) and object in (56d). PRO is the subject of the infinitive. One of the differences between these types of empty category is that [NP e ] has abstract Case in Chomsky’s (1981a) theory, whereas PRO does not. Another difference is that [NP e ] blocks wanna-contraction whereas PRO does not.
220
Chomskyan Linguistics and its Competitors
Chomsky then proposes that if a Case-bearing NP intervenes between want and to, the two are no longer adjacent and wanna-contraction is blocked. All lexical NPs have to bear Case. Therefore, wanna-contraction is blocked in (56c) for the same reason as in (56b). The structure of the argument of wanna-contraction is typical of the type of argumentation in Chomskyan linguistics. It is also a good example of the type of reasoning that proponents of the research programme of GPSG take issue with. The analysis of wanna-contraction was a prominent issue in the period when GPSG emerged, discussed on the pages of Linguistic Inquiry. Postal and Pullum (1978) attacked Lightfoot’s (1976) analysis. Chomsky and Lasnik (1978) reacted to this attack. Pullum and Postal (1979) argued that this reaction was not an adequate defence. Jaeggli (1980) summarises the point from a Chomskyan perspective. At that point, the analysis as presented by Chomsky (1981a) and outlined above had been reached. We join the discussion when Postal and Pullum (1982) attack this position. Postal and Pullum (1982: 123–126) argue that there are at least two types of shortcomings in an account such as the one by Chomsky (1981a). 11 First, they point to what they call ‘liberal dialects’ of speakers for whom contraction is possible in contexts such as (55c). They draw the conclusion in (57). (57)
a.
‘By insisting that the constraints blocking contraction derive from universal grammar, Jaeggli and Chomsky’s accounts falsely claim that the liberal dialects cannot exist. b. This conclusion holds, of course, unless, as a referee has suggested, it is a language-particular fact that Case interferes with phonology, in which case nothing can be explained by the approach in question.’ [Postal and Pullum (1982: 123)]
The argument in (57) leaves two possibilities for the Chomskyan account of the data in (54), either it is false, (57a), or it is patched up and loses its explanatory force, (57b). A second shortcoming Postal and Pullum identify is that there are other contexts in which want and to are superficially adjacent but contraction is blocked. They include cases such as (58a), in which ‘uncontracted cases exhibit distracting unacceptabilities; but the failure of contraction is sufficiently glaring that this does not really matter’ (1982: 125), as well as coordination cases such as (58b). In (58a) some structure has been added to facilitate interpretation. (58)
a.
I don’t want [[to flagellate oneself in public] to become standard practice in this monastery] b. I want to dance and to sing.
They conclude this part of their argument with (59).
Some modern competitors (59)
221
a. ‘The data cited in this section illustrate the following points: b. There are at least six distinct contexts where, in TT terms, the verb want occurs adjacent to infinitival to but where contraction is entirely impossible. c. No known independent principles suffice to block contraction in any of these contexts.’ [Postal and Pullum (1982: 126), b and c are their (12) a and b]
In (59b), TT stands for trace theory and the ‘six contexts’ are the ones of which (58) reproduces some typical examples. The conclusion in (59c) is apparently based on the observation that the argument for (55) does not carry over to these cases. No other attempts to explain them in terms of ‘known independent principle[s]’ are discussed. Instead, Postal and Pullum (1982: 129–134) formulate a new generalisation. This generalisation covers all contexts they discuss. It is formulated as a characterisation of the set of configurations in which infinitival to can contract with a preceding verb, and formalised in terms of Arc-Pair Grammar, one of the formalisms that shares basic assumptions about the research programme with GPSG. In their reaction, Aoun and Lightfoot (1984) set out to show that (59c) is not only not demonstrated by Postal and Pullum (1982) but also incorrect. Using Postal and Pullum’s generalisation and the requirement that to must be governed by want for contraction to be possible, they show for each context referred to in (59b) that the impossibility of wanna-contraction is actually the consequence of well-known principles in GB-theory. In the final part of their reaction, Aoun and Lightfoot (1984: 470–473) show that the claim Postal and Pullum (1982) attack is not one that was originally made in Chomskyan linguistics. The original point, as formulated by Lightfoot (1976: 569), is that NP movement leaves traces, identified as [NP e ] in (56), and that the difference between PRO and trace has observable consequences. To the extent that these facts cannot be learned, they have to be part of Universal Grammar. They argue for the view on the explanation of facts in (60). (60)
a.
‘it is unreasonable to say that one has no explanation until all facts are described. In order to have an explanation (of greater or lesser depth), one needs to describe the relevant facts. b. It is important to note that there is no theory-independent way of establishing which facts are relevant.’ [Aoun and Lightfoot (1984: 472)]
The claim in (60a) should be compared to the one in (50). Although they are not in logical contradiction, (50) emphasises the precondition of the description and (60a) the restriction to the relevant facts. An important premiss is (60b), which is entirely in line with current views of the nature of empirical science
222
Chomskyan Linguistics and its Competitors
and the problems of the empirical cycle as discussed in Section 1.2 above. This approach is applied to Chomskyan linguistics in (61). (61)
‘Giving a full description of some pretheoretically defined range of phenomena is not the primary goal; one investigates phenomena for the light that they cast on properties of the language faculty in its initial and mature states.’ [Aoun and Lightfoot (1984: 471)]
On the basis of (61), Aoun and Lightfoot argue that there is nothing wrong with an argument of the type illustrated with (55) and (56). There is no need to take into account the data referred to in (59b) when making a point about wanna-contraction in (55). In reaction to this, Postal and Pullum (1986) make three points and Lightfoot (1986) answers them. The first point concerns a technical quibble about the definition of government. Postal and Pullum (1986: 105f.) claim that the definition used by Aoun and Lightfoot (1984) is equivalent to one that results in want never governing to. Lightfoot (1986: 111) claims that this equivalence does not hold. Another point concerns the choice of references. Postal and Pullum (1986: 108f.) complain that certain references are not used by Aoun and Lightfoot (1984). Lightfoot (1986: 112) answers that some of them were not necessary for the points they made and others had not been published in 1982, when the reply was written. The point most relevant to our concerns here is a reiteration of the point made in (57), highlighting the existence of ‘liberal dialects’ that allow contraction in contexts such as (55c). Postal and Pullum (1986: 106–108) complain that Aoun and Lightfoot (1984) do not reply to this point, which they reinforce by mentioning more cases of speakers of such a liberal dialect. Lightfoot’s reply is summarised in (62). (62)
a.
‘nothing can be concluded until we know what else might be going on in these “liberal dialects”. […] b. an analysis is needed rather than a list of speakers.’ [Lightfoot (1986: 112)]
Here we see another instance of the basic opposition between formal and empirical science. In formal science, a counterexample destroys a proof. In empirical science, counterexamples cannot refute a theory. A theory can only be replaced by another one giving a better (deeper or more inclusive) explanation. A list of speakers of liberal dialects may improve the quality of the data, but as data cannot refute a theory, it hardly reinforces the case against the Chomskyan analysis of wanna-contraction in Chomskyan linguistics. Only if an analysis of the data gives rise to a better theory can the original theory be
Some modern competitors
223
threatened. However, what makes a theory ‘better’ is inextricably tied up with the research programme. From the fragment of the discussion of wanna-contraction we analysed here, it is clear that from the perspective of Chomskyan linguistics, the criticism by Postal and Pullum is not pertinent. From the perspective of the formalist research programme of GPSG and Arc-Pair Grammar, however, the defence by Aoun and Lightfoot simply misses the point. The two sides in the debate use different criteria to judge arguments so that they cannot agree on the validity of some of each other’s statements.
4.2.4.2 X-bar theory We have encountered X-bar theory in the context of Chomskyan linguistics in Section 2.5.1 and in the context of LFG in Section 4.1.5.3. GPSG also adopts X-bar theory, but not quite in the same way as Chomskyan linguistics. In fact, proponents of GPSG strongly object to the type of statements Chomskyan linguists make in relation to X-bar theory. Two examples of such statements are in (63). (63)
a.
‘The rules of the categorial component meet some variety of X-bar theory.’ [Chomsky (1981a: 5)] b. ‘X-bar theory permits only a finite class of possible base systems.’ [Chomsky (1981a: 11)]
As discussed in Section 2.5.1, X-bar theory was introduced in Chomskyan linguistics as part of the transition from Standard Theory to Government and Binding Theory. Several versions of it were proposed and (63a) refers to these varieties as constraining the phrase structure rules which determine D-structure. This statement is part of the introduction to the main exposition of GB-theory and appears in a general overview of the structure of the theory of universal grammar. In the same introduction, (63b) is part of an argument that the theory of UG proposed in the book restricts the degree of choice sufficiently to explain that languages are learnable (cf. also (67) in Section 2.4.3 and (84) and (89) in Section 2.5.1.2). Pullum (1985) attacks the sloppiness of the formulations in (63), as well as the lack of formal argumentation to support the claims. He criticises (63a) in (64). (64)
a. ‘The assertions in (1) are phrased as if they state empirical assumptions, b. but it is striking that the works from which they are taken do not proceed to give any hint as to which version of the alleged X-bar theory they assume.’ [Pullum (1985: 323)]
224
Chomskyan Linguistics and its Competitors
The ‘assertions’ referred to in (64a) include (63a) as (1d), following three similar statements from other publications by Chomsky. Working from the research programme in Figure 4.2, Pullum considers X-bar theory as a candidate for the grammatical framework. This means that ‘empirical assumptions’ as referred to in (64a) can only take the form of constraints on the formulation of grammars, constraints corresponding to universals holding for the set of possible languages. From this perspective, (63a) is indeed impossibly vague. The observation in (64b) suggests, however, that Chomsky (1981a) is not primarily interested in this type of interpretation of X-bar theory. Pullum’s attack on (63b) is even more acrimonious. Some of the main statements are given in (65). (65)
a.
‘The kindest thing that can be said about this flagrant falsehood is that it should not have been published. […] b. even under the most restrictive version of X-bar theory that can be defined in terms of the defining conditions that have occasionally been sketched in the literature (reviewed in Section 1 below), no result limiting the class of X-bar grammars to finite cardinality can be proved. […] c. Chomsky’s eccentric pronouncement could only have been accepted and repeated by linguists whose vision of X-bar theory owed more to faith than to formalization.’ [Pullum (1985: 323)]
What strikes immediately in (65) is the aggressive tone of the discussion, which is typical of much of the criticism of Chomskyan linguistics originating from GPSG. 12 The verdict in (65a) that (63b) is a ‘flagrant falsehood’ is understandable if it is taken as a theorem to be included in the grammatical framework in Figure 4.2. This interpretation is indeed taken throughout Pullum (1985), which consists for a large part of the discussion announced in (65b). In this discussion, (1985: 325–346), constraints on phrase structure rules proposed in the literature as evolving from X-bar theory are formalised mathematically and shown to have no significant impact on the set of languages that can be generated. What is implied in (65c) is that anyone adopting a research programme in which a statement such as (63b) makes sense does so for other than sound reasons. Of course, to the extent that the choice of a research programme cannot be justified by rational considerations only, this is correct. Pullum (1985: 346–350) proposes the version of X-bar theory adopted in GPSG as a more reasonable alternative. In order to understand (63b), it is important to consider its original context. In the introduction to a major theoretical work, Chomsky explains how the theoretical discussion of the subsequent chapters is embedded in the research programme of Figure 2.7. The focus of this research programme is learnability,
Some modern competitors
225
which requires the restriction of the number of possible grammars. The argument leads to (66). (66)
a.
‘If these assumptions are correct, then UG will make available only a finite class of possible core grammars, in principle. b. That is, UG will provide a finite set of parameters, each with a finite number of values, c. apart from the trivial matter of the morpheme or word list, which must surely be learned by direct exposure for the most part.’ [Chomsky (1981a: 11)]
In (66a) ‘these assumptions’ refers to (63b) and a number of similar statements. They can be taken as programmatic in the sense that they sketch the form a full theory of UG should take. What (63b) intends to say is that the mental equivalent to X-bar theory in the language faculty must have the property of allowing only parameters as referred to in (66b), so that the theory of UG must aim to describe it in these terms. Within the context of Chomskyan linguistics this is a plausible statement. The ‘faith’ referred to in (65c) is actually nothing else than adopting a particular, coherent research programme in empirical science. The opposition between Chomskyan linguistics and GPSG on the issue of how to constrain phrase structure rules surfaced prominently at a discussion organised by the Royal Society of London and the British Academy on 11 and 12 March 1981. Chomsky (1981c) presented X-bar theory as a way to restrict the number of choices open to the child in language acquisition to a finite number, along the lines of the argument from which (66) is taken. Gazdar (1981b) presented the slash feature explained at the start of Section 4.2.2 as a way to reduce the generative capacity of the formalism to context-free grammars. In the discussion of the latter Chomsky criticises Gazdar’s proposal as in (67). (67)
a.
‘This is simply a needlessly complex variant of T2, the latter supplemented by a class of superfluous devices (derived categories, […] b. Therefore, we can replace this theory by the far simpler variant T2, in which case it becomes (in a loose sense) a notational variant of t.g. c. This of course assumes that Gazdar’s theory is supplemented with constraints to exclude the vast class of c.f.gs that cannot serve as possible grammars for human languages, presumably reducing it to finite generative capacity.’ [Chomsky in Gazdar (1981b: 280)]
In (67a-b), T2 refers to one of the versions of trace theory Chomsky considers, the one in which traces are base-generated and coindexed with their antecedents (as opposed to resulting from movement). The ‘derived categories’ in
226
Chomskyan Linguistics and its Competitors
(67a) are the categories formed with the slash feature. In (67b), ‘t.g.’ refers to Transformational Grammar. As shown in (67c), Chomsky rejects Gazdar’s claim that he has simplified grammar by introducing the slash feature. The reason is that for CFGs, the indeterminacy as discussed in Section 2.3.3 arises. Without further restrictions the phenomenon of language acquisition cannot be explained. Gazdar reacts to (67) in (68). (68)
‘When one realizes that the syntactic theory that Chomsky has been developing over the last ten years has embraced phrase structure rules, complex symbols, a level of S-structure, a level of D-structure, a level of ‘Logical Form’, filters, transformations, interpretive rules, stylistic rules, coindexing conventions, and abstract cases, among other things, it is a little surprising to hear him castigating as ‘needlessly complex’ an alternative syntactic theory that employs only phrase structure rules, complex symbols and a level of surface structure.’ [Gazdar (1981b: 281)]
In reacting to (67), Gazdar ignores (67c) and concentrates on the issue of simplicity. There is no disagreement about the fact that Chomsky has proposed and/ or used all of the concepts Gazdar mentions in (68). However, Chomsky and Gazdar use different measures of simplicity. Gazdar concentrates on simplicity of the formalism used to describe languages. Chomsky uses ‘simple’ in relation to the learnability problem. To the extent that the devices in (68) are universal, they do not count as complex for Chomsky. On the other hand, whereas Gazdar considers a grammar as a descriptive device of the (E-)language, Chomsky requires it to be learnable as an I-language. Therefore Chomsky also counts the devices needed for (67c), but Gazdar does not. The issue of finiteness in (66) and (67c) is discussed separately by Pullum (1983). The design of the argument is explained in (69). (69)
a.
‘Let us say that if a given linguistic theory supports a proof that it defines only a finite class of grammars, that theory has a finiteness result.’ [Pullum (1983: 447)] b. ‘Chomsky has hinted at or explicitly espoused the position that his current version of transformational-generative (TG) grammar has a finiteness result in a number of places […] c. I shall show that the claims he makes there cannot be substantiated.’ [Pullum (1983: 448)]
The formulation of (69a) suggests that linguistics is taken as a formal science concerned with the proof of theorems. Therefore, (66), which is in the list of places following (69b), is interpreted as a claim in formal grammar rather than as the sketch of the orientation of GB-theory. As an example of (69c) consider the extracts of the discussion of the lexicon in (70).
Some modern competitors (70)
227
a.
‘Here Chomsky admits that the lexicon falsifies the finiteness thesis. […] b. claims which are false for a trivial reason count nonetheless as false when we are proving theorems.’ [Pullum (1983: 455)]
In (70a), ‘Here’ refers to a statement to the same effect and on the same page as (66c). The observation in (70a) gives rise to the comment in (70b). This comment further illustrates that Pullum is involved in an exercise of ‘proving theorems’, which is the correct thing to do in the research programme in Figure 4.2, whereas Chomsky treats linguistics as an empirical science, in which theorem proving is not the major concern. His purpose is to show that the approach in GB-theory is compatible with the learnability of I-languages. It is not surprising, then, that few if any proponents of Chomskyan linguistics felt the immediate need to react to Pullum (1983, 1985). Pullum observes that his (1983) argument ‘has gone unanswered (and, indeed, unmentioned) in the subsequent literature, there being apparently no defense anyone wants to make’ (1985: 324). According to Gazdar (2001), this was ‘one of the last straws’ which made him so disappointed in linguistics as a field that he turned his attention to computational linguistics instead. In what may be an indirect reaction to Pullum (1985), however, Chomsky (1986b: 2–4) gives an overview of X-bar theory in which he discusses the type of finiteness involved in the application of (66) to the X-bar theory in a footnote. 13 In (71) some extracts are given. (71)
a.
b.
c.
‘For a particular core language L, the X-bar system is determined by fixing the values of the parameters of X-bar theory (head-first etc.) in accordance with whatever dependencies among them are determined by UG; a particular set of choices constitutes the X-bar component of the grammar of L. Since there are no known reasons to suspect that there are more than a few such parameters, the number of base (X-bar) systems for core grammar appears to be not only finite but in fact small. […] Of course, from the fact that the number of possible X-bar systems is finite, indeed small, it does not follow that the number of possible phrase structure systems (including lexicon and marked periphery) is finite, but these possibilities are of slight interest. See Chomsky 1981: 11. There remain open questions of X-bar theory […]. In the absence of compelling evidence to resolve these empirical questions, there is no point in specifying one or another of the possible options in detail; in particular further formalization is pointless, since there are no theorems of any interest to be proved or hidden assumptions to be teased out in these systems. The interesting questions have to do with issues of fact.’ [Chomsky (1986b: 91, fn. 3)]
228
Chomskyan Linguistics and its Competitors
Chomsky starts in (71a) by spelling out exactly what is meant by the statement that the number of options in the X-bar theory is finite. In (71b) he draws attention to the fact that what used to be the set of phrase structure rules is replaced not by X-bar theory alone, but by X-bar theory in interaction with other components. In the lexicon it is specified, for instance, which verbs have what types of complements. The contrast between (71a) and (71b) corresponds exactly to the difference in simplicity measure used by Chomsky in (67) and by Gazdar in (68). In (71c), the limited role of formalisation in the discussion of X-bar theory is explained. This explanation refers implicitly to the preface of Chomsky (1957), where we find (72). (72)
‘Precisely constructed models for linguistic structure can play an important role, both negative and positive, in the process of discovery itself. By pushing a precise but inadequate formulation to an unacceptable conclusion, we can often expose the exact source of this inadequacy and, consequently, gain a deeper understanding of the linguistic data. More positively, a formalized theory may automatically provide solutions for many problems other than those for which it was explicitly designed.’ [Chomsky (1957: 5)]
The cases in which formalisation can make a positive or negative contribution to theoretical progress are stated in (71c) to be absent or unlikely to emerge in the context of X-bar theory. Moreover, the price of formalisation is that certain aspects of the theory have to be specified even if there is no reason to specify them one way or another. Therefore, (71c) concludes that ‘there is no point’ in doing so. In fact, Pullum (1989) uses (72) in his argument that the ‘early Chomsky’ was in favour of formalisation but the ‘later Chomsky’ rejects it. He contrasts (72) to such statements as (63) and places the point where Chomsky started attacking formalisation at ‘about 1979’ (1989: 139). These attacks on formalisation are exemplified in (67) and (71), to which (73) gives Pullum’s comments. (73)
a.
‘Five years later, worse than ever, we find Chomsky going so far as to say (about X-bar systems) that b. “there is no point in specifying one or another of the possible options in detail”; in particular further formalization is pointless, since there are no theorems of any interest to be proved or hidden assumptions to be teased out in these systems” […] c. Chomsky now flatly rejects his 1957 position, in other words.’ [Pullum (1989: 140–141), original non-matching quotes in b]
Some modern competitors
229
In (73a), the time reference is to the period from the discussion between Chomsky and Gazdar at the Royal Society to the publication of Chomsky (1986b). Part of (71c) is quoted in (73b), but it almost seems as if Pullum is trying to make the quote as unreasonable as possible by leaving out the context explaining why formalisation is in this case not desirable according to Chomsky. The conclusion in (73c) is not logically compelling, however, because in (72) Chomsky only argues that formalisation can have positive effects, not that all positive effects have to derive from formalisation. In his reaction, Chomsky (1990) is concerned to show that his basic position towards formalisation has not changed since (72). He calls (73b) a ‘truism’ (1990: 146) and explains his position in (74). (74)
‘To formalize one or another version is a straightforward exercise, but apparently of no more value than Woodger’s, because it would require decisions that are arbitrary; not enough is understood to make them on principled grounds. The serious problem is to learn more, not to formalize what is known and make unmotivated moves into the unknown.’ [Chomsky (1990: 146–147)]
Joseph Henry Woodger (1894–1981) wrote a book entitled The Axiomatic Method in Biology. Published by Cambridge University Press in 1937, it is now mainly referred to for historical reasons, e.g. by Cain (2000). Chomsky refers to his work as an example of formalisation in biology, ‘forgotten, because empirical consequences were lacking’ (1990: 146). The compatibility of (74) with (72) should be obvious.
4.2.5
Conclusion
The opposition between GPSG and Chomskyan linguistics is much more fundamental than the one between LFG and Chomskyan linguistics. In the latter case, the discussion in Section 4.1 identified a common core of the research programmes, which can be identified roughly with Figure 2.3, extended in different ways. This corresponds to a difference in evaluation criteria for the explanation theories should give. In the case of GPSG, no common core with Chomskyan linguistics can be found. The conflict is not about the type of explanation that is to be prioritised, but about whether formal description or explanation is more important. By (48), Gazdar et al. (1985) abandon the link with language as a mental phenomenon. They do not identify any other way language might be realised as an object in the world. This means that the data of linguistics are no longer empirical data. GPSG linguistics is not an empirical science. This approach to
230
Chomskyan Linguistics and its Competitors
language is shared by a number of other grammatical frameworks, including Montague Grammar and Arc Pair Grammar. Despite this glaring rift between GPSG and Chomskyan linguistics, they can still be seen as having a common origin. In Section 4.1.6 we saw that LFG and GB-theory can be seen as different reactions to the shortcomings of the theory proposed in Chomskyan linguistics in the 1960s. GPSG is yet another type of reaction. Whereas Chomsky proposes to look for a deeper explanation, leading to GB-theory, GPSG argues for a more rigorous formalisation. As indicated in Section 4.2.4, the discussions between proponents of GPSG and of Chomskyan linguistics are marked by very strong incommensurability effects. The two arguments described are remarkable for their duration and the acrimonious character of the exchanges. Both sides have a mission, but the two missions are incompatible. Especially Geoffrey Pullum insists on the deficiency in descriptive adequacy of Chomskyan analyses that fail to take into account all data or lack formal precision. In his view of linguistics as a formal science, descriptive adequacy is a pre-condition for explanatory adequacy. In empirical science as pursued in Chomskyan linguistics, it is impossible to reach descriptive adequacy without explanatory adequacy. The only way to achieve adequacy is at both levels at the same time. This is a consequence of the tension between the two as discussed in Section 2.4.3. SUMMARY
•
GPSG emerged in response to a perceived crisis in Chomskyan linguistics concerning the standard of formalisation of grammars.
•
The emphasis on formalisation leads to the complete abandoning of references to psychological notions.
•
Language is seen as an abstract object. Linguistics becomes a formal science in which intuitions can only be used in the same sense as in mathematics to motivate axioms.
•
The idea of linguistics as a formal science is shared with, for instance, Montague Grammar and Arc Pair Grammar.
•
GPSG and Chomskyan linguistics share the idea of a grammar describing a language and containing rewrite rules.
•
GPSG and Chomskyan linguistics diverge on the role of explanation and formalisation in linguistics, so that their evaluation of post-1960s theories often conflicts and gives rise to serious incommensurability effects.
Some modern competitors
4.3
231
Head-driven Phrase Structure Grammar
Whereas Sells (1985) treats GB-theory, LFG and GPSG as the main frameworks in linguistic theory, a similar selection in the mid or late 1990s would definitely include Head-driven Phrase Structure Grammar (HPSG) instead of GPSG. 14 Thus, along with LFG, HPSG has had regular annual conferences dedicated specifically to work in that framework since the mid 1990s. HPSG is often seen as the successor of GPSG. Apart from the similarity in name, there is also an overlap in the people involved, because the foundational works were written by Ivan Sag and Carl Pollard. It should be noted, however, that the other three main representatives of GPSG were not actively involved. 15 According to Gazdar (2001), ‘We went our separate ways’ after the publication of Gazdar et al. (1985) and ‘We thought that HPSG was the successor to GPSG and that Ivan was in charge of it’. The first major overview work of HPSG was Pollard and Sag (1987). This is a rather short book (218 pages of text), whose title mentioned that it was ‘Volume 1’. The introduction contains two brief sections discussing the nature of language and linguistic theory assumed (1987: 1–10). Instead of a companion volume, Pollard and Sag (1994) is a ‘logically self-contained’ volume in which the authors have ‘revised a number of technical assumptions’ of the earlier work (1994: ix). In this book, the discussion of fundamental notions that would appear in the research programme is limited largely to the introduction (1994: 1–14) and the conclusion of Chapter 1 (1994: 57–59). Some further remarks can be found in the Sag and Wasow (1999) textbook. The main purpose of this section is to determine the nature of the research programme of HPSG. In Section 4.3.1, the background of its emergence is sketched, with a discussion of the question whether there was a crisis likely to give rise to a new research programme. Section 4.3.2 then attempts to analyse and interpret the remarks found in the three works mentioned above in order to find out about the nature of the HPSG research programme. Section 4.3.3 compares the result to the research programmes discussed previously and Section 4.3.4 gives an overview of the interaction with these other research programmes.
4.3.1
The crisis: a ‘meta-crisis’?
In the description of the crises giving rise to the emergence of LFG and GPSG it was clear that the problems taken to be crucial by the people who founded the new research programmes were not considered as real problems from within Chomskyan linguistics. Although critics of Chomskyan linguistics
232
Chomskyan Linguistics and its Competitors
considered their arguments that Chomsky had been wrong on certain points valid and convincing and his reply insufficient, this did not have any real effect on the overall course of the field. Newmeyer (1986a: 223) estimates that ‘In the United States today, somewhat more than two-thirds of the generative syntactic analyses published adopt the GB framework’ and even more abroad. Since incommensurability between research programmes makes conclusive arguments impossible and the assumptions of a research programme are often passed on from one generation of scientists to the next, a number of linguists felt that the field was getting into a sort of ‘meta-crisis’. They were frustrated by the disregard in the dominant research programme of problems which, in their view, ought to cause a crisis. This feeling is expressed by Gazdar (2001) as ‘The field has been damaged by the presence of a charismatic leader who has led it badly’ and for him it was a reason to give up on linguistics and turn to computational linguistics instead. As noted in previous sections, the issues separating LFG and GPSG from Chomskyan linguistics are formalisation and processing. A context where these issues are unavoidable is computational linguistics. It is no surprise, then, that in the 1980s LFG and GPSG were more popular in computational linguistics than various theories developed in Chomskyan linguistics. As Sag (1993) explains in an interview, HPSG emerged in a natural language processing project at Hewlett Packard and ‘it has moved to a very different formal basis’ compared to GPSG. Perhaps the most striking difference is the extensive use of unification of feature structures based on information encoded in the lexicon, instead of the generation of tree structures by means of rewrite rules. Pollard and Sag (1987) highlight the advantages of this approach in (75). (75)
‘One of the most significant contributions of unification-based linguistics has been the development of a common, mathematically precise formalism within which a wide range of distinct theories (and differing hypotheses about a given phenomenon in the context of a fixed theory) can be explicitly constructed and meaningfully compared.’ [Pollard and Sag (1987: 10)]
In the interpretation of (75) it is important to understand the intended ‘range of distinct theories’ correctly. Pollard and Sag state that ‘very considerable portions of most current linguistic theories can be formalized’ by the tools they propose, including ‘numerous variant theories from research traditions as diverse as […] GPSG, LFG, […] and – most recently – GB’ (1987: 10). Given that these three theoretical frameworks belong to different research programmes, it is difficult to see how they can be ‘meaningfully compared’, as (75) claims, without imposing the evaluation criteria developed in one research programme on a theory proposed in another. Pollard and Sag indicate
Some modern competitors
233
the nature of the criteria by complaining about the ‘Inadequate formalization of some theories’ (1987: 10), a complaint reminiscent of (43) and (44) stated by Gazdar et al. (1985) in the context of GPSG. While (75) only indicates a continuation of the efforts of GPSG with a slightly different formalism, a more inclusive goal is indicated in (76). (76)
a.
‘A pleasant and important consequence of this highly favorable state of affairs is b. the rapid obsolescence of a certain authoritarianism in the sociology of the field, which has dictated that one’s investigations are to be conducted in the “right” framework and that one’s fruitful collegial interactions are to be confined to devotees of the “right” research tradition. […] c. With the development of an expressive and formally precise lingua franca, essentially the full range of current theories can be composed, decomposed, compared, recombined, and generally tinkered with, in a manner constrained only by the individual researcher’s aesthetic sense, philosophical predispositions, and responsibility to get the facts right.’ [Pollard and Sag (1987: 10f.)]
It is obvious in (76) that the crisis at this point has given way to optimism, which (76a) (over-)emphasises. The phenomenon described in (76b) refers to some of the typical social aspects of Kuhnian paradigms. As discussed in Section 1.4.2, there is a difference between paradigms in Kuhn’s sense and research programmes as intended here, because the former are based on actual processes observed in scientific communities whereas the latter are restricted to the necessary shared assumptions to make the empirical cycle work. No doubt the ‘authoritarianism’ in (76b) refers to the same problem as Gazdar’s (2001) quote at the start of this section. In an interview, Sag (1993) draws a parallel between Chomskyan linguistics and the Catholic Church. The elaboration in (76c), however, tends to exaggerate the freedom resulting from taking away this aspect. The incommensurability of different Kuhnian paradigms, discussed in Section 1.2.2, is not caused by their social aspects. It applies equally to research programmes, giving rise to the kind of effect we observed in Sections 3.2.3, 4.1.5 and 4.2.4. If we take the concept of incommensurability seriously, it is obvious that no ‘lingua franca’ in the sense of (76c) can exist. The question remains, then, how (76c) can be interpreted. I can think of four, not necessarily mutually exclusive considerations. First, the emphasis on ‘formally precise’ and ‘get the facts right’ points to a concern with formalisation shared with GPSG. Second, the freedom of combining elements from different theories shows a concern to overcome the divisions between research
234
Chomskyan Linguistics and its Competitors
programmes. This is of course laudable in principle, but the result should be a theory embedded in a well-founded research programme. The simplest way to achieve this is of course to start from such a research programme. Third, the same freedom is extended to the ‘philosophical predispositions’, which would have to include the research programme itself. If each researcher developed their own research programme, however, this would be highly counterproductive. Too much work would be spent on philosophical foundations and incommensurability would reach as yet unknown heights. Finally, we can see (76c) as an expression of general discontent with the fact that scientific research is necessarily embedded in a research programme with the implications this has for communication and social organisation in science. It is in particular this last factor in the interpretation of (76c) which indicates that the field was perceived as being in a state of crisis. The crisis as described here can be termed a ‘metacrisis’ because it is not caused by data (as in the case of LFG) or standards (as in the case of GPSG), but by the way science is organised.
4.3.2
A new research programme?
The main source of information about the research programme of HPSG is Pollard and Sag’s (1994) introduction. Here they draw an analogy between the organisation of astronomy as an empirical science and the organisation of HPSG. The basic structure of an empirical science they assume is presented in Figure 4.3. phenomena
modelling
model model-theoretic interpretation
prediction theory
Figure 4.3: The HPSG model of science
Pollard and Sag explain the model of science in Figure 4.3 on the basis of astronomy before applying it to linguistics. In astronomy, an example of a phenomenon is the solar system. As a generalisation they propose ‘possible motions of n-body systems’ (1994: 7). Phenomena are formally described in a ‘model’. The theory in Pollard and Sag’s sense must be a mathematical characterisation of the structures used in the formal descriptions of the ‘model’
Some modern competitors
235
in Figure 4.3. In astronomy, the ‘theory’ includes particular equations as axioms as well as first-order predicate logic and set theory for drawing inferences. The ‘model’ is then a vector system derived from this theory, which characterises the different forces that are active in the system of star(s) and planet(s). In linguistics, Pollard and Sag (1994: 9) give the following analogues. The formal theory is a formal grammar and other mechanisms used to generate representations. These representations constitute the model. In HPSG they are sorted feature structures. In other theories they might be tree structures. The phenomena in linguistics are ‘types of possible linguistic entities’ (1994: 9). It is not straightforward to relate Figure 4.3 to the empirical cycle as discussed in Chapter 1. For ‘phenomena’ in Figure 4.3 the correspondence is ambiguous between ‘observations’ in Figure 1.3 and the real-world phenomena that are observed. In empirical science, these two have to be distinguished because observations are structured in a way the world is not. It is not possible to work with real-world entities without observing them, but in observing them we select what to pay attention to, what to group together as an entity, etc. The identification of ‘phenomena’ in Figure 4.3 with ‘types of possible linguistic entities’ highlights a further ambiguity. The phenomena are not just linguistic entities as they occur, but possible linguistic entities. This is what corresponds in the real world to the theory in Figure 1.3, i.e. the system underlying the facts that can be observed. For the ‘model’ in Figure 4.3 there is no equivalent in Figure 1.3. There is a good reason for this absence. The way model is used in Figure 4.3 means that it is implied by the theory. Therefore, there is no need to introduce this extra element, which plays no independent role. As discussed in Section 4.2.4.2, formalisation in terms of an axiomatic system is not a necessary requirement for a theory in empirical science. Thus, ‘theory’ in Figure 1.3 corresponds to the combination of ‘theory’ and ‘model’ in Figure 4.3. In order to see the effect of taking Figure 4.3 as a basis for the description of the way theories are organised, it is interesting to see how the representation of linguistics in Figure 2.3 would translate into this format. As we have seen in Section 4.1.4, Figure 2.3 represents the common part of the research programme of Chomskyan linguistics and LFG. Although a grammar in the sense of Figure 2.3 is a theory, ‘theory’ in Figure 4.3 covers only part of it. Grammar in Figure 2.3 corresponds to the combination of theory and model in Figure 4.3. By contrast, phenomena in Figure 4.3 corresponds to the entire rest of Figure 2.3, i.e. competence, facts, and observations. We can thus summarise the difference in approach as follows. In the way research programmes have been represented so far, we have concentrated on the empirical cycle and the way the theoretical entities correspond to the real world. This results in the analysis of phenomena in Figure 4.3 into (at
236
Chomskyan Linguistics and its Competitors
least) three entities. It is the interaction between these entities that makes explanation possible. The HPSG approach collapses these three entities into one, but analyses the theory so as to highlight the formalisation in terms of an axiomatic system. Pollard and Sag (1994) describe the descriptive goals of linguistics as in (77). (77)
a.
‘Indeed, we take it to be the central goal of linguistic theory to characterize what it is that every linguistically mature human being knows by virtue of being a linguistic creature, namely universal grammar. b. And a theory of a particular language – a grammar – characterizes what linguistic knowledge (beyond universal grammar) is shared by the community of speakers of that language. […] c. But what does language consist of? […] what is known in common, that makes communication possible, is the system of linguistic types.’ [Pollard and Sag (1994: 14)]
In (77), language is described as shared knowledge. Rather than the individual, it is the community that determines the language. The distinction between universal grammar in (77a) and particular grammars in (77b) is that the former is shared by a more inclusive community than the latter. In (77c) the nature of the knowledge is described in formalistic terms. Linguistic types are opposed to linguistic tokens. This way of describing language allows Pollard and Sag to maintain a neutral position as to mentalism, something they had proposed earlier in (78). (78)
a.
‘It seems pointless to prolong the debate about whether the language is the system of situation types that conform to the conventions, or the system of shared knowledge by virtue of which the conventions can be used and transmitted.’ [Pollard and Sag (1987: 5)] b. ‘Fortunately it does not seem necessary to settle this question in order to have a workable linguistic theory. Much as a physicist can go about his or her business without being clear about the philosophical status of theoretical constructs in quantum mechanics, so linguists can theorize about signs without knowing for sure whether they are [in] the mind, out of the mind, or somewhere in between.’ [Pollard and Sag (1987: 6), preposition added]
In (78a) the ‘situation types’ refer to the theory of situation semantics as developed by Barwise and Perry (1983). In that system, the language is an object external to the mind, of which a speaker has (partial) knowledge. How to understand ‘shared knowledge’ in a mentalist way is not immediately
Some modern competitors
237
obvious. If knowledge is a mental entity it cannot be shared. If the object of knowledge is abstracted from similarities between different speakers, it is strictly speaking no longer a mental entity. Characteristic of the HPSG attitude is (78b), which invites linguists to develop and use theories rather than to think about their nature. Pollard and Sag (1994: 14) reiterate this point, expressing the mentalist and non-mentalist views in terms of the status of the ‘linguistic types’ in (77c). The way grammars describe language is formulated in (79). (79)
‘Modelling types of conceivable linguistic entities as rooted labelled graphs of a special kind – totally well-typed, sort-resolved feature structures – we formulate universal grammar and grammars of particular languages as a system of constraints on those feature structures.’ [Pollard and Sag (1994: 57f.)]
The HPSG representation makes use of so-called ‘complex’ feature structures. The basic unit of a feature structure is the attribute-value pair, e.g. [number = singular]. A value can itself be a feature structure. A Latin adjective like bonus (‘good’) agrees with the noun it modifies in number, gender, and case. This can be expressed by an attribute agreement, as in Figure 4.4. 16
AGR β
NUM GEN CASE α
SING MASC NOM
Figure 4.4: The agreement feature of Latin bonus
In Figure 4.4, α is a feature structure with three features. At the same time, α is the value of AGR. In HPSG, β is again the value of a higher attribute. The top level features are PHON and SYNSEM, representing the phonological and the syntactic/semantic properties of the word, phrase, or sentence described by the feature structure. The conditions imposed on feature structures in (79) are explained in (80). (80)
a.
‘A feature is totally well-typed in case it is well-typed, and, moreover, for each node, every feature that is appropriate for the sort assigned to that node is actually present. […] b. A feature structure is sort-resolved provided every node is assigned a sort label that is maximal (i.e. most specific) in the sort ordering.’ [Pollard and Sag (1994: 18)]
238
Chomskyan Linguistics and its Competitors
The sort referred to in (80b) is the label for a feature structure, α and β in Figure 4.4. Technically, atomic feature values, e.g. SING in Figure 4.4, are also sorts. While there is no space here to explain the formal notions in (80) in detail, a brief discussion of an example will show why their implications may cause problems for (79). (81)
a. Niels has a cousin who lives in Paris. b. Niels has two cousins. c. How many cousins does Niels have?
As a consequence of (79), if a noun can have gender, gender has to be specified for every noun. Although the role of gender in English is limited (cf. ten Hacken (1998)), it is necessary for the proper use of third person singular pronouns. In (81), cousin must be specified as masculine or feminine. If the feature is left out, the feature structure for cousin is not totally well-typed, (80a). If the value is left unspecified it is not sort-resolved, (80b). In (81a) this is not a serious problem. We can argue that the person will be either feminine or masculine and it is only a matter of finding out which of the two values applies. In (81b), however, the two cousins may be of different sex. Specifying gender will be complicated even more in (81c). The implication of the question is that gender must be unspecified. The problem is well-known from translation into a language such as German, as illustrated in (82). (82)
a. Niels and Odette are cousins. b. Niels und Odette sind Cousin und Cousine.
In German there is no word that covers both the masculine and the feminine, so that in the German translation (82b) of (82a), the two words for masculine cousin and feminine cousin have to be coordinated to render English cousins. Therefore, (79) may be a desirable property for formal languages, but it is hard to reconcile with natural language. Another statement relating grammar and language in HPSG is (83). (83)
a.
‘The distinction between the system of constraints and the collection of linguistic entities that satisfies it can be viewed as corresponding both to Chomsky’s (1986a) distinction between I-language and E-language […] b. Though only the latter is directly observable, only the former can be embodied as a mental computational system shared by members of a linguistic community.’ [Pollard and Sag (1994: 58)]
There is no problem in the assumption in (83a) that the collection of linguistic entities constitutes an E-language in the sense of Section 2.1.3. In terms
Some modern competitors
239
of Chomsky’s (15) in that section, they are a ‘collection of linguistic forms (words, sentences) paired with meanings’. However, E-language is not ‘directly observable’ as claimed in (83b). What is directly observable is performance, but performance is not a ‘collection of linguistic entities’ as described in (83a). Similarly, it is conceivable to see an I-language as a ‘system of constraints’ as in (83a), but the concept of I-language is explained by Chomsky in (16) of Section 2.1.3 as ‘some element of the mind of the person who knows the language’, which is incompatible with the idea of a ‘system shared by members of a linguistic community’ as in (83b), because no part of the mind is shared. It is difficult to see HPSG as explained by Pollard and Sag (1994) as an empirical science in the sense of Chapter 1. The emphasis on formalism in Figure 4.3 suggests a closer relationship to the GPSG model as represented in Figure 4.2 than to the empirical cycle. However, whereas GPSG does not draw any parallel to other sciences and Montague Grammar relates linguistics to mathematics, cf. (51), which is a formal science, Pollard and Sag (1994: 6–9) draw a parallel to astronomy, which is undoubtedly an empirical science. Therefore, it is hard to argue that HPSG linguistics is a formal science and to see Figure 4.2 as an adequate research programme for HPSG. A further indication of the nature of HPSG can be derived from the choice of issues on which a definite position is taken and the ones for which more than one option is left open. In (78), whether language is a mental state or an abstract phenomenon is left unspecified. Sag and Wasow (1999: 227–231) address the question of language acquisition from the position in (84). (84)
a.
‘There can be little doubt that biology is crucial to the human capacity for language; if it were not, family pets would acquire the same linguistic competence as the children they are raised with. It is far less clear, however, that the human capacity for language is as specialized as Chomsky says. A range of views on this issue are possible.’ [Sag and Wasow (1999: 228)] b. ‘Our position is that the grammatical constructs we have been developing in this text are well suited to a theory of universal grammar, whether or not that theory turns out to be highly task-specific, and that the explicitness of our proposals can be helpful in resolving the taskspecificity question.’ [Sag and Wasow (1999: 229)]
The issue is explained in (84a) as the degree of task-specificity of the human capacity for language. Chomsky’s position is described as being at one of the extremes, according to which the human language faculty is ‘a ‘mental organ’ for language which is distinct in its organisation and functioning from other cognitive abilities’. Sag and Wasow claim that in developmental psy-
240
Chomskyan Linguistics and its Competitors
cholinguistics ‘many scholars disagree’ with this position (1999: 228). Their own position is neutral, as described in (84b). By contrast, concerns of language processing are explicitly highlighted. Pollard and Sag (1994: 10–14) argue that a linguistic theory should be declarative, constraint-based, and monotonic, conditions familiar from LFG (cf. (15) and (18) in Section 4.1.2). Sag and Wasow (1999: 218–227) argue for what they call ‘constraint-based lexicalism’ as a ‘performance-plausible competence grammar’, reiterating basically the same point. 17 On the basis of these considerations, we can conclude that in HPSG, formalisation and compatibility with processing are essential criteria for the evaluation of a grammar. A running computer program for the analysis of sentences that exhibit the phenomena for which the grammar was written is then a good argument that the theory is correct. By contrast, HPSG does not commit itself to a firm position as to the mental realisation of competence, as stated in (78), or the existence of a dedicated language faculty involved in language acquisition, as stated in (84). Figure 4.5 incorporates these conclusions in a way compatible with the model of science in Figure 4.3.
Competence
Language
Grammar
contains
Judgements
Linguistic entities
generates describe
Feature structures
Figure 4.5: The research programme of HPSG
In Figure 4.5 some of the same conventions are used as in Figure 4.2. The shaded oval for Competence and the dashed arrows mean that these do not play an essential role in the research programme. The language contains the linguistic entities, but how it relates exactly to the competence is an open question. Universals do not play any significant role in HPSG so they are not included in Figure 4.5. If we map Figure 4.5 onto Figure 4.3, Grammar corresponds to Theory, Feature structures to Model, and Linguistic entities to Phenomena. Establishing what are linguistic entities is a task for the linguist. Sag and Wasow (1999: 6, fn. 3) indicate that judgements are an important tool, although they have to be used carefully and supplemented with other types of
Some modern competitors
241
data, e.g. evidence of usage, in the more delicate cases. The language is simply the set of linguistic entities.
4.3.3
Comparison of the model of HPSG with other models
It is interesting to compare the HPSG model of Figure 4.5 not only with that of Chomskyan linguistics, but also with those of LFG and GPSG. Perhaps the most striking point is the lack of feedback loops in Figure 4.5. This is in accordance with the model of science proposed by Pollard and Sag (1994) and represented in Figure 4.3. The essence of science is a formal description of the phenomena. Explanation, which drives the empirical cycle and the research programmes of Chomskyan linguistics and of LFG, is accessory in HPSG. Universals and competence do not play an essential role. In these respects, HPSG is similar to GPSG. That Figure 4.2 represents universals as well as upward arrows labeled as tests is more a consequence of the rhetoric involved than of a difference in the actual research programme. It would have been possible to add grammatical framework and universals to Figure 4.5 without major consequences. The tests in HPSG are not so much absent as implicit. The main difference between the research programmes of GPSG and HPSG is the inclusion of Competence and Judgements in Figure 4.5. This is motivated by Pollard and Sag’s (1994: 6–9) parallel treatment of linguistics and astronomy, which implies that it is an empirical science. This contrasts with the idea of linguistics as a formal science in GPSG. The main similarity of HPSG with LFG is the emphasis on language processing. However, whereas LFG uses it as a criterion for the selection of grammars, HPSG does not operationalise it directly. Instead, HPSG emphasises the formal, explicit derivation of feature structures. LFG wants to use human processing to select the most appropriate descriptive and explanatory account of human language, but HPSG wants to use formal, computational processing as a criterion for the explicitness of the grammar. In the introduction to their textbook, Sag and Wasow (1999: 12–21) give three reasons for studying syntax in subsections entitled ‘A window on the structure of the mind’, ‘A window on the mind’s activity’, and ‘Natural language technologies’. This attitude explains the popularity of HPSG in computational linguistics. There is very little similarity between the research programmes of HPSG and Chomskyan linguistics. As far as entities in Figure 4.5 and Figure 2.7 correspond they play a completely different role. Whereas competence is central to Chomskyan linguistics, it is in the periphery of HPSG. This is necessary to maintain the neutral position with respect to mentalism as formulated in (78). Whereas judgements are used in HPSG to establish linguistic entities, in
242
Chomskyan Linguistics and its Competitors
Chomskyan linguistics they are a subset of the observations to be explained. Whereas the main function of a grammar in HPSG is to generate feature structures through a formally explicit process, in Chomskyan linguistics it explains observations by describing competence. The conclusion that HPSG emphasises formalisation and processing while not taking a definite position on mentalism and language acquisition makes it into a mirror image of Chomskyan linguistics in these respects.
4.3.4
Interaction of HPSG with other frameworks
In its research practice, HPSG has always been characterised by a remarkable open-mindedness with regard to theories developed within other research programmes. This is the most likely reason why little emphasis has been placed on the presentation of the research programme or the discussion of assumptions made in other research programmes. The main interaction with other frameworks has been the adoption and adaptation of their analyses. This is also implied in (85). (85)
‘As we proceed, it will become evident to readers familiar with a range of contemporary syntactic and semantic theories of language that many of the constructs and hypotheses of HPSG – perhaps most of them – are borrowed or adapted from elsewhere.’ [Pollard and Sag (1987: 11)]
The statement in (85) is from the section entitled ‘HPSG: a preview’ and it is followed by a presentation of some of the main similarities and differences to other theories, including GPSG, LFG, and GB. It is interesting to observe the different attitudes to Chomskyan linguistics in the different books presenting HPSG. Pollard and Sag’s (1987) attitude is exemplified by (76). No special role in the field is assigned to Chomskyan linguistics. The fiction of a ‘level playing field’ is meant to be realised by a reduction of all theories to a unification-based formalism. Mentalist and antimentalist conceptions of language are presented as equally viable options. In the preview following (85), GB is referred to alongside GPSG, LFG, and other theories. Pollard and Sag (1994) take a different attitude, described in (86). (86)
a.
‘given the widespread acceptance of that framework as a standard in recent years, especially among an extensive community of syntacticians in the United States and much of continental western Europe, b. it is incumbent on the proponents of a competing framework to explicate the sense in and extent to which the proposed alternative addresses the concerns of that community.’ [Pollard and Sag (1994: 1f.)]
Some modern competitors
243
In (86a), ‘that framework’ refers to ‘the research framework established by Noam Chomsky’, which they now admit to be the dominant one. The conclusion they draw in (86b) leads to a presentation for Chomskyan linguists, pointing out similarities and differences to GB. As a result of this strategy, Chomsky has more index entries than any other linguist, including the authors themselves. Sag and Wasow’s (1999) textbook is directed at undergraduate students of linguistics and graduate students and general readers in ‘disciplines related to linguistics, such as psychology, philosophy, mathematics, or computer science’ (1999: xi). As Kaplan puts it, it ‘provides a general introduction to syntactic theory as seen from the vantage point of HPSG’ (2003: 89). Thus, while Pollard and Sag (1987) try to play down the predominant role of Chomskyan linguistics, Pollard and Sag (1994) accept it and Sag and Wasow (1999) ignore it. The last two attitudes are compatible if we take into account the intended readership.
4.3.5
Conclusion
It is far more difficult to produce a coherent picture of the research programme for HPSG than it is for any of the other approaches discussed in this book. The presentation of the underlying assumptions in HPSG by Pollard and Sag (1994: 6–9) is coherent, but it is based on a model of science that is not aligned with the analysis of empirical science used here. In many respects, the information they give is not presented in a way to answer foundational questions such as (1) and (2) in Chapter 2 with a research programme defined as in (7) in Section 1.2.2. As HPSG is taken as a basis for linguistic research by a relatively large community of linguists, this raises doubt as to the generality of the notion of research programme developed here. Pollard and Sag (1994: 6–9) take astronomy as their prototype of empirical science. Astronomy is compatible both with a description in terms of a formal, logical system with axioms and inferences and with a description in terms of the empirical cycle. However, the former can hardly account for the practice of astronomical research. It can at most give a partial snapshot, derived from a particular state of the theory, while leaving out certain aspects of knowledge that are not compatible with this form of description. Moreover, there is no need for an empirical science to be describable as a formal, logical system. By contrast, the empirical cycle concentrates on the essential aspects that make an empirical science empirical and scientific. If HPSG linguistics is an empirical science, it must have a research programme relating it to the empirical cycle. It may also have a research programme of the type required by Figure 4.3, but this does not govern the actual research process.
244
Chomskyan Linguistics and its Competitors
On the basis of an interpretation of the remarks found in the main sources on HPSG, Figure 4.5 was the best representation I could come up with that is compatible with the model of science in Figure 4.3. However, if the analysis of empirical science in Chapter 1 is correct and HPSG linguistics is an empirical science, it cannot be a representation of the assumptions that guide actual research. The most likely solution to this problem is that researchers in HPSG have an implicit research programme that does not remain neutral with respect to some crucial questions. This may mean that some HPSG researchers assume a mentalist position and others not, or that all of them have a mentalist position. However, the actual assumptions remain tacit and any set of them contradicts some of the explicit statements found in the literature and discussed here. SUMMARY
•
The crisis at the basis of HPSG resulted from a sense of ‘unease’ with the fact that Chomskyan linguistics continued to attract many linguists despite arguments against it (e.g. from LFG and GPSG) that were considered convincing.
•
As some of the causes of this crisis are inherent to the nature of empirical science, it can be called a ‘meta-crisis’.
•
In the presentation of their methodological assumptions, Pollard and Sag (1994) take a view of empirical science that ignores the empirical cycle.
•
The research programme of HPSG as presented by Pollard and Sag (1994) highlights formalisation of the grammar and the descriptions of linguistic entities rather than explanation of the type the empirical cycle makes possible.
•
No position as to mentalism and language acquisition is taken. All options are left open.
•
Because of the importance of formalisation and processing in its design, HPSG is highly popular in computational linguistics.
•
HPSG does not engage in fundamental discussions with other frameworks, but has taken over and adapted many of the ideas proposed in GPSG, LFG, and Chomskyan linguistics.
Some modern competitors
4.4
245
Jackendoff ’s linguistics
The preceding sections of this chapter each focused on a particular theory. Although the origins and the development of LFG, GPSG, and HPSG are each linked to a small number of researchers, it seemed more appropriate to start from the theory than from a particular researcher. In this section, however, I take the other perspective, identifying the topic on the basis of one linguist, Ray Jackendoff. The reason why, in this case, the focus on a person appears to offer a better perspective is that Jackendoff’s work has throughout most of his activity been at the same time influential and relatively isolated. From the early 1970s onwards, Jackendoff has published a large number of books and many articles, which appeared in Linguistic Inquiry and other journals widely read among generative linguists. As opposed to LFG, GPSG, HPSG, and indeed Chomskyan linguistics, however, there has never been a significant group of linguists working together in the framework set out by these writings. As noted in the introduction to Chapter 2, Botha (1989: 5–7) discusses the differences between, among other terms, Chomskyan linguistics and Chomsky’s linguistics. Chomskyan linguistics is used by Botha in a sense which makes it a reasonable choice as the name for a research programme, whereas Chomsky’s linguistics ‘represents the set of assumptions about linguistic structure held by himself at any particular moment’ (1989: 6). By analogy, this section is entitled Jackendoff’s linguistics, rather than Jackendovian linguistics. The central question to be addressed here is in fact whether Jackendoff’s linguistics belongs to Chomskyan linguistics or has a research programme of its own. Another aspect in which this section diverges from the preceding three is that there is much more material that directly addresses questions of the research programme. In particular, building on various of his earlier publications, Jackendoff (2002) gives a detailed overview of the basic assumptions he makes about the linguistic framework and the consequences of these assumptions for the description of different aspects of language as well as the study of language acquisition, language processing, and the evolutionary origin of language. Our discussion of Jackendoff’s linguistics starts with a historical overview of some aspects of Jackendoff’s work in Section 4.4.1. It will be demonstrated that a crisis of the type we encountered at the basis of LFG, GPSG, and HPSG can also be identified here. A crisis need not lead to a new research programme, but it opens up that possibility. In Section 4.4.2, we turn to one of the central innovations resulting from this crisis, the parallel architecture of Jackendoff
246
Chomskyan Linguistics and its Competitors
(1997). The question in this section is whether the argumentation Jackendoff gives for this architecture is compatible with the research programme of Chomskyan linguistics as described in chapter 2. Section 4.4.3 shows how the opposition between Jackendoff’s linguistics and Chomskyan linguistics affects their debate on the evolution of language. Section 4.4.4 summarises the conclusions.
4.4.1
The crisis: integrating semantics
Jackendoff started his linguistic career when Chomskyan linguistics was involved in the so-called ‘Linguistic Wars’. 18 The central point of discussion in this conflict was the place of semantics. Generative Semantics adopted the position that the syntactic level of deep structure, as discussed in Section 2.5, represents meaning. Interpretive Semantics assumed instead that semantic interpretation rules were involved in the mapping from syntactic to semantic representations. Chomsky supported the latter position which in the course of the 1970s gained the upper hand. The standard account of this episode is Newmeyer (1986a). A different analysis of the discussions of this period is presented by Huck and Goldsmith (1995). 19 Jackendoff belonged to the Interpretive Semantics camp. Indeed, he was one of its main proponents, as can be concluded from both these accounts. Thus, Huck and Goldsmith include interviews with four protagonists of the conflict and one of them is Jackendoff. Three important publications from the 1970s should be placed in this context. Jackendoff (1972) formulates a proposal for the semantic interpretation rules that were central to Interpretive Semantics. Newmeyer calls it ‘The most important treatment of semantics in early EST’ (1986a: 144). 20 Two other publications elaborate ideas outlined first by Chomsky (1970), the written form of the lectures that Jackendoff calls ‘the first authoritative stand against the growing school of Generative Semantics’ (1977: xi). Jackendoff (1975) offers a proposal for the way morphology is encoded in the lexicon. Jackendoff (1977) elaborates X-bar theory. It is interesting to note, however, that Jackendoff’s version of the X-bar theory departs in a number of respects from the prevailing trend in Chomskyan linguistics. Thus, Jackendoff (1977) has three bar levels, S as a projection of V rather than Infl, and X-bar theory as a constraint on explicitly formulated phrase structure rules. 21 Despite these divergences, there is no doubt that Jackendoff’s linguistics in this period belongs to Chomskyan linguistics. While mainstream Chomskyan linguistics entered a new phase of optimism with the publication of Chomsky’s (1981a) Lectures on Government and Binding, Jackendoff continued to work on semantics. In his (1983) Semantics
Some modern competitors
247
and Cognition, he broke away from the view of semantics as an interpretive module tagged onto a syntactic representation and laid the foundation of Conceptual Semantics. A central hypothesis in this respect is the Conceptual Structure Hypothesis in (87). (87)
‘The Conceptual Structure Hypothesis There is a single level of mental representation, conceptual structure, at which linguistic, sensory, and motor information are compatible.’ [Jackendoff (1983: 17)]
A consequence of (87) is that there is no distinction between a linguistic level of semantic representation and another level for the representation of nonlinguistic meaning. Jackendoff (1983) argues extensively for a single level at which linguistic meaning interacts with information from the visual domain. An example is the use of pragmatic anaphora in (88). (88)
a. Your coat is here [] and your hat is there []. b. Can you do that []? c. Can you do this []?
In the use and interpretation of pragmatic anaphora, visual pointers interact with linguistic elements. In (88a) they indicate places, in (88b-c) actions. The contrast between (88b) and (88c) expresses linguistically how the visual input is to be interpreted. In (88b) the speaker’s action points to the intended action, in (88c) it demonstrates the action. Jackendoff argues that there has to be a level of representation at which both linguistic and visual information is available in order to account for their interaction in the interpretation of (88). A further consequence of (87) is that the single level referred to cannot be derived from syntactic representation by a set of interpretation rules. If this were the case, it would not be possible for vision (or other sources of information) to contribute elements to conceptual structure that cannot be expressed in language. This is necessary, however, because not only is ‘a picture’ as the saying goes ‘worth a thousand words’, but for many elements of the picture, no linguistic expressions are available at all. After Jackendoff (1983) had laid the foundations of Conceptual Semantics, including a system for the notation of conceptual structures and their interpretation, Jackendoff (1990a) elaborated a theory covering a wide range of interesting semantic phenomena. Instead of rules of interpretation deriving semantic structures from syntactic representations, he assumes a system of correspondence rules linking conceptual and syntactic structures. Jackendoff (1990a) exemplifies and elaborates correspondence rules by means of analyses of a range of phenomena.
248
Chomskyan Linguistics and its Competitors
The major difference between interpretation rules and correspondence rules is that the former are necessarily directional and the latter are not. In the model of Standard Theory, it is impossible to start from a semantic structure and find out information about the corresponding syntactic structure, except by hypothesising a syntactic structure and checking whether interpretation rules are able to derive the semantic structure one started with. In a model involving correspondence rules, however, it does not matter what the starting point is. One can even think of a partial syntactic structure and a partial conceptual structure mutually informing each other. Jackendoff expresses his own assessment of the state of his theory at this point in (89). (89)
a.
‘I consider the state of development of this theory to be comparable to the theory of generative syntax in the early 1960s […] b. As in that period in syntax, the emphasis at the moment is on descriptive power – tackling a wide range of phenomena, being able to state alternative solutions with some precision, and finding criteria to decide among them.’ [Jackendoff (1990a: 3–4)]
In (89) Jackendoff compares the development of his theory of semantics to the development of generative syntax. As we saw in Section 2.5.2, the purpose of providing explanations and the type of explanations to be provided were available from the start of Chomskyan linguistics, although the choice between different grammars describing an I-language was in many cases difficult to motivate in the relevant terms. Chomsky (1957) emphasised the potential of his formalism to describe phenomena more precisely and much of the early work was indeed devoted to the type of task described in (89b). Still, there is an important difference that (89a) makes abstraction from. By 1990, Chomskyan linguistics (as well as a number of competitors) had produced an elaborate theory of syntax. It would therefore not be unreasonable to expect Jackendoff to integrate his theory with the existing Chomskyan syntax (or, for that matter, one of its competitors). Jackendoff expresses a different view, however, in (90). (90)
a.
‘As stressed throughout, the theory of syntax-semantics correspondence worked out here has relied on a fairly rudimentary theory of syntax – basically little more than a mid-1970s version of phrase structure.’ [Jackendoff (1990a: 284)] b. ‘the present results lead one to ask in various cases whether the phenomena that are taken as motivation for one syntactic framework over another are genuinely syntactic phenomena.’ [Jackendoff (1990a: 285)]
It is interesting to see that (90a) refers to the period in which Jackendoff (1977) was written. In the discussion between (90a) and (90b), Jackendoff refers to
Some modern competitors
249
certain properties of syntax as formalised in LFG and GB-theory. This indicates the intended meaning of ‘syntactic framework’ in (90b). The tone of (90) suggests that Jackendoff is considering more radical strategies of integration and does not see it as a problem that his theory does not combine readily with existing syntactic theories. Although this is certainly not a widely felt crisis, it is a sign that Jackendoff feels the need to reconsider the basic assumptions of the research programme. The book ends with the enigmatic ‘But it comes time to stop and admire the view before pushing on again’ (1990a: 287).
4.4.2
Architecture and research programme
Jackendoff’s approach to the crisis of how to integrate semantics with syntax is based on the way the rule components interact with each other, the ‘architecture’ of the linguistic theory. The question to be investigated here is whether this new architecture implies a new research programme.
4.4.2.1 Syntactocentrism versus parallel architecture The two architectures in contrast are represented in Figure 4.6 and Figure 4.7. 22 Syntax
Sound
Meaning
Figure 4.6: Syntactocentric architecture
Figure 4.6 represents the constant factors of the architecture of all theoretical stages in mainstream Chomskyan linguistics. Jackendoff introduces the term syntactocentrism in (91) to refer to it. (91)
‘Assumption 6 (Syntactocentrism) The fundamental generative component of the computational system is the syntactic component; the phonological and semantic components are “interpretive”. ’ [Jackendoff (1997: 15)]
In accordance with (91), in Figure 4.6 the representation of sound and meaning is derived from and of a different nature than syntax. Whereas the box for ‘Syntax’ in Figure 4.6 includes rules and representations, the ovals for ‘Sound’
250
Chomskyan Linguistics and its Competitors
and ‘Meaning’ do not include rules. ‘Sound’ stands for a phonetic representation, ‘Meaning’ for a semantic representation of the expression. Although the internal organisation of the ‘Syntax’ box developed quite significantly in the course of the history of Chomskyan linguistics, its relationship to sound and meaning did not change to the same extent.
Phonology
Syntax
Concepts
Figure 4.7: Parallel architecture
The alternative architecture favoured by Jackendoff is represented in Figure 4.7. As in Figure 4.6, the box for ‘Syntax’ represents both syntactic representations and the rules for generating them. As suggested by their shape in Figure 4.7, the two other boxes should be interpreted analogously. ‘Phonology’ includes generative rules and the representations they generate. ‘Concepts’ includes conceptual structures and the rules generating them. The arrows in Figure 4.7 are of a very different nature to the ones in Figure 4.6. Two boxes in Figure 4.7 are connected by correspondence rules that work in both directions. 23 The difference between the two architectures in Figure 4.6 and Figure 4.7 shows up very clearly if we consider the place of the lexicon. In Figure 4.6, the lexicon is a component that contributes information to syntax. Some of this information is then used by the interpretive components for sound and meaning. This means that all lexical information has to pass through syntax, whether it is used in syntax or by the interpretive rules. In Figure 4.7, however, lexical information can be used in either of the three components directly. Jackendoff expresses this in (92). (92)
a.
‘the function of lexical items is to serve as interface rules, and the lexicon as a whole is to be regarded as part of the interface components. b. On this view, the formal role of lexical items is not that they are “inserted” into syntactic derivations, but rather that they establish the correspondence of certain syntactic constituents with phonological and conceptual structures.’ [Jackendoff (2002: 131), originally the whole of a in italics]
According to (92b), the difference between cat and dog is visible in phonology, but not in syntax. They are syntactically indistinguishable nouns. The difference between man and father, where only the latter has an argument, is
Some modern competitors
251
visible in conceptual structure, but not in syntax. Conceptually, one cannot be father without being father of someone. Syntactically this relationship may be expressed in different ways or be left implicit. As suggested by the last example and by the remark in (90), the parallel architecture leads to a redistribution of tasks compared to mainstream Chomskyan linguistics. In (93) this is formulated in the form of a hypothesis. (93)
‘Simpler Syntax Hypothesis / SSH: Syntactic structure is only as complex as it needs to be to establish interpretation.’ [Culicover and Jackendoff (2006: 413)]
Culicover and Jackendoff (2005) take up the challenge of giving substance to the programmatic hypothesis in (93).
4.4.2.2 Jackendoff ’s presentation of the research programme of Chomskyan linguistics In several of his writings, Jackendoff presents or discusses elements of the research programme of Chomskyan linguistics analysed in Chapter 2. As (94) illustrates, Jackendoff agrees with basic notions of Chomskyan linguistics. (94)
a.
‘Accordingly, an important boundary condition on my enterprise is that it be in all respects compatible with the world view of generative linguistics. b. In particular it is crucial to choose I-concepts rather than E-concepts as the focus for a compatible theory of knowledge.’ [Jackendoff (1990a: 8)]
The statement in (94) is the conclusion of Section 1.1 of the book. ‘Accordingly’ in (94a) refers to the general purpose of the theory to be developed in the book. In that section, Jackendoff presents Chomsky (1986a) as the authority on ‘generative linguistics’, so that (94a) places his theory of Conceptual Structure explicitly in the research programme of Chomskyan linguistics. In (94b), he refers to an analogy between two perspectives on language, I-language and E-language as discussed in Section 2.1.3, and two perspectives on concepts. The conclusion he draws from the analogy is exactly the one we would expect of someone adhering to Chomskyan linguistics. Apart from such remarks in introductions of theoretical works, we also find a full, systematic argument for some of the basic assumptions of the research programme of Chomskyan linguistics in Jackendoff (1993). In particular, he argues for what he calls ‘the two Fundamental Arguments’ in (95).
252 (95)
Chomskyan Linguistics and its Competitors a.
‘The Argument for Mental Grammar: The expressive variety of language use implies that a language user’s brain contains a set of grammatical principles. b. The Argument for Innate Knowledge: The way children learn to talk implies that the human brain contains a genetically determined specialization for language.’ [Jackendoff (1993: 6), originally all bold]
It is not difficult to recognise (95a) as referring to competence or I-language and (95b) as referring to the language faculty in the sense of Figure 2.7. This means that the two arguments of (95) support the basic analysis of the nature of language adopted in Chomskyan linguistics. Jackendoff (1993) presents an abundance of evidence based on introspective judgements, sign language, language acquisition, aphasia, etc. to make (95) plausible. We can find a parallel for the theoretical side of Figure 2.7 in (96). (96)
‘By taking Universal Grammar as my starting point, I intend to reaffirm that, whatever differences surface as we go on, the work presented here is down to its deepest core a part of the Chomskian tradition.’ [Jackendoff (1997: 2)]
In Jackendoff’s (1997: 2–11) discussion of universal grammar, (96) is the conclusion of the introductory paragraph. Despite the explicit adherence to ‘the Chomskian tradition’ expressed in (96), it is worth noting the issue of how Jackendoff’s notion of tradition relates to the notion of research programme adopted here. As we saw in the introduction to this chapter, Newmeyer considers Chomsky’s GB-theory, LFG, GPSG, and HPSG all as representatives of ‘generativist approaches’ that ‘trace their ancestry to the work pioneered by Noam Chomsky’ (1998: 11). Although they are clearly rooted in the Chomskyan tradition, the preceding sections demonstrated that LFG, GPSG, and HPSG do not share the research programme of Chomskyan linguistics. As a preliminary to the discussion of whether Jackendoff’s linguistics is a kind of Chomskyan linguistics, it is therefore useful to briefly compare the Fundamental Arguments in (95) with the corresponding positions taken by these three. Mentalism of the type stated in (95a) is adopted also in LFG, but it is rejected by GPSG, at least in the form in which the grammatical principles in the language user’s brain are the object of investigation in linguistics. HPSG is not committed to either (95a) or its denial. Given these positions on (95a), (95b) is not directly relevant to GPSG and HPSG. In LFG, it is language processing, not ‘the way children learn to talk’, that is taken to imply a genetic specialisation for language. 24 Therefore, in the interpretation of (96), we can safely assume that Jackendoff’s linguistics is closer to Chomskyan linguistics than the frameworks which were shown in previous sections to have a different research programme.
Some modern competitors
253
4.4.2.3 Theory versus research programme In order to determine whether the question of the choice between the two architectures in Figure 4.6 and Figure 4.7 is a matter of different theories in the same research programme or different research programmes, we have to analyse the nature of the criteria Jackendoff uses to compare the two. From the outset it should be clear that such an analysis does not lead to an assessment of the quality of the argumentation. The only question is whether Jackendoff’s criteria are compatible with the research programme of Chomskyan linguistics. Jackendoff’s (2003) target article is particularly useful here, because it summarises the argument of Jackendoff (2002) from a comparative perspective. It starts with a concise overview of the external history of generative grammar, from which (97) is taken. (97)
a.
‘the overall program of generative grammar was correct, as was the way this program was intended to fit in with psychology and biology. b. However, a basic technical mistake at the heart of the formal implementation, concerning the overall role of syntax in the grammar, c. led to the theory being unable to make the proper connections both within linguistic theory and with neighboring fields.’ [Jackendoff (2003: 651)]
In (97) we find contradictory signals as to our main question. The approval of the ‘overall program of generative grammar’ in (97a) suggests that Jackendoff adheres to the research programme of Chomskyan linguistics. It is completely in line with (96) and the same caveat applies. In addition, (97b) calls the choice of the architecture in Figure 4.6 ‘a basic technical mistake’. Technical mistakes are typically connected to the level of theory rather than research programme. By contrast, (97c) evokes problems of a more general kind. Thus, it encompasses the main difference between the research programmes of Chomskyan linguistics and LFG, which can be characterised in terms of the different ways in which language acquisition and processing are used to select a grammar. In (98) the types of argument Jackendoff uses to argue for the parallel rather than syntactocentric architecture are summarised. (98)
a.
‘It might well be argued that the standard syntactocentric framework has served the field well for forty years. Why would anyone want to give it up? A reply might come in five parts. b. First, no one has ever argued for the syntactocentric model. […] c. Second, an advocate might argue that the syntactocentric model is a priori simpler. […] The reply would be that the choice among theories must be determined by empirical adequacy as well as a priori simplicity. […]
254
Chomskyan Linguistics and its Competitors d. A third point concerns the relation of syntax and semantics. […] e. A fourth point concerns the nature of Universal Grammar. In the parallel architecture, the issues of acquisition and innateness don’t go away, they are exactly the same […] f. A final point concerns not linguistic structure itself but its connection to the rest of the theory of the brain/mind.’ [Jackendoff (2003: 659)]
In (98) Jackendoff gives five reasons for giving up the architecture of Figure 4.6. The two in (98b-c) are of a very general, logical nature. The ‘empirical adequacy’ in (98c) refers to the discussion of problems in the mappings between syntax and phonology and between syntax and semantics he had discussed before (2003: 655–658). 25 General logical arguments are neutral for the theory versus research programme analysis. The two reasons in (98d-e) concern the distribution of tasks between different components. In general, such arguments are a matter of selecting the best theory within a particular research programme. An example of such an argument that is uncontroversially internal to the research programme of Chomskyan linguistics is Chomsky’s (1972b) argument that Deep Structure is not the only source of meaning, as Chomsky (1965) had assumed. He invokes examples involving focus, presupposition, and negation and their interaction with quantifier scope and passive to show that some aspects of the meaning of sentences can only be determined on the basis of Surface Structure. Therefore, (98d) cannot be used to argue that Jackendoff promotes a new research programme. This analysis is reinforced by (98e), which reiterates the allegiance to one of the central assumptions of Chomskyan linguistics. If the parallel architecture is a sign of a new research programme, this must be because of the question of how to integrate linguistic structure with adjacent fields, as alluded to in (98f). It is necessary, therefore, to look into this issue in more detail. Jackendoff (2003: 661–665) outlines arguments from four areas in which integration is more easily achieved in a parallel architecture. The first is indicated in (99). (99)
a.
‘The parallel architecture claims that language is organized into a number of semi-independent combinatorial systems, each of which has its own organizing principles.’ [Jackendoff (2003: 661)] b. ‘A syntactocentric architecture, by comparison, shows no resemblance to the rest of the mind/brain.’ [Jackendoff (2003: 662)]
The point in (99) is typical of the defence of research programmes in the sense that it is not an empirically or logically compelling argument, but an argument based on plausibility. The question (99) addresses can be assimilated to the discussion of the unification of linguistics and neurobiology in Section 2.6.2.1. There it was concluded that in Chomskyan linguistics neurobiology was not
Some modern competitors
255
considered far enough advanced to guide linguistics. Therefore, Chomskyan linguistics does not attach much importance to (99b), but Jackendoff underlines the contrast with (99a), which is more in line with what we know about the implementation of other functions in the brain. A second field where the question of integration is important is semantics. Adopting the Conceptual Structure Hypothesis in (87) and the study of I-concepts in (94), Jackendoff places semantics outside of language but inside the mind. For this reason, conceptual structure has to have its own set of generative rules which do not belong to language. This leads to the difference with a syntactocentric approach formulated in (100). (100) ‘if the combinatorial properties of semantics were completely attributable to the combinatorial properties of syntax, then it would be impossible for nonlinguistic organisms to have combinatorial thoughts.’ [Jackendoff (2003: 662)]
As a syntactocentric architecture takes meaning to be derived from syntax, any combination in conceptual structure has to be derived from a combination in syntax. In (100) the inference is made that combinatorial thoughts depend on the availability of syntax. This means that this architecture predicts that thinking by chimpanzees and other primates has to be very limited, at least if arguments such as Anderson’s (2004) that they do not have language are accepted. A third field that Jackendoff proposes to (re-)integrate with syntax is language processing. The central claim is formulated in (101). (101) a. ‘sound ← phonology ← syntax → meaning […] b. there is no way that the logical directionality in (13) can serve the purposes of both perception and production. […] c. The parallel architecture, by contrast, is inherently nondirectional.’ [Jackendoff (2003: 663)]
In (101b) ‘(13)’ refers to (101a). The objection in (101b) is an old one. We have come across it in the discussion of competence and performance in Section 2.1.1, the choice of language acquisition rather than language use as determining the language faculty in Section 2.4.1, and in the reaction to the alternative research programme of Lexical-Functional Grammar in Sections 4.1.3 and 4.1.5.1. The classical answer in Chomskyan linguistics involves the two elements in (102). (102) a.
Processing mechanisms make use of but are separate from grammatical competence. b. Processing considerations are not important for the way competence is organized and grammars are formulated.
256
Chomskyan Linguistics and its Competitors
In (101b-c) Jackendoff does not argue against (102a). As such, he takes a different position to the one represented in Figure 4.1. The LFG research programme integrates competence with processing. Jackendoff only argues against (102b), most explicitly in (103). (103) ‘do the principles of the competence theory bear any resemblance to the principles that the language user actually employs in speaking and understanding language? If not, it is not entirely clear exactly what is claimed in attributing psychological reality to the competence grammar.’ [Jackendoff (2002: 198)]
The question of psychological reality was discussed in Section 2.3.1 in the context of the nature of a grammar and in Section 4.1.1 as the central issue at the root of the crisis leading to the emergence of LFG. In (28) and (29), Chomsky (1980a) argues that evidence based on processing cannot have a privileged position in selecting a theory. In (103) Jackendoff takes up the same issue as proponents of LFG, but within a research programme that shares with Chomskyan linguistics the central position of language acquisition. Jackendoff does not propose to use processing instead of acquisition as a way of choosing a grammar, but as a way to choose among different pairs of acquisition mechanism and grammar. Jackendoff (2007) elaborates this point, claiming that ‘A linguistic theory that disregards processing cuts itself off from valuable sources of evidence and from potential integration into cognitive science.’ The final area of integration adduced by Jackendoff in support of the parallel architecture is the evolutionary origin of language. He formulates the central issue in (104). (104) a.
‘How did the ability to systematically map combinations of concepts into sequences of speech sounds and back again develop in our species, and b. how did the ability to learn such a systematic combinatorial mapping develop?’ [Jackendoff (2003: 664)]
The questions in (104) can be seen as an elaboration of the question that in Section 2.6 was shown to be at the origin of the transition from GB-theory to the Minimalist Program. However, the formulation of (104a) takes a processing perspective that is less easily compatible with a syntactocentric model of grammar, for reasons indicated in the discussion of (101). Moreover, Jackendoff imposes the boundary condition in (105). (105) ‘we would not like to have to explain language through miraculous emergence, given that (as argued by Pinker & Bloom 1990) it has the hallmarks of being shaped by natural selection.’ [Jackendoff (2003: 664)]
Some modern competitors
257
Pinker and Bloom (1990a: 712–720) list a number of points they suggest as indications of ‘design’, i.e. gradual adaptation under the influence of use in communication. 26 In (105) Jackendoff excludes the type of approach taken in Chomskyan linguistics by Hauser et al. (2002). That Jackendoff dismisses this approach can be deduced from (106). (106) a.
‘Chomsky himself has been notably evasive on the issue of the evolution of the language faculty, often seeming to cast aspersions on the theory of natural selection […] b. The logic of the syntactocentric architecture suggests a reason why such evasion has been necessary. c. The problem is in providing a route for incremental evolution, such that some primitive version of the faculty could still be useful to the organism.’ [Jackendoff (2003: 664)]
While both Hauser et al. (2002) and Jackendoff (2002, 2003) invoke the evolution of language as a criterion for the selection of a theory of the language faculty, they use it in different ways. Jackendoff demands in (106c) that a theory of the language faculty indicates a plausible path for its incremental development. Jackendoff (2002: 238–264) sketches a detailed map of steps such that each step has advantages for the species so that its survival can be explained on the basis of natural selection. This explains the negative judgement of (106a-b). As shown in Section 2.6.2.2, Hauser et al. (2002) argue that the essential part of language is recursion as the core property of the language faculty in the narrow sense (FLN). All other mechanisms had evolved earlier. Without the FLN, there is no language so that no selective advantage can be derived from linguistic abilities. Each of the other components of FLB must have evolved for other reasons (exaptation). Therefore, Jackendoff, like Chomskyan linguistics, uses the argument of the evolution to decide on the best theory of the language faculty, but he derives a different criterion from this argument, resulting in the choice of a different theory. Chomskyan linguistics
Jackendoff’s linguistics
Mental organisation
Mental organ
Modular interaction
Semantics
Outside of language
Conceptual module
Processing
Outside of language
Constraint-based
Evolution
Exaptation
Incremental adaptation
Table 4.1: Integration of language in Chomskyan linguistics and Jackendoff ’s linguistics
258
Chomskyan Linguistics and its Competitors
Table 4.1 summarises the distinctions invoked by Jackendoff (2003). In the interpretation of this table, it is important to keep in mind the research programme of Chomskyan linguistics as represented in Figure 2.10. The difference between the two approaches does not concern the part of Figure 2.10 that overlaps with Figure 2.7. Moreover, the arrangements of entities and their relationships is the same for both approaches. The difference resides in the consequences taken from the ‘World’ at the top level to determine the nature of the ‘Language Faculty’ at the species level. Jackendoff’s linguistics derives additional criteria (semantics, processing) and uses the criterion of evolution differently. We can expect that the arguments Jackendoff uses for the parallel architecture will not be recognised as such in Chomskyan linguistics. Therefore, Jackendoff’s linguistics has a different research programme, even though the layout of Figure 2.10 is shared with Chomskyan linguistics.
4.4.3
The debate on the evolution of language
From the overview of differences between Jackendoff’s linguistics and Chomskyan linguistics in Table 4.1, it is clear that the evolution of language constitutes the most promising topic of debate between the two. To the extent that it is possible to make any specific claims about the mental organisation, the discussion of this area tends to merge with the one on evolution. For semantics and processing, any debate is bound to be rather sterile because only one side claims that a definite theory of them is essential for a theory of language. In the presentation of Chomskyan linguistics in Chapter 2, Hauser et al. (2002) was given as the source of the proposal setting out the position on evolution taken by Chomskyan linguistics. This article appeared in Science’s Compass, the section of Science devoted to review articles on particular branches of science, in this case Neuroscience. Before that, Jackendoff (1999) had already set out his position in an article in the Opinion section of Trends in Cognitive Sciences. This article presents a slightly less elaborate version of the account of the evolution in Jackendoff (2002). Hauser et al. (2002: 1572) refer to Jackendoff (1999), as well as to Pinker and Bloom (1990a), without reacting to it in detail. Jackendoff’s view on the position of Hauser et al. (2002) is summarised in (107). (107) a.
‘One of the longstanding difficulties in reconciling Darwin with Chomsky is that the Chomskyan treatment of syntax does not lend itself readily to plausible incremental evolutionary steps. b. This is why Chomsky has advocated catastrophic emergence of language without the help of natural selection.’ [Jackendoff (2001: 569)]
Some modern competitors
259
In (107a) Jackendoff formulates the problem as making the architecture adopted throughout Chomskyan linguistics compatible with what is known about evolution. The term catastrophic in (107b) is used in the study of evolution for a feature that arises suddenly, without an explanation in terms of gradual adaptation. Thus, an account in which all components of FLB arise by exaptation and FLN is the last component to emerge can be called catastrophic in this technical sense. A series of articles in the journal Cognition features a direct confrontation between the two positions. Pinker and Jackendoff (2005) attack Hauser et al. (2002). In reply, Fitch et al. (2005) defend their views. Jackendoff and Pinker (2005) react to this defence. Three points are central in this confrontation. First, the role of recursion in relation to what makes human language unique. Second, the role of adaptation in the emergence of human language. And finally, the relationship between the way human language emerged and the theory of language adopted.
4.4.3.1 Recursion As discussed in Section 2.6.2.2, Hauser et al. (2002) make a distinction between the language faculty in the narrow sense (FLN) and in the broad sense (FLB). The hypothesis Pinker and Jackendoff (2005) take issue with is formulated in the abstract as (108). (108) a. ‘We hypothesize that FLN only includes recursion and b. is the only uniquely human component of the faculty of language.’ [Hauser et al. (2002: 1569)]
In their reaction, Pinker and Jackendoff start by identifying three questions that characterise the nature of human language, listed in (109). (109) a.
‘The first is which aspects of the faculty are learned from environmental input and which aspects arise from the innate design of the brain […] b. A second question is what parts of a person’s language ability (learned or built-in) are specific to language and what parts belong to more general abilities. […] c. A third question is which aspects of the language capacity are uniquely human, and which are shared with other groups of animals.’ [Pinker and Jackendoff (2005: 202)]
There is broad agreement between the two sides as to the approach to (109a). Hauser et al. express their view on (109b) in (108a) and on (109c) in (108b). Pinker and Jackendoff (2005: 205–218) present a number of types of evidence to play down the unique role of recursion hypothesised in (108). They show
260
Chomskyan Linguistics and its Competitors
that there are other properties of human language that distinguish it and indicate that recursion may not be essential for human language. An example they give is phonology in (110). (110) a.
‘The set of phonological structures of a language forms a “discrete infinity,” a property which, in the case of syntax, HCF identify as one of the hallmarks of language. […] every language has an unlimited number of phonological structures, built from a finite repertoire of phonetic segments. […] b. We note that the segmental and syllabic aspect of phonological structure, though discretely infinite and hierarchically structured, is not technically recursive.’ [Pinker and Jackendoff (2005: 210)]
In (110a), ‘HCF’ stands for Hauser et al. (2002). The importance Hauser et al. attribute to discrete infinity is indicated in (111). (111) ‘All approaches agree that a core property of FLN is recursion, attributed to narrow syntax in the conception just outlined. FLN takes a finite set of elements and yields a potentially infinite array of discrete expressions. This capacity of FLN yields discrete infinity.’ [Hauser et al. (2002: 1571)]
In (110b), Pinker and Jackendoff note that the property of discrete infinity, which (111) links to FLN so directly, is shared by phonology, although phonology is not recursive. Phonological structure achieves discrete infinity by iteration. The difference is that in syntax, a sentence may contain a subordinate sentence, whereas in phonology any syllable may be followed by another one. Discrete infinity does not depend on embedding or subordination, but recursion does. In their reaction, Fitch et al. (2005) start by addressing the distinction between FLB and FLN again. They state that Pinker and Jackendoff’s criticism is in large part due to ‘their blurring the distinction’ between these two (2005: 180). They restate the relation of FLN to FLB in (112). (112) ‘FLN is composed of those components of the overall faculty of language (FLB) that are both unique to humans and unique to or clearly specialized for language. The contents of FLN are to be empirically determined.’ [Fitch et al. (2005: 182)]
The consequence of (112) is that (108b) is no longer an empirical hypothesis, but a definition. It is also not straightforward to combine (112) with Hauser et al.’s (110b) in Section 2.6.2.2, which introduces FLN as ‘the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces’ (2002: 1571). On the basis of the definition in (112) they go through the evidence adduced by Pinker and Jackendoff in some detail
Some modern competitors
261
(2005: 190–205) and argue that it does not prove their point. Their general strategy is to argue that what Pinker and Jackendoff give as special properties of human language is either shared with other species or with other capacities so that it is not part of FLN according to (112). As an example (113) is what they state about phonology. (113) ‘Given our present knowledge, much of phonology is likely part of FLB, not FLN, either because phonological mechanisms are shared with other cognitive domains (notably music and dance), or because the relevant phenomena appear in other species, particularly bird and whale “song” .’ [Fitch et al. (2005: 200)]
The reasoning in (113) gives a good flavour of the general strategy Fitch et al. (2005) use. Jackendoff and Pinker’s (2005) reaction to this is to demonstrate that recursion also plays a role in visual grouping. They use a figure similar to Figure 4.8 to illustrate this (2005: 218).
×× ×× ×× ××
×× ×× ×× ××
×× ×× ×× ××
×× ×× ×× ××
Figure 4.8: Recursion in visual grouping
Jackendoff and Pinker claim that a figure such as Figure 4.8 ‘is perceived as being built recursively out of discrete elements which combine to form larger discrete constituents’ (2005: 217). Pairs of crosses are arranged in clusters of two pairs, arrays of two such clusters, and so on. Lerdahl and Jackendoff (1983) show that in the perception of musical structure, the same principle is at work. 27 Their conclusion is (114). (114) a.
‘This shows that recursion per se is not part of FLN under FHC’s definition. […] b. Their construal of the FLN/FLB distinction fails to shed light on why humans have language and other animals do not.’ [Jackendoff and Pinker (2005: 217–218)]
If (113) is an argument against using evidence from phonology to counter (108a), then, according to (114a), Figure 4.8 and musical grouping are arguments against including recursion in FLN, i.e. against (108a). However, the correct conclusion, according to (114b), is to reject (112). Of course anyone is free to define terms, but if they do not fulfill their function (‘fail to shed light on …’) we should use different terms instead.
262
Chomskyan Linguistics and its Competitors
A remarkable feature of the discussion of recursion summarised here is the degree of misunderstanding and uncertainty on how to interpret the other side’s statements. At a macro-level, this can be observed because the conclusion in (114) is only reached in the second contribution by Jackendoff and Pinker. They could have reached it in their first contribution, because the substance of the argument depends on Jackendoff’s work in the 1980s. Visual categorisation is discussed by Jackendoff (1983). Pinker and Jackendoff (2005) obviously did not anticipate arguments of the type illustrated in (113). At a micro-level, both sides complain about the lack of full mutual understanding. The two quotations in (115) illustrate this. (115) a.
‘PJ quote our own comments about the vastness of the lexicon as being contradictory to our hypothesis 3. Again, misunderstanding has resulted from a failure to carry through the FLN/FLB distinction.’ [Fitch et al. (2005: 202)] b. ‘FHC repeatedly claim that we misunderstand their hypothesis about the content of FLN. Yet their statement of the hypothesis is extremely unclear.’ [Jackendoff and Pinker (2005: 216)]
In (115) ‘PJ’ is Pinker and Jackendoff (2005) and ‘FHC’ is Fitch et al. (2005). Statements such as the ones in (115) are generally an indication of incommensurability effects of the type that typically marks debates between proponents of different research programmes. Rather than concluding that one of the two sides is a clear winner, we can only conclude that each side has won the debate from the perspective of its own research programme.
4.4.3.2 Adaptation At the heart of Jackendoff’s argument against the syntactocentric architecture adopted in Chomskyan linguistics is the problem in (107). While evolution generally works by adaptation, no such account is available with a syntactocentric architecture. Chomsky’s reluctance to refer to adaptation is matched by a reluctance to consider communication as a driving force. As we saw in (58) of Section 2.4.1, he considers communication as a derived goal of language, whose principal goal is to express thoughts. Pinker and Jackendoff (2005: 223–225) argue against Chomsky’s view on the relationship between language and communication because in their view inner language (the expression of thought without communication) depends ultimately on the use of language in communication. Thus, words can only be learned in communication. Moreover, without the need to express thoughts in sound, it would be incomprehensible why a phonological representation would have emerged. The linear nature of phonetic realisation imposes constraints on
Some modern competitors
263
the expression of conceptual structures that distort them. If these constraints had not existed, more direct, multi-dimensional ways of expression would have been possible. 28 The choices of the view on communication and the view on adaptation are not independent. If the evolution of language is driven by the advantage of enhanced communication there is an urge to see gradual improvements of the system in the form of adaptations to communication. This explains why Pinker and Jackendoff (2005: 225–229) also object to Chomsky’s argument for the perfection of language, presented in Section 2.6.4. Perfection of the type advocated by Chomsky is not compatible with evolution in a sequence of steps. If evolution in steps is not an option, catastrophic emergence as referred to in (107b) is the only alternative. Catastrophic emergence is only possible if FLN, the final cornerstone of language, happened to come about for other reasons and find a perfect environment for constituting FLB. Pinker and Jackendoff (2005: 229–231) argue that such a scenario is much less plausible than the gradual evolution proposed by Jackendoff (2002). Fitch et al. (2005: 184–189) react to this by reiterating some of the familiar arguments we saw in Section 2.4.1 concerning the difficulty of determining the function of language. They also note that this complicates the task of setting up an evolutionary account based on adaptation in general. They propose to distinguish the current utility and the functional origin. However, the current utility of a trait is not directly relevant to the question of how it emerged and the functional origin is often impossible to determine. Jackendoff and Pinker (2005: 212–214) object to the division of functions into current utility and functional origins. They give the example of the evolution of human legs. Legs are used, for instance, for kicking a soccer ball and operating the brakes of a car, but these functions are not relevant to their evolution. A plausible functional origin is the control of swimming in fish. Neither of these can explain the evolution of human legs. Instead, they propose a third possibility, current adaptation. This refers to the reason the trait was selected in a particular species. In the case of human legs, this is bipedal locomotion. Jackendoff and Pinker also object to the claim that evolutionary explanations in terms of adaptation are futile. They point to ‘the fact that Hauser and Fitch, in their experimental work, brilliantly test hypotheses about adaptive function’ (2005: 213). They also mention the possibility of ‘reverse-engineering a trait’, i.e. considering how useful it would be without one or other of its properties. They give the example of an eye with a retina and muscles moving the eyeball. The muscles are only useful when the retina is already there, so that the retina must have evolved first. Jackendoff (2002) uses this technique extensively in his account of the evolution of language.
264
Chomskyan Linguistics and its Competitors
The discussion about adaptation shares many properties with the discussion about recursion. Therefore, and because the technicalities involved are further removed from linguistics, I have summarised the argument more briefly. As in the previous case, one has the impression that the discussion could go on without getting systematically nearer to a conclusion accepted by both sides.
4.4.3.3 Evolution and architecture The debate between Jackendoff and Pinker on one side, Chomsky, Hauser, and Fitch on the other, is first of all about the question of whether the evolution of human language was a process consisting of a sequence of steps reflecting adaptation or the result of the accidental reusability in language of a single component that emerged for other reasons. At the same time, however, it addresses the question of which linguistic theory is better. The discussion of this question largely takes the form of critical remarks on the Minimalist Program (MP) and the reply to these remarks. Pinker and Jackendoff (2005: 218–222) argue for the two points summarised in (116). (116) a.
‘on both empirical and theoretical grounds, the Minimalist Program is a very long shot. […] b. Behind HCF’s claim that the only aspect of language that is special is recursion lies a presumption that the Minimalist Program is ultimately going to be vindicated.’ [Pinker and Jackendoff (2005: 222)]
The claim in (116a) is supported by an argument that the MP, because of its emphasis on recursion in the merge operation, neglects a wide range of other phenomena. This leads them to the statement in (117). (117) ‘In fact, most of the technical accomplishments of the preceding 25 years of research in the Chomskyan paradigm must be torn down, and proposals from long-abandoned 1950s-era formulations and from long-criticized 1970s-era rivals must be rehabilitated.’ [Pinker and Jackendoff (2005: 220–221)]
In Section 2.6, the criticism of the MP by Lappin et al. (2000a, b) was referred to, which also incorporates claims such as (117). However, Pinker and Jackendoff do not argue for Lappin et al.’s claim (120) in Section 2.6.5, i.e. that the MP was adopted largely on the basis of authority, presumably of Chomsky’s authority. In the discussion of that statement we found that it is rather the authority of the research programme than Chomsky’s that is responsible for this transition. In (116b), Pinker and Jackendoff argue on a
Some modern competitors
265
similar presupposition, but suggest that the research programme was tuned to the theory. This is of course not an untypical perception of the relationship of a theory and a research programme by observers who do not share the research programme. Research programmes tend to remain implicit until the theories they generate are attacked from a perspective that is not compatible with the research programme. Fitch et al. (2005: 183) emphatically deny (116b) and refer to Chomsky et al. (2005) for a more detailed discussion of the arguments for (116a). This is an appendix that was not published in Cognition but made available on Marc Hauser’s website. It reiterates many of the points motivating the MP discussed in Section 2.6 and argues that Pinker and Jackendoff’s (2005) discussion of it is based on serious misunderstandings. The discussion of the MP has many signs of incommensurability effects. The quotations from Chomsky et al. (2005) in (118) illustrate this. (118) a.
‘PJ’s discussion of the success of eliminating d- and s-structure appears to be based on a misunderstanding of the notion of “linguistic level” as it has been used in generative grammar.’ b. ‘PJ also misunderstand the notion of redundancy that has always been used in this context, namely redundancy of rule systems. They object that language use involves plenty of redundancy. That is indisputable. It is also irrelevant.’ c. ‘Their entire discussion here reveals a complete misunderstanding of the notions “looks like an imperfection” and “apparent imperfection” as used in the source they cite, and the MP literature generally.’ [Chomsky et al. (2005), no pagination]
As illustrated by (118a-b) Chomsky et al. feel that Pinker and Jackendoff misunderstand some basic technical notions. A straightforward explanation of this phenomenon is that technical notions referred to by the same name may have subtle but crucial differences in different research programmes. Another type of incommensurability effect is that an important argument in one research programme is irrelevant in another. The way (118b) is formulated suggests that this is at least partly responsible for the communication problem. (118c) refers to the heuristic use of perfection as discussed in Section 2.6.4. This is part of the research programme of Chomskyan linguistics, but not of the one assumed by Pinker and Jackendoff. In a direct reaction to (117), Chomsky et al. (2005) even suggest that ‘Possibly PJ’s consistent misinterpretations are based on rejection of the entire enterprise of the past half century’ because they object to ‘abandoning rich descriptive technology that has been argued to be superfluous and misguided.’.
266
Chomskyan Linguistics and its Competitors
Jackendoff and Pinker (2005) do not react directly to Chomsky et al. (2005), but they argue that Fitch et al. (2005), while trying to prove (116b) wrong, only further illustrate it. The opposition is expressed in (119) and (120). (119) ‘The distinction itself is intended as a terminological aid to interdisciplinary discussion and rapprochement, and obviously does not constitute a testable hypothesis.’ [Fitch et al. (2005: 181)] (120) a. ‘Actually, this distinction is far from theoretically innocent. […] b. Despite FHC’s assurances, this conception is not terminological: it is an empirically testable hypothesis.’ [Jackendoff and Pinker (2005: 219)]
The distinction referred to in (119) and (120a) is the one between FLN and FLB. It should be noted first of all, that although (119) seems to be a direct negation of (120b), the subject of the hypothesis has shifted from the ‘distinction itself’ to ‘this conception’. The conception (120b) refers to is developed in the text between (120a) and (120b). It concerns the relationship and distribution of tasks between rules of grammar and lexicon entries. The characterisation of FLN as ‘an abstract core of computational operation’ by Fitch et al. (2005: 180) determines the choice of architecture as discussed in Section 4.4.2. Even if this makes the observation in the first part of (120b) correct, the label ‘empirically testable hypothesis’ is problematic. If Jackendoff and Pinker use different criteria for evaluating the hypothesis than Chomskyan linguistics, they may well end up with opposite conclusions drawn from the same empirical tests. This only further illustrates the incommensurability of the research programmes involved.
4.4.4
Conclusion
Large portions of Jackendoff (2002) can be read as entirely compatible with the research programme of Chomskyan linguistics. This is even more obviously the case for Jackendoff (1993). Indeed, Pinker and Jackendoff (2005: 222) call these books ‘passionate expositions’ of the overall programme of generative linguistics. Superficially, the difference between Jackendoff’s linguistics and Chomsky’s linguistics seems to be a matter of architecture and formalism. Whereas Jackendoff adopts a parallel architecture with unification of feature structures, Chomsky keeps to a syntactocentric architecture with movement operations on tree structures. A closer look at the argumentation, however, shows that Jackendoff motivates his choices with arguments that are not based on the research programme of Chomskyan linguistics. The differences are summarised in Table 4.1. These differences can be seen as different elaborations of the way the highest level in Figure 2.10 influences the selection of
Some modern competitors
267
the best theory for the language faculty. The research programme adopted by Jackendoff therefore shares all of Figure 2.7 with the research programme of Chomskyan linguistics, but has different interpretations for some of the arrows in Figure 2.10. Despite the similarity of the research programmes, they are distinct. This is confirmed by the incommensurability effects observed in the discussion about the evolution of human language. 29 SUMMARY
4.5
•
On the basis of the research programme of Chomskyan linguistics as represented in Figure 2.7, Jackendoff developed a theory of conceptual structure, which could not be integrated easily with Chomsky’s theory of syntax.
•
The proposed solution for this crisis was a parallel rather than Chomsky’s syntactocentric architecture.
•
In general, different architectures can be compatible with the same research programme, as long as the same criteria are used to choose between them.
•
Jackendoff’s argumentation for the parallel architecture uses criteria that are not compatible with the research programme of Chomskyan linguistics.
•
The criteria adduced include accounts of language processing, the interaction of syntax and semantics, and the gradual evolution of human language through adaptation.
•
The discussion of the emergence of human language in evolution between Jackendoff and Pinker on one hand and proponents of Chomskyan linguistics on the other shows the incommensurability effects we expect to find in the discussion across research programme boundaries.
•
Therefore Jackendoff’s linguistics is not a branch of Chomskyan linguistics, but belongs to a separate (though similar) research programme.
Conclusion
In this chapter we have considered four frameworks that can be seen as competitors to Chomskyan linguistics. In the introduction to this chapter we saw that Newmeyer (1998: 11) recognises ‘two trends’ in what he calls ‘the formalist
268
Chomskyan Linguistics and its Competitors
(structuralist, generativist) orientation’ (1998: 7). One of these we identified with Chomskyan linguistics. The other includes LFG, GPSG, and HPSG, as well as a number of other theories. Presumably, Jackendoff’s linguistics would be included here as well. The incommensurability effects that occur in discussions between the proponents of each of these approaches and Chomskyan linguistics show that they belong to different research programmes. This means that the level of research programme is more specific than Newmeyer’s concept of (formalist) orientation. The question is, then, whether it is also more specific than his concept of trend, i.e. whether some or all of the four frameworks discussed in this chapter belong to the same research programme. It is clear from the discussion above that all four frameworks are presented as alternatives to Chomskyan linguistics. We did not consider any controversial discussions between, for instance, LFG and GPSG from which incommensurability effects could be deduced. The type of divergence from Chomskyan linguistics suggests a division between on the one hand LFG and Jackendoff’s linguistics and on the other GPSG and HPSG. The former two accept basic parts of the research programme of Chomskyan linguistics. For LFG, this is the part represented in Figure 2.3, i.e. up to the level of grammar and competence. For Jackendoff’s linguistics it is the part represented in Figure 2.7, i.e. up to the level of UG and the language faculty. Chomskyan linguistics can also be seen in principle as having diverged with respect to the new research programme. This perspective is more convincing in the context of Jackendoff’s linguistics, because the move from GB-theory to the MP involved the extension of the research programme. Chomskyan linguistics and Jackendoff’s linguistics have grown apart rather than one of them diverging from the track set by the other. In the case of GPSG and HPSG, the conflict with Chomskyan linguistics goes deeper. They not only reject the research programme of Chomskyan linguistics, but also propose a different perspective on science. GPSG makes linguistics a formal science, so that its research programme ties in with the tradition of formal logic, including Montague Grammar. HPSG refuses to commit itself to a mentalist conception of language, without however committing itself to a non-mentalist conception instead. These differences imply that no two of the four approaches share a common research programme. Nevertheless, it is interesting to see that a number of recurrent themes are elaborated in most or all of the four competing frameworks. Human language processing is more important in all of them than in Chomskyan linguistics. The incorporation of semantics was a driving force in GPSG. It led to what can be seen as a parallel architecture in HPSG and Jackendoff’s linguistics. Also the use of features structures and unification is shared by all four frameworks. Finally, the research programmes are
Some modern competitors
269
not too far apart for ideas to be taken over from one into another. Typically, however, they are adapted to the receiving research programme and their further development is subject to evaluation criteria not shared by the original research programme.
Notes 1
Textbook introductions include Falk (2001) for LFG and Sag and Wasow (1999) for HPSG. In particular the latter represent their framework as the approach to syntax. A second edition appeared in 2003. Annual international conferences on LFG have taken place since 1996. For HPSG, the start is somewhat less clear, but since the 3rd International Conference on HPSG in 1996 the event has been annual, cf. http: //www.ling.ohio-state.edu/research/hpsg/Gatherings.html.
2
Slobin’s (1966) experiment consisted of a matching task. Subjects were asked to press one of two buttons in response to the question whether a particular sentence matched a picture. Both the sentence and the picture were presented on a computer screen. What was measured is the response time. In Slobin’s experiments, average response times for adult subjects varied from .69 to 1.75 seconds depending on the sentence.
3
Although Bresnan and Kaplan ascribe to ‘most American structuralist linguists’ a goal that ‘would be equivalent to taking the domain and range of the syntactic mapping to be finite’ (1982: xxxix), the discussion in Chapter 3 shows that this analysis is not correct.
4
In fact, in ten Hacken (1997a: 297) I proposed an analysis of the LFG research programme in which this distinction was neutralised. I now think this representation is the result of an incorrect method of research, searching too much for equivalents of individual elements of the research programme of Chomskyan linguistics.
5
Whereas Joan Bresnan certainly has a mission publicising LFG, Gerald Gazdar rather considers GPSG as mathematicians consider their theorems and proofs. Gazdar (2001) shows a certain disappointment with linguistics and with linguists because missionary zeal plays such an important role in the success of a framework.
6
Among the approaches listed in (44a), APG refers to Arc-Pair Grammar, proposed by Johnson and Postal (1980). According to their preface, ‘A notable feature of the present work […] is its degree of formality’ (1980: x). Montague Grammar applies formal logical concepts to the semantics of natural languages, cf. Dowty et al. (1981). Stockwell et al. (1973) present the results of the project ‘Integration of Transformational Theories on English Syntax’. Lasnik and Kupin (1977) present what they call in their abstract ‘A set theoretic formalization of a transformational theory in the spirit of Chomsky’s LSLT’. LSLT refers to a
270
Chomskyan Linguistics and its Competitors 1955 manuscript published as Chomsky (1975b). Given the contrast suggested by (44a-b) it is remarkable that Chomsky (1981a) refers to Lasnik and Kupin (1977) four times without a word of criticism.
7
Note, however, that Gazdar does not distinguish grammaticality and acceptability. This distinction is crucial in Chomskyan linguistics (cf. Section 2.2.1), but many linguists from other backgrounds consider it impossible to distinguish them in practice.
8
The similarity is reflected by some remarks in Johnson and Postal’s (1980) preface and introduction. Thus they call sentences ‘the basic elements of language’ and expect a linguistic theory to ‘have a basic conception of what kind of formal objects sentences are’ (1980: 3). Furthermore, ‘a linguistic theory must have a conception of what a possible sentence-specifying system or grammar is’ (ibid.).
9
At a higher level of abstraction, the discussion between Katz and Chomsky should be mentioned. Katz (1981) provides an account of his ‘radical change in my outlook on the nature of linguistics’ so that ‘I no longer accept the equation of scientific with empirical’ (1981: 3). In his overview of the philosophy of Chomskyan linguistics, Botha (1989) outlines some of Chomsky’s reactions to Katz’s challenge.
10 Van Riemsdijk and Williams (1986: 139–156) give a historical overview of the rise of trace theory, showing the role of the contraction data in more detail. Although Lightfoot states that ‘The trace theory of movement rules was first outlined in Chomsky (1973)’ (1976: 559), van Riemsdijk and Williams (1986: 154f.) assign Chomsky (1973) to ‘Before trace theory’ and give the PhD dissertations of Thomas Wasow and of Robert Fiengo as well as Chomsky (1976a) as the first sources. 11 Postal and Pullum (1982: 122) refer to Jaeggli (1980) and to Chomsky (1980a: 158–160) rather than to Chomsky (1981a), but the arguments are basically the same in all three cases. 12 Chomsky’s style in such debates, as observed also in Section 3.2.3, tends to impersonalise his views and identify his assumptions with what is rational, which may stimulate an aggressive tone. In his reaction to Pullum (1989), Chomsky also uses sarcasm, when he replies to two points of criticism concerning formalisation in Chomsky (1986b), that ‘In each case, what he cites is completely straightforward, though the monograph does presuppose some literacy in linguistics and logic’ (1990: 146). 13 In an undated ‘Draft: Not for circulation’ of what was later published as Chomsky (1986b), the footnote from which (71) is taken as well as the sentences it is a footnote to are missing. I acquired a photocopy of this draft in 1985. The chronology is consistent with the hypothesis that Chomsky added this part of the text after reading Pullum (1985).
Some modern competitors
271
14 Cf. Kaplan’s (2003) overview of syntax for computational linguists, which takes LFG and HPSG as the main theories that are successful both in theoretical and in computational linguistics. 15 In the extensive list of colleagues thanked in the preface of Pollard and Sag (1994), none is mentioned in the category ‘colleagues who gave us detailed comments on earlier drafts, in addition to providing valuable sustained interaction’ (1994: ix) and only Geoffrey Pullum is mentioned in the wider category, which also includes Joan Bresnan and Ronald Kaplan. 16 Pollard and Sag (1994: 60–99) propose a more sophisticated theory of agreement. For a brief general introduction to feature structures and the operations performed on them, cf. Shieber (1986). 17 It should be noted in this context that assuming a ‘competence grammar’ is hardly compatible with a non-mentalist view of language. The neutral position with respect to mentalism in (78) seems to have been abandoned here. 18 Newmeyer (1986a: 117) attributes this term to Paul Postal, who was one of the leading figures of generative semantics. 19 Whereas Newmeyer attributes the downfall of Generative Semantics mainly to internal factors, such as problems in formulating coherent theoretical accounts, Huck and Goldsmith highlight the importance of external factors, for instance problems in setting up a strong institutional organisation. Newmeyer (1996: 127–137) argues against Huck and Goldsmith’s analysis. 20 EST stands for Extended Standard Theory. It refers to the theory which we can now, with hindsight, see as part of the transition from Standard theory to Chomsky’s (1981a) Government and Binding Theory. 21 Chomsky (1981a) assumes two bar levels, i.e. X 0, X’ and XP=X’. The emergence of functional heads, e.g. Infl for S=IP and Comp for S’=CP, is discussed in Section 4.1.5.3. From the late 1980s, a large number of functional projections have been proposed. Thus, Pollock (1989) proposes to split Infl into separate projections for Tense, Agreement, and Negation, leading to TP, AgrP and NegP. Jackendoff has always rejected this trend. Perhaps the most important difference, however, is that, as discussed in Sections 2.5.1.1 and 4.2.4.2, Chomskyan linguistics takes X-bar theory to replace individual phrase structure rules, whereas Jackendoff (1977) uses it only as a constraint on the formulation of rules. 22 Both Figure 4.6 and Figure 4.7 are highly simplified representations. Newmeyer (1986a: 74, 140, 163) represents detailed architectures of different stages of Chomskyan architecture. Jackendoff (2002: 109–110) updates this to include the MP. Jackendoff (2002: 125) gives a more detailed view of the parallel architecture. 23 Jackendoff (2002: 125) also has a set of correspondence rules linking phonology and concepts directly. Their function includes relating the metrical information in phonology to the information structure represented in conceptual structure.
272
Chomskyan Linguistics and its Competitors
24 The relevant statements are (13) and (22) in Section 4.1.2 for LFG, (48) in Section 4.2.2 for GPSG, and (78) in Section 4.3.2 for HPSG. 25 In the case of phonology, Jackendoff points to the ‘organization of phonological structure into semi-independent tiers’ (2003: 657) as a problem for a purely interpretive set of mapping rules. Arguably, this is at most indirect empirical evidence, because it is the theoretical solution preferred in phonology that causes the problems in the mapping, not the data per se. 26 Pinker and Bloom’s target article is commented on by various proponents of Chomskyan linguistics who attack their position. Some of them attack Pinker and Bloom’s stance on adaptation, in particular Otero (1990), to whom Pinker and Bloom (1990b: 768–770) reply in detail. Jackendoff’s comment to the target article is also worth noting. He suggests ‘that syntax – the component of language for which evolutionary antecedents are hardest to imagine – has evolved as a refinement and elaboration of a preexisting informational link between phonological and conceptual structure’ (1990b: 738). This remark should be read in conjunction with the discussion of (90) in Section 4.4.1. 27 They propose four tiers for grouping structure, metrical structure, time-span reduction, and prolongational reduction. The first has the organisation of an unheaded tree structure, the second can be compared to stress patterns, and the last two are most similar to headed tree structures. All of them involve recursion. 28 In this context it is interesting to note Sutton-Spence and Woll’s (1998: 129–130) discussion of topographic space and syntactic space in sign language. The former is iconic and exploits three-dimensional space, e.g. in presenting the relative location of shops in a shopping centre. The latter applies space to represent items for the purpose of reference, e.g. in opposing concepts such as ‘honesty’ and ‘wealth’. Both uses of space are practical for rendering thoughts, but neither is available in spoken language. 29 It is interesting to note the similarity in outlook between Jackendoff’s linguistics and HPSG. Both assume a parallel architecture with unification of feature structures. Jackendoff (2002: 194–195) lists four main differences between the two and indicates which modifications would be necessary for HPSG to fit his theory. He concludes that these modifications ‘are more cosmetic than substantial’ and invites practitioners of HPSG ‘to work out the consequences’. It is not clear from the outset whether the modifications would affect the research programme.
5
Aspects of language development and use
One of the standard objections against generative linguistics is the claim that it concentrates only on grammar and neglects other aspects of the study of language. Nevertheless, in recent years, an increasing interest in such fields can be observed among generative linguists. In this chapter, some of the issues arising in this context will be presented. The discussion will be limited largely to Chomskyan linguistics, for which by far the most material is available. The areas to be discussed here are (first) language acquisition, second language acquisition, historical linguistics and language change, and linguistic communication. The first three of these can be thought of as development of language under different perspectives. In all of them, language use plays a crucial role. In the last one, only language use is in focus. In each case, the emphasis will be on the relationship between research in these areas and the research programme of Chomskyan linguistics as laid out in Chapter 2. Before addressing these four areas in Sections 5.2 to 5.5, I will discuss the position of the concept of named language, i.e. language in the sense in which English, Dutch, and French are languages (Section 5.1). An interpretation of this concept is essential for an understanding of what is acquired in first and second language acquisition, what changes in language change, and what is used in linguistic communication. The conclusion of this chapter (Section 5.6) gives a classification of areas according to their position in the research programme of Chomskyan linguistics and some considerations of their position in other research programmes discussed in Chapter 4.
273
274
5.1
Chomskyan Linguistics and its Competitors
The nature of named languages
One of the controversial claims of Chomskyan linguistics is the one formulated in (1). (1)
a. ‘English doesn’t really exist [40].’ [Uriagereka (1998: 27)] b. ‘Provocative as this central Chomskyan claim may seem, it is fundamental to everything else this book is about and is still not understood in many circles.’ [Uriagereka (1998: 541, fn. 40)]
Uriagereka (1998) explains Chomskyan linguistics in the form of a series of dialogues. Therefore ‘everything else this book is about’ refers to the version of Chomskyan linguistics current in the late 1990s. In order to explore the meaning and significance of (1a), in this section we will first consider how it should be understood (Section 5.1.1). Then some arguments for its plausibility will be presented (Section 5.1.2). Finally, an analysis is offered of what people tend to observe when they observe English as a language (Section 5.1.3). Of course, what applies to English can in this respect apply equally to French, Vietnamese, or Wolof. I will use the term named language to refer to language in this perspective.
5.1.1
Why English cannot exist
One reason for the lack of understanding referred to in (1b) is the rather provocative and arguably misleading formulation of (1a). A typical response would be, ‘so what language is this book written in?’. Another objection is that Chomsky himself makes claims such as (2). (2)
‘PRO is singular in Spanish, English and French, but plural in Italian.’ [Chomsky (1981a: 61)]
How can PRO, the empty category that functions as the subject of infinitives (cf. (56) in Section 4.2.4.1), have any properties in English if there is no such thing as English? The key to understanding (1a) is that it uses exist in a precise, terminological sense, as indicated by really. Thus, (1a) means that there is no entity in the real world corresponding to ‘the English language’. In the terminology introduced in Section 2.1.3, English is only an E-language. Its use in (2) is of the same kind as that of ‘a language as a set of sentences’ in Chomsky and Halle’s (1968) statement (18) in Section 2.1.3. Chomsky states this more explicitly in (3).
Aspects of language development and use (3)
275
a.
‘Chinese is the language of Beijing and Hong Kong, but not Melbourne. […] b. The first is true, but “Chinese” surely has no real world denotatum, in the technical sense, nor need one believe that it does to assign truth value.’ [Chomsky (1995a: 25)]
Chomsky gives (3a) as an example statement and in (3b) refers to this statement with ‘The first’. In the parallel statement to (1a) for Chinese, ‘doesn’t really exist’ is paraphrased in (3b) by ‘has no real world denotatum’. As the start of an explanation we can consider the approach to the relationship between language and communication in (4). (4)
a.
‘I am using the term “language” to refer to an individual phenomenon, a system represented in the mind/brain of a particular individual. b. If we could investigate in sufficient detail we would find that no two individuals share exactly the same language […] c. Two individuals can communicate to the extent that their languages are sufficiently similar.’ [Chomsky (1988: 36)]
In (4a) Chomsky refers to I-language as an object of study in linguistics. In (4b) he reiterates Bloomfield’s (12) in Section 3.1.1.2. Although Bloomfield of course held a different, non-mentalist view of language, the observation as such is pertinent to both. Mutual intelligibility is the classical test to determine whether two people speak the same language. In (4c) Chomsky uses it as a measure for perceived similarity instead. What (4a) highlights is that the model of Chomskyan linguistics as presented in Chapter 2 does not leave room for the study of named languages such as English. As a mental object, realised in the speaker’s mind, an I-language is a real-world entity. There is no sense in which two people, for instance Paul McCartney and David Beckham, can have the same I-language. They would have to share (part of) their minds for this to happen. This impossibility explains the observation (4b). The only level of generalisation that is an object of study in Chomskyan linguistics is that of the language faculty. According to Chomskyan linguistics, there is no principled sense in which Paul McCartney and David Beckham share a language with each other but not with, for instance, Cecilia Bartoli or Steffi Graf. The concepts of English, Italian, and German do not have a status that enables them to characterise this difference in real terms. Chomsky expresses this in (5).
276 (5)
Chomskyan Linguistics and its Competitors a.
‘In the empirical study of language, it has long been taken for granted that there is nothing in the world selected by such terms as “Chinese”, or “German”, or even much narrower ones. b. Speaking the same language is much like “living near” or “looking like”; there are no categories to be fixed.’ [Chomsky (1995a: 48f.)]
By ‘the empirical study of language’ in (5a), Chomsky refers to a type of linguistics that includes his own approach. The ‘categories’ in (5b) refer to classes such as ‘English’. Only similarity can be established, something which is subject to intuitive judgement along an indefinite number of vaguely determined dimensions. The qualification in (5a) that this position ‘has long been taken for granted’ may be controversial. On the one hand, Bloomfield (1933) makes an observation similar to (4b), as we saw in (12) of Section 3.1.1.2. On the other hand, Chomsky’s statement that language (i.e. E-language) is more abstract than grammar (i.e. I-language) quoted as (40) in Section 2.3.1, is preceded by ‘I do not know why I never realized that clearly before’ (2004 [1982]: 131). Here the reference point for ‘before’ is Chomsky (1980a), a series of lectures held 1976–1979. Therefore, it is not obvious that ‘long’ in (5a) extends back beyond the mid 1970s. However, the pertinence of the insight in (5b) does not depend on when it emerged. In conclusion, named languages such as English do not have any theoretical status in Chomskyan linguistics. English is treated as a label not referring to any real-world entity. This does not exclude the use of such labels in non-theoretical statements such as (3a) or in a pre-theoretical sense as in (2). In the case of (2), a theoretical entity, PRO, is attributed theoretical properties, being singular or plural, but it is not necessary to determine the precise boundaries of the named languages before (2) can be interpreted.
5.1.2
Why English is a problematic notion
Chomsky is concerned to show that the lack of any theoretical status of named languages in Chomskyan linguistics is not only a necessary consequence of the research programme, but also a desirable consequence. The most prominent set of problems he refers to is of a well-known type and concerns the identification of languages and dialects. As shown by ten Hacken (2005), basic identification problems were recognised by Bloomfield (1933), and Hockett (1958) devises a complex system to solve them from the point of view of Post-Bloomfieldian linguistics. Hockett develops formal definitions on the basis of mutual intelligibility and sets of expressions. Chomsky formulates and exemplifies the problem as in (6).
Aspects of language development and use (6)
277
a.
‘The term is hardly clear; “language” is no well-defined concept of linguistic science. b. In colloquial usage we say that German is one language and Dutch another, but some dialects of German are more similar to Dutch dialects than to other, more remote dialects of German. c. We say that Chinese is a language with many dialects and that French, Italian, and Spanish are different languages. But the diversity of the Chinese “dialects” is roughly comparable to that of the Romance languages. [Chomsky (1980a: 217)]
The phenomenon in (6b) is described by Barbour and Stevenson (1990: 85f.). It is typical of the problem of determining boundaries between languages in a dialect continuum. A major dialect boundary separates the Northeastern part of the Netherlands (roughly the provinces of Overijssel, Drente and Groningen) from much of the rest. In the Western part of the Netherlands, Low Franconian dialects are spoken (‘Niederfränkisch’), in the Northeastern part Low Saxon dialects (‘Nordniedersächsisch’). The latter group of dialects stretch Northeast to Bremen, Hamburg, Lübeck, and Kiel. 1 The fact that the dialect of Groningen is said to be a dialect of Dutch and the dialect spoken in Bremen a dialect of German is the result of what were at least around 1900 political rather than linguistic considerations. Meanwhile, the increased influence of nationally oriented mass media and education have resulted in an alignment of the Groningen dialect with Dutch and of the Bremen dialect with German.2 The problem in (6c) concerns the question of how many dialects are grouped together into one language. The Chinese language situation is discussed by Sun (2006: 5-10). Traditionally, seven major dialect groups of Han Chinese are distinguished, of which Mandarin has the biggest number of speakers. According to Sun, ‘The seven major Chinese dialect groups are actually like many European languages that are members of the Indo-European language group but are mutually unintelligible.’ (2006: 7). Many of the Romance languages constitute a dialect continuum in which it is difficult to draw any boundaries. While no one would probably argue that they are all dialects of a single language, the number of Romance languages is not obvious. There are long-standing issues such as the status of Galician and Valencian. As Luraghi (2005) outlines, Galician was historically linked to Portuguese, politically to Spanish, and is now officially (i.e. politically) recognised as a separate language. In the case of Valencian, the question is whether it is a dialect of Catalan or a language of its own. Often such issues have political implications and the choice of position is more politically than linguistically motivated.
278
Chomskyan Linguistics and its Competitors
The list of examples of these phenomena can be extended almost at will, although there are probably more examples in Europe with its traditional identification of language and nationhood. Chomsky (1993b: 20) mentions the example of Skåne. In 1658 this region went from Denmark to Sweden and overnight was deemed to change from speaking a Danish to speaking a Swedish dialect. 3 A recent example is the split of what used to be called the Serbo-Croatian language into three languages, Serbian, Croatian, and Bosnian (cf. Greenberg, 2005). Of course, a genuine change in language use may have occurred, in part due to political or even physical pressure. Another set of phenomena that makes the traditional concept of language hard to use in the study of grammar concerns the existence of incomplete acquisition. Chomsky (1995a: 39f., 2000b: 99f.) mentions a number of cases in which it is ‘meaningless’ to ask whether a person knows, understands, or speaks English, for instance young children learning English, adults learning English as a second language, people with different types of aphasia. The fact that English cannot be dealt with as a language in Chomskyan linguistics is not considered a serious problem. Here Chomsky and others often point to various parallels with other sciences. (7)
‘we do not expect the theory of vision to deal with Clinton’s vision of the international market, or expect the theory of language to deal with the fact that Chinese is the language of Beijing and Hongkong, though Romance is not the language of Bucharest and Rio de Janeiro.’ [Chomsky (1995a: 9f.)]
In (7), Chomsky implies that the ambiguity of language between I-language and named languages is comparable to the one of vision between ‘the faculty of sight’ and ‘opinion, view’. Smith (1999: 16) gives the example of a sunset, on which the physicist’s view differs markedly from the commonsense one. Lightfoot draws the parallel in (8). (8)
‘Just as it is unimportant for most work in molecular biology whether two creatures are members of the same species […], so too the notion of a language is not likely to have much importance within this biological perspective.’ [Lightfoot (2005: 59)]
The parallel between species and languages is interesting because in both cases, the commonsense view sees a clear distinction whereas closer inspection reveals complex borderline cases. In both cases there seems to be an intuitive criterion (mutual intelligibility, interbreeding), which raises complications when one tries to formalise it. 4 Moreover, recent developments in the research programme of Chomskyan linguistics, as explained in Section 2.6, use compatibility with an account of the emergence of language in evolution as a criterion to constrain linguistic theory. Evolution is of course the central concept in the study of species.
Aspects of language development and use
279
In conclusion, proponents of Chomskyan linguistics do not regret that named languages such as English cannot be theoretical concepts in their approach to the study of language, because as a concept they are problematic anyway and the type of problem they raise is also found in pre-theoretical, commonsense concepts that are not used in a scientific sense in other fields of science.
5.1.3
English as a phenomenon
Although it is not necessary to account for all folk notions in a particular field, a scientific theory must in some way accommodate the underlying phenomena. If named languages such as English are not the correct way to group phenomena, linguistic theory need not give a unified account of the group of phenomena labelled English. However, it must still accommodate the individual phenomena, either by explaining them or by assigning them to the domain of other fields. A central component of the explanation for the relatively unified impression that languages such as English give is that the I-languages of speakers of English are in fact very similar. The same can be said for German, Dutch, and other languages. There is no need in Chomskyan linguistics to make this similarity more absolute, for instance by indicating a boundary between Dutch and German. It is sufficient to state the mutual intelligibility in the form of (4c). Part of the similarity and difference can be explained by the opposition between core language and periphery, introduced in (74) in Section 2.4.4. The relevance can be illustrated by the English sentences in (9), their German equivalents in (10), and the Danish equivalents in (11). (9)
a. Petra sees Quincy. b. Every day Petra sees Quincy. c. It is good that Petra sees Quincy.
(10)
a. Petra sieht den Quincy. b. Jeden Tag sieht Petra den Quincy. c. Es ist gut, dass Petra den Quincy sieht.
(11)
a. Petra ser Quincy. b. Hver dag ser Petra Quincy. c. Det er godt at Petra ser Quincy.
In (9) we see that in English the subject always precedes the verb. In German, the rule is that the finite verb is in second position in main clauses (verb second, V2) and all other verbs are at the end of the clause. Therefore, when an adverbial precedes the verb in (10b), the subject follows it. In (10c) the
280
Chomskyan Linguistics and its Competitors
verb in the subordinate clause is in final position. Danish, as illustrated in (11), has verb second both in main clauses and in subordinate clauses. Contrasts of the type illustrated by (9–11) are typically constant for named languages. They are accounted for in Chomskyan linguistics by means of parameters. Parameters of this type define a limited number of (classes of) core languages (cf. Section 4.2.4.2). Each of (9–11) represents a class of core languages sharing certain parameter values. Thus, Swedish is like Danish in this respect and Dutch is like German. Whereas the syntax of (9–11) illustrates the opposition between core languages, the choice of sees, sieht, and ser is an example of the periphery. Here a much larger degree of variation is possible between individual speakers. In the case of core language contrasts of the type illustrated in (9–11), a classification of speakers is possible in terms of different core languages to the extent that the relevant parameters and their possible values are known. In the periphery, gradual transitions are to be expected. Thus, in the borderline area of Germany and Denmark, the word order contrast between (10c) and (11c) provides a better criterion to identify speakers of German and of Danish than the form of the verb. However, the oppositions that can be expressed in terms of core and periphery can only explain a small part of the phenomena grouped together in the commonsense notion of named languages. Linguistic boundaries such as the one between Dutch and German are determined to a large extent by political considerations (cf. the discussion of (6b) above). Another aspect not covered by purely linguistic considerations is the standardisation of such languages. Standardisation yields a criterion enabling one to evaluate a person’s I-language, something which is as devoid of sense in Chomskyan linguistics as evaluating a planet’s orbit would be in astronomy. Chomsky expresses this in (12). (12)
a.
‘Such languages often are “cultural artifacts” in a narrower sense: partially invented “standard languages” that few may speak and that may even violate the principles of language. […] b. There is little interest in studying the behavior of the French Academy, for example.’ [Chomsky (1995a: 51)]
In (12a), ‘Such languages’ refers to named languages such as English or French. They are ‘partially invented’ in the sense that a standard has been established consciously. The best-known example of such a standard is the one the Académie Française sets for French. The statement in (12b) is valid from the point of view of Chomskyan linguistics, which is interested in language
Aspects of language development and use
281
as a natural property of the human mind. There are many other perspectives from which the deliberations and decisions are in fact highly interesting. One can, for instance, consider their work as part of applied science, concerned with the establishment of a usable and appropriate standard, cf. ten Hacken (2006b). Another interesting question is how the existence of such an institution influences language use. In the case of English, where no corresponding academy exists, one of the major areas of study is the emergence of different varieties. An extensive overview of this research area is presented by Kortmann and Schneider (2004). For a brief overview of some key issues see also Allerton et al. (2002). As far as the issues concern the attitude of speakers and non-speakers to these varieties, they are mainly sociological. As far as the grammatical properties of the varieties are studied, the questions guiding research can be formulated as questions about I-language. The study of languages and varieties as cultural products with a social dimension, as proposed in different ways by Millikan (2003) and Wiggins (1997) can be seen as complementary but contingent to the type of study Chomskyan linguistics requires. SUMMARY
5.2
•
Named languages such as English, Dutch, or Polish are not realworld entities.
•
The boundaries between named languages are fuzzy and subject to political and other non-linguistic factors.
•
Some of the similarities and differences between I-languages can be explained in terms of parameters.
Empirical aspects of language acquisition
As described in Section 2.4, language acquisition is at the centre of Chomskyan linguistics. The question of language acquisition is strongly connected to the question of the nature of I-language. The research programme specifies that I-language should be described in such a way that language acquisition can be explained. This produces the tension between description and explanation that makes descriptive and explanatory adequacy attainable in principle (cf. (67) in Section 2.4.3). As explained in Section 2.4.1, the central position of language acquisition in the research programme does not imply that the process of language
282
Chomskyan Linguistics and its Competitors
acquisition should be the primary or even a privileged area of empirical research. The perspective of language acquisition that is essential in the research programme is that of a logical problem, determined by the poverty of the stimulus. What is important for establishing the correct grammar of an I-language and the correct version of UG is that language acquisition takes place, not how. Given the idealisation of instantaneous acquisition as described in Section 2.4.4, Hornstein and Lightfoot explain in (72) of that section that ‘We are also under no obligation to pay special attention to child grammars’ (1981: 30, fn. 8). With the transition from Standard Theory to GB-theory described in Section 2.5.1, the possibility arose to give substance to the process of language acquisition within Chomskyan linguistics. Whereas in most studies directed to grammar, the process of language acquisition could still be taken as in Figure 2.4, in which only the logical problem is represented, some studies replaced this perspective by the one in Figure 2.5, which shows the stages S0 and SS as the starting point and the end point of a process. Rather than giving an overview of research resulting from this perspective, this section intends to describe the relationship of this research to the research programme and the main issues evolving from this relationship. First a general discussion of the nature of language acquisition in a Principles and Parameters (P&P) model is given in Section 5.2.1. After that a number of issues are sketched. Section 5.2.2 looks at a number of hypotheses on how data are processed by the child in language acquisition. The development of the language faculty and its role in language acquisition are discussed in two parts. The critical period for language acquisition, explained in Section 5.2.3, concerns the period of availability of the language faculty. In Section 5.2.4, maturation and continuity are described as two competing hypotheses on the relation between the language faculty and the developing I-language.
5.2.1
Language acquisition as parameter setting
The central idea of the P&P model is that the transition from S0 to SS in Figure 2.5 is a matter of setting parameters to specific values. This process is described by Atkinson (1992: 100) and illustrated in Figure 5.1.
Aspects of language development and use
… …
…
283
S0
… …
Si+1 a w
…
a
…
w
…
… g …
Si a
SS
w
k g v
Figure 5.1: Parameter setting in language acquisition
In Figure 5.1, the large grey circles represent different stages of the language faculty, including all principles, and the small white circles represent (a small sample of) parameters. In reality, there will be more parameters than would fit in this figure. Circles with ‘…’ represent parameters that have not been set. In the initial state S0 this applies to all parameters. In this state, only the nature of the principles and the range of possible values for the parameters have been determined. The middle part of the figure represents a single parameter setting step. Si is an intermediate stage in the language acquisition process. Some parameters have been set, others not. The value of a parameter that has been set is represented by a letter. In Si+1 one more parameter has been set, the one with value g. SS represents the final stage, the stable state in which all parameters have been set. Setting a parameter corresponds to ‘learning’ (an aspect of) how a principle of UG is applied in the language of the environment. It is in the context of Figure 5.1 that we should interpret Chomsky’s statements in (13). (13)
a.
‘Take a somewhat different case: my four-year-old granddaughter. Does she speak English?’ [Chomsky (1995a: 39)] b. ‘The fact that ordinary language provides no way to refer to what my granddaughter is speaking is fine for ordinary life, but empirical inquiry requires a different concept. c. In that inquiry, her language faculty is in a certain state, which determines (or perhaps is) her “language”.’ [Chomsky (1995a: 49)]
284
Chomskyan Linguistics and its Competitors
The original context of (13) is an argument to show that the commonsense notion of English (or other named languages) is problematic. The other case implied by (13a) is another example of such a problem. By ‘empirical inquiry’ in (13b) Chomsky refers to research in the research programme of Chomskyan linguistics. In (13c), ‘language’ is linked to a certain state of the language faculty, e.g. Si. An essential insight from the representation in Figure 5.1 is that each intermediate state can be described as an I-language. The only conceptual difference between a mature I-language and the I-language of Chomsky’s granddaughter in (13) is that the latter has unset parameters. A much discussed parameter is illustrated by the contrast in (14–15). (14)
a. she sleeps b. *pro sleeps
(15)
a. ella dorme b. pro dorme
The Italian examples in (15) are translations of English (14a). In (15b), pro is an empty category, the unpronounced subject pronoun. In English, she cannot be left out in (14a), because English does not have pro. Therefore (14b) is ungrammatical. In Italian, (15b) is the unmarked version, but (15a) can be used to emphasise the subject (e.g. ‘she as opposed to us’). It seems then that Italian is more liberal than English. However, this is not the case in (16–17). (16)
a. it rains b. *pro rains
(17)
a. *esso piove b. pro piove
In the case of weather verbs, English has a non-referential (‘expletive’) pronoun it, as in (16a). As pro is not available, (16b) is ungrammatical. Italian has pro, which explains the grammaticality of (17b). In cases such as this one, however, pro is the only possible pronoun. Any lexical pronoun, e.g. (17a), is ungrammatical. The conventional label for the contrast in (14–17) is that Italian is a ‘pro-drop’ language and English is not. 5 Simplifying considerably, we could express this contrast as the result of different settings of a parameter [±prodrop]. The informal statement that Italian has the value [+prodrop] and English [–prodrop] for this parameter means that a speaker of Italian has [+prodrop] specified in his or her SS and a speaker of English [–prodrop]. We could say, for instance, that this parameter is the one specified as g in the transition from Si to Si+1 in Figure 5.1.
Aspects of language development and use
285
Two issues concerning the nature of parameter setting can be illustrated on the basis of the prodrop example. The first is the nature of the value of the parameter before it is set. In Si, should the value be represented as [αprodrop], i.e. unspecified, or should it have one of the two values, the unmarked value, for instance [+prodrop]? In the former case, children learning Italian or English always have to set the parameter. In the latter case, a child learning Italian does not have to do anything, whereas a child learning English has to change [+prodrop] to [–prodrop]. Another issue is how and on the basis of what evidence the parameter is set. Obviously, (14–17) are simple sentences and they are sufficient to determine the right setting of [±prodrop]. However, as Wexler (1991) shows, we cannot assume that these data are available to the child. As already mentioned in Section 2.4.1, it is generally accepted that negative evidence does not play an essential role in language acquisition. Although some degree of correction may occur, language acquisition does not depend on it. This means that the English-learning child cannot be assumed to have the information that (14b) and (16b) are ungrammatical. Moreover, the input may actually contain errors, e.g. utterances such as (14b), without any marking that they are ungrammatical. This is the basis of the ‘argument of the poverty of the stimulus’. The learning process itself can also be modelled in more than one way. Atkinson (1992: 101) mentions on the one hand learning by testing hypotheses, on the other hand triggering. In the former case, the child would compare [+prodrop] and [–prodrop] settings with the actual input and choose the more compatible value. In the latter case, the value emerges as a side effect of the attempts to interpret the input. Triggering does not require comparative evaluation, so that it seems more compatible with the spirit of a P&P approach, as expressed in the view that language grows in the child (cf. (56) in Section 2.4.1). Valian (1999) discusses the issue in more depth.
5.2.2
Learning strategies
Much of the research on first language acquisition in Chomskyan linguistics is devoted to the description of the individual stages such as Si in Figure 5.1 or to the comparison of sequences of such stages. Thus, Guasti’s (2002) textbook introduction is full of linguistic examples of child varieties of English and other languages and of proposals for their analysis. Another part of the research is devoted to the strategies applied, of course unconsciously, by the child in language acquisition. This perspective is less common. It is taken, for instance, by Atkinson’s (1992) textbook. Two issues that arise in this context are the organisation of the values for parameters and the complexity of the input data used by the child.
286
Chomskyan Linguistics and its Competitors
Atkinson (1992: 139) considers two possible organisations of the values of a parameter such as [±prodrop]. One is based on markedness. In this view, one of the possible values of the parameter is chosen as the default value. The other one is the marked value. More data are necessary to set a parameter to the marked value than to the unmarked one. The question of markedness has played a central role in the discussion of the analysis of subjectless sentences produced by young children learning languages such as English in which no prodrop is possible. Guasti (2002: 151–185) discusses this phenomenon and presents a number of analyses that have been proposed. An example is (18). (18)
e want banana
In (18), e is an unpronounced but understood subject. Hyams (1986) argues that e is pro, so that early English, Si in Figure 5.1, is a prodrop language. She notes that early English also lacks expletive it and there and non-emphatic be, resulting in examples such as No morning for ‘It’s not morning’ (1986: 63). Sentences such as (18) coexist with sentences with lexical subjects, including personal pronouns. All these observations coincide with the predictions of a [+prodrop] value of the parameter. Hyams (1986: 158) proposes that [+prodrop] is the unmarked value of the parameter and children learning English have to ‘reset’ the parameter to [–prodrop] in the course of their acquisition process. This resetting cannot be done on the basis of the ungrammaticality of examples like (14b), because no negative data are available, but it can be done on the basis of examples like (16a), because expletive pronouns of this type only occur in non-prodrop languages. Hyams (1996) abandons this analysis because empirical and conceptual problems had been discovered. It was found, for instance, that children learning Italian have pro more frequently and in more syntactic positions than the e exemplified in (18) occurs in early English. It is also conceptually unattractive to state that a parameter is set to a potentially definitive value, which might occur in SS, and subsequently changed. An alternative proposal for the organisation of parameter values is the Subset Principle, formulated in (19). (19)
‘The learning function maps the input data to that value of a parameter which generates a language: (a) compatible with the input data; and (b) smallest among the languages compatible with the input data.’ [Wexler and Manzini (1987: 61), their (19)]
Aspects of language development and use
287
The scope of this principle can be illustrated with the diagrams in Figure 5.2.
Figure 5.2: The Subset Principle
In Figure 5.2, two values of a parameter are visualised in terms of the data that are compatible with them. In constellation A, the possible evidence for value 1 and for value 2 overlaps. In constellation B, the data compatible with value 1 are a subset of the data compatible with value 2. Wexler and Manzini (1987) argue that in constellation B, the child has to start from the assumption that value 1 is correct. If the input contains enough data that are only compatible with value 2, the parameter value has to be changed to value 2. If the child takes value 2 as the starting point, then no data whatsoever can trigger a change to value 1. This is the reason for condition (b) in (19). Wexler and Manzini then propose the Subset Condition in (20). (20)
‘In order for the Subset Principle to determine a strictly ordered learning hierarchy – this is what we will mean when we say that the Subset Principle as we define it here applies – it is necessary that two values of a parameter in fact yield languages which are in a subset relation to each other (i.e., one is a subset of the other). This requirement we will call the “Subset Condition” .’ [Wexler and Manzini (1987: 45)]
In (20) Wexler and Manzini propose that all parameters must be organised as in B. They exclude the more general constellation in A. In the case of the [±prodrop] parameter, the data in (14–17) can be grouped into the classes of Table 5.1. [+prodrop]
[±prodrop]
[–prodrop]
pro dorme pro piove
she sleeps ella dorme
it rains
Table 5.1: Classes of data for [±prodrop]
288
Chomskyan Linguistics and its Competitors
The negative evidence of (14b), (16b) and (17a) is not available to the child and is not included in the table. At first sight, Table 5.1 suggests a constellation as in Figure 5.2 A. There is an overlap of data compatible with either value, which occur both in English and in Italian, but each value has data that are incompatible with the other. This would mean that all the child has to do is to wait for this type of data to occur. In fact, Hyams (1986: 154–156) argues that the Subset Principle does not apply to the prodrop parameter. A different view is possible, however. Koster (1986) discusses prodrop in relation to a number of other syntactic phenomena for a broad range of Germanic languages. He notes that while none of them is [+prodrop] like Italian, they are not all [–prodrop] like English either. We have seen that Dutch has sentences without overt subjects in impersonal passives, as in (86b) in Section 2.5.1.2. Icelandic is like Dutch in this respect, but it behaves like Italian in sentences with weather-verbs as in (17). What this suggests is that [±prodrop] is too simplistic as a parameter and has to be split up into (at least) three parameters, one governing the contrast (14–15), another for the contrast (16–17), and a third for impersonal passives. If we call the first of these [±referential prodrop], we can represent the evidence for setting it in Table 5.2. [+referential prodrop]
[±referential prodrop]
pro dorme
she sleeps ella dorme
Table 5.2: Classes of data for [±referential prodrop]
In Table 5.2, the two values are in a clear subset relation to each other. [+prodrop] is compatible with a subset of the evidence compatible with [–prodrop]. Therefore, in terms of Figure 5.2 B, [+prodrop] is value 1 and [–prodrop] is value 2. Data for prodrop with weather verbs are not included because they belong to the evidence for a different parameter with a separate table. The way the prodrop parameter has been brought into line with the Subset Principle in this example is typical of a more general tendency. The Subset Condition tends to lead to the split of parameters having a range of correlated effects into a larger number of more specialised parameters. Apart from the way parameters are organised, also the type of input data has been the subject of discussion. As mentioned above, it is generally assumed that negative evidence is not necessary for language acquisition. The discussion of the complexity of the input data required for language acquisition started in the
Aspects of language development and use
289
context of formal learnability theory. The measure of complexity is expressed in terms of the number of embedded sentences. (21)
a. Who does Tatjana love? b. Who does Sjoerd believe that Tatjana loves? c. Who does Renata think that Sjoerd believes that Tatjana loves?
In (21a) there is no embedded sentence, so it is of degree-0. Added embeddings make (21b) of degree-1 and (21c) of degree-2. The work on degree-n learnability proofs started in the framework of Standard Theory. As summarised by Atkinson (1992: 45–53), Wexler and Culicover (1980) proved that systems of rewrite rules could be learned on the basis of degree-2 input. Morgan (1986) then proved that degree-1 input was sufficient provided the input was presented with additional structural information (unlabelled bracketing). In the context of a P&P model, such results are of limited significance. The very idea of P&P is that parameters are finite in number and in range of values. Instead of proofs, attention has moved to hypotheses as to the required complexity. Atkinson defines Degree-n Condition as in (22). (22)
‘There is an n (n ≥ 0) such that for any parameter p and any values i, j of p, there is a primary linguistic datum d of degree less than or equal to n such that d does not belong to L(p(i)) ∩ L(p(j))’. [Atkinson (1992: 227)]
The ‘primary linguistic data’ referred to in (22) are the input to the child in the language acquisition process (cf. Figure 2.4). L(p(i)) is the language generated by a grammar with parameter p set to value i. This sense of language as a set of sentences can be seen as an E-language, but this is not necessary. One can also see L(p(i)) as the set of potential evidence for the child, without assuming that the child learns an E-language. 6 L(p(i)) ∩ L(p(j)) is the intersection of the sets of evidence compatible with the two values of p. As shown in Table 5.1, the [±prodrop] parameter can be decided either way on the basis of degree-0 data. The middle column represents the intersection, i.e. the data compatible with either value. Italian data in the left-hand column and the English example in the right-hand column, however, all of degree-0, are only compatible with one of the values. Therefore they can be used to set the parameter to the Italian and the English value, respectively. Lightfoot (1989, 1994) proposes degree-0 learnability as a general condition on parameter setting. This means that embedded clauses in sentences such as (21b-c) can be ignored by the child without affecting the resulting parameter settings. No proof of the same type as the ones for degree-2 and degree-1 learnability can be given for this hypothesis. Its significance lies in guiding the way parameters are formulated.
290
Chomskyan Linguistics and its Competitors
5.2.3
The critical period hypothesis
In neuroscience, a critical period is defined as ‘A strict time window during which experience provides information that is essential for normal development and permanently alters performance’ (Hensch, 2005: 877). After the critical period, the capacities concerned can no longer be acquired. A classic example is the work by Wiesel and Hubel (1963a, b) on the development of vision in cats. They experimented with distorting visual input by suturing together the lids of the kittens’ right eye before they opened their eyes. After 9–13 weeks, they observed that ‘cell bodies and nuclei were much smaller for layers receiving their input from the closed eye’ (1963a: 981). After opening the right eye again, the eye itself behaved normally, but the kittens regularly ‘bumped into large obstacles such as table legs’ (1963b: 1006) because the information was not processed correctly. The same experiments were carried out with a mature cat, but no similar effects were observed (1963a: 987). From experiments such as these, it was concluded that the neural circuits that are essential for the processing of visual information can only develop when visual input is available during the critical period of the first months after eye-opening. Colombo (1982) gives an overview of subsequent research. He identifies four defining elements of a critical period, listed in (23). (23)
a. b. c. d.
An identifiable onset and terminus An intrinsic component related to maturation of the organism An extrinsic component related to environmental stimuli A critical system affected by stimulation
A critical period for language acquisition was proposed by Lenneberg in (24). (24)
a.
‘Language cannot begin to develop until a certain level of physical maturation and growth has been attained. Between the ages of two and three years language emerges by an interaction of maturation and selfprogrammed learning. b. Between the ages of three and the early teens the possibility for primary language acquisition continues to be good; […] c. After puberty, the ability for self-organization and adjustment to the physiological demands of verbal behavior quickly declines.’ [Lenneberg (1967: 158)]
If we compare (24) with the general criteria in (23), it is obvious that Lenneberg concentrates on (23a), specifying the boundaries of the period. What Lenneberg calls ‘the “Critical Period” for Language Acquisition’ (1967: 175) is bounded
Aspects of language development and use
291
in his Figure 4.4 (1967: 159) by ‘Physical immaturity’ up to two years of age, corresponding to (24a), and by the ‘Loss of flexibility for cerebral reorganization’ after age twelve, corresponding to ‘After puberty’ in (24c). The remaining items of (23) are indicated by the reference to ‘primary language acquisition’ in (24b). For (23b) we can understand the genetically determined input, i.e. the language faculty. As for (23c), Colombo (1982: 264) criticises the practice of specifying this component simply as ‘language’. However, the discussion of the nature of the primary linguistic data, for instance in terms of degree-n learnability, constitutes some further specification. As for (23d), Colombo criticises that ‘one finds no distinction made as to what aspects of the acquisitional process are most significantly shaped during this period’ (1982: 264). Given the discussion in Chapter 2, we expect the system to be the I-language, more specifically the core as opposed to the periphery. For a more precise specification of the elements in (23), controlled experiments of the type carried out by Hubel and Wiesel on cats would be optimal. However, they are impossible for obvious ethical reasons, because they could only be carried out on children. The only evidence can be gathered from what Colombo calls ‘naturally occurring abnormal or pathological conditions that vary across individuals on age of onset’ (1982: 266). Although they are sometimes called ‘nature’s experiments’, they lack the controlled nature of proper experiments, so that the quality and interpretation of their outcomes are less reliable. A first class of such cases is children who grow up without linguistic input because they are isolated. Lenneberg discusses these cases and concludes that the evidence cannot be used because ‘The children are invariably discovered by well-meaning but untrained observers, and the urgency for getting help is so overwhelming that the scientifically most important first months are the least well-documented’ (1967: 141). Against this background, the special nature of the evidence gathered by the study of a girl called Genie, described by Curtiss et al. (1974), should be appreciated. This case is special because the state of Genie’s linguistic competence when she was discovered and its development after that were monitored by a team of linguistic scholars so that it is well-documented. Genie was discovered in 1970 when she was 13. She had been isolated by her apparently psychotic parents who had treated her in such a way that she did not produce any sounds. From a variety of tests carried out in the first couple of months, Curtiss et al. concluded that ‘It appeared, therefore, that Genie was a child who did not have linguistic competence’ (1974: 530). They describe Genie’s rapid cognitive development, but note that it is uneven
292
Chomskyan Linguistics and its Competitors
in an important respect. Retarded development is usually expressed in terms of the ‘normal’ age at which a specific stage is reached. In the case of Genie, ‘her vocabulary is much larger than that of children at the same stage of syntactic development’ (1974: 540). In a later overview, Curtiss concludes that while Genie managed to acquire some syntactic knowledge, ‘much of the grammar remained unacquired’ (1988: 98). According to Eubank and Gregg (1999: 74), for instance, she was never able to produce subordinate clauses or grammatical wh-questions. Moreover, Colombo (1982: 269) mentions electroencephalographic research showing that she processes language with an atypical part of the brain. 7 Everything that we know about the linguistic development of Genie tends to support the critical period hypothesis. In terms of Chomskyan linguistics, the critical period applies to the core language, the system that develops into a stable state SS in Figure 5.1. This constitutes the critical system in terms of (23d). Vocabulary belongs to the periphery that can be learned throughout a person’s life. A related class of cases is deaf children of hearing parents. Lenneberg (1967: 320–324) discusses the problems deaf children encounter in language acquisition as this is approached in the specialised schools he visited. His central observation is (25). (25)
‘there can be no doubt that the deaf come in contact with language at an age when other children have fully mastered this skill and when, perhaps, the most important formative period for language establishment is already on the decline.’ [Lenneberg (1967: 321)]
The educational approach leading to (25) involves the prohibition against using gesture and the discouragement of reading and writing as a primary access to language, because these were believed to distract from the use of oral language. Lenneberg concludes that he is ‘inclined to believe that the failures in the proficiency are primarily due to shortcomings in instruction and training’ (1967: 322). It was not until the 1970s that sign languages of the deaf were widely recognised as ‘real’ languages as opposed to systems of gesture only suitable for limited communicative needs. 8 When this is recognised, (25) can be seen as an instance of deprivation of linguistic input. If the linguistic development of hearing children is taken as normal, only deaf children born to parents that are also deaf and use sign language naturally have a normal linguistic development. As Neidle et al. (2000: 9) point out, ‘fewer than 5–10%’ of deaf people grow up in such a family setting, because most forms of deafness are not hereditary. The consequences for the majority of deaf people are serious, as indicated in (26).
Aspects of language development and use (26)
293
‘Whereas a deaf child from a Deaf, ASL-signing family will normally acquire native fluency in ASL, the deaf child of hearing, English-speaking parents has no guarantee of acquiring any native language at the usual age.’ [Neidle et al. (2000: 9)]
ASL in (26) stands for American Sign Language. Neidle et al. (2000) adopt the convention of systematically distinguishing between deaf, a physical condition, and Deaf, a cultural allegiance. The studies reported on by Mayberry (1994) support (26). In one study she investigated the skills in repeating a message in ASL of 55 deaf students who had acquired ASL starting at an age ranging from 0–18 years. Early learners were likely to make lexical substitutions, whereas late learners would make substitutions by phonologically similar but semantically different signs. 9 In conclusion, the hypothesis that there is a critical period for language acquisition cannot be tested in fully controlled experiments for ethical reasons, but the evidence that is available supports it. This evidence consists of at least one well-documented example of a hearing child growing up until after the critical period without linguistic input and a relatively large population of deaf people for which different levels of first language proficiency can be correlated to the age when they were first exposed to linguistic input in the visual mode accessible to them. The critical period applies to the acquisition of a core language, including syntax, but not to the acquisition of the periphery, including vocabulary.
5.2.4
Maturation versus continuity
In Chomskyan linguistics, language acquisition is modelled as parameter setting. While the image of parameter setting in Figure 5.1 is generally appealing, it leaves a number of issues unspecified. It is on one of these issues that the opposition between maturation and continuity turns. The two positions are presented by Wexler (1999) and Lust (1999), respectively. Lust (1999) formulates the Strong Continuity Hypothesis in (27). (27)
a. ‘Strong Continuity Hypothesis in UG b. UG (where this term refers to the “principles and parameters” which provide the true content of UG) is a model of the Initial State; c. it is thus available to the child from the beginning. d. The “initial state” is taken to refer to the onset of first language acquisition, even “before experience” […] e. UG remains continuously available throughout the time course of first language acquisition. UG does not itself change during this time course.’ [Lust (1999: 118)]
294
Chomskyan Linguistics and its Competitors
In understanding (27) it is important to see that ‘UG’ is used in two different senses in (27b) and in (27e). In (27b) it refers to the model constructed by the linguist, corresponding to ‘Universal Grammar’ in Figure 2.7. In (27e), this meaning is impossible because it is not the model that is available to the child, but the entity it is a model of, i.e. the ‘Language Faculty’ in Figure 2.7. In (27c) these two senses are conflated, because ‘it’ corefers with ‘UG’ in the sense of ‘model of the Initial State’, but is at the same time said to be ‘available to the child’. 10 Despite the fact that the whole of (27) goes under the title of (27a), the actual hypothesis is contained in (27e). (27b-c) is intended as a clarification of ‘UG’ and (27d) of ‘Initial State’. An implication of (27e) is that the Language Faculty (which is what ‘UG’ refers to here) is not the same as the entity which is in the state S0 in Figure 5.1. I will call this entity X. X changes in the course of language acquisition by the process modelled in Figure 5.1 and ends up in SS. We can think of the Language Faculty as producing X in S0 and governing its development to SS. At the end of this process, X has become a fully developed, adult I-language. In line with the discussion of (13), we can call X an I-language with unset parameters. This situation is represented in Figure 5.3.
Language Faculty
… …
a w
S0
… … …
SS
k g v
Figure 5.3: The Strong Continuity Hypothesis
As shown in Figure 5.3, according to the SCH, the Language Faculty generates an entity with unspecified parameters which develops from S0 to a stable state SS. Meanwhile, the Language Faculty does not change. Wexler (1999) argues for an alternative view, addressing the question of what is needed for language acquisition in (28).
Aspects of language development and use (28)
295
a.
‘Everybody can agree that some component of learning is needed, for example, to set language-specific parameters of lexical items. What else is needed? b. Linguistic theory concludes that to explain how every child exposed to input from one language winds up with an essentially identical language, and one which goes way beyond what is heard in the input, it must be the case that the child has a genetically based program (that is, UG) that determines much of the form of the attained grammar. c. Moreover, to explain how and why every child goes through many of the same steps in linguistic development, steps that are also not directed by the input, linguistic theory concludes that parts of UG grow in a way that is dictated by the genetic program.’ [Wexler (1999: 56)]
In (28), ‘UG’ is used consistently for what Figure 2.7 calls the ‘Language Faculty’. Two components of language acquisition are identified in (28), learning (28a) and a genetically based Language Faculty (28b). It is this second component which is subject to growth or maturation. We can interpret the relation of the Language Faculty to Figure 5.1 in two different ways. First, it is the Language Faculty which is in S0 at the start and develops into an I-language at SS. Growth means, then, that not all principles are active and/or not all parameters available at S0, but some of them only develop in the course of the transition to SS. Alternatively, the Language Faculty can be thought of as producing an entity X that is in S0. The subsequent development of X into an I-language at SS is influenced on the one hand by input data that determine the setting of parameters, on the other hand by the growing Language Faculty which continues to interfere with X after the initial stage. These two options are represented as A and B in Figure 5.4.
A … …
a w
B S0
…
Language Faculty
… …
…
SS
k g v
…
Language Faculty
a w
S0
… … …
SS
k g v
Figure 5.4: Two versions of the Maturation Hypothesis
296
Chomskyan Linguistics and its Competitors
In Figure 5.4 A, the Language Faculty is the entity in S0 that develops into SS. In Figure 5.4 B, the Language Faculty creates an entity X in S0 and both develop simultaneously. The development of X is observable in the succession of I-language states. The development of the Language Faculty is at least in principle observable by the influence it has on the development of the I-language. However, it is not straightforward to distinguish the influence of development of the language faculty and development of X in the data. In order to decide which of the hypotheses is the most attractive, conceptual and empirical arguments can be used. Lust refers to statements by Chomsky such as (29). (29)
a.
‘The theory of languages and the expressions they generate is Universal Grammar (UG); b. UG is a theory of the initial state S0 of the relevant component of the language faculty.’ [Chomsky (1995b: 167)]
Lust (1999: 112) quotes (29b) alongside similar statements by Chomsky in older publications to characterise the relationship between UG and S0. She uses this as an argument against the Maturation Hypothesis (MH) in (30). (30)
a.
‘On the MH, UG (defined independently by the science of theoretical linguistics on the basis of adult language) arises only gradually, culminating when language acquisition is completed. b. Therefore, on the MH, the full theory of UG (i.e., the result of theory internal argumentation, which is the core of linguistic science today) characterizes the final state, not the initial state (prior to experience).’ [Lust (1999: 125)]
In (30), Lust uses ‘UG’ in the sense of the language faculty, not in the sense of a theory. The argument she advances is that there are in principle two ways to determine the nature of the language faculty. One is by abstraction from what (30a) calls ‘adult language’, i.e. the range of I-languages that can be observed. The other is by considering S0 directly. To the extent that the data collected from the two perspectives coincide it is possible to explain language acquisition. The observations of the language acquisition process can then be used as external evidence for UG in the sense of (29), which is formulated on the basis of SS I-languages. If we accept the Maturation Hypothesis, this type of argument breaks down. In the Maturation Hypothesis, UG is no longer the description of any particular state of the language faculty, but an idealisation based on abstraction from the range of I-languages. This is what must be intended by ‘characterizes the final state’ in (30b). Therefore language acquisition studies cannot provide external evidence for the evaluation of UG. The two types of evidence are no longer
Aspects of language development and use
297
independent. It should be noted, of course, that Lust’s argument in (30) is not that the Maturation Hypothesis is wrong, only that it blocks one way of using language acquisition data as evidence for the nature of the language faculty. Another possible conceptual objection to the Maturation Hypothesis is that in the intermediate stages, strange types of grammar may develop, not constrained by the fully developed language faculty. This applies especially to the version represented in Figure 5.4 B. In order to avoid this, Borer and Wexler (1992) propose that the development of the language faculty is constrained by UG-Constrained Maturation, defined in (31). (31)
‘UG-CONSTRAINED MATURATION: Given a sequence of acquisition stages S0…Sn…SS, for any Sn, Sn a maturational step, the set of representations generated at Sn is a subset of the set of representations allowed by UG.’ [Borer and Wexler (1992: 170–171)]
The hypothesis in (31) means that if a child grows up in an English-speaking community, at intermediate stages in the linguistic development the child may have an I-language that generates structures that are ungrammatical in English, but not structures that are ungrammatical in any human language. The conceptual objections to the Strong Continuity Hypothesis (SCH) are of a different type. Wexler argues along the lines illustrated in (32) that the SCH goes against the spirit of Chomskyan linguistics as it developed in the MP. (32)
a.
‘biology is often looked at as the study of systems that develop, and growth/maturation is in fact what drives change in all living species. b. Thus it would be extremely odd, a real contradiction to the basic tenets of biology, to find a species that showed no development, no growth, no maturation. […] c. the basic tenet of linguistic theory (generative grammar) is that language is a central part of human biology. […] d. Thus, if we accept the foundations of linguistic theory, the claim that language has all these biological properties, then language must be seen to grow, to mature. There is no other logically coherent way of analyzing language as part of human biology.’ [Wexler (1999: 69)]
In (32c) Wexler refers to the unification of linguistics and biology, as integrated in the latest version of the research programme of Chomskyan linguistics, represented in Figure 2.10. He uses this as a way to impose general biological constraints on the nature of language. In (32a) growth/maturation is said to be such a constraint. The wording in (32b) seems to be somewhat lacking in precision, because human language is not a species and the human species could satisfy (32b) by growing in other respects. The central position (32c) assigns to language is meant to counter this. From (32b) and (32c) the conclusion in
298
Chomskyan Linguistics and its Competitors
(32d) is drawn. One might object, however, to the use of ‘logically’ in (32d). No logically compelling argument is given in (32), only an intuitively based argument pertaining to the plausibility of the two alternatives. Wexler notes another conceptual problem with the SCH in (33). (33)
a.
‘On the view that there is a genetic program that constructs UG, we have to ask at exactly which time point a full UG arises. b. At birth? Why at birth? What is special about birth that makes UG arise then? c. Before birth? When? d. After birth? When? (And that is the growth/maturation view anyway).’ [Wexler (1999: 62)]
In (33), ‘UG’ is again used to refer to the language faculty. The SCH makes abstraction from the question of when the language faculty arises, but (33a) refuses to do so. Although birth is usually seen as the beginning of the life of an individual, there is no special reason, as (33b) states, to consider it the beginning of linguistic development. Human development can be seen as a process starting with a fertilised egg and leading to an adult human being. There is no obvious point on this line where a fully-fledged language faculty would naturally arise. However, (33d) relaxes the claim of growth/maturation in a way that makes it possible to reconcile it with the model in Figure 5.3. What is essential in the continuity hypothesis is that the language faculty is distinct from the entity that becomes the adult I-language and that the language faculty does not interfere with that entity after S0. A possibility left open by (33d) is that the language faculty develops before S0 comes into existence. What is obvious from the discussion on conceptual arguments is that maturation and continuity emphasise different aspects of language development. The necessity for UG to describe S0, as in (30), is essential for the proponents of continuity, whereas the necessity to account for the origin of the language faculty in the individual is essential for the proponents of maturation. This leads to mutual exaggeration of the differences, but not to different research programmes. 11 In principle, it is possible to bring empirical evidence to bear on the questions under discussion. The problem here is that there exists a large gap between the observation of child language data such as (18) and their interpretation as evidence for one or the other hypothesis on the exact role of the language faculty in language acquisition. Without a large set of theoretical assumptions this gap cannot be bridged. These assumptions can be made in different ways, which is what proponents of the two conflicting hypotheses do. This does not mean that maturation and continuity are just idle speculation with no connection
Aspects of language development and use
299
to language acquisition data. They stimulate the collection of data in particular ways and thereby guide research in directions that may lead to more evidence for one or the other. SUMMARY
•
The study of first language acquisition in Chomskyan linguistics is the study of parameter setting in the language acquisition process.
•
The Subset Principle states that if a parameter has values for which the potential evidence can be represented as proper subsets, the value with the smallest set of potential evidence should be assumed by the child, unless a sufficient amount of contrary evidence is observed.
•
The Subset Condition states that the Subset Principle holds for all parameters. It leads to the division of parameters with a relatively large range of consequences into more parameters, each with a more restricted range of consequences.
•
Degree-0 Learnability is the hypothesis that all I-languages can be learned on the basis of sentences without embeddings.
•
The Critical Period Hypothesis for first language acquisition states that if language acquisition has not taken place by the age of about twelve, it cannot be recuperated. It is supported by evidence from linguistically deprived children.
•
The Strong Continuity Hypothesis and the Maturation Hypothesis are different hypotheses about the relation between the language faculty and the emerging I-language in the language acquisition process.
•
According to the Strong Continuity Hypothesis, the language faculty remains constant throughout the language acquisition process. The emerging I-language is separate from the language faculty.
•
According to the Maturation Hypothesis, the language faculty develops as part of the language acquisition process. The I-language can be thought of either as the language faculty itself with progressively specified parameters or as an entity separate from the language faculty.
300
5.3
Chomskyan Linguistics and its Competitors
Second language acquisition
In the context of Chomskyan linguistics, the term second language acquisition (SLA) has been adopted as the general designation of a language apart from the first language. In the context of language teaching, a distinction is sometimes made between second language and foreign language. If an English person learns French in evening classes in London, French is a foreign language, whereas if the same person moves to Paris and learns French while living there, French is a second language.12 The designation foreign is problematic, however, because of its association with nationality. If an English person learns Welsh, for instance, it is hard to see this as a foreign language. An objection sometimes raised against the term second language, e.g. by Bley-Vroman (1989: 43), is that it does not apply comfortably to further languages. To the English person who has learned French and then starts German, German is the third language, but it is technically referred to as a second language. Thus, when Selinker mentions ‘other second languages’ of a single person interfering with the SLA process (1992: 164), there is no contradiction. The reason for this generalisation is that the cognitive processes involved in second and further languages are largely similar. The main difference is that between first and second language acquisition. 13 The prototypical case of SLA involves an adult learner. In this sense, SLA is opposed to bilingual language acquisition. The prototypical case of bilingual language acquisition is a child exposed to two languages at the same time. This is a common phenomenon in bilingual regions such as Wales, where children of mixed Welsh-speaking and non-Welsh-speaking couples will hear Welsh and English in different situations. As the discussion by Meisel (2004) shows, it is difficult to draw an exact boundary between bilingual and second language acquisition. De Houwer (1995: 223) uses exposure before the age of two as a criterion, others take a somewhat later age. Implicit in the assumption of SLA as a field of study is that there are interesting differences between first and second language acquisition. Section 5.3.1 explores these differences in general terms. Section 5.3.2 considers the analogy between L1 and L2 acquisition with respect to the opposition between a logical and a practical problem and Section 5.3.3 does the same for the critical period. In the context of Chomskyan linguistics, it is natural to assume that these differences relate to the way the language faculty is involved in the language acquisition process. The main positions in this respect are outlined and compared in Section 5.3.4.
Aspects of language development and use
5.3.1
301
The difference between first and second language acquisition
One of the reasons for the special role of child language acquisition in Chomskyan linguistics is its difference from adult second language acquisition. Chomsky expresses this in (34). (34)
‘It is a common observation that a young child of immigrant parents may learn a second language in the streets, from other children, with amazing rapidity, and that his speech may be completely fluent and correct to the last allophone, while the subtleties that become second nature to the child may elude his parents despite high motivation and continued practice.’ [Chomsky (1959: 42)]
First language acquisition plays a central role in the research programme of Chomskyan linguistics because it is necessary for the existence of language but cannot be explained on the basis of input available to the child. (34) highlights that the adult has what seems to be a better starting point than the child, but still experiences more difficulties and has a poorer overall achievement. The two defining differences between adult SLA and child first language acquisition are formulated by Flynn (1996) as in (35). (35)
a.
‘adults are more cognitively advanced than children and thus have a wider set of problem-solving skills […] b. adults already know at least one language.’ [Flynn (1996: 127)]
At first sight, the two properties in (35) are advantages for the learner. This makes the observation in (34) all the more surprising. Bley-Vroman (1989: 43–49) discusses nine differences under the headings in (36) that highlight the surprising nature of this observation. (36)
a. b. c. d. e. f. g. h. i.
Lack of success General failure Variation in success, course, and strategy Variation in goals Fossilization Indeterminate intuitions Importance of instruction Negative evidence Role of affective factors
The different headings in (36) cover overlapping ideas. The final result of SLA is characterised by (36a-b), part of (36c), and (36e). It should be noted
302
Chomskyan Linguistics and its Competitors
how all of these characterisations presuppose a norm. Without a norm, success or failure cannot be measured. 14 As shown in Section 5.1, such a norm does not have a natural place in Chomskyan linguistics. The acquisition process is characterised by part of (36c), as well as (36g-i). Among these, (36g-h) depend directly on the difference in cognitive development in (35a), and the others may also fall in this category. (36d) and (36i) highlight the fact that in SLA the language does not grow automatically, as Chomsky states for children in (56) of Section 2.4.1. We will come back to (36f) below. A different list of ‘major differences between the child L1 and the adult L2 cases’ is given by Schachter in (37). (37)
a.
‘The most obvious difference is that few (and possibly no) adults reach the level of tacit or unconscious knowledge of the L2 (L2 competence) that would place them on a par with native speakers (NSs) of the L2 […] b. The second striking difference is that adult L2 speakers typically produce a kind of variation not obviously attributable to the register variation one finds in the production of NSs. […] c. Nor is the adult L2 learner equipotential for language acquisition in the way the child L1 learner is. […] An adult speaker of English will require considerably less time and effort to achieve a given level of ability in German than in Japanese […] d. And finally, the adult learner’s prior knowledge of one language has a strong effect, detectable in the adult’s production of the L2, variously labeled transfer or crosslinguistic effects.’ [Schachter (1996: 160f.)]
It is interesting to see that of the four differences in (37) only (37a) corresponds directly to any of the differences listed by Bley-Vroman (1989) in (36). The difference in (37b) is reminiscent of (36f), but whereas (36f) is based on competence, (37b) only refers to performance. The two differences in (37c-d) are based on the extra knowledge of an adult learner, as formulated in (35b). A child cannot use prior knowledge of a language acquired earlier in L1 acquisition, because by definition there is no such language in L1 acquisition. Therefore all languages are equally difficult and no transfer is possible. In the same way as (36a-b), the formulation of (37a) is hard to reconcile with Chomskyan linguistics. Although it mentions ‘L2 competence’, it describes this notion as ‘knowledge of the L2’ in a way that presupposes that the L2 exists as an E-language, independently of the knowledge. In (37a) we could interpret L2 competence as the I-language embodied in the native speakers of L2. However, while I-languages are real entities, a comparison of one or more of them with the learner’s competence yields a degree of similarity, not a ‘level of tacit or unconscious knowledge’.
Aspects of language development and use
303
Meanwhile, developments in Applied Linguistics had taken a different turn as evidenced by (38). 15 (38)
a.
‘his studies take a contrastive perspective of units applying at one and the same time to three linguistic systems – NL, TL and IL – whereas recall that classical CA confined itself to two – NL and TL.’ [Selinker (1992: 183)] b. ‘Transfer takes place between two mental structures: NL and developing IL.’ [Selinker (1992: 168), originally all italics]
Selinker (1992) gives a historical overview of the field of Applied Linguistics, focusing on the concept of interlanguage, abbreviated IL in (38). Selinker is himself generally credited with coining this term and in (38a), ‘his studies’ refers to Selinker’s PhD research, completed in 1966, and subsequent publications. He contrasts his own approach with contrastive analysis (CA), an approach to Applied Linguistics that tried to identify problems by looking at the differences between the native language (NL) of the learner and the target language (TL). In the list of which (38b) is item 27, Selinker summarises the contribution to the field of Applied Linguistics by Pit Corder. We should compare the interlanguage approach in (38) to Schachter’s (37d). Whereas (38) emphasises the mental nature of interlanguage, which can therefore be seen as an I-language, (37d) only mentions ‘production of the L2’, i.e. performance, and compares this to an E-language norm. Schachter does not explicitly deny that there should be an underlying I-language, but she formulates (37b) and (37d) in terms of performance only. In this respect, Bley-Vroman’s reference to ‘intuitions’ in (36f) suggests a more promising approach. In the discussion of (36f), however, Bley-Vroman argues as in (39). (39)
a.
‘the knowledge underlying native-speaker performance may be incomplete (in the technical sense) and thus may be a different sort of formal object from the systems thought to underlie native speaker performance. […] b. A system of rules generating all and only the sentences of a language may even be absent.’ [Bley-Vroman (1989: 47)]
The suggestion in (39) is that interlanguage is not an I-language. According to (39a), if interlanguage is the knowledge underlying performance, it is suggested that it is a different type of entity compared to I-language. In (39b) this suggestion is elaborated. However, ‘A system of rules generating all and only the sentences of a language’ is not the same as ‘what the speaker of a language knows implicitly’, which is what competence is according to Chomsky (1966a) in (4) of Section 2.1.1. What (39b) describes is a grammar of an E-language,
304
Chomskyan Linguistics and its Competitors
a type of object for which there is no reason to assume that any native or nonnative speaker has a mental equivalent. The fact that a certain type of knowledge leads to indeterminate judgements is not problematic. In fact, it is a well-known phenomenon for native speaker judgements of marginal constructions (cf. the discussion of parasitic gaps in Section 2.2.1). The indeterminacy of judgements is a matter of degree. Sorace (1996: 384–390) discusses some possible reasons for the difference between L1 and L2 competence in this respect. 16 In conclusion, the discussion of the differences between SLA and L1 acquisition suggests that the minimal statement in (35) may be sufficient. Additional differences suggested by Bley-Vroman in (36) and Schachter in (37) can in most cases be reduced to those in (35). Where differences in formulation are significant, they show an emphasis on performance or a misconception of the nature of competence. The idea that SLA is marked, in addition to (35), by a ‘lack of success’ depends on the comparison of an interlanguage to a norm. In Chomskyan linguistics, such a norm is taken to be a social rather than a linguistic phenomenon.
5.3.2
The logical problem of second language acquisition
In Section 2.4.1, we encountered the ‘logical problem of language acquisition’, a phrase introduced by Hornstein and Lightfoot (1981). The study of this version of the problem rather than the practical one was motivated by the need to concentrate on what in the research programme of Chomskyan linguistics constitutes the most important aspect of language acquisition, i.e. the fact that it takes place. The resulting competence cannot be explained on the basis of the input data alone. It must involve properties of S0 as well. Therefore, a genetically determined language faculty is necessary. The research problem is then to discover the properties of this language faculty on the basis of the variety of SSs that can be attained. When we transfer this approach to SLA, the first question is whether there is a logical problem of SLA parallel to that of L1 acquisition. As can be expected from the discussion of (36), Bley-Vroman (1989) is sceptical and emphasises the fact that L2 acquisition does not succeed. His approach to the logical problem of SLA is summarised in (40). (40)
a.
‘Let us tentatively assume, therefore, that the same language acquisition system which guides children is not available to adults. […] b. The logical problem of foreign language acquisition then becomes that of explaining the quite high level of competence that is clearly possible in some cases, while permitting the wide range of variation that is also observed. […]
Aspects of language development and use c.
305
My specific proposal here is that the function of the innate domainspecific acquisition system is filled in adults (though indirectly and imperfectly) by this native language knowledge and by a general abstract problem-solving system. I shall call this proposal the Fundamental Difference Hypothesis.*’ [Bley-Vroman (1989: 49f.), footnote * deleted]
In (40a), ‘therefore’ refers to Bley-Vroman’s discussion of (36). The problem is then, as stated in (40b), to explain the unexpected (though partial) success. This problem is remarkably parallel to the logical problem of L1 acquisition. However, (40c) takes an opposite approach to the one expected in Chomskyan linguistics. It would be highly surprising if general problem-solving systems were able to solve the puzzle of (40b). Gregg (1996: 52–66) states the problem from a slightly different perspective, as indicated in (41). (41)
a.
‘it can be argued that truly nativelike competence in an L2 is never attained. b. But even if that turns out to be the case, it does not follow that there is no logical problem; c. if the learner’s L2 grammar – what is known as his interlanguage (IL) grammar – is underdetermined by the input data, the problem exists however “imperfect” the acquired grammar may be.’ [Gregg (1996: 52f.)]
While (41b) is not directly in opposition to (40b), it presupposes a different interpretation of logical problem. For Gregg, the existence of a logical problem implies that the input data cannot explain the resulting competence. Even if, as (41a) concedes, SLA may generally fail, the approach in (40c) is not sufficient. As in L1 acquisition, the competence ‘is underdetermined by the data’, as stated in (41c). This means that no general problem-solving device can figure out the system, because the input is not sufficient. Given the solution chosen for first language acquisition, it is attractive to attribute also the additional competence acquired in SLA to the language faculty. Against this background, White (2003: 22–56) discusses the logical problem of SLA in detail. From the perspective of Chomskyan linguistics, it is expected that the interlanguage competence cannot be explained completely on the basis of L1 competence and L2 input. To the extent that this expectation is correct, a further component has to contribute information and the language faculty involved in L1 acquisition is a good candidate. White (2003: 23–39) summarises various studies with different phenomena and language pairs,
306
Chomskyan Linguistics and its Competitors
arguing that the results are compatible with hypotheses that assume a different use of principles and parameters in UG. As an example of a different parameter setting White (2003: 23–26) discusses the use of overt pronouns in Spanish as L2 for L1 English speakers. Spanish is a prodrop language like Italian so that the relevant (set of) parameter(s) has a different value than in English. Such experiments indicate that parameter settings not used in the L1 are available to the learner in L2 acquisition. As an example of the effect of a principle, White (2003: 35–39) discusses the effects of the Empty Category Principle (ECP) in Japanese as L2 for L1 English speakers. The ECP accounts for a range of subject-object asymmetries. In Japanese it governs the possibility of dropping certain overt case markers. As English does not have such overt case markers, no transfer from L1 is possible. This indicates that principles such as ECP can be applied to L2 elements of a type not available in L1. Therefore, principles and parameters in UG, used to describe the language faculty, can also be used to account for the way interlanguage comes into existence. She also considers alternative explanations of these results and experiments that seem to yield counterarguments. The alternative explanations have to argue that general-purpose problem solving can use the L1 competence and the L2 input to arrive at the interlanguage competence mimicking the use of principles and different parameter settings. Whereas for the principles it can be argued that L1 competence contains them, this is much harder to maintain for parameter settings that are different from the ones in L1. A counterargument to UG-based accounts of interlanguage competence is what is sometimes called wild grammars for an interlanguage. A wild grammar is a grammar not allowed by UG. After discussing some of the claims of a wild grammar (2003: 42–54), White reaches the conclusion in the evidence referred to in (42). (42)
‘We have seen that analyses adopted by L2 learners may in fact be true of natural language, even if they happen not to be appropriate for the L1 or L2 of the learners in question.’ [White (2003: 56)]
White’s argument is that linguists who claimed to have found a wild grammar only considered the properties of the L1 of the learner and the L2 they were learning before concluding that the interlanguage grammar was wild. In fact, the evidence referred to in (42) offers strong support for the hypothesis that the language faculty as described in UG is involved. If the interlanguage selects an option from UG not represented in L1 or L2, the role of the language faculty is actually demonstrated most convincingly.
Aspects of language development and use
5.3.3
307
The critical period in second language acquisition
General observations of the type of Chomsky’s (34) quoted above suggest that there is not only a difference in success between child L1 acquisition and adult L2 acquisition, but also one between L2 acquisition of children and adults. 17 An explanation of this difference has often been framed in terms of a critical period. The experiment described by Johnson and Newport (1989) was set up to substantiate the type of pre-theoretical observation in (34). Their subjects were 46 native speakers of Chinese and Korean who learned English as a second language when they moved to the United States. They were asked to give grammaticality judgements for a set of 276 sentences of English. These were compared with the judgements of a control group of six- and seven-year old native speakers. The sentences represent a range of different syntactic properties of English. The results were calculated in four groups with various age of arrival in the United States: 3–7, 8–10, 11–15, and 17–39. The subjects had been chosen so that they had not been exposed to English before they arrived in the United States. Johnson and Newport’s (1989: 77–81) conclusions can be summarised as in (43). (43)
a.
In comparing ‘the age 3–7 group and the native group’ they found that ‘the two groups were entirely overlapping in performance’. (1989: 78) b. In the group first exposed to English at age 3–15, there is a ‘strong linear relationship between age of exposure to the language and ultimate performance’. (1989: 78) c. For the age group 17–39, ‘there are large individual variations in ultimate ability in the language’, but ‘later age of acquisition [i.e. 17–39 as opposed to 3–15] determines that one will not become native or nearnative in a language’. (1989: 81)
The results of experiments like the one carried out by Johnson and Newport (1989) can be seen as evidence for a critical period in language acquisition which is also active in SLA. However, the idea of a critical period in L2 acquisition is of a quite different nature compared to the critical period in L1 acquisition. In the general sense of critical period as discussed by Colombo (1982), the critical system has a certain degree of plasticity during the critical period. This means that the neural connections in the brain can grow in different ways. Once the growth is complete, the critical period can end and the neural connections will be stable. In the case of language, L1 acquisition is the actual process to which the critical period applies. As far as L2 acquisition uses the
308
Chomskyan Linguistics and its Competitors
same system, it is in a sense parasitic on the remaining plasticity. Johnson and Newport (1989) consider the two hypotheses in (44). (44)
a.
‘The exercise hypothesis. Early in life, humans have a superior capacity for acquiring languages. If the capacity is not exercised during this time, it will disappear or decline with maturation. If the capacity is exercised, however, further language learning abilities will remain intact throughout life. […] b. The maturational state hypothesis. Early in life, humans have a superior capacity for acquiring languages. This capacity disappears or declines with maturation.’ [Johnson and Newport (1989: 64)]
In (44a), the language faculty is represented as comparable to a muscle, which needs training to remain strong. In (44b), the critical period is of a more absolute type. In their conclusion they suggest that ‘our results are most naturally accommodated by some type of maturational account’ (1989: 97). Eubank and Gregg (1999) consider two positions as to the critical period in SLA in (45). (45)
a.
‘One could point out, as has often been done in such discussions, that normal adults did not miss the critical period for their L1 and that they do not respond to L2 input as Genie or Chelsea did to L1 input, and one could conclude that hence for normal adults, the question is misguided. b. Alternatively, noting that with few exceptions adult learners fail, often miserably, to become indistinguishable from members of the ambient L2 speech community, one could argue that there must be a CP that affects L2 competence.’ [Eubank and Gregg (1999: 77)]
In (45a), Chelsea is another case of a woman who as a child missed the critical period for language acquisition. She was misdiagnosed as retarded and only when she was 31 it was discovered she was deaf. Her L1 competence remained considerably below that of Genie.18 In (45b), ‘CP’ stands for critical period. Eubank and Gregg (1999) argue that both positions in (45) are too simplistic. (45a) is a way of disposing of the question of a critical period in SLA without answering it. (45b) fails to account for phenomena such as UG-constrained interlanguage grammars that have properties occurring neither in L1 nor in the native speaker variant of L2, cf. (42). The real question, according to them, is to what extent the stable state can be modified after it has been attained. This is a question which has attracted a lot of attention in neuroscience in recent years. Hensch (2005) gives an overview of research on critical periods. Most of this research takes as its example the critical period in the develop-
Aspects of language development and use
309
ment of vision in cats and rodents. Hensch (2003) describes experiments with methods to control the onset and termination of critical periods and in particular the possibility to regain plasticity after the end of the critical period. Two recent experiments showing that plasticity can be regained after the critical period are reported by He et al. (2006) and Hofer et al. (2006). 19 While these experiments explore the concept of critical period in general, modern methods of measuring brain activity also enable us to observe the effects of the critical period in language acquisition on adult SLA without relying exclusively on performance. Weber-Fox and Neville (1999) studied a group of Chinese-English bilinguals selected in a way similar to Johnson and Newport (1989), but instead of evaluating their grammaticality judgements, they analysed their brain activity by means of ERP (cf. the final part of Section 2.2.3). They found that the different age of first exposure to English has a stronger effect on the type of brain activity than on the success of processing. Similar results had been reported by Kim et al. (1997) on the basis of fMRI. In their study, individuals with two native languages were compared with individuals who had acquired a second language in early adulthood. Such results are exactly what Colombo (1982: 268–270) predicts for critical periods in general. Organisms are clever at using whatever possibilities they have at their disposal to make up for the lost opportunity of a missed critical period. This suggests that the critical period is more important for SLA than L2 performance indicates. Nevertheless, the language faculty cannot be dispensed with in the account of the observation in (42) on ‘wild languages’. When properties of the interlanguage correspond to a possibility predicted by UG but neither realised in L1, nor in the competence of the speakers producing the L2 input, we have to assume that the language faculty is not entirely unavailable after the critical period. This is in accordance with the Exercise Hypothesis of (44a), because the availability for SLA depends on previous L1 acquisition.
5.3.4
The initial state of second language acquisition
In SLA, there are three potential sources of information. The learner has L2 input, L1 competence and, depending on the extent to which it is available, the language faculty as described by UG. Discussions about the initial state of SLA turn on the mix of these components. The two main issues are the degree and type of availability of the language faculty and the way L1 competence is used. Gregg (1996) divides the positions as to the direct availability of the language faculty into the two classes in (46).
310 (46)
Chomskyan Linguistics and its Competitors a. ‘UG is not involved, or not directly involved in L2 acquisition; […] b. UG is a causal factor in L2 acquisition, just as, or more or less as, it is in L1 acquisition.’ [Gregg (1996: 60)]
In the now familiar way, Gregg uses ‘UG’ to refer to what UG describes. He draws a parallel to eighteenth century theological discussions, with (46a) as the ‘deist’ and (46b) as the ‘theist’ position. The most radical version of (46a) is Bley-Vroman’s (1989) Fundamental Difference Hypothesis in (40c). At the opposite extreme, Flynn (1996) presents the Full Access Hypothesis, of which (47) presents some of the key assumptions. (47)
a.
‘the L2 learner solves certain aspects of the L2 acquisition problem in a manner comparable to that of a child L1 learner. b. That is, we would expect that the L2 acquisition process might be constrained by a set of language principles similar to those found in L1 acquisition.’ [Flynn (1996: 128)] c. ‘UG is fully available to the L2 learner.’ [Flynn (1996: 129)]
Flynn introduces her proposal in a rather hedged way and (47a-b) are taken from this part. They indicate the substance of the proposal described in brief in (47c). Again, ‘UG’ refers to the language faculty. According to Flynn, principles and parameters of the language faculty are available in SLA in much the same way as in first language acquisition. The positions defended by Bley-Vroman and Flynn are both based on the observations discussed in the preceding sections, but they differ in the interpretation and relative weighting of these observations. A third position is the Incompleteness Hypothesis presented by Schachter (1996). Its central assumption is (48). (48)
‘UG as the initial state will not be available as a knowledge source for the adult acquisition of an L2. Only a language-specific instantiation of it will be.’ [Schachter (1996: 171)]
Flynn (1996: 130) calls (48) the Partial Access Hypothesis. It claims that the principles of UG that are applied non-vacuously in the L1 are available also in SLA, but not the principles that play no role in L1. Schachter’s main example is subjacency. This is a principle constraining long-distance movement as illustrated in (49). (49)
a. Whoi does [S Ulrike think [S’ that [S Volker loves ti ]]] b. *Whoi does [S Ulrike believe [NP the claim [S’ that [S Volker loves ti ]]]
In (49a) the wh-phrase who is extracted out of an S that is embedded in another S. In (49b) it is extracted out of an S that is embedded in an NP. Subjacency
Aspects of language development and use
311
says that only one S or NP boundary can be overcome in a single movement step. In sentences with a succession of embedded clauses, as in (21) above, movement proceeds in different steps because each S’ has an intermediate ‘landing site’. In (49), these landing sites are before that. NPs, however, do not have such landing sites. Therefore (49b) is ungrammatical. 20 The choice of so-called bounding nodes, S and NP in English, is language-specific. Thus, Rizzi (1980) argues that in Italian the bounding nodes are S’ and NP. In order to illustrate (48), Schachter (1996: 174–177) did a test involving grammaticality judgements about English sentences with and without subjacency violations, comparing learners whose L1 was Dutch, Indonesian, Chinese, and Korean. Dutch has the same bounding nodes as English, Indonesian and Chinese are more liberal on wh-movement, and Korean has no subjacency effects at all. 21 As expected, Dutch learners performed best. The essential result for Schachter, however, is that whereas Chinese and Indonesian learners had some definite problems with the English data, Korean learners performed randomly. She concludes that Korean learners failed to develop subjacency in their L1 acquisition and can therefore not learn it in SLA. An alternative explanation of the test results is proposed by White (1996: 95f.). She suggests that the interlanguage of the Korean learners is UG-constrained, but that their analysis of sentences such as (49) involves a different type of empty category in t. Flynn (1996) discusses the evidence provided by Schachter in depth and comes to the conclusion in (50). (50)
‘those principles and parameters of UG carefully investigated thus far indicate that those not instantiated or applying vacuously in the L1 but operative in the L2, are in fact acquirable by the L2 learner.’ [Flynn (1996: 151)]
We see then that different interpretations of the data for the Korean learners in Schachter’s experiment are possible. Schachter observes a clear distinction between on the one hand Chinese and Indonesian learners, who struggle to acquire English subjacency, and on the other Korean learners, who do not manage at all. White and Flynn do not see such a clear-cut distinction. There is no problem with the interpretation of the Dutch learner data in this experiment. Dutch learners can simply use the relevant aspects of their L1 competence and transfer them to L2. As Selinker’s (1992) historical overview shows, transfer, the use of L1 competence in L2, was a focus of study in pre-generative approaches to SLA. Attention was paid in particular to transfer as a source of learners’ errors. In the context of Chomskyan linguistics, L1 competence is considered first of all as a resource for the learner. The way this resource can be used depends on the relationship of the language faculty and L1 competence in the process of L1 acquisition. Three different models of this relationship were presented in Figure 5.3 and Figure 5.4 A and B.
312
Chomskyan Linguistics and its Competitors
The model of Figure 5.3 corresponds to the Strong Continuity Hypothesis. The language faculty is not affected itself by the process of L1 acquisition. This means that the principles and parameters are there, alongside the L1 competence in SS. This model is compatible both with the Fundamental Difference Hypothesis and with the Full Access hypothesis. In the former case, the language faculty is assumed to have faded after the critical period so that only the SS of L1 is available. In the latter case, the language faculty produces a new entity X, identical to the one that developed into the SS of the L1. This situation is represented in Figure 5.5.
Language Faculty
… …
S0
…
…
…
…
…
L1
S0
… … …
L2 a w
SS
k g v
b w
SS
k f u
Figure 5.5: The Full Access Hypothesis for L2 acquisition combined with the Strong Continuity Hypothesis for L1 acquisition
As shown in Figure 5.5, the Full Access Hypothesis assumes that the language faculty projects a new S0 as a starting point for L2 acquisition. The Fundamental Difference Hypothesis claims that this step is impossible after the critical period. The question of why SLA is dependent on the age of acquisition, in the way experiments such as Johnson and Newport’s (1989) show, is addressed in (51). (51)
‘the source of the difficulty does not appear to be a lack of access to UG; rather it seems to involve problems with integration of the language faculty with other domains of cognition and problems with learning that are beyond the scope of UG.’ [Flynn (1996: 152)]
Flynn suggests in (51) that it is not the part of language acquisition represented in Figure 5.5 which distinguishes L1 acquisition and L2 acquisition. Instead,
Aspects of language development and use
313
she assumes that the way the core language SS is linked up with other cognitive capacities determines this distinction. A well-known factor is, for instance, that it is more difficult for adults to memorise new vocabulary. Whereas both the Fundamental Difference Hypothesis and the Full Access Hypothesis are compatible with the Strong Continuity Hypothesis, it is hard to reconcile the Strong Continuity Hypothesis of L1 acquisition with the Incompleteness Hypothesis of L2 acquisition. Partial access to the language faculty requires a distinction between accessible and non-accessible parts dependent on their realisation in L1. There is no position in Figure 5.5 where such a distinction could be realised because the language faculty is not affected by the L1 acquisition process. The model of Figure 5.4 A represents the Maturation Hypothesis with parameter setting directly encoded into the language faculty. This model seems to be most easily compatible with Bley-Vroman’s (1989) proposal because it excludes the reuse of the language faculty after L1 acquisition. There is only one object at the end of the process. An obvious problem, noted for instance by White (2003: 60), is that this model is not compatible with the simultaneous acquisition of two I-languages. As shown by Bhatia and Ritchie (1999: 598–613), there is abundant evidence for the more or less immediate distinction of two language systems in the course of bilingual acquisition. This means that the language faculty can give rise to two systems with different parameter settings. Therefore, Figure 5.4 A is an unlikely model of language acquisition. The model of Figure 5.4 B is an alternative interpretation of the Maturation Hypothesis. The language faculty is separate from the I-language system and both develop simultaneously. At the end of the language acquisition process, there are two entities, both of which have grown and developed in the course of L1 acquisition. This is compatible with the Full Access Hypothesis in (47) because (47) does not claim that the processes of L1 and L2 acquisition are identical. Thus (47a) calls them ‘comparable’ and (47b) ‘similar’. Whereas (47c) states that the language faculty is ‘fully available’, no claim is made that it is used in exactly the same way. The Incompleteness Hypothesis is compatible with the model in Figure 5.4 B if the two entities develop in mutual dependence. Not only does the developing language faculty influence the emerging I-language, but also the I-language determines in part which components of the language faculty will be more or less developed. This means that at the end of the L1 acquisition process, the language faculty is still available, but it is no longer neutral. This situation is represented in Figure 5.6.
314
Chomskyan Linguistics and its Competitors
Figure 5.6: The Incompleteness Hypothesis in L2 acquisition combined with a Maturation Hypothesis for L1 acquisition
The language faculty in Figure 5.6 is skewed towards L1 in its maturation process so that the S0 projected at a later stage for L2 acquisition is less general than the S0 projected for L1 acquisition. This would explain Schachter’s observation in (37c) that some L2s are easier than others, depending on the L1. Evidence against this view can be provided by the analysis of ‘wild grammars’. White (2003: 42–54) argues that what seems to be a wild grammar is in fact an interlanguage grammar constrained by UG but diverging from both L1 and L2 grammars. Depending on the details of the analysis, this may imply that the parts of the language faculty that are neither triggered by the development of L1 nor by the L2 input data are also available to the learner. In the context of the models in Figure 5.5 and Figure 5.6, we can now consider transfer as the adoption of parts of the SS of L1 in the S0 of L2. An example of transfer can arise in word order as exemplified in (9–11) above. English has SVO word order, German SOV with V2, and Danish SVO with V2. These word order effects are most probably the result of the interaction of a number of parameters. The question is whether the S0 is different for, for instance, a child learning English as her L1 or an adult speaker of German learning English as his L2. A well-known type of error in the latter case is (52), reflecting German word order, instead of (9b). (52)
Every day sees Petra Quincy.
Aspects of language development and use
315
There are different ways to account for the production of (52). First, (52) may be a performance error. It is ungrammatical in the learner’s interlanguage, but occurs for instance because of stress factors in the speech production context. Second, (52) is grammatical in the interlanguage, because the learner has transferred their German parameter settings into the S0 for English. Third, the interlanguage has not specified what to do with sentences like these and the learner uses knowledge of the German L1 as a reasonable hypothesis to supplement the interlanguage. The main difference between the last two cases is that in the former the interlanguage parameters have been set as in German and in the latter they have not yet been set. White (2003: 61–96) discusses a number of proposals that assume different types and degrees of transfer from the SS of L1 to the S0 of L2. Her conclusion is (53). (53)
a.
‘It can be seen that there is considerable overlap in the initial-state proposals that we have considered. In consequence, there is also an overlap in their predictions, sometimes making it hard to find suitable evidence to distinguish between them. […] b. all hypotheses assume that interlanguage grammars will be UGconstrained in the course of development.’ [White (2003: 95)]
The difficulty noted in (53a) of distinguishing between the predictions of different hypotheses stems from similarities such as (53b). If we compare the discussion in the four chapters of Ritchie and Bhatia (1996) by Gregg, White, Flynn, and Schachter with the presentation of the state of the field in White (2003), a remarkable development can be observed. In 1996, the main discussion seems to be whether and how the language faculty plays a role in SLA. The type of discussion is illustrated by the fact that Schachter’s (1996) experiment with subjacency data is analysed in detail by White (1996) and Flynn (1996), each of whom reinterprets the results in line with her own theory. White (2003) presents an overview of the discussion in which the focus has shifted to the amount and type of transfer. All of the approaches she presents assume Full Access to the language faculty. Thus, while the designation Full Access is sufficient to characterise in 1996 the difference of Flynn’s position from the ones of her adversaries, in 2003, White has to extend the designation of Flynn’s position to ‘Full Access without Transfer’ (2003: 89). The question is whether this is the result of White’s biased selection of theories or of a genuine development in the field. A further alternative may be that the term full access is not always used by everyone in exactly the same sense. In this respect (54) is interesting.
316
Chomskyan Linguistics and its Competitors
(54)
a.
‘But having argued for a CP in L1 acquisition (and given the problem of general failure to acquire an L2), we clearly cannot accept a “full access” position either. b. Current UG theory, however, may give us a way to accept UG as operating in the acquisition and use of an L2, while explaining the limitations on that acquisition.’ [Eubank and Gregg (1999: 84f.)]
In (54a), Eubank and Gregg reject the Full Access Hypothesis because of the existence of a critical period (‘CP’) and the lack of success in SLA. Instead, in (54b), they propose a more limited role of the language faculty. If we take the Full Access Hypothesis in the sense of (53b), however, it only implies that every stage of the interlanguage is ‘UG-constrained’. This version of Full Access is fully compatible with (54b) and is not subject to the objections in (54a). Also in Flynn’s version of it in (47), the Full Access Hypothesis does not have to mean that there is no critical period and the result of SLA is equal to that of L1 acquisition. Therefore, a certain convergence can be observed in this respect. SUMMARY
•
Adult second language acquisition (SLA) differs from first language acquisition because the learner is more cognitively advanced and already knows (at least) one language.
•
Competence in a second language is called interlanguage. In Chomskyan linguistics, interlanguages are considered as I-languages, not as deficient versions of a norm in the speech community.
•
The logical problem of SLA is that the interlanguage competence cannot be explained completely as a result of L2 input and L1 competence. It can be solved by assuming some role of the language faculty.
•
There have been several experiments showing the correlation between the age of (the start of) SLA and the degree to which the result corresponds to native speaker competence. They can be based on the evaluation of performance and on the analysis of brain processes.
•
A general trend is that until the end of the critical period, the degree of success in SLA is correlated with age. After that it is generally lower with greater individual differences.
Aspects of language development and use
5.4
•
In the early 1990s, there were three main hypotheses concerning the degree of availability of the language faculty in SLA: the Fundamental Difference Hypothesis, the Full Access Hypothesis and the Incompleteness Hypothesis.
•
According to the Fundamental Difference Hypothesis, SLA cannot use the language faculty. This hypothesis is hardly compatible with Chomskyan linguistics.
•
According to the Full Access Hypothesis, the language faculty is available in SLA. Difficulties not found in first language acquisition have to be explained in other ways than by the unavailability of the language faculty.
•
According to the Incompleteness Hypothesis, only the parts of the language faculty triggered in L1 acquisition are available in SLA. This hypothesis is not compatible with the Strong Continuity Hypothesis for first language acquisition.
•
At present, Full Access to the language faculty seems to be the prevailing hypothesis. The discussion concentrates on how the access in SLA differs from the access in first language acquisition.
317
Language change
Throughout the nineteenth century, linguistics was dominated by the study of languages in their historical dimension. In a major overview work of this period, Hermann Paul expresses this very clearly in (55). (55)
‘I have to justify briefly that I have chosen as a title Principles of Language History. It has been objected that there is another scientific perspective on language than the historical one.* I have to deny this.’ [Paul (1886: 19), footnote at * deleted]22
As the subsequent discussion shows, the alternative suggested by Franz Misteli was that the comparison of dialects and languages is not necessarily historically oriented. Paul argues that such a comparison has to involve historical research: ‘Then the task of science is to go beyond a mere description of corresponding elements in the different languages or dialects, but also to reconstruct the unattested original forms and meanings from the attested forms as far as possible’ (1886: 20). 23 The only questions considered for scientific study were then the history of a single language or the comparison of two or more of them.
318
Chomskyan Linguistics and its Competitors
This perspective is not compatible with the central position of I-language in the study of language as assumed in Chomskyan linguistics. A language in traditional historical linguistics is an E-language in terms of Chomsky’s distinction discussed in Section 2.1.3. This does not mean that the general impression that language changes or the idea that languages are related to each other cannot be dealt with from a Chomskyan perspective. Lightfoot (1979) is the first systematic attempt to develop a Chomskyan diachronic linguistics. Lightfoot (1999) develops and updates this approach. More concise presentations are to be found in Lightfoot (1981) for the earlier version and Lightfoot (2006) for the latest version. This section will briefly outline Lightfoot’s assumptions (Section 5.4.1) and illustrate the type of analysis resulting from his approach (Section 5.4.2).
5.4.1
A history of I-languages
In Chomskyan linguistics, the perceived change of E-languages can only be explained as a change in I-languages observed under certain perspectives of generalisation. As opposed to the situation in the study of first and second language acquisition, in the study of language change there is widespread agreement on the generalisations to be made. Thus although Roberts and Roussou conclude that ‘the approach sketched here is distinct from that developed by Lightfoot’ (1999: 1036), they nevertheless follow ‘a standard paradigm for work on language change in generative grammar starting with Lightfoot (1979)’ (1999: 1020). Also Fischer et al. (2000) adopt the same framework. 24 Therefore, the central assumptions will be presented here in the manner of a textbook rather than as a discussion of quotations. These assumptions can be formulated as in (56). (56)
a.
An I-language represents a stable state. It does not change in the individual’s lifetime. b. The way an I-language is used is subject to change, depending on context, situation, fashion, etc. c. The Primary Linguistic Data (PLD) used in language acquisition are performance data. d. Language change occurs when the PLD cause parameters to be set differently from the I-languages that produced the PLD.
As stated in (56a), an SS is indeed a stable state. I-language change can only mean change across generations, not change in an existing I-language. What can change in an individual’s knowledge is the periphery, e.g. by the addition of new vocabulary items, but not the core. What can also change, as (56b) states, is the way I-language is used. A construction may become rare in a speaker’s
Aspects of language development and use
319
performance, for instance, because it belongs to a style that is not appropriate to the speaker’s usual situation. This does not mean that it becomes ungrammatical, but only that it is less often observable. The Primary Linguistic Data (PLD) mentioned in (56c) are used by a child to set parameters in language acquisition. They consist of observable performance data. As such, they are only indirectly related to the I-languages of the speakers in the child’s environment. Parts of the I-language that are not used cannot be observed. In this way, as (56d) suggests, it is possible that the PLD do not contain the information necessary to produce the same I-language as the adult speakers in the child’s environment. The transfer of I-languages between generations is represented in Figure 5.7. Si a w
…
a
…
k g
w
…
v
Primary Linguistic Data
Si+1 a w
… g …
a w
a w
k g v
k g v
Figure 5.7: Parameter setting as a source of change
On the left-hand side in Figure 5.7, the same parameter setting step is represented as in Figure 5.1. In performing this process, the child’s growing brain reaches out for relevant cues in the PLD. The PLD are produced by the speakers in the environment and, for the sake of simplicity, Figure 5.7 represents three adult speakers whose parameter settings are identical. Thus, Figure 5.7 represents a stage in language acquisition in a homogeneous speech community. Even in a homogeneous speech community, in which all relevant speakers have the same parameter settings in their I-language, there is no guarantee that the child will end up with the same parameter settings. If the adult settings are taken as the norm, successful acquisition depends on whether the child is able to find the right cues in the PLD. Given the poverty of the stimulus argument, the threshold cannot be very high. For each child, the PLD will be different, but this will only affect the parameter settings in relatively few cases. Nevertheless, there must be a threshold and the amount of relevant cues has to be higher than zero, because the very idea of a parameter is that PLD are used.
320
5.4.2
Chomskyan Linguistics and its Competitors
An example: change of word order
A well-known and well-documented example of linguistic change is the change in basic word order between Middle English and Early Modern English. This change concerns the loss of verb-second. The contrast in (9), (10) and (11), repeated here for convenience, illustrates this phenomenon. (9)
a. Petra sees Quincy. b. Every day Petra sees Quincy. c. It is good that Petra sees Quincy.
(10)
a. Petra sieht den Quincy. b. Jeden Tag sieht Petra den Quincy. c. Es ist gut, dass Petra den Quincy sieht.
(11)
a. Petra ser Quincy. b. Hver dag ser Petra Quincy. c. Det er godt at Petra ser Quincy.
Modern English has SVO word order as illustrated in (9), whereas German has SOV as its basic word order but in main clauses the inflected verb moves to second position. Danish in (11) combines SVO with verb-second. In Old English, we find OV word order. However, as Fischer et al. (2000: 49–52) describe, word order is not as rigidly SOV with verb second as in present-day German. Fischer et al. (2000: 138–179) discuss the transition to a system of VO basic word order with verb second in Middle English. In the transition from Middle English to Early Modern English, verb second was gradually lost. This process is described by Fischer et al. (2000: 104–137). Lightfoot (1999) also uses this development as one of the main examples of his approach, because it is a relatively well-documented instance of a change that must involve different parameter settings. Let us now consider which data the child needs to set the parameters correctly for these languages. Of course, (9a), (10a) and (11a) cannot be used to determine parameter values, because they are the same in all cases. Lightfoot (1999) assumes furthermore that the PLD used by children in the context of Figure 5.7 are restricted to main clauses. This is Lightfoot’s (1989) Degree-0 learnability hypothesis mentioned in Section 5.2.2. This means that (9c), (10c) and (11c) cannot be used to set the parameters governing word order. As this is the only relevant difference between (10) and (11), the child cannot learn the difference between German and Danish from these data. However, we also have evidence such as (57).
Aspects of language development and use (57)
321
a. Petra has seen Quincy. (English) b. Petra hat den Quincy gesehen. (German) c. Petra har set Quincy. (Danish)
In verb-second, only the inflected verb is moved. In periphrastic constructions such as the present perfect, the participle remains in the original position. Therefore, the object follows the participle in English and Danish, but precedes it in German. Data such as (57b-c) are the essential cues for SVO vs. SOV word order. The cue used to determine the value of the parameter governing verb-second is the type of sentence illustrated in (9b), (10b) and (11b). If a non-subject appears as the first constituent, the position of the subject shows whether verb-second is operative or not. Lightfoot (1999: 151–158) elaborates the verb-second contrast with Dutch instead of German. On the basis of Dutch corpus data, he estimates that approximately 30% of Dutch main clauses have a non-subject in first position (1999: 153f.). Apparently, this figure is high enough to make Dutch PLD a robust source of verb-second grammars. In Old and Middle English, the situation is different. If we combine all the material we have from this period, the conclusion seems to be that verb-second is optional and disappears in most contexts in Early Modern English. However, such an approach is not relevant to the discovery of I-languages. In Chomskyan linguistics, language is not seen as a corpus of performance in which texts are classified as belonging to a particular point in time. Instead, texts are classified in principle as the result of a particular I-language, of course with the familiar factors influencing the mapping from competence to performance (cf. Section 2.1.1). On this basis, Lightfoot argues that in the relevant period there were actually two systems in competition. In one system, verb-second worked more or less as in Danish, illustrated in (11). This variant was especially widespread in areas with strong Scandinavian influence. The other variant, more common in the South, had Modern English word order as in (9). Lightfoot (1999: 156) proposes on the basis of a thirteenth century manuscript that the percentage of relevant cue sentences supporting verb-second would have been around 17%. This was apparently below the threshold, because verb-second disappeared. A number of more general observations can be made about this model. First, changes are often classified as gradual or catastrophic. The general perception of a change is the gradual one represented in Figure 5.8.
322
Chomskyan Linguistics and its Competitors
100%
0% t0
tn
Figure 5.8: Gradual change
In Figure 5.8, the frequency of a particular phenomenon increases from 0% at time t0 to 100% at time tn. In Lightfoot’s model, the impression of gradual change can only be seen as the result of a number of factors interacting with catastrophic change. The gradual change in Figure 5.8 corresponds to usage in a population. The explanation of this usage is formulated in terms of parameter settings of individual people’s I-languages. Changes in parameter settings are necessarily catastrophic in nature. In Figure 5.7, the child does not make a difference between 30% or 35% of non-subject initial verb-second sentences in the PLD. Unless the threshold is in this range, no distinction in competence will result. Whether the parameters are set to allow verb-second or not, however, is not gradual. The child acquires either value g from the PLD, as in Figure 5.7, or another value, say h, but not an in-between value, say 70% g and 30% h. Second, it is crucial for Lightfoot’s account that verb-second cannot be optional. Apparent optionality must be explained as the co-existence of two varieties. Lightfoot (1999: 155) proposes that there was a large degree of diglossia, a situation where two varieties of language are used in one speech community and even by the same speaker in different situations. A classical example of widespread diglossia is the situation in German-speaking Switzerland, cf. Barbour and Stevenson (1990: 212–217). Most children in German-speaking Switzerland grow up learning the local dialect of their area and learn the standard variety of German at primary school. The phenomenon of diglossia contributes to the perception of a change as in Figure 5.8. In late medieval England, diglossia certainly assumed a different form from present-day Switzerland, but one can imagine that the population in some areas used one variety only and in other areas only the other variety. People who were in contact with both speech communities, either because they lived in between both zones or because they travelled between them, would know both varieties and use the one most appropriate to the situation. A child growing up in a situation of diglossia would have PLD produced by speakers with g and h values of the relevant parameter or by speakers who
Aspects of language development and use
323
have two I-languages, one with g and one with h. In principle, the child in this situation can develop two I-languages or a single I-language with either g or h setting. Lightfoot (1999: 255–257) argues that there are several reasons why in this concrete situation only the non-verb-second variety would survive. Apart from the low level of cues supporting verb-second, as mentioned above, there are also interactions with other properties of the grammar that favour the word order in (9). A third observation is that, if parameter resetting is indeed catastrophic change and if diglossia accounts for the apparent optionality, the gradual geographic spread of the non-verb-second variety of English is what can be expected. If conditions of diglossia are such that in borderline areas children will tend to acquire a non-verb-second I-language, the area where the verbsecond variety is used will contract. This coincides with what can actually be observed in the history of English. It also explains the form of the curve in Figure 5.8. It should be noted that this development is not what is expected if language change is only approached as a social or cultural phenomenon. While language contact may be invoked to explain that more expressive possibilities become available, as in the case of loanwords, there is no way the catastrophic disappearance of expressive possibilities (e.g. the possibility of verb second) can be explained on this basis. By contrast, on Lightfoot’s assumptions in (56) such events can be explained in terms of language acquisition and parameter setting. SUMMARY
•
In Chomskyan linguistics, an I-language does not change after an SS has been reached, but use of the I-language (performance) can still change.
•
The problem of language change is considered as the problem of how a child’s I-language can be different from the I-language of the people who produce the primary linguistic data for the child.
•
The solution to the problem of language change is that the performance data that constitute the primary linguistic data for the child do not provide enough crucial evidence to set parameters in the same way as in the I-languages of the people who produced these data.
•
Variability in use can be explained in terms of diglossia, not in terms of a single, variable I-language.
324
Chomskyan Linguistics and its Competitors
•
5.5
Gradual change from A to B can be the result of language use or of an increase in the population of the proportion of people whose I-language has B. It cannot be the result of the change of individual I-languages.
Language and communication
Throughout this chapter the approach has been to consider questions that do not immediately bear on the central focus of Chomskyan linguistics but have a clear linguistic component. In the case of first and second language acquisition, we found that there is a ‘logical problem’ in the sense that there is a gap between the state of knowledge attained and what can be explained on the basis of the input (in first language acquisition) or the L1 competence and the L2 input (in SLA). This gap can be bridged by appealing to the language faculty. Roberts and Roussou also identify a ‘logical problem of language change’ (2003: 12). This problem pertains to the possibility of setting a parameter to a different value than the one the input data are based on. The solution, as sketched in Section 5.4, is to assume that the input data are not rich enough in cues to obtain the same parameter setting. In each case, then, the existence of the language faculty and the process of setting parameters provided by the language faculty are a crucial component of the solution. When we now turn to the way language is used in communication, we have to deal with a problem of a completely different nature. The existence of an I-language is presupposed when we study how it is used in communication. No more acquisition is involved and no direct reference needs to be made to the language faculty. The question in this case is how the I-language interacts with other components of the mind involved in communication. A first element of the answer to this question arises from the distinction made between grammatical competence and pragmatic competence, as discussed in Section 2.1.2. Whereas Hymes (1971) proposes to embed grammatical competence in what he calls ‘communicative competence’ in (12) of that section, Chomsky (1980a) proposes in (13) to treat grammatical and pragmatic competence as two interacting modules of the mind. This distinction serves as a basis for the analysis of communication data so that grammatical competence can be studied independently of other sources of these data. Another component of the answer was introduced in the discussion of language use in Section 2.4.1. In (58) of that section, Chomsky (1986a) identifies the use of language with the expression of thoughts and the understanding of specimens of language, whereas communication is called a ‘derivative’ and ‘special’ use of language. Extending the meaning of communication to
Aspects of language development and use
325
include any use of language results in a highly artificial concept, as argued by Chomsky (1975a: 55ff.). This distinction serves as a basis for the study of competence and communication as autonomous phenomena in the sense that neither depends on the other. Given these two components of the answer to the question of how I-language is involved in communication, we should not be too surprised at the assessment by Smith (1999) in (58). (58)
a.
‘That language makes an important contribution to communication has never been denied, least of all by Chomsky, b. but little specific about the structure of language follows from this. […] c. Chomsky has had little to say about the role of pragmatics or about theories of communication more generally. This is unsurprising as they are tangential to his ideas about language.’ [Smith (1999: 152)]
Kasher calls the idea in (58c) a ‘negative heuristic [which] tells us to shun according communication linguistic prominence’ (1991: 128). The motivation is not the denial of (58a), which has sometimes been ascribed to Chomsky, but (58b). As a consequence of (58b), Chomskyan linguistics is compatible with a range of different theories of communication. The only constraint imposed on such a theory is that it accepts an autonomous I-language, as formulated by Fukui and Zushi (2004) in (59). (59)
‘The use of language is a result of the interactions of the language faculty with other cognitive systems that govern thinking, intention, articulation, sense and so on.’ [Fukui and Zushi (2004: 8f.)]
An example of such a theory is Relevance Theory as proposed by Sperber and Wilson (1995) and Wilson and Sperber (2004). They characterise its central idea in (60). (60)
‘Relevance theory may be seen as an attempt to work out in detail one of Grice’s central claims: that an essential feature of most human communication is the expression and recognition of intentions.’ [Wilson and Sperber (2004: 607)]
Since the first edition of their book appeared in 1986, the analysis and elaboration of the idea in (60) have triggered a large body of research as well as criticism. An example of the latter is Levinson’s (1989) review. In the 1995 edition, Sperber and Wilson address this criticism in a 25-page postface and additional footnotes to their original text. One of the central ideas of Relevance Theory is that the link between language and communication is not a necessary one for the nature of either language or communication. They express this in (61).
326 (61)
Chomskyan Linguistics and its Competitors a.
‘as long as there is some way of recognising the communicator’s intentions, then communication is possible.’ [Sperber and Wilson (1995: 25)] b. ‘Languages are essential not for communication but for information processing; this is their essential function.’ [Sperber and Wilson (1995: 172)]
In (61a), the existence of communication that is not mediated by a code is predicted. Sperber and Wilson illustrate this by a number of examples of ‘ostensive-inferential communication’ (1995: 50–54). What these examples have in common is that they do not depend on a shared language serving as a code for the message. Unencoded contextualised gestures can show both the intention to communicate and enough information to infer the relevant message. 25 This means that communication does not depend on language, even if language is taken in a very broad sense. Conversely, (61b) states that language need not be used for communication. What is essential is a system for the internal representation of information. Although Sperber and Wilson (1995: 174) call such a system ‘language’, they use a much wider sense of the term than I-language in the sense of Chomskyan linguistics. As long as the difference between linguistic code, which could be used as a gloss for ‘language’ in (61b), and I-language is recognised, there is no problem in combining results of Chomskyan linguistics and Relevance Theory. Having seen that Chomskyan linguistics is compatible with Relevance Theory, we may wonder what a theory that is not compatible with Chomskyan linguistics might look like. The lack of mutual impact of competence and pragmatics in Chomskyan linguistics means that any approach to pragmatics that does not deny an autonomous component of I-language is in principle compatible with Chomskyan linguistics. An example of an incompatible approach is the holistic approach to language of some proponents of generative semantics. As Newmeyer (1986a: 118–123) describes, it was especially George Lakoff who in the early 1970s proposed that the scope of grammar should be extended to include pragmatic phenomena that influence the distribution of grammatical constituents. In an interview about this period, Lakoff expresses this in terms of the commitment in (62). (62)
‘the generalization/full range commitment, which is the commitment to seek maximal generalizations over the full range of linguistic data, both within and across all domains of language – syntax, semantics, pragmatics, discourse, phonetics, phonology, morphology and so on.’ [Lakoff (1995: 109)]
Aspects of language development and use
327
According to Lakoff (1995: 110), the commitment in (62) was shared by J. R. Ross and Paul Postal. It corresponds to Hymes’s (1971) proposal to replace grammatical competence by a more inclusive ‘communicative competence’ as a focus of inquiry. As discussed in Section 2.1.2, this approach is incompatible with Chomskyan linguistics. The contrast between (59) and (62) is significant. Whereas Chomskyan linguistics aims to account for the full range of data in terms of the interaction of various modules, one of which is the I-language, Lakoff proposes to achieve such an account by means of ‘maximal generalizations’ crossing the boundaries of any such modules. SUMMARY
5.6
•
Pragmatics is seen in Chomskyan linguistics as the study of how competence is used in communication.
•
In principle, any theory of pragmatics that accepts an autonomous grammatical competence is compatible with Chomskyan linguistics. An example is Relevance Theory.
•
Holistic approaches of the type involving Hymes’s communicative competence and Lakoff’s generalisation/full range commitment are not compatible with Chomskyan linguistics.
•
The loose relationship between competence and communication means that pragmatics does not have a significant impact on the theory of competence.
Conclusion
The four areas of linguistics discussed in this chapter can be divided into two categories depending on the impact they have on work in Chomskyan linguistics. First and second language acquisition and language change have a strong impact, whereas linguistic communication has a weak impact. Strong impact areas are not only studied from within Chomskyan linguistics, but also contribute data to the central concern of the research programme, the description of the language faculty and the explanation of linguistic competence. Weak impact areas may be studied in a way compatible with the research programme of Chomskyan linguistics, but there is much less of a reason for Chomskyan linguists to get involved in them. The division between strong and weak impact areas coincides with the question whether the development of an I-language is essentially involved in
328
Chomskyan Linguistics and its Competitors
the process. In the case of first and second language acquisition, the connection to the development of an I-language is obvious. In the case of linguistic change, the model used analyses change as a side effect of such a development process. This is the reason why interesting data can be generated about the language faculty. As an example of a weak impact area, linguistic communication, understood as pragmatics, was discussed in Section 5.5. The language faculty is not directly involved in pragmatics, because no new I-language emerges. Instead, it is competence that plays a central role. As a consequence, the main consideration is autonomy. Any theory of pragmatics that is compatible with an autonomous grammatical competence can be used in combination with Chomskyan linguistics. A similar analysis applies to language processing. In other research programmes discussed in Chapter 4, less research has been carried out on the areas of strong impact for Chomskyan linguistics presented here. From the different decisions in their research programmes, however, we can predict to a certain extent how various areas of linguistics can be incorporated as strong or weak impact areas. In LFG, for instance, language processing is clearly a strong impact area. The isomorphism condition determines that processing experiments are essential in the choice of a theory. Language acquisition, by contrast, is a weak impact area. It can be accommodated but is unlikely to contribute essential data. The situation is somewhat different for GPSG and HPSG. The GPSG research programme treats linguistics as a formal science. Therefore, no acquisition or processing data can bring about a choice between different grammars. The choice is determined by compatibility with grammaticality judgements and simplicity and elegance of the grammar. In HPSG, accommodating processing as well as pragmatics has clearly been a concern. However, as a consequence of the unwillingness on the part of Pollard and Sag (1987, 1994) to commit themselves to a mentalist or non-mentalist position, it is not obvious in which way this can lead to an explanatory account. The HPSG formalism is highly flexible and can easily represent pragmatic and semantic information alongside syntax and phonology so that processing can be modelled. In the case of Jackendoff’s linguistics, the position of the language faculty in the research programme is very similar to that in Chomskyan linguistics. The internal organisation, however, is quite different. Whereas in Chomskyan linguistics the language faculty is modelled in terms of principles and parameters, in Jackendoff’s linguistics, the lexicon plays a decisive role. As much of the research in first and second language acquisition is based on parameters, it will be interesting to see to what extent corresponding results can be obtained in Jackendoff’s linguistics. Another difference is the position of language processing, which in Jackendoff’s linguistics is closer to the one it has in
Aspects of language development and use
329
LFG than in Chomskyan linguistics. Here the question will be to what extent processing results can be operationalised.
Notes 1
I used the map of Niebaum and Macha (1999: 193), which gives the situation of around 1900 as represented in the Deutsche Sprachatlas. They argue that no more recent, reliable source for the distinction of major dialect boundaries exists.
2
In his presentation of the Groningen dialect, Reker (2002:52-62) discusses word order phenomena and notes that whereas the traditional dialect has the same word order as in German, people in Groningen increasingly adopt standard Dutch word order.
3
In fact, Sweden exerted large pressure on the population to become Swedish in allegiance, whereas Denmark fomented popular revolts in the seventeenth century. Clearly that has more to do with political than with linguistic differences. As Sven-Göran Malmgren pointed out to me, the standard account of the transition process is Ohlsson (1978–1979).
4
According to Raven et al.’s textbook ‘Species are groups of organisms that are distinct from other, co-occurring species and that are interconnected geographically. The ability to exchange genes appears to be a hallmark of such species’ (2005 7: 473), emphasis added. According to Thain and Hickmann’s Dictionary of Biology, ‘Inability to find a unified species concept is no disgrace’ (2004 11: 660–661).
5
Hyams (1986: 26–62) gives a good overview of the contrastive data in English and Italian and the way they are explained by the prodrop parameter. She introduces the name ‘AG/PRO parameter’ (1986: 32).
6
In the context of the Subset Principle, Wexler (1991: 262) also makes the argument that working with sets in the context of language acquisition input does not commit one to a theory based on E-language.
7
He refers to a paper by J. T. Shurley and K. Natani presented at the 1972 meeting of the American Psychological Association in Honolulu.
8
Jackendoff (1993: 82–98) gives a brief summary of the argument and its relevance for a mentalist view of language. Miles (1988) gives more details of the history, of the current situation, as well as of the internal organisation of British Sign Language (BSL). Similar introductions to other sign languages are Moody (1998) for Langue de Signes Française (LSF) and Boyes-Braem (1990) for Swiss-German Sign Language.
9
It is customary to speak of phonology in the context of sign languages as referring to the shape of the sign. See Uyechi (1996) for a discussion of the phonology
330
Chomskyan Linguistics and its Competitors of ASL, analysing it into features in a way parallel to the common analysis for acoustic phonology.
10 Taken literally, this makes (27b-c) claim that every child is a generative linguist. Of course this is not the intended meaning. Unfortunately, the conflation of these two senses of UG is very common in the literature, in particular in the context of language acquisition. Although it may seem pedantic to distinguish Language Faculty and Universal Grammar rigorously, it is necessary to avoid confusion of the type illustrated here. 11 A further argument against the Maturation Hypothesis in the version of Figure 5.4 A is discussed in Section 5.3.4, as it depends on the acquisition of more than one language. 12 The distinction is useful in the context of language teaching. Thus, Muncie (2002) discusses different teaching methods to be used in teaching English as a Foreign Language (EFL) as opposed to English as a Second Language (ESL). 13 There is a field of Third Language Acquisition, which focuses on the differences between L2 and L3 acquisition, e.g. Cenoz et al. (2001). Interestingly, the L3 Homepage (www.spz.tu-darmstadt.de/L3) indicates by the formula ‘L3 = L2 + n (n≥1)’ that the term third language does not exclude further languages. Thus, speakers of Dutch who learn English, French, and German at school have two third languages. Third Language Acquisition can be seen as a specialised subfield of SLA. 14 The term fossilization in (36e) refers to the lack of improvement after reaching a stage in the acquisition process short of success. Without a norm for success it is hard to distinguish fossilisation from reaching a SS as in first language acquisition. 15 I use this term here in the narrow sense of the study of language teaching. As Newmeyer (1983: 132) notes, this is common in Britain, whereas in the United States a much broader field is designated by the term, including also, for instance, language planning for endangered languages and the contrastive analysis of conversational styles of different groups of people. 16 White (2003: 54f.) also suggests that in group studies, the variability may at least in part be a consequence of individuals having internally consistent but systematically different judgements. 17 In statements such as these, success is used in a general intuitive, non-technical sense. It is comparable in this respect to the use of English in a pre-theoretical sense, as discussed in Section 5.1. In Eubank and Gregg’s (45b), below, the underlying idea is described in more detail. 18 This description is based on the brief descriptions by Curtiss (1988: 99f.) and Eubank and Gregg (1999: 74f.). The latter refer to an unpublished manuscript (1989) by Susan Curtiss.
Aspects of language development and use
331
19 I am grateful to Jeroen Smeets for pointing me to the references in this paragraph. 20 The ‘landing site’ is in the specifier of S’ (CP), whose head is the complementiser that. In (49b) movement to the position before that is unproblematic. However, there is no intermediate landing site between this position and the position at the start of the sentence where who is realised, so that NP and S have to be crossed in one step. For a more extensive discussion of subjacency, see Haegeman (1994: 371–429). She discusses an example comparable to (49) on p. 402–404. 21 Indonesian, Chinese, and Korean do not have overt wh-movement. Huang (1982) argues, however, that wh-elements in Chinese have to move in the same way as their English counterparts. The only difference is that the movement takes place between S-structure and LF instead of between D-structure and S-structure. He assumes that LF movement is subject to subjacency and uses subjacency effects in Chinese (1982: 380–386) to support his analysis. 22 Original: ‘Ich habe es noch kurz zu rechtfertigen, dass ich den titel Principien der sprachgeschichte gewählt habe. Es its eingewendet, dass es noch eine andere wissenschaftliche betrachtung der sprache gäbe, als die geschichtliche. Ich muss das in abrede stellen.’ My translation. The footnote refers to Misteli’s (1882) review of the first edition in Zeitschrift für Völkerpsychologie. 23 Original: ‘Dann ist es aufgabe der wissenschaft, nicht bloss zu constatieren, was sich in den verschiedenen sprachen oder mundarten gegenseitig entspricht, sondern aus den überlieferten die nicht überlieferten grundformen und grundbedeutungen nach möglichkeit zu reconstruieren.’ My translation. 24 Discussions concerning assumptions such as the ones in (56) as the basis for historical linguistics take place between proponents of Chomskyan linguistics and proponents of work based on the comparative method that continues the tradition of nineteenth century historical linguistics. An overview of modern applications of the comparative method is Beekes (1995). An example of a discussion is that between Lightfoot (2002a, b) and Campbell and Harris (2002) on the status of reconstructions. 25 A brief example is ‘Peter asks Mary, ‘How are you feeling today?’ Mary responds by pulling a bottle of aspirins out of her bag and showing it to him.’ [Sperber and Wilson (1995: 25), originally the question is a numbered item]. Unencoded communication depends on many components of the context and although they are not complex or long-winded, the more convincing examples they give take typically about a page to describe, cf. Sperber and Wilson (1995: 51). Such examples show that language is not necessary for communication, but that reporting linguistic communication can be much more concise.
References References are listed chronologically for each (first) author, so that the position of Anderson et al. (1996) does not depend on the names in ‘et al.’ and Chomsky and Miller (1963) follows Chomsky (1962b) rather than Chomsky (2004). Dutch names with particles are ordered under the main constituent, e.g. van Riemsdijk after Rieber.
Allerton, D. J., Skandera, Paul and Tschichold, Cornelia (eds) (2002) Perspectives on English as a World Language, Basel: Schwabe. Anderson, Stephen R. and Kiparsky, Paul (eds) (1973) A Festschrift for Morris Halle, New York: Holt, Rinehart and Winston. Anderson, Stephen R., Chung, Sandra, McCloskey, James and Newmeyer, Frederick J. (1996) ‘Chomsky’s 1962 Programme for Linguistics: A Retrospective’, repr. in Newmeyer (1996) p. 66–79. Orig. in Crochetière, André, Boulanger, Jean-Claude and Ouellon, Conrad (eds) Proceedings of the XVth International Congress of Linguists, Québec: Les Presses de L’Université de Laval, p. 367–382. Anderson, Stephen R. (2004) Doctor Dolittle’s Delusion: Animals and the Uniqueness of Human Language, New Haven: Yale University Press. Andronis, Mary, Ball, Christopher, Elston, Heidi and Neuvel, Sylvain (eds) (2002) CLS 37: The Panels. Papers from the 37th Meeting of the Chicago Linguistic Society. Vol. 2, Chicago: Chicago Linguistic Society. Antony, Louise, M. and Hornstein, Norbert (eds) (2003) Chomsky and his Critics, Oxford: Blackwell. Aoun, Joseph and Lightfoot, David W. (1984) ‘Government and Contraction’, Linguistic Inquiry 15: 465–473. Atkinson, Martin (1992) Children’s Syntax: An Introduction to Principles and Parameters Theory, Oxford: Blackwell. Ayer, Alfred Jules (1946) Language, Truth and Logic, New York: Dover (First edition 1935, second edition with new introduction 1946, Dover reprint 1952).
332
References
333
Barbour, Stephen and Stevenson, Patrick (1990) Variation in German; A Critical Approach to German Sociolinguistics, Cambridge: Cambridge University Press. Barwise, Jon and Perry, John (1983) Situations and Attitudes, Cambridge (Mass.): MIT Press. Beekes, Robert S. P. (1995) Comparative Indo-European Linguistics: An Introduction, Amsterdam: Benjamins. Bensaude-Vincent, Bernadette and Stengers, Isabelle (1993) Histoire de la chimie, Paris: La Découverte. Bever, Thomas G., Lackner, J. R. and Kirk, R. (1969) ‘The Underlying Structures of Sentences are the Primary Units of Immediate Speech Processing’, Perception and Psychophysics 5: 225–234. Bhatia, Tej K. and Ritchie, William C. (1999) ‘The Bilingual Child: Some Issues and Perspectives’, in Ritchie and Bhatia (eds) p. 569–643. Bhatia, Tej K. and Ritchie, William C. (eds) (2004) The Handbook of Bilingualism, Oxford: Blackwell. Bickerton, Derek (1999) ‘Creole Languages, the Language Bioprogram Hypothesis, and Language Acquisition’, in Ritchie and Bhatia (eds) p. 195–220. Birdsong, David (ed.) (1999) Second Language Acquisition and the Critical Period Hypothesis, Mahwah NJ: Lawrence Erlbaum. Blakemore, Sarah-Jane & Frith, Uta (2005) The Learning Brain: Lessons for Education, Malden (Mass.): Blackwell. Bley-Vroman, Robert (1989) ‘What is the Logical Problem of Foreign Language Learning?’, in Gass and Schachter (eds) p. 41–68. Bloch, Bernard and Trager, George L. (1942) Outline of Linguistic Analysis, Baltimore: Linguistic Society of America. Bloch, Bernard (1948) ‘A Set of Postulates for Phonemic Analysis’, Language 24: 3–46. Bloomfield, Leonard (1926) ‘A Set of Postulates for the Science of Language’, Language 2: 153–164. Bloomfield, Leonard (1933) Language, London: Allen and Unwin. Bloomfield, Leonard (1946) ‘Twenty-One Years of the Linguistic Society’, Language 22: 1–3. Borer, Hagit and Wexler, Ken (1992) ‘Bi-unique Relations and the Maturation of Grammatical Principles’, Natural Language and Linguistic Theory 10: 147–189. Botha, Rudolf P. (1968) The Function of the Lexicon in Transformational Generative Grammar, Den Haag: Mouton. Botha, Rudolf P. (1981) The Conduct of Linguistic Inquiry: A Systematic Introduction to the Methodology of Generative Grammar, Den Haag: Mouton. Botha, Rudolf P. (1989) Challenging Chomsky: The Generative Garden Game, Oxford: Blackwell.
334
Chomskyan Linguistics and its Competitors
Boyes Braem, Penny (1990) Eine Einführung in die Gebärdensprache und ihre Forschung, Hamburg: Signum, 19533. Bresnan, Joan (1978) ‘A Realistic Transformational Grammar’, in Halle et al. (eds) p. 1–59. Bresnan, Joan and Kaplan, Ronald M. (1982) ‘Introduction: Grammars as Mental Representations of Language’, in Bresnan (ed.) p. xvi-lii. Bresnan, Joan (ed.) (1982) The Mental Representation of Grammatical Relations, Cambridge (Mass.): MIT Press. Bresnan, Joan (2001) Lexical-Functional Syntax, Oxford: Blackwell. Bright, William (1959) Review of Robins, R. H. (1958) The Yurok Language: Grammar, Texts, Lexicon, Berkeley: University of California Press, Language 35: 100–104. Broekhuis, Hans and Vogel, Ralf (eds.) (2006) Optimality Theory and Minimalism: A Possible Convergence?, [Linguistik in Potsdam 25] Potsdam: Universität Potsdam, Institut für Linguistik. Brown, Keith (ed.) (2006) Encyclopedia of Language and Linguistics, 2nd Edition (14 vol.) Oxford: Elsevier. Buck, Roger C. and Cohen, Robert S. (eds) (1971) Boston Studies in the Philosophy of Science VIII, Dordrecht: Reidel. Cain, Joe (2000) ‘Woodger, Positivism, and the Evolutionary Synthesis’, Biology and Philosophy 15: 535–551. Campbell, Lyle and Harris, Alice C. (2002) ‘Syntactic Reconstruction and Demythologizing “Myths and the Prehistory of Grammars’’’, Journal of Linguistics 38: 599–618. Cenoz, Jasone, Hufeisen, Britta and Jessner, Ulrike (eds) (2001) Cross-linguistic Influence in Third Language Acquisition: Psycholinguistic Perspectives, Clevedon: Multilingual Matters. Chao, Yuen-Ren (1934) ‘The Non-Uniqueness of Phonemic Solutions of Phonetic Systems’, Bulletin of the Institute of History and Philology 4: 363–397, repr. in Joos (ed.) (1957) p. 38–54. Chen, Xiang (1997) ‘Thomas Kuhn’s Latest Notion of Incommensurability’, Journal for General Philosophy of Science 28: 257–273. Chomsky, Noam (1953) ‘Systems of Syntactic Analysis’, The Journal of Symbolic Logic 18: 242–256. Chomsky, Noam (1957) Syntactic Structures, Den Haag: Mouton. Chomsky, Noam (1959) Review of Skinner, B. F. (1957) Verbal Behavior, Englewood Cliffs NJ: Prentice-Hall, Language 35: 26–58. Chomsky, Noam (1961) ‘Some Methodological Remarks on Generative Grammar’, Word 17: 219–239. Chomsky, Noam (1962a) ‘Explanatory Models in Linguistics’, in Nagel et al. (eds) p. 528–550. Chomsky, Noam (1962b) ‘A Transformational Approach to Syntax’, in Hill (ed.) p. 124–186.
References
335
Chomsky, Noam and Miller, George A. (1963) ‘Introduction to the Formal Analysis of Natural Languages’, in Luce et al. (eds) Vol. 2, p. 269–321. Chomsky, Noam (1964) Current Issues in Linguistic Theory, Den Haag: Mouton. Chomsky, Noam (1965) Aspects of the Theory of Syntax, Cambridge (Mass.): MIT Press. Chomsky, Noam and Halle, Morris (1965) ‘Some Controversial Questions in Phonological Theory’, Journal of Linguistics 1: 97–138. Chomsky, Noam (1966a) ‘Topics in the Theory of Generative Grammar’, in Sebeok (ed.) Vol. III, p. 1–60. Chomsky, Noam (1966b) Cartesian Linguistics: A Chapter in the History of Rationalist Thought, New York: Harper and Row. Chomsky, Noam (1967) ‘Some General Properties of Phonological Rules’, Language 43: 102–128. Chomsky, Noam and Halle, Morris (1968) The Sound Pattern of English, New York: Harper and Row. Chomsky, Noam (1970) ‘Remarks on Nominalization’, in Jacobs and Rosenbaum (eds) p. 184–221. Chomsky, Noam (1972a) Language and Mind, Enlarged Edition, New York: Harcourt Brace Jovanevich. Chomsky, Noam (1972b) Studies in Semantics in Generative Grammar, Den Haag: Mouton. Chomsky, Noam (1973) ‘Conditions on Transformations’, in Anderson and Kiparsky (eds) p. 232–286. Chomsky, Noam (1975a) Reflections on Language, Fontana. Chomsky, Noam (1975b) The Logical Structure of Linguistic Theory, New York and London: Plenum. Chomsky, Noam (1976a) ‘Conditions on Rules of Grammar’, Linguistic Analysis 2: 303–351. Chomsky, Noam (1976b) ‘On the Biological Basis of Language Capacities’, in Rieber (ed.) p. 1–24, repr. in Chomsky (1980a), p. 185-216. Chomsky, Noam and Lasnik, Howard (1978) ‘A Note on Contraction’, Linguistic Inquiry 9: 268–274. Chomsky, Noam (1980a) Rules and Representations, New York: Columbia University Press. Chomsky, Noam (1980b) ‘The New Organology’, The Behavioral and Brain Sciences 3: 42–58. Chomsky, Noam (1981a) Lectures on Government and Binding, Dordrecht: Foris. Chomsky, Noam (1981b) ‘Principles and Parameters in Syntactic Theory’, in Hornstein and Lightfoot (eds) p. 32–75. Chomsky, Noam (1981c) ‘Knowledge of Language: Its Elements and Origins’, Philosophical Transactions of the Royal Society of London B 295: 223–234.
336
Chomskyan Linguistics and its Competitors
Chomsky, Noam (1982) Some Concepts and Consequences of the Theory of Government and Binding, Cambridge (Mass.): MIT Press. Chomsky, Noam (1986a) Knowledge of Language: Its Nature, Origin, and Use, Westport (Conn.): Praeger. Chomsky, Noam (1986b) Barriers, Cambridge (Mass.): MIT Press. Chomsky, Noam (1988) Language and Problems of Knowledge, Cambridge (Mass.): MIT Press. Chomsky, Noam (1990) ‘On Formalization and Formal Linguistics’, Natural Language and Linguistic Theory 8: 143–147. Chomsky, Noam (1993a) ‘A Minimalist Program for Linguistic Theory’, in Hale and Keyser (eds) p. 1–52, repr. in Chomsky (1995b), p. 167-217. Chomsky, Noam (1993b) Language and Thought, Wakefield RI: Moyer Bell. Chomsky, Noam (1995a) ‘Language and Nature’, Mind 104: 1–61, repr. in Chomsky (2000b), p. 106-163. Chomsky, Noam (1995b) The Minimalist Program, Cambridge (Mass.): MIT Press. Chomsky, Noam and Lasnik, Howard (1995) ‘The Theory of Principles and Parameters’, in Chomsky (1995b) p. 13–127. Orig. in Jacobs, Joachim, von Stechow, Arnim, Sternefeld, Wolfgang and Vennemann, Theo (eds) (1993) Syntax: An International Handbook of Contemporary Research, Berlin: Mouton de Gruyter. Chomsky, Noam (1997) Perspectives on Power: Reflections on Human Nature and the Social Order, Montréal: Black Rose. Chomsky, Noam (2000a) ‘Linguistics and Brain Science’, in Marantz et al. (eds) p. 13–28. Chomsky, Noam (2000b) New Horizons in the Study of Language and Mind, Cambridge: Cambridge University Press. Chomsky, Noam (2000c) ‘Minimalist Inquiries: The Framework’, in Martin et al. (eds) p. 89–155. Chomsky, Noam (2001) ‘Derivation by Phase’, in Kenstowicz (ed.) p. 1–52. Chomsky, Noam (2002) On Nature and Language, Cambridge: Cambridge University Press. Chomsky, Noam (2003) ‘Reply to Lycan’, in Antony and Hornstein (eds) p. 255–263. Chomsky, Noam (2004) The Generative Enterprise Revisited, Berlin: Mouton de Gruyter. Chomsky, Noam, Hauser, Marc D. and Fitch, W. Tecumseh (2005) ‘Appendix. The Minimalist Program’, ms. available from www.wjh.harvard. edu/~mnkylab. Clahsen, Harald (ed.) (1996) Generative Perspectives on Language Acquisition, Amsterdam: Benjamins. Colombo, John (1982) ‘The Critical Period Concept: Research, Methodology, and Theoretical Issues’, Psychological Bulletin 91: 260–275.
References
337
Comrie, Bernard (1989) Language Universals and Linguistic Typology, 2nd edition, Chicago: University of Chicago Press. Couvalis, George (1997) The Philosophy of Science: Science and Objectivity, London: Sage. Culicover, Peter W. and Jackendoff, Ray (2005) Simpler Syntax, Oxford: Oxford University Press. Culicover, Peter W. and Jackendoff, Ray (2006) ‘The Simpler Syntax Hypothesis’, Trends in Cognitive Sciences 10: 413–418. Curtiss, Susan, Fromkin, Victoria, Krashen, Stephen, Rigler, David and Rigler, Marilyn (1974) ‘The Linguistic Development of Genie’, Language 50: 528–554. Curtiss, Susan (1988) ‘Abnormal Language Acquisition and the Modularity of Language’, in Newmeyer (ed.) p. 96–116. De Houwer, Annick (1995) ‘Bilingual Language Acquisition’, in Fletcher and MacWhinney (eds) p. 219–250. Dowty, David R., Wall, Robert E. and Peters, Stanley (1981) Introduction to Montague Semantics, Dordrecht: Reidel. Dubois, Michel (2001) La nouvelle sociologie des sciences, Paris: Presses Universitaires de France. Eilfort, William H., Kroeber, Paul D. and Peterson, Karen L. (eds) (1985) Papers from the General Session at the Twenty-First Regional Meeting, Chicago: Chicago Linguistic Society. Eubank, Lynn and Gregg, Kevin R. (1999) ‘Critical Periods and (Second) Language Acquisition: Divide et Impera’, in Birdsong (ed.) p. 65–99. Falk, Yehuda N. (1984) ‘The English Auxiliary System’, Language 60: 483–509. Falk, Yehuda N. (2001) Lexical-Functional Grammar: An Introduction to Parallel Constraint-Based Syntax, Stanford (Calif.): CSLI. Fischer, Olga, Koopman, Willem, van Kemenade, Ans and van der Wurff, Wim (2001) The Syntax of Early English, Cambridge: Cambridge University Press. Fitch, W. Tecumseh, Hauser, Marc D. and Chomsky, Noam (2005) ‘The Evolution of the Language Faculty: Clarifications and Implications’, Cognition 97: 179–210. Fletcher, Paul and MacWhinney, Brian (eds) (1995) The Handbook of Child Language, Oxford: Blackwell. Flynn, Suzanne (1996) ‘A Parameter-Setting Approach to Second Language Acquisition’, in Ritchie and Bhatia (eds) p. 121–158. Fodor, Jerry A. and Katz, Jerrold J. (eds) (1964) The Structure of Language: Readings in the Philosophy of Language, Englewood Cliffs NJ: PrenticeHall. Fodor, Jerry A. and Bever, Thomas G (1965) ‘The Psychological Reality of Linguistic Segments’, Journal of Verbal Learning and Verbal Behavior 4: 414–420.
338
Chomskyan Linguistics and its Competitors
Fodor, Jerry A., Bever, Thomas G. and Garrett, Merrill F. (1974) The Psychology of Language: An Introduction to Psycholinguistics and Generative Grammar, McGraw-Hill, New York. Fought, John G. (1995) ‘American Structuralism’, in Koerner and Asher (eds) p. 295–306. Friederici, Angela D. (2002) ‘Towards a Neural Basis of Auditory Sentence Processing’, Trends in Cognitive Sciences 6(2): 78–84. Fries, Charles C. (1961) ‘The Bloomfield ‘School’’, in Mohrmann et al. (eds) p. 196–224. Fukui, Naoki and Zushi, Mihoko (2004) ‘Introduction’, in Chomsky (2004) p. 1–25 (translated by Mana Kobuchi-Philip). Gass, Susan M. and Schachter, Jacquelyn (eds) (1989) Linguistic Perspectives on Second Language Acquisition, Cambridge: Cambridge University Press. Gaul, Wolfgang and Pfeifer, Dietmar (eds) (1995) From Data to Knowledge: Theoretical and Practical Aspects of Classification, Data Analysis and Knowledge Organization, Berlin: Springer. Gazdar, Gerald (1976) Review of Sampson, Geoffrey (1975) The Form of Language, London: Weidenfeld and Nicolson, Journal of Linguistics 12: 206–208. Gazdar, Gerald (1981a) ‘Unbounded Dependencies and Coordinate Structure’, Linguistic Inquiry 12: 155–184. Gazdar, Gerald (1981b) ‘On Syntactic Categories’, Philosophical Transactions of the Royal Society of London B 295: 267–283. Gazdar, Gerald, Klein, Ewan, Pullum, Geoffrey and Sag, Ivan (1985) Generalized Phrase Structure Grammar, Oxford: Blackwell. Gazdar, Gerald (2001) Ted Briscoe interviews Gerald Gazdar, http: //www. informatics.susx.ac.uk/research/nlp/gazdar/briscoe, 23/12/2005. Goddard, Ives (1987) ‘Leonard Bloomfield’s Descriptive and Comparative Studies of Algonquian’, in Hall (ed.) p. 179–217. Goldin-Meadow, Susan and Mylander, Carolyn (1990) ‘Beyond the Input Given: The Child’s Role in the Acquisition of Language’, Language 66: 323–355. Goodman, Judith C. and Nusbaum, Howard C. (eds) (1994) The Development of Speech Perception: The Transition from Speech Sounds to Spoken Words, Cambridge (Mass.): MIT Press. Gopnik, Myrna and Crago, Martha B. (1991) ‘Familial Aggregation of a Developmental Language Disorder’, Cognition 39: 1–50. Gould, Stephen Jay (1991) ‘Exaptation: A Crucial Tool for an Evolutionary Psychology’, Journal of Social Issues 47/3: 43–65. Greenberg, Joseph H. (1963) ‘Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements’, in Greenberg (ed.) p. 73–113. Greenberg, Joseph H. (ed.) (1963) Universals of Language, Cambridge (Mass.): MIT Press
References
339
Greenberg, Marc L. (2005) ‘Serbo-Croatian and South Slavic Languages’, in Strazny (ed.) p. 956–958. Gregg, Kevin R. (1996) ‘The Logical and Developmental Problems of Second Language Acquisition’, in Ritchie and Bhatia (eds) p. 49–81. Guasti, Maria Teresa (2002) Language Acquisition: The Growth of Grammar, Cambridge (Mass.): MIT Press. ten Hacken, Pius (1997a) ‘Progress and Incommensurability in Linguistics’, Beiträge zur Geschichte der Sprachwissenschaft 7: 287–310. ten Hacken, Pius (1997b) ‘Some Parallels and Divergences between the Copernican Revolution and the Chomskyan Revolution’, On-line Conference The 40th Anniversary of Generativism, www.kcn.ru/tat_en/science/fccl/generate.htm. ten Hacken, Pius (1998) ‘The English Gender System in a Cross-Linguistic Perspective’, RANAM: Recherches anglaises et nord-américaines 31: 19–31. ten Hacken, Pius (2002) ‘Chomskyan versus Formalist Linguistics’, in Andronis et al. (eds) p. 155–168. ten Hacken, Pius (2005) ‘The Disappearance of the Geographical Dimension of Language in American Linguistics’, in Spurr and Tschichold (eds) p. 249–264. ten Hacken, Pius (2006a) ‘Formalism/Formalist Linguistics’, in Brown (ed.) vol. 4, p. 558–564. ten Hacken, Pius (2006b) ‘Introduction’, in ten Hacken (ed.) p. 7–15. ten Hacken, Pius (2006c) ‘The Nature, Use and Origin of Explanatory Adequacy’, in Broekhuis and Vogel (eds.), p. 9-32. ten Hacken, Pius (ed.) (2006) Terminology, Computing and Translation, Tübingen: Narr. Haegeman, Liliane (1994) Introduction to Government and Binding Theory, Second Edition, Oxford: Blackwell (orig. 1991). Hale, Kenneth and Keyser, Samuel J. (eds) (1993) The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, Cambridge (Mass.): MIT Press. Hall, Robert A. (ed.) (1987) Leonard Bloomfield: Essays on His Life and Work, Amsterdam: Benjamins. Halle, Morris (1962) ‘Phonology in Generative Grammar’, Word 18: 54–72. Halle, Morris, Bresnan, Joan and Miller, George A. (eds) (1978) Linguistic Theory and Psychological Reality, Cambridge (Mass.): MIT Press. Harman, Gilbert (1980) ‘Two Quibbles about Analyticity and Psychological Reality’, The Behavioral and Brain Sciences 3: 21–22. Harris, Zellig S. (1951) Methods in Structural Linguistics, Chicago: University of Chicago Press (repr. as Structural Linguistics, 1960). Harris, Zellig S. (1952) ‘Discourse Analysis’, Language 28: 1–30, repr. in Fodor and Katz (eds) (1964) p. 355–383.
340
Chomskyan Linguistics and its Competitors
Harris, Zellig S. (1954) ‘Distributional Structure’, Word 10: 146–162, repr. in Fodor and Katz (eds) (1964) p. 33–49. Harris, Zellig S. (1957) ‘Co-occurrence and Transformation in Linguistic Structure’, Language 33: 283–340, repr. in Fodor and Katz (eds) (1964) p. 155–210. Hauser, Marc D., Chomsky, Noam and Fitch, W. Tecumseh (2002) ‘The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?’, Science 298: 1569–1579. He, Hai-Yan, Hodos, William and Quinlan, Elizabeth M. (2006) ‘Visual Deprivation Reactivates Rapid Ocular Dominance Plasticity in Adult Visual Cortex’, Journal of Neuroscience 26: 2951–2955. Hensch, Takao K. (2003) ‘Controlling the Critical Period’, Neuroscience Research 47: 17–22. Hensch, Takao K. (2004) ‘Critical Period Regulation’, Annual Review of Neuroscience 27: 549–579. Hensch, Takao K. (2005) ‘Critical Period Plasticity in Local Cortical Circuits’, Nature Reviews Neuroscience 6: 877–888. Hill, Archibald A. (ed.) (1962) Third Texas Conference on Problems of Linguistic Analysis in English, May 9–12, 1958, Austin (Texas): The University of Texas. Hockett, Charles F. (1942) ‘A System of Descriptive Phonology’, Language 18: 3–21. Hockett, Charles F. (1947) ‘Problems of Morphemic Analysis’, Language 23: 321–343. Hockett, Charles F. (1948) ‘A Note on “Structure”’, International Journal of American Linguistics 14: 269–271, repr. in Joos (ed.) (1957) p. 279–280. Hockett, Charles F. (1954) ‘Two Models of Grammatical Description’, Word 10: 210–231. Hockett, Charles F. (1958) A Course in Modern Linguistics, New York: MacMillan. Hockett, Charles F. (1963) ‘The Problem of Universals in Language’, in Greenberg (ed.) p. 1–29. Hockett, Charles F. (1965) ‘Sound Change’, Language 41: 185–204. Hockett, Charles F. (1968) The State of the Art, Den Haag: Mouton. Hoekstra, Teun (1984) Transitivity: Grammatical Relations in GovernmentBinding Theory, Dordrecht: Foris. Hofer, Sonja B., Mrsic-Flogel, Thomas D., Bonhoeffer, Tobias and Hübener, Mark (2006) ‘Prior Experience Enhances Plasticity in Adult Visual Cortex’, Nature Neuroscience 9: 127–132. ’t Hooft, Gerard (1996) De bouwstenen van de schepping: Een zoektocht naar het allerkleinste, Amsterdam: Prometheus (4th ed., orig. 1992). Horn, Laurence R. and Ward, Gregory (eds) (2004) The Handbook of Pragmatics, Malden (Mass.): Blackwell.
References
341
Hornstein, Norbert and Lightfoot, David (1981) ‘Introduction’, in Hornstein and Lightfoot (eds) p. 9–31. Hornstein, Norbert and Lightfoot, David (eds) (1981) Explanation in Linguistics: The Logical Problem of Language Acquisition, London and New York: Longman. Householder, Fred W. (1952) Review of Harris, Zellig S. (1951) Methods in Structural Linguistics, Chicago: University of Chicago Press, International Journal of American Linguistics 28: 260–268. Householder, Fred W. (1965) ‘On some Recent Claims in Phonological Theory’, Journal of Linguistics 1: 13–34. Householder, Fred W. (1966) ‘Phonological Theory: A Brief Comment’, Journal of Linguistics 2: 99–100. Householder, Fred W. (1971) Linguistic Speculations, Cambridge: Cambridge University Press. Hoyningen-Huene, Paul (1989) Die Wissenschaftsphilosophie Thomas S. Kuhns, Braunschweig: Vieweg. Huang, C. T. James (1982) ‘Move WH in a Language without WH Movement’, The Linguistic Review 1: 369–416. Huck, Geoffrey J. and Goldsmith, John A. (1995) Ideology and Linguistic Theory: Noam Chomsky and the Deep Structure Debates, London: Routledge. Huxley, Renira and Ingram, Elisabeth (eds) (1971) Language Acquisition: Models and Methods, London: Academic Press. Hyams, Nina (1986) Language Acquisition and the Theory of Parameters, Dordrecht: Reidel. Hyams, Nina (1996) ‘The Underspecification of Functional Categories in Early Grammar’, in Clahsen (ed.) p. 91–127. Hymes, Dell (1971) ‘Competence and Performance in Linguistic Theory’, in Huxley and Ingram (eds) London: Academic Press, p. 3–24. Hymes, Dell and Fought, John (1981) American Structuralism, The Hague: Mouton. Jackendoff, Ray (1972) Semantic Interpretation in Generative Grammar, Cambridge (Mass.): MIT Press. Jackendoff, Ray (1975) ‘Morphological and Semantic Regularities in the Lexicon’, Language 51: 639–671. Jackendoff, Ray (1977) X ¯ Syntax: A Study of Phrase Structure, Cambridge (Mass.): MIT Press. Jackendoff, Ray (1983) Semantics and Cognition, Cambridge (Mass.): MIT Press. Jackendoff, Ray (1990a) Semantic Structures, Cambridge (Mass.): MIT Press. Jackendoff, Ray (1990b) ‘What would a Theory of Language Evolution have to look like?’, Behavioral and Brain Sciences 13: 737–738. Jackendoff, Ray (1993) Patterns in the Mind: Language and Human Nature, New York: Harvester/Wheatsheaf.
342
Chomskyan Linguistics and its Competitors
Jackendoff, Ray (1997) The Architecture of the Language Faculty, Cambridge (Mass.): MIT Press. Jackendoff, Ray (1999) ‘Possible Stages in the Evolution of the Language Capacity’, Trends in Cognitive Sciences 3: 272–279. Jackendoff, Ray (2001) Review of Calvin, William H. and Bickerton, Derek (2000) Lingua ex Machina: Reconciling Darwin and Chomsky with the Human Brain, Cambridge (Mass.): MIT Press, Language 77: 569–573. Jackendoff, Ray (2002) Foundations of Language: Brain, Meaning, Grammar, Evolution, Oxford: Oxford University Press. Jackendoff, Ray (2003) ‘Précis of Foundations of Language: Brain, Meaning, Grammar, Evolution’, Behavioral and Brain Sciences 26: 651–665. Jackendoff, Ray and Pinker, Steven (2005) ‘The Nature of the Language Faculty and its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky)’, Cognition 97: 211–225. Jackendoff, Ray (2007) ‘A Parallel Architecture Perspective on Language Processing’, Brain Research 1146:2-22. Jacobs, Roderick A. and Rosenbaum, Peter S. (1968) English Transformational Grammar, London: Ginn (British edition 1970). Jacobs, Roderick A. and Rosenbaum, Peter S. (eds) (1970) Readings in English Transformational Grammar, Waltham (Mass.): Ginn. Jaeggli, Osvaldo A. (1980) ‘Remarks on To Contraction’, Linguistic Inquiry 11: 239–246. Jaeggli, Osvaldo A. (1986) ‘Passive’, Linguistic Inquiry 17: 587–622. Jespersen, Otto (1924) The Philosophy of Grammar, London: George Allen and Unwin. Johnson, David E. and Postal, Paul M. (1980) Arc Pair Grammar, Princeton NJ: Princeton University Press. Johnson, Jacqueline S. and Newport, Elissa L. (1989) ‘Critical Period Effects in Second Language Learning: The Influence of Maturational State on the Acquisition of English as a Second Language’, Cognitive Psychology 21: 60–99. Joos, Martin (ed.) (1957) Readings in Linguistics: The Development of Descriptive Linguistics in America 1925–1956, Chicago: University of Chicago Press, 1966 . Kaldewaij, Jelle (1986) Structuralisme en transformationeel-generatieve grammatica, Dordrecht: Foris. Kaplan, Ronald M. and Bresnan, Joan (1982) ‘Lexical-Functional Grammar: A Formal System for Grammatical Representation’, in Bresnan (ed.) p. 173–281. Kaplan, Ronald M. (2003) ‘Syntax’, in Mitkov (ed.) p. 70–90. Kasher, Asa (1991) ‘Pragmatics and Chomsky’s Research Program’, in Kasher (ed.) p. 122–149. Kasher, Asa (ed.) (1991) The Chomskyan Turn, Oxford: Blackwell. Katz, Jerrold J. (1964) ‘Mentalism in Linguistics’, Language 40: 124–137.
References
343
Katz, Jerrold J. (1981) Language and other Abstract Objects, Oxford: Blackwell. Kayne, Richard S. (1975) French Syntax: The Transformational Cycle, Cambridge (Mass.): MIT Press. van Kemenade, Ans and Los, Bettelou (eds) (2006) The Handbook of the History of English, Malden (Mass.): Blackwell. Kenstowicz, Michael (ed.) (2001) Ken Hale: A Life in Language, Cambridge (Mass.): MIT Press. Keuth, Herbert (ed.) (2004) Karl Popper: Logik der Forschung [Klassiker Auslegen Band 12], 2. durchgesehene Auflage, Berlin: Akademie Verlag. Kim, Karl H. S., Relkin, Norman R., Lee, Kyoung-Min and Hirsch, Joy (1997) ‘Distinct Cortical Areas Associated with Native and Second Languages’, Nature 388: 171–174. Knecht, Jean (1982) ‘France’, in Britannica Book of the Year 1982: 392–395. Koerner, E.F.K. (1995a) ‘History of Linguistics: The Field’, in Koerner and Asher (eds) p. 3–7. Koerner, E.F.K. (1995b) ‘Historiography of Linguistics’, in Koerner and Asher (eds) p. 7–16. Koerner, E.F.K. Konrad and Asher, R. E. (eds) (1995) Concise History of the Language Sciences: From the Sumerians to the Cognitivists, Oxford: Pergamon / Elsevier Science. Koningsveld, Herman (1976) Het verschijnsel wetenschap: Een inleiding tot de wetenschapsfilosofie, Meppel: Boom, 19826. Kortmann, Bernd and Schneider, Edgar W. (eds) (2004) A Handbook of Varieties of English: A Multimedia Reference Tool (2 vol. + CD-ROM) Berlin: Mouton de Gruyter. Koster, Jan (1986) ‘The Relation between Pro-drop, Scrambling, and Verb Movements’, Groningen Papers in Theoretical and Applied Linguistics 1. Kuhn, Thomas S. (1957) The Copernican Revolution: Planetary Astronomy in the Development of Western Thought, Cambridge (Mass.): Harvard University Press (repr. 1985). Kuhn, Thomas S. (1959) ‘The Essential Tension: Tradition and Innovation in Scientific Research’, in Taylor (ed.) p. 162–174, repr. in Kuhn (1977) p. 225–239. Kuhn, Thomas S. (1970a) The Structure of Scientific Revolutions, Second Edition, Enlarged, Chicago: University of Chicago Press (orig. 1962). Kuhn, Thomas S. (1970b) ‘Reflections on my Critics’, in Lakatos and Musgrave (eds) p. 231–278. Kuhn, Thomas S. (1977) The Essential Tension: Selected Studies in Scientific Tradition and Change, Chicago: University of Chicago Press. Ladefoged, Peter and Broadbent, Donald E. (1960) ‘Perception of Sequence in Auditory Events’, Quarterly Journal of Experimental Psychology 12: 162–170.
344
Chomskyan Linguistics and its Competitors
Ladefoged, Peter (2001) Vowels and Consonants: An Introduction to the Sounds of Language, Oxford: Blackwell. Ladyman, James (2002) Understanding Philosophy of Science, London: Routledge. Lakatos, Imre (1970) ‘Falsification and the Methodology of Scientific Research Programmes’, in Lakatos and Musgrave (eds) p. 91–196. Lakatos, Imre and Musgrave, Alan (eds) (1970) Criticism and the Growth of Knowledge, Cambridge: Cambridge University Press. Lakatos, Imre (1971) ‘History of Science and its Rational Reconstructions’, in Buck and Cohen (eds) p. 91–136. Lakoff, George (1970) ‘Global Rules’, Language 46: 627–639. Lakoff, George (1995) ‘George Lakoff in Conversation with John Goldsmith’, in Huck and Goldsmith, p. 107–119. Lappin, Shalom, Levine, Robert D. and Johnson, David E. (2000a) ‘The Structure of Unscientific Revolutions’, Natural Language and Linguistic Theory 18: 665–671. Lappin, Shalom, Levine, Robert D. and Johnson, David E. (2000b) ‘The Revolution Confused: A Response to our Critics’, Natural Language and Linguistic Theory 18: 873–890. Larvor, Brendan (1998) Lakatos: An Introduction, London: Routledge. Lasnik, Howard and Kupin, Joseph J. (1977) ‘A Restrictive Theory of Transformational Grammar’, Theoretical Linguistics 4: 173–196. Lasnik, Howard (1999) Minimalist Analysis, Oxford: Blackwell. Laudan, Larry (1977) Progress and its Problems: Towards a Theory of Scientific Growth, Berkeley: University of California Press. Lees, Robert B. (1957) Review of Chomsky, Noam (1957) Syntactic Structures, Den Haag: Mouton, Language 33: 375–408. Lees, Robert B. (1960) The Grammar of English Nominalizations, Bloomington: Indiana University Press and Den Haag: Mouton. Lerdahl, Fred and Jackendoff, Ray (1983) A Generative Theory of Tonal Music, Cambridge, MA: MIT Press. Lenneberg, Eric H. (1967) Biological Foundations of Language, New York: Wiley. Levinson, Stephen C. (1989) ‘A Review of Relevance’, Journal of Linguistics 25: 455–472. Lightfoot, David (1976) ‘Trace Theory and Twice-Moved NPs’, Linguistic Inquiry 4: 559–582. Lightfoot, David (1979) Principles of Diachronic Syntax, Cambridge: Cambridge University Press. Lightfoot, David (1981) ‘Explaining Syntactic Change’, in Hornstein and Lightfoot (eds) p. 209–240. Lightfoot, David (1986) ‘A Brief Response’, Linguistic Inquiry 17: 111–113. Lightfoot, David (1989) ‘The Child’s Trigger Experience: Degree-0 Learnability’, Behavioral and Brain Sciences 12: 321–375.
References
345
Lightfoot, David (1994) ‘Degree-0 Learnability’, in Lust et al. (eds) p. 453–471. Lightfoot, David (1999) The Development of Language: Acquisition, Change, and Evolution, Oxford: Blackwell. Lightfoot, David (2002a) ‘Myths and the Prehistory of Grammars’, Journal of Linguistics 38: 113–136. Lightfoot, David (2002b) ‘More Myths’, Journal of Linguistics 38: 619–628. Lightfoot, David (2005) ‘Plato’s Problem, UG, and the Language Organ’, in McGilvray (ed.) p. 42–59. Lightfoot, David (2006) ‘Cuing a New Grammar’, in van Kemenade and Los (eds) p. 24–44. Louessard, Laurent (1990) La révolution de juillet 1830, Spartacus. Luce, R. Duncan, Bush, Robert R. and Galanter, Eugene (eds) (1963–1965) Handbook of Mathematical Psychology (3 Vol.) New York: Wiley. Luraghi, Silvia (2005) ‘Spanish and Iberoromance Languages’, in Strazny (ed.) p. 1007–1010. Luria, A. R. (1966) Human Brain and Psychological Processes New York: Harper and Row (translated by Basil Haigh). Lust, Barbara, Hermon, Gabriella and Kornfilt, Jaklin (eds) (1994) Syntactic Theory and First Language Acquisition: Cross-Linguistic Perspectives – Volume 2: Binding, Dependencies, and Learnability, Hillsdale NJ: Lawrence Erlbaum. Lust, Barbara (1999) ‘Universal Grammar: The Strong Continuity Hypothesis in First Language Acquisition’, in Ritchie and Bhatia (eds) p. 111–155. Lycan, William G. (2003) ‘Chomsky on the Mind–Body Problem’, Antony and Hornstein (eds) p. 11–28. Lyons, John (1968) Introduction to Theoretical Linguistics, Cambridge: Cambridge University Press. Lyons, John (1970) Chomsky, London: Fontana/Collins. McGilvray, James (ed.) (2005) The Cambridge Companion to Chomsky, Cambridge: Cambridge University Press. Marantz, Alec, Miyashita, Yasushi and O’Neil, Wayne (eds) (2000) Image, Language, Brain: Papers from the First Mind Articulation Project Symposium, Cambridge (Mass.): MIT Press. Margolis, Howard (1993) Paradigms and Barriers: How Habits of Mind Govern Scientific Beliefs, Chicago: University of Chicago Press. Martin, Roger, Michaels, David and Uriagereka, Juan (eds) (2000) Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, Cambridge (Mass.): MIT Press. Masterman, Margaret (1970) ‘The Nature of a Paradigm’, in Lakatos and Musgrave (eds) p. 59–89. Matthews, Peter H. (1967) Review of Chomsky, Noam (1965) Aspects of the Theory of Syntax, Cambridge (Mass.): MIT Press, Journal of Linguistics 3: 119–152.
346
Chomskyan Linguistics and its Competitors
Matthews, Peter H. (1968) ‘Some Remarks on the Householder-Halle Controversy’, Journal of Linguistics 4: 275–283. Matthews, Peter H. (1974) Morphology: An Introduction to the Theory of Word Structure, Cambridge: Cambridge University Press. Matthews, Peter H. (1993) Grammatical Theory in the United States from Bloomfield to Chomsky, Cambridge: Cambridge University Press. Mayberry, Rachel I. (1994) ‘The Importance of Childhood to Language Acquisition: Evidence from American Sign Language’, in Goodman and Nusbaum (eds) p. 57–90. Meisel, Jürgen M. (2004) ‘The Bilingual Child’, in Bhatia and Ritchie (eds) p. 91–113. Miles, Dorothy (1988) British Sign Language: A Beginner’s Guide, BBC Books. Miller, George A. (1962) ‘Some Psychological Studies of Grammar’, American Psychologist 17: 748–762. Millikan, Ruth G. (2003) ‘In Defense of Public Language’, in Antony and Hornstein (eds) p. 215–237. Misteli, Franz (1882) ‘Beurteilung: Principien der Sprachgeschichte von Hermann Paul’, Zeitschrift für Völkerpsychologie und Sprachwissenschaft 13:376-409. Mitkov, Ruslan (ed.) (2003) The Oxford Handbook of Computational Linguistics, Oxford: Oxford University Press. Mohrmann, Christine, Sommerfelt, Alf and Whatmough, Joshua (eds) (1961) Trends in European and American Linguistics 1930–1960, Utrecht/ Antwerpen: Spectrum. Montague, Richard (1970) ‘Universal Grammar’, Theoria 36: 373–398. Moody, Bill (1998) La langue des signes – Tome 1: Histoire et grammaire, Paris: International Visual Theatre. Moore, Patrick (2002) Venus, London: Cassell. Morgan, James L. (1986) From Simple Input to Complex Grammar, Cambridge (Mass.): MIT Press. Muncie, James (2002) ‘Finding a Place for Grammar in EFL Composition Classes’, ELT Journal 56: 180–186. Nagel, Ernest (1961) The Structure of Science: Problems in the logic of Scientific Explanation, Indianapolis: Hackett, 19792. Nagel, Ernest, Suppes, Patrick and Tarski, Alfred (eds) (1962) Logic, Methodology and Philosophy of Science: Proceedings of the 1960 International Congress, Stanford (Calif.): Stanford University Press. Neidle, Carol, Kegl, Judy, MacLaughlin, Dawn, Bahan, Benjamin and Lee, Robert G. (2000) The Syntax of American Sign Language: Functional Categories and Hierarchical Structure, Cambridge (Mass.): MIT Press. Newmeyer, Frederick J. (1983) Grammatical Theory: Its Limits and Its Possibilities, Chicago: University of Chicago Press. Newmeyer, Frederick J. (1986a) Linguistic Theory in America, second edition, New York: Academic Press.
References
347
Newmeyer, Frederick J. (1986b) The Politics of Linguistics, Chicago: University of Chicago Press. repr. 1988. Newmeyer, Frederick J. (1986c) ‘Has there been a ‘Chomskyan Revolution’ in Linguistics ?’, Language 62: 1–18, repr. in Newmeyer (1996), p. 23-38. Newmeyer, Frederick J. (ed.) (1988) Linguistics: The Cambridge Survey – Volume II: Linguistic Theory: Extensions and Implications, Cambridge: Cambridge University Press. Newmeyer, Frederick J. (1996) Generative Linguistics: A historical perspective, London: Routledge. Newmeyer, Frederick J. (1998) Language Form and Language Function, Cambridge (Mass.): MIT Press. Niebaum, Hermann and Macha, Jürgen (1999) Einführung in die Dialektologie des Deutschen, Tübingen: Niemeyer. Nique, Christian (1974) Initiation méthodique à la grammaire générative, Paris: Colin. Noordegraaf, Jan (2000) ‘Noam Chomsky en zijn Nederlandse uitgevers. Twee retouches’, Neder-L 4.11 (http: //www.neder-l.nl/bulletin/2000/04/000411. html, 9/1/2006). Oesterreicher, Wulf (1977) ‘Paradigma und Paradigmawechsel – Thomas S. Kuhn und die Linguistik’, Osnabrücker Beiträge zur Sprachtheorie 3: 241–284. Ohlsson, Stig-Örjan (1978–79) Skånes språkliga försvenskning (2 vol.) Lund: Lund University Press. Ooi, Vincent B. Y. (1998) Computer Corpus Lexicography, Edinburgh: Edinburgh University Press. Otero, Carlos P. (1990) ‘The Emergence of Homo Loquens and the Laws of Physics’, Behavioral and Brain Sciences 13: 747–750. Paul, Hermann (1886) Principien der Sprachgeschichte, zweite auflage, Halle: Niemeyer. Pepperberg, Irene M. (1999) The Alex Studies: Cognitive and Communicative Abilities of Grey Parrots, Cambridge (Mass.): Harvard University Press. Pepperberg, Irene M. (2005) ‘Animals and Human Language 3: Parrots’, in Strazny (ed.) p. 66–67. Percival, W. Keith (1976) ‘The Applicability of Kuhn’s Paradigms to the History of Linguistics’, Language 52: 285–294. Pinker, Steven (1982) ‘A Theory of the Acquisition of Lexical Interpretive Grammars’, in Bresnan (ed.) p. 655–726. Pinker, Steven and Bloom, Paul (1990a) ‘Natural Language and Natural Selection’, Behavioral and Brain Sciences 13: 707–726. Pinker, Steven and Bloom, Paul (1990b) ‘Issues in the Evolution of the Human Language Faculty’, Behavioral and Brain Sciences 13: 765–778. Pinker, Steven and Jackendoff, Ray (2005) ‘The Faculty of Language: What’s Special about it?’, Cognition 95: 201–236.
348
Chomskyan Linguistics and its Competitors
Pollard, Carl and Sag, Ivan A. (1987) Information-Based Syntax and Semantics; Volume 1: Fundamentals, Stanford (Calif.): Center for the Study of Language and Information. Pollard, Carl and Sag, Ivan A. (1994) Head-Driven Phrase Structure Grammar, Chicago: University of Chicago Press and Stanford (Calif.): Center for the Study of Language and Information. Popper, Karl R. (1959) The Logic of Scientific Discovery, London: Routledge, updated, authorized translation of Logik der Forschung, published in 1934 (Wien: Springer). Popper, Karl R. (1963) Conjectures and Refutations: The Growth of Scientific Knowledge, London: Routledge and Kegan Paul. Popper, Karl. R. (1970) ‘Normal Science and its Dangers’, in Lakatos & Musgrave (eds) p. 51–58. Postal, Paul M. and Pullum, Geoffrey K. (1978) ‘Traces and the Description of English Complementizer Contraction’, Linguistic Inquiry 9: 1–29. Postal, Paul M. and Pullum, Geoffrey K. (1982) ‘The Contraction Debate’, Linguistic Inquiry 13: 122–138. Postal, Paul M. and Pullum, Geoffrey K. (1986) ‘Misgovernment’, Linguistic Inquiry 17: 104–110. Pullum, Geoffrey K. and Postal, Paul M. (1979) ‘On an Inadequate Defense of “Trace Theory”’, Linguistic Inquiry 10: 689–706. Pullum, Geoffrey K. (1983) ‘How Many Possible Human Languages Are There?’, Linguistic Inquiry 14: 447–467. Pullum, Geoffrey K. (1985) ‘Assuming Some Version of X-Bar Theory’, in Eilfort et al. (eds) p. 323–353. Pullum, Geoffrey K. (1989) ‘Formal Linguistics Meets the Boojum’, Natural Language and Linguistic Theory 7: 137–143. Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey and Svartvik, Jan (1985) A Comprehensive Grammar of the English Language, London: Longman. Raven, Peter H., Johnson, George B., Losos, Jonathan B. and Singer, Susan R. (2005) Biology, 7th edition, New York: McGraw Hill. Reker, Siemon (2002) Gronings [Taal in stad en land 3], Den Haag: Sdu. Reuland, Eric (2000) ‘Revolution, Discovery, and an Elementary Principle of Logic’, Natural Language and Linguistic Theory 18: 843–848. Rieber, Robert W. (ed.) (1976) The Neuropsychology of Language: Essays in Honor of Eric Lenneberg, New York: Plenum. van Riemsdijk, Henk and Williams, Edwin (1986) Introduction to the Theory of Grammar, Cambridge (Mass.): MIT Press. Ritchie, William C. and Bhatia, Tej K. (eds) (1996) Handbook of Second Language Acquisition, San Diego: Academic Press. Ritchie, William C. and Bhatia, Tej K. (eds) (1999) Handbook of Child Language Acquisition, San Diego: Academic Press.
References
349
Rizzi, Luigi (1980) ‘Violations of the Wh-Island Constraint in Italian and the Subjacency Condition’, Journal of Italian Linguistics 5: 157–192, repr. in Rizzi (1982), p. 49–76. Rizzi, Luigi (1982) Issues in Italian Syntax, Dordrecht: Foris. Roberts, Ian and Roussou, Anna (1999) ‘A Formal Approach to “Grammaticalization’’’, Linguistics 37: 1011–1041. Roberts, Ian (2000) ‘Caricaturing Dissent’, Natural Language and Linguistic Theory 18: 849–857. Roberts, Ian and Roussou, Anna (2003) Syntactic Change: A Minimalist Approach to Grammaticalization, Cambridge: Cambridge University Press. Robins, Robert H. (1967) A Short History of Linguistics, London: Longman, 1979 2. Roeper, Thomas and Williams, Edwin (eds) (1987) Parameter Setting, Dordrecht: Reidel. Rossi, Paolo (1997) La nascita della scienza moderna in Europa, Roma / Bari: Laterza. Sag, Ivan (1993) ‘I Still Think of Myself as a Chomskyan’, Interview with Anne-Marie Mineur and Gerrit Rentier, Ta! 2.2 (http: //www.let. uu.nl/~Anne-Marie.Mineur/personal/Ta/Sag.html, 26/12/2005) Sag, Ivan A. and Wasow, Thomas (1999) Syntactic Theory: A Formal Introduction, Stanford: CSLI. Schachter, Jacquelyn (1996) ‘Maturation and the Issue of Universal Grammar in Second Language Acquisition’, in Ritchie and Bhatia (eds) p. 159–193. Sebba, Mark (1997) Contact Languages: Pidgins and Creoles, Basingstoke: Palgrave. Sebeok, Thomas A. (ed.) (1966) Current Trends in Linguistics Volume III: Theoretical Foundations, Den Haag: Mouton. Selinker, Larry (1992) Rediscovering Interlanguage, Harlow: Routledge. Sells, Peter (1985) Lectures on Contemporary Syntactic Theories; An Introduction to Government-Binding Theory, Generalized Phrase Structure Grammar, and Lexical-Functional Grammar, Stanford: CSLI. Shapere, Dudley (1964) ‘The Structure of Scientific Revolutions’, Philosophical Review 73: 383–394. Shieber, Stuart M. (1986) An Introduction to Unification-Based Approaches to Grammar, Stanford: CSLI. Skinner, B. F. (1957) Verbal Behavior, Englewood Cliffs NJ: Prentice-Hall. Slobin, Dan I. (1966) ‘Grammatical Transformations and Sentence Comprehension in Childhood and Adulthood’, Journal of Verbal Learning and Verbal Behavior 5: 219–227. Smith, Neil (1999) Chomsky: Ideas and Ideals, Cambridge: Cambridge University Press. Sorace, Antonella (1996) ‘The Use of Acceptability Judgments in Second Language Acquisition Research’, in Ritchie and Bhatia (eds) p. 375–409.
350
Chomskyan Linguistics and its Competitors
Sperber, Dan and Wilson, Deirdre (1995) Relevance: Communication and Cognition, second edition, Oxford: Blackwell (orig. 1986). Spurr, David and Tschichold, Cornelia (eds) (2005) The Space of English, Tübingen: Narr. Stockwell, Robert P., Schachter, Paul and Partee, Barbara H. (1973) The Major Syntactic Constructions of English, New York: Holt, Rinehart and Winston. Stowell, Timothy A. (1981) Origins of Phrase Structure, PhD Dissertation, Massachusetts Institute of Technology. Strazny, Philipp (ed.) (2005) Encyclopedia of Linguistics (2 vol.) New York: Fitzroy Dearborn. Sun, Chaofen (2006) Chinese: A Linguistic Introduction, Cambridge: Cambridge University Press. Sutton-Spence, Rachel and Woll, Bencie (1999) The Linguistics of British Sign Language: An Introduction, Cambridge: Cambridge University Press. Taylor, C. W. (ed.) (1959) The Third (1959) University of Utah Research Conference on the Identification of Scientific Talent, Salt Lake City: University of Utah Press. Taylor, Stuart Ross (1998) Destiny or Chance: Our Solar System and its Place in the Cosmos, Cambridge: Cambridge University Press. Thagard, Paul (1988) Computational Philosophy of Science, Cambridge (Mass.): MIT Press. Thain, M. and Hickman, M. (2004) The Penguin Dictionary of Biology, 11th edition, London: Penguin. Thomason, Richmond H. (ed.) (1974) Formal Philosophy: Selected Papers of Richard Montague, New Haven: Yale University Press. Thorne, James Peter (1965) Review of Postal, Paul (1964) Constituent Structure: A Study of Contemporary Models of Syntactic Description, The Hague: Mouton, Journal of Linguistics 1: 73–76. Trager, George L. and Smith, Henry Lee (1951) An Outline of English Structure, Washington: American Council of Learned Societies. Uriagereka, Juan (1998) Rhyme and Reason: An Introduction to Minimalist Syntax, Cambridge (Mass.): MIT Press. Uriagereka, Juan (2000) ‘On the Emptiness of “Design” Polemics’, Natural Language and Linguistic Theory 18: 863–871. Uyechi, Linda (1996) The Geometry of Visual Phonology, Stanford (Calif.): CSLI. Valian, Virginia (1999) ‘Input and Language Acquisition’, in Ritchie and Bhatia (eds) p. 497–530. Verein Ernst Mach (1929) Wissenschaftliche Weltauffassung der Wiener Kreis, Wien: Wolf. Voegelin, Charles F. (1958) Review of Chomsky, Noam (1957) Syntactic Structures, ’s-Gravenhage: Mouton, International Journal of American Linguistics 24: 229–231.
References
351
Weber-Fox, Christine M. and Neville, Helen J. (1999) ‘Functional Neural Subsystems are Differentially Affected by Delays in Second Language Immersion: ERP and Behavioral Evidence in Bilinguals’, in Birdsong (ed.) p. 23–38. Weiss, Albert P. (1925) ‘Linguistics and Philosophy’, Language 1: 52–57. Wexler, Kenneth and Culicover, Peter W. (1980) Formal Principles of Language Acquisition, Cambridge (Mass.): MIT Press. Wexler, Kenneth and Manzini, M. Rita (1987) ‘Parameters and Learnability in Binding Theory’, in Roeper and Williams (eds) p. 41–76. Wexler, Kenneth (1991) ‘On the Argument from the Poverty of the Stimulus’, in Kasher (ed.) p. 252–270. Wexler, Kenneth (1999) ‘Maturation and Growth of Grammar’, in Ritchie and Bhatia (eds) p. 55–109. White, Lydia (1996) ‘Universal Grammar and Second Language Acquisition: Current Trends and New Directions’, in Ritchie and Bhatia (eds) p. 85–120. White, Lydia (2003) Second Language Acquisition and Universal Grammar, Cambridge: Cambridge University Press. Wiesel, Torsten N. and Hubel, David H. (1963a) ‘Effects of Visual Deprivation on Morphology and Physiology of Cells in the Cat’s Lateral Geniculate Body’, Journal of Neurophysiology 26: 978–993. Wiesel, Torsten N. and Hubel, David H. (1963b) ‘Single-Cell Responses in Striate Cortex of Kittens Deprived of Vision in One Eye’, Journal of Neurophysiology 26: 1003–1017. Wiggins, David (1997) ‘Languages as Social Objects’, Philosophy 72: 499–524. Wilson, Deirdre and Sperber, Dan (2004) ‘Relevance Theory’, in Horn and Ward (eds) p. 607–632. Woods, William A. (1970) ‘Transition Network Grammars for Natural Language Analysis’, Communications of the ACM 13: 591–606. Woolgar, Steve (1988) Science: The Very Idea, Chichester: Horwood and London: Tavistock. Yamada, Jeni E. (1990) Laura: A Case for the Modularity of Language, Cambridge (Mass.): MIT Press. Zahar, Elie G. (2004) ‘Falsifiability’, in Keuth (ed.) p. 103–123. Zeilinger, Anton (2003) Einsteins Schleier: Die neue Welt der Quantenphysik, München: Beck.
Author index Page numbers are listed with the first author only. Co-authors are listed with a reference to the first author. Page numbers in italics indicate edited volumes referred to in the reference section. Page number in bold italics refer to summaries. Allerton, D.J., 281 Anderson, Stephen R., 103, 115, 127 fn. 29, 255, 335 Andronis, Mary, 339 Antony, Louise, M., 183, 336, 345, 346 Aoun, Joseph, 221–3 Asher, R.E. → Koerner Atkinson, Martin, 282, 285–6, 289 Ayer, Alfred Jules, 20 Bahan, Benjamin → Neidle Ball, Christopher → Andronis Barbour, Stephen, 277, 322 Barwise, Jon, 236 Beekes, Robert S.P., 331 fn. 24 Bensaude-Vincent, Bernadette, 112–3 Bever, Thomas G., 62; see also Fodor Bhatia, Tej K., 313, 346; see also Ritchie Bickerton, Derek, 125 fn. 11 Birdsong, David, 337, 351 Blakemore, Sarah-Jane, 125 fn. 5 Bley-Vroman, Robert, 300–5, 310, 313 Bloch, Bernard, & Trager (1942), 136–7 (1948), 137–8, 166, 180 fn. 3, 5 Bloom, Paul → Pinker & Bloom (1990a, b) Bloomfield, Leonard, 31, 49, 130, 155, 156, 174, 179–80 fn. 2–3 (1926), 133–5, 137–8 (1933), 135–6, 138, 140, 152, 158, 160–2, 275–6 (1946), 131 Bonhoeffer, Tobias → Hofer Borer, Hagit, 297 Botha, Rudolf P. (1968), 200 (1981), 62, 71, 90, 126 fn. 15, 132, 139, 180 fn. 4, 211, 213 (1989), 39–40, 245, 270 fn. 9
Boyes Braem, Penny, 329 fn. 8 Bresnan, Joan, (1978), 186–9, 196–8, 204 & Kaplan (1982), 186, 188–96, 201–2, 205, 269 fn. 3 (ed.) (1982), 184, 210, 342, 347 (2001), 185, 205–7; see also Kaplan & Bresnan (1982), Halle Bright, William, 152 Broadbent, Donald E. → Ladefoged Broekhuis, Hans, 339 Brown, Keith, 339 Buck, Roger C., 344 Bush, Robert R. → Luce Cain, Joe, 229 Campbell, Lyle, 331 fn 24 Cenoz, Jasone, 330 fn. 13 Chao, Yuen-Ren, 149–50, 164 Chen, Xiang, 27 Chomsky, Noam (1953), 181–2 fn. 19 (1957), 55, 70, 76, 103–4, 124 fn. 4, 129–30, 163, 165–6, 177, 187, 210–1, 228–9, 248 (1959), 105, 301 (1961), 43, 60 (1962a), 66, 73, 76, 104, 199 (1962b), 105 & Miller (1963), 125 fn. 10, 197–9 (1964), 41, 59, 84, 88–9, 167–9, 171, 192, 197 (1965), 40, 42, 45, 55, 58, 65–6, 69–70, 73, 82–3, 89–90, 95–6, 102–4, 106, 106, 111, 121, 126 fn. 19, 148, 186, 196–7, 203, 205, 210–1, 212–3, 254 & Halle (1965), 167–74, 175, 176, 182 fn. 21 (1966a), 42, 47, 73, 103, 198, 303 (1966b), 114–5 (1967), 167, 171 & Halle (1968), 45, 47, 50, 67, 90, 274 (1970), 98, 103, 206, 246 (1972a), 126 fn. 14 (1972b), 254 (1973), 109, 270 fn. 10 (1975a), 80, 125–6 fn. 12–3, 161, 198, 325
Author index (1975b), 269–70 fn. 6 (1976a), 270 fn. 10 (1976b), 30 & Lasnik (1978), 220 (1980a), 42–3, 45, 48, 50, 54, 59, 66, 68, 71–2, 77–8, 82, 84, 111–2, 121, 124 fn. 2, 131, 198, 201, 256, 270 fn. 11, 276–7, 302, 324 (1980b), 68 (1981a), 40, 87, 92, 95, 97–8, 100, 102–3, 106–7, 106, 123, 126 fn. 21, 205, 207, 211–2, 219–20, 223–7, 246, 270 fn. 6, 11, 271 fn. 20–1, 274 (1981b), 75, 197 (1981c), 225, 229 (1982), 119 (1986a), 49–51, 53–4, 63, 67–8, 74–7, 79, 84, 86, 88, 91–2, 99, 102, 131, 162, 190, 195, 197, 213, 238–9, 251, 324 (1986b), 57, 98, 107, 207, 227–9, 270 fn. 12–3 (1988), 44, 69, 79, 83, 102, 109–10, 126 fn. 16, 140, 275 (1990), 229, 270 fn. 12 (1993a), 107, 109 (1993b), 110, 112–3, 278 (1995a), 44–5, 102, 113, 274–6, 278, 280, 283–4 (1995b), 40, 107–8, 119, 122, 123, 127 fn. 30, 296 & Lasnik (1995), 119 (1997), 57, 63, 80, 113 (2000a), 111, 119 (2000b), 127 fn. 26, 278 (2000c), 107 (2001), 107 (2002), 59, 85 (2003), 127 fn. 26 (2004), 67, 84, 93, 95, 99, 102, 104, 116, 121, 338 et al. (2005), 265–6 see also Fitch, Hauser Chung, Sandra → Anderson Clahsen, Harald, 341 Cohen, Robert S. → Buck Colombo, John, 290–2, 307, 309 Comrie, Bernard, 81, 126 fn. 14 Couvalis, George, 5 Crago, Martha B. → Gopnik Culicover, Peter W., 251; see also Wexler & Culicover (1980)
353
Curtiss, Susan, 91, 291–2, 330 fn. 18 De Houwer, Annick, 300 Dowty, David R., 269 fn. 6 Dubois, Michel, 14 Eilfort, William H., 348 Elston, Heidi → Andronis Eubank, Lynn, 292, 308, 316, 330 fn. 17–8 Falk, Yehuda N., 184–5, 207, 269 fn. 1 Fischer, Olga, 318, 320 Fitch, W. Tecumseh, 259–66; see also Chomsky et al. (2005), Hauser Fletcher, Paul, 337 Flynn, Suzanne, 301, 310–2, 315–6 Fodor, Jerry A., 61–2, 168 187–8, 339–40 Fought, John G., 131, 135, 138, 181 fn. 12; see also Hymes & Fought (1981) Friederici, Angela D., 63, 113 Fries, Charles C., 130, 179–80 fn. 3 Frith, Uta → Blakemore Fromkin, Victoria → Curtiss Fukui, Naoki, 325 Galanter, Eugene → Luce Garrett, Merrill F. → Fodor Gass, Susan M., 333 Gaul, Wolfgang, 164 Gazdar, Gerald (1976), 214 (1981a), 212–3 (1981b), 225–6, 228–9 et al. (1985), 210, 213–4, 216, 229, 231, 233 (2001), 211–2, 214, 227, 231–3, 269 fn. 5 Goddard, Ives, 135 Goldin-Meadow, Susan, 125 fn. 11 Goldsmith, John A. → Huck Goodman, Judith C., 346 Gopnik, Myrna, 48 Gould, Stephen Jay, 115, 117 Greenbaum, Sidney → Quirk Greenberg, Joseph H., 216, 338, 340 Greenberg, Marc L., 278 Gregg, Kevin R., 305, 309–10, 315; see also Eubank & Gregg (1999) Guasti, Maria Teresa, 285–6
354
Chomskyan Linguistics and its Competitors
ten Hacken, Pius (1997a), 126 fn. 15, 180 fn. 4, 269 fn. 4 (1997b), 160 (1998), 238 (2002), 218 (2005), 180 fn. 8, 276 (2006a), 126 fn. 15 (2006b), 281 (2006c), 121, 126 fn. 17 Haegeman, Liliane, 126 fn. 20, 331 fn. 20 Hale, Kenneth, 336 Hall, Robert A., 338 Halle, Morris, 167–8, 334; see also Chomsky & Halle (1965), (1968) Harman, Gilbert, 68 Harris, Alice C. → Campbell Harris, Zellig S. 49, 131, 155, 158 (1951), 134, 139–47, 149, 151–2, 154, 159, 163, 165–6, 179–80 fn. 3, 181 fn. 13–4 (1952), 157 (1954), 141, 147–50, 152–3, 159, 173–4 (1957), 157 Hauser, Marc D., 114–9, 127 fn. 28, 257–60, 263–5; see also Chomsky et al. (2005), Fitch He, Hai-Yan, 309 Hensch, Takao K., 290, 308–9 Hermon, Gabriella → Lust Hickman, M. → Thain Hill, Archibald A., 131, 334 Hirsch, Joy → Kim Hockett, Charles F. (1942), 144, 150–2, 160, 165 (1947), 140–1, 166 (1948), 148, 181 fn. 18 (1954), 142, 144, 150, 152–4, 159, 165–6, 177, 192 (1958), 138, 180 fn. 8, 276 (1963), 158 (1965), 177, 182 fn. 22 (1968), 181 fn. 12 Hodos, William → He Hoekstra, Teun, 60 Hofer, Sonja B., 309 ’t Hooft, Gerard, 29 Horn, Laurence R., 351
Hornstein, Norbert, 67, 77, 87–8, 90, 203, 282, 304, 335, 344; see also Antony Householder, Fred W., 181 fn. 12 (1952), 145–7 (1965), 167–74, 175, 176–8, 181 fn. 15, 182 fn. 21 (1966), 167, 170–1, 174, 176 (1971), 167–8, 178–9 Hoyningen-Huene, Paul, 15–6, 26, 28 Huang, C.T. James, 331, fn. 21 Hubel, David H. → Wiesel Hübener, Mark → Hofer Huck, Geoffrey J., 104–5, 246, 271 fn. 19, 344 Hufeisen, Britta → Cenoz Huxley, Renira, 341 Hyams, Nina, 286, 288, 329 fn. 5 Hymes, Dell (1971), 47–8, 324, 327, 327 & Fought (1981), 131, 181 fn. 12 Ingram, Elisabeth → Huxley Jackendoff, Ray (1972), 246 (1975), 246 (1977), 206–7, 246, 248, 271 fn. 21 (1983), 12, 246–7, 262 (1990a), 247–9, 251 (1990b), 272 fn. 26 (1993), 43, 77, 114, 251–2, 266, 329 fn. 8 (1997), 245–6, 249, 252 (1999), 258 (2001), 258 (2002), 245, 250, 253, 256–8, 263, 266, 271–2 fn. 22–3, 272 fn. 29 (2003), 253–8, 272 fn. 25 & Pinker (2005), 259, 261–4, 266 (2007), 256 see also Pinker & Jackendoff (2005), Culicover, Lerdahl Jacobs, Roderick A., 126 fn. 19–20, 335 Jaeggli, Osvaldo A., 99–100, 220, 270 fn. 11 Jespersen, Otto, 49, 124 fn. 3 Jessner, Ulrike → Cenoz Johnson, David E., 218, 269 fn. 6, 270 fn. 8; see also Lappin Johnson, George B. → Raven
Author index Johnson, Jacqueline S., 307–9, 312 Joos, Martin, 131, 334, 340 Kaldewaij, Jelle, 156–9, 181 fn. 17 Kaplan, Ronald M. & Bresnan (1982), 185, 189, 205 (2003), 243, 271 fn. 14 see also Bresnan & Kaplan (1982) Kasher, Asa, 19, 325, 351 Katz, Jerrold J., 78–9, 111, 164, 198, 270 fn. 9; see also Fodor Kayne, Richard S., 72 Kegl, Judy → Neidle van Kemenade, Ans, 345; see also Fischer Kenstowicz, Michael, 336 Keuth, Herbert, 351 Keyser, Samuel J. → Hale Kim, Karl H.S., 309 Kiparsky, Paul → Anderson Kirk, R. → Bever Klein, Ewan → Gazdar et al. (1985) Knecht, Jean, 25 Koerner, E.F.K., 34, 338 Koningsveld, Herman, 37 fn. 3 Koopman, Willem → Fischer Kornfilt, Jaklin → Lust Kortmann, Bernd, 281 Koster, Jan, 288 Krashen, Stephen → Curtiss Kroeber, Paul D. → Eilfort Kuhn, Thomas S. (1957), 37 fn. 2, 160 (1959), 30 (1970a), 15–9, 21–4, 26–36, 36–7, 57, 95, 102, 129, 132, 149, 156, 160, 166, 172, 175, 177, 188, 207, 233 (1970b), 22, 32 (1977), 30, 132 Kupin, Joseph J. → Lasnik Lackner, J.R. → Bever Ladefoged, Peter, 61, 81 Ladyman, James, 5 Lakatos, Imre, 18–9, 22, 343, 345 Lakoff, George, 219, 326–7, 327 Lappin, Shalom, 107, 122–3, 264
355
Larvor, Brendan, 22 Lasnik, Howard, 108, 210, 269–70 fn. 6; see also Chomsky & Lasnik (1978), (1995) Laudan, Larry, 18, 22 Lee, Kyoung-Min → Kim Lee, Robert G. → Neidle Leech, Geoffrey → Quirk Lees, Robert B., 40, 126 fn. 22, 177, 211 Lenneberg, Eric H., 30, 201, 290–2 Lerdahl, Fred, 261 Levine, Robert D. → Lappin Levinson, Stephen C., 325 Lightfoot, David, 219–22, 270 fn. 10, 278, 289, 318, 320–3, 331 fn. 24; see also Aoun, Hornstein Los, Bettelou → van Kemenade Losos, Jonathan B. → Raven Louessard, Laurent, 24 Luce, R. Duncan, 335 Luraghi, Silvia, 277 Luria, A.R., 125 fn. 9 Lust, Barbara, 51, 293, 296–7, 345 Lycan, William G., 127 fn. 26 Lyons, John, 40, 54–6, 58, 129, 206 Macha, Jürgen → Niebaum MacLaughlin, Dawn → Neidle MacWhinney, Brian → Fletcher Manzini, M. Rita → Wexler & Manzini (1987) Marantz, Alec, 336 Margolis, Howard, 35, 175 Martin, Roger, 336 Masterman, Margaret, 15 Matthews, Peter H. (1967), 125 fn. 7 (1968), 167, 169, 176 (1974), 38 fn. 10 (1993), 31, 51, 95, 99, 101, 131 Mayberry, Rachel I., 293 McCloskey, James → Anderson McGilvray, James (ed.), 345 Meisel, Jürgen M., 300 Michaels, David → Martin Miles, Dorothy, 329 fn. 8 Miller, George A., 187, 199; see also Chomsky & Miller (1963), Halle Millikan, Ruth G., 281
356
Chomskyan Linguistics and its Competitors
Misteli, Franz, 317, 331 fn. 22 Mitkov, Ruslan (ed.), 342 Miyashita, Yasushi → Marantz Mohrmann, Christine, 130, 338 Montague, Richard, 212, 214, 239 Moody, Bill, 329 fn. 8 Moore, Patrick, 10 Morgan, James L., 289 Mrsic-Flogel, Thomas D. → Hofer Muncie, James, 330 fn. 12 Musgrave, Alan → Lakatos Mylander, Carolyn → Goldin-Meadow Nagel, Ernest, 6, 8, 53, 334 Neidle, Carol, 292–3 Neuvel, Sylvain → Andronis Neville, Helen J. → Weber-Fox Newmeyer, Frederick J. (1983), 40, 47–8, 57, 70, 82, 330 fn. 15 (1986a), 131, 177, 209, 218, 232, 246, 271 fn. 18 & 22, 326 (1986b), 137 (1986c), 175–6 (ed.) (1988), 337 (1996), 271 fn. 19, 332 (1998), 183, 252, 267–8 see also Anderson Newport, Elissa L. → Johnson, Elissa L. Niebaum, Hermann, 329 fn. 1 Nique, Christian, 126 fn. 19 Noordegraaf, Jan, 177 Nusbaum, Howard C. → Goodman Oesterreicher, Wulf, 30–1 Ohlsson, Stig-Örjan, 329 fn. 3 O’Neil, Wayne → Marantz Ooi, Vincent B.Y., 51 Otero, Carlos P., 272 fn. 26 Partee, Barbara H. → Stockwell Paul, Hermann, 317 Pepperberg, Irene M., 116–7 Percival, W. Keith, 32–4 Perry, John → Barwise Peters, Stanley → Dowty Peterson, Karen L. → Eilfort
Pfeifer, Dietmar → Gaul Pinker, Steven (1982), 202–5 & Bloom (1990a), 256–8 & Bloom (1990b), 272 fn. 26 & Jackendoff (2005), 259–66 see also Jackendoff & Pinker (2005) Pollard, Carl, 231–43, 244, 271 fn. 15–6, 328 Popper, Karl R., 18, 21–3, 25–6, 28 Postal, Paul M., 220–3, 270 fn. 11, 271 fn. 18, 327; see also Pullum & Postal (1979), Johnson, David E. Pullum, Geoffrey K., 210, 271 fn. 15 & Postal (1979), 220 (1983), 226–7 (1985), 223–4, 227, 270 fn. 13 (1989), 228–30, 270 fn. 12 see also Gazdar et al. (1985), Postal Quinlan, Elizabeth M. → He Quirk, Randolph, 66 Raven, Peter H., 329 fn. 4 Reker, Siemon, 329 fn. 2 Relkin, Norman R. → Kim Reuland, Eric, 109 Rieber, Robert W., 335 van Riemsdijk, Henk, 126 fn. 19, 204, 270 fn. 10 Rigler, David → Curtiss Rigler, Marilyn → Curtiss Ritchie, William C., 315, 333, 337, 339, 345, 349–51; see also Bhatia Rizzi, Luigi, 103, 109, 311 Roberts, Ian, 121, 318, 324 Robins, Robert H., 130, 152 Roeper, Thomas, 351 Rosenbaum, Peter S. → Jacobs Rossi, Paolo, 12, 53 Roussou, Anna → Roberts Sag, Ivan, 210, 231–3, 239–41, 243, 269 fn. 1; see also Gazdar et al. (1985), Pollard Schachter, Jacquelyn, 302–4, 310–1, 314–5; see also Gass Schachter, Paul → Stockwell Schneider, Edgar W. → Kortmann
Author index Sebba, Mark, 125 fn. 11 Sebeok, Thomas A., 335 Selinker, Larry, 300, 303, 311 Sells, Peter, 209, 231 Shapere, Dudley, 22 Shieber, Stuart M., 271 fn. 16 Singer, Susan R. → Raven Skandera, Paul → Allerton Skinner, B. F., 105 Slobin, Dan I., 187–8, 269 fn. 2 Smith, Henry Lee → Trager Smith, Neil, 49, 132, 278, 325 Sommerfelt, Alf → Mohrmann Sorace, Antonella, 304 Sperber, Dan, 325–6, 331 fn. 25; see also Wilson Spurr, David, 339 Stengers, Isabelle → Bensaude-Vincent Stevenson, Patrick → Barbour Stockwell, Robert P., 210, 269 fn. 6 Stowell, Timothy A., 207 Strazny, Philipp, 339, 345, 347 Sun, Chaofen, 277 Suppes, Patrick → Nagel Sutton-Spence, Rachel, 114, 272 fn. 28 Svartvik, Jan → Quirk Tarski, Alfred → Nagel Taylor, C.W., 343 Taylor, Stuart Ross, 9 Thagard, Paul, 9, 14, 38 fn. 6 Thain, M., 329 fn. 4 Thomason, Richmond H., 214 Thorne, James Peter, 29, 129–30, 156, 175 Trager, George L., 131, 152, 162, 179–80 fn. 3; see also Bloch & Trager (1942) Tschichold, Cornelia → Allerton, Spurr Uriagereka, Juan, 40, 109, 274; see also Martin Uyechi, Linda, 329–30 fn. 9 Valian, Virginia, 285 Verein Ernst Mach, 20 Voegelin, Charles F., 129, 177 Vogel, Ralf → Broekhuis Wall, Robert E. → Dowty
Ward, Gregory → Horn Wasow, Thomas → Sag Weber-Fox, Christine M., 309 Weiss, Albert P., 160–1, 174 Wexler, Kenneth & Culicover (1980), 289 & Manzini (1987), 286–7 (1991), 285, 329 fn. 6 (1999), 293–5, 297–8 see also Borer Whatmough, Joshua → Mohrmann White, Lydia, 305–6, 311, 313–5, 330 fn. 16 Wiesel, Torsten N., 290–1 Wiggins, David, 281 Williams, Edwin → van Riemsdijk, Roeper Wilson, Deirdre, 325; see also Sperber Woll, Bencie → Sutton-Spence Woods, William A., 189 Woolgar, Steve, 14, 21, 24 van der Wurff, Wim → Fischer Yamada, Jeni E., 48 Zahar, Elie G., 22 Zeilinger, Anton, 38 fn. 8 Zushi, Mihoko → Fukui
357
Subject index References to definitions or central explanations are in bold. References to summary sections in bold italics. Abbreviations appear in their alphabetical position with a reference to the full term. ability (language as an ~) 43–5, 78, 111, 125 fn. 10, 197–8, 256, 259, 302, 307 absolutism 21–2, 154, 202 Académie Française 280 acceptability 55, 64, 142, 144, 163, 214, 270 fn. 7 accident (history of science) 9 adaptation 114–5, 119, 257, 259, 262–4, 267, 272 fn. 26 AG/PRO parameter 329 fn. 5 agreement 97, 104, 237, 271 fn. 16 & 21 ambiguity of grammar 67–8, 74, 86, 125 fn. 7, 191, 214 American Sign Language (ASL) 293, 329–30 fn. 9 American structuralism 131–2, 138, 156, 179 fn. 2, 269 fn. 3 analysis procedure 133, 165 Anaximandros 17 animal communication 114–6 anomaly 21, 23, 26–7, 149, 171–2 APG → Arc-Pair Grammar aphasia 48, 71, 91, 125 fn. 9, 252, 278 Applied Linguistics 157, 303, 330 fn. 15 arbitrariness of language 136–7 architecture 245–6, 249–51, 254–7, 259, 262, 264, 266, 267, 268, 271 fn. 22, 272 fn. 29 Arc-Pair Grammar 210, 218, 221, 223, 230, 269 fn. 6 Aristotle 132 ASL → American Sign Language astronomy 6–7, 12, 16, 26, 110, 132–3, 142, 148, 157, 178, 234–5, 243, 280 Augmented Transition Network (ATN) 189 authority (criterion) 51, 60, 145 authority (person) 24, 123, 233, 264 autonomy 157, 325–6, 327, 328 auxiliaries 92, 96–7, 207 auxiliary hypothesis 38 fn. 6 barrier 107 bee dancing 115–6 behaviourism 105, 181 fn. 18
bilingual language acquisition 300, 313 Binding Theory 205 biology 112, 120, 229, 239, 253–4, 278, 297 bird communication 115–6, 261 biuniqueness 167 Bloch, Bernhard 131, 177 Bloomfield School 130 Bloomfieldian linguistics 130–1 Boas, Franz 131, 179 fn. 2 bounding node 311 brain 56, 67–8, 118, 123, 170–1, 259 brain injury 44–5 brain research 62–4, 125 fn. 5 brain science 111–3, 292, 307, 309 Bresnan, Joan 184, 269 fn. 5, 271 fn. 15 bridge theory 62, 202–3 calibration 60, 62–3, 113 Case 98, 119, 126 fn. 21, 219–20 Case filter 98–100 catastrophic change 321–3 catastrophic emergence 258–9, 263 central embedding 43–4, 46, 55, 80 chain 98 characteristic function 192 Chelsea 308 chemistry 112–3, 120, 178 child grammar 90–1, 282 chimpanzees 115, 117, 255 Chinese 275–8, 307, 309, 311, 331 fn. 21 Chomskyan linguistics 39–127, 40, 156–175, 196–208, 217–229, 258–266 Chomskyan revolution 129–30, 175–179 circularity 151–2, 155, 164 classification 81, 144, 164 Clever Hans 127 fn. 29 click experiment 60–3, 64, 188 cognition 12, 93, 312 coindexation 98, 225–6 communication 78–9, 81, 111, 114, 125 fn. 12, 127 fn. 29, 152, 155, 160–2, 236, 257, 262–3, 275, 324–7 communication system 114–6 communicative competence 47, 327
Subject index community norm 135, 138 Comp/CP 206–7, 271 fn. 21 comparative method 331 fn. 24 competence 42, 53, 55, 65–6, 81, 162, 170, 174, 176, 186, 190, 195, 201, 208, 213, 240, 252, 256, 302–3 competence grammar 190–1, 195, 240, 256, 271 fn. 17 Competence Hypothesis 186, 188–91, 196–8, 209 competition between paradigms 17, 29–30 computational linguistics 232, 241, 244, 271 fn. 14 computational procedure 44, 91 Conceptual Structure 247–8, 251, 255, 267 conceptual-intentional system 115–6, 124 constraints (grammatical) 55, 88, 98–9, 102, 108, 123, 205, 224–5, 237–40, 257 construction 99, 102 context-free grammar 212, 225 contraction 219–23 contrastive analysis 303 coordination 212, 220 Copernican revolution 129, 160 Copernicus, Nicolaus 16, 148, 157, 160 core grammar/language 91–2, 94, 100, 106, 225, 227, 279–80, 292–3, 313 corpus 57–60, 64, 76, 139–43, 153, 155, 163, 169, 172, 214, 321 correspondence rules 247–8, 250, 271 fn. 23 counterexample 23, 56, 81, 102, 172, 217, 222 creationism 38 fn. 6 creative aspect of language 41, 43, 80–1, 93, 199 creativity 79, 114, 159, 192 creoles 79, 125 fn. 11 crisis 26–8, 32–3, 36, 176–7, 188–9, 197, 208–9, 210–2, 218, 230, 231–4, 244, 246, 249, 267 critical period 91, 290, 292–3, 299, 307–9, 312, 316, 316 crucial experiment 20, 26, 28 cues 319, 321, 323 current adaptation 263 current utility 115, 263 Danish 100, 278–80, 314, 320–1 Darwin, Charles 114, 258 data 6, 53, 63, 64, 73, 88, 139, 163, 171–3, 195, 200, 215, 217, 298 deaf 114, 292–3, 308
359
decision procedure 76, 165–6 declarative 84, 240 deep structure 96–7, 99, 103, 126 fn. 19, 187, 204, 212, 246, 254 default 286 defective verbs 142 Degree-0 learnability 289, 299, 320 degree-n complexity 289, 291 deism 310 demarcation 20 deontic 135, 180 fn. 6 derivation 69, 99, 108, 119, 187, 196, 204, 250 Derivational Theory of Complexity (DTC) 187–8, 204 derived categories 225 Descartes, René 110, 161 descriptive adequacy 66, 88–9, 94, 108, 170, 172–3, 230 descriptive linguistics 49, 134, 139 diachronic linguistics 318 dialect 92, 139–40, 180 fn. 8, 220, 276–8, 317, 322 diglossia 322, 323 disciplinary matrix 15–6, 18, 34, 95 discovery procedure 76, 165 discovery process 9, 14, 32, 61, 178, 207, 228 discrete infinity 116–7, 124, 260 distribution 144–5, 152, 154, 157, 165, 326 dogmatism 168 D-structure 97–9, 108, 119, 127 fn. 30, 223 DTC → Derivational Theory of Complexity Dutch 92, 100–1, 277, 279–80, 288, 311, 321 dynamic aphasia 71, 125 fn. 9 Early Modern English 320–1 economy 107, 127 fn. 24, 170, 172 economy principles 122 effectively computable 192 E-language 42, 49–52, 53, 55, 58, 68, 72, 82, 162, 190, 213, 238–9, 251, 274, 289, 302–3, 318 electroencephalic research 292 elegance 121, 147, 171, 217, 328 elicitation 141–2, 155, 163 empirical adequacy 171, 253–4 empirical cycle 6–11, 14–6, 18–20, 23, 25, 34–5, 36–7, 65, 145, 164–6, 178, 179, 215, 222, 235, 243, 244 empirical law 8–11, 16, 35, 65
360
Chomskyan Linguistics and its Competitors
empirical science 215–30, 234–5, 239, 243, 244 empty category 219, 274, 286, 311 Empty Category Principle (ECP) 306 English 56, 58, 99–101, 103, 109, 238, 274–6, 278–81, 284–6, 288–9, 306–7, 314–5, 320–3 epistemic 180 fn. 6 ERP → Event-Related brain Potential error 46, 61, 63, 70, 113, 285, 311, 314–5 ether 20–1 ethnographic approach to science 14 evaluation 7, 153 evaluation criteria 17, 35–6, 148, 165, 171, 200, 232, 266 evaluation of grammars 75–6, 148, 153–4, 171, 209, 212, 214, 240 evaluation procedure 25, 76–7, 166 Event-Related brain Potential (ERP) 63, 309 evidence 63, 201, 298 evolution 111, 114–5, 118–9, 123–4, 257–66, 267 exaptation 114–5, 117, 119, 257, 259 exemplar 15 exercise hypothesis 308–9 experiment 12, 20, 54, 60, 62, 64, 145, 178, 187, 269 fn. 2, 291, 307, 311 experimental law 8 explanation 6, 35, 74, 87, 164, 211, 214–5, 218, 221, 229, 230, 241, 244 explanation (depth of) 13–4, 23, 37, 75 explanatory adequacy 88–9, 94, 108, 121, 171–3, 230 expletive pronoun 284, 286 Extended Standard Theory (EST) 246, 271 fn. 20 extension vs. intension 137, 142–3, 214 external evidence 10–1, 213, 296 face recognition 112 faculty of language (FL) 85–6, 114, 259–60 faculty of language, broad/narrow sense (FLB/FLN) 115–6, 118–9, 124, 257, 259–63, 266 falsifiability 20–2, 33, 101 fashion 24, 26, 32, 318 feature structure 232, 235, 237–8, 240–2, 266, 268, 271 fn. 16, 272 fn. 29 Fermat, Pierre de 175 Fidditch 144, 180 fn. 11 Fiengo, Robert 270 fn. 10 final state (of language faculty) 77, 296
finite capacity 192 finiteness 44, 140–1, 224–7 first language acquisition 281–99, 299, 301, 305, 310, 317 FL → faculty of language, language faculty FLB/FLN → faculty of language, broad/narrow sense fMRI → functional magnetic resonance imaging foreign language 300, 304, 330 fn. 12 formal science 215, 217, 222, 226, 230, 239, 241, 268, 328 formal universals 81 formalism 157, 183–4, 189, 218, 225–6, 232–3, 328 formalisation 157, 210–3, 218, 224, 227–9, 230, 232–3, 236, 240, 242, 244 fossilisation 301, 330 fn. 14 frame of reference 150 France 24–5 French 72, 92, 142, 163, 274, 280 frequency 141, 163, 322 f-structure 204 Full Access Hypothesis 310, 312–3, 315–6, 317 function of language 114, 263, 326 functional heads 207, 271 fn. 21 functional magnetic resonance imaging (fMRI) 62, 125 fn. 5, 309 functional origin 263 functionalism 157, 183, 185 Fundamental Difference Hypothesis 305, 310, 312–3, 317 Galileo 132 Gazdar, Gerald 269 fn. 5,, 270 fn. 7 GB → Government & Binding general problem-solving 301, 305–6 generalization 8–11, 13, 35, 36, 65, 73, 132, 158, 211,234, 326 Generalized Phrase Structure Grammar (GPSG) 209–230, 230, 231, 241–2, 244, 252, 268 generate 69, 96 generative grammar 40, 60, 69, 95, 124 fn. 1, 185, 210–2, 218, 232, 253 generative linguistics 183–5, 251 Generative Semantics 246, 271 fn. 18–9, 326 genetic code 78, 82, 87 Genie 91, 291–2, 308 geography 68–9
Subject index German 92, 238, 277, 279–80, 314, 320–2 God’s truth hypothesis 145–8, 155, 159, 174 government 222 Government & Binding (GB) theory 106–9, 120–1, 123, 126 fn. 18, 183, 204–5, 209–10, 221, 223, 226–7, 232, 242–3, 271 fn. 20 GPSG → Generalized Phrase Structure Grammar gradual change (of a language) 321–3, 324 grammar 51, 65–6, 73, 74, 76, 81, 142–3, 153, 155, 176, 179, 186, 189, 191–2, 195, 200, 209, 211, 213–4, 217, 226, 230, 235–8, 240, 242, 244, 270 fn. 8 grammatical competence 42, 46–49, 52, 53, 69, 201, 324, 327, 328 grammatical framework 215–7, 230, 241 grammatical function 204 grammatical relations 191, 193–4 grammaticality 55, 58–9, 64, 142, 193, 211, 213, 217, 270 fn. 7 grammaticality judgement 54–7, 62, 64, 77, 134, 159, 163, 174, 200, 218, 240, 304 gravity 13, 23, 110, 160, 179 Grice, H.P. 325 habit (linguistic) 138, 155, 181 fn. 18 habits of mind 35 Halley’s comet 8–10 Harris, Zellig S. 49, 131, 155, 158 Head-Driven Phrase Structure Grammar (HPSG) 231–243, 244, 252, 268, 269 fn. 1, 271 fn. 14, 272 fn. 29, 328 Herakleitos 17 heuristic model 16 heuristics 23, 35, 113, 122, 153, 265, 325 historical linguistics 130, 182 fn. 22, 317–8, 331 fn. 24 historical precedence 11, 206–7 Hockett, Charles F., 131, 155 hocus-pocus hypothesis 145–8, 155, 159 home sign 125 fn. 11 homogeneous speech community 70–2, 75, 91, 139, 319 HPSG → Head-Driven Phrase Structure Grammar human legs 263 Huygens, Christiaan 16, 175 hypothesis 7, 20–1, 178, 179
361
Icelandic 288 ideal speaker-listener 70 idealization 65–6, 70–2, 75, 90–3, 94, 193, 199, 282, 296 idiolect 138, 162 I-language 42, 44, 49–52, 53, 57, 67–8, 74, 82–5, 94, 191, 238–9, 251–2, 275, 281, 284, 299, 302–3, 316, 318–9, 323 implicational universal 126 fn. 14 incommensurability 17–8, 33, 166–74, 175, 176, 202, 205, 208, 209, 218, 223, 230, 230, 232–4, 262–6, 267 Incompleteness Hypothesis 310, 313–4, 317 indeterminacy of grammars 73–4, 75, 75–6, 101, 107, 109, 118, 149, 164–6, 172, 197, 199–200, 208, 212, 226 individual language 41, 133, 143 infinity → creativity, discrete infinity, finiteness Infl/IP 97–8, 206–7, 246, 271 fn. 21 informant 54, 59–60, 72, 134–5, 139, 141–2, 163 initial state (S0) 84–6, 94, 108, 283, 293–4, 296, 309–10, 315 innateness 102, 105, 209, 252, 259, 305 instantaneous language acquisition 90–1, 94, 282 interface rules/components 250 interfaces 99, 118–20, 122, 124 interlanguage 303–6, 308–9, 311, 314–5, 316 International Congress of Linguists (ICL) 103, 130, 177 interpretation procedure 195, 199–200 interpretive rules 212, 249–50, 272 fn. 25 Interpretive Semantics 246–7 introspection 54, 60 intuition 56–7, 59–60, 64, 66, 74, 88, 104–5, 172–3, 214–8, 230, 301, 303 isomorphy 191, 195, 208–9 Italian 92, 103, 109, 274, 277, 284–9, 311 iteration 260 Jackendoff, Ray 245–267, 267, 268, 328 Jones, William 182 fn. 22 judgement fatigue 60 Kaplan, Ronald M., 184, 271 fn. 15 Kepler, Johannes 13, 16, 22–3, 148, 158 kernel sentence 103, 187–8
362
Chomskyan Linguistics and its Competitors
knowledge 6–7, 9, 11–2, 41–8, 45, 53 Korean 307, 311 Kuhn, Thomas S., 14 Kuhn loss 27 LAD → Language Acquisition Device landing site 311, 331 fn. 20 language 40, 42, 50, 67, 137, 143, 155, 162, 213, 230, 236, 238–40, 270 fn. 8, 318 language acquisition 73, 75–92, 93, 100, 105, 106, 124 fn. 3, 148, 181 fn. 18, 194, 197, 199, 202–5, 212, 225, 239, 281–324, 328 Language Acquisition Device (LAD) 83–4, 86, 104 language change 317–22, 323–4, 327 language faculty 83–91, 93, 118, 158, 239, 252, 291, 294, 299, 305 language learning 78 language organ 85–6, 112 language processing 45, 70, 74, 136, 187–8, 191–2, 197, 199, 201, 209, 232, 240–1, 244, 255, 267, 268, 328 language system 136–7 language use 78–9, 81, 93, 186–8, 196–9, 324 language, boundaries of 133–4, 151, 276–7, 281, 329 fn. 1 last resort 108, 123 learnability 81, 88, 94, 109, 118, 176, 179, 199, 203, 209, 223–4, 226–7 level of analysis 151–2 levels of adequacy 88–9, 121, 126 fn. 17, 167, 169, 182 fn. 21 Lexical-Functional Grammar (LFG) 184–208, 208–9, 241, 256, 269 fn. 1 & 5, 271 fn. 14, 328 Lexicalist Hypothesis 206, 246 lexicon 44, 91–2, 100, 103–4, 114, 226–8, 250, 328 LF → Logical Form LFG → Lexical-Functional Grammar light (wave vs. particle) 16–7, 20–1, 27–8, 177 lingua franca 233 linguistic types 236–7 Linguistic Wars 246, 271 fn. 18 loanwords 323 Logical Form (LF) 99, 108, 119, 122, 331 fn. 21 logical positivism 20, 33 logical problem of language acquisition 90–1, 203, 282, 304–5, 316, 324
long-distance dependencies 212, 310 markedness 286 Mars 7–8, 13, 37 fn. 2, 42 mathematics 175, 214–5, 217, 230 maturation 290, 295, 297–8 Maturation Hypothesis (MH) 295–7, 299, 313–4 maturational state hypothesis 308 mature I-language 84, 284 mature state 90, 222 mechanics 132 memory 43, 45–6, 116 mental organ 85, 111–2, 117, 213, 257 mentalism 111, 150–2, 155, 160–2, 171, 173, 174, 176, 229, 230, 236–8, 242, 244, 252, 268, 271 fn. 17 merge 264 metaphysical paradigms 15–6, 35 methodological minimalism 121 Michelson-Morley experiment 20–1, 26 microlinguistic analysis 152 Middle English 320–1 mind/body problem 110, 161 mind/brain 49, 57, 83, 111, 161, 254 Minimalist Program (MP) 106–123, 123–4, 165, 256, 264–5, 297 mixing levels 151–2, 155, 165 model 35, 120, 152, 155, 234 model of description 150 molecules 16, 35, 215 monotonic 240 Montague Grammar 210, 212, 214, 218, 230, 269 fn. 6 morphology 151–2, 181 fn. 14, 246 move α 98–9, 102, 108, 158, 205 MP → Minimalist Program music 116, 261, 272 fn. 27 mutual intelligibility 275–9 mysteries (vs. problems) 80, 140, 161–2, 198–9 named language 72, 99, 143, 274–80, 281, 284 narrow syntax 118–9, 124, 260 Natani, K. 329 fn. 8 native speaker 65–6, 70, 104–5, 144–5, 172–3, 190, 197, 214, 293, 302–5, 316 Natural Language Processing 232
Subject index natural selection 115, 256–8 naturalistic data 53–4, 60, 64 negative evidence 77, 285, 288, 301 neural system 111, 290, 307 neuroanatomical experiments 62–3, 64 neuroscience 110–1, 123, 125 fn. 5, 254, 290, 292, 308 Newton, Isaac 13, 15–6, 23, 27, 110, 132, 160, 179 non-uniqueness 147, 149–50, 152–4, 155, 164–5, 173–4, 175, 176–9 norm 135, 302–4, 330 fn. 14 normal science 24, 27–8, 36, 132, 149 number faculty 112, 117, 127 fn. 29 numeration 122 object language 216–7 objectivity 11–2, 56, 133 observable facts 65–6, 87 observation 6–12, 20–3, 36, 65–6, 235 observational adequacy 88–9, 94, 167, 169–72 Occam’s razor 24 open-endedness 159 optics 16, 28–9 optionality 321–3 order-free composition 193 origin of language → evolution ostensive-inferential communication 326 oxygen, discovery of 32, 207 P&P → Principles and Parameters paradigm 15–9, 36, 95, 175, 233 paradigm change 32, 107 parallel architecture 250, 254–6, 266, 267, 272 fn. 29 parameter setting 282–9, 293, 299, 323 parameters 92, 100–1, 103, 106, 109, 225, 227, 280, 281, 283 parasitic gaps 57, 59–60, 77, 304 parrots 115–7 parsing 80–1, 84, 93, 212–3 Partial Access Hypothesis 310, 313 particle physics 29 particular language → named language Pascal, Blaise 175 passive 96–102, 170, 187, 254, 288 perception 12, 261 perception problem 79–81, 93, 125–6 fn. 13
363
perfection 122, 124, 263, 265 performance 42, 45–6, 50, 53, 55, 58, 190, 196, 302, 303, 318, 323 performance error 315 performance model 45 periphery 91–2, 94, 100, 279–80, 291–3, 318 PET → Positron Emission Tomography PF → Phonetic Form phenomena 234 (see also observation) Phonetic Form (PF) 99, 108, 119, 122 phonology 169–71, 178, 219–20, 250, 260–2, 272 fn. 25, 329 fn. 9 phrase structure 103, 157, 204, 223–8, 246, 248, 271 fn. 21 physics 31, 60, 68, 110–3, 132, 178, 215, 236, 278 pidgins 79 Pike, Kenneth L. 152, 181 fn. 15 Pisa lectures 103 Plato’s problem 77–8 PLD → Primary Linguistic Data positive evidence 77 Positron Emission Tomography (PET) 62, 125 fn. 5 possible human language 73, 82, 190, 224 possible linguistic entities 235 Post-Bloomfieldian linguistics 130–54, 155, 156–173, 174–5, 177–9, 269 fn. 3 postulates 133, 137–8 poverty of the stimulus 77–8, 87, 93, 282, 285, 319 pragmatic anaphora 247 pragmatic competence 42, 46–9, 52, 53, 324 pragmatics 325–6, 327, 328 Prague School 156–7 prediction 6–9, 13, 20, 36, 61, 134–5, 144–5, 200, 204, 234 pre-paradigmatic period 17 prescriptive 144, 153, 180 fn. 11 presentational there 54, 65–6 primary linguistic data (PLD) 73, 83–4, 92, 94, 200, 289, 318–22, 323 primate communication 116 primates 115, 255 prime numbers 215 principles 40, 78, 88, 98, 100–2, 106, 119, 122, 190, 210–1, 221, 283 Principles and Parameters (P&P) 95–103, 106, 123, 183, 282–5
364
Chomskyan Linguistics and its Competitors
priority claims 208 PRO 204, 219, 221, 274, 276 pro 284, 286–8 probability calculus 175 processing → language processing pro-drop 284–9, 306, 329 fn. 5 production problem 79–81, 93, 125–6 fn. 13, 140 productivity 153–4, 158–9, 192 progress 5, 22–7, 23, 37, 74, 102, 175, 179 progress though revolutions 27, 32, 175 proof 14, 21–2, 215, 226, 289 psycholinguistic experiment 60–3, 64, 187–8, 200–2, 208–9 psychological reality 68–9, 145, 181 fn. 18, 186, 188, 199, 208, 256 psychology 49, 104–5, 120, 160, 213, 230, 253 Ptolemaic paradigm 16, 157–8 puzzle-solving 21, 305 quantum theory 112, 236 range of analysis 151 rationality 3, 11, 17, 26–8, 76, 109, 202, 224 realism 68, 146–8, 152, 158–9, 229 reconstruction 317, 331 fn. 24 recursion 44, 103, 116, 118–9, 124, 192, 257, 259–62, 264, 272 fn. 27 redundancy 265 relativism 14, 21–2, 24, 32 relativised descriptive/explanatory adequacy 121 Relevance Theory 325–6, 327 reliability 192–4 representation 69–70, 83, 97, 119 representational basis 191, 195, 200 research programme 1–2, 5–19, 18, 29–36, 37, 39, 65, 75, 85–6, 95, 106, 120, 129, 153, 156, 175, 233, 243, 264, 268 research tradition 18, 232–3 respiration 116 reverse engineering 263 reversibility 188 revolution 24–30, 32–3, 37, 95, 107, 123, 129, 156, 160, 175–8, 179 rewrite rules 96, 98–9, 106, 212, 230, 289 Romance 277–8 Ross, John R. 327
Royal Society of London 225 saltational emergence 114–5 Sapir, Edward 131, 156, 179 fn. 2 Saussure, Ferdinand de 156, 182 fn. 22 S0 → initial state SCH → Strong Continuity Hypothesis school 17, 105, 130–1 scientific community 18, 26–8, 34, 36 scientific process 6–9, 25, 164–5 Searle, John 125 fn. 12 second language acquisition (SLA) 278, 300–15, 316–7, 327–8 selectional restrictions 104, 127 fn. 23 semantic interpretation 97, 99, 246–7 semantics 191, 211–2, 246–9, 254–5, 268 semiotics 214 sensory-motor system 115–7, 247 sentence 134, 270 fn. 8 Serbo-Croatian 278 shared knowledge 236–9 Shurley, J.T. 329 fn. 7 sign language 114, 125 fn. 11, 180 fn. 7, 272 fn. 28, 292–3, 329 fn. 8–9 Simpler Syntax Hypothesis 251 simplicity 24, 38 fn. 6, 74, 107, 127 fn. 24, 147, 149, 171, 200–1, 226, 228, 253, 328 situation semantics 236 Skåne 278 skill (linguistic ~) 44, 90, 292–3 SLA → Second Language Acquisition slash feature 212, 225–6 Smeets, Jeroen 331 fn. 19 Smith, Henry Lee 131 SMT → Strong Minimalist Thesis social science 30–1 sociolinguistics 45–6 software design 70 Sokal affair 14 sort 237–8 sorted feature structure 235 sort-resolved 237–8 species (terminological problem) 278, 329 fn. 4 speech (actual ~) 140 (see also speech signal) speech community 72, 133–9, 155, 162, 236, 238, 308, 316, 322
Subject index speech organs 114 speech perception 61, 127 fn. 25, 136, 147, 188, 255 speech production 127 fn. 25, 136, 188, 196 speech signal 136, 155, 161 Split Infl 271 fn. 21 SS → stable state, steady state S-structure 97–9, 108, 119, 219, 265, 331 fn. 21 ST → Standard Theory stable state (SS) 283, 292, 294, 296, 308, 318, 323 standard language 280 (see also named language) Standard Theory (ST) 95–6, 99, 101–3, 106, 223, 248, 289 statistics 56, 140 steady state (SS) 84, 94 stimulus-response model 135, 155 Strong Continuity Hypothesis (SCH) 293–4, 297–8, 299, 312–3, 317 Strong Minimalist Thesis (SMT) 119, 121–2 Strong Programme 24, 32 structural description 69, 104 structured expressions 126 fn. 16 subcategorization 104, 127 fn. 23 subjacency 109, 310–1, 315, 331 fn. 20–1 subject-object asymmetries 306 Subset Condition 287, 299 Subset Principle 286–8, 299, 329 fn. 6 substantive minimalism 121, 124 substantive universals 81–2 sunset 278 Surface Structure 96–7, 254 SVO word order 216, 314, 320–1 Swedish 100, 278, 280, 329 fn. 3 symbol (linguistic ~) 136–7, 155 symbolic generalisations 15–7, 35–6 syntactic mapping problem 191–2 syntactic space (in signing) 272 fn. 28 syntactocentric achitecture 249, 253–7, 267 syntagmatic relations 157 taxonomic grammar 139, 144 tension between description and explanation 87–9, 94, 109, 118, 200, 230 terminology 67, 150, 261, 274, 277 test 6, 60, 65, 74, 85–7, 144–5, 164, 179, 200 textbook 30–1, 184, 243, 269 fn. 1 TGG → Transformational Generative Grammar
365
Thales 17 theism 310 thematic roles 63, 98 theorem 224, 226–8 theory 6–11, 36, 40, 65, 68, 73, 149, 165, 172, 235–6 theory of grammar 73, 189–90 thermodynamics 35–6 theta criterion 98–9 Third Language Acquisition 330 fn. 13 tiers (phonological ~) 272 fn. 25 topographic space (in signing) 272 fn. 28 trace theory (TT) 69, 98–9, 219, 221, 225, 270 fn. 10 traditional grammar 66, 74, 137, 157 transfer 302–3, 306, 311, 314–5 transformation 96–9, 106, 157–8, 170, 185, 187–8, 204–5, 212 Transformational Generative Grammar (TGG) 156, 226 transformational grammar 185, 202, 226, 269 fn. 6 trigger (movement) 108 trigger (language acquisition) 91, 112, 285, 287, 314, 317 truth 19–23, 32, 37, 68 typology 81, 126 fn. 14, 185 UG → universal grammar UG-constrained maturation 297 unbounded dependency → long-distance dependencies unification (of feature structures) 232, 242, 266, 268, 272 fn. 29 unification (of sciences) 110, 112–3, 124, 202, 254, 297 uniformity of UG 91, 112 universal assent 17, 175 universal grammar (UG) 77–8, 82, 85–6, 94, 185, 189–90, 195, 209, 220–1, 236–7, 239, 252, 254, 294, 330 fn. 10 universal interpretation procedure 199 universality constraint 193–5 universals 81–2, 94, 126 fn. 14, 158, 216–7, 240–1 utterance 51, 57–8, 133–45, 153, 155, 163, 180 fn. 5 V2 → verb-second values (as a component of the paradigm) 15–6, 25, 35–6
366
Chomskyan Linguistics and its Competitors
variability of judgements 172, 330 fn. 16 variability of use 322, 323 varieties of English 281, 285, 322–3 Venus 10 verb second (V2) 279–80, 314, 320–3 Verner’s Law 182 fn. 22 vision 111, 247, 261–3, 278, 290, 309 wanna-contraction 219–22 Wasow, Thomas 270 fn. 10 weather verbs 284, 288 well-formedness 51, 134, 193 Wells, Rulon 131 well-typed 237–8 Welsh 300 wh-movement 219, 310–1, 331 fn. 21 wh-questions 109, 292 Wiener Kreis (Vienna Circle) 20–1 wild grammar 306, 314 wings (evolution of) 115 Woodger, Joseph Henry 229 word order 216, 280, 314, 320–1, 329 fn. 2 writing 78–9, 292 X-bar theory 98, 102–3, 119, 206–7, 223–8, 246, 271 fn. 21 X-rays 26 Young, Thomas 27, 177