Representation of Language: Philosophical Issues in a Chomskyan Linguistics 019885563X, 9780198855637

This book is a defense of a Chomskyan conception of language against philosophical objectionsthat have been raised again

227 22 4MB

English Pages 480 [477] Year 2020

Table of contents :
Dedication
Acknowledgments
Preface
Brief Contents
Guide to the Reader
Introduction and Synopsis
PART I. THE CORE LINGUISTIC THEORY
1. The Core Galilean Idea and Some Crucial Data
2. The Basics of Generative Grammars
3. Competence/Performance: Determinate I- vs. E-languages
4. Knowledge and the Explanatory Project
PART II. THE CORE PHILOSOPHICAL VIEWS
5. Grades of Nativism: From Projectible Predicates to Quasi-Brute Processes
6. Resistance of Even Mental Realists and the Need for Representational Pretense
7. Linguistic Intuitions and the Voice of Competence
PART III. INTENTIONALITY
8. Chomsky and Intentionality
9. Linguistic Ontology
10. Linguo-Semantics
11. Psycho-Semantics of Perceptual Content
References to Works of Chomsky
General References
Glossary of idiosyncratic terms and abbreviations
Name Index
General Index

Recommend Papers

Chomskyan Linguistics and Its Competitors 9781845530549, 1845530543

Noam Chomsky is not only one of the most influential, but also one of the most controversial figures in Twentieth centur

336 83 2MB Read more

Inquiries in Philosophical Pragmatics: Issues in Linguistics (Perspectives in Pragmatics, Philosophy & Psychology, 28) 3030566951, 9783030566951

Together with the first volume “Inquiries in philosophical pragmatics: Theoretical developments,” this book collects con

109 94 7MB Read more

Philosophical Issues in Tourism 9781845410988

The aim of this book is to bridge the disciplines of philosophy and tourism and to provide an analysis and application o

138 96 2MB Read more

The Language of Touch: Philosophical Examinations in Linguistics and Haptic Studies 9781350059269, 9781350059290, 9781350059276

Offering an in-depth analysis of the relationship between touch and language through the history of philosophy, this boo

152 48 10MB Read more

The Language of Touch: Philosophical Examinations in Linguistics and Haptic Studies 1350059269, 9781350059269

Offering an in-depth analysis of the relationship between touch and language through the history of philosophy, this boo

335 2 2MB Read more

Understanding Language: A Study of Theories of Language in Linguistics and in Philosophy 9783110805406, 9789027931115

194 80 5MB Read more

Syntactic Structures after 60 Years: The Impact of the Chomskyan Revolution in Linguistics 9781501506925, 9781501514654

This volume explores the continuing relevance of Syntactic Structures to contemporary research in generative syntax. The

145 4 73MB Read more

Syntactic Structures after 60 Years: The Impact of the Chomskyan Revolution in Linguistics 9781501506925, 9781501514654

This volume explores the continuing relevance of Syntactic Structures to contemporary research in generative syntax. The

148 33 6MB Read more

Issues in Philosophical Counseling 9780313013225, 9780275976675

150 73 1MB Read more

Linguistics and Philosophy: An Essay on the Philosophical Constants of Language [Reprint ed.] 0268160538, 9780268160531

The dual purpose of this volume―to provide a distinctively philosophical introduction to logic, as well as a logic-orien

118 94 11MB Read more

Representation of Language: Philosophical Issues in a Chomskyan Linguistics
019885563X, 9780198855637

Author / Uploaded
Georges Rey

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Representation of Language

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

“Rey’s volume is both an excellent introduction to the foundations of generative linguistics and perhaps the most sustained contemporary engagement with the many philosophical nuances of Chomsky’s own position. Where Rey agrees with Chomsky, he offers novel considerations in support, and where he disagrees, his points are unfailingly insightful and well-informed. It is an utterly admirable volume, from which the student and the expert have much to learn.” - Professor John Collins, Professor of Philosophy, University of East Anglia. “This long-awaited book distills years of deep engagement with the most important foundational and philosophical questions raised by linguistics and the cognitive sciences. Overflowing with sharp observations and original ideas, it’s a must read for philosophers and scientists alike interested in innateness, intentionality, computational explanation, and methodology in the mind-brain sciences.” - Professor Steven Gross, Professor of Philosophy, Johns Hopkins University “In this book, Georges Rey offers a solid and thought-provoking discussion of a range of foundational issues in Chomskyan linguistics. The book will influence and engage linguists and philosophers alike, hopefully enriching cross-disciplinary conversations in the future.” - Terje Lohndal, Dept of Language and Literature, Norwegian University of Science and Technology & The Arctic University of Norway “In this rollicking ride through the philosophy of Chomskian linguistics, Georges Rey argues for the centrality of the notion of representation, and makes a startling proposal for what it is that the claims of linguistic theories represent, and what the wider implications are for psychology more generally. Original, intellectually fun, and, for a linguist, enjoyably contentious, this is a great read.” - David Adger, Professor of Linguistics, Queen Mary University of London. “This book is vital reading for anyone interested in Chomsky’s revolution in linguistics and what it has meant for our understanding of the human mind. It is both an invaluable guide to the impact of Chomsky’s work on philosophy and a very considerable original contribution to the field. It is rigorous (Rey has done his homework on the linguistics thoroughly), fair-minded (to Chomsky and to his critics), and contentious. Rey writes vividly and has a gift for going to the essence of an argument. Particular highlights include Rey’s masterful compilation of evidence for an underlying system of grammatical competence that is not driven by communicative concerns or social conventions, and his excellent chapter on poverty of the stimulus arguments. This advances the state of the art considerably by clearly distinguishing between empirical problems and indeterminacy arguments, and highlighting a deep problem for empiricist accounts: how could children infer (un)grammaticality, a modal attribute, merely from statistical observations of what people say?” - Nicholas Allott, Department of Literature, Area Studies and European Languages, University of Oslo & University College London.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Representation of Language Philosophical Issues in a Chomskyan Linguistics G E O R G E S R EY

1

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

1 Great Clarendon Street, Oxford, OX2 6DP, United Kingdom Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © Georges Rey 2020 The moral rights of the author have been asserted First Edition published in 2020 Impression: 1 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number: 2020933786 ISBN 978–0–19–885563–7 DOI: 10.1093/oso/9780198855637.001.0001 Printed and bound in Great Britain by Clays Ltd, Elcograf S.p.A. Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

In memory of Jerry Fodor (1938–2017) intrepid intentionalist, mentor, friend, and much missed wit.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Acknowledgments I first of all need to thank the US-Norway Fulbright Foundation For Educational Exchange for a research grant that supported this work for a year at the Center for the Study of Mind and Nature (CSMN) at the University of Oslo, and to the Center, especially, for its generous hospitality before, during, and after that year. Much of my time there was spent in extended discussions with Nick Allott about the third edition of Neil Smith’s Chomsky: Ideas and Ideals, which he was co-authoring (Smith and Allott, 2016). Those discussions were some of the most fruitful I’ve ever had. Nick served as both an insightful expositor of Chomsky’s views and a wonderfully cooperative and imaginative interlocutor in response to the various problems I raised. The many points at which I thank him for specific advice are a tiny fraction of all I have gained from him. Others at CSMN, particularly Carsten Hansen and Jan-Terje Faarland, also commented frequently and helpfully on drafts, as did Terje Lohndal, at the Norwegian University of Science and Technology (NTNU) in Trondheim. Another invaluable interlocuter has been John Collins of the University of East Anglia (UEA). John originally wrote a sharply critical review of my 2003a, which occasioned an extended exchange in which he forcefully defended what he calls Chomsky’s “meta-scientific eliminativism” about intentionality against my objections. We also became fast friends, and, although, as will be evident, we still don’t quite agree about this and other issues (although see Collins and Rey, forthcoming!), our discussions have seemed to me a model of always deeply amicable and respectful contention. I should also thank him for suggesting the final title of the book, with which I had struggled mightily for many weeks. There’s also Michael Devitt. Despite our nearly five decades of friendship and shared commitment to a general “naturalized epistemology” and “scientific realism” that we both inherited from our teachers Putnam and Quine, he and I have been at loggerheads about Chomsky’s work since the early 1990s. We’ve had innumerable exchanges at conferences, on buses and trains, on email and at his splendid Hudson home, from all of which I’ve benefited greatly.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

viii Acknowledgments Various portions of this material have been presented at the CSMN, the NTNU, the UEA. the University of Maryland, and at a number of conferences in America and Europe, several of the latter hosted by Djuna Timorović and Nenad Miščević at the Inter-University Centre in Dubrovnik. In addition, each summer from 2016 through 2019 Nick, Terje, Carsten, and I ran a summer institute on language and mind that was handsomely funded by CSMN, the Norwegian National Graduate School in Linguistics and Philology, the University of Maryland, and the department of philosophy, Classics and History of Art and Ideas (IFIKK) at the University of Oslo, where the institute was held. It involved many prominent linguists and philosophers, and about 40 graduate students each year from around the world. Needless to say, I gained immensely from the many participants on all these occasions, far too numerous to name. I’m also indebted to my colleagues particularly in linguistics here at Maryland, who have provided me endlessly patient tutelage and advice, especially Howard Lasnik, whose yearly course on the history of generative grammar is legend. Bill Idsardi, Jeff Lidz, Omer Preminger, and Alexander Williams generously participated in a seminar on my ms. in the Spring of 2018, correcting me on a number of points and deepening my understanding of the Chomskyan project. That seminar ran simultaneously with a similar one on the ms. run by Mark Greenberg in philosophy and law at UCLA, which provided further, valuable perspectives. My deep thanks to all of them, as well as to Steven Gross and to my student Andrew Knoll for pressing challenging objections, Dan Blair for correcting me on many technical details, and to Paul Pietroski for immensely useful discussions of his and Chomsky’s views on semantics. Special acknowledgments are due Peter Montchiloff at Oxford University Press for sheparding the ms. from the start, and to Kim Allen, who guided me through the final editing with great care, sensitivity, and superhuman patience, as well as to my students, Casey Enos, Will Fenton, Chris Masciari, Michael McCourt for conscientious help with the proof reading. Thanks are also due to the waitrons of two terrific cafes, Le Pain Quotidian in DuPont Circle in DC, and (the original!) Peet’s Coffee in Berkeley. Their espresso, friendliness, and tolerance of my working long hours over their many shifts were indispensable. Lastly, of course, I want to express heaps of love and gratitude to my wife, Cynthia Haggard, who has graciously endured my absorption in this material for the last let’s not say how many years.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Preface For the past now seventy years, Noam Chomsky and many other linguists and psychologists have pursued a research program to show that natural human languages are based upon an internal computational system constrained by principles of an innate, universal grammar (“UG”)1. It is a specific “generative” system that makes available to speakers a potential infinity of linguistic structures that determine much of the character of our perception, understanding, and production of speech. The program has raised a large number of foundational issues regarding linguistics and psychology, for example, Rationalism and Empiricism, innateness, computational explanations of mental processes, and the role of representation and intentionality in them and in cognitive science generally. This program has met with quite volatile reactions. A number of discussions have been either from advocates who sometimes seem to defend virtually everything Chomsky says, or from skeptics who attack almost all of it—some even calling him a fraud!2 A significant number of writers have claimed that Chomsky’s whole approach has been a complete failure. For recent example, the psychologist Nick Chater (2018) claims that the Chomskyan programme has foundered: it turns out that even the structural patterns observed in language . . . are a jumble of inconsistent regularities, sub-regularities and outright exceptions. (Chater 2018: 32)

1 For brevity, throughout this book I will refer to the relevant views as “Chomskyan,” fully aware that the views are not his alone, and evolved as an immense collaborative effort among many linguists, between whom there are many important differences. I think those differences will be largely irrelevant to the foundational issues I will be addressing. Chomsky himself often rightly stresses the importance of not tying a research program to any one individual, but sometimes the individual’s contributions are so rich, distinctive, and synoptic that the proper name is really the only available means of identifying it (cp. “Galilean,” “Newtonian,” “Marxist,” “Freudian”). In any case, it is that dis tinctive and synoptic view that is the topic of this book, although it is of course by no means Chomsky’s alone. Note that I will not be discussing any of his political views (see “Guide to the Reader”). 2 For examples of the abuse, see, e.g., Baker and Hacker (1984), Botha (1989) and Levine and Postal (2004). Berlinski (1988: 140) reports the logician Richard Montague claiming Chomsky to be one of the “two great frauds in the history of twentieth-century science” (the other being Albert Einstein!).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

x Preface And in a popular book that received a surprising amount of attention, the linguist Vyvyan Evans (2014: 69) claims that recent research has shown Chomsky’s program to be “completely wrong”.3 He calls attention to the emergence in the last few decades of a large number of cognitive scientists who present themselves as offering an alternative “new paradigm” of language, viz., “functional/construction” grammars, that tie language to its uses and social functions. For example, a leading primatologist and psychologist Michael Tomasello (2009) has defended this latter approach, and claims by contrast that “Universal grammar is, and has been for some time, a completely empty concept” (2009: 470). Philosophers have not been shy of entering the fray. In a salient article and exchange, John Searle (2002a,b,c) claimed that Chomsky has abandoned his original project; Patricia Churchland (2013), that linguistic universals, long the darlings of theorists, took a drubbing as one by one they fell to the disconfirming data of field linguists; (Churchland 2013: xiii)

And, in an otherwise balanced treatment of the nativism issue in her Stanford Encyclopedia of Philosophy entry, Fiona Cowie (2008 § 3.6.3) nevertheless concluded that “our understanding of language acquisition, when it comes, is likely . . . to resemble the theory of the Chomskyan nativist in few or no respects”. Only slightly more moderately, in his recent philosophical study, David Pereplyotchik (2017) claims that: it is safe to say that there is, at present, no consensus in the cognitive science community—and certainly none among philosophers concerning the viability and conceptual coherence of Chomsky’s program. (Pereplyotchik 2017: xvii–xviii; emphasis mine)

Although Michael Devitt (2006a) does not dispute either the nativism or any of the specific grammatical proposals, he does dismiss Chomsky’s central tenet that his theory is about the mind:

3 See the highly rhetorical exchanges in the blogosphere (https://facultyoflanguage.blogspot. com/2015/04/ does-lsa-and-its-flagship-journal.html), the favorable reviews in the Times Higher Education and New Scientist, and the recent multiple-author peer commentary in Language, the journal of the Linguistic Society of America. A peculiar conjunction of charges Evans (2014) raises against UG is that it is both “wrong” (p69) and “unfalsifiable” (p78), the latter being a worry also raised by Hayes (2018: 196). Nick Allott and I addressed this and other issues in our (2017).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Preface xi Chomskian views of the place of language in the mind are seriously mistaken and . . . we should be exploring other options. (Devitt 2006a: viii)

I should say from the start that I’m partial to the advocates. I find Chomsky’s fundamental “Galilean” idea about an innate, domain-specific computational system that structures a great deal of our perception and understanding of speech, to be one of the best ideas anyone has ever had, not only about language, but about how to think about language, the mind, and psychology generally. This is not because I am in any position to pass judgment on the details of the linguistics.4 I am not remotely a linguist, and am concerned only with general methodological issues that largely do not depend upon the details that standardly occupy linguists. I am particularly concerned to counter the above general dismissals of Chomsky’s work, which seem to me due to serious failures to appreciate not so much its details as its core claims, the substantial evidence for them, and, most importantly, the power of the explanatory strategy he has proposed. Clarifying this situation in sometimes novel ways will be the task of the first two parts of the present study (I summarize the novelties in the “Guide to the Reader”). There are of course more restricted, technical objections and alternatives to Chomsky’s specific proposals. Some of these alternative research programs, such as “Lexical Functional Grammar,” “Head-driven Phrase Structure Grammar,” and “Combinatory Categorical Grammar,” are within a broadly generative perspective, and the differences between them and Chomsky’s proposals, will not be an issue in this book. Others, for example statistical approaches in computational linguistics, are standardly concerned with “artificial intelligence” and machine learning from data from books and news papers, quite unlike the data any child is likely to encounter. Such approaches do not remotely share Chomsky’s psychological explanatory aims, and so, I will argue, fail to address the crucial data that are distinctive of his project. Similar divergencies of interest underlie the differences between a Chomskyan approach and the approaches of “Cognitive-Functional” and “Constructivist” grammars and many other supposed opponents of Chomsky, whose concern is consistently more “usage based,” that is, concerned with the actual use of language—more than with the idiosyncratic underlying structures that inform and constrain that use.

4 Some idea of the present state of the details will be shortly available in the forthcoming Blackwell Companion to Chomsky edited by Nick Allott, Terje Lohndal and myself.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

xii Preface In addition to these differences that could be regarded as ones largely within the discipline of linguistics, there are issues that intersect with other disciplines, specifically, psychology, philosophy, and what has come to be called cognitive science, and, as one moves away from the core to some of the more purely foundational and philosophical issues, I begin to have more sympathy with some of the skeptics. Although Chomsky frequently expresses views about philosophical issues, they have never actually been his central concern, and he sometimes recruits what he takes to be traditional views for his own linguistic purposes, often without noting how those purposes sometimes diverge from their proponents’ original aims. For example, I will argue that his assimilation of his core theory to the views of the Rationalists is somewhat misleading. Although he rightly stresses nativist as opposed to traditional empiricist accounts of language acquisition, this is not because he thinks “knowledge” of language involves a process of a priori reasoning in the way that, for example, Plato and Descartes thought of knowledge of mathem atics and geometry. To the contrary, he frequently likens it to non-rational, more purely biological processes of growth and the development of bodily organs. Moreover, far from endorsing any theory of “innate ideas,” Chomsky (2000) turns out to be as sceptical of any serious talk of “ideas” as were the most severe empiricists, like Goodman and Quine, that he ostensibly opposes. The last issue involves the problem of intentionality, which will be the focus of much of the last part of my discussion. This is the familiar, but curious property whereby an idea or other mental representation is “about” something, or “has meaning” or “content,” in the way that, for example, representations of trees, triangles, ghosts and Zeus clearly have “intentional contents” that “refer to” or “pick out” those things, either real, or (as in the case of ghosts and Zeus) unreal.5 The nineteenth-century German philosopher Franz Brentano (1874/1995) plausibly claimed that intentionality is what is distinctive about the mental, and that it presented quite puzzling logical problems that made it, in his view, “irreducible” to any physical phenomena. These and other problems have led many philosophers and scientists to exclude it from serious science, beginning with the behaviorists a century ago, but continuing to the present day in the work of some connectionists, neuroscientists, and of what I call “normativists,” such as Quine (1960), 5 The (unfortunate) term “intentional” is three ways homophonous: in addition to the philosoph ical use that will be important in this book, there is of course the ordinary term “intentional,” meaning, roughly, deliberate; and then the related but different term “intensional” (with an “s” rather than a “t”), which is contrasted with “extensional” (see §3.2 and §8.1 below for discussion).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Preface xiii Davidson (1970), and Dennett (1987), who claim that intentional ascription involves some kinds of evaluative considerations that are not part of serious natural science. Chomsky has often rightly deplored this latter “methodological dualism” that treats mind differently from the rest of nature, and, for many of us encountering Chomsky’s work in the 1960s, it seemed indeed to give the lie to it, presenting the first serious prospect of integrating a substantive mentalism and intentionality into the natural sciences. Consequently, we were taken aback by his later claims that intentional attribution . . . may turn out to be an interesting pursuit (as literature is), but is not likely to provide explanatory theory or to be integrated into the natural sciences. (Chomsky 2000: 22–3; see also 15, 132, 146)

Such claims seem indistinguishable from the very claims of the “methodological dualists” that he also claims to be opposing! I was sufficiently perplexed by this development that I wrote a long essay (Rey 2003a), to which Chomsky (2003) replied in ways I found even more puzzling, and so I wrote a response (Rey 2003b). Part III of the present book is a substantial expansion of parts of that latter response (omitting some tangential issues raised in the original exchange). One concern that seems to lead Chomsky to his denial of intentionality is his view that his theory is not about any actual external words or sentences that people take themselves to be hearing in the acoustic stream or reading on the page. Rather, as a chapter of psychology, the theory is about the internal mental “representations” of such things over which the posited explanatory computations are defined, but are not actually “of ” anything real in the external world. But even if this is so, and the relevant internal representations are not about actual things, it does not follow that they are not “of ” or “about” something. They can very well have meaning, or what philosopher’s call “intentional content” as is patently the case with representations “of ” Zeus, ghosts, phlogiston, colors, and Euclidean triangles (whose existence is at least disputed). But what then are such representations exactly “about”? Here I will—cautiously—resurrect a term of Brentano’s: they are of, or about, “intentional inexistents,” “objects” of thought, perception, and representation which simply are not real. I will argue that we have substantial reason to think that words, sentences, phrases, phonemes, and/or their properties—what I will

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

xiv Preface loosely call “standard linguistic entities” (or “SLE”s6)—are by and large just further intentional, and in some cases what I call “perceptual” inexistents. Talk of intentional inexistents is often thought to invite dubious metaphys ical claims incompatible with the naturalistic attitudes of science. I argue that that fear is unfounded. Talk of “intentional inexistents” is just a convenient (if admittedly logically peculiar) way of characterizing the intentional content of psychological states, and involves speaking “as if ” the content is veridical, an implicit, what I call “representational pretense,” that it is. It involves no special ontology in addition to those psychological states any more than pretending there are ghosts does. Things that don’t exist, don’t exist in any way, even though we may take ourselves to see or hear them, and to think and talk about “them” as if they did. But, still, insofar as we talk about such things, we do need to talk about intentional content, and one might wonder what exactly that is in any psychological theory. This is a hoary problem, and I don’t begin to undertake a general solution to it, much less to the problem of how to “reduce” or identify it in non-intentional terms. But nor, lacking one, will I despair as many have of its explanatory utility. What I will defend is a strategy for understanding how at least some postulations of intentional content are required in order to satisfy Chomsky’s requirement that a linguistic theory be “explanatorily adequate,” that is, that it show how it is at least possible for a child to acquire a natural language. I will argue that children could not do this unless they were perceptually sensitive to grammatical phenomena, and the only explanation on offer of this fact requires that their perceptual systems engage in some form of probabilistic reasoning, and, at least in this case, this cannot be understood without the states over which such reasoning is defined possessing intentional content. I hope that my discussion may lead Chomskyans and their opponents to be clearer about some of the commitments of the Chomksyan program, but that it might also lead them and other readers to appreciate how philosophical issues regarding the mind, intentionality, and “inexistence” bear upon serious scientific issues not only in linguistics, but in psychology generally. Let me stress that I do not think there are principled distinctions between philosophy

6 I use this term throughout only for brevity, with no commitment to what should ultimately be subsumed (or not) under it. Until otherwise qualified, it will include the kinds of (purported) entities speakers take themselves to hear, parse, and produce—e.g., words, phrases, sentences—as well as whatever entities, e.g., tree structures, (instances of) phonological and syntactic “properties” or “features,” and unpronounced elements, such as “traces,” “copies,” and “PRO,” which a (psycho-) linguist might posit as involved in a parse.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Preface xv and these other disciplines. To my mind, there are simply issues of less or greater generality and abstractness, and these are often pursued by people, like myself, employed in so-called “philosophy” departments. Some of these latter people do think of themselves as doing some kind of quite distinctive work, often decidedly “independent” of any empirical results. I am not one of them, and in order to make clear my distance from them, and, I hope, not to be ignored by empirically minded researchers who might dismiss that sort of work, I prefer to think of the issues that I am addressing as merely “foundational.” But many issues mostly discussed by philosophers will sometimes surface, and I was advised that “philosophical” would be apter than “foundational” in the title. In any case, I hope this book will be read by philosophers, linguists, psych ologists, and others interested in the nature of the mind. I will try to explain issues in one area in ways accessible to readers in others. Consequently, I may labor points that are familiar to readers in specific areas, who may therefore want to skip over such passages.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Brief Contents Guide to the Readerxxv Introduction and Synopsis

1

PA RT I . T H E C O R E L I N G U I ST IC T H E O RY 1. The Core Galilean Idea and Some Crucial Data

13

2. The Basics of Generative Grammars

45

3. Competence/Performance: Determinate I- vs. E-languages

93

4. Knowledge and the Explanatory Project

129

PA RT I I . T H E C O R E P H I L O S O P H IC A L V I EWS 5. Grades of Nativism: From Projectible Predicates to Quasi-Brute Processes

151

6. Resistance of Even Mental Realists and the Need for Representational Pretense

184

7. Linguistic Intuitions and the Voice of Competence

222

PA RT I I I . I N T E N T IO NA L I T Y 8. Chomsky and Intentionality

261

9. Linguistic Ontology

295

10. Linguo-Semantics

336

11. Psycho-Semantics of Perceptual Content

363

References to Works of Chomsky General References Glossary of idiosyncratic terms and abbreviations Name Index General Index

391 395 427 429 436

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Detailed Contents Guide to the Reader

Introduction and Synopsis

xxv 1

PA RT I . T H E C O R E L I N G U I ST IC T H E O RY 1. The Core Galilean Idea and Some Crucial Data

13

1.1.1 Internalist vs. Social Conceptions 1.1.2 A Galilean Theory and Crucial Data

13 16

1.1 An Internalist “Galilean” Idealization

13

1.2 Competence vs. Performance 1.3 Typical WhyNots

19 21

1.4 Performance Issues

35

1.5 Further Evidence

38

1.3.1 Purely Syntactic Cases (i) Island Constraints (ii) Constraints on Contraction (iii) Ellipses (iv) Parasitic Gaps 1.3.2 Possibly Mixed Syntax, Semantics, or Pragmatic Cases (i) Binding Phenomena (ii) (Negative/Positive) Polarity Items (NPIs/PPIs) (iii) Structural Constraints on Meaning 1.4.1 Grammatical but Unacceptable 1.4.2 Acceptable but not Grammatical 1.5.1 Productivity 1.5.2 Creativity 1.5.3 Relations Between Forms 1.5.4 Constrained Homophony (or “Ambiguity”) 1.5.5 Stability of Acquisition 1.5.6 Speed of Stable Acquisition 1.5.7 Poverty and Corruption of Stimulus Data 1.5.8 No Negative Data 1.5.9 Independence of General Intelligence 1.5.10 A “Critical Period” 1.5.11 Universality of Grammatical Principles and Parameters 1.5.12 Spontaneous Languages of the Deaf 1.5.13 Absence of Logically Simple Languages 1.5.14 “The Linguists’ Paradox”

27 27 28 30 30 31 31 33 34 35 36

38 38 39 39 40 40 41 41 41 42 42 44 44 44

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

xx Detailed Contents

2. The Basics of Generative Grammars 2.1 Philosophical Beginnings 2.2 Stages of the Core Theory

2.2.1 Logical Constructivism: LSLT and Syntactic Structures 2.2.2 Psychology and Explanatory Adequacy: the “Standard” “Aspects” Model 2.2.3 From Phonemes to Features 2.2.4 Constraining the Rules: The Extended Standard Model 2.2.5 Resisting Teleo-tyranny: Semantics and “the Autonomy of Syntax” (i) Teleo-tyranny (ii) Surprising Consequences for Linguistics 2.2.6 Generative vs. Interpretive Semantics 2.2.7 GB/P&P: Addressing Plato’s Problem and Linguistic Diversity 2.2.8 Crucial Move from Hypothesized Rules to Mechanical Principles 2.2.9 The Minimalist Program 2.2.10 The “Third Factor”: Darwinian and Neural Speculations

45

45 46

48 54 57 59 62 62 65 67 71 74 75 80

2.3 Some Simple Proposed Explanations

82

2.4 Conclusion

91

2.3.1 C-command 2.3.2 (Negative) Polarity Items 2.3.3 Binding Phenomena

3. Competence/Performance: Determinate I- vs. E-languages

82 83 86

93

3.1 Conventional vs. Non-Conventional Grammars 3.2 I- vs. E-language 3.3 Behaviorism and Quine’s Problems

94 97 102

3.4 Other Superficialist Objections

117

3.3.1 3.3.2 3.3.3 3.3.4

The Motivations for Behaviorism The Poverty of Behaviorism Extensionally Equivalent Grammars Explicit vs. Implemented (“Implicit”) Rules and Structures

3.4.1 “Nothing Hidden” (Wittgenstein, Ryle, Baker and Hacker, and Chater) 3.4.2 Homunculi (Harman and V. Evans) 3.4.3 “Kripkenstein”: Errors and Ceteris Paribus Clauses

4. Knowledge and the Explanatory Project 4.1 4.2 4.3 4.4 4.5

Explanatory Adequacy “Knowledge” Non-Conceptual Content An Explanatory vs. a Working Epistemology Computational-Representational Theories (“CRT”s)

102 105 107 113 117 121 123

129

130 132 137 139 143

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Detailed Contents xxi

PA RT I I . T H E C O R E P H I L O S O P H IC A L V I EWS 5. Grades of Nativism: From Projectible Predicates to Quasi-Brute Processes 5.1 5.2 5.3 5.4

Innate and Learned! The Poverty of the Stimulus Plato’s Problem? General Statistical (“GenStat”) Approaches

5.4.1 5.4.2 5.4.3 5.4.4 5.4.5 5.4.6 5.4.7

Simple Induction Goodman’s Problem of Projectibility Bayesian Strategies Empirical Difficulties Leibniz’s Problem of Modality Quine’s Problem of Behavioral Indeterminacy Quasi-Brute Process Nativism

5.5 Usage Based Strategies 5.6 Conclusion

6. Resistance of Even Mental Realists and the Need for Representational Pretense 6.1 Initial Red Herrings 6.1.1 Prescriptivism 6.1.2 Consciousness

151

154 157 161 162 162 164 168 170 172 175 176

179 181

184

185 185 186

6.2 Platonism

187

6.3 Devitt’s “Linguistic Reality” (LR)

195

6.4 Literal Form of Principles and Parameters

213

6.5 A Somewhat Ecumenical Conclusion: Internalist and Externalist Interests

219

6.2.1 6.2.2 6.2.3 6.2.4

The “Type” Argument The “Necessity” Argument The “Veil of Ignorance” Argument Platonism’s Need of Psychology

6.3.1 6.3.2 6.3.3 6.3.4 6.3.5 6.3.6

Fundamentally Different Concerns Bee Dances and Natural Language Competence and Its Outputs Structural vs. Processing Rules Conventions Devitt’s Ontological Argument

6.4.1 Representational Pretense 6.4.2 A Reassuring Postscript about Pretense

189 190 192 194 198 200 202 203 205 208 213 218

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

xxii Detailed Contents

7. Linguistic Intuitions and the Voice of Competence 7.1 A “Voice of Competence”?

7.1.1 Devitt’s Skepticism about (Non-)Standard Models 7.1.2 I- vs. E-languages 7.1.3 Devitt’s Alternative Proposal

222

222 224 226 228

7.2 Parsing and Perception

233

7.3 The Evidence

249

7.4 Conclusion

257

7.2.1 7.2.2 7.2.3 7.2.4

Linguistic Perception as Parsing Non-Conceptual Content Again: NCSDs Having vs. Representing Properties How Would NCSDs Help?

7.3.1 Involuntary Phonology 7.3.2 “Meaningless” Syntax 7.3.3 Syntax Trumping “the Message”: Garden Paths, Structural Priming, and “Slips of the Ear”

235 240 242 247 250 252 254

PA RT I I I I N T E N T IO NA L I T Y 8. Chomsky and Intentionality 8.1 8.2 8.3 8.4 8.5

Intentionality Chomsky as Intentionalist The Controversy Chomsky as Anti-Intentionalist Collins’ Abstract-Algebraic Reading

8.5.1 As an Interpretation of Early Chomsky 8.5.2 Collins’ Positive Proposal

8.6 Chomsky’s De Re Reading of “Representation of ” 8.7 Empty Intentional Representations and “Intentional Inexistents”

9. Linguistic Ontology 9.1 Background

9.1.1 Intentionality and Ontology 9.1.2 Resisting General Anti-Realism (i) General Anti-Realism (ii) Deciding Ontology (iii) Stable Cases (iv) Unstable Cases: Secondary Properties (v) Non-Preservative Cases

261

262 267 271 272 276 276 279

286 289

295

295

295 296 296 302 303 304 306

9.2 The Problems with SLEs

307

9.3 Dispositional Strategies

315

9.2.1 “Beads on a String” 9.2.2 Efficiency and Noise (i) Communicative Efficiency (ii) Noisy Transduction

308 309 310 311

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Detailed Contents xxiii 9.4 9.5 9.6 9.7 9.8

Articulatory Idealizations Speaker Intentions? SLEs as Abstract Objects “Psychological Reality” SLEs as Neural items

316 319 320 322 325

9.8.1 Use/Mention Confusions 325 9.8.2 SLEs as Neural items—Deliberate Use/Mention Neural Collapse328 9.8.3 Intentional Object/Representation Confusions 330

9.9 Folieism: SLEs as Perceptual Inexistents

10. Linguo-Semantics

1 0.1 Linguo- vs. Psycho-semantics 10.2 Meaning and the Analytic

1 0.2.1 The Analytic Data 10.2.2 Quinean Challenges (i) Revisability (ii) Confirmation Holism (iii) Reductionism (iv) Explanatory Role

1 0.3 The Disjunction Problem and BasicAsymmetries 10.4 A Promising Application to Language

1 0.4.1 Meaning without Truth 10.4.2 A Cautionary Aside: Resisting More Anti-realism

11. Psycho-Semantics of Perceptual Content

1 1.1 Sensitivities to Abstruse Phenomena 11.2 Probabilities, Disjunctions, and BasicAsymmetries 1 1.2.1 Sensitivities as BasicAsymmetries 11.2.2 Perception as Probabilistic Confirmation

11.3 Concluding Remarks on Meth(odological)-Dualism 1 1.3.1 Motivations for Meth-Dualism 11.3.2 Mind/Body Problems

References to Works of Chomsky General References Glossary of idiosyncratic terms and abbreviations Name Index General Index

331

336 337 338 338 339 340 341 343 345

349 356 356 360

363 364 371 371 373

378 379 382

391 395 427 429 436

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Guide to the Reader The present work is not intended to serve as any sort of textbook on Chomskyan linguistics (which in any case I wouldn’t be qualified to provide). I try to keep the technical detail to the minimum needed for discussion of the foundational issues. For excellent introductions to the general Chomskyan programme, see Pinker (1994/2007), Collins (2008b) and Smith and Allott (2016: ch 2). For standard textbook expositions of the basic technical mater ial, see, e.g., Radford (1997, 2004) Adger (2003), Hornstein (2005) and Lasnik et al. (2005). Explanations of abbreviations and technical terms can be found by consulting the glossary, or bold-face page numbers for them in the index. Many readers might hope that a book on Chomsky and philosophy would bring together his theories of language and his views about politics. He has often been pressed about the connection, and sometimes has briefly responded, usually claiming the connection is tenuous (see, e.g., Chomsky 1988a: 144). (My favorite response I recall is: “Can’t a guy have two interests?”) Smith and Allott (2016: 267–76) address the connection in probably the most detail that is possible, and provide excellent discussions of both topics. Other useful discussions relating his political and general views of the mind can be found in Cohen and Rogers (1991/forthcoming) and Rai (1995). Nor shall I be providing a comprehensive discussion of even most of the philosophical work on Chomsky’s linguistics, which itself is far too vast for any one book. I shall try to touch on what I take to be characteristic discussions that have seemed to me to be important or influential, and sincerely apologize in advance to authors who might perfectly reasonably feel that discussion of their own work should have been included. A word about how my book differs from others of the past few decades, e.g., the Collins (2008b), and Smith and Allott (2016) that I have already mentioned, as well as Ludlow (2011) and McGilvray (2014). Although these books have many merits, the differences between them and mine lie largely in my focus almost entirely upon the foundational, “philosophical” issues. This leads me to stress more than those other volumes do what I regard as the crucial data for the core theory, particularly what I call the “WhyNots,” as well as the philosophically relevant history of Chomskyan theories and the various strengths of nativism (as well as the compatibility of being innate and

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

xxvi Guide to the Reader learned!). I also undertake within a Chomskyan framework a naturalistic defense of the reliance on linguistic intuitions and of its psychological conception of linguistics against a variety of critics, e.g., Nagel, Searle, Quine, and Michael Devitt. The discussions of Devitt are extensive and complex, partly because his own discussions of Chomsky are themselves extended and multifaceted, but partly because we’ll see that they lead to some somewhat surprising restatements of Chomskyan grammatical principles in terms of representations, as distinct from the SLEs that they represent, and which I argue are unlikely actually to exist. This in turn leads to a somewhat novel discussion of the explanatory role of what I call “representational pretense,” as well as of intentionality and ontological issues in linguistics and in much psychological theory generally. I also address at the end of Chapter 11 some of Chomsky’s further philosophical views that are peripheral to his linguistics, which these other books also treat fairly lightly, with the exception of Collins and McGilvray, with whom it will become clear I have some strong disagreements. A further volume that came my way late in my writing of the present book is David Pereplyochik’s (2017). It was too late—and my own book was already too long—for me to devote as much attention to it as it deserves. But I will note several places of divergence between his views and mine. The chapters do not heavily depend on their sequencing here, and readers, guided by the table of contents, the Introduction and Synopsis, and the Index, could easily skip around and even disregard some chapters entirely. To facilitate this, I provide a considerable number of cross-references to different sections, which can be accessed via the table of contents. Chapter 2 is a fairly brief summary of the complex history of Chomskyan proposals about a generative grammar and likely provides more technical detail than many readers will be eager to learn. I want to stress that there is no reason to master it for purposes of this book. Readers new to the material are advised merely to skim that chapter on at least their initial read, to note the main topics and termin ology, and then to refer back (via the cross references, index, and glossary) to specific issues as they become relevant to the particular foundational issues that interest them.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Introduction and Synopsis The book is divided into three parts1. Part I provides a somewhat novel exposition and defense of what I regard as the core ideas of a Chomskyan linguistic theory; Part II, a discussion of some of the core philosophical claims that surround it; and Part III, a more contentious discussion of the ultimate problem that concerns me: whether and how the core theory is committed to a philosophically troublesome notion of intentionality that is associated with the near ubiquitous term “representation.” The last chapter will conclude with a brief critical discussion of a few further philosophical views that Chomsky has expressed regarding the mind–body problem, which many might—mistakenly— take to be essential to his theory. In Chapter 1 of Part I, I briefly discuss one of Chomsky’s most important contributions to linguistics and psychology: his “Galilean” model of language as an internal, mental computational system as opposed to a view of language as essentially a social product. The model abstracts from a great deal of surface data to isolate the relevant mental systems, for which only certain, highly selective “crucial” data will be relevant (§1.1), data that will be revealing of an underlying competence, as opposed to what Chomsky regards as the more motley issues of performance (§1.2). Some striking data of this sort are provided by what I call “WhyNots,”2 These are pairs of very similar sentence-like strings of words (such as Who did John and Bill kiss? vs. *Who did John and kiss Mary?), one of which speakers regard as perfectly acceptable, and the other clearly not (as indicated by the “*”)—but where it would be perfectly clear to many speakers what the unacceptable string could be used to mean! The question is, what makes the unacceptable string unacceptable (§1.3)? If readers remember nothing else of the Chomskyan program, they should remember at least several of these examples, since a moment’s reflection upon what could 1 This is an introduction and synopsis of the book, not to the many topics themselves. Consequently, it may use terms that will only be explained in the chapters and/or the glossary. 2 Chomsky (2013: 41, 2015) calls them, less mnemonically, “fine thoughts”–short for “perfectly fine thoughts.” His concern with them goes back to his earliest articles, e.g., Chomsky (1962: 531). Traditional grammars had noticed some of them, but without explanation. I will tend to restrict attention to cases that do not have an obvious explanation, in the way that, say, *He goed does (see the Glossary for the use of “*.” to indicate unacceptabililty). Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.001.0001

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

2 Representation of Language possibly explain them readily invites much of the rest of the core view. These data, crucial for the grammar, are to be distinguished from unacceptability responses that may be due to other systems, say, of memory or rapid parsing, which may cause the grammatical to be sometimes judged unacceptable, and the acceptable sometimes ungrammatical (§1.4). I will also briefly review some other important data, for example, the speed, stability, and virtual universality of the acquisition of a grammatical competence that even young children display, and the significant independence of this competence from the rest of their cognitive life (§1.5). Chapter 2 provides as brief an overview of Chomskyan generative theories as needed for the rest of the discussion. In the seventy years of the program, there has of course been quite a bit of replacement of one explicitly provisional approach by another, a not surprising development in a relatively young science; but, as I try to make plain, it is important to appreciate how the evolution has been by no means haphazard or lacking in continual, quite impressive progress. We will trace its philosophical beginnings (§2.1), the various stages of its development (§2.2), and some of Chomsky’s recent specu lations about what he calls “Darwin’s problem,” and the role of “third” physiobiological factors in his theory (§2.2.10). A theme I will particularly stress is how every stage in this evolution was intended to provide a generative characterization of a fairly “autonomous” syntactic system, not subject to the (what I call) “teleo-tyrannical” demand that its structure be understood as serving non-syntactic, semantic, or communicative ends (§2.2.5). Both to provide a minimal feel for the theory, and to introduce terminology that is essential to understanding later discussions, I will end the chapter with sketches of a few representative (but fairly simple) explanations (§2.3). Again, those interested in a serious introduction to the theories should not rely on this discussion, but consult the references provided in the “Guide to the Reader.” In Chapter 3, I set out the important psychological conception Chomsky assumes for his theory: the non-conventional character of much of grammar (§3.1), the crucial distinctions between competence and performance , and between “I(nternal)-” and “E(xternal)-”languages (§3.2). Without these distinctions, it is hard to see how there could even be a topic of grammatical rules apart from mere statistical generalizations about language use. I go on to argue that these distinctions in turn presuppose a more general distinction between what I call “superficialist” vs. deeply structural approaches to psych ology, the former of which require all real psychological distinctions to be based upon either ordinary behavior or (sometimes) introspection (§§3.3–3.4). It is a requirement that is explicit in behaviorists such as Skinner and Quine

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Introduction and Synopsis 3 (§3.3.1), raising a problem of extensionally equivalent grammars (§3.3.3), which in reply requires a distinction between explicit vs. implemented/“Implicit” rules and structures (§3.3.4). But other versions of Superficialism also inform the work of Wittgenstein (1953/2016), Ryle (1949) and Chomsky’s most vociferous philosophical critics, Baker and Hacker (1984) (§3.4) It seems also presupposed in the argument that Saul Kripke (1982) claims to have found in Wittgenstein, according to which there could be no factual basis for Chomskyan “rule following” accounts of an inner linguistic competence (§3.4.3).3 Chomsky rejects all such views as presuming a “methodological dualism” that treats the mental as somehow unamenable to the methods of the natural sciences. Chomsky resists methodological dualism because of the essentially realistic, psychological conception he has of his entire enterprise. Chapter 4 addresses what seems to many to be this overly mentalistic aim, particularly his crucial demand for “explanatory,” versus merely “descriptive” adequacy of a linguistic theory, whereby his theory would explain how a grammar could possibly be acquired (§4.1). I consider in this connection a number of issues raised by Chomsky’s use of “knowledge of language” to describe what an adequate theory would explain (§4.2). The term “knowledge” is, of course, philosophically fraught and I think ultimately misleading, and it could well be deleted without loss from the core theory. For one thing, it has too often been taken to imply that a child has some kind of conceptual knowledge compar able to that of a professional linguist, which would, of course, be absurd. The absurdity is avoided by presuming the knowledge involves “non-conceptual” representations along the lines many philosophers have already noted are needed for representational states in perception and other mental processes not fully integrated into general cognition (§4.3). Related misunderstandings can be avoided by noting that the kind of epistemological project that concerns Chomskyans is not the kind that have occupied traditional philosophers. The latter have been largely interested in what I call a “working” epistemology, involving, for example, characterizations of “knowledge” that allow one to reply to skeptics, rather than a serious “explanatory” epistemology concerned with explaining the cognitive capacities of humans and many animals (§4.4), an interest that may or may not coincide with that of the working epistemologist. Indeed, it could be that the best 3 Superficialism can also be seen to be at work less obviously in post-behaviorist approaches such as those of Michael Devitt and of what I call “General Statistical” approaches that have been popular in more purely computational linguistics and Artificial Intelligence (“AI”), which we’ll discuss in subsequent Chapters 5 and 6.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

4 Representation of Language working epistemology for the nonce may be a Quinean holistic empiricism, even if it turns out not to be explanatorily correct. In the concluding section §4.5, I briefly set out the kind of “computational-representational” explana tory strategy on which Chomskyans are relying, and how it might afford a principled basis for those ascriptions of knowledge that are worth preserving. In Part II I begin to address some of the deeper philosophical issues that surround the core Chomskyan view. In order to meet the demand of explana tory adequacy, Chomsky proposes the best known and, for many, the most implausible feature of his core view, the claim that linguistic competence is explained by the innate constraints imposed by a “Universal Grammar” (“UG”),4 a claim that I will discuss along somewhat novel lines in Chapter 5. I will review the famous (for some, notorious) “poverty of stimulus” argument on its behalf (§5.2) and Chomsky’s linking of it to views of Plato and traditional Rationalists (§5.3), a link that I argue is actually less close than Chomsky’s current more purely biological theories would allow it to be. Along the way, I also argue that, contrary to many received views (but as Leibniz noted!), something’s being innate does not actually exclude it also being learned, as some aspects of grammar may well be (§5.1). The claim that UG is innate has been vigorously opposed by many theorists who defend in its stead various “general statistical learning” (what I call “GenStat”) proposals. These face an increasingly severe set of problems raised by various philosophers over the years, not only by Plato, but also by Goodman, Leibniz, and, in a somewhat surprising way, Quine (§5.4). I will argue that the considerations raised by these last two figures actually invite a nativism not merely of predicates nor even of explicit hypotheses, but of specific computational processes, what I call “quasi-brute computational process nativism,” whereby the innate principles are not represented or confirmed by children’s experience, even though the grammatical properties and perhaps the parameters are. I will conclude the chapter by briefly discussing “Cognitive Functional/ Constructivist” strategies (§5.5). These seem to me too much under the spell of the teleo-tyranny discussed in §2.2.4, being concerned largely with issues about the use of language, rather than with the underlying system that informs and constrains that use, whether or not it serves any performance function. On the other hand, there is no reason to suppose that a concern with underlying

4 Many might reasonably regard Chomsky’s nativist claims as essential to the core, and I wouldn’t want to dispute this. I include it in Part II only because it involves a number of complexities that have historically been discussed largely by philosophers. But certainly the final view I discuss, what I call “process” nativism, could well be regarded as essential to the core.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Introduction and Synopsis 5 competence excludes a concern with performance, which, pace Chomsky’s occasional dismissive remarks, may also yield explanatory insights (§5.6). In Chapter 6 I turn to objections to this psychological conception that have been raised not only by the Superficialists or Methodological Dualists we discussed in Chapter 3, but by otherwise serious mental realists. Some are easily answered red herrings raised by David Wiggins (1997), John Searle (1984), Galen Strawson (1994), and Thomas Nagel (1974/82) (§6.1); others involve simply the often orthogonal research interests of Platonists, such as Jerrold Katz (1981, 1985c) and Scott Soames (1984) (§6.2). But the bulk of the chapter will be devoted to more substantial criticisms of it that have been raised over several decades by Michael Devitt (2003, 2006a, 2008a), who argues that, although there are of course internal mental processes responsible for language, Chomsky’s psychological conception of his linguistics should be replaced by a conception of a (largely) conceptually independent “linguistic reality” (§6.2). Although, Devitt’s arguments seriously fail, they do point to a surprising omission in the standard statements of Chomskyan rules and principles: despite appearances, the theory is not actually about the words, sentences, morphemes, NPs, and IPs—what I call “Standard Linguistic Entities” (SLEs; see Preface)—that seem to be produced in the acoustic stream or inscribed on pages, and to which everyone, including Chomskyans, routinely seem to refer. Rather, they concern internal mental representations of these things, precisely as any internal computational theory of the sort Chomskyans embrace in fact standardly claims (§6.3). The apparent reference to words and sentences uttered or inscribed in books should be understood as what I call a “representational pretense”: whether or not linguists are ultimately realists about the external reality of SLEs, in standard expositions of their theory virtually everyone—speakers, psychologists, and linguists (Chomskyan or otherwise)— go along with the common appearances and speak as if SLEs are externally real, in the air or on pages. This allows linguists to focus on the relevant issues of grammar that are usually at issue, and efficiently express the contents of the relevant internal representations without having to tediously insert “representation” continuously throughout their discussions. N.b., it is not a deliberately deceptive pretense; just a virtually indispensible expository one, although one by which it is easy to be misled.5 Although this pretense will not matter to most discussions, if it is not noted it can lead to serious confusions about exactly what the theory is describing. These include not only the confusions that beset Devitt’s discussion, but more 5 For those who worry that this use of “pretense” suggests linguistics is a sham, I offer a reassuring postscript about the term in §6.4.2.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

6 Representation of Language serious ones about just what SLEs should positively be considered to be, confusions that we will consider in more detail in Chapter 9. On the other hand, once this pretense is appreciated, then, apart from the surface antagonisms, there is less need for any real tension between Chomsky’s and Devitt’s views: they are simply concerned with quite different issues, Chomsky with the internal system of representation responsible for grammatical competence, Devitt with the external uses of the output of the language faculty for communication and the expression of truth-valuable thought (§6.4). An issue that has bothered many of Chomsky’s critics is the appeal to people’s “unacceptability responses” and other linguistic “intuitions.” I turn to this issue in Chapter 7. The worry here is motivated by the sometimes uncrit ical ways philosophers have relied on Platonic reflection or “Cartesian” introspections to try to acquire what empirically-minded critics regard as a non-naturalistic “knowledge on the cheap.” Devitt (2006a: §7) dismisses (what he calls) such a “Voice of Competence” (VoC) conception, arguing along the lines laid out in §4.4, that Quine’s holistic empiricism is really the only way of knowing anything: intuitive verdicts regarding either syntax or semantics are no more than empirical observations by speakers who have had experience speaking their public language (§7.1.1–7.1.3). Of course, such a view is perhaps not surprising, given Devitt’s externalism about “linguistic reality” discussed in §6.3: if language is entirely an external phenomenon, then speakers’ intuitions about it would (perhaps) seem on a par with their intuitions about any other external phenomena. But even to the extent that Chomsky’s psychological conception might be correct, and grammar might be largely an internal affair, linguistic intuitions for Devitt would enjoy no special status (§7.1.2). But Devitt’s view, I will argue, is badly mistaken. It certainly does not accord with the practice of Chomskyan linguists, who are explicitly concerned not with the verdicts of just any adult who might have observed and learned a language, but with the intuitive verdicts of native speakers of a language, which are more likely to reflect the speaker’s competence that concerns Chomskyans. And, along the lines of theories of perception in general, there is no reason that a computational-representational theory could not accord a special status to such intuitions in a perfectly naturalistic way (§7.2). Moreover, corroboration for their special status is provided by the abundant evidence for a manifestly special perceptual basis for parsing (§7.3). As I mentioned in the preface, my main motivation for this book is to address the fraught issue of the role of representation and intentionality in a Chomskyan theory. This is the topic of the last and perhaps most contentious

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Introduction and Synopsis 7 Part III of the book. In Chapter 8, I begin by briefly setting out some of the distinctive features of intentionality, as well as indicating how those features figure increasingly across various areas of serious psychology (§8.1). Indeed, intentionality seems to figure in a wide range of Chomsky’s own characterizations of his linguistic project (§8.2). Intentionality has, however, been thought to be deeply problematic, ever since Brentano claimed it was “irreducible” to any physical phenomenon, and physicalistically inclined philosophers have struggled, thus far unsuccessfully, to prove him wrong. Intentionality has consequently come to seem to many of the scientifically minded as somehow hopelessly obscure (§8.3). Although Chomsky does not discuss Brentano’s thesis, he did begin in the 1990s to echo the claims of such intentionalist skeptics, and to claim that his use in his the ories of the crucial term “representation” should not be understood along intentionalistic lines (§8.4). Aside from some passing remarks, he, himself, seldom really indicates how his uses are positively to be understood, directing readers to Frankie Egan’s (1992) non-intentionalist reading of David Marr’s (1982) work on vision. John Collins (2007b, 2009, 2014), however, has ingeni ously defended a non-intentionalist, what I call an “abstract-algebraic” reading of a surprising number of Chomsky’s claims. and I will consider in detail the interest and problems of his proposals (§8.5). One issue that seems to be driving both Chomsky and Collins to their (to my mind) surprising claims is one of the distinctive features of intentionality, its lack of existential commitment: one can have a representation of something, for example, a rotating cube, even though there is no such cube in the environment. Partly in view of the fact that we noted in §6.4, that actual external SLEs are not required for linguistic theory, Chomsky claims that there are in fact no such things, and this, curiously enough, leads him to claim that there is therefore no need for the intentionality suggested by the “representation of ” idiom. I think this is a serious error, but an understandable one, given the difficulties of using that idiom in cases of non-existence, difficulties I address in §§8.6 and 8.7. There I defend the importance—and innocuousness—of the often ridiculed category of what Brentano called “intentional inexistents.”6 Fully disentangling the issues here turns out to require becoming clearer than many people are about the nature and ontology of the SLEs. This turns out to be surprisingly difficult, and, in Chapter 9, after setting out some 6 I will not be adopting any of Brentano’s (or his student, Meinong’s) other distinctive views concerning how to think about the issues associated with this term.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

8 Representation of Language ground rules for “realism” about a domain (§9.1.1), and resisting a too facile general anti-realism with which Chomsky unnecessarily flirts (§9.1.2). I consider several unsatisfactory options that have been proposed: acoustic, dispositional, articulatory, idealization, and abstracta accounts (§9.2–§9.6). To my mind, the most plausible proposal is that of Chomsky and Halle (1968), following Sapir (1933/49), of SLEs being “perceptually real” in a way that does not correspond to any external reality (§9.7). Now, one would think that being “perceptually real” is a species of being “psychologically real,” but unfortunately Chomsky (1980a: 107) independently uses this latter expression merely for the explanatory, causal reality of psychological entities a theory might posit, along the lines of neural states posited to explain behavior. This and other conveniences and confusions have led Chomsky and many of his followers to regard SLEs as merely neural entities (§9.8). I argue that such a proposal essentially amounts to a peculiar use/mention confusion (§9.8.1) and/or deliberate use/mention collapse (§9.8.2) that have become rampant in the field, along with a related confusion of a representation with its intentional object (§9.8.3). What is the solution to all of these difficulties? Along the lines of the “representational pretense” discussed earlier in §6.4, I propose that SLEs be recognized as simply not existing at all (§9.9), a view I call “Folieism,” as in “folie à deux,” or “à n,” for the n of us who share the vivid perceptual illusion that people utter SLEs, despite our seldom, if ever, actually doing so.7 As noted in §6.4, all that a Chomskyan theory is really committed to are representations of SLEs that figure in the I-language computations; the SLEs themselves perform no explanatory role whatsoever. Given the difficulties of identifying them with anything independently real, “they” are best regarded as (in the phrase of §8.7) “intentional inexistents,” more specifically “perceptual inexistents”: entities that are the “intentional objects” of mental states, i.e., the “things” mental states are “of ” or “about”, but do not actually exist, no way, no where, no how (“existence in the mind” doesn’t count!). They are no more real than the colors and triangles routinely discussed by vision theorists, and whose existence is problematic, but the object of representational pretense for precisely the same sorts of reasons.8 Many may find such a proposal so unattractive as to prefer to abandon talk of intentionalist representation altogether, a preference they may have also on

7 Why not mere “fictionalism,” as in the case of other anti-realist philosophical proposals? Because I want to stress the vivid perceptual illusion I think we’re all under that SLEs are spoken and heard. 8 Although it is by no means essential to my discussion, I will assume what I take many (though, to be sure, not all) psychologists and philosophers to hold, that neither colors nor the standard Euclidean figures represented in, e.g., vision actually exist in the external world. But I will discuss the reasons for thinking this about them and about SLEs in §§9.1–9.3.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Introduction and Synopsis 9 other grounds. In Chapter 10 I will begin to address the various objections that have been raised against intentionality, confining attention for the remainder of that chapter to the issues as they arise for a “linguo-semantics” (or semantics of natural language), postponing to Chapter 11 the issues as they pertain to a “psycho-semantics” (or semantics of mental states). I will begin in §10.2 by considering Quine’s well-known challenges to any notion of “meaning,” distinguishing four of the most important: revisability, holism, reduction, and explanation. I will argue that only the last has serious force. I will then turn in §10.3 to a different challenge raised by Fodor (and independently by Kripke’s (1982) reading of Wittgenstein), what he called the “disjunction problem,” or how to distinguish, semantically, truth from error. Fodor went on to provide a proposal that he thought met both that challenge and those of Quine. I will argue that his proposal as he stated it is far too strong, and that all the challenges can be met not by attempting to provide a notoriously risky “reduction” that he undertook of the intentional to the physical, but merely by indicating the explanatory role that intentionality plays in at least the areas of linguistics and psychology that are of concern to Chomskyans. I propose a modest version of Fodor’s proposal that I combine with an equally modest version of an independent proposal of Paul Horwich—a proposal I call “BasicAsymmetries”—which I argue in §10.4 will work well enough in conjunction with a Chomskyan conception of I-semantics (along lines developed by Paul Pietroski) to meet Quine’s most serious challenge of showing how “meaning” can earn its explanatory keep (and perhaps even save a modest notion of the analytic) (§10.4.1). I conclude the chapter by pointing out that, for all its “internalism,” an I-semantics does not itself entail any sort of antirealism that is often associated with it (§10.4.2). I turn finally in Chapter 11 to address the ultimate concern of my discussion, the role of intentionality in a psycho-semantics, specifically one that I will argue is needed for linguistic theory. In §11.1, I raise what seems to me a quite general problem faced by any psychological theory, of accounting for how a physical organism, interacting with the world in only local ways, can be sensitive to non-local, non-physical, and often non-instantiated—what I call “abstruse”—properties, such as being a dinosaur, the moon, a genuine triangle, or an SLE, such as a noun or a verb. I note that such sensitivities could conceivably be brought about by sensitivities to local, transducible, “surrogate” properties, but that these are unlikely to be actually available in general, particularly not for SLEs. This is where I argue intentionality plays a promising role: intentionality is needed where mere transduction gives out. It provides a “common coin” needed at the interface between grammar and, inter alia, the

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

10 Representation of Language perceptual system. By combining BasicAsymmetries with probablistic computations, we are afforded a way of realizing a Chomskyan model of how SLEs might be perceived, and how his theory might thereby achieve its aim of explanatory adequacy (§11.2). I conclude the chapter with a discussion of the general sources of resistence to including intentionality in science (§11.3.1), as well as with some comments (§11.3.2) on the “mind–body” problems that, while completely orthogonal to linguistics, provide some of the background to those motivations. Although Chomsky has repeatedly insisted that the problems cannot even be stated outside of a Cartesian physics, I argue that the problems are quite clear enough apart from Descartes, and may require the kind of careful thought and research that has been the business of much traditional philosophy and, increasingly, cognitive theories such as Chomsky’s, to which I hope this book is a small contribution.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

PART I

THE C OR E L INGU I ST IC T H E ORY

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

1 The Core Galilean Idea and Some Crucial Data 1.1 An Internalist “Galilean” Idealization 1.1.1 Internalist vs. Social Conceptions In a highly influential work, the American linguist Leonard Bloomfield (1933) expressed a quite common social, externalist, and empiricist conception of language: Language has been developed in the interchange of messages, and every individual who has learned to use language has learned it through such interchange. The individual’s language, consequently, is not his creation, but consists of habits adopted in his expressive intercourse with other members of the community. (Bloomfield, 1933: 17)

Indeed: The facts of language are facts of social, not of individual psychpology. (Bloomfield, 1933: 198; see also 252, 259)

A similar view was expressed by the English philosopher, Gilbert Ryle (1961): A Language, such as the French language, is a stock, fund or deposit of words, constructions, intonations, cliché phrases and so on. . . . A stock of language-pieces is not a lot of activities, but the fairly lasting wherewithal to conduct them; somewhat as a stock of coins is not a momentary transaction or set of momentary transactions of buying, lending, investing, etc., but is the lasting wherewithal to conduct such transactions. Roughly, as Capital stands to Trade, so Language stands to Speech. . . . A Language is something to be known, and we get to know it by learning it. . . . A language is a corpus of teachable things. (Ryle, 1961: 2234) Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0001

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

14 Representation of Language Thus, language is essentially an external social phenomenon, much as are monetary systems or styles of dress. This seems also to be the conception people have when they take seriously various “prescriptivist” rules that are taught in “grammar school,” in books of style, and often insisted upon by “language mavens” in the popular press. The view is an instance of a broadly “empiricist” conception of language acquisition whereby it consists largely of ideas and generalizations derived from experience, along the lines of the Classical empiricist philosophers, Locke and Hume. Such a conception can seem so obvious that many might wonder how it could ever be seriously questioned.1 After all, it is certainly true that language is the main way we communicate with one another, and we often try to do so by using the same vocabulary and grammatical structures. Don’t dictionaries and grammar books essentially record the conventions that have evolved from those efforts, just as do the rule books for games like Chess or Go? But the conception overlooks an important possibility that scientists and philosophers have called attention to at least since Galileo: some phenomena that we take to be external to ourselves may be the result of facts internal to our minds that we mistake as facts about the external world. Galileo and others cited color and other what are called “secondary” properties, like warmth and fragrance: the best account of such properties is not so much in the external objects as in the nature of our sense organs (we will touch on these issues in Chapter 9). Quite apart from issues of language and mind, there are other uncontroversial cases where at least such an internalist explanatory approach is obviously appropriate. Although the way people walk may be very much influenced by the way people around them walk and facts about the terrain, surely the basic facts about walking have to do not with terrain, social norms, or conventions, but with specifc features of our anatomy. Similarly for how we sit, chew, sing, and dance. One of Chomsky’s most important insights is that talking is more like walking than like playing games.2 His work consists in a systematic development 1 The first quote from Bloomfield appears in a section of his book entitled “The Social Character of Language.” The view has the authority of Aristotle (Politics, I.2, 1253a8–18), and was also expressed by the early linguist Ferdinand de Saussure (1914/77: 9), who described language as a “social product.” It is also explicit in the work of philosophers such as W. V. O. Quine (1960/2013), David Lewis (1969), and Michael Devitt (2006a, 2008a, and in prep); Devitt’s views in this regard will be discussed in detail in Chapters 6 and 7. 2 Chomsky often analogizes language to the growth of the heart and other bodily organs (see, e.g., his 1980a: 33, 41, 134–40), capturing what he rightly regards as its biological foundation. Activities like walking and dancing strike me as better analogies since, unlike growth, talking is standardly a deliberate activity that is affected by social mores in a way that purely biological processes typically are not.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 15 of a theoretically and empirically rich “internalist” alternative to the social view, showing how the possible ways we can talk are as much constrained by facts about our brains as are the possible ways we can walk are constrained by facts about our legs. What is more, these facts about our brains structure the very way we perceive and understand the noises of speech, along lines that Rationalist philosophers like Descartes, Leibniz, and Kant argued we did in the case of much of the other stimulation the world provides. This was in sharp contrast with the proposals of empiricists, such as Locke and Hume, according to which our cognitive life was based merely on sensory impressions, or, in influential terms of their twentieth-century behaviorist heirs, on associations among stimuli and responses. Chomsky’s significance for philosophy consists in part in his providing serious scientific support for at least some (though by no means all) of the Rationalist speculations. I will not be concerned in this book with the full details of Chomsky’s proposals, since, mercifully, not all of the enormously complex and varying details are relevant to the foundational issues that are my focus. But some of the details will be important. In the first two parts of this book I will sketch what I take to be the relevant technical material, and some of the different forms it has taken over the last seventy years so that even a reader new to the material might come to have a pretty good idea of the Chomskyan program. As a prelude to that, I will first briefly discuss in the next sub-section, one of Chomsky’s leading ideas, his general, what he calls “Galilean” view of science. This is a view, familiar from physics, according to which our deepest theories are ones that involve considerable abstraction from most ordinary observable data and look instead to “crucial data” that would be expected and explained by one such theory and not by its rivals. In the case of linguistics, he proposes an abstraction to what he calls “linguistic competence” (§1.2), and among the data that he considers are what I call “WhyNots,” or strings of words that native speakers of a language find unacceptable, but nevertheless comprehensible, some standard examples of which I will provide in §1.3–§1.4.3 (If one remembers nothing else about Chomsky’s views, one should remember at least a few examples of such data: a moment’s reflection on them should readily invite a great deal of the rest of his conception.). I will then turn in §1.5 to further evidence that can be marshaled for his views.

3 Chomsky (2013: 41) calls them “fine thoughts”; see the Introduction, fn 2 above.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

16 Representation of Language

1.1.2 A Galilean Theory and Crucial Data Almost in passing, Chomsky (1980a) mentions a methodological principle on which much of his work turns: Substantial coverage of data is not a particularly significant result, it can be attained in many ways, and the result is not very informative as to the correctness of the principles employed. It will be more significant if we show that certain far-reaching principles interact to provide an explanation for crucial facts—the crucial nature of these facts deriving from their relation to proposed explanatory theories. (Chomsky, 1980a: 2; emphasis mine; cf. 1978a: 23)

This could initially sound a little odd, since many people might reasonably think that substantial coverage of data is what science is all about. Surely theories of language that covered more data would, other things being equal, be better than theories that covered less, no? Many linguists prior to Chomsky decidedly thought so, and that “covering data” amounted to explaining regularities in the actual use of language in thought and communication. In a famous passage, Bloomfield stipulated: The only useful generalizations about language are inductive generalizations. Features which we think ought to be universal may be absent from the very next language that becomes accessible. Some features, such as, for instance, the distinction of verb-like and noun-like words as separate parts of speech, are common to many languages, but lacking in others. The fact that some features are, at any rate, widespread, is worthy of notice and calls for explanation . . . but this study, when it comes, will not be speculative but inductive. (Bloomfield, 1933: 20)

Indeed, much of the linguistic research that followed Bloomfield consisted in “discovery procedures,” or explicit rules for moving from a collection of phonetically characterized utterances to phonemic, morphemic, and finally syntactic analyses (see Sampson, 1980: 76ff). In the eyes of many traditional empiricists and Logical Positivists, this seemed the only way that a theory might be entitled to make claims about the real world, and not some merely imagined one. Such approaches to science might be called “Baconian” or “Humean.”4 By way of contrast to them, one of the most revolutionary ideas that Chomsky 4 After Francis Bacon (1561–1626) and David Hume (1711–1776), both of whom emphasized the centrality to science of generalizing from and so purportedly thereby explaining perceptual experience.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 17 brought to linguistics and psychology is that of a “Galilean method” that he believes underlies the deepest explanatory sciences. The Galilean method takes the main aim of theory not to be the summary of observed data, but rather the description and explanation of the structures and principles of reality that underlie the data. And this might well not include all the data, some of which may be due to a complex interaction of causes, but just data that are revelatory of that reality.5 The point cannot be overly stressed. Mere summaries of surface data may not be particularly useful, since their relation to underlying structure is typically quite indirect: most of the data we ordinarily observe—of the motion of people, animals, cars, clouds, or leaves in the wind—are the result of massive interaction effects with an immense variety of underlying systems of, for example, gases, frictional planes, biological creatures, and perceptual processes, each of which any serious science must study largely in highly idealized isolation (hence the “different disciplines” and the enormous effort and expense of “controlled experiments”). Galileo and Newton did not do physics by taking a careful inventory of all the variety of motions objects exhibit, attempting to explain the complicated trajectories of people running up hills or leaves swirling in a storm. Rather, they turned to what they had a hunch were specific data that were simple and free from interaction effects, for example, the trajectories of uniform masses in free fall, or, as in the case of the planets, in the splendid near vacuum of extra-terrestrial space. These often supplied crucial data that, relatively free of interfering factors, began to reveal the underlying principles of motion that were not predicted by rival theories, but which applied only in immensely complex ways to the trajectories of people, horses, clouds, and leaves in a storm. As the eminent philosopher of science

5 Chomsky was not alone in pressing what one might call a “metaphysical” view of explanation, as opposed to the essentially “epistemological” one of the traditional empiricists and Positivists that dominated Anglo-American philosophy up until the 1970s. It was pressed quite forcefully for science generally in a slew of important papers in the 1960s and 1970s, by Chomsky’s friend and colleague, the philosopher Hilary Putnam (1975a and b). Indeed, one wonders about the degree of their mutual influence (given they were one year apart in the same high-school! and became good friends throughout their professional careers until Putnam’s death in 2016). But little so far seems to have been written on the topic. Chomsky (pc) himself denies they influenced each other in this regard, but it’s hard to believe that many of their ideas weren’t very much “in the air” in their shared worlds. Chomsky often cites the work of Steven Weinberg (e.g., 1976) for “the Galilean method.” Note that the term is sometimes used to describe the view that mathematics is “the language of nature” which scientists often assume to be mathematically elegant (see Koyré, 1943: 33 6ff). Although Chomsky (1983/2004: 154ff) also takes this view seriously (see Boeckx, 2006: chap 4 and §2.2.10, fn59 below), it is independent of the point about idealization, which could of course involve idealization to mathematically inelegant systems.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

18 Representation of Language JamesWoodward (2011) recently put it (expanding on his and Bogen’s classic 1988): What matters is not that we be able to infer “downward” to features of the data from theory and other assumptions; rather, what matters is whether we are able to infer “upward” from the data (and other assumptions) to the phenomenon of interest. Data are scientifically useful and interesting insofar as they provide information about features of phenomena. (Woodward, 2011: 168)

For example, the astrophysist Arthur Eddington famously photographed starlight during an eclipse and infer[red] from features of his photographs (and various background assumptions) to a value for the deflection of starlight rather than trying to infer or derive characteristics of the photographs from other assumptions. (Woodward, 2011: 168)

That is, Eddington wasn’t concerned to explain all the (as it turned out, fairly mixed) data provided by his photographs: rather, he was concerned with the crucial data that were predicted by Einstein’s but not Newton’s theory. As the philosopher Jerry Fodor (1991) put it with his usual pith: What goes on in science is not that we try to have theories that accommodate our experiences; it’s closer that we try to have experiences that adjudicate among our theories (Fodor, 1991: 202–3)

Thus, Chomsky is not concerned to explain all—or even a significant portion— of language use; rather, just crucial data that are predictable from his but not rival, social theories.6 Moreover, as we shall shortly see, many of the crucial data for a Chomskyan theory are negative, or data about what utterances people seldom if ever would produce, since (to anticipate Chomsky’s explanation of such facts) they are 6 Actually, Galileo himself evidently didn’t free himself from misleading experimental data quite enough! As the historian of science Alexander Koyré (1943) noted: [I]t is only [Galileo’s] reluctance to draw, or to admit, the ultimate consequences or implications of his own conception of movement, his reluctance to discard completely and radically the experiential data for the theoretical postulate he worked so hard to establish, that prevented him from making the last step on the road which leads from the finite Cosmos of the Greeks to the infinite Universe of the Moderns. (Koyré, 1943: 334, emphasis mine) Chomsky (2009a) mentions Koyré in discussing his conception of science.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 19 not ones that are permitted by their language faculty. What is striking about much of Chomsky’s discussion is his relying on what were largely non-existent data that he noted really only as a result of his theory. Focusing on actual data that have been pre-theoretically collected would miss such crucial bits.

1.2 Competence vs. Performance This focus on crucial “negative” data is connected to the specific way Chomsky brings the Galilean method to bear upon linguistic research with his (1965: 4) famous distinction between “competence” and “performance.” In contrast to the Bloomfieldian concern with the actual use, or performance of language, Chomsky is interested in the underlying system of competence, or the facts that are responsible for the human ability, ceteris paribus, to understand a potential infinity of expressions of their natural language. Like any deeply underlying system, it may or may not be manifested in actual performances, which, given the multitude of diverse factors responsible for it, Chomsky doubts would afford a fruitful domain for general theoretical insight:7 [A] formalized grammar, regarded as a predictive theory, is an idealization in at least two respects: first, in that it considers formal structure independently of use; and second, in that the items that it generates will not be the utterances of which actual discourse is composed. Actual discourse consists of interrupted fragments, false starts, slips, slurring, and other phenomena that can only be understood as distortions of an underlying idealized pattern. (Chomsky, 1962: 531)

The actual use of language, like the trajectories of leaves, looks very much to be a massive interaction effect, involving not only some sort of appreciation of the grammar of one’s language and “the meanings of one’s words,” but also the social conventions of the relevant speech exchange (is it an argument in a

7 Ferdinand de Saussure (1914/77) made a similar distinction between “langue” and “parole” (cf. Chomsky, 1968/2006: 17–18; 1986: 32ff; but see Newmeyer, 1986: 71, for differences between Saussure’s and Chomsky’s notions). We’ll look at what seem to be somewhat systematic issues raised by performance systems in §1.4. Chomsky’s original use of the word “competence” did invite an unfortunate confusion between mere “behavioral dispositions” and the underlying system responsible for those dispositions, what he eventually calls “I-language,” which we’ll discuss in §3.2, and which comes to replace the use of “competence” (see §6.3.3 for continuing confusion on the issue). Since the term has figured in so many discussions, I’ll stay with it with the understanding of its eventual replacement.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

20 Representation of Language pub, a newspaper headline,. or a course in linguistics?), as well as an immense amount of knowledge about what one’s audience knows and cares about, and for whom one can and will abbreviate what one says in light of relevance and efficiency. For Chomskyans, all of that involves far too much arbitrary complexity to be included within a single theory. As with the above cases of natural sciences, one breaks things down into manageable bits, hoping, as Plato famously put it, “to carve nature at its joints.” It is important to note that neither of Chomsky’s two senses of “idealization” are intended quite in the manner of, say, the frictionless planes or perfectly elastic collisions of physics, or the “ideal rational agents” of economics and game theory, all of which are acknowledged to be patently impossible. Chomsky need not be idealizing to some impossible being, but simply abstracting away from various “performance” systems in the brain that interact with a presumably quite real computational system implemented in the brain, distinct from the many other diverse systems responsible for the actual utterances people make.8 In any case, it cannot be stressed enough that Chomsky’s use of “competence” is not merely the ordinary sense of an “ability” in the sense of a disposition to actually behave in a certain way. This distinction between competence and performance is of the profoundest significance not only for a Chomskyan theory, but for psychology generally. Thus, in researching our underlying conceptions of, for example, physical objects, causation, number, and morality, psychologists inspired by Chomsky’s work have studied what seem to be infants’ innate conceptions of objects (e.g., Spelke, 2003, 2017), causation (Michotte, 1946/63), basic natural kinds (Keil, 1989), and, in a particularly interesting analogy with grammar, morality (Mikhail, 2011). It seems to be precisely what is needed in any area of psychology where failures of performance can too quickly be taken to be failures of competence (cf. Scott and Baillargeon, 2017, regarding the “false-belief tasks”). Chomsky (1980c) refers to the specific application of the method to various psychological capacities as “the new organology,” that is, that we should regard the study of language as like the study of any other organ or system in the body, such as the kidneys or the metabolic system (see Smith and Allott, 2016: 224–5, for discussion).

8 The appeal to “an ideal speaker-listener” in chapter 1 of Aspects, 1965: 3, does read like an nrealistic appeal to an ideal agent; but, as we will note in §2.2.3, in these early passages he hadn’t yet u fully abandoned an externalist, social view of language (and he likely presumed a behaviorally oriented audience), and so such an idealization may have seemed more appropriate.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 21 In any event, it is with Chomsky’s Galilean conception of linguistic competence in mind that I want to stress from the start what I think are some of the crucial data for his core theory.

1.3 Typical WhyNots So what are some crucial data for Chomsky? What is a little odd about the data that fill most Chomskyan texts is that, in contrast to the usual carefully controlled data obtained by experimental scientists, they are data that are immediately available to almost any competent native speaker of a natural language. They are just data that were, by and large, surprisingly little noticed until Chomsky called attention to them. They generally consist of pairs of very similar sentence-like strings of words, one of which is perfectly acceptable, and the other not. Thus, to take an obvious example, most English speakers would find (1) acceptable, but not (1a): (1)    John and Mary danced a polka. (1a) *Polka a danced and Mary John. This, of course, is not a particularly interesting example, since one might presume that some familiar rules about nouns, verbs, articles, and conjunctions in English might suffice to rule (1a) out. But other cases are not nearly so obvious: (1b) Stories about Caesar terrified Mary. (1c) Who did stories about Caesar terrify? are fine, but not: (1d) *Who did stories about terrify Mary? The reaction that virtually every native speaker has to the *‑ed cases are what are called peculiar “Unacceptability” reactions, for example, a specific puzzlement and report that “one can’t say it.” (The issue, incidentally, is not about Who vs.Whom, which is of vanishing significance to most speakers, linguists, and any issues in this book.) Cases like (1d) are particularly interesting, since if someone uttered it, many hearers could likely guess what was meant, for example, by thinking of

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

22 Representation of Language the related, mere “echo” question (where the query—or “wh” word—remains in situ at the queried spot: Stories about who? terrified Mary?), but nevertheless most hearers would think there is something wrong, perhaps replying, “Huh? You can’t ask the question that way. You need to ask something like What were the stories about that terrified Mary?” Note that there is no reason in principle that the Unacceptability reactions need be in the least self-consciously meta-linguistic, involving explicit linguistic judgments such as “That’s not grammatical,” or “That term doesn’t mean the same, or co-refer to this other one.” These terms are the provenance of either “grammar school” teachers, “language mavens” that opine in Sunday newspapers, or professional linguists or philosophers, all of whom likely use the terms in special ways that may not be available to ordinary speakers. It would be enough for Chomskyan purposes that speakers simply produce some idiosyncratic reactions to *‑ed strings—perhaps just a peculiar hesitation, perplexity, pupillary dilation, or maybe distinctive states of their brain. Insofar as such responses are correlated with the *‑ed strings, they would just as readily call out for explanation as would more standard meta-linguistic reactions. Explicit intuitive verdicts are simply the preferred sort of evidence, since they are both extremely easy to obtain—one just imagines hearing the sentence!—and are often more informative than the other reactions. One might worry that such “intuitive,” “armchair” data are just the sort of irresponsible surmises that have characterized some of the worst philosophy of the last century. After all, what sort of serious science could be based so largely upon ordinary speaker’s—often just the linguists’ own—verdicts about strings of words? Shouldn’t science be relying on carefully controlled experiments? And of course it should—should there be any question of confounding factors. But what is striking about all these WhyNots is that it is virtually impossible to think of non-linguistic factors—factors having to do with anything other than grammar—that could explain the spontaneous verdicts. Why else other than grammar should one reject *Who did stories about terrify Mary?9 In any case, it is important to bear in mind that the Unacceptablity Reactions are not being taken to afford any special insights into some special realm. Rather, they are just data: readily observable facts about human reactions that need to be explained along with any other data, just ones for which the best explanation seems to lie in certain fundamental rules or principles 9 Chomsky (1968/2006: 21–2) early on commented on the “difficulty in the psychological sciences . . . [of] the familiarity of the phenomena with which they deal,” citing Wittgenstein’s (1953/2016: §129) observation that “the aspects of things that are most important for us are hidden. (One is unable to notice something because it is always before one’s eyes.)”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 23 governing a specific computational system somehow realized in our brains. The only confound that can occasionally occur is a linguist’s commitment to his or her theory, or a hearer’s acquiescence to a linguist’s claims—which is one reason why linguists routinely ask for the (repeated) verdicts of “naive” native speakers of a language. Of course, all such data could be subjected to controlled experiment, asking native speakers to fill out questionnaires. This has in fact occasionally been done, and recent studies have demonstrated high reliability between linguists and large sets of experimentally controlled subjects, particularly ones, such as beginning cognitive science students, who have some conception of what is being asked but are not yet swept up by any particular theory (see, e.g. Maynes and Gross, 2013; Sprouse et al., 2013; Sprouse and Almeida, 2017).10 Given this high reliability, and the difficulty of coming up with any alternative explanation of the range of Unacceptability reactions, it would be frivolous to insist upon controlled experiments to verify each of them (but we’ll return to this issue and some other relevant experimental data in §7.3). Of course, in thinking “one can’t say that,” this obviously does not mean that one cannot actually utter the string, but only that native speakers find something somehow “wrong” with the string: one “can’t say it in my language.” And the question to be raised is why not? Chomskyans look at an immense range of examples, but, for many of them, a person might just shrug and dismiss them as due to some or other rules that have emerged as the result of implicit social conventions of speech. WhyNots are cases like (1d) where that response would pretty obviously not suffice. Thinking about the case for a moment, it is surprising that no such convention ever arose against its use, since it would be so obvious what it would mean: as Chomsky (1962: 531) observed, such strings would seem to express “perfectly fine thoughts.” And so they suggest that language is somehow peculiarly constrained in a way that thought itself does not seem to be. The hypothesis of an innate internal generative grammar would seem to explain them better than appeals to training, imitation, or the communicative purposes of language.11

10 There are, of course, difficulties in elicting intuitions from pre-literate people, for whom Unacceptability Reactions are often confounded with judgments of pragmatic and moral appropriateness (cf. Schütze, 1996/2016: 122). But as to the reliability of linguists’ personal “intuitions,” Sprouse et al. (2013) and Maynes and Gross (2013) report a correspondence of roughly 95% between them and those of non-linguists. This is not to say that worries can’t be raised about specific data, as well as about the interesting effects of “satiation” by frequent repetition of examples (see Hofmeister et al. 2013). As in any science, (analyses of) data are as much up for careful discussion as is a theory itself. 11 It is a striking fact that few if any defenders of a social/conventional account of language actually address WhyNots (see Chapter 5 for discussion). As crucial data deciding between theirs and an

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

24 Representation of Language Do all speakers respond alike? No. There are regional, dialectical and ideolectical differences. Some of these may involve different settings of what Chomskyans regard as certain free parameters (see §2.2.7); some may just be local speech habits unreflective of the underlying grammar (see §3.2). But notice that, even if the verdicts on WhyNots diverged more in the population than linguists presume, still there would be the theoretically interesting question of explaining why so many people nevertheless tend to concur about so many of them. Presumably it is not some (very) odd conspiracy! In any case, data for some languages are sometimes supplied by the one surviving speaker of it, as in many Native American languages. Simply for expository convenience, we will largely discuss here examples from (what is roughly called) “Standard American English.”12 Modulo parametric variation, most of them have close analogues in virtually all other languages. But even confining oneself to English, I think we will see that it is virtually impossible to imagine how even its constraints could have been acquired merely from experience. It is this that leads Chomskyans to think that they can be taken to be reflections of an innate “universal grammar.”13 Of course, there could be many explanations of why someone has an Unacceptability Reaction to some sentence other than their linguistic competence. After all, for some people, it is unacceptable to swear; for Orthodox Jews to utter God’s name, and for some stylistic sticklers, to split an infinitive or end a sentence with a preposition.14 In these cases there are perfectly obvious internalist theory, WhyNots should be among the first data to be addressed, which is why I urge readers to remember them. 12 We’ll see in §3.3 that this initial appeal to what are colloquially called “languages,” such as “English” or “Mandarin” is only provisional, and that the real concern is with the internal computational system underlying linguistic competence, what Chomsky calls an “I-language.” The colloquial designations are simply a way of picking large sets of individuals whose I-languages are quite similar. 13 One often hears a surprisingly simple-minded objection to Chomsky’s claims of a “universal” grammar that points to the obvious diversity of human languages, as though Chomskyans were unaware of it (see, e.g, Evans, 2014: 15, 64; Hayes, 2018: 178–81). As with any scientists, Chomskyans simply believe there are underlying structures shared across this diversity, structures that, like chemical structures, may not be evident on surface inspection, but are only revealed by careful analysis (see Baker, 2001, for proposed analyses of languages that exhibit the commonalities of languages, such as English and Algonquin, that are strikingly dissimilar on the surface). 14 Neither of which are anything but stylistic opinions. Note that everyone who attends “grammar school” already speaks perfectly well the spoken grammar of their “first language(s)” or dialects. What is taught in grammar school is the “prestige” dialect (“proper English”), as well as conventions of writing, which is a quite separate issue from speaking: most of the world’s speaking population is illiterate, and, unlike speaking, reading and writing are not acquired “automatically” and must be laboriously taught and learned. It is for these reasons that written languages are not strictly in the purview of a Chomskyan linguistics. Indeed, what most interests Chomskyans is a speaker’s first, spoken “native” language(s). Some of the data I cite in this book may not be vivid to non-native speakers of English; if so, they should try translating them, if possible, into their own native tongue(s).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 25 explanations of the reaction, usually based on explicit conventions. What is interesting about the WhyNots and will concern us is that no such simple explanation seems remotely available. Despite the fact that we can make “perfect sense” of them, WhyNots are not violations of sensible behavior or any explicit “grammar school” conventions, such as not splitting an infinitive. Indeed, what is distinctive about the WhyNots we will consider is that, unless one has studied linguistics, one would never have heard or been remotely tempted to utter them, much less been able to say, if they happened to be uttered, what rule may have been violated. Virtually no ordinary, non-technical grammar or style book would even begin to note them. And when one does notice them, it usually seems peculiarly difficult to imagine adopting a convention that allowed one to utter them: unlike, say, the conventions of saying “Hello” and “Goodbye” or calling cats “cats” or “chats,” it would seem almost impossible to follow a convention of uttering many of the WhyNots without considerable effort. This is not to say that all of the WhyNots involve, strictly speaking, grammatical errors. Some errors could be semantic (a matter of meaning) and/or pragmatic (a matter of use), and their Unacceptability could be a matter of degree (some *‑ed expressions may not be “as bad” as others). But even if there are semantic and pragmatic influences at work, what we will see is crucial about all of them is that they manifest a sensitivity to surprisingly abstract and idiosyncratic features and, especially, grammatical structures that our minds seems to impose upon speech, and which, as well see in Chapter 2, Chomskyans take it to be their task to describe and explain. I want to set off particularly the syntactic WhyNots separately in this section in order to make clear how most of them can be appreciated entirely independently of Chomsky’s core theory, and thereby provide crucial evidence for it.15 They should be thought of as closely analogous to the various optical phenomena that vision scientists routinely study, as revealing facts about the presumably largely innate structure of the visual system: illusions, ambiguities, and impossibilities such as the Penrose Triangle (Figure 1.1), which the visual system cannot seem to satisfactorily “parse,” even though, as in the case of unparsable strings of words, we can make some “sense” of parts of the figure.

15 A common objection of especially Wittgensteinian and Quinean opponents of Chomsky (e.g., Burton Dreben in his 1970s seminars at Harvard) was that the data he provided were questionbegging (see the Glossary for this use of the term). But theoretically neutral versions of most of the crucial data could be obtained simply by presenting subjects with “standardized tests” (like the SAT), with physically described stimuli and responses; see Rey (1997: chap 3) for discussion.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

26 Representation of Language

Figure 1.1 Penrose triangle

Indeed, notice that there could well be ungrammatical sentences that, for one reason or another, we might ordinarily find sufficiently acceptable, since their intended meaning might be sufficiently clear (we’ll look at a number of such cases in §1.4.1), just as there might be optical illusions that we don’t notice (e.g., “bent” sticks in water). Again, what are crucial are not our superficial ordinary judgments per se. What are crucial are judgments that are revealing about the underlying system of grammatical (or visual) competence that seems to structure our perception and understanding of the relevant parts of the world (cf. Chapter 7 below). Grammar can be seen as affording an even more striking instance than vision of the Kantian insight about the ways in which our minds are the source of much of the structure we take ourselves to be perceiving in the world.16 Turning to the “WhyNot” data, I will mention several kinds of cases. The un-starred sentences should sound fine, but a native speaker should find the starred ones unacceptable. The question is: Why can’t you say the second, given you can say the first?, that is, Why not the second? Indeed: (a) Why do we all tend overwhelmingly to agree about the Unacceptable cases? Why do we in fact virtually never even utter them? (b) How do children manage to acquire the competence to accord with the judgments of (a)?—especially when such weird grammatical errors in children are seldom, if ever, made, much less corrected. The WhyNots exhibit a robustness, again, like that of visual illusions and ambiguities, which it seems extremely unlikely were acquired on the basis of the statistics of experience. 16 Which is not to endorse Kant’s idealism; cf. §9.1.2, fn7 below.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 27 For later reference, I will use the standard labels linguists employ for the different categories of example, and shortly indicate the underlying issue they reveal. I will first consider cases that seem purely syntactic (§§1.3.1–1.3.4), and then cases that seem to demonstrate a mix of syntactic, semantic, and perhaps pragmatic issues (§§1.3.5–1.3.7).

1.3.1 Purely Syntactic Cases (i) Island Constraints In English and many other languages, certain words and phrases seem to be “moved” from a place in the sentence in which they would seem to “originally” occur to a different location. The most obvious examples of this involve the “movement” of “wh-” words (who, what, which, how) to the left front of a sentence from a place “in situ” where they might indicate certain missing information. Thus, from the sentence (3) Jim went to the movies with Bill last week. One could form the echo question: (3a) Jim went to the movies with who last week? And then one can “move” the who to the front: (3b) Who did John go to the movies with last week?17 Now, one might initially think this could be done for any such wh-word wherever it might originally occur in a sentence. But John Ross (1967/86) noted a surprisingly wide assortment of different sorts of phrases from which such movement was oddly barred: for example, from conjunctive phrases, from clauses within the subject phrase of a sentence, or from within many interrogative clauses in direct object position. He called them “islands” (underlining indicates the location of the queried gap):

17 Of course, such movements involve more than just moving the who, e.g., insertion of the auxiliary do with a corresponding change in the form of the verb.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

28 Representation of Language (4) Mary went with Bill and John to the movies last week. OK: Mary went with Bill and who to the movies last week? But not: *Who did Mary go with Bill and __ to the movies last week? (5) You think friends of Susan amused Bill. OK: You think friends of who amused Bill? but not: *Who do you think friends of __ amused Bill? (6) Susan asked why Sam was waiting for Fred. OK: Susan asked why Sam was waiting for who? but not: *Who did Susan ask why Sam was waiting for __? Note that it is not mere distance between the wh- and the queried gap: (7) Bob thinks that Mary hopes that Fred will marry Ann. OK: Bob thinks that Mary hopes that Fred will marry who? but also: Who does Bob think that Mary hopes that Fred will marry __? The problems with islands, as with all the WhyNots, seem to have something to do with the specific structure of a sentence, not with mere linear distance of items within it. (ii) Constraints on Contraction Contractions of adjoining words, e.g., s/he is to s/he’s, want to wanna, can occur in certain places, but not others: (8) She is as sad as he is. can become She’s as sad as he is, but not: *She’s as sad as he’s. (Note that some sentences can end in contractions: He won’t.) (9) You have my book in the car can become You’ve my book in the car But (10) I wouldn’t let you have my book in the car. cannot become: *I wouldn’t let you’ve my book in the car.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 29 In American English, some nice colloquial examples are provided by the verbs want to, supposed to, and have to, which can sometimes, but not always, contract to wanna, sposta, and hafta in certain interrogative contexts (some of these examples may be difficult for non-natives, and even some natives, to hear): (11) You want to read Moby Dick. can be queried by (12) You want to read what? and the “wh” can be fronted: (13) What do you want to read? and the want to then contracted to: (14) What do you wanna read? But if you ask about the object of want and subject of the verb “to read”: (15) Who do you want to read? (cf., I want Daddy to read; you want who to read?) the result cannot be contracted to: (16) *Who do you wanna read? The constraint is perhaps clearest with intransitive verbs, which lack a direct object, leaving wh- questions always about their subjects: (17) Who do you want to sing/die/sneeze/whistle? cannot become *Who do you wanna sing/die/sneeze/whistle? Note that (18) Who do you want to succeed?

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

30 Representation of Language is ambiguous, since one could reply to it with either:     (18a) I want Obama to succeed (e.g., as president) or (18b) I want to succeed Obama (i.e., come after him as president) but (19)   Who do you wanna succeed? can only be understood as asking about (18b) not as about (18a), which cannot serve as an answer. (iii) Ellipses These involve deleting repetitious material, the stars indicating what cannot be omitted: (20) I know I should go home, but I don’t want *(to) (i.e.,. . . but I don’t want to is fine) (21) Abby knows five people who have dogs, but cats, she doesn’t *(know five people who have). The rule seems to be subtle, allowing (22a) but curiously disallowing (22b): (22a) The man stole the car after midnight, but not the diamonds. (22b) *They caught the man who’d stolen the car after searching for him, but not the diamonds. See Marchant (2012) for discussion. (iv) Parasitic Gaps One or more unpronounced elements, one of which may be OK but only so long as the other one is there as well. In the following, the gaps (indicated by underlines) are where the direct objects of the verb should appear after the verbs, and one such direct object can be omitted, but only as long as the one other is: (23) Which articles did you file ___ without reading ___? OK: I filed Bill’s articles without reading them. But not *I filed Bill’s articles without reading ___.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 31 That is, the second gap can’t occur without the first. (24) Here’s the woman that John married __ without meeting __. OK: Here’s the woman that John married __ without meeting her. but not: *Here’s the woman that John married her without meeting __. Parasitic gaps are particularly interesting examples, since analyses of corpora of exchanges between adults and children suggest that they rarely, if ever, are presented in those exchanges (see Pearl and Sprouse 2013), so it appears no one learns such an odd constraint from examples.

1.3.2 Possibly Mixed Syntax, Semantics, or Pragmatic Cases A number of cases are worth noticing for their apparently borderline status between pure syntax, semantics (or issues related to truth and reference) or pragmatics (issues related to the use of language in a context). Some of them tend not to be at the center of Chomskyan theories, partly because they do not all seem to be as “universal” across languages as purely syntactic cases, but mostly because, as we will see in §10.2, just what counts as “semantic” is still immensely controversial (is “Cats are animals” a truth about the meaning of the words or a substantive claim about biology?) Some semantic seeming cases, however, appear sufficiently clear and much closer to many of above syntactic phenomena than they do to worldly belief—indeed, they often display “syntactic reflexes,” that is, effects on what seems to be syntactic structure—and so deserve to be included among these WhyNots: the same issue arises with respect to why the *‑ed cases are unacceptable, despite agreement about all the relevant facts concerning the extra-linguistic world. (i) Binding Phenomena These phenomena are analogous to the ways in which quantifiers in formal logic “bind” occurrences of variables within their scope (as when in logic “(∀x)” binds the second but not the third “x” in “(∀x)Fx & Gx”). In natural language, pronouns often function as variables and can be bound by nouns. Subscripts indicate that the occurrence of that pronoun is bound by the noun with that subscript. Intuitively, co-subscripted noun phrases (“NP”s) are standardly used with the intention to refer to the same thing (slashed indices, “i/j”, indicate the co-reference is left open):

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

32 Representation of Language (25) Johni’s mother likes himi/j. (i.e., where him can = John) (26) *Johni likes himi (i.e., where him = John; i.e., him must be intended to be someone else)18 (27) Johni believes hei/j is intelligent (he can be John or someone else) (28) *Johni believes himi to be intelligent (where he = John; i.e., him must be intended to be someone else). One might think the rule here for pronouns is simply not to introduce a pronoun for a noun you introduce later: but this won’t do, since we can say: (29) Although hei/j lost, Johni hoped to win (with he = John). (30) When hei/j yelled, Johni finally hit the target (with he = John). Reflexive pronouns work differently:19 (31) Johni likes himselfi (i.e., where himself = John) (32) *Johni thinks Paul likes himselfi (i.e., where himself = John; i.e., himself must be Paul) One might think that the issue had merely to do with conversational focus. But consider: (33) * Johni was always concerned with himselfi. He always talked about himselfi, would constantly praise himselfi in public, and earnestly hoped the boss liked himselfi. Despite the clear semantic and/or pragmatic focus on self-centered John, it is almost impossible to hear the last himself as referring to him. Some cases are subtle (note it is the index, i, of herself in (33b) that gets *‑ed, unlike the one in (33a)).

18 Note that the issue throughout these examples is not one of actual (co-)reference, since, unbeknownst to the speaker two terms could co-refer. So John likes him would be OK when the speaker mistakenly thought “him” was different from John (she didn’t know John was the man she met last night). Hence the statement here in terms of “co-indices.” 19 Chomskyan linguistics departs from ordinary usage in calling all and only reflexive pronouns, such as himself or myself, “anaphors.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 33 (33a) Maryi wondered which picture of herselfi/j Janej saw. (i.e., where herself can be either Mary or Jane) (33b) Maryi wondered how proud of herself*i/j Janej was. (i.e., where herself can’t be Mary) We will return this example in considering children’s exposure to language in §5.2. (ii) (Negative/Positive) Polarity Items (NPIs/PPIs) It is not easy to characterize polarity items. The negative ones, the NPIs are the most salient and widely discussed. The rough rule that has emerged in many discussion is that they must be “licensed” by expressions that are (intuitively) “negative,” “downward entailing” (as when “doubt anyone drinks” entails “doubt Bill drinks”), non-“factive” (Y(p) doesn’t entail p), or non-“existentially committing” (Y(x) doesn’t entail there exists an x, Y(x)):20 Examples of NPIs are: any, ever, at all, yet, in weeks, in ages, in the longest time, last long. Examples of licensors are: deny, doubt, is surprised that, interrogatives and imperatives. In the following, underlined words are NPIs, double-underlined are licensors: (34) I doubt he has ever spoken French vs. *I know he has ever spoken French. (35) I haven’t seen him in weeks vs. *I have seen him in weeks. (36) Do you have any wool? vs. *I have any wool? By contrast, Positive Polarity items, PPIs, are excluded from “negative” contexts:

20 Licensing will be defined in §2.3.2. A constituent, C, is existentially committing iff: “C[Fa]” entails “(∃x)Fx” and is factive iff: “C[p]” entails “p”. Thus, “John knows that Bill is sad” entails both that “(∃x) x=Bill” and “Bill is sad” (some linguists use “veridical” instead of the philosopher’s “factive,” who use “veridical” just for “true.” I’m keeping to the philosopher’s use which, unlike the linguist’s, is unambiguous). Interrogatives are neither: “Does some/any angel like soup?” doesn’t entail either that “Someone likes soup” or that there are angels (or soup). Various other criteria have been proposed, so far, none of them completely satisfactory. There is considerable discussion about whether these phenomena form a single class, and whether they are syntactic, semantic, or pragmatic (see §2.2.5, fn37 and M. Israel, 2011, for discussion and from which some of my examples are taken).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

34 Representation of Language (37)   *She hasn’t already won the race vs. She already won the race. (38a) *The committee wasn’t sort of/rather hard on poor Gladys vs. (38b) The Committee was sort of/rather hard on poor Gladys.21 (iii) Structural Constraints on Meaning A given string of words such as (39) John is easy to please. (40) John is eager to please. can be superficially extremely similar (differing here by only one word). But (39) can be paraphrased as (41) It is easy to please John but (40) cannot be paraphrased as (42) *It is eager to please John. Some kind of structural facts about how we understand easy vs. eager prevent this. For starters, we understand John as the “subject” of eager to please, but as the “object” of easy to please. Similarly, (43a) The millionaire called the senator from Texas can mean (43b) The millionaire called the senator, and the senator is from Texas or (43c) The millionaire called the senator, and the call was from Texas, but not 21 Note that this, like many PPI examples, has a possible meta-linguistic negation reading: The committee wasn’t “sort of hard’ ” on Gladys; it was bloody draconian!, where the PPI is (partly) mentioned, not used (hence the quotes).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 35 (43d) $The millionaire called the senator, and the millionaire is from Texas.22 A prepositional phrase cannot attach to just any phrase in a sentence.

1.4 Performance Issues One way to appreciate the special significance of the above WhyNots, and of the importance of the competence/performance distinction, is to consider, by contrast, cases where, arguably, the data do not reflect facts about underlying competence, but rather facts about other systems responsible for performance. These include not only simple slips of the tongue, but some more systematic discrepancies between an underlying grammar and ordinary speech, where what is grammatical may sometimes be unacceptable, and what is acceptable may sometimes be ungrammatical:

1.4.1 Grammatical but Unacceptable Some perfectly grammatical strings will virtually never be uttered, for example, a million conjunctions of Snow is white or patent absurdities, for example, Twelve times five is a trillion and four, which many speakers might find “unacceptable,” but not because they are ungrammatical. More interesting examples are afforded by “center embeddings” of relative clauses. Most people would probably off-hand reject: (CE) The man the girl the cat scratched loved died. But this turns out, with a little reflection, to be in fact grammatical. Start with “The man died.” Ask “Which man?” “The man the girl loved.” Ask “Which girl?” “The girl the cat scratched.” Put them together, inserting commas and “that”s, and we get: The man that the girl, that the cat scratched, loved died. Deleting the commas and “that”s: The man the girl the cat scratched loved died. 22 The example is from Pietroski (2005). I use “$” to indicate non-paraphrase where he uses “#,” since this latter is used by many linguists to indicate semantic anomaly.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

36 Representation of Language Some of you will still have trouble with this (I do!), but generativists reasonably argue that excluding it from the grammar would introduce a wholly arbitrary constraint into the grammatical rules, excluding embeddings of relative clauses that are otherwise fine, and permitted in the case of indefinite “right” and “left” embeddings of them: (RE) The cat that scratched the girl that loved the man . . . suddenly died. (LE) This is the cat that scratched the girl that loved the man . . . who suddenly died. where indefinitely more nested clauses can be inserted in the ellipses, as most children will gleefully observe. Note that, rather than wrapping their mind around (CE), many speakers would feel no trouble with its genuinely ungrammatical cousin: (XCE) *The man the girl the cat scratched died. (where either The man or the girl is one noun phrase too many).23 This last example brings us to the converse category.

1.4.2 Acceptable but not Grammatical (a) People quite often speak elliptically, omitting grammatically essential elements, or make simple slips of the tongue, which can go unnoticed since the intended message is easily inferred, for example:24 *Want some coffee? *She finish her thesis? *Raining yet? *And what he said? *Turns out he’s a thief. 23 Example from J. D. Fodor et al. (2017), who have studied the ways in which the right prosody can facilitate (mis-)understanding of center embeddings. 24 The last examples are from Fromkin (1980), who discusses many more, as does Roberge (1990). Some proposals, e.g., Chomsky (1965), treat the occasional deletions as part of the grammar, but this is controversial. Note that there are many non-sentential phenomena to be considered: the grammar may generate mere bare phrases like On the top shelf, John’s dad, that can be used in directives or answers to questions, and need involve no deletions; and there are special forms of speech where deletions are routinely allowed, as in diaries (went to see Dad; then shopping) and headlines (Storm Gives Jolt to Lumber Market! One-man show also gives a nod to late dramatist), where the deletions are plausibly made to sentences after they have been generated by a grammar. These various sorts of deletions will be important to remember in considering the difficulties of accounting for acquisition that we’ll discuss in Chapter 5. (Thanks to Nick Allott for a tutorial on the many issues here.)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 37 *I know where is a top for it. *Why do you be an oaf sometimes? (b) A quite common source of speech errors involve subject–verb agreement, especially if some local noun has a different number property than the main subject of the sentence, as in *The bridge to the islands close at seven (see Vigliocci et al. 1996). (c) Grammatical illusions: Unlike in the case of vision, genuine grammatical illusions (which persist even after you know better) are rare. In addition to the above, fairly obvious example of (XCE), there is the more peculiar example of the “Comparative Illusion”: (IL1) *More people have been to Europe than I have. The purported sentence can seem to be an ellipsis—but try to complete it: (IL2) *More people have been to Europe than I have been to Europe. What some people seem to do is simply interpret the sentence as the quite different, grammatical (but still unusual): (IL3) More people than I (alone) have been to Europe. Whatever the re-interpretation, it does seem that people are sometimes content with a shallow, but ungrammatical parse. (d) Lastly, although the phenomena are not strictly ungrammatical, there are curious semantic oddities in the sentences English speakers will accept. For example: (IL4) John married his widow’s sister. But if John has a widow, then he is dead and unable to marry. (IL5) Bill’s grandfather died childless. And people seem to have systematic problems with nested negations: (IL6) No head injury is too trivial to be ignored

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

38 Representation of Language is frequently taken to mean (IL7) No head injury is too trivial to be treated. Presumably, the opposite is intended by the following headline from the Los Angeles Times: (IL8)

South Korea’s obsession with speed-skating isn’t hard to miss.25

All the above examples point to a double dissociation between competence and performance. Again, the deep moral is the Galilean one: systematic theories are likely to be found only in considerable abstraction from the plethora of interacting systems that are in play in ordinary speech.

1.5 Further Evidence In addition to the WhyNots, there is other evidence to which generative linguists appeal. This evidence is not quite as obvious and universally accepted as the WhyNots, but is worth taking particularly seriously in conjunction with them.

1.5.1 Productivity There is no upper bound on the length of a natural language sentence: for any purported longest sentence, σ, there exists a longer one, for example, “Mary thinks that σ,” etc.

1.5.2 Creativity Apart from obvious formulaic or deliberately repeated speech (e.g., greetings, reading aloud), the vast majority of the sentences we read or utter are novel: we have never produced or read them before—else why bother gossiping, or reading books and newspapers? When you think about it, this is quite a striking fact, not obviously observable in the behavior of non-human animals. Chomsky also includes under this rubric the facts that the standard use of 25 This and other negation errors can be found at http://languagelog.ldc.upenn.edu/nll/?cat=273. Woodward et al. (2018) discuss some of them and the Comparative Illusion at length.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 39 language “is unbounded, stimulus free, appropriate to situations, coherent, and evoke[s] appropriate thoughts in its audience” (1986: 234).26

1.5.3 Relations between Forms Speakers not only appreciate correct forms, but some of the felt relations between them, as in the case of interrogatives and their corresponding declaratives, for example, Did John wash the dishes? and John washed the dishes; Will John wash the dishes? and John will wash the dishes.27 And they also learn that superficially very similar sentences may in fact have radically different structure, as in the pair we already mentioned: (39) John is easy to please. (40) John is eager to please. and an elegant pair from Higginbotham (1983b): (44a) I heard Pat sang. (cf. I heard that Pat sang.) vs. (44b) I heard Pat sing. (cf. *I heard that Pat sing).

1.5.4 Constrained Homophony (or “Ambiguity”) Some expressions are n ways, but not m>n ways ambiguous.28 Berwick et al. (2011) point out that (45a) can be used to express (45b), but not, homophonously, (45c): 26 He adds to this obvious characterization an entirely non-obvious, controversial philosophical claim: Or if your behavior reflects understanding and the exercise of will and choice distinct from mechanical response, then I attribute to your possession of mind, a power that exceeds the bounds of Cartesian contact mechanics. (Chomsky, 1986: 234) He discusses this issue so frequently, it would be easy to think it’s an essential part of his theory, or even what he takes the data for it to be. As we’ll see, however, other than the fact that we utter constantly novel utterances, issues of how we use language, and whether it involves “(free) will” and choice, which may or may not be compatible with “contact mechanics,” are, in fact, entirely inessential to his core theory and to the crucial data for it. We’ll return to this topic very briefly at the very end of the book (§11.3.2). 27 See Berwick et al. (2011) and Lasnik and Lohndal (2013: 31) for recent discussion. 28 Chomskyans tend to prefer “homonymy,” since they tend not to regard words or sentences as having more than one meaning, but rather only their phonology, or (roughly speaking) their

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

40 Representation of Language (45a) Can eagles that fly eat? (45b) [Can [[eagles that __fly] eat]] (i.e., questioning Eagles that can fly eat) (45c) *[Can [[eagles that fly] [__ eat]]] (i.e., questioning Eagles that fly can eat). Why not?

1.5.5 Stability of Acquisition Substantially the same specific grammars are acquired by virtually all speakers in their ambient community, despite considerable variation in the evidence to which they are exposed and a difference in the amount and character of their exchanges with adults (see Hoff, 2009). Without an innate UG, this would appear to be a monumental coincidence.

1.5.6 Speed of Stable Acquisition Children’s utterances respect the rules of UG almost as soon as they begin uttering sentences of the relevant type, producing few if any WhyNots (see Pearl and Sprouse, 2013; Pearl, forthcoming). Where they make syntactic constructions that differ from the ambient language, they are almost always constructions allowed in other UG permitted languages;29 and, in any case, all children converge on what is very nearly the ambient grammar within three–four years. They do make frequent morpho-phonological errors,30 getting the various forms of the “irregular” verbs wrong, for example, *goed vs. went, an issue about which adults can also be confused: learned vs. learnt, forsaked vs. forsook (see Pinker, 1999: 275–9 for discussion).

ronunciation. Thus, the two meanings associated with the English phonological form /pen/—writing p implement, enclosure—are just attached to different lexical items. 29 Some American children go through a stage at which they utter things like *What do you think what the pigs eat?, a construction (called “medial wh-insertion”) that they’ve presumably never heard locally, but is allowed in German and Romani (see Crain, 2012: 65). 30 See “Linguistic Categories” in the Glossary for the differences between phonetics, phonology, morphology, syntax, and semantics. Morpho-phonology is the study of how combining morphemes affects their pronunciation.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 41

1.5.7 Poverty and Corruption of Stimulus Data In the short period of first-language acquisition, children acquire the ambient grammar despite hearing only a small subset of the potential infinitude of possible sentences. Moreover, the utterances are often “corrupt,” as in cases of false starts, re-starts, verbal slips, and deletions of conventionally understood elements, e.g., *Love you! *Finished? *Turned out she never showed up, all of which lack subjects that are grammatically required in English. (We’ll discuss the poverty of the stimulus at length in §5.2.)

1.5.8 No Negative Data Adult corrections of children are rarely of grammar. Brown and Hanlon (1970) showed that very young children tend to be “reinforced” not for their grammar but rather for the informational content of what they say. Thus, a typical exchange of a mother and a child, might proceed: Child: Momma isn’t boy, he a girl. Mother: That’s right. Child: And Walt Disney comes on Tuesday., Mother: No, he does not. (quoted in Pinker, 1994: 285) And, of course, children are routinely corrected for merely socially inappropriate speech, for example, swearing, or engaging in baby-babble as they mature (we will discuss such data further in §5.2). In any case, it would be astonishing if all children across a language group were subject to even approximately the same amount of grammatical correction; yet they all seem to acquire the same ambient grammars. Such data present a serious challenge not only to claims that grammar is learned along the lines of, for example, conventional table manners, but that it is conventional at all (see Chapter 5 for further examples and discussion).

1.5.9 Independence of General Intelligence A grasp of basic syntax seems not to depend on general intelligence in problem solving: children with high scores on IQ tests don’t seem to learn basic syntax more quickly than children with average scores (Valian, 2014: 86). Indeed,

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

42 Representation of Language even children with severe cognitive limitations seem to acquire basic competence. Studying “Laura,” a young woman with a testable IQ of 40, Yamada (1990) found that she had a large vocabulary that she uses in appropriate ways, though apparently without much understanding. More dramatically, Neil Smith and Ian Tsimpli (1995) studied at length a “savant” with an IQ between 60 and 70 who nevertheless acquired competence in fifteen languages!31

1.5.10 A “Critical Period” Interestingly, there are also cases of the converse dissociation: intelligence without grammatical competence. These are cases that also seem to show that there is apparently a “critical period” for language acquisition.32 There is the quite tragic case of “Genie” who was treated horrifically as a child, locked in a closet without exposure to language until she was about fourteen. After social workers discovered her, she was studied by linguists for more than a decade, and even though she rapidly came to display a good understanding of vocabulary items, “even after eight years of exposure and attempts at acquisition, her speech remained largely agrammatic” (Curtiss, 1988: 97). Less horrific is the case of “Chelsea,” a woman whose deafness was not discovered until she was in her twenties. Despite average health and intelligence, and even after several years of studying language, she was unable to master grammar and spoke in a way that was similar to someone with very severe aphasia, saying things like “The small a the hat,” “The woman is bus the going” and “The girl is cone the ice cream shopping buying the man” (Curtiss, 1988: 99).

1.5.11 Universality of Grammatical Principles and Parameters This, of course, is one of the bold conjectures of the theory; one that is being continuously investigated over a wider and wider selection of languages, but has been denied by many. However, it is one of the more exasperating errors of too many of Chomsky’s critics (e.g. N. Evans and Levinson, 2009 and V. Evans, 2014), 31 Pinker (1994: 41–3) and others have also called attention to what seem to be an even more extreme, analogous phenomenon of Williams syndrome children, who, though cognitively limited (average IQ of 55), sometimes seem to display an exceptional grasp of language, producing some fairly elaborate discourse. Further research has raised skepticism about such abilities (see, e.g., Brock, 2007). 32 This hypothesis was first advanced by Lenneberg (1967). Cowie (2008) correctly notes that the critical period is not sharp in normal humans. But all that is important here is that there is some significant difference in language acquisition that is not part of any general difference in cognitive processing at the time.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Core Galilean Idea and Some Crucial Data 43 that they neglect to look beyond surface appearances in assessing claims of universality. As in any science, commonalities between phenomena are established by the appropriate analysis of the surface phenomena. Ice and steam seem very different; their commonality is only established by chemical analysis. Pace Chater (2018), quoted in the preface, one cannot simply rush to conclusions about “Chomsky’s programme of generative grammar [having] foundered” without analyzing more deeply what may seem only on the surface like “inconsistent regularities, sub-regularities and outright exceptions” (2018: 32). Comparative syntactic analysis between languages has been at the heart of the generative approach, particularly since Richard Kayne’s (1975) and Rizzi’s (1982) work on French and Italian (for overviews, see Roberts, 1997, Baker, 2001 and Cinque and Kayne, 2005). Chomskyan generativists have extensively studied not only many of the European Germanic, Romance, Slavic, and Celtic languages, but also many native languages of Africa, the Americas, and Australia; Asian languages from several families; and Uralic and Austronesian languages.33 This work led directly to what we will see (§2.2.7) was one of the key developments in generative grammar, the “Principles and Parameters” approach, which distinguishes the constant vs. variable properties of different languages. There was a widespread view in the popular press a few years ago that Everett’s (2012) work on the Pirahã language of a Brazilian tribe had presented a counterexample to Chomskyan claims of the universality of UG principles, since the language appeared to be non-recursive (see §2.2.1, fn15). Subsequent analysis has cast doubt on the claim (see Nevins et al., 2009, and Smith and Allott, 2016: 191ff for discussion). But, in any case, note that a single spoken communication system does not actually count against a Chomskyan hypothesis of universality, since it is compatible with the hypothesis that, for some reason or another, a particular tribe did not utilize their innate UG in communicating. After all, there are tribes that don’t eat pork despite their innate ability to digest it. As Chomsky quipped in reply to Everett’s claims, “Can they learn Portuguese?” (cf., can Moslem neonates digest pork?). We will return to this point in discussing Chomsky’s distinction between E- and I-languages in §3.2. Of course, UG might become a less interesting hypothesis if it turned out that the principles applied to only a minority of (what we pre-theoretically call) languages. But the evidence so far is that this is simply not the case.34 33 For overviews, see Roberts (1997), Baker (2001), and Cinque and Kayne (2005). Note also Chomsky’s (1951/79); this—his own first generativist work—was his master’s thesis. It was not on the English that dominated his own later work, but on Hebrew. 34 Conceivably—say, if all but a few of us died off—grammar could turn out to be as rare as the prime factorization abilities of idiot savants, which are still a challenge for psychology to explain, even if, in both cases, the rarity might make the challenge less pressing.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

44 Representation of Language In any event, it is certainly rash to make a cavalier claim merely on the basis of Everett’s work that “Linguistic universals, long the darlings of theorists, took a drubbing as one by one they fell to the disconfirming data of field linguists” (P.S. Churchland, 2013: xiii).

1.5.12 Spontaneous Languages of the Deaf A perhaps surprising example of the extent of the universality of UG is afforded by the sign languages of the deaf. During the civil war in Nicaragua in the 1980s, a number of children born deaf were not exposed to any spoken or even standardly signed language. Fending for themselves, they began to create their own sign language, and it appears to abide by many of the properties of UG (see Neidle et al., 1999 and, more generally on universals in sign languages, Sandler, 2010).

1.5.13 Absence of Logically Simple Languages One example of a simple linguistic universal is that no natural language constraint involves counting words, even though it is perfectly easy for adults to figure out an artificial code in which one forms, for example, a negative sentence merely by putting “‑na” after every third word. Smith and Tsimpli (1995) demonstrated that adolescents, at least, were unable to learn this rule if it was presented as a linguistic rule of a natural language, but had no trouble with it as the rule of some arbitrary code; and Musso et al. (2003) found that such a non-structural, linear rule would not activate the areas of the brain activated by natural language (cited in Chomsky, 2016a: 11).

1.5.14 “The Linguists’ Paradox” Ray Jackendoff (1993) wondered how is it that, after more than sixty years of concerted, meticulous study, correction, and comparisons between languages, even the most brilliant linguists have yet to come up with the complete rule system effortlessly mastered by every child in 3–4 years! So, again, the question to ask is: what could possibly begin to explain all this data?

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

2 The Basics of Generative Grammars 2.1 Philosophical Beginnings Philosophy was there from the start.1 Chomsky had originally enrolled as an undergraduate at the University of Pennsylvania where he studied with the mathematical linguist, Zellig Harris. At Harris’ suggestion, he also began attending a seminar by the philosopher, Nelson Goodman, on the manuscript that would become The Structure of Appearance (Goodman, 1951).2 Goodman was employing the then still fairly novel techniques of formal definition and “logical construction” that had been developed by Frege, Russell, and Carnap—indeed, his book was essentially a re-working of the latter’s Logical Construction of the World (the “Aufbau”)—in order to provide a rational reconstruction of the structure of phenomenal experience from a spare primi tive, sensory base. His language consisted of a small number of basic predicates for a small set of properties of experience, and a set of logical operations for combining them. Although Chomsky was likely already inclined to such an approach, the elegance and formal precision of both Harris’ and Goodman’s work became and has remained an ideal in his work, right up through (indeed, especially in) his most recent “Minimalist Program” (see Chomsky and McGilvray, 2012: 87 and §2.2.9 below).3 Chomsky was particularly concerned to capture what he took to be the rich structure of natural language for which he found the descriptive methods of the then prevailing traditions of structuralism and behaviorism in linguistics entirely inadequate. The logical techniques he learned from Harris and Goodman seemed to him far more promising, particularly the formal 1 I am indebted to the many linguists I cite in the acknowledgments for invaluable corrections and advice on this chapter, but especially to Dan Blair, John Collins, and Howard Lasnik. 2 As a freshman in 1945, at Harris’ suggestion, Chomsky began sitting in on Goodman’s seminar and was apparently immediately so adept that Goodman allowed him to enroll (see Chomsky and McGilvray, 2012: 86). They stayed quite close friends for the next dozen years. Chomsky, 1957: 5, acknowledges Goodman’s influence, and cites his book (p. 14). I will discuss in due course (§3.3.1) their falling out. 3 Which is not to say that it is a sine qua non. Indeed, he does “not see any point in formalizing for the sake of formalizing. You can always do that” (Chomsky, 1983/2004: 101); see also Smith and Allott, (2016: 232, 364, fn110). Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0002

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

46 Representation of Language techniques for setting out the “syntax” of a language (or the system of combining symbols in specific ways), independently of its “semantics” (or relation of the symbols to ideas or to phenomena in the world).4 It will be essential for many portions of later discussions to have at least a general feel for the character of Chomskyan theories, some grasp of the often opaque technical terms, and some understanding of some simple, representative explanations. What follows is a sketch of some of the main, well-known, and representative claims that are made, paring down to a minimum what the non-linguist reader will need to know for the rest of the book. Even so, readers may want to merely skim some portions until they later encounter terms or technicalities that need explaining, which could then be accessed via the index. Indeed, a suggestion for reading this chapter: For discursive purposes, the developments of the theory are set out historically; but it may be easier to grasp those developments by going first through the sample explanations in §2.3, the general structures of which should be fairly easy to grasp, and then going back and learning some of the history. The examples will provide an intuitive basis for understanding many of the ideas that emerged in that history: for example, phrase structures, x-bar theory, and the role of structural relations like c-command and subjacency (but, again, one could also return to this material as needed for later discussions). More importantly, appreciating a sample of Chomskyan explanations provides the benefit of making vivid what strikes me as one of the most significant contributions of the core theory, viz., the degree to which it vindicates a Kantian theme, showing how our perception and understanding, at least of language, is determined by fairly abstract structures that are not themselves the immediate objects of sensory experience, in the way a sound or a smell might be.5

2.2 Stages of the Core Theory There are, roughly, five main stages in the development of Chomsky’s technical syntactic proposals:6 4 Of course, this distinction had proved especially important for the spectacular work in the 1930s in the logic and meta-mathematics of Gödel, Tarski, Turing, and Emil Post (on whose work Chomsky specifically relied). As Chomsky (1959/64) also rightly pointed out, some of his own criticisms about behaviorism’s difficulties with the hierarchical structures generated by language were anticipated by the psychologist, Karl Lashley (1948/51). 5 Or, as I have put it elsewhere (Rey 2013), there is more to phenomenology than phenomenality. 6 Here I roughly follow Collins (2008b: 5). See Newmeyer (1986) for a book length history up to GB/P&P theory of the early 1980s, and then Freidin (2012) for the early Minimalism of the 1990s and

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 47 (I) Logical Construction: LSLT and Syntactic Structures (ca. 1950–57) (II) Explanatory Adequacy: the Standard Aspects Model (1958–64) (III) Constraining the Rules: the Extended Standard Theory (EST) (ca. 1965–80) (IV) Government and Binding (GB)/Principles and Parameters (P&P) (ca. 1980–95) (V) The Minimalist Program and Phases (MP) (ca. 1995–present) Some may think that such rapid changes in theory indicate a disheveled science that hasn’t yet found its footing, or that Chomskyans have abandoned their initial project, as Searle (2002a,b,c) has insisted they “quietly” have done. In reply to Searle, Chomsky (2002b) rightly noted: Since the 1950s, proposals have been revised and improved as more has been learned, though not “quietly”; rather as explicitly and loudly as possible. The reason is that these successive steps (still continuing) have been regarded as progress toward the original goal: to show that phenomena that appeared to require rules of great intricacy and diversity in fact followed from the interaction of far simpler rules that might be true invariants, holding for many constructions in typologically distinct languages. The long-term goal has been, and remains, to show that contrary to appearances, human languages are basically cast to the same mold, that they are instantiations of the same fixed biological endowment, and that they “grow in the mind” much like other biological systems, triggered and shaped by experience, but only in restricted ways. (Chomsky, 2002b: on-line)

That is, as I hope will emerge in this chapter, the later developments do not so much replace but largely include earlier ones: the sequence of proposals strive for a greater depth, comprehensiveness, and simplification of the technical apparatus, earlier proposals not being so much mistaken as simply insufficiently explanatory (which is not to deny that mistakes have been made).7 Indeed, as the titles of this chapter’s sub-sections should make plain, each of early 2000s. Smith and Allott (2016: chap 2) is an immensely readable history up to 2015. Freidin and Vergnaud (2001), and Lasnik and Lohndal (2013) provide rich, article-length treatments. 7 There is an unfortunate way of speaking among many scientists of saying “There are no Xs” to mean merely that the category of Xs is not a theoretically interesting one. As I will argue in §9.1.2, this confuses the “ontology” of a theory (what must exist for it to be true) with its “ideology” (or the store of its predicates). There may no longer be distinctive rules about phrase structures that are part of more recent theories, but those very structures may nonetheless be real for all that (cf. Lasnik et al., 2005: 34).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

48 Representation of Language the proposals are directed at specific explanatory questions: stage (I), the constructivism of LSLT was aimed at merely setting out the logical structure of a linguistic theory; stage (II) aimed to provide a “psychologically real,” “descriptively adequate” and ultimately “explanatorily adequate” set of rules that generated the structural descriptions of a natural language in a fashion that made it possible to explain how children could acquire them;8 stage (III) aimed to constrain an initial over-proliferation of rules; stage (IV) sought to provide a more abstract systematization of the rules by an appeal to (crucially: unrepresented) principles that were both more economical and more plausibly at work in a child; and stage (V) aimed to consolidate the entire theory in a way that is conceptually elegant, and computationally and evolutionarily plausible. At worst, later theories simply treat the claims of earlier theories as “epiphenomenal,” generalizations that are approximately true, but not in the end genuinely explanatory. Whether or not one accepts the approach at the end of the day, it is hard to deny the explanatory richness of these theoretical stages, and the continual stream of subtle data to which they have drawn attention, the striking peculiar ities of language, and the economical ways in which they might be captured. Alternative approaches face the serious challenge of beginning to explain the often surprising data in a way that is close to being as theoretically satisfactory.

2.2.1 Logical Constructivism: LSLT and Syntactic Structures Chomsky began to apply some of the ideas he had developed under the influence of Harris and Goodman in his senior thesis on The Morphophonemics of Modern Hebrew, which he submitted for his BA in 1949, and which he turned into his MA thesis in 1951. Goodman then recommended him for a junior fellowship at Harvard’s Society of Fellows, where he worked intensively on a massive (573pp) ms., The Logical Structure of Linguistic Theory (LSLT). This was clearly influenced by Goodman’s formal work, although it relied specific ally on the work of the mathematician Paul Rosenbloom (1950), and the concatenation algebra of Quine (1940/81). Indeed, he had gone to Harvard partly out of interest in the work of Quine, who was working at that time on papers in the foundations of linguistics (see his 1953/61b and c).9 Despite Chomsky’s 8 I put “psychologically real” in scare quotes since, as we shall see in §9.7, the expression suffers from a crucial ambiguity both in Chomsky’s and other writings, between, roughly speaking, a causal and a mere phenomenological reading. For now and until Chapter 9, I will assume that what is usually intended is the causal reading, i.e., that the phenomena being so described are in some way part of actual causal processing in the mind/brain (albeit at what may be a very abstract level, cf. §3.3.4). 9 This relationship with Quine seems, however, to have come to naught. It is (to my mind) one of the most perplexing and perhaps consequential facts of the period that, though Quine and Chomsky

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 49 many later, deep differences with Goodman and Quine, these connections with them are quite significant, not only for Chomsky’s interest in formal methods of construction, but also as one source of his wariness about intentionality, a topic to which we’ll return in Part III. Chomsky was unable to get LSLT published at the time,10 and, aside from a few short articles, his best-known initial contribution to linguistics consisted of some lecture notes he prepared for a course he was teaching at MIT in 1956, which were published as Syntactic Structures (Chomsky, 1957). In them he discusses material he had elsewhere (1956) set out and which subsequently came to be called “the Chomsky hierarchy” of different logical strengths of grammars, and the computational power that would be needed to generate them.11 The text offers by no means a serious theory of a natural language, but is more an examination of what kind of theory would be required for one. The only real piece of natural language linguistics in it is what was then, and has remained, a celebrated proposal about the analysis of the intricate system of auxiliary verbs in English. But he did supply a number of important ideas that figured crucially in his and others’ subsequent work. Indeed, if one looks at the work in linguistics, psychology, and philosophy in the 1940s and 1950s, in both America and Europe, one cannot but be awed by the revolutionary character of Chomsky’s proposals. must have encountered one another quite often (if only at the Society of Fellows’ weekly dinners!), according to Chomsky (pc), Quine displayed no interest in the work on LSLT on which Chomsky was then engaged, despite that work being modeled on work Quine admired of his friends, Carnap and Goodman, and employing his very own concatenative algebra! What is particularly puzzling is that Quine’s lack of interest could not have been due to LSLT contradicting either his behaviorist scruples or his scepticism about semantics: unlike Chomsky’s later work, LSLT makes no claim to any kind of psychological reality, and Chomsky confines his attention only to syntax and morphophonology, explicitly deferring to Quine’s skepticism about semantics (see Chomsky 1955/75: 85–97) and even acknowledging his influence in Chomsky (1957: 6). So it is a puzzle why Quine largely ignored his work. One cannot but wonder how the history of Anglo-American philosophy in the later twentieth century would have been different had Quine shown some interest. 10 It and various revisions were circulated by duplication in the late 1950s and it wasn’t published until 1975, and then mostly out of historical interest, since its formalism had been superseded by Chomsky’s subsequent proposals (see his (1975a) introduction for the complex history). Anyone ser iously interested in understanding the scope and logical impetus of Chomsky’s approach should at least have a look at this really extraordinary work, particularly pp. 85–97 for its sensitivity to philosophical issues. 11 Since they are sometimes relevant philosophically: the Chomsky(–Schützenberger) hierarchy consists of decreasingly restrictive grammars, with type-3 grammars generating the regular languages (with rules involving a single non-terminal symbol on the left-hand side and a single terminal symbol on the right); type-2 grammars generating the context-free languages (with rules permitting inserting strings within strings independent of surrounding symbols); type-1 grammars generating the contextsensitive languages, where application of re-write rules depends upon surrounding symbols; and type-0 being any formalized grammar at all. Although there is wide agreement that natural languages need to be more restrictive than type-0, and less restrictive than type-3, there has been considerable controversy about whether they are type-1 or -2; see Matthews (1979), Pullum and Gazdar (1982), Bresnan (1983), and Shieber (1986) for discussion.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

50 Representation of Language The basic idea in LSLT and in Syntactic Structures, on which later work was partly based, was that a given syntactic structure should be thought of as a series of “phrase structures,” for example, NPs (noun phrases), VPs (verb phrases) that were the “constituents” of sentences,12 whose content was provided in part by Phrase Structure (PS) “re-write” rules, by means of which the sentences of a natural language could be generated. Just to provide the reader with the min imal sense of the proposal, consider phrase structure rules for the following, tiny fragment of English involving just one name and an intransitive verb, Vi:13 (R1): the following rules: S → NP ViP NP → N ViP → Vi N → John Vi → laughs Using these phrase structure rules we could create the following structures, ending with the sentences (7) through (10): (1) S (2) NP ViP (3) N ViP (4) N Vi (5) John Vi (6) John laughs. An important feature of these rules is that they involve not only “terminal” symbols—the actual lexical items such as John and laughs (which, for succinctness, I minimize)—but the “non-terminal” symbols, N, NP, VP, etc. that are the vehicles for generalization in the theory: the grammar can remain the same even if the terminal symbols vary, as, of course, they do as a speaker’s vocabulary grows or diminishes, particularly adding “open class” items, such as nouns, verbs, and adjectives.14 12 He also employs the term “representation,” without, however, any assumption of the (meta‑) representations being of the phrases of a natural language, an issue to which we will return in detail in Chapter 8 below. Collins (2004) claims this use is due to Rosenbloom (1950), on which Chomsky relied in writing LSLT. 13 I adapt an example from Lasnik et al (2000: 19), avoiding its over-generation that he notes and prevents later in other ways. The example also requires treating (in)transitivity as a kind of verb phrase, instead of as a feature, as it would customarily be. 14 As opposed to “closed class” morphemes, such as prepositions and inflectional particles, such as ‑ing and ‑ed. Unlike nouns and verbs, few closed class items are added or eliminated in a speaker’s lifetime. Whilst is a rare case that has disappeared from American (but not UK) English.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 51 A particularly important kind of rule are recursive rules,15 of the sort that had become standard fare in logic and mathematics in the 1920s and 1930s. They allow an expression to be re-written in terms of an expression ultimately of its own kind. To take advantage of such rules, I will introduce a transitive verb, Vt, thinks, and two VtP rules to go with it: Vt → thinks (R2) VtP → Vt S (R3) S → NP Vt P These rules now give rise to recursion, since rules (R1) and (R3) allow S to be re-written in two ways, one of which, in combination with (R2), lets S be rewritten in terms of S itself. (R3) and (R2) can in this way seem initially circular, but given that the earlier (R1) rules allow Ss to “bottom out” with sentence (6), composed of purely “terminal” elements, we can use them to start the application of (R2), and then, with those VtPs in hand, go on to produce further sentences using (R3), and so on, round and round. Together, the rules (R1) through (R3) make the apparent circle into, so to say, a recursive “spiral,” allowing us produce the following sentences: (7) (8) (9) (10) (11) (12)

S John Vt P John Vt S John thinks S John thinks John laughs John thinks John thinks John laughs. . . . and so on.

15 The term “recursive” has come to be used in a variety ways, not all of which are actually relevant to a generative grammar. In logic, “recursive” (strictly speaking, “recursively enumerable”) sets are those that are specifiable by a mechanical “recursive procedure” whose output serves as further input. Thus, in arithmetic, the set of natural numbers is recursively enumerable, since, for any number n, if n is a number so is the result of adding 1 to it. Specifying precisely the kind of recursive enumeration that is essential to natural languages is a topic of continuing discussion see Fitch (2010), van der Hulst (2010), Roeper and Speas (2014), and Watumull et al (2014). Certainly, the recursive embedding of a phrase with a phrase of the same kind, which is stored in a read/write memory (as might be the case of “This is the mouse that . . . that lived in the house that Jack built”) is a striking condition, whose effortless mastery by young children cries out for explanation along recursive lines. See Smith and Allott (2016: 192ff) for further discussion. A related use of “recursive” that is not relevant to a grammar of natural language is the use to mean “decidable,” where a function is decidable iff, for a given value, it can be determined in a finite number of mechanical steps whether or not that value is in the set defined by the function. It is a consequence of Gödel’s celebrated 1931 incompleteness theorem that there are arithmetic functions which are not decidable, even though they define recursively enumerable sets. Many generativists likely hope that at least a certain core set of grammatical sentences of English is recursively enumerable (but see Collins, 2010a, for reasons to think even this demand is needless). But there is no particular reason to hope or expect that being a member of that set is decidable—there may well be strings of English words that for one reason or another could not be determined to be so in a finite number of mechanical steps.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

52 Representation of Language A “generative” grammar is one that allows such recursion, thus “generating” a potential infinity of different sentences. As Chomsky (1996: 8), echoing von Humboldt (1836/2000), likes to stress, recursion allows us “the infinite use of finite means,” a feature that seems distinctive of human language and thought (a point to which we will often return). It is, after all, a deep, distinctive fact of natural languages that there is no longest sentence: one can always make a longer one by using rules like (R2)–(R3).16 The structures produced by these rules can be represented by what have become the familiar “tree-structures” that now fill Chomskyan linguistic texts. Thus, (12) could be represented as: S /

\

NP

VP

|

/

N

V

|

|

John

thinks

\ S /

\

NP

VP

|

/

N

V

|

|

John

thinks

\ S /

\

NP

VP

|

|

N

V

|

|

John

laughs

These provide a perspicuous way of both exhibiting the structure of a phrase, and of defining permissible transformations of it. 16 Or, going beyond this tiny fragment, using logical connectives (and, not) and embedded clauses (the girl who kissed the boy who . . . who . . . . . .).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 53 In addition to these rules, Syntactic Structures also included a set of operations or “transformations” upon these phrase markers that created more complex structures by recursively altering simpler structures in certain ways: “singulary transformations” applied to one phrase marker, allowed, for example, for question formation (transforming “The cat is on the mat” to “Is the cat on the mat?” and/or “What is on the mat?”), and “generalized transformations” applied to more than one phrase marker, and permitted the kind of embedding of VPs within VPs permitted by (R2)–(R3). Some transformations were optional (e.g., the formation of passive sentences from their active voice counterparts), while some were obligatory (e.g., “do support,” or the insertion of a “do” auxiliary in forming an interrogative: Did John sleep?). Chomsky introduced different “levels of representation,” each distinguished from the others by both the kinds of information they contain (e.g., predicate argument structure, word order), but, especially, by the existence of rules applying to them. It is this latter issue that is crucially responsible for the proposal and “elimination” of various “representational levels” in subsequent theories (but, n.b., fn7 above). For various reasons, in LSLT and Syntactic Structures Chomsky confined recursion only to transformations, not permitting PS rules such as (R2)–(R3). This was soon revised in later work, and recursive phrase structure rules replaced generalized transformations. Indeed, a generative grammar might have no transformational rules17 (thus, “transformational grammars” are a sub-species of generative ones, which is why it is the latter, and not the former term that characterizes Chomskyan approaches generally). But, of course, the underlying idea persists, that the surface sentences of a language may involve some or other kind of “change” in an underlying structure, as when we “displace” an expression by, say, topicalization (John, I trust) or, in forming a question, by moving a wh-word, such as what or who to the beginning of the sentence (forming What do you want? from You want what?). Such “displacement” phenomena are characteristic of natural languages, giving rise to the felt relationships between sentences that generativists hope their theory will also capture, and which, as we will note in Chapter 5, not all rival theories can always do. Of course, as Chomsky knew very well, all of this resonates with claims of Frege, Russell, and Wittgenstein, that language often “disguises” thought, concealing its underlying logical structure.

17 Harman (1963/87) argued that, contrary to Chomsky’s early claims, transformations were inessential to grammars; phrase-structure rules could suffice. By the 1980s, there was only one trans formation, “Move α,” subject to a variety of constraints. In the present “Minimalist Program” (see §2.2.9 below), “internal merge” can be regarded as a generalized transformation.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

54 Representation of Language It is important to notice that Chomsky’s proposals aspire to be formal in two ways: the rules are supposed to operate on the form of what representations are employed, and the representations concern only the syntactic objects of a natural language, such as English, not its semantics (by contrast, there are, of course, “formal” theories of semantics). Although he was not as averse to issues of “meaning” as was Quine (1953/61b), he does not think it will provide a fruitful basis for linguistic explanation: It is important to distinguish sharply between the appeal to meaning and the study of meaning. The latter is an essential task for linguistics. . . . The import ant thing to remember in constructing a theory of linguistic form is that no matter what difficulties beset this endeavor, these difficulties in themselves in no way indicate that the theory should be based on meaning, or on any given notion. (Chomsky, 1955/75: 97; see 86ff for discussion of Quine)

It is a striking feature of both LSLT and Syntactic Structures (in contrast to later work) that there is no discussion either of the relation between natural language words and reality, nor even of the relation of the internal representations of words (or NPs and VPs) to the words themselves (whatever they may be, cf. Chapter 9 below). Indeed, in this early work there is little or no discussion of the relation of these formal proposals to human psychology, or of how language is related to the minds and brains of a speaker or what purposes it might serve in thought or communication.18 (We will return to issues of nat ural language semantics in Chapter 10).

2.2.2 Psychology and Explanatory Adequacy: the “Standard” “Aspects” Model None of this is meant to suggest that Chomsky did not have some psychological conceptions in his mind at the time. He just thought it would have been “too audacious” in the mid-1950s to press them (1975: 35). But that did not last long. There is a whisper of them in a footnote to a technical paper of 1956:

18 At most there are passing allusions to psychology in, e.g., the opening summary of LSLT (1955/75: 61–2), and in the introduction to the 1975 publication, where he says he added remarks about the psychological interpretation he intended as “the framework for the entire study” (1975: 35–6). However, the rest of LSLT is presented without any discussion of psychology.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 55 Note that a grammar must reflect and explain the ability of a speaker to produce and understand new sentences which may be longer than any he has previously heard. (Chomsky, 1956:124fn5)

And, of course, in 1959, he also published his famous review of Skinner’s (1957) then recently published behaviorist account of language, Verbal Behavior. There, Chomsky took on the psychological issue quite explicitly, not only in its critique of Skinner’s psychological theory, but, in its concluding section, in which he sketches the alternative he envisages: The speaker’s task is to select a particular compatible set of optional rules. . . . The listener (or reader) must determine from an exhibited utterance what optional rules were chosen in the construction of the utterance. . . . The child who learns a language has in some sense constructed the grammar for himself on the basis of observation of sentences and non-sentences. (Chomsky, 1959/64: 577)

Chomsky spells out this conception in detail in his 1965, Aspects of the Theory of Syntax. He sees it as addressing what he calls “Plato’s Problem,” or the problem of how, on the basis of such exiguous evidence provided by experience, the child comes to have such a rich understanding of a domain. Indeed, he stipulates that any “explanatorily adequate” theory should address this problem, and not merely provide a “descriptively adequate” account of the output of an idealized competence, viz., the set of structural descriptions that the grammar would produce (Chomsky 1965: 24–5) (we will return to these issues in §4.1). Aspects is the first real presentation of both Chomsky’s syntactic and psychological proposals to the world at large.19 But recently a leading psycholinguist, Janet Fodor (2009), describes the first fifty pages of Aspects as “one of the most important fifty pages of all of the important fifty pages that Noam has written, and . . . still very germane today” (J.D. Fodor, 2009: 256). She calls attention especially to the specific psychological proposal he makes there: Let us consider with somewhat greater care just what is involved in the construction of an “acquisition model” for language. A child who is capable of language must have 19 The psychological conception is expressed to a philosophical audience in an earlier published article, Chomsky (1962), and the first chapter of Aspects was evidently written several years before the appearance of the full book in 1965 (Newmeyer, 1986: 39).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

56 Representation of Language (i) a technique for representing input signals; (ii) a way of representing the structural information about these signals; (iii) some initial delimitation of a class of possible hypotheses about language structure; (iv) a method for determining what each such hypothesis implies with respect to each sentence; (v) a method for selecting one of the (presumably, infinitely many) hypotheses that are allowed by (iii) and are compatible with the given primary linguistic data. Conditions of at least this strength are entailed by the decision to aim for explanatory adequacy. (Chomsky, 1965: 30)

(We will often return to this formulation.) What seems to be suggested by these passages is that the process of language acquisition is essentially one of hypothesis confirmation, of the sort that is routine in science. The child is presumed to be deriving “what each such hypothesis implies,” much as a scientist might derive predictions about observables from her own theoret ical proposals. Indeed, a feature of the proposal that has not been sufficiently noted is how very similar it is to the famous “deductive-nomological” (“D-N”) model of scientific explanation that the philosopher, Carl Hempel (1962), was advancing, almost surely within earshot of Chomsky during the latter’s intellectually formative years. Chomsky (1968/2006, 1980a: 136) himself assimilates the proposal to Peirce’s notion of “abduction” that we will discuss further in Chapters 5 and 8.20 In any case, as the last sentence (and adjoining passages) of the quoted passage make clear, the conception he is spelling out here is intended to meet the condition of “explanatory adequacy.”21

20 See Harman (1965). Chomsky discusses the analogy with Peircean abduction at length in chapter 3 of his (1968/2006) Language and Mind. In his (1980a, 1986), he seems to reject it, but, as we will see in Chapters 5 and 8, the issues get delicate. Until we discuss Collins’ (2007b) alternative proposal in §8.5, I will presume an intentionalist reading of it. Throughout these and later discussions, Chomsky understandably intertwines the issue of explanatory adequacy with his arguments for nativism, which I shall want to treat separately (an explanatorily adequate theory need not per se be especially nativist). 21 Aspects also made a number of technical proposals about the evaluation metric and specific rules of the grammar that need not concern us, but which gave rise to what was called the “Extended Standard Theory” (Chomsky, 1970). In various formulation of this view, there are different, e.g., “D(eep)” and “S(urface)” levels of representations that, we will see, figured in discussions in the 1970s, and about which many readers may have heard. However, readers are cautioned to note that these levels of representations no longer play serious roles in current theories.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 57

2.2.3 From Phonemes to Features An important shift should be noted in the way that Chomsky, at about this time, began to think of “sentences” and their constituents. In LSLT (1955/75), Syntactic Structures (1957), and at least in the first “methodological” chapter of his (1965) Aspects, he seems to regard them pretty much as Bloomfield and Quine regarded them, as sequences of morphemes, phonemes, and phones, which were ultimately sequences of sounds, like, as it was often put, “beads on a string” (we will discuss this view further in §9.2.1). Thus, in LSLT, Chomsky writes: We can “spell” a certain utterance in terms of phonemes or in terms of morphemes. (Chomsky, 1955/75: 106)

This conception is carried over in Syntactic Structures (1957: 13;, emphasis mine), where the grammar will project the finite and somewhat accidental corpus of observed utterances to a set (presumably infinite) of grammatical utterances. (Chomsky, 1957: 15; emphasis mine; see also p. 18)

Very much the same conception persists in chapter 1 of Aspects: [L]et us use the term “acceptable” to refer to utterances that are perfectly natural and immediately comprehensible. . . . (Chomsky, 1965:10, emphasis mine) I shall use the term “sentence” to refer to strings of formatives rather than to strings of phones. It will be recalled that a string of formatives specifies a string of phones uniquely (up to free variation), but not conversely. (Chomsky, 1965: 16, emphasis mine)

In the early 1960s, however, Chomsky began to replace this idea with a more abstract conception. In a passage in chapter 2 seventy pages after the previous extract, he introduces an interestingly different conception (oddly, without noting the difference):22 22 Howard Lasnik (pc) doesn’t think the difference in conceptions is as great as I suggest. And perhaps it wasn’t in Chomsky’s mind. But it seems to me a sufficiently significant move away from the “nominalist” and acoustic conceptions of Quine that it deserves emphasis, especially in view of later

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

58 Representation of Language General structure of the base component. We now modify the description of the base sub-component that was presented earlier . . . in the following way. . . . [T]here are rewriting rules . . . that apply to symbols for lexical categor ies and that introduce or operate on complex symbols (sets of specified syntactic features). . . . [T]he base of the grammar will contain a lexicon . . . a set of lexical entries, each lexical entry being a pair (D, C), where D is a phono logical distinctive feature matrix “spelling” a certain lexical formative and C is a collection of specified syntactic features (a complex symbol). (Chomsky, 1965: 84, emphasis mine)

And in a footnote to the passage, he importantly expands the idea to allow “a lexical entry to be simply a set of features, some syntactic, some phono logical, some semantic” (1965: 215, fn15).23 Features, or lexical items that are “bundles of them,” become the standard terminal nodes of tree structures, and the items over which phonological, syntactic, and semantic operations are defined. Why is this so important? While there were a number of serious theoryinternal considerations for the shift (see fn 24), it is especially significant for philosophical purposes, since it is the first step in a move that Chomsky only fully made twenty years later (in his 1986) in dissociating what concerns him as “language” from the external physical phenomena that concerned Bloomfield and Quine, and seems to be the focus of our ordinary thought and talk. At any rate, we do ordinarily think and talk of words as sounds people make, or marks discussions, such as Devitt’s; see §6.3.6 below. Moreover, in a recent volume marking the fiftieth anniversary of Aspects, Smith and Cormack (2015) remark on the shift to features: One major contribution of Aspects (Chomsky 1965) was to initiate the development of a theory of syntactic features. There is no use of features, either syntactic or morphophonemic, in Chomsky’s earliest work The Morphophonemics of Modern Hebrew (1951/79); they do not appear in his monumental The Logical Structure of Linguistic Theory (1955/75); nor in Syntactic Structures (1957) (except for category labels); nor, except for phonological distinctive features, in Current Issues in Linguistic Theory (1964). In Aspects features play a crucial role. Chomsky suggests (1965: 214) that “We might, then, take a lexical entry to be simply a set of features, some syntactic, some phonological, some semantic”. (Smith and Cormack, 2015: 233) But, interestingly, they go on to note: But even here the use of features was somewhat haphazard, apparently unconstrained, and a terminological mess. (Smith and Cormack, 2015: 233) As we’ll see in §9.8, part of the mess is a persistent use/mention confusion—or perhaps deliberate collapse?—whereby symbols in the lexicon are identified with the syntactic, phonological, and semantic features that the symbols may represent. 23 There were many purely grammatical motivations: syntactic and phonological features seemed to provide a much more suitable basis for rules and generalizations of the theories than did the coarse categories of “N” and “V.” The change as it concerns phonology is addressed at length in Chomsky (1964: chap 4) and Chomsky and Halle’s (1968: 11, fn9) and is more generally taken as the point of departure in his influential (1970: 184–5).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 59 on paper, not as relatively abstract “bundles of syntactic, semantic, and phono logical features.” After this change to features, the sounds and marks tend to be regarded as merely external manifestations of these features, caused by representations of them over which internal computations are defined (the significance of this difference will become vivid in Chapters 6 and 9 below).

2.2.4 Constraining the Rules: The Extended Standard Model Mark Baker (forthcoming) has recently written an exquisite summary of the development of Chomsky’s views up until roughly the mid-1980s, which may serve as a useful guide to the next several sections: An idealized history of the line of inquiry that led to principles of UG goes like this. First, there was “immediate constituent analysis”, the notion that sentences and phrases are built up out of smaller phrases—a pre-Chomskian idea. Then added to this was the idea that some of these phrases can move to new positions in various ways—the early Chomskian notion of a transformational grammar (built on work by Zellig Harris). Next was added the idea that human language puts general constraints on what these trans formations can do, which restrict the application of any transformation; Ross’s (1967) famous island conditions were a paradigm-forming case in point. As more became known about this, two exciting consequences came into view. One was that maybe one doesn’t have to say nearly as much about the trans formations as before; if the constraints are rich enough, then the transforma tions can reduce to “Do anything” (Chomsky’s proto-transformation “Move-alpha”) as long as the constraints are not violated. The second exciting result is that the constraints appear to be far more universal than the trans formations and the phrase structures that they seem to depend on. (Baker, forthcoming)

Spelling some of this out in a little more detail: the development of the Generative Model in the 1950s and 1960s gave rise to a number of puzzles. Given what had been stated thus far at that time, there seemed to be no limit to the number or complexity of different kinds of possible phrase markers and transformations. As research continued, and made impressive gains over more traditional grammars in accounting for data, one thing that became increasingly striking about natural language was not only its recursiveness and productivity, but how oddly constrained it was, a fact made particularly vivid by John Ross’s (1967/86) discovery of the wide array of “island” effects

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

60 Representation of Language (a few of which we noted in §1.3.1(i)). Moreover, the plethora of rules led to many generalizations being missed by the theory, for example, that NPs always contained a noun, VPs a verb, an IP an inflection feature, and so forth: it seemed that XPs (for any X) were always “headed” by an X. And then, on the purely formal side, Peters and Ritchie (1973) showed that the logical power of the class of transformational grammars set forth in Chomsky (1965) was equivalent to a Universal Turing Machine. Such computational power was obviously far too great to capture the constraints and idiosyncracies of natural language: it made an infinity of “unnatural” grammars available to the child, and thereby decreased the prospect of an explanatorily adequate theory. But, so far as the work of the mid-1960s had indicated, inter alia, there seemed no principled way for the models to place sufficient, non-arbitrary constraints upon rules and transformations. In order to simplify, compress, and constrain the rules, Chomsky (1970) introduced the major innovation of “X-bar theory,” with variable X (which ranged over all kinds of phrase: NPs, VPs, PPs, etc.). This theory put a limit on the kinds of phrase-markers, XP, that could be introduced in the theory. All phrases have to admit of a certain form: nodes of trees are confined to “projections” of “terminal” elements of a fundamental lexical or functional category, for example, a noun (N), verb (V), determiner (D, such as the or a), or inflectional morpheme (I, such as past, or [+pst]). Unlike living trees, X-bar trees “grow,” so speak, upside down or “bottom-up”: they grow from their “leaves,” or “terminal nodes,” consisting of lexical or functional items at the bottom, to a single “root” of the tree at the top. This top root could be any sort of phrase, but the phrases of ultimate interest are ones that correspond to what we ordinarily think of as “sentences,” and these are usually “IPs,” or “inflectional phrases,” which grow from an inflectional terminal leaf item, such as [+past] or [+3S] (third-person singular).24 Thus, in the analysis below 24 A general form and the terminology used for this sort of example could be schematized: XP / \ Spec X′ / \ X Complement | head Where the “head” is the lexical category (e.g., a noun or a verb), and the X´ an “intermediate projection,” and the XP a “maximal projection” (it was called “X-bar” theory because the superscript stroke after the “X” was originally a line over it). Complements are the sisters of initial Xs (e.g., an NP or CP in the case of a verb). Spec is a “specifier,” which is the daughter of an XP and the sister of an X´, and is often, semantically, filled by something that “specifies” or modifies the head: e.g., subjects of

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 61 of John painted the barn (adapted from Haegeman, 1994: 114), a verb paint combines with an NP to form a VP, which then combines with [+pst] to form an I´, which then combines with an NP to form an IP, which, of course, is what corresponds to a sentence (although “sentence” is no longer the tech nical term for the category): IP /

\

NP

I′

|

/

\

N

I

VP

|

|

/

John

[+pst]

V | paint

\

DP /

\

D

N

|

|

the

barn

The “X-bar” requirement simplified the theoretical categories by insisting that they all should be “endocentric,” ruling out “exocentric” nodes that would introduce new categories, as in the case of “S” that was used in the earlier phrase structure grammars, with re-write rules such as “S → NP VP.” Besides reducing the primitives of the theory, this also exhibited what otherwise seemed an arbitrary relation between a noun and an NP, a verb and VP. Since the morpheme for [+pst] in English is an ‑ed affixed to the verb, the maximal projection, IP, corresponds to the sentence that the phonological/pronunciation system would produce as John painted the barn (see Lyons, 1966: 226–7). sentences are Specs of VPs, determiners, e.g. a, the, are specifiers of NPs. (Note that sometimes the spec and complement positions can be empty: Cows moo has an empty specifier for Cows and an empty complement for moo.) Since the grammar is recursive, the trees could be projected upward with indefinite numbers of ”intermediate” X-bars, X´, X″, X‴, . . . , between the initial X and the maximal projection XP. Exactly which lexical entry serves as the “head” is the subject of substantial theoretical discussion (for various reasons, determiners and inflectional morphemes instead of nouns and verbs began to be treated as heads, giving rise to “determiner phrases”—DPs—and “inflectional phrases”—IPs—where one might expect an NP or a VP). What determines whether an intermediate level, X´, is an actual constituent of a phrase is whether it admits of generalizations and/or substitutions (e.g., repair the house for paint the barn; by contrast, paint the is not any sort of serious constituent, or an X´).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

62 Representation of Language With these more constrained trees in hand, there was also a search for deeper generalizations about grammatical rules and permissible transfor mations, or “constraints on movement.” Reducing the number of transforma tions and sets of rules set the agenda for generative grammar throughout the 1970s and 1980s, and led to the development of the “Principles and Parameters” model, discussed more fully below. Two other issues that concerned Chomsky in the late 1960s were phon ology and semantics. He discussed morphophonemics in both his BA/MA thesis (1951/79), and his (1962: 541ff). And he and Morris Halle (1968) published The Sound Pattern of English, which proposed a system of phonological rules analogous to those of generative syntax that proved influential for a few decades, but (aside from the issues about features that we discussed in §2.2.3) is largely independent of his work in syntax.

2.2.5 Resisting Teleo-tyranny: Semantics and “the Autonomy of Syntax” (i) Teleo-tyranny Semantics presented special problems. As we noted, in his (1955/75: 86–97) Chomsky had explicitly rejected any reliance within his theory on a notion of “meaning.” In Syntactic Structures he emphasized that the rules that govern the syntactic structure of a language were not constrained by whether or not the sentences generated by the language actually had a coherent semantic interpretation. His parade example was “Colorless green ideas sleep furiously,” which, while grammatically impeccable, is (literally understood) nonsense. Syntactic rules should be provided in purely formal, that is, non-semantic terms, a thesis that came to be known in the field by the (somewhat fraught phrase) “autonomy of syntax.”25 Chomsky wanted to insist that neither phrase structure rules nor the set of transformations are to be constrained or formulated in anything other than syntactic terms, but should be specified independently of the semantics. This allows for a fruitful direction of ultimate explanation: from the syntax, to the semantics, to the actual use of language to express thoughts (I will qualify this shortly). 25 Chomsky himself has often renounced any such claim, see, e.g., his (1975: 92ff; 1976: 166; 1980: 126; 2000: 204, fn9), but many other passages seem to invite it, most famously the passages in Syntactic Structures, which, after a survey of problems of semantics, concludes “I think that we are forced to conclude that grammar is autonomous and independent of meaning” (1957: 17; see also §§8–9). However, I would be surprised if he were not sympathetic to the very weak version of it I shall shortly defend. Note that the principle is cousin to, but not the same as the principle governing the specification of a Turing machine, in terms of the formal properties of the symbols over which transitions are defined, not in terms of what they “mean” or “refer” to.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 63 Such an “autonomy” thesis raises a crucial philosophical issue about psychological explanation that is at the bottom of many of the disputes between Chomskyans and their opponents. It is important to notice from the start that WhyNots invite a kind of answer that people can sometimes find difficult to resist, especially in the case of what seems a phenomenon as patently intelligent as language. It can seem as though any acceptable explanation of the WhyNots and any other linguistic phenomena must ultimately be expressed in some kind of rational, or at least “teleological” terms, there being some end (or “telos”) or “function” that the phenomena can be shown to rationally serve: perhaps some social function, or perhaps an evolutionarily selected one.26 After all, much of language seems well suited to a wide variety of social purposes: conveying information about how things are related, who did what to whom, when, and how, and indicating degrees of (un)certainty, as well, of course, as requesting and imagining such information. It can seem like a perverse obsession with mathematical formalisms to suppose otherwise. This is a main motivation of the resistence to a Chomskyan program by what has come to called “Usage Based Linguistics.”27 But the WhyNots should at least give us pause about always expecting any such teleological accounts to suffice. Indeed, I think it apt to call the excessive demand people sometimes make in this connection “teleo-tyranny,” and to notice how much it can dominate proffered explanations of not only language, but a wide range of mental and biological phenomena.28 It can seem to many people that if some kind of functional, purposive explanation is not provided of a psychological phenomenon, then no explanation has been provided at all. (No one, of course, makes such a demand of the explanation of a volcano!) What may seem an initially startling alternative suggestion that Chomsky invites us to entertain is that the competence for grammar may not have a rational, teleological, or selectionist explanation at all. The human sensitivity to grammar may just be due to an odd, intricate machine in our brains that some mutation happened to

26 Some terminological headaches: (i) The term “teleological” was originally often thought of either in terms of some future end state actually causing an earlier state, or in terms of some “designer” intentions. The word has recently come to imply neither of these, but now includes “blind” Darwinian selectionist explanations, as in Larry Wright (1973) and Karen Neander (2017). Indeed, (ii) the fact that the I-language may be understood non-teleologically, does not entail that it can be understood non-intentionalistically, as I will argue in Chapter 11 that it shouldn’t be. The issues have become completely independent. Finally, (iii), note that the word “function(al)” gets appropriated by nearly every discipline for marking any number of distinctions (often in the same discipline! See Newmeyer, forthcoming, for its varied uses in linguistics). In this section, I shall be restricting it to this recent teleological sense, inclusive of selection. 27 See Tomasello, 2014, for representative work and Newmeyer, forthcoming, for extensive discussion. 28 It is perhaps an instance of a more general fallacy to which many are susceptible, of thinking that what is interesting from a personal point of view is interesting theoretically (cf. Smith and Allott, 2016: 2). Personality traits may be personally compelling but of little theoretic interest.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

64 Representation of Language produce in a certain primate, as a result of the underlying physics and chemistry of cells. Some primate and her or his descendants might well have learnt in time to exploit this mutation for any number of the above and other purposes, making use of its oddities for whatever advantages might be gained, and simply ignoring ways in which it might be useless or even an obstacle to thought and communication. Chomsky (2000) put the point well: Human languages are in part unusable, but none the worse for that; people use the parts that are usable. (Chomsky 2000: 161; see also p. 124)

Although of course the purposes for which we use parts of language will obviously be important to explaining why we do use them as we do, there is no a priori reason to think that they are the parts that are in the least important to understanding the structure of the system in general. An animal may find that certain properties of twigs allow it to poke into holes to extract some food; but those properties may be quite unrelated to the properties that explain the nature of twigs and their growth, themselves. None of this should be taken to deny that certain traits can and do play a useful role in an animal’s life, only that their existence and structure might not depend upon that role. The fact that people use twigs as tools, legs to dance and language to tell jokes does not show that the best theory of twigs, dance, or language is a selectional one. Selection is explanatory only when the trait was one that persisted—that is, was “selected”—from an array of options that actually occurred historically, as Darwin was at pains to show about the beaks of finches. Is there any reason to think that there was an array of options for the structure of language at some early time, from which UG was selected for its comparative advantages? The historical record is scant: unlike gross physical traits, what may be long-dormant mental capacities, such as those for language and higher mathematics, usually do not leave discernable physical traces (see Lewontin, 1990). What evidence there is suggests that: Something emerged in an evolutionary process. . . . It emerged once, as far as we know, very recently. There is no real evidence for use of language prior to maybe 50,000 years ago or so. But the neuroanatomy seems to have been in place before that, maybe 150,000 years ago. . . . The emergence seems to be fairly sudden, in evolutionary terms, in an organism with a very large brain, which was developed for whatever reason, and conceivably . . . led to something that works close to optimally, like a virus shell. (Chomsky, 2002a: 149–50)

That is: perhaps our relevant ancestor just happened to begin to use a complex system that had physically developed much earlier, much as animals came to use the twigs they happened to find. The explanation of the internal structures

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 65 responsible for language may well turn out to be more like the explanation of the Fibonocci patterns one finds in plants and shells, what D’Arcy Thompson (1917/2002) called “the Laws of Form,” than like the explanation of the selection of the beaks of birds.29 Having labored all that, let me stress that there is no reason to regard the “autonomy of syntax” as an essential condition on any linguistic theory. Better to think of it as a serious aspiration, in keeping with the Galilean method ology that we discussed in §1.1: insight in science requires considerable abstraction and idealization, and so we should try to abstract syntax from semantics and pragmatics as far as good theory will allow. Of course, the world may be more complex than we hope. As we observed in §1.3, some phenomena seem to be a mix of semantic/pragmatic and syntactic issues (although a syntactic component of them may still play an autonomous role), but, more importantly, in recent developments of the Minimalist Program (§2.2.9 below), considerations of the need of temporal linearization, ease of processing, and interpretability have come to play an explanatory role (see Smith and Allott, 2016: 120–1). Functional-teleological issues may well play a role in many, especially biological investigations. And, of course, it would be absurd to say that the various personal and social purposes to which language is put determine a great deal about how languages are used, just as it would be absurd to deny that the purposes to which automobiles are put determine a great deal about their observable behavior. But just as a serious understanding of automobiles requires an understanding of the underlying physics that determines what they can and cannot do, so does a serious understanding of language require an understanding of the underlying grammar that informs and constrains how people use it. (ii) Surprising Consequences for Linguistics The possibly non-teleological autonomy of syntax has a perhaps slightly surprising substantive consequence. Many people are likely inclined to think that the various syntactic markers of real world phenomena, for example, number, tense, person, and animacy, are simply the result of importing into the grammar semantically important information. For example, it can feel hard to resist the temptation to try to find some unifying theme to negative polarity items and their licensors; and the various marks people have

29 See Miller and Chomsky (1963), Chomsky (1979: 85–7; 1980a: 230–31; 2002a: 139–42), Chomsky and Lasnik (1977:436) and §2.2.10 below for Chomsky’s own occasional functional speculations, and Newmeyer (forthcoming: §4.5) for discussion. Uriagereka (1998) discusses the analogy with the Fibonacci series.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

66 Representation of Language noted—“affective,” “non-factivity”—have certainly been suggestive. But one should not be surprised if it turns out that no semantic or pragmatic condition quite works: the syntactic machine may just make its own odd distinctions that we can try to exploit, with varying degrees of success, for semantic and pragmatic purposes, and mark lexical items to be so treated, but those purposes will not, in the end, provide an explanation of the distinctions. This autonomy of syntactic categories is obvious in the case of gender in most European languages: witness the neuter das Mädchen in German, and the feminine la virilité in French. Biological gender is an issue only for the conceptual system that may (or may not) make use of the linguistic distinction (cf., Corbett, 1991). Similarly with Case,30 where it turns out there is only a rough, erratic correlation between case and thematic/semantic role: nom inatives often indicate agents, and accusatives, themes, but consider the different Cases of her and she in the otherwise synonymous We expected her to win, and We expected she would win; or the arbitrary Case markings on German prepositions: mit (with) requires dative, ohne (without) accusative, and anstatt (instead of) genitive. Even marking for so-called “thematic roles” such as “agent” and “patient”— what is called “theta marking” (see §2.2.7 below)—may be largely independent of any serious worldly or semantic considerations. Consider, for example: (13) The wolf blew the house down. (14) The wind blew the house down. Both sentences are fine, though the first is far less probable than the second; but, of course, wolves, but not the wind, are plausibly regarded as serious agents. Or consider ergative verbs (whose object can also serve as subjects): (15) John spilt the milk/The milk spilt. (16) John daubed paint on the wall/*The paint daubed on the wall. John is naturally the agent of both paint and daub, but why can milk be the “agent” of spilt but paint not of daubed?31 It appears that the grammar has “thematic” distinctions of its own. 30 “Case” with a capital “C” refers to “abstract case,” which seems to be present on nouns and pronouns in all languages, even in those, like modern English, in which overt case marking does not show up in pronunciation (except, in English, in a few pronouns, e.g., he, him, his). 31 I owe the examples to Collins (2009: 285-6), who raises them against Fodor’s (1998: 59ff) insistence that the theta-role agent needs to be understood semantically. As Collins notes, the grammatical category of “theta-roles cross-classify our common-notions” (2009: 285). Pinker (1994: 41–3) and others have also called attention to what seem to be an even more extreme, analogous phenomenon of

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 67 What is explanatorily interesting about all this is that there seems to be something in us that so much as bothers to notice and abide by these arbitrary and other grammatical distinctions, even when they are not the least communicatively important. As Omer Preminger (pc) put it to me, in acquiring syntax speakers are desperate to do something to manage the various distinctions the syntax makes available, and so attach a meaning that approximates some distinction they take to be semantically important. Thus, the marking of specific lexical items as NPIs or their licensors might well be due to some conceptual/ semantic issue outside the purview of syntax (cf. Michael Israel, 2011), but, once so marked, the syntactic system might still process them according to its own autonomous principles, producing idiosyncratic syntactic reflexes. “But,” it might be protested, “surely a child couldn’t acquire grammatical competence if she had no conception at all of the uses to which language is put!” Chomsky (1962) addressed this point early on: It may well be that a child given only the inputs of [a sentence] as nonsense elements would not come to learn the principles of sentence formation. This is not necessarily a relevant observation, however, even if true. It may only indicate that meaningfulness and semantic function provide the motivation for language learning, while playing no necessary part in its mechanism, which is what concerns us here. (Chomsky, 1962: 531, fn5, emphasis mine)

Again, for Chomsky, the fully semantic conceptual system makes use of the structures that the syntax makes available, without determining those structures, or those uses being essentially implicated in their specification.32

2.2.6 Generative vs. Interpretive Semantics Functionalist theories of language go back to the work of, for example, Trubetzkoy (1939/69) and Jakobson (1960), associated with the “Prague School of Structural Linguistics,” and influenced many linguists who began in the late 1960s to challenge Chomsky’s views of syntactic autonomy. The so-called Williams syndrome children, who, though severely retarded (average IQ of 55), sometimes seem to display an exceptional grasp of language, producing some fairly elaborate, syntactic discourse. Further research has raised skepticism about such abilities (see, e.g., Brock, 2007). 32 Which is not to deny the existence of the possibly complex cases, e.g., of lexical decomposition with syntactic reflexes, as seems to be the case with many causative verbs (Bob killed Bill → Bill died; Jim broke the case → the case broke; see Pietroski (2003), Harley (2012) in reply to JA Fodor (1972).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

68 Representation of Language “linguistics wars” (R. Harris, 1993) evolved from work within Chomsky’s general research programme that began insisting on functionalist explanations spe cifically against Chomsky’s resistance to them. The issue became particularly fraught when “Generative Semanticists” like James McCawley (1968) and George Lakoff (1971) tried to extend Chomsky’s syntactic theory to include semantics. Their idea was that there was a finite set of semantically primitive elements from which, using a set of rules, one might generate the meanings of the expressions in a language. Thus, they argued that the primitive elements within D(eep)-structures (see fn 21 above), for example, many morphemes, have a derivational origin. For example, causative/ergative verbs seemed to involve systematic relations between an agent causing an effect on an object by verb-ing it and the object itself being verb-ed: John melted the butter entails The butter melted; John burned the house down entails The house burned down. A case that engendered considerable discussion was whether kill could be decomposed into cause to die. There seemed to be grammatical reasons to think not (e.g., Mary caused John to die and it surprised me that he did so is fine, whereas *Mary killed John and it surprised me that he did so is not (an argument that, however, is no longer seen as decisive, see below). Familiar philosophical questions arose around what to include in “semantics,” many of which could be summed up in Quine’s (1953/61b and 1954/76) challenge to distinguish claims of meaning from simple ingrained belief, which, as we saw, Chomsky took seriously in his 1955/75: 94–6. Is it a logical contradiction for someone who believes in an after-life to claim that Mary killed John but that John didn’t really die? Is the truth of “Cats are animals” a matter of the meaning of the words or simply a commonplace of biology? We will discuss this challenge further in §10.2. For now, it is enough to note that it did not daunt Lakoff (1971), who was glad to include many of a speaker’s beliefs about the world in the purview of syntax. Lakoff went so far as to claim that we should distinguish the grammaticality of (17) (underlining here indicates phonological stress): (17) John told Mary she was ugly and then she insulted him. from what he claimed was the ungrammaticality of: (18) (??) John told Mary she was beautiful and then she insulted him, the ungrammaticality of which he took to be due to cultural presuppositions about good looks (Lakoff, 1971: 333–7). Indeed:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 69 Given a sentence, and a set of presuppositions, PR, we will say, in such instances, that S is well-formed only relative to PR. That is, I will claim that the notion of relative well-formedness is needed to replace Chomsky’s (1957) notion of strict grammaticality (or degrees thereof), which was applied to a sentence in isolation. (Lakoff, 1971: 329, emphasis mine)

He even goes so far as to allow “anything” into the grammar: One thing that one might ask is whether there is anything that does NOT enter into the rules of grammar. For example, there are certain social concepts from the study of social interaction that are part of grammar, e.g, relative social status, politeness, formality, etc. Even such an abstract notion as FREE GOODS enters into rules of grammar. (Lakoff, 1974: 161, quoted in Newmeyer, 1986: 125)

(Newmeyer, 1986: chap 5), provides useful discussion of this and many kindred claims.) Charles Fillmore et al (1988: 504) criticized the generative paradigm because “it doesn’t allow the grammarian to account for everything in its terms” (quoted at Smith and Allott, 2016: 9), and Culicover and Jackendoff (1999), complain of the Chomskyan conception being “overly narrow”: We do recognize that truly general syntactic phenomena have a special status in virtue of their generality and possible universality. However, in our view an empirically adequate syntactic theory should be able to account for the full range of phenomena that actually exist in natural language. . . . [T]here must be a basis in the language faculty not only for the most general and universal properties of language, but also for the most idiosyncratic and specific. (Culicover and Jackendoff, 1999:544, emphasis mine)

But that would be like insisting that a theory of motor-control explains why people dance polkas.33 As the philosopher, H.P. Grice (1989) famously 33 Cf. the moral of a lovely passage pointed out to me some years ago by Renford Bambrough from Lewis Carroll’s (1893: 169), Sylvie and Bruno Concluded: Impatient with a map that was scaled only six inches to a mile, Professor Mein Herr boasts of having produced a map scaled “a mile to the mile,” resorting eventually to using the country itself as its own map! It might be thought that the point here of the importance of abstraction is so obvious that the claims of linguists I have quoted should be discounted as simply foolishly extreme. The reader should be assured that they are all repeated claims of major, highly influential figures. None of this is meant to imply that there are no non-foolish, substantive issues about just which abstractions are theoretically useful, or to beg the question of whether Chomsky’s unquestionably are. But then something more than the mere intuitive importance of a consideration has to be demonstrated: the resulting general theory has to be shown to be genuinely more systematic.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

70 Representation of Language stressed with respect to semantics, theoretical advance in this area surely requires cordoning off certain features of language as having to do with the use (what has come to be called the pragmatics) of language apart from its semantics, syntax, and phonology. It is certainly scientifically foolhardy to try to theorize about the world in its full and rich entirety; as we emphasized in discussing the Galilean method in §1.1, abstraction is the breath of explanatory life. In contrast to generative semantics, Chomsky (1970) and Jackendoff (1972) (at least then) advocated what they call “interpretive semantics,” which treated semantic interpretation as an independent process that applied to autonomous syntactic structures only after they were fully generated. The history of “Generative Semantics” is complex. As a movement under that banner, it subsided in the early 1970s, although a great many of its innov ations—logical form (“LF”), indices, traces, filters, and even some forms of lexical decomposition—were assimilated into Interpretive Semantics and eventually into the Minimalist Program. Its broader motivations of regarding syntactic structure as a non-autonomous projection of semantics and even parts of pragmatics survive in the aforementioned “Usage Based Linguistics,” specifically in what have come to be called “Cognitive” and “Construction” and “Functional” grammars. For example, Ronald Langacker’s Foundations of Cognitive Grammar (1987) builds up linguistic structure from what he calls “symbolic structure,” and the idea of autonomous syntax is left behind. And Simon Dik’s (1997) The Theory of Functional Grammar incorporates semantics, pragmatics, phonology, and (morpho-)syntax into one essentially teleological theory. An important notion that Chomsky (1981) did adopt from generative semantics is the replacement of the level of “D(eep)-structure” by the notion of a level of “logical form,” or “LF.” This is, of course, the phrase used by early analytic philosophers like Frege (1879) and Russell (1905, 1919) who were concerned to capture the underlying logical structure of a thought or propos ition, something with truth-valuable content essential to deductively valid inferences, and this seems to be the use that many generative semanticists had in mind. But Chomsky’s use is significantly different. “LF,” with “PF” (or “phonological form”), are only the syntactic structures that the I-grammar makes available to the “conceptual-intentional” (“CI”) and speech production systems that might exploit them for, for example, reference, truth, and entailment and utterances as it sees fit (see, e.g., Chomsky, 1996, 2000, and §10.4 below).34

34 See Randy Harris (1993: chaps 5–9) and Newmeyer (1996: chaps 8–10) for discussion of the complicated history and legacy of generative semantics.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 71 Difficult though it may be to draw the distinction between the syntactic and semantic aspects of language, it is hard not to believe that at least some form of syntactic autonomy turns out ultimately to be right. Again, the WhyNots provide striking evidence that the human brain is endowed with some fairly autonomous machinery that automatically maps certain phono logical to certain syntactic forms in a way that is highly idiosyncratic and independent of semantic, pragmatic, and communicative functions. Such a machine could still be an extremely interesting and important component of human linguistic competence, even if it did not provide an account of the whole of the phenomena of language.

2.2.7 GB/P&P: Addressing Plato’s Problem and Linguistic Diversity While a number of theoreticians were constructing alternatives to the generative model of linguistic structure throughout the 1970s, Chomsky and others refined and ultimately significantly revised the earlier Aspects confirmation model of language acquisition, synthesizing work from the study of an increasingly wide variety of languages. The “Principles and Parameters” (P&P) approach emerged in the 1970s, but its main published appearance was in Chomsky’s (1981) Lectures on Government and Binding (GB), and was the first of the generative theories to incorporate the lessons learned from detailed studies of a variety of languages. One innovation of GB/P&P was to organize the grammar by “modules,” each module treating a different, fairly autonomous aspect of syntax.35 Here are a few of the main ones (from Newmeyer, 1986: 198–9, cf. Chomsky, 1981: 5): Bounding theory, which sets limits on the domain of movement rules; Binding theory, which links grammatical elements such as pronouns, anaphors, names, and variables with their antecedents; Case theory, which deals with the assignment of abstract Case (e.g., “nom inative,” “accusative,” “dative”) and its morphological realization; Control theory, which determines the potential reference of the abstract nominal element PRO;36 35 The notion of “module” here is a purely analytic one, having none of the commitments to details of processing involved in either Fodor’s (1983) or Carruthers’ (2006) conceptions; see Chomsky (2017) and other essays in Almeida and Gleitman (2017), as well as McCourt (ms.) for discussion. 36 PRO (“big PRO”) serves as a covert pronoun in embedded infinitival clauses, as in John hopes [PRO] to win, where PRO is the unpronounced subject of to win that is co-referential with the subject

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

72 Representation of Language Theta theory, which determines where “theta roles” (or thematic roles such as “patient,” “agent,” and “theme”) are assigned to the “arguments” (or NPs) associated with a verb (cf. §2.2.5 (ii) above). There were “principles” associated with each module. For example, Bounding theory postulated a “Subjacency Condition,” which essentially limited “cyclic” movement of elements from lower to higher contiguous phrases, and put limits on the number of certain sorts of “bounding” nodes (“barriers”) they could cross, which seemed to account for many of the island constraints. Binding theory postulated constraints on the co-reference of pronouns, as when John cannot be himself in *John thought Bill liked himself (see §1.3.2 (i) above and §2.3.3 below for some details). Case theory postulated the “Case Filter,” according to which all NPs in a sentence must be assigned Case, and Thetatheory the “Theta Criterion,” whereby every argument of a verb needs to be assigned a unique theta role, and every theta role a unique argument. Languages obviously differ from one another in large and small ways, for example in how they are pronounced, and, most importantly, in the variety of different grammatical constructions permitted within them. This variety seemed to challenge a central thesis of generative grammar, namely that there is a universal grammar, genetically determined, that underlies human linguistic competence. A main hope behind the development of the “Principles and Parameters” approach was that the amount of variation that one finds with respect to the grammatical profiles of different languages is not arbitrary, tends to cluster, and is limited to just a few properties that can be “parameterized” in a limited number of ways. Some standard examples are: (i) the “Null-subject” parameter: Some languages (e.g., English, French) require clauses to have overt (pronounced) subjects; others (e.g., Italian, Hebrew, Japanese) do not. (ii) the “Head Directionality” parameter, which determines whether the “head” (or main identifying element of a phrase, e.g., a verb in a VP) appears before or after the complement (see fn. 24 above). A much discussed example of parametric variation between languages is whether, in a VP, the verb precedes the object, or the object the

of the main (“matrix”) clause; and sometimes co-referential with its object (as in Mary told John to leave). This “big PRO” is to be distinguished from “little pro,” which serves as the covert subject of a finite clause, as in Shakespearean Wilt come? (= Will you come?), and, as we will see shortly, is, parametrically, unpronounced in “pro-drop” languages like Italian, but not in languages like English, which insists upon a syntactic subject even when it is vacuous (or, as linguists call it, an “expletive,” as in It’s raining, or There have been many floods).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 73 verb—hence “SVO” vs. “SOV” languages. English is regarded as fundamentally a “Head Initial,” SVO language, as in, e.g., John ate an apple; whereas Japanese is a “Head Final,” SOV language, as in John ga ringo o tabeta (John-subj apple-obj ate). (iii) the Binding Domain parameter, where the domain of a reflexive pronoun must be the smallest clause (as in English), or in the full clause containing it (as in Japanese) (we will discuss this and the crucial notion of “c-command” more in §2.3.3). (iv) the Polysynthesis parameter: “All/any/no participants of an event must be expressed on the verb” (Baker, 2001: 150–1). “Polysynthetic languages” are, roughly, those in which words are composed of many morphemes that are not able to stand alone, giving rise to long “sentence-words” such as the following example from Yup'ik Inuit:37 tuntussuqatarniksaitengqiggtuq tuntu - ssur - qatar - ni -ksaite - ngqiggte - uq reindeer -hunt - future -say -neg - again - 3sg:ind ‘He had not yet said again that he was going to hunt reindeer.’ Thus, in polysynthetic languages such as Yup'ik Inuit or Mohawk, all verbs must include some expression of each of the main participants in the event described by the verb (the subject, object, and indirect object). Baker (2001: 91, 111) argues that this is the crucial parameter that distinguishes a non-polysynthetic language, like English, from Mohawk.38 The advantage of a P&P model over the earlier, increasingly baroque proposals of sets of rules and transformations is that, at least initially, it appeared to be more explanatorily adequate: it seemed possible that children might be able to settle upon a grammar not by confirming a whole raft of rules, but simply by setting the values of binary parameters, à la “Twenty Questions,” as Janet Fodor (2001: 378) nicely put it. Thinking in terms of them has been a fruitful avenue of research—even if Chomskyan conceptions of them have evolved and been revised in intricate ways (see Newmeyer, 2017, and Baker, forthcoming, for discussions of recent proposals). 37 Example from Eliza Orr, cited by Payne, T. 1997: 27–8. 38 The actual statement of the parameter is both technical and much disputed in ways that need not concern us here. I include the parameter only because of the interest, made vivid in Baker’s (2001) popular book, of how (pace N. Evans and Levinson, 2009 and V. Evans, 2014) a generative grammar has been applied to languages whose surface appearance is very far from English. Baker’s analysis leads him to what many might find a surprising claim: “In fact, it turns out that I-Mohawk differs from I-English in one relatively small way [viz., being polysynthetic], but the difference is strategically placed to have a huge effect” (Baker, 2001: 87).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

74 Representation of Language

2.2.8 Crucial Move from Hypothesized Rules to Mechanical Principles The P&P model, particularly as envisaged by Chomsky, invited what we will see was a theoretically crucial change from the earlier Aspects (1965: 30) model, which was not immediately stressed at the time. Whereas, as we saw in §2.2.2, on the Aspects model the principles are hypotheses that the child represents and confirms or disconfirms on the basis of sensory input, on the P&P model all that the child has to (dis)confirm are the settings of the parameters. The principles are just principles that govern the operation of the machine without their being represented or (dis)confirmed, much as the principles that govern metabolism may be satisfied by the metabolic system without being represented or (dis)confirmed by that system itself (cf. Chomsky 1980a: 134, 2000: 65).39 As we will note in subsequent chapters (esp. §11.1), the comparison can get overstated. but it does have the advantage that, as Smith and Allott (2016) point out: children can ignore a whole host of alternatives which are logically possible but linguistically excluded by universal principle. The generality of the principle suggests that it is part of the mental architecture the child brings to bear on the language-learning task and is not something that has to be learned. (Smith and Allott 2016: 72, emphasis mine)

Indeed, initially, Chomsky (1981) regarded even the setting of parameters as a simple mechanical, “triggering” effect: mere exposure to an ambient Nullsubject language could directly cause a parameter to be set to null-subject, without there being any intermediate inductive or abductive processes (we will return to problems with this proposal in Chapter 11). This led him in his (1980a) to distance himself from his previous (1965/2006) endorsement of an abductive model, wondering whether the steady state grammar arises in another way—for example, by virtue of a hierarchy of accessibility . . . and a process of selection of the most accessible grammar compatible with given data. (Chomsky, 1980a: 136) 39 It is perhaps this change that Searle (2002a, b, c) construed as Chomsky’s abandonment of his original project: The project to get such rules failed and Chomsky has now given up on that project. As I said in my review, something else may yet succeed, but the original project of a transformational generative grammar has been abandoned. (Searle, 2002b: internet) Although, as already noted (fn 17 above), the technical proposal of “transformations” has come and gone and then sometimes come again, the change from rules to unrepresented principles is a deeper one, although, along the lines of development we have been tracing in this chapter, hardly one that should be regarded as an abandonment of the original project.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 75 Since Chomsky (1981), there has been intense discussion of principles and parameters. Extremely detailed psycholinguistic research has been devoted to testing ways in which relevant parameters could be set in the short time of language acquisition, particularly in a way that satisfied the “subset principle,” according to which a child must not overgeneralize to a richer grammar from which she would have to retreat (which would require negative evidence of a sort children do not regularly receive, see §1.5.8 above and §5.2 below). Unfortunately for the “triggering” suggestion, it emerged that the setting of one parameter depended upon the setting of another, so that whole batteries of parameters— in the case of Yang (2002: 26ff) an entire grammar!—had to be set by the input, a process that could hardly any longer be regarded as a piece of simple, brute causal triggering, but seemed to require some kind of probabilistic weighing of one grammar against another, more like hypothesis testing after all.40 So the subset problem remained. Another problem was simple proliferation. At one point, it was tempting to think that there were only about twenty parameters, which meant that there could be—only?—220 possible grammars! Over the years, however, different parameters were proposed by different theorists, and, especially with the proposal of “micro-parameters,” their number seemed to become completely unmanageable, not only by the theoretical linguist, but by the child—that is, the theory risked ceasing to be explanatorily adequate (even if it might still serve a purely descriptive function). In any case, the various principles themselves seemed like a set of stipulations that it would be theoretically desirable to reduce and perhaps derive from something less arbitrary.

2.2.9 The Minimalist Program A persistent complaint of Chomsky’s critics was that it was difficult to imagine how such an elaborate system of even Principles and Parameters could have evolved, an issue that Chomsky came to call “Darwin’s Problem.” In order to begin to address it and to cut through all the arbitrariness and proliferation that came to characterize both the Aspects and GB/P&P models, Chomsky (1995a, 2004a) proposed a fourth major approach, what he called the “Minimalist Program” (“MP”) (alluded to in the quote in §2.2.5 above). It has undergone considerable development in various ways in the last two decades, which is why it is called more of a “program,” rather than a “theory.” No one pretends it provides a single, nearly adequate account of grammar at this time, nor even explains data not explained by earlier approaches. The effort is 40 See Janet Fodor (2001) for critical discussion, and her (2009: 256) consequent interest in returning to the Aspects model that we noted earlier.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

76 Representation of Language merely to try to simplify and unify earlier approaches in a way that would suggest an answer to the Darwin problem. It does not aim so much to deny the claims of previous approaches, for example those of P&P, but to explain them (or their appearance) more economically. The central idea of MP is that grammatical constructions should be the result of an extremely simple operation, “merge,” which is essentially the construction of a set from two elements:41 merge (x,y) = {x, y} For example, love[V] and Mary[N] could be merged to form {love[V], Mary[N]} which in turn would be merged with will[Infl] to form: {will[Infl], {love[V], Mary[N]}} and then with John[N] to form {John[N] {will[Infl],{ love[V], Mary[N]}}} This result could be labeled and represented as a familiar tree:42 IP /

\

NP

I′

|

/

N

I

|

|

John[N]

will[Infl]

\ VP /

\

V

NP

|

|

love[V]

Mary[N]

41 merge wasn’t proposed until Chomsky (2004a). I am abstracting from the issue of whether full set theory is the right apparatus for expressing the view; see Collins (2011: ch 6) for discussion. 42 For simplicity, I provide labels of the sort we have already seen and omit further complications. Determining exactly what labels should actually be attached—if any!—to the results of each merge would involve more technical discussion than we need—but note that the system can still respect endocentricity (see 2.2.4 above), the category of each merge being determined by its head, e.g., being an IP generated by an I. For discussion, see Adger (2003) and Chomsky (2009b: 52–3).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 77 So far, these are “external” merges, involving combining new items. There also are “internal” merges, which merges an element already in a set with the set itself, for example, {John[N],{{will[Infl], {love[V], Mary[N]}}} could be merged with the inflectional verb willInfl that has been previously merged to form {will[Infl], {John[N] {will[Infl], {love[V], Mary[N]}}}} which, merging with an interrogative element, could become the question Will John will love Mary? When the result is eventually processed by the phono logic al system, this would be pronounced with the repetitive will deleted. Thus, merge and deletion of copies accomplish what previous views accomplished with “move,” capturing the phenomenon of apparent displacement that we noted earlier to be characteristic of natural languages. Obviously, such an absurdly simple operation as recursive merge by itself could not possibly account for the idiosyncratic structures of grammar. It needs to be constrained in a number of ways. One source of constraints are the features, such as [N], [V], [Infl], that, as indicated, are attached to functional and lexical items that serve as the terminal nodes. Recall the important fact from §2.2.3 that lexical items—“words”—are not mere phonological sequences, but are to a first approximation treated as bundles of “features,” representations of which constrain how the item is to be processed by the merge system. The system “checks” that features needed by one item (e.g., a transitive verb needs an NP complement) are supplied by another (viz., an NP), and deletes features that are “uninterpretable,”43 such as number features on a verb, or gender or case features that (as mentioned in §2.2.5) are not reliably interpretable semantically. If features are not checked and an uninterpretable feature remains, the merge “crashes.” If it does not crash, then it is a candidate for further merge, which can occur recursively along the same lines, driven and constrained in various ways by the need for each item to have its features “checked” by the features of the others, a process rather like chemical combination or “Lego” constructions. The whole procedure is further constrained by general principles of computational economy, such as 43 “Interpretable” does not mean in this usage “able in principle to be informative at the interface,” but merely to be in fact always used for interpretation at the interface. So, even though Case might sometimes provide information about thematic roles, it is unreliable (cf. §2.2.4). And the number feature on verbs, while perhaps informative, is redundant because the same information is on the noun with which it agrees. Perhaps a better term would have been “unnecessary at the interface.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

78 Representation of Language “Greed” (constituents move—i.e., internally merge—only in order to satisfy the requirements of their syntactic features) and “Last Resort” (operations do not apply unless they are the only way to satisfy some requirement). They, and the operations of checking and merge, are expected to have the effect of rendering the whole process so elegant that it approaches “conceptual necessity.”44 Still, many have wondered why, if the system operates by “near conceptual necessity,” it is not just an application of some general human abilities to reason about almost any domain (see, for example, Pinker and Jackendoff, 2005). Here it is important to bear in mind that, as always, a Chomskyan theory is focsed upon the core syntactic system considered by itself, independently of other mental systems, what Hauser et al (2002) call the “Faculty of Language Narrow” (FLN), that is supposed to display such elegance. According to Minimalism, the idiosyncracies arise from the complex ways in which this FLN must interact with the interfaces to meet, for example, independent demands of linearization by the sensori-motor system, creating the “Faculty of Language, broad” (FLB). After all, for example, tree structures are not linear, but merely hierarchical, like Alexander Calder mobiles twirling in the air, with one node suspended from another and capable of being to one side or the other relative to its siblings. But we produce and hear SLEs in lin early structured time. This linearization of a tree structure is, strictly speaking, post-syntactic, and part of the morpho-phonological system: as Chomsky (2000: 13) speculates, “the displacement property is . . . forced by legibility conditions” (see also Berwick and Chomsky, 2011: 31–2, 37–8).45 Moreover, it appears that there are “phases,” or portions of syntactic material that are subjected to certain processing and then blocked from anything further. This can give rise to “multiple spell-out” of sub-sentential phrases to the phonological and conceptual systems, which has seemed to many a promising way to explain island-effects: operations like wh-movement cannot be performed on 44 That is, Chomsky (2000: 10; 2005: 9–11) hopes the system would be so simple and “optimal” as to obtain by “virtual conceptual necessity,” rather in the way many physicists hope for the ultimate laws of physics, and some biologists, for ultimate “laws of form” à la D’Arcy Thompson (1917/2014); see Boeckx, (2006: chap 4) for rich and useful discussion. Note that this hope of realizing a certain conceptual, perhaps mathematical ideal is not entailed merely by the “Galilean method” of idealization that we discussed in §1.1.2 (cf. fn 4); nor is it intended to involve any sort of a priori justification of its truth. Those pressing for teleo-functional explanations (§2.2.5) may find some satisfaction were such an ideal realized; but note again the ideal is not required. 45 This is where the parameters of the P&P model may also be explained. In some cases, like many other properties that drive the merge system, they appear to attach not to grammars, but to lexical items, specifically the functional, “closed class” items (see fn 14 above). This is a version of a proposal of Manzini and Wexler (1987:424), who defend what they call the “Lexical Parameterization Hypothesis,” an earlier version of which they credit to Borer (1984). More recently, Berwick and Chomsky (2011: 37) argue that many of them, for example, Null-subject, Head Directionality, and Polysynthesis may not concern the syntactic system itself, but rather the morpho-phonological system.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 79 material that has been “spelt out” as input to the phonological or conceptual systems (see Uriagereka, 1999, and Sprouse and Hornstein, 2013a, b for rich discussion).46 The hope in all this is that The minimalist program seeks to show that everything that has been accounted for in terms of [levels of deep and surface structure] has been mis-described, and is better understood in terms of legibility conditions at the interface . . . that means the projection principle, binding theory, Case theory, the chain condition, and so on. (Chomsky, 2000:10)

Berwick and Chomsky (2011) express the strongest version of the aim of the Minimalist program: A very strong thesis, called the “Strong Minimalist Thesis,” is that the generative process is optimal: the principles of language are determined by efficient computation and language keeps to the simplest possible recursive operation, merge, designed to satisfy interface conditions in accord with independent principles of computational efficiency. Language is something like a snowflake, assuming its particular form by virtue of the laws of nature—in this case principles of computational efficiency—once the basic mode of construction is available, and satisfying whatever conditions are imposed at the interfaces. The basic thesis is expressed in the title of a recent collection of technical essays: “Interfaces+Recursion=Language?” (Sauerland and Gärtner 2007). – (Berwick and Chomsky, 2011: 30)

Note that the interest of this theoretical move may not lie in its predicting any novel phenomena that were not predicted on earlier models. Its interest is purely the theoretical one of unification. On the other hand, it is not so unifying that the language system merely becomes a particular application of a general, simple cognitive strategy. Although recursive merge could in principle be available to the general cognitive system, these interfaces in the FLB are presumably specific architectural features of the brain to which the general cognitive system would not have any sort of direct access.47 46 This is sometimes expressed by saying that “there are no longer LF or PF levels of representation.” But it may be less confusing to say simply that there are no further syntactic rules defined over sentences qua “sentences” at these final syntactic stages (recall what it was to be “level of representation,” §2.2.1 above). But of course the precise character of the output of the language faculty will matter greatly to its interaction with the phonological and conceptual systems; the present convention seems to be to refer to that output as being at the “SM” (sensory-memory) and “CI” (conceptual-intentional) interfaces. 47 Just what the relation there could be between merge as it might operate in the I-language and how it might operate in general cognition is, of course, an interesting topic of speculation. Hauser et al

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

80 Representation of Language That the whole of syntax could be captured by such comparatively simple and elegant ideas is, of course, an immensely bold conjecture and one which up to now has not been nearly shown to be true—that is why it is still a “research program,” not a full theory. But it is one that seems to have met with some measure of success, particularly in deriving many of the phenomena that were captured by earlier approaches (see Hornstein, 2005).

2.2.10 The “Third Factor”: Darwinian and Neural Speculations In an effort to specifically address “Darwin’s Problem,” Hauser et al. (2002) have argued that the conception that emerges from these considerations suggests an in-principle solution to it. As we have noted, according to Minimalism, an I-language consists of a simple, presumably easily realizable core of recursive merge, the FLN, which could easily have evolved and which is deformed by relations to the interfaces to form the FLB. This latter was presumably used to express existing concepts of earlier mental systems, but then, exploiting this recursive resource, was able to create indefinitely many, complex new ones. As Chomsky put the point to McGilvray: It seems that the language system developed quite suddenly. If so, a long process of historical accident is ruled out, and we can begin to look for an explanation elsewhere—perhaps, as Turing thought, in chemistry or physics. (Chomsky and McGilvray, 2012: 23; see also Chomsky 2002a: 149–5, quoted in §2.2.5 above)

Chomsky proposes the following not completely implausible evolutionary history (I summarize speculations set out at Chomsky and McGilvray, 2012: 14, 78): (i) Some individual undergoes a mutation that permits recursive merge, which permits him or her to have a capacity for an indefin ite range of thought—but not yet with any means of externalizing it through sign or speech (which would have no point when the mutation was in only one individual); (ii) This capacity permits planning about indefinitely remote places and times, which has an obvious selectional advantage; (2002) propose that it may have developed initially only with respect to language (which, after all, provides a notation in which to record the results of recursion, which might not otherwise be available to cognition), which is later recruited by general cognition, enormously expanding its capacities. But this is only speculation at this point. The important point is that claims of the “near conceptual necessity” of recursive merge is not incompatible with it being confined at least initially to the I-language.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 81 (iii) The capacity now gets transmitted to the individual’s children; (iv) Given the advantages, this small breeding group becomes gradually dominant; (v) The now shared capacity comes to be externalized amongst the group, the recursive capacity becoming integrated with pre-existing motor and perceptual capacities, e.g., speech, sign, or song, all of which require temporal linearization, which produces the idiosyncracies of natural language that we have noted. Note that Chomsky allows a role here for selectional, “functional” explan ation, of a sort that we simply cautioned earlier (§2.2.5) he simply does not think linguistic theory requires. Chomsky (2005) describes the physio-biological constraints of the various systems as a “Third Factor” in the determination of grammar, in addition to the genetic and experiential factors: The biolinguistic perspective regards the language faculty as an “organ of the body,” along with other cognitive systems. Adopting it, we expect to find three factors that interact to determine (I-)languages attained: genetic endowment (the topic of Universal Grammar), experience, and principles that are language- or even organism-independent. (Chomsky, 2005: 1)

This third factor may be the crucial determinant of the specific format a grammar assumes: It is then not impossible . . . that the format of grammar actually does involve, to a high degree, principles of computational efficiency, and so on—which may be not only extra-linguistic, but extra-organic—and the acquisition problem is then shunted aside. It’s a matter of fixing the parameters. (Chomsky and McGilvray, 2012: 83; see also pp 24, 45–6)

Language, that is, is to be assimilated to other examples of the way in which purely physical constraints may determine biological traits, as in the cases of, e.g., bilateral symmetry, binary branching, mitosis, and the use of polyhedra as construction materials (Chomsky and McGilvray, 2012: 23).48 Although selection pressures undoubtedly enter into the full explanation, they alone do 48 Chomsky implicitly alludes to this third factor early on, e.g., in his (1965) Aspects, speculating that human language may well be due to “principles of neural organization that may be more deeply grounded in physical law” (1965: 59; quoted at Chomsky, 1968/2006: xi). He (2005: 6) cites in this connection, theorists of biology, notably Thompson (1917/2014) and Turing (1952), but also the interesting work of Cherniak (2005), who proposes as an example of “non-genomic nativism” the apparent

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

82 Representation of Language not suffice: one needs the character of the underlying physiology of the evolved organism.

2.3 Some Simple Proposed Explanations I will provide here a few fairly clear examples of the kinds of explanations a generative grammar might provide for some of the WhyNots considered in §1.1. I emphasize again that I am not pretending to present the latest and best explanations of these or any other phenomena—the field is simply in much too rapid flux to aspire to that—or ones that provide anything like the detail that present analyses standardly display, which would be far too complex to present here.49 I simply want to exhibit the explanatory strategy that generativists employ and which I want to defend in this book, as well as some indication of the striking peculiarities that any theory will need to address.

2.3.1 C-command An important and (from the point of view of mere linear input) non-obvious formal relation within linguistic structures is “c-command.”50 It has figured in generative explanations since at least the mid-1960s, and is one of the few structural relations it is worth really memorizing for later discussion (it will play a role in the explanations below of binding and negative polarity phenomena, and is especially interesting for the nativism discussion in Chapter 5). To define it, we first need to define (the fairly obvious relation of) “domin ance” in a tree: A node X dominates all and only the nodes “beneath it” in a standard tree. This can be seen, for example, in the case of the relation of ancestors to their descendants in a family tree. We then define: fact that the structure of the brain is optimized by minimizing the amount of neural wiring, which Cherniak argued is a direct consequence of physical laws. 49 I will certainly not try to present Minimalist explanations, since, by virtue of the effort to rely on very simple operations of merge and checking, the actual derivations are per force quite lengthy (cf., a derivation of “2+2=4” in Russell and Whitehead’s Principia!). Again, the point of Minimalism is not to deny the earlier explanations, but to subsume them into structures generated by the simplest, “min imal” operations. 50 The term itself, “c-command,” was introduced by Tanya Reinhart in her 1976 dissertation, and is a shortened form of “constituent command.” Reinhart thanks Nick Clements for suggesting both the term and its abbreviation. The concept Reinhart was developing was, however, not actually new to syntax. Similar configurational notions had been circulating for more than a decade (see Klima, 1964).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 83 Node α c-commands node β if and only if: (a) α does not dominate β (b) the first branching node that dominates α also dominates β. An intuitive way of grasping the relation is by the relation A bears to her sister and all her sister’s descendants. Thus, in the following tree: X /

\ B

A /

/

\

C

\

E

D

F

/

\ /

\

/

\

/

\

G

H I

J

K

L

M N

X dominates all the nodes; A dominates all the nodes beneath it but none beneath B (and vice versa); C dominates G and H, D dominates I and J, and so forth. Node A also c-commands all and only the nodes in bold: B, E, F, K . . ., N. C-command has some nice properties: it is formally definable (per above); it has played a major role in explaining a variety of phenomena; and, happily for the Minimalist Program, it can be derived from merge.51 We will consider here its role in beginning to explain two of our WhyNots: Negative Polarity Items and Binding Phenomena.

2.3.2 (Negative) Polarity Items Negative Polarity Items comprise a motley of words and idioms—a partial list follows—that, despite the robustness of the Unacceptability response,52 turns out to be surprisingly difficult to define. Roughly, what they have in common is that they must be c-commanded by a “negative,” “downward entailing,” and

51 See Epstein et al (1998). But note that, for all its nice features, it remains a cognitively curious property that would be by no means obvious to a learner concerned only with mere linear strings of words, a fact to which we will return in Chapter 5. It affords a vivid example of how a speaker is sensitive to structures, not mere sequences. 52 My local morphologist, Omer Preminger, tells me he has never heard of a language in which the phenomenon does not occur.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

84 Representation of Language non-factive constituent (see §1.3.2 (ii) for these terms). These can include questions, negations, and “affective” or negatively tinged words, such as doubt, forget, deny. Some NPI examples are: any, ever, give a damn, at all, yet, in weeks, in ages, in the longest time, last long, take long, bother to, can seem to. (Similar constraints apply to “positive polarity items,” such as could well have and is sort of—see §1.3.2 (ii) —which will not be discussed here.) Examples (19) and (20) offer tree representations of simple examples from our set that shows how c-command by a licensor is crucial (the licensor and NPI are in boldface): (19) John doubts/*knows Sue will buy any wool.53 IP /

/

NP

I′

|

/

N

I

|

|

John

pres

\ VP /

\

V

CP

|

/

doubts

C

*knows

\ IP

|

/

that

NP

\ I′

|

/

N

I

VP

|

|

/

Sue

\

future

V

|

|

will

buy

\ DP /

\

Det N | | any wool

53 Note, again, that “any” has a non-NPI, “free choice” usage as a universal quantifier, allowing John knows Sue will buy any wool (at all). The NPI “any” in (20) is being used as an existential quantifier.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 85 Both doubt and know c-command their sister, the complement phrase, CP (and so all of her descendants). But only doubt licenses the NPI, any: unlike know, doubt is “negative,” downward-entailing, and non-factive. Some might think that mere precedence of the licensor in the linear ordering of the sentence would be enough to license an NPI. But consider: (20) Although John doubts that she likes sheep, Sue will buy some/*any wool. whose structure could be represented as: TP \

/

T‴

CP /

/

\

C

Although

NP

TP

|

/ NP John

/

N

T′

\

|

|

/

VP

Sue

T

V

will

|

T′ / T |

/

pres

V

T″

|

\

|

\

\

\ VP

|

/

\

doubts

C

TP

/

|

\ NP |

some N

/

that

DP

buy D

CP

|

\

*any

\

NP

wool

T′

|

/

she

T

|

\ VP

|

/

pres

V

NP

|

|

V

N

likes

sheep

\

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

86 Representation of Language Here, although the licensor “doubts” precedes the NPI, “any,” it does not c-command it. Doubts c-commands the CP, that she likes sheep, but is neither a “sister” nor an “aunt” of any in the IP, she will buy *any wool, and so the NPI is not licensed.

2.3.3 Binding Phenomena As we noted, binding theory was one of the modules of GB/P&P theory.54 Three principles of binding have been highly influential as approximately invariant principles or derived generalizations, and, although they are no longer thought to be strictly true, they have proven descriptively useful and are frequently discussed in approaches that do not include them (e.g. Lasnik et al, 2005). Whatever their ultimate status, they call attention fairly dramatic ally to the role of subtle structural facts such as locality and c-command. Binding phenomena are intuitively thought of as involving the “co-reference” of pronouns and their antecedents, as in (21) John likes himself. But, technically, instead of using the semantic term “refer,” we could resort to a purely syntactic term, “co-index” (see §1.3.2fn18): One expression, α, binds another, β, iff α c-commands β and is co-indexed with it. (Roberts, 1997: 126)

Thus, to stay with (21), John would bind himself iff the two expressions were co-indexed: Johni and himself, the common “i” subscripted “index” indicating that they are intended to refer to the same person. (Note that c-command here plays the role that parentheses play in predicate logic, indicating the scope of a quantifier and the variables within that scope that it binds.) Here is one standard statement of the three principles, which are known by their alphabetical designation. Note that “R-expressions” are NPs (such as names or descriptions) that are neither pronouns nor (reflexive) anaphors. Note also that in generative grammar “anaphors” are restricted to only reflexive (e.g., himself, themselves) and reciprocal (e.g., each other) pronouns, and do not include the ordinary pronouns (such as he, she, it, they) and that, to a 54 These examples are slightly more intricate than the others, but are so frequently discussed and interesting to philosophers that they deserve to be included here. The reader could skip to §2.4 without loss of continuity.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 87 first approximation, “local” means “within the smallest finite clause that contains it”:55 Principle A: An anaphor must be bound locally. Principle B: A pronoun must not be bound locally. Principle C: An R-expression must not be bound by any antecedent. Thus, according to Principle A: (24) Johni likes himselfi is fine, since John c-commands and is sufficiently local with the co-indexed himself: IP /

\

NP

VP

|

/

Johni

V likes

\ NP himselfi

However, (25) *Johni thinks Paulj likes himselfi is unacceptable: although both John and Paul c-command himself, himself is “local” only to Paul, and so cannot be co-indexed with John; but it can be coindexed with Paul: (26) Johni thinks Paulj likes himselfj Principle B rules out: 55 See Lasnik et al. (2005: 47, 231–6), Roberts (1997: 130–42) for technical discussions of locality (we will consider a subtle example that children appreciate in §5.2). Note that (i) these principles are often stated in terms of “free,” meaning “not bound” (and, n.b., not meaning “free to be either bound or not bound”; so being free excludes co-indexing with a non-local element); and (ii) that “ante cedents” can be maximal NP projections, such as The man of dubious character who loved Mary, which will thereby c-command whatever subsequent VP, even if The man by itself does not (where a “max imal NP projection” is one that is not dominated by an NP (see fn. 24 above).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

88 Representation of Language (27) *Johni likes himi since him is c-commanded and “local” to John, and so must not be (intended to be) co-referential with John: IP /

\

NP

I′

|

/

Johni

\

Inf

VP

|

/

\

pres

V

NP

|

|

likes

himi

However, Principle B allows: (28) The son of Johni likes himi In (28), him, while still “local” to Johni, is not c-commanded by it, since John occurs in a PP occurring in a complement to a DP, and so is not a sister or aunt of himi: IP /

\

DP /

I′ \

Det

NP

| The

/ I

/ N

PP

|

/

/

\

pres

V

N

|

|

likes

himi

\

son P

N

|

|

VP

|

\

of

\

Johni

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 89 Note, by way of contrast, that (i) by the same principle, himi cannot be The soni of John, which does c-command it, but that (ii) if instead of himi we had himselfi, co-reference with The soni of John would be OK, since in that case, by Principle A, the c-command and locality are just what are needed (but that, in that case, himselfi could not be Johni, for lack of the c-command). Principle C rules out the reverse of (28): (29) *Hei likes Johni since Hei c-commands Johni, but does not rule out: (30) Although hei was poor, Johni was honest. Principle C does not rule out (30), even though hei precedes Johni. (see the tree for sentence (20) in the discussion of NPIs in §2.3.2 above). The reader will have noticed that binding was actually mentioned above as a parameter. And indeed there appears to be parametric variation in Principle A: whereas reflexives in English need to be co-indexed locally, in Japanese, reflexives can be co-indexed with any c-commanding antecedent. Thus, in (31), the Japanese reflexive pronoun, zibun (translated here as “REFLEC,” since it is gender and person neutral) can be co-indexed with either John or Bill (example from Nick Allott, pc): (31) John-waj [Bill-gai zibun-oi/j nikunde iru] to omotte iru. Johnj Billi REFLECi/j hates that think In colloquial English this would be translated as the ungrammatical (32a) *Johnj thinks that Billi hates himselfi/j since this would allow (32b) *Johnj thinks that Billi hates himself*j (i.e., asterisk on second “j”) where, of course, English only allows (32c) Johnj thinks that Billi hates himselfi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

90 Representation of Language So Principle A is parameterized as to how wide the co-indexing domain may be, English, but not Japanese, setting the domain parameter quite narrowly (see Manzini and Wexler, 1987, for discussion). A number of theorists have pointed out that binding phenomena can be highly sensitive to semantic and pragmatic influences, giving rise to apparent counterexamples to each of the principles. Principle (C) has been the most vulnerable. Thus, logic teachers explaining the law of Universal Instantiation are all too familiar with the frequent need to point out to students that, if everyone loves Oscar, then it follows that (33) Oscari loves Oscari. Thus, one can easily imagine saying: (34) Everyone loves Oscari. Even Oscari loves Oscari! Indeed, it would be hard to express identity theory without violating Principle A: (36) If Marki is Sami, then Sam1 is Marki. (37) If Marki is Sam1, then whatever is true of Marki is true of Sami. Identity statements can similarly produce counterexamples to Principle B: (35) He must be Bob; he’s driving Bob’s car (Higginbotham, 1983a) All that such examples show, however, is that pragmatic interests may override syntactic principles, not that the Binding Principles are not operating at all.56 As with all of the syntactic phenomena exhibited by WhyNots, the fact the syntactic rules may be overridden by other systems is not an argument that there are no rules there, any more than the upward rise of helium balloons refutes the theory of gravity. The important point is that usage is sensitive to issues of c‑command and locality, not that it is completely determined by them. This is a point at which skeptics may complain that the theory is “unfalsifiable.” Quite aside from the difficulty of spelling out a general condition of 56 There is an extensive literature on the Binding Principles and purported counterexamples to them; see, e.g., Levinson (1991) and Crain (2012). As mentioned earlier, Reinhart (1983) suggested that they are subject to semantic and pragmatic influences, and so might be regarded as a result of interactions between syntax and these other factors.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

The Basics of Generative Grammars 91 falsifiability that would be relevant to Galilean theories (which are clearly assessed as theoretical wholes, not by individual data alone), there are at least two obvious ways in which Binding Theory could fail: either the data cited on its behalf might be shown to be entirely spurious, or alternative theories could be shown to explain the data more simply. Given the robustness of so many examples, the first seems wildly improbable; but, given the constant drive of the theory towards greater abstractness and integration with independent systems, the second seems not improbable at all. Indeed, Binding Theory has undergone significant transformations over the three and a half decades or so that it has been studied. But given at least the aim of binding theory to achieve explanatory adequacy for a great range of cases, it is hard to see how any successive theory would not have to accommodate most of the facts about locality and c-command that the Binding Theory seems to capture at least descriptively. With these sketches and samples of the technical explanations of the core theory in hand, we turn in the next chapters of Part I to some of the less tech nical, philosophically more controversial claims that provide the framework for the technical details, and so should also be regarded as part of the core theory. We will then consider in Parts II and III other philosophical claims that are less essential.

2.4 Conclusion Again, the sketches I have provided in this chapter of the generative program are perforce extremely superficial. Interested readers should consult the various texts referenced for the serious details and controversies that still surround most of the proposals. As I have also stressed, these approaches, especially the “Minimalist Program,” are still very much in flux, but this flux should not for a moment be seen as a defect of them. Successive proposals are advanced in the interest of greater generality and simplicity, and particularly for Chomsky, greater explanatory adequacy. Even if, for example, specific proposals, such as the postulation of parameters, turn out to be flawed, this should not for a moment be taken as an indictment of the whole program. Science is hard, but, still, on the whole progressive, and it is hard to see how any further the ories could avoid taking account of the extraordinary data, generalizations, and explanatory strategies that Chomskyan theories have provided. Strictly generative approaches to linguistic structure are not the only theories of grammatical structure to have emerged. There are purely “representationalist”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

92 Representation of Language theories that do not posit the kind of derivational operation of merge to generate syntactic structures (e.g. Brody, 2002; see Chomsky, 1995a: 204–5 for his view of the issue).57 Other approaches use different kinds of information than that used by generative theories, for example information about what are traditionally termed “grammatical functions” such as subject and predicate, as in so-called “declarative” approaches such as Lexical Functional Grammar (e.g. Bresnan, 2001) and Head-driven Phrase Structure Grammar (HPSG, e.g., Pollard and Sag, 1994). These theories, however, share most of the features of generative theories that have been of philosophical concern: for instance, they aim to at least characterize linguistic competence by some sort of highly idealized recursively productive principles, even if the characterizations differ. The structures that are posited by these theories are every bit as abstract as those found within Minimalist analyses, and subject to the constraints imposed by the phonological and conceptual interfaces (and so should not be confused with the Usage Based proposals discussed in §2.2.6). As many have stressed, “inference to the best explanation” is by itself not a sufficient reason to take an explanation seriously. There can be odd noises in an attic for which no one has a ready natural explanation, and the “best” on offer may be an extravagant postulation of ghosts. There is obviously some kind of threshold on acceptable explanations, and it is a perfectly reasonable question whether Chomskyan proposals are above that threshold. Judgments may vary, and we will look at some objections and alternative approaches in Chapter 5. But, especially with the evidence of the WhyNots, recent proposals of how merge could be easily implemented in a brain, and how a simple recursive operation leaving readable records could have evolved, it is hard to see how it could be dismissed out of hand. But here, as in science generally, both god and the devil are in the details, which readers are encouraged to pursue.

57 This use of “representational,” as opposed to “derivational” is entirely orthogonal to the use that will concern us in Chapter 8 on “(intentional) representational” as opposed to “(algebraic) representational.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

3 Competence/Performance: Determinate I- vs. E-languages I was once told what certainly sounded like a typical joke about Chomsky, that after a talk he gave to some linguists in the mid 1990s, a member of the audience said,: “Noam, I like your new work, but you can say the following in Welsh,” and he produced a Welsh sentence that would apparently have been excluded on Chomsky’s view. To which, without a missing a beat, Chomsky is said to have replied, “Well, that just shows you Welsh is not a natural language.” And then, after missing only half a beat, he added, “In fact, come to think of it, it’s commonly presumed that people speak natural languages. There’s not a shred of reason to think so.”1 When I told this story to a prominent philosopher, he did not believe me: “Chomsky wouldn’t have said something so preposterous,” he admonished me. And, to be sure, it sounds like precisely the kind of hyperbole of which Chomsky is sometimes (not always uncharitably) accused. This is not, however, how I regard this particular remark. To the contrary, I want to argue that the punch line of this “joke” is not only entirely plausible, but is precisely what Chomsky’s core theory would lead one to expect. To appreciate the joke, it will help to consider how the WhyNots raise two surprising issues: they point to the existence of non-conventional grammatical constraints (§3.1), and they invite the postulation of an internal “I-language,” distinct from and theoretically deeper than the external “E-languages” that people standardly take themselves to be speaking (§3.2). I will then defend this conception against various forms of what I call “superficialism,” or the view that all genuine psychological distinctions can be made on the basis of ordinary behavior or introspection. The view, restricted to behavior, was most saliently espoused by scientific behaviorists (§3.3), but surprisingly it survives in weaker, quite influential forms that allowed for at least some introspective

1 Joseph Levine (pc) reported hearing the exchange at the University of North Carolina. I should say, though, that nothing in what follows here is intended to adjudicate the issue that the questioner raised about Welsh. Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0003

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

94 Representation of Language knowledge, and did not require all respectable mental terms to be definable behaviorally (§3.4).

3.1 Conventional vs. Non-Conventional Grammars We discussed in Chapter 1 the external, social conception of language that dominated conceptions of linguistics in the first half of the twentieth century, and is still influential. We saw, however, that reflection on the WhyNots calls attention to just how odd natural language looks to be from an external point of view. Of course, some aspects of our speech may well be conventional: some details of style, accent, and, of course, the fact that certain phonemes and not others are used to indicate single word meaning and grammatical structure. Many such facts may be due to people coordinating their speech with those of others. But precisely why I have stressed the WhyNots is that they seem patently not to be open to such an account. At any rate, it is extremely difficult to imagine how the constraints exhibited by starred strings could have emerged conventionally or are in any other way dependent on the environment, in the way that, say, dropping “r”s and reciting prayers became a norm in parts of Boston. This seems to be true for a number of reasons: first, the WhyNots are virtually never noticed, much less uttered by anyone except linguists. Secondly, even were such a string to slip out, the speaker would likely not be corrected. In a normal conversation, the intended meaning would probably be so easily understood that the anomaly would likely pass unnoticed, or, if noticed, quickly re-worded (“You mean . . .?”). Thirdly, even if it were noticed, no one, probably not even most linguists, could readily explain the rules; and it would certainly be odd for people to be insisting upon conventions they could not even ordinarily conceive. Lastly, even if there were some way in which convention might enter into the acceptance of the rule, there would still be a major task explaining why everyone chose these particularly odd and complicated rules governing WhyNots as conventions, and not some simpler alternatives, such as one that allowed wh-extraction from any conjunctions (e.g., *Who did Sam and kiss Mary?). It would of course be possible in principle to deliberately adopt a convention where one did use WhyNot sentences to communicate (linguists might do this as joke—although it would probably turn out to be difficult to sustain as a general practice). But Chomskyans will reasonably insist on an empirical point that the evidence so far seems to have supported: (a) that children could

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 95 not spontaneously learn to use WhyNots in the way that they spontaneously produce natural language; and (b) modulo parameter settings (cf. §2.2.7), many of the WhyNots are not allowed in any natural language. Unlike virtually all “conventions,” uttering WhyNots does not seem to be a naturally viable alternative, in the way that variations (within limits) in vocabulary and pronunciation obviously are. But if crucial portions of grammatical structure are not conventional, what could be their source? It is hard to see how they could issue from anywhere but from “within us,” something that was not instilled conventionally, but must have been determined somehow by our mental/physical make-up. There seems to be some “system” in us that is determining our reactions by respecting certain rules that are surprisingly hard to articulate, and do not allow sentences that we otherwise might find perfectly reasonable to utter. We just happen to be a species with this odd, idiosyncratic system. Indeed, one of Chomsky’s deepest and most important discoveries is simply that there so much exists a system of “grammar” that is not merely the result of implicit social conventions. It is hard to see how there could possibly be unconscious constitutive rules of grammar if there were nothing more to language than mere regularities in usage. As he, himself, once put it, when asked what he takes to be his most important contribution: Well, I think studying language in all its variety as a biological object ought to become a part of future science—and the recognition that something similar has to be true of every other aspect of human capacity. (Chomsky and McGilvray 2012: 76)

Chomsky frequently compares I-language to the visual system, its acquisition to the development of puberty, the immune system, and the growth of teeth. Hence the centrality of issues of acquisition to Chomskyan theory (cf. §4.1 below). There are important insights to be had in this conception. It captures (i) the non-conventional character of the WhyNot constraints by locating them in facts about our biology; (ii) why, therefore, the constraints seem peculiarly ineluctable; and (iii) the fact that a good deal of our linguistic competence consists in processes that comprise a largely innate, relatively automatic computational system (we will return to this issue when discussing innateness in §5.4.7 below). For all that, though, the comparison of language with brute processes like maturation can be overdrawn. Chomsky rather overstates this case, making it seem almost idiotic that anyone would disagree:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

96 Representation of Language It’s quite striking how sensible people, [even those] embedded in the sciences, just take a different approach toward human mental faculties and mental aspects of the world than they do to the rest. There’s no issue of innatism regarding the development of the visual system in the individual; why is there an issue regarding cognitive growth and language’s role in it? That’s just irrational. (Chomsky and McGilvray (2012: 123)

But a nativist suggestion is plainly not equally true of every human mental capacity. There is a spectrum between capacities whose explanation is largely grounded internally in human beings and those whose explanation is grounded more in their external history. At one extreme are sensitivties to color and pain and basic emotional reactions (e.g. fear, disgust, joy), which seem to be due to largely innate properties, subject only to fine tuning by cultural experience. At the other end are abilities to play chess, ride bicycles, drive cars, engage in various social and political institutions such as a market or a democracy. Other than requiring language and general human intelligence, these seem largely grounded in people’s specific interactions with each other in external environments. In between are phenomena, such as mathem atical, musical, artistic, and literary abilities that are likely grounded in both. With its (all hands agree) largely arbitrary selection of phonological forms as morphemes, and (for Chomskyans) its settings of biologically determined parameters, language is somewhere in this complex area in between. It is surely not “irrational” to think that language may in many significant respects be less innately constrained than vision. If so many features of languages are learned from experience, why should not all of them be? Of course, all that Chomsky needs to be taken as proposing is that language involves a systematic core that is far more internally grounded than many earlier linguists and philosophers have supposed. It is hard to think of many abstract cognitive systems that are like this. Ethics comes to mind. Indeed, to Kant’s famous conjunction: Two things fill the mind with ever new and increasing admiration and awe . . . the starry heavens above me and the moral law within me. (Kant, 1788/1956: 166)

—perhaps we should add, thirdly: “and grammar”! It seems to have much of the character that Kant (1785/1948:Ak4: 411) observed about young children, who seem sensitive to moral principles not derived from experiential examples, and whose abstract character neither they nor most adults are able

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 97 readily to articulate. One might even say that both morals and grammar exhibit what Kant (1785/1948: Ak: 431) called the “autonomy” of human beings who find themselves subject to laws of which they themselves are the author—even if they do not always abide by them.2 But, as with morals, full understanding of linguistic phenomena will almost surely be a complex mix of the innate and the learned. The important question is one of explanatory priority, not parceling the phenomena entirely into one or the other.

3.2 I- vs. E-language A perennial issue raised by Chomsky’s views is how to understand what he is referring to with his uses of the word “language.” The distinction between competence and performance invites a distinction between the “internal” language that is the object of a theory of grammatical competence and the “external” languages, such as “German” and “Dutch” that are ordinarily taken to be what people speak. To mark the distinction, Chomsky (1986: 20–2) introduced the notion of an “I-language” for the internal, “biological” system. He takes the “I” to stand for three properties he thinks the system displays: it is intensional, internal, and individualistic, all of which are in sharp contrast to what he calls an “E-language,” which is extensional, external, and social, that is, the kind of phenomena that we ordinarily take ourselves to be referring to when we speak of “English,” “German,” “Japanese,” or “Swahili.”3 Extensional approaches to a phenomenon look to the extension, or actual entities that are instances of the phenomenon. Thus, all and only the numbers that are perfect squares are numbers that are the product of a number and itself; that is, the set {4, 9, 16, 25, . . .} This set can also be specified as the set of sums of consecutive odd numbers: {1 + 3, 1 + 3 + 5, 1 + 3 + 5 + 7, . . .}. So far as an extensionalist is concerned, all that matters are the actual objects in a set, not one or another way that they might be specified. Similarly, an extensionalist 2 The analogy between grammar and at least a sense of justice was pressed by Rawls (1971: §9) and has been pursued in some detail by John Mikhail (2011). But see §5.4, esp fn28, below for limits of the analogy. 3 Chomsky (1986: 20–2) also introduced “I-language” to distinguish the “grammar” of a natural language from the linguist’s theory of the I-language, a distinction that had been blurred in earlier discussions, as well as, as we noted in §1.2, fn4, to replace his earlier use of “competence,” which suggested to too many a mere surface ability or behavioral disposition (as in the work of Devitt, see §6.3.2 below). Note that Chomsky did not intend the “I-”s to include intentionality (with a “t”), which we will discuss in Chapters 9–11 (readers should be careful not to confuse the (maddeningly homoph onous) “intensional,” contrasted with extensional, and “intentional,” having to do with the “aboutness” of many mental states; see Preface, fn5, and §8.1 for discussion).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

98 Representation of Language conception of a (binary) mathematical function is as a set of ordered pairs of the domain and co-domain: so the squaring function is simply the set of pairs {, , etc.}, which is identical to the set that can also be generated by pairing numbers with sums of consecutive odd numbers. This extensionalist conception proved immensely fruitful in the development of logic, mathematics, and formal languages in the twentieth century, providing a framework for the stunning achievements of logicians such as Gödel and Tarski, and the entire field of model theory in modern mathematics. It has also been prized for its freedom from the obscurities associated with “concepts” and psychological “meanings,” which try to capture the “idea” or “way one thinks” of the set, so that the concept [a square of a number, n] is different from the concept [the sum of the first n odd numbers], which is of course why we require a proof that the two concepts do indeed pick out the same extension. By contrast, in intensional approaches it does matter how the extension is specified, and so the intensions of “square number” and “sum of consecutive odd numbers” are different.4 One way to think of intensions is as procedures for computing an extension, and this is the notion on which Chomsky relies. I-languages are computational procedures, somehow implemented in the brain, for determining the “mapping from sound to meaning” that underlies our linguistic competence. If one believed in extensionally individuated languages, then one might think of I-languages as being procedures for computing those extensions. Not believing in extensional languages, Chomsky does not think of I-languages in that way, but rather as procedures for computing pairings of internal representations of sound with internal representations of meaning, all of which are located within the heads of speakers, not in the social world. I-languages should therefore not (pace Burge, 2003: 453) be confused with what are commonly called “idiolects” or the idiosyncratic ways in which an individual might speak. Idiolects are just another example of an external “E-language,” just not one widely spoken. I-languages are not even the sort of thing that anyone might “speak”: again, they are computational procedures, not strings of words. So what is “language”? Well, it is not just any way animals communicate: people all the time convey their thoughts by means other than language, for example yelps, grunts and groans, facial expressions, gesticulations, traffic signals, etc. One could, of course, decide to call any means of communication 4 Standard non-mathematical examples of co-extensional but non-synonymous terms are “morning star”/“evening star,” “renate”/“chordate,” and (according to Aristotle) “rational animal”/“featherless biped,” , as well as any empty terms, such as “unicorn,” “leprechaun,” and “ghost” (since their extensions are identically null).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 99 a “language,” just as, I suppose, one could call just any bodily problem a “disease,” so that losing a limb would be included. But, as Putnam (1962a/75a: §3, 1975c) vividly showed, much of our ordinary use of “kind” terms involves designating something like the “nearest natural kind” (if there is one) that includes most of the things that the term has been used explanatorily to pick out: thus, “polio” is standardly used to pick out the activation of a certain virus, “water” often to pick out H2O (whether solid or liquid, and give or take local isotopes), and “growth” for some specific sort of natural biological process. At least from the point of view of science, the referent of the term is what much of its usage can be thought to be distinctively “getting at.” If Chomskyans are right, then the term “language” arguably can be used to pick out a nearest explanatory natural kind, and this they argue is the internal system responsible for the distinctive competence underlying human speech.5 A possibility that one might think argues against an I-language conception of language is of some alien beings, say from Mars, with whom, amazingly enough, human beings were able to communicate “fully in English,” as we might spontaneously describe the interaction. But suppose they do so on a completely different basis than we do, following what might be an “extensionally equivalent” grammar. It would seem overly scientistic to insist that what they spoke was not really English because of this difference in its etiology. But there is no reason for a Chomskyan to so insist. Perhaps they speak the same, socially individuated E-language. The issue for a Chomskyan is not such a social question, but what he takes to be the serious scientific question of whether they do use the same I-language.6 The social/individualistic distinction concerns how languages are to be individuated: in terms of speakers’ relations with each other, or, instead, in terms of their internal mental structure. Here one might think that Putnam’s insight, to which we just appealed in defending Chomsky’s internalist conception, invites the very opposite conclusion. It is that insight, after all, that leads, for example, Putnam and Burge (1979) to claim that contentful language and mental states depend for their individuation on social and worldly facts outside the skulls of speakers. That is, it turns out to be a fact about the external world we inhabit that uses of “polio” pick out the activation of certain virus. Although Chomsky acknowledges the insights of these claims about uses of natural kind terms for scientific purposes (see Chomsky 1975b: 18), 5 See §10.4.1 below for more on Putnam’s view, and Pietroski (2017, 2018: chs 1–2) for an application of it to the case of “language.” 6 Similarly, perhaps computer “language translation” will someday become as reliable as a human translator. This in itself would not be relevant to a Chomskyan linguistics unless it were done by way of the same I-language.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

100 Representation of Language he regards them as just that: as issues essentially about the use of language, not about the common underlying competence that makes that use possible. This he insists must begin with the kind of internalist conceptions he pursues (see §10.4.1 below for further discussion).7 I think it is now possible to understand our opening “joke” that no one speaks natural languages. For Chomsky, the nearest scientific kind the ordin ary use of the word “language” is getting at is I-language. E-languages are too ill-defined and unexplanatory: even if they were reasonably delineated, their character would be determined not only by the underlying I-language, but by a motley of accidental historical, social, political, and pragmatic factors that Chomsky is skeptical that any seriously general theory could capture.8 The usual criteria for individuating them, for example mutual intelligibility, would appear to be far too vague and multi-dimensional to be scientifically useful: mere phonological idiolects vary without variation in grammar, and many people can become quite good at guessing and effectively communicating in a language they do not fully understand. This is not to say that the ordinary folk notion of a “natural language spoken by a certain group,” is not a useful point of departure for the study of I-language, just as ordinary talk of “fire,” and “forces” are points of departure for chemistry and physics. Talk of “English,” “Mandarin,” and the like is freely used by generative linguistics as a convenient way of talking about what they take to the underlying reality: I-languages, with their specific syntactic parameter settings, phonology, and lexical properties that are roughly shared by individuals in a certain population. These can be widened or narrowed to the group that shares whatever feature(s) they are interested in: for example “early modern English,” “Northern English English” (which in work on phon ology might include Birmingham English but exclude the accent of Berwick). I-languages are theoretical posits, on a par with phenomena posited by chemistry and physics, for example atoms, molecular structures, fundamental forces, that are not “directly observable,” but are proposed by those sciences to explain “observable” phenomena.9 Closer analogies would be the kinds of systems posited by contemporary psychology: the autonomic nervous system; 7 Putnam’s (1975c) “Meaning of ‘Meaning’ ” can be taken in two ways: either as an essentially negative argument against many traditional philosophical conceptions of meaning, or as positive proposal for a theory of linguistic meaning. A Chomksyan can endorse the former while rejecting the latter (cf. Pietroski, 2017, 2018:chap 2). 8 Not all Chomskyans agree with him on this point. It’s important to note that this denigration of the study of E-languages is quite inessential to the core theory; cf., Sperber and Wilson (1995), Allott and Textor (2017), and §6.5 below. 9 I put “(directly) observable” in scare quotes to stress that there is substantial skepticism concerning whether it is a serious, stable category of any psychology or epistemology, as opposed to a

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 101 internal and external perceptual systems; systems of semantic, episodic, and working memory; and so forth. Note that, as is usual in developing sciences, there need be no a priori or even precise definition of “I-language” (although see Chomsky, 1986: 22–3 for a stab at one). As the theory is developed along the lines we sketched in Chapter 2, however, it becomes clear what features I-languages are claimed to have and what phenomena that, in conjunction with other psychological hypotheses (e.g. about perception, memory, and cognition), they would explain.10 I shall follow Chomskyans in using the “I-” prefix for things associated with the I-language, for example, “I-syntax,” and “I-semantics,” as a need for these distinctions might arise. The extensionality of E-languages also adds a further consideration that is very likely not shared by many of the folk. Applied to a language, an extensionalist conception of a language treats it as sets of expressions, as opposed to the intensional rules for generating them.11 Although this was certainly fruitful in the definition of formal languages, and was pursued with great industry by logicians, its application to natural languages by Quine (1953/61c, 1960/2013), David Lewis (1970), Richard Montague (1974) and Donald Davidson (1984), remains controversial. As Chomsky (2000) notes: The class of expressions generated by the (I-)language should not be confused with a category of well-formed sentences, a notion that has no known sense

pragmatic distinction that depends upon what facts and technology (radio telescopes, electron microscopes) are taken for granted at a particular time. 10 Pereplyotchik (2017) thinks reference to E-languages is indispensable to linguistics: It is very difficult to see how such findings might be formally recast as claims about the relation between the child’s usage and some specific I-language. Against whose I-language would a child’s usage be quantitatively compared? One might reply that the acquisition theorist is comparing the child’s grammar to an idealized I-language. But it is not at all clear what import the appeal to idealization has in this context. If we press on the notion of an idealized I-language, we find that it amounts to no more than a consistent setting of parameters. . . . But there are many such settings, most of which fail to match the language of the child’s linguistic community. Thus, reference to the grammar of a public language seems unavoidable in singling out the language that the theorist identifies as the child’s “target grammar.” (Pereplyotchik 2017: xxiii) But the fact that I-languages have not yet been adequately characterized is no argument that they are not the proper object of Chomskyan inquiry (cf., again, Putnam 1962a/75a: §3, and the frequent situation with diseases, whose proper characterization only emerges with the science of them). 11 This, at any rate, is a strict definition of “E-language.” But with the decline of interest in extensionality per se, the term has come to be used more loosely as short for merely the usual “external” languages, such as “English” and “Swahili,” which might be understood not as a set of external sentences, but as a mixed system of I-language and the kind of social rules and conventions, alluded to in the passage we quoted in (§3.1) from Ryle (1949), which is how I will use it.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

102 Representation of Language in the theory of language, though informal exposition has sometimes obscured the point, leading to much confusion and wasted effort. (Chomsky, 2000: 78)

Perhaps other approaches could satisfactorily define well-formedness of external sentences. The point is that this is not an interest or commitment of a Chomskyan theory.

3.3 Behaviorism and Quine’s Problems A focus upon extensionalism and E-languages in much of the twentieth century was encouraged not only by the success of extensional approaches to logic and mathematics, but by a widespread skepticism among the scientific ally minded about notions of meaning and mind with which the intentional and intensional seemed inextricably entwined. It is crucial to a proper understanding of both the Chomskyan project and of various forms of resistance to it to appreciate the historical reasons for this wariness.

3.3.1 The Motivations for Behaviorism In a philosophical tradition extending at least back to Descartes, mental states were often regarded as “logically private”: it seemed impossible in principle to know about at least the conscious states of others in the way that one knows about one’s own. This privacy has seemed to many to invite the posting of non-physical objects and properties that seemed to render the domain of the mind inaccessible to any “objective” science (we will return to this topic in §11.3.2). But if mental states were not publicly available, then it seemed they could not serve as evidence for a serious “objective” science. In any event, psychologists such as John Watson (1913) and B.F. Skinner (1938) argued that a serious science of psychology should be based only upon publicly observ able regularities in an organism’s physical behavior, and the environmental variables upon which it depends. With regard to language, Bloomfield (1933) writes: Non-linguists (unless they happen to be physicalists) constantly forget that a speaker is making noise, and credit him, instead, with the possession of impalpable “ideas,” and that the noise is sufficient—for all the speaker’s

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 103 words act with a trigger-like effect upon the nervous systems of his speech-fellows . . . (Bloomfield, 1936/85: 23)

In the hands of Skinner the approach came to called “Scientific” (or “Radical”) Behaviorism, an approach that resonated with the prevailing Positivistic skepticism about unobservable phenomena in general.12 For a salient example, Chomsky’s erstwhile mentor, Nelson Goodman (1969), came to deplore Chomsky’s proposals of what he disparaged as “the emperor’s new ideas.” In view especially of some of our later discussions, it is important to distinguish the specific skepticism about appeals to mental states from a general scepticism about theoretical entities that pervaded much of the general understanding of science in the same period, particularly among the Logical Positivists. It was one thing to posit atoms and molecules that could not be “directly observed,” but might be, with the right instruments (e.g. electron microscopes); quite another to posit mental phenomena in some “private” non-physical realm that seemed (at least in the case of other minds) publicly un-observable in principle.13 Although behaviorists shared a wariness about any theoretical entities, it was this latter impossibility that seemed to make mental states even more dubious. The overarching theoretical principle of Radical Behaviorism was Thorndike’s “Law of Effect,” which claimed (roughly) that behavior that was followed by positive consequences for an organism in the past was likely to be repeated in similar circumstances in the future. Thus, a pigeon’s tendency to peck at a lever whenever a light was flashed could be explained in terms of the patterns of “reinforcements” the pigeon had received for pecking behavior in the presence of the flashing light in the past. In his (1953), Skinner tried to make the approach plausible for human behavior in general, and then in his (1957), Verbal Behavior, for linguistic behavior in particular.

12 Scientific Behaviorism should be distinguished from “Analytical Behaviorism,” which was a completely independent semantic program to “analyze” mental vocabulary in behavioral terms (see, e.g. Ryle, 1949), and “methodological behaviorism,” which insisted that the evidence for mental states be confined to behavior, versions of which I will include under the rubric of “Superficialism” below (§3.4). Both of these latter views were of course motivated by the verificationism about especially (but not only) mental concepts that dominated philosophy from the 1920s until the 1960s, whereby it was thought that mental concepts could only be acquired from observing ordinary public behavior. 13 Of course, nowadays one might think that a computational theory of mind might help in the latter case. But such a theory did not emerge until the work of Turing in the late 1930s, and it did not begin to become incorporated into psychology until the late 1940s (see §4.5 below), much less understood as a rival to traditional dualism until the work of Putnam (1960) and Fodor (1968, 1975)—and even then, as we shall see, this development has been fraught with difficulties.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

104 Representation of Language This latter speculation was subject to a devastating critique by Chomsky in his famous (1959/64) review of that book. Chomsky argued that Skinner’s speculations really came to little more than a pseudo-scientific re-description of ordinary mentalistic talk, and didn’t stand a chance of explaining the crucial data we discussed in Chapter 1. There is no question that Skinner’s proposals were sloppy and ill thought-out. However, the theorist who gave a more careful voice to the behavorist proposal regarding language was Skinner’s friend and colleague, Quine (1934/76, 1954/76, 1960/2013),14 and, although in the end his version does not really fare much better than Skinner’s, it is a merit of Quine’s discussion that he carefully draws out some of the surprising consequences of the view. Quine proposed that people’s cognitive psychology (if that is how he would have regarded it) consists of a bundle of dispositions to assent and dissent to sentences when queried, a bundle that is constantly modified as a result of sensory stimulation. Acquisition of a language such as English or Swahili depends upon children observing publicly produced noises in publicly observable circumstances that provide evidence of those dispositions. No issue of “linguistic competence” arises apart from simply social agreement. Indeed, Quine provided as clear a statement as one might find of the externalsocial conception of language we mentioned at the start (§1.1): Language is socially inculcated and controlled; the inculcation and control turn strictly on the keying of sentences to shared stimulation. Internal factors may vary ad libitum without prejudice to communication as long as the keying of language to external stimuli is undisturbed. (Quine, 1969b: 81)

And he provided a famous simile that guided his and later conceptions: Different persons growing up in the same language are like different bushes trimmed and trained to take the shape of identical elephants. The anatomical details of twigs and branches will fulfill the elephantine form differently from bush to bush, but the overall outward results are alike. (Quine, 1960/2013: 8) 14 Quine (1960/2013) cites Skinner as the basis of his views, but is much more assiduous in trying to spell them out in non-mentalistic terms. John Collins (2018: 160, fn6) reports Chomsky saying that he really wrote the review of Skinner more for Quine. But Quine’s acknowledgment of the review consists in a single phrase in his (1960/2013: 75): “Skinner is not without his critics [fn:] see Chomsky”. As I mentioned earlier (§2.2.1, fn9), it is a puzzle why (with the exception of the article we will shortly discuss) Quine almost totally ignored Chomsky’s work—indeed, Dagfinn Føllesdal (2013), in his preface to the second (2013) edition of Word and Object stunningly makes no mention of the industry of generative grammar of the intervening fifty-three years (although he curiously includes Chomsky in his acknowledgments, at p. xxviii)!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 105 The task of linguistics is then merely one of categorizing the external shapes, or sounds that people make, along the extensionalist lines of an E-language, not one of specifying intensional procedures by which people might intern ally compute those extensions. Quine’s extensionalist story leads to extraordinary consequences, many of which he (1960/2013: ch 2, 1970a) stressed and which subsequently became hallmarks of his views, most famously, his thesis of the “indeterminacy of translation.” The consequences surface in so many other conceptions, that they are worth dignifying as “Quine’s Problem.” As we shall see in §5.4.6, they give rise to problems beyond merely those Quine identifies.

3.3.2 The Poverty of Behaviorism Quine’s general conception of “mind” as nothing more than a bundle of dispositions whose inner causes are irrelevant to language, leads him to some famous conclusions that expose just how narrow and radical his superficialism was, and how nihilistic it is with regard to our ordinary—and, nowadays, increasingly scientific—understanding of people. One of his central and distinctive insights was to notice how this conception had the consequence of undermining any principled distinction between language and socially shared, tenacious beliefs (an issue to which we will return in §10.2.2). In a penetrating attack on the Positivists’ doctrine of “truth by convention,” Quine concluded: The lore of our fathers is a fabric of sentences [which] develops and changes, through more or less arbitrary and deliberate revisions and additions of our own, more or less directly occasioned by the continuing stimulation of our sense organs. It is a pale grey lore, black with fact and white with convention. But I have found no substantial reasons for concluding that there are any quite black threads in it, or any white ones. (Quine, 1954/76: 132)

The point in the passage was the one Quine had already pressed in his famous “Two Dogmas of Empiricism,” (1953/61b) that there seemed to be no principled way to distinguish the “analytic” (or “truths by convention” of language) from the “synthetic” (or truths by virtue of the way the world is). Not surprisingly, on this view, there is no real distinction between knowing a language and knowing a theory of the world: both “language” and belief acquisition are merely a matter of altering one’s “speech” dispositions, coordinating some of

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

106 Representation of Language them with those of others, partly by social convention and partly by evolving theories of the world, but never by one or the other in isolation. Leave aside whether such behavioral dispositions would actually constitute “a mind”15: at least at the present time it does seem at least conceptually possible that human psychology should consist merely of such malleable bundles of dispositions. What our Unacceptability reactions to the WhyNots seem to show, however, is that a Superficialist view would fail to capture something in us that so much as “cares” about grammatical structure in a way that goes beyond mere coord ination with others. What has, I think, been insufficiently appreciated (perhaps even by Quine) is that the point he is making would equally apply to phonology and morph ology as it does to semantics. Quine should have recognized that that same “pale grey” character suffuses not only the sentences that are true by convention vs. fact, but even the strings of sounds that are meaningful or “significant” in the language. After all, for Quine all there are in regard to cognition are merely dispositions to assent and dissent, some of which may or may not be accompanied by “bizarreness reactions.” But this drastic reduction of distinctions applies not only to the purportedly factual vs. the analytic, but equally to the syntactic, lexical, and phonological! For Quine, as I submit for any theorist who refuses to “look inside” the mind for the basis of language, there is no principled difference between what is part of language, whether phonology, syntax, semantics and simply some noise made in regular response to stimulation.16 For what could be the basis for these distinctions in behavioral dispositions alone? People can assent, dissent, or manifest bizarreness reactions for any number of reasons, but if there are no internal mental facts, then it is hard to see how we can appeal to those different reasons or causes as a basis for any further distinctions. By contrast, Chomsky’s treatment of the entities at all these levels involves the positing of specific levels of internal representation in the minds 15 Given his “thesis of the indeterminacy of translation” and a general sympathy with (mental) eliminativism (see his 1960/2013: 27ff and 264), Quine himself would have been quite happy not to regard his physically specified behavioral dispositions as part of anything called “the mind.” As he reminisced at Skinner’s retirement dinner in 1974: Fred [Skinner] and I met on common ground in our scorn of mental entities. Mind shmind; on that proposition we were agreed. The things of the mind were strictly for the birds. To say nothing of freedom and dignity. . . (Føllesdal and D. Quine 2008: 291) It is interesting to compare Quine’s straightforwardness here with Chomsky’s more puzzling, antiintentional, algebraic views that we will discuss in §8.4 below. 16 Cf. Bloomfield’s (1936/85: 23) reference to “noise,” quoted in §3.3.1 above. Of course, even to get as far as “assent,” “dissent,” and “bizarreness reactions” would require some internal speculations about a speaker’s psychology. Moreover, it is not clear what basis there could be for analysis of constituent syntactic structure (but see Quine 1953/61c for discussion). But the fundamentally important point here does not require that we pursue these further problems for Quine.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 107 of speakers (see Chomsky 1955/75: 159, and Lasnik, 2005: 66ff). The very existence and intelligibility of the linguistic categories of, for example, phonetics, phonology, syntax, semantics, and pragmatics requires the Galilean abstraction to idealized systems of the mind presumed by Chomsky and eschewed by behaviorists like Quine, an issue to which we will return in defending a specific form of nativism in §5.4

3.3.3 Extensionally Equivalent Grammars It is one thing, however, to reject Quine’s behaviorist, extensionalist conceptions of language, quite another to meet his challenge of saying just wherein the reality of a grammar internally consists. An influential challenge to Chomsky’s psychological conception of grammar was Quine’s (1970a) skepticism about any determinacy of grammar beyond extensionality. Quine distinguished between behavior that “fits” rules and behavior that is “guided by them,” as he presumes clearly occurs when people’s behavior fits certain rules as a result of someone consciously following those rules. Indeed, he claims “the behavior is not guided by the rule unless the behaver knows the rule and can state it” (1970: 386 emphasis mine). He rightly sees Chomsky as proposing a third possibility—of some kind of “implicit” guidance by rules— but sees no factual basis for it, since it raises the problem of evidence whereby to decide, or conjecture, which of two extensionally equivalent systems of rules has been implicitly guiding the native’s verbal behavior. Quine (1970a: 388)

Harman (1974/82) raises the same worry: If theories A and B account equally well for the facts, does it make any sense to suppose that the language learner internalizes the rules of grammar A rather than those of grammar B? . . . Even if we open up the brain and discover a representation there of the rules of grammar, we cannot expect the representation to be in English; it will require interpretation, and will receive different interpretations from proponents of different theories. Philosophers who feel this sort of multiplicity of theories is inevitable are naturally sceptical about talk of “the” rules which the learner allegedly internalizes, “the” innate schematism he allegedly has. (Harman 1974/82: 216)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

108 Representation of Language Of course, maybe analytic philosophers as a breed naturally feel this way— but then they should feel the same about any scientific theory, which is always “under-determined” by the data alone. What is so special about the case of rules of grammar? In any case, we have already seen that the notion of an extensional grammar is quite beside the Chomskyan point. In the first place, again, the extension itself is subject to Quinean indeterminacies, although, as we have just seen in Section 3.3.2, perhaps more than he appreciated: the only tractable way to delineate a set of what would count as roughly the grammatical sentences of, say, “Boston English” would be to abstract away from such confounds as memory and processing difficulties, speech errors, odd accent, playful nonsense, and solecisms, and the like. But the only plausible way to do that is surely to speculate about psychological etiology, sorting out which utterances fairly directly reflect the I-language and which partly reflect other mental systems. Secondly, as we noted in §1.5.3 and John Collins (2008b) has stressed, Chomskyans are not only interested in the strings that speakers might accept or reject, but also how they understand them and their relations to each other, sorting between different hypotheses about what is syntactically, semantically, or pragmatically (un)acceptable: Both the preponderate data for linguistic theories and the phenomena the theories seek to explain are psychological, rather than E-linguistic (having to do with external “symbols”). . . . We are concerned with how speakers interpret the sentences offered; the sentences themselves, understood as symbol strings, are not the data. . . . That informants might find a string ambiguous or unacceptable is a datum, but it is a theoretical matter to explain it. It might be a semantic phenomenon, or a syntactic one, or a pragmatic one, or simply stylistic. It is not obvious, prior to any theory, what the answer will be, and the ultimate answer might depend upon the integration of linguistics into the other cognitive sciences. (Collins, 2008b: 145–6)

It is simply not true that the data to be explained by a grammar could be determined entirely externally, without substantive internalist theories. Pace Quine, we do not determine the relevant intension from a predetermined extension, but, rather, the extension from a theory of the intension and how the system that embodies it interacts with other mental states! As we noted in discussing “the autonomy of syntax” (§2.2.5), Chomskyans do not presume with Quine or other behaviorists that the grammar is concerned

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 109 with the mapping of actually perceived sounds to the I-language description. This is the task of the parser, which, although it is likely constrained by the I-language, may or may not employ the very same rules.17 Likewise, it is not committed to specifying rules of production, which would take a speaker from an intention to express a certain meaning to the motor commands that might produce an utterance of one, an issue likely involving I-language, but much else besides (memory retrieval, style preferences, etiquette, etc.).18 The I-language is a system that makes certain structures available to the conceptual and perceptual systems to make use of as they might, whether or not those structures ultimately get used in speech comprehension and performance (cf. §10.4 below). The advantage of the Unacceptability responses is that by and large no other psychological processes are plausibly responsible for them (we will return to this issue in discussion of “processing rules” in §6.3.4, and of linguistic intuitions in Chapter 7). But suppose we could come upon an extensionally adequate set of rules for a relevant extension. What determines that one recursive specification of that set is the causally real one? Chomsky has not always been as clear about this as he might have been. In chapter 1 of his 1965 Aspects, he initially entwined the issue with the competence/performance distinction. One passage in particular has been so influential and often cited, that it needs to be quoted in full: To avoid what has been a continuing misunderstanding, it is perhaps worth while to reiterate that a generative grammar is not a model for a speaker or a hearer. It attempts to characterize in the most neutral possible terms the knowledge of the language that provides the basis for actual use of language by a speaker-hearer. When we speak of a grammar as generating a sentence with a certain structural description, we mean simply that the grammar assigns this structural description to the sentence. When we say that a 17 Phillips and Wagers (2007: 742) point out that the competence/performance distinction does not require that the grammar and the parser be distinct. Indeed, on the basis of a variety of experimental results, Momma and Phillips (2018) defend the view that: fundamental properties of parsing and generation mechanisms are the same and that differing behaviors reduce to differences in the available information in either task. (Momma and Phillips, 2018: 748) There is no need to defend this view here. I am only concerned to show ways in which Chomskyan linguists are concerned at various levels of abstraction with real processes in the brain. 18 And also speculations about what would be computationally tractable in the brain. Imagine a not so very “minimalist” grammar that after merging two items proceeded arbitrarily to un-merge and then re-merge the items a million times: it would be extensionally equivalent to a system that didn’t do so. But it would be an empirical question relevant to Chomskyans which system turns out to be the one the brain actually uses.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

110 Representation of Language sentence has a certain derivation with respect to a particular generative grammar, we say nothing about how the speaker or hearer might proceed, in some practical or efficient way, to construct such a derivation. . . . The term “generate” is familiar in the sense intended here in logic, particularly in Post’s theory of combinatorial systems. . . . (Chomsky, 1965: 9)19

And in various places he dismisses psychological experiments that try (he thinks, prematurely) to pin down generativist proposals in real time. This sometimes gave the quite misleading impression he does not care about psychology at all. Now, if one were to use “generate” in this purely logical sense, then there would be a puzzle as to why one axiomatization of the domain was psychologically more real than another. Considered merely as a task in logic, there are an infinite number of ways to axiomatize any extensionally specified domain. But attention to Chomsky’s specific target in this passage and the larger ambitions of his theory should make it clear that he spoke slightly carelessly here. Twenty pages later (1965: 30), he proposes a specific process for language acquisition, an issue, which, as we noted, is as intimately bound up with the character of grammar as tooth growth is to the nature of teeth. What concerns him in this earlier passage is simply to broadly distinguish competence from performance, not begin to specify the character of that competence, even abstractly. Specifying just how the structures of the grammar are produced within the brain is not an issue that can be easily ascertained. Simple proposals are certainly attractive, and the psychologist, George Miller, joined Chomsky in their (1963) in proposing what would be a convenient sufficient condition for beginning to map what was then the going version of a transformational grammar on to brain processes: The psychological plausibility of a transformational model of the language user would be strengthened, of course, if it could be shown that our per formance on tasks requiring an appreciation of the structure of transformed

19 Harman (1983) reiterates the view: Chomsky has observed in many places that the term “generative” in the phrase “generative grammar” has no such implication. The relevant sort of generation is mathematical, not psychological. Chomsky’s claim, then, is that grammar is internally represented whether or not speakers generate sentences in accordance with its rule. (Harman, 1983: 408) as do Boden (1988: 4) and Bock (1995: 207), all of whom are cited, and the view is discussed at length, in Devitt (2006a: 68ff), whose own take on the passage will be discussed in due course (§7.2).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 111 sentences is some function of the nature, number and complexity of the grammatical transformations involved. (Miller and Chomsky, 1963: 481)

This suggestion led to the “derivational theory of complexity” that was hotly discussed in the 1960s, was thought to have been experimentally discredited (see Fodor et al, 1974), but has been cautiously revived by Chomskyans such as Colin Phillips (1996) and Phillips and Wagers (2007) (although see Townsend and Bever, 2001, for continuing doubts). There is no need to enter the details of the debate here: Miller and Chomsky were proposing an attract ive sufficient condition by which psychological plausibility could be “strengthened,” not a necessary condition for it obtaining at all. The proposed grammars are characterizing the processes at a quite abstract level, largely independent of the details of processing. As John Collins (2009) nicely put it: [T]he relevant states are nothing over and above neuronal organization viewed from the abstract perspective of the functional integration that produces and consumes this thing we call language. (Collins, 2009: 262)20

It is important to emphasize that, in regarding an I-language as “computational,” a Chomskyan theory is not committed to citing any specific algorithms used in actual language perception and production. At this point, the best that can be expected is a relatively abstract characterization of a computation that maps phonological types to some sort of meaning specifications, and which may be realized by a variety of specific algorithms that are implemented in the brain, for example, perhaps partly in serial, partly in parallel, depending upon, inter alia, the character of its specific codings and short-term memory resources. A particularly striking example of the psychological conception at work concerns the specification of the rules for island constraints. There is an interesting debate discussed by Phillips (2013) and others in Sprouse and Hornstein (2013a), as to whether these constraints are internal to the grammar, so that the WhyNots are simply not generated by it; or, alternatively, whether they are perfectly grammatical, but, like center embeddings (The man the boy the dog licked liked died), they are too difficult for our short-term memory to process. Whatever the correct account, the crucial point is that the

20 I, myself, abstract this passage from Collins’ use of it to defend a non-intentionalist reading of the core theory (see §8.5 below), as well as from what seems like Chomsky’s (1993: 85) own passing rejection of this “abstract level” view (see §11.3.2, fn 29) neither of which I endorse.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

112 Representation of Language specification of the grammar requires sorting out at some level of abstraction claims about the character of certain processes and of the psychological architecture in which they take place. Notice that this last point very much complicates Marr’s (1982) famous description of three distinct levels of cognitive explanation, the computational, the algorithmic, and the implementational in terms of which a Chomskyan theory is often understood.21 But one can characterize algorithms in increasingly specific ways, depending on how far one abstracts from issues of notation, memory resources, and processing procedures. As Momma and Phillips (2018) observe: It is common to appeal to the three levels proposed by Marr (1982) for analyses of visual systems, but Marr’s levels do not correspond closely to standard practice in linguistics and psycholinguistics. It is probably more accurate to view levels of analysis on a continuum, with as many different levels as there are details that can be abstracted away from or included. (Momma and Phillips, 2018: 234–5)

Precisely how abstract the level at which one ought to define a specific grammar is a question that cannot be settled in advance of determining the scope of the relevant laws and generalizations, nor what is due to the grammar and what to the demands of memory and the pragmatics of parsing and production. All that is important for our purposes is that the issue of how a grammar is realized is of potentially serious relevance to the theory, and that any good evidence of the specific ways that the brain might compute grammatical structure would be welcome.22 It is just that it is extremely hard to produce such evidence, and so present theories are often regarded as simply abstract characterizations of the intensional procedures that are ultimately being

21 See, e.g., Egan (1992), an interpretation I think Chomsky (2000: 162) sometimes seems to endorse too rashly. I discuss further problems with Egan’s interpretation of Marr in Rey (2003a: §V(ii)). 22 Note also the hope of the more recent Minimalist Program’s appeal to computational “economy,” of course, would be relative to the physical structures—presumably in the brain—in which the grammar is realized. If certain of Chomsky’s recent “third factor” speculations were to turn out to be true (see §2.2.10), its physical/biological properties could themselves be the source of that economy, although there is so far no serious evidence for this. Of course, just how abstractly or concretely an I-language should be specified will depend upon the level of generality a theorist is concerned to capture—all human beings, or just ones with a specific physical makeup?

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 113 sought. But such essentially epistemic challenges should not be confused with the metaphysical indifference of an extensionalist like Quine.23 In later writings, Harman (1974/82) does supply a correct reply to the philosophical sceptic, noting quite rightly that It is not enough that [the sceptics] point out that there may be alternatives to [Chomsky’s] view. What they need to do is to develop an alternative account that makes language look less surprising. (Harman, 1974/82: 217, emphasis original)

And it should be obvious that this is by no means easy to do. As we noted in setting out the crucial data at the start, what is surprising about natural language is not merely its, for example, productivity, but more strikingly, all the comprehensible strings of words we find “Unacceptable”; that is, the WhyNots, as well as the speed with which children effortlessly seem implicitly to come to appreciate them. It is such phenomena that a Chomskyan grammar begins to explain in a way that is difficult to imagine serious rival theories beginning to do. But then how could a competence grammar, remote from processes of ordinary parsing and performance in the ways that Chomskyans insist they may be, possibly explain people’s sensitivity to such data without the ever so abstract grammar being in some way or other causally realized? (We will return to this issue in §11.1)

3.3.4 Explicit vs. Implemented (“Implicit”) Rules and Structures Obviously, a Chomskyan has to move beyond the (1965: 9) Aspects claim of mere logical generation, and, not surprisingly, as I mentioned, a mere twentyone pages later in the same book, Chomsky formulates his view of acquisition in terms of the hypothesis-testing model we discussed in §2.2.2, which certainly would seem to involve a presumption that the rules of grammar were explicitly represented, say, by some kind of data structures in the brain. 23 Pereplyotchik (2017) also seems to conflate the metaphysical and epistemic issues here: the formal syntactician is not best seen as engaging in a distinctively psychological inquiry. . . . A great deal of support from psycholinguistic research must be marshaled before we can make substantive claims about the psychological role of a successful grammar. (Pereplyotchik, 2017: xviii–xix) To be sure, a lot more research is needed, but that does not show that it is still not distinctively psychological research.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

114 Representation of Language However, as Martin Davies (1989) made plain,24 explicit representation of the rules was not even needed there. Rules can be realized in a brain or other machine without being explicitly represented; they might simply be mirrored by processes in it, and the effect of ambient stimulation might be merely to activate one set of processes rather than another: [T]acit knowledge of a systematic theory is constituted by a causal-explanatory structure in the speaker which mirrors the derivational structure in the theory. Where there is, in the theory, a common factor—for example, a common axiom—used in the derivations of several theorems, there should be, in the speaker, a casual common factor implicated in the explanations of the corresponding pieces of knowledge about whole sentences. (Davies, 1989: 132)

Thus, the logical rule, modus ponens, could be represented explicitly as an axiom, or it could simply be mirrored by operations that are not explicitly represented: the machine might be so constructed that whenever a sentence “p” is in one address and “p → q” in another, then it enters “q” in a third (mutatis mutandis for a phrase structure rule such as “S → NP VP”).25 In this latter case, some like to speak of the rule being “implicitly represented.” Given the importance that will attach to the notion of “representation” in this book, I want to resist this, what seems to me casual and unnecessary use of the word. Thus, where the rules merely govern the operations in a system, I think one should more cautiously say that they are realized or implemented in it.26

24 Davies draw here on his earlier (1986) and (1987), which draws on Gareth Evans (1981/85). Edward Stabler (1983) is also often rightly credited with independently making analogous points. But Stabler’s, Davies’, and Evans’ earlier work involved further distinctions and controversies (many involving semantic theories) that seem to me not relevant to the relatively simple points needed merely in reply to Quine’s challenge about choosing between extensionally equivalent grammars. 25 Indeed, we know from Lewis Carroll’s (1895) famous parable that, on pain of infinite regress, if a machine or person is actually to perform an inference, then at some point an inference rule needs to function without being explicitly represented. For simplicity, I have used ordinary quotes rather than the technically required “corner quotes,” which allows the “p”s and “q”s to be not mentioned, but used as variables within a quotation context. 26 Stabler (1983) uses “embodied.” Note that talk of such mirroring and “implicit representation” invites an unfortunate terminological confusion. In mathematics, “representation” is often used to mean merely a homomorphism, a mere mapping between two structured phenomena, It seems to be the usage Harman (1983: 408) has in mind when he remarks in passing on the “trivial observation that any mechanism that computes a function is naturally treated as a representation of that function”, and when Chomsky (2003: 276) claims “the computational system involved in insect navigation or bird song is said to be “internally represented” (see also Gallistel and King 2009). My main reason for resisting this usage is that I see no reason to suppose it involves the intentionality that I will argue in due course (Chapters 8–11) attaches to explicit representation or adds anything else that could not be captured by “implementation” (cf. §4.5 below).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 115 But Chomskyans rightly, at this point, do not want to be committed even to such literal causal implementation of their rules. As we noted above, they regard their proposals as very abstract characterizations of the intensional procedures whose details await much more detailed knowledge of the casual implementation of the specific algorithms the brain employs. However, even if the proposed rules of a grammar are not explicitly represented, the states over which whatever actual processes are defined would have to represent something close to the categories to which the system is presumably causally sensitive.27 Davies (1989: 132) goes on to fill out the condition of causal sensitivity, referring us to Christopher Peacocke’s (1986) suggestion that a (causally) psychologically real grammar for a given speaker must specify the information drawn upon by the relevant mechanisms or algorithms in that subject. . . . So for VP → V NP to be psychologically real is for the subject’s syntactic mechanisms or algorithms to draw on the information that a V followed by an NP is a VP. For the phonological rule “Aspirate a segment which is [–voiced, –continuant]” to be psychologically real is for the relevant mechanisms or algorithms to draw on the information that segments which are [–voiced, –continuant] are aspirated; and so forth.

Peacocke called this “the Informational Criterion” for (causal) psychological reality.28 Thus, for instance, consider a simple phrase structure grammar, and the linguistic fact that in John hit the boy, the boy has the phrase marker [NP [DPthe] [Nboy]]. If the simple phrase structure grammar is psychologically real according to the Informational Criterion, the phrase is assigned this structure because some mechanism or algorithm draws upon these several pieces of information: that a determiner followed by a noun phrase is a noun phrase; that the is a determiner; and that boy is a noun. (Peacocke, 1986: 114–15)29 27 I say “something close” to allow that sensitivity to a category could be the result of a complex process, as, for example, c-command might be in the case of Minimalism, whereby it might be the result of a specific pattern of merges (see §2.3.1 above). Note that we can also allow at this point what Chomsky and some of his followers claim, that even “explicit representations” of categories need not be intentional, an issue that I postpone until Part III (esp §11.1). 28 I qualify “psychologically real” in light of the crucial ambiguity of the phrase, between a causal reality and it simply marking, e.g., a mere object of perception that need not be causally real, an issue we will discuss in §9.7. 29 Peacocke ( 1986: 115) provides a more general, formal characterization. He is proposing a level “1.5” in between Marr’s (1982) well-known “computational level 1,”—which abstractly describes a problem a system has to solve—and the “algorithmic level 2,”—which describes the precise steps by which the system solves it.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

116 Representation of Language After all, abstracting from issues of parsing and production does not imply abstracting from all issues of processing. There are plenty of psychological processes—for starters: mulling, imagining, remembering, reasoning—that might not be at all tied in any direct way to ordinary perception, action, or speech. A theory of competence is, precisely as Chomsky also says in this passage, a theory of “the knowledge of the language that provides the basis for actual use of language by a speaker-hearer” (emphasis mine). Such a theory needs to be more than merely an axiomatic characterization of that knowledge basis, but a theory of actual states and processes of the mind/ brain that provide that basis and so, directly or indirectly, causally inform the use of language as it figures in those various processes, internal or externalizd. Consequently, at least some of the categories of the grammar may need to be explicitly represented, even if the rules need not be. But this is hardly a fact confined to the case of grammars; it is part and parcel of the problem of “intentionality” that in one way or another is ubiquitous (even if seldom discussed) in psychology and which we will address in Chapters 8–11. But the examples should make plain at least one strategy that could be deployed in meeting Quine’s challenge of choosing among extensionally equivalent grammars: to insist that the representations of relevant information be causally efficacious. It is worth stressing the complexity of the relevant information, which goes far beyond the simple examples Peacocke cites. An explanation of, for example, speakers’ responses to WhyNots, must account for speakers’ brains not only needing states causally sensitive to information about the constituent structure of SLEs, but, recalling some of the examples of §2.3, also needing states which are causally sensitive to crucial information that determines c-command, co-indexing, binding domains, negative polarity items and their licensors, and/or copies of “moved” elements. As we will discuss in §11.1, to begin to explain how a brain could possibly exhibit such sensitivities is no mean feat. It is hard to imagine anything that could do this short of some kind of representations of the relevant linguistic categories as well as at least some kind of realization of grammatical rules defined over them. In any event, such an appeal to sensitivities to grammatical information affords the beginnings of a reply to Quine’s challenge regarding the (causal) psychological reality of grammars, pending issues about the nature of the specific sort of information to which a speaker is sensitive, an issue that we will address in Chapter 11.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 117

3.4 Other Superficialist Objections As early as (1965) Chomsky expressed his views in terms of speakers, including children, following rules, a strategy that had begun to emerge in early computational models of other mental processes. And this evoked a great deal of skepticism that was not only raised by the Scientific Behaviorism we have discussed, but also by weaker superficialist views defended by Wittgenstein, Ryle, Dennett, and many of their followers.

3.4.1 “Nothing Hidden” (Wittgenstein, Ryle, Baker and Hacker, and Chater) What I call “superficialism” was provided its most straightforward articulation by Wittgenstein (1953/2016) when he famously wrote: An inner process stands in need of an outward criterion. (Wittgenstein, 1953/2016: §580)

There can be no inner mental process for which there could not be ordinary external behavioral evidence (which is clearly what he meant by “outward’).30 Indeed, he claimed it is a “misleading parallel” to suppose that psychology treats of processes in the psychical sphere, as does physics in the physical. Seeing, hearing, thinking, willing, are not the subject of psychology in the same sense as that in which the movements of bodies, the phenomena of electricity etc., are the subject of physics. You can see this from the fact that the physicist sees, hears, thinks about, and informs us of these phenomena, and the psychologist observes the external reactions (the behavior) of the subject. (Wittgenstein, 1953/2016): §571)

And he persistently rejected internalist explanations of visual constancies (1981: §614), color perception (Wittgenstein, 1977: 37–40), simulation accounts 30 I include the possibility of some introspective evidence, since it’s not clear that Wittgenstein didn’t sometimes include it, but also to allow the more liberal, “Urbane” verificationism” of Dennett (1991: 461), who aims to rule out only processes that “swim our of reach of both ‘outside’ and ‘inside’ observers” (p133). I discuss his view in my (1995), to which he (1995) replied, happily endorsing the “superficialist” moniker!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

118 Representation of Language of our knowledge of other minds (Wittgenstein, 1980: §220), and the psychology of animals (1953/2016: §25), e.g. fish: The question “Do fishes think?” does not exist among our applications of language, it is not raised. (Wittgenstein, 1980: §192)

Interestingly, Chomskyan concerns with the processes underlying language are ruled out from the start. At the very beginning of his (1953/2016) Philosophical Investigations, he replies to the question of how a shopkeeper knows what to do with the word “red”: Well, I assume that he acts as I have described. Explanations come to an end somewhere. (Wittgenstein, 1953/2016: §1)

He later adds: If it asked, “How do sentences manage to represent?”—the answer might be: “Don’t you know? You certainly see it, when you use them.” . . . How do sentences do it? —Don’t you know, for nothing is hidden. (Wittgenstein, 1953/2016: §435)

For Wittgenstein, the mistake of philosophers and many psychologists was to appeal to hidden “processes and states” where we leave their nature undecided. Sometime perhaps we shall know more about them—we think. But that is just what commits us to a particular way of looking at the matter. For we have a definite concept of what it means to learn to know a process better. (The decisive movement in the conjuring trick has been made, and it was the very one that we thought quite innocent.) (Wittgenstein 1953/2016: §308)

He hopes to avoid such errors by sticking to the readily available surface ways (or “language games”) in which we ordinarily use psychological expressions, which no more require reference to internal states than does our talk of clubs, rainbows, or “the sky” require finding “objects” in the world corresponding to them.31 31 We will actually return to a defense of a modest version of this latter suggestion—but not the “nothing hidden” one—in §10.4. Note, charitably, that attempts to specify the nature of mental processes in terms of computer models of the mind were not seriously on the table when Wittgenstein

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 119 In a similar vein, Ryle (1949/2000) wrote: [W]hen we are in a less impressionable frame of mind, we find something implausible in the promise of hidden discoveries yet to be made of the hidden causes of our actions and reactions. (Ryle, 1949/2009: 297)

And Wittgenstein’s student, Norman Malcolm (1977) rejected “the myth of cognitive processes and structures,” writing, for example, with regard to a cognitive explanation of such a phenomenon as a man or child recognizing a dog: we could, just as rationally, have said that the man or child just knows without using any model, pattern or idea at all, that the thing he sees is a dog. We could have said that it is just a normal human capacity. (Malcolm 1977: 168)

But this last claim—like many of those other passages from Wittgenstein— surely flies in the face of any serious science. One could have said that it is just a normal human capacity for people to breathe and digest, but presumably this observation would not discourage theories of physiology. Perhaps Malcolm thought that just citing brute brain processes, without any talk of “models” or “ideas,” could explain how a child could recognize a dog. If so, then he should have provided it. It is by no means obvious that any such explanation is to be had.32 One might have thought that this sort of line had died out with the rise of increasingly interesting cognitive explanations. But in a highly polemical book, G.P. Baker and P.M.S. Hacker (1984) claim that Chomsky’s whole approach is a non-starter: It is astonishing that elementary language-learning should even seem a mystery, as opposed to wonderful. But it is even more astonishing to suppose was writing (although he probably wouldn’t have liked then if they had been, as, we will see, many of his followers certainly didn’t). 32 For some sympathy with these odd, seemingly anti-scientific claims of Malcolm, it might help to think of Rosalind Hursthouse’s (1991) interesting category of what she calls “arational action,” or actions, like shouting in joy or tousling a loved one’s hair, which seem not to be done by usual patterns of belief-desire decision making (or, if they are, they seem “insincere”). Superficialism is perhaps an overreaction to what does seem the sometimes excessive intellectualization of behavior, of the sort one often finds in Freud (1901/60), as well as Davidson (1963), to which Hursthouse was responding. Precisely where brute causal explanations give out will be a topic of §11.2.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

120 Representation of Language that it’s rendered less mysterious by the supposition that the child tacitly knows (‘cognizes’) a universal grammar which is part of its genetic endowment, and which it uses to map experience on to a ‘steady state grammar,’ which it then employs in speaking and understanding (‘decoding’) its native tongue. That would indeed be to explain the obvious by the miraculous. (Baker and Hacker, 1984: 291)

They too think nothing is hidden: What the child learns is manifestly not a theoretical grammar, but how to speak (grammatically, correctly); how to ask for a drink, tell mommy about a bird in the tree, call Daddy, object to dinner . . . (Baker and Hacker, 1984: 291)

In keeping with the behaviorism of the 1940s and 1950s, the standard explan ation of language acquisition offered by this tradition was “training” (cf. Wittgenstein, 1953/2016: e.g., §§5, 6, 158, 189, 441; Ryle, 1949: 30–1,46; and Quine, 1960/2013: 8 quoted above), and it is to this idea that, remarkably, Baker and Hacker (1984) continue to appeal: [The child’s] initial uses of language are responses to training. The normativity of its early forays in the use of language lies in parental reactions and corrections. (Baker and Hacker, 1984: 291)

What is remarkable is that Baker and Hacker are happy to advance such a vague, unsubstantiated hypothesis immediately after quoting, without rebuttal, Chomsky’s (1959/64) detailed refutation of Skinner’s (1957) effort to try to spell it out. They conclude that Chomsky’s theory is a folie de grandeur. The pretensions are as great as those of astrology, and the achievements altogether comparable. After decades of this sound and fury, can one honestly say that the tale signifies anything? (Baker and Hacker, 1984: 315)

Suffice it say that the burden on any such view is to provide an inkling of a plausible explanation of the crucial data we reviewed in Chapter 1. Is there a serious notion of “training” that would explain why people do not violate island constraints, and respect c-command in cases of NPIs and binding, and how children manage to acquire such competence so quickly? The burden is on Baker and Hacker, as it is on other defenders of Wittgenstein, to provide such an explanation.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 121 The rejection of internalist explanations was perhaps understandable in the 1940s, before the development of reasonably clear computational models of mental processes, which, since the “cognitive revolution” of the 1960s, has resulted in widely diverse and substantial research in the now various cognitive sciences. Perhaps Baker and Hacker, writing in 1984, were unimpressed by the results up to that date. But it is baffling that this rejection can be found persisting even to the present day. In a disturbingly recent, popular book, Nick Chater (2018) insists that postulations of mental depths are a confabulation—a fiction created at the moment by our own brain. There are no pre-formed beliefs, desires, preferences, attitudes, even memories, hidden in the deep recesses in which anything can hide. The mind is flat; the surface is all there is. Our brains are . . . relentless and compelling improvisors, creating the mind moment by moment. . . . We are not driven by hidden, inexorable forces from a dark and subterranean world. Instead, our thoughts and actions are transformations of past thoughts and action. (Chater, 2018: 220)

Chater does not actually explain how there could be, specifically, “confabulations,” “improvisations,” and “transformations of past thoughts and action without there being, well, “past thoughts,” or without their being hidden mental mechanisms underlying the specific sorts of the (persisting, coincidentally recurring?) processes he invokes, a problem facing superficialism in general.33

3.4.2 Homunculi (Harman and V. Evans) One of the standard reasons why Wittgenstein (1953/2016: §32), Ryle (1949: 30), Skinner (1963/84: 615) and many of their followers regarded “going inside” for a psychological explanation as somehow vacuous was because they took such moves to be committed to a “homunculus,” or internal creature with a mind every bit as intelligent as the mind whose capacities it was being invoked to explain. In an early paper, Harman (1967) deploys this “homunculus” argument against Chomsky’s (1965) “theory confirmation” model of Aspects (1965: 30, quoted above in §1.2): 33 It may seem a bit unseemly to quote an essentially popular, quite ill-thought-out screed in the context of (for all their faults) the entirely serious work of Wittgenstein, Ryle, and Quine. I do so only to attest to the astonishing persistence of superficialism. Chater and Chistianson (2008) do present more careful arguments regarding Chomkyan linguistics and evolutionary theory, which there is no need to discuss here: they do not remotely have any superficialist premises or conclusions.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

122 Representation of Language Taken literally, Chomsky would be proposing that, before he learned any language, Smith had made a presumption about certain data; he had set himself a task; he possessed a theory; he made assumptions; he had techniques of representation. This would be possible only on the absurd assumption that before he learned his first natural language Smith already knew another language. So we could not accept Chomsky’s proposal if it were to be taken literally. (Harman (1967: 77)

The simple reply is that the language “already known,” or, anyway, already available to a neonate, would be a computational one, of the sort needed for computational theories of hosts of other mental processes. A Turing Machine, after all, is a specification of a way to compute any computable function by transitions that require the machine to be sensitive to only local, physical phenomena, such as being a “0” or “1” on a tape. Insofar as one thought of this latter sensitivity in terms of homunculi “reading” these figures, they are homunculi obviously so stupid that they can be replaced by purely mechan ical steps, not requiring any of the intelligence they may be invoked to explain (see Fodor, 1968; Dennett, 1975).34 The argument has surfaced most recently in Vyvyan Evan’s (2014: 171) attack on Chomsky. He considers this reply in terms of mechanically simple homunculi that are “so stupid that all he can do is say ‘yes’ or ‘no’. ” But he replies: This argument is disingenuous. Even a “yes”/“no” response involves an interpretation of a representation. And a further homunculus would still be required to interpret that representation. [This] attempt to defend the computational view of mind from the homunculus argument fails. (V. Evans, 2014: 171–2)

To be sure, classifying, say, different amounts of electric current as “yes” or “no” responses requires interpretation of those currents. But, at least in the case of natural, non-artifactual systems, this interpretation is provided by the role those currents play in the computational account that best explains the machine’s operation. Of course, one might wonder how an explanation that assigned the interpretation a state as representing, say, a grandmother, could 34 This fact about computational explanations seems to me of profound significance, both to the plausibility of such accounts of various mental phenomena, but also why such accounts, replete with intentionality are needed in any general theory of the sensitivity of animals to non-local, non-physical, and/or non-instantiated (what I call “abstruse”) phenomena, to be discussed in §11.1 below.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 123 possibly work. The advantage of assigning simply “yes” or “no” is that, unlike representing a grandmother, representing “yes” and “no” can be transparently implemented by those different electric currents (an issue to which we will return in §11.2). Perhaps not Evans, but many philosophers have worried whether any explanations that turn on such interpretations are inevitably interest dependent, in the way that many explanations of artifactual “computers” pretty clearly are. This raises a general worry about the interpretation of rule following that undermines any claim on there being an objective fact about which rule any seemingly intelligent system could be following, be it an artifactual machine or a naturally occurring mind or brain. The worry is so influential, it needs to be addressed separately.

3.4.3 “Kripkenstein”: Errors and Ceteris Paribus Clauses Developing what he took to be an insight of Wittgenstein (1953/2016: §§201ff), Saul Kripke (1982) raised what (to my mind) a surprising number of philosophers have taken to be a fundamental problem for theories that appeal to “rules.”35 Kripke imagines someone who replies “5” when queried “What is 68+57?,” and raises the question: what makes it true that she is adding and making an error, as opposed to computing an alternative function, “quaddition,” which is exactly like addition except that quadding 68 to 57 yields 5? He considers some tempting replies in terms of behavioral dispositions, but rightly concludes that these will not solve the problem, since all of a person’s surface behaviors could be compatible with either hypothesis. The problem is to specify what makes a particular response an “error” as opposed to a correct application of quaddition. Kripkenstein and many others argue that the notion of “error” is irreducibly “normative”: it is not grounded in any genuine facts about the world, and so introduces a value element into any such intentional ascriptions, including Chomsky’s positing of grammatical rules and principles.36

35 Kripke (1982) himself does not claim to wholly endorse the view, and many have disputed his attribution of it to Wittgenstein (see, e.g., Anscombe, 1985). Consequently, it is often to referred to, as I will, as the view of “Kripkenstein.” 36 “Normativity” has slipped into a surprising number of philosophical views about psychology (as Fodor, 2010: 203, noted, “Cows go ‘moo’; philosophers go ‘norms’ ”). We will return to this issue in discussing a verison of the Kripkenstein problem, “the disjunction problem” in §10.3, and in discussing “methodological dualism” in §11.3. See Rey (2002a, 2007) for general discussion.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

124 Representation of Language However, for any post-behaviorist who is not embarrassed to “look inside” to ground psychological facts, a fairly obvious reply to such claims of unnat ural normativity would be to appeal to Galilean idealizations of an internal system, of the sort Chomskyans advocate in abstracting from behavioral per formance to systems of internal competence. If one insisted on there being some connection between such a competence and behavior, then one could simply appeal to familiar ceteris paribus (“CP” or “other things being equal”) clauses which serve to set aside all manner of interference that may conceal the underlying principles of a system. In the case of arithmetic, presumably normal human beings, when asked for a novel sum such as 68+57, would reply “125”—but, of course, sometimes other things are not equal: for example, they do not want to be truthful, or they are distracted, inebriated, or die before being able to reply. Kripke considers this option, suggesting that someone might try to flesh it out in terms of a conditional: [CPK:] If my brain had been stuffed with sufficient extra matter to grasp large enough numbers, and if it were given enough capacity to perform such a large addition, and if my life (in a healthy state) were prolonged enough, then given an addition problem involving two large numbers, m and n, I would respond with their sum, and not with the result according to some quus-like rule. (Kripke, 1982: 27)

However, Kripke goes on to claim that we have no idea whether such a counterfactual conditional is true: How in the world can I tell what would happen if my brain were stuffed with extra brain matter, or if my life were prolonged by some magic elixir? Surely such speculation should be left to science fiction writers and futurologists. We have no idea what the results of such experiments would be. (Kripke, 1982: 27)

Now, Kripke is absolutely right to despair of straightforwardly spelling out such speculations. But he is wrong to despair for that reason of what may, for all that, turn out to be a perfectly good CP idealizations. If such speculations are just “science fiction,” then by the same reasoning most—likely all—genuine science would be science fiction! Scientists regularly idealize to, for example, frictionless planes, point masses, elliptical orbits, infinite populations of animals, and even away from the the Big Bang, the heat death of the universe,

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 125 and the likely purely statistical character of fundamental physical laws.37 Kripke offers no reason for thinking that abstraction from mortality, brain size, or memory limitations are any less legitimate than idealizations about any other region of the world. “CP (an English speaker asked to add two numbers presents us with their sum)” seems on the face of it as promising a psychological law as any. If someone has failed to produce the sum of a pair of numbers when asked, because she was demonstrably distracted by the passing parade, forgot what she was doing, or had a slip of the pen, we reasonably say that she “made a mistake.” That is, she did not exhibit the underlying competence that we ordinarily presume to be one of the crucial determinants of her behavior, a fact that would surely be as ascertainable as any other such fact about a complex system. Kripke (1982) does raise a familiar worry about computational theories of artifactual machines depending for their semantics on the intentions of their artifactors, and claims that if a machine “so to speak ‘fell from the sky,’ there can be no fact of the matter as to which program it “really” instantiates” (1982: 39). In reply, Chomsky (1986) rightly points put that we might learn a lot about, say, the computational organization of an IBM PC from the sky by investigating how facts about its input and output are affected by the keyboard and what could be changed by manipulating its different internal parts: We could develop a theory of the machine, distinguishing hardware, memory, operating system, program and perhaps more. It’s hard to see how this would be crucially different in the respects relevant here from a theory of other physical systems, say . . . the organization of neurobehavioral units . . . that explain how a cockroach walks. (Chomsky, 1986: 239)38

Of course, Chomsky’s reply here is in keeping with his general “Galilean” approach to theorizing about the computational system underlying human 37 Note, for example, that the heat death of the universe surely does not give the lie to the obvious truth that there is no longest English sentence! A striking example of the kind of radical counterfactual abstraction that even physics may require is afforded by the finding of Penzias and Wilson (1965) with regard to the source of the interference with their radio signals to orbiting satellites: it turns out it was the residual radiation from the Big Bang. So whatever laws they were using to signal the satellites had to be qualified by a CP clause that allowed them to abstract from such effects of the Big Bang. But now consider just how science fictional it would be to consider our universe without the Big Bang! Speculations about having larger brains, living forever, etc., pale by comparison (see Pietroski and Rey, 1995, for discussion). Kripke does follow his worry about “science fiction” with a more familiar worry about whether the relevant CP laws could be spelt out without using further intentional terms (1982: 28). This is a quite different issue that we will discuss in §10.2.2. 38 A point on which, as we will discuss in due course (§8.4), Chomsky (2000: 105) oddly seems to renege.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

126 Representation of Language linguistic competence. Curiously, Kripke (1982: 30–1, 97) discusses Chomsky’s (by the 1980s, well-known) distinction very briefly in two footnotes, claiming that competence is a “normative, not a descriptive” notion, and that, if the view he is ascribing to Wittgenstein is correct, “the notion of competence will be seen in a radically different light” (1982: 31, fn22). Unfortunately, he does not consider the interpretation of CP clauses offered here (indeed, he appears to misunderstand what Chomsky regards as competence, presuming that it is supposed to “explain . . . all my specific utterances” (1982: 99, fn77), as opposed to the manifestations of a specific inner mechanism). He does say in both footnotes that the whole “matter deserves an extended discussion elsewhere,” but, to my knowledge, has never provided it.39 Now, it is perfectly possible that the agent will still produce quad-like results, even after we control for various possible sources of interference. But then we would have excellent reason for thinking that the function she in fact intended was not addition, but perhaps quaddition or some other function (or that we ourselves are mistaken about the relevant sum). By analogy: it is perfectly possible that it is a law that the planets move in “schmelliptical” orbits—which are just like the complex orbits that emerge from the various interactions of planets on elliptical orbits. It is just that there is, of course, absolutely no reason to believe in such a law of planetary motion. Just so, as a matter of fact, most people are surely adding, not quadding (of course, maybe some odd individual could genuinely think “add” meant “quad”; fine, then, ceteris paribus, he would give the answer “5” when asked to “add 68 and 57”). It is widely feared that laws containing CP clauses are vacuous unless those clauses can be replaced by an explicit statement of what the “cetera” are, and precisely when they are “paria.” But such replacements are hard to come by, and would create often indefinitely complex counterfactual monsters when provided: if all the qualifications needed for even so familiar an example as Boyle’s Law were made explicit, the resulting conditional would be enormously complex; and the corresponding “initial conditions” would be wildly remote from conditions in the actual world (notice that in (CPK) above, Kripke does not really begin to consider all the ways things could go wrong). Suspicion on this issue has led Cartwright (1983) to suppose that “the laws of physics lie,” and others—for example, Davidson (1970) and Schiffer (1987)—to claim that macro-sciences like psychology cannot traffic in laws.

39 Crispin Wright (1989) does try to press the Kripkenstein worry on Chomskyan theories, on the basis of what I argue elsewhere are exaggerated claims about the nature of self-knowledge of one’s meanings and intentions; see Rey 2013, 2020c and §7.2.1, fn19 below.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

Competence/Performance 127 In an effort to quell this panic, Paul Pietroski and I (1995) argued that it is pretty clear that this requirement could never be met by some of our best laws outside of psychology. However, there is no need of it. It is enough to existentially quantify over independent interferences. Specifically, we argue that CP clauses should be understood as claims that apparent exceptions to the law that follows them are to be explained as being due to interferences that could in principle be independently identified. The clauses should be treated not as explicitly specifiable conditionals, but as “cheques” written on the banks of independent theories, their substance and warrant deriving from the substance and warrant of those theories, which determine whether the cheque can be cashed. The question comes down to whether the candidate CP law, in conjunction with the rest of the theory of which it is part, and along with all the theories of the independent interferences, all manage to “cut nature at its joints.” This is perhaps not an easy fact to determine. But scientists are surely entitled to speculate, experiment, and settle such issues, which they sometimes do with immense plausibility. So, the ultimate force of this response to Kripkenstein depends, of course, on the plausibility of the theorist’s speculations. In the case of addition vs. quaddition, it would of course be amazing if our ordinary speculations in this regard turned out to be false. In the case of Chomsky’s idealization to a gram matical competence, that is part and parcel of an assessment of his theoretical approach as a whole, as with any scientific theory. Recall, for example, that Chomskyans routinely speculate that some unacceptability responses may be due to features of memory and not to the grammar itself. Such speculative checks need of course to be cashed, and other specific challenges met. The point is that these are not matters to be decided on a priori, much less by an a priori superficialism of the sort Kripkenstein seems to presuppose.40 A similar reply can be made to John Searle’s (2002c) and V. Evans’ (2014) insistence that “norms” of “interpretation” vitiate linguistic appeals to explana tory computational rules. Searle (2002c) writes: several of the fundamental notions [Chomsky] discusses, such as sentence, grammaticality, and, above all, computation are observer-relative, or normative, or both, in ways that have no echo in natural science. It is often tempting in the human sciences to aspire to being a natural science; and there is

40 In §11.2 I will propose a more specific explanatory reply to the Kripkenstein worry, as it was raised by Fodor (1987) in his “disjunction problem,” in the more specific psychological context of meeting Chomsky’s demand of explanatory adequacy, to be discussed in Chapter 4.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 05/09/20, SPi

128 Representation of Language indeed a natural science, about which we know very little, of the foundations of language in the neurobiology of the human brain. (Searle, 2002c: internet)41

And, as we observed above, Evans (2014: 171–2) claims that even interpreting binary states of a machine as “yes”s or “no”s requires interpretation. Of course it does. But, as anywhere in science, what determines the right interpretation is the correct explanation of the relevant systems. Searle and Evans would perhaps insist that mere physics or neurobiology already provide that explan ation without any need of interpretation. And so they presumably do, of every particular, “token” physical state of the system. But science is not in the business of merely explaining particular token states or events. Far more import antly, science aims to explain regularities among those states or events, which may not even be statable in the specific language of physics (which does not and might not in principle include the terms of linguistics or psychology). The crucial data of Chapter 1 are just such regularities. Objections to Chomsky’s mentalistic conception of linguistics have not been confined to superficialists. Objections to it have also been raised in a quite different way from within a broadly entirely factual, mentalistic approach to psychology. We will consider these in Chapter 6. But, first, we need a clearer understanding of Chomsky’s specific explanatory ambitions.

41 By his appeal to “neurobiology” for linguistic explanation, Searle here is of course not a superficialist. But he does share the Superficialist’s claim about normativity, going on immediately after this quote to wonder how Chomsky’s “invariant principles” relate to the actual performance of speech acts? Perhaps he discusses it in another work, but there has to be some account of this central question in linguistics: How does the speaker’s mastery (internalization, knowledge) of the rules of the language relate to his speech behavior? And any such account has to deal with the normativity of the phenomena. (Searle 2002c: internet)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

4 Knowledge and the Explanatory Project From early on, a quite distinctive feature of Chomsky’s approach to linguistics was a constraint that had not previously occurred to anyone to place upon any linguistic theory: that it not merely generate the right “psychologically real” structural descriptions of natural language expressions,1 but that it should do so in a way that made clear how grammatical competence could be acquired by a child. Specifically, a grammar should specify how a neonate could attain in only a few years the stable state of a speaker who “has m astered and internalized a generative grammar that expresses his knowledge of his language” (Chomsky, 1965: 8). With this specific task in mind, Chomsky introduced a pair of distinctions between “weak” vs. “strong” generative capacity (1965: 60), and between the “descriptive” vs. “explanatory” adequacy of a theory (1965: 24–7). I will briefly discuss these distinctions in §4.1, turning in §4.2 to issues surrounding Chomsky’s use of the word “knowledge” to characterize the states of the child. One worry here has to do with the apparent suggestion that children are “little linguists,” their knowledge being somehow comparable to the knowledge of a professional scientist, a comparison that Chomsky (1975a: 11, 1959/64: 577) seemed sometimes to embrace. Part of this worry is easily dispatched by invoking the now standard distinction many philosophers have drawn between “conceptual” and “non-conceptual content,” only the latter being of interest in describing a child’s competence in grammar (§4.3). But a more important philosophical worry can be addressed by drawing a distinction between the kind of explanatory epistemology that interests Chomskyans and what I call a working epistemology that is the standard interest of philo sophers: whereas the former may be what is at issue in accounting for the acquisition of language and other cognitive capacities, the latter concerns the justifications that people might explicitly provide for many of their folk and scientific beliefs, particularly in light of the threats posed to them by various sorts of skeptics (§4.4), and the two may well not coincide. Whether it does or 1 We will discuss what is meant by “psychologically real” in §9.7. For now, let it just imply some sort of important connection to the mind. Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0004

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

130 Representation of Language does not afford a reply to a skeptic, an explanatory interest reasonably invites a “computational/representational” approach to mental processes (§4.5), which is at least presently not generally available in our ordinary, working justificatory practices.

4.1 Explanatory Adequacy For Chomsky (1965) a “weakly generative” grammar is one that simply generates the strings of phonemes that a speaker would produce. It is this conception of the task of grammar that Quine (1953/61c) had in mind when he understood such strings as external, physical phenomena: the grammarian’s business, with respect to a language L, can be stated as the business of finding what sequences of phonemes of L are significant for L. (Quine 1953/61c: 51)

(We will return to this proposal in more detail in §9.2). Chomsky, by contrast, seeks in the first instance a strongly generative grammar that would generate a set of structural descriptions (“SD”s) of the linguistic items a speaker hears ambient speech to be realizing, for example, the descriptions that encode the tree-structures generated by a speaker’s I-language. It is a deep and important fact about language that speakers do not hear speech in their native language as mere noise—indeed, it is virtually impossible for them to do so. Although they may not consciously think of the speech they hear as possessing anything like the elaborate structures posited by a Chomskyan theory, as we will see in §7.3, there is abundant experimental evidence that they are percep tually sensitive to much of it. At any rate, it is these perceptual structures that Chomskyans reasonably think a linguistic theory should generate. But even generating structural descriptions is not enough. One reason Chomsky dissociated himself from the formal conception of the field pursued by his teacher, Zellig Harris, is that he thought a serious linguistic theory ought to aim not merely for such descriptive adequacy, characterizing the possible SDs, but for what he calls “explanatory adequacy”: the account should fit into an explanation of how a grammar could be acquired by children on the basis of the relatively little evidence to which they are exposed (the “primary linguistic data”) in the stunningly short time in which they do:2 2 Pereplyotchik (2017) mistakenly believes the phrase “is best construed as a successful fit with the maximally general, simple, and unified theoretical coverage of all human languages” (2017: xxiii; see also 74–9). This is obviously a substantially weaker demand, such a coverage unlikely to be available to a child.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 131 To the extent that a linguistic theory succeeds in selecting a descriptively adequate grammar on the basis of primary linguistic data, we can say that it meets the conditions of explanatory adequacy. That is, to this extent, it offers an explanation for the intuition of the native speaker on the basis of an empirical hypothesis concerning the innate disposition of the child to develop a certain kind of theory to deal with the evidence presented to him. (Chomsky, 1965: 25–6)

One might wonder why this additional aim should be important. After all, a theory of arithmetic is standardly pursued without any obligation to explain how children could become competent at counting; why should linguistics be any different? The reason has to do with what seems to be the immensely plausible suggestion that, unlike arithmetic, linguistics is essentially about us: facts about language simply do not seem to obtain independently of us in the way that facts about numbers famously seem to do (see §6.2 below). Whether language is due to convention or biology, it still all depends upon us in a way the truths of arithmetic plainly do not. But, given what we observed in §3.2, that language does seem importantly rooted in our biology, a theory of it ought to be constrained by that biology, and so aim to explain how the biological system develops in response to experience, much as (to return to our original analogy in §1.1.1) a theory of walking would be constrained by facts about how our limbs develop. Smith and Allott (2016) put the point well: Each language seems to be an extremely rich and complex system, but Chomsky and others have argued that all languages are cut from the same cloth, and the simple reason for this is that they rely on dedicated and innate mental structure. That is, human children—all human children—have a language acquisition device which has built-in invariant principles, supplemented (perhaps) by “parameters,” switches that can be set one way or another in childhood. . . . For the first time, linguists have a good grasp of what it is they are trying to describe and explain. (Smith and Allott, 2016: 56)

Explaining what the observed diversity of adult languages have in common turns out to depend upon questions about how children came to acquire them, rather as a characterization of species seems to involve an explanation of how they evolved.3 Perhaps if species had not evolved, biology would have 3 Of course, it might also depend upon how the faculty of language evolved across history, an issue Chomskyans also address, however tentatively (cf. §2.2.10).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

132 Representation of Language had a very different character; similarly, if languages each sprang up merely as a result of arbitrary social conventions, linguistics might not have to be so interested in Chomsky’s condition of explanatory adequacy. However, if, as the crucial data we noted in Chapter 1 clearly suggest, languages do not spring up that way, then we need to understand how they did emerge, and this seems to be from the interaction of UG with primary linguistic data. Consequently, the problem of acquisition becomes essential to an explanatorily adequate linguistics. Note that explanatory adequacy comes in many degrees, there being an impressive number of different aspects of language acquisition that cry out even for a sketch of a plausible explanation. Chomsky, throughout his work, focuses primarily on how a child could possibly acquire the system of rules or principles of a generative grammar. But there is surprisingly less discussion of how merely postulating an innate formal structure of grammar is not by itself nearly enough to meet the demand of explanatory adequacy. Children face a monumental problem in sifting through all manner of what Lidz and Gagliardi (2015) call the sensory “input” to determine what should be serious “intake” to the grammar (cf. §5.4.4 below).4 Although just how the child gets from the input to the intake is a difficult question, it is hard to believe that their grammars are automatically updated by just anything that happens to be acoustically similar to speech, for example, merely by squeaky doors (pace Chomsky, 1987: 37). A significant problem is posed by genuine speech that is nevertheless taken to be in some sense “irrelevant” to such updating, for example, errors, false starts, and corrections; playful nonsense and “baby talk”; and archaic speech, as in prayers and nursery rhymes. Otherwise, contrary to the central idea of the theory, children would be completely at the mercy of the likely weird statistics of the impinging stimuli, which will probably vary widely over different children who nevertheless fairly rapidly attain the same grammar (cf. §11.2.1).

4.2 “Knowledge” A great deal of controversy has attached to Chomsky’s use of the expression “knowledge of language/grammar” to refer what children achieve in 4 As Howard Lasnik (2000: 3) remarks: “The big step is going from ‘noise’ to ‘word’. ” As we will see in Chapter 9, the latter is far from identifiable with a kind of the former. For now it is enough to note that words have hosts of linguistic properties that mere noises lack. But even if words were a subset of noises, there would be the question of how the child would determine that subset.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 133 acquiring linguistic competence in the first several years of their life. Thus, Chomsky writes: A theory of linguistic structure that aims for explanatory adequacy incorp orates an account of linguistic universals, and it attributes tacit knowledge of these universals to the child. It proposes, then, that the child approaches the data with the presumption that they are drawn from a language of a certain antecedently well-defined type, his problem being to determine which of the (humanly) possible languages is that of the community in which he is placed. (Chomsky, 1965: 27, emphasis mine)

There are quite a number of different reasons that philosophers especially have balked at this use of the term “knowledge,” some of them merely verbal, involving facts about the diverse uses of the word in ordinary English.5 I must confess, as they are usually stated, these seem to me to be theoretically uninteresting debates, since, so far as I can tell, nothing of genuine explanatory significance actually turns on the term: a Chomskyan theory is not about the English word “know,” but about the relation of children to the principles that govern their I-language, which they (and/or their I-languages) respect in the way that their visual system respects, for example, principles of good form or Ullman’s “Structure from Motion theorem” (see Palmer, 1999a: 398ff, 490ff), without the children in any way “knowing” these principles. But, given that Chomskyans do persistently use the term in this connection, there are some important distinctions that are worth stressing to avoid pointless misunderstandings. For starters, there was an unfortunate comparison that Chomsky (1959/64: 577, 1975a: 11) sometimes made of a child to a “little linguist,”6 which suggested that “knowledge” of language is supposed to be comparable to the knowledge of a professional linguist. As against this conception, Michael Devitt (2014) rightly objected: (A) Ordinary hearers understanding [an English sentence] have no conscious awareness of its SD or of any inference from the SD to a translation of [it].

5 See Devitt and Sterelny (1987/1999: §8.4) for discussion of the diversity of relevant uses of “know.” 6 Bill Idsardi (pc) says this characterization evidently goes back to Bloomfield (1933). The analogy was intended only to capture the problem of a “discovery procedure” that besets both the linguist and the child determining a grammar from limited data, even though their ways of dealing with that problem might be fundamentally different.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

134 Representation of Language (B) Given that it takes a few classes in syntax even to understand an SD, it is hard to see how ordinary hearers could use it as a premise even if they had access to it. (Devitt, 2014: 285)

We will see, however, in the next chapter (§5.4) that thinking of the child as a “little linguist” was actually a needless and quite misleading characterization employed only for purposes of exposition. There is not the slightest presumption that the relevant sort of “knowledge” that typical three-year-olds or even most adults attain is the least bit conscious or explicit in the way it is for linguists. To be sure, most of the evidence for Chomskyan theories is typically provided by conscious adult “intuitive” responses, along lines we will discuss further in Chapter 7, but this is inessential. As we mentioned in introducing the WhyNots in Chapter 1, evidence of “Unacceptability” could in principle be provided by other manifestations of mental/neural structure, for example, hesitation, double-takes, confusion, or distinctive neural patterns. To clear away misunderstandings, in this regard Chomsky (1980a) eventually introduced for his purposes what he regards as “a technical term,” cognize: To avoid terminological confusion, let me introduce a technical term devised for the purpose, namely “cognize,” with the following properties. The particular things we know, we cognize. In the case of English, presented with the examples, “the candidates wanted each other to win” and “the candidates wanted me to vote for each other,” we know that the former means that each wanted the other to win, and that the latter is not well-formed with the meaning that each wanted me to vote for the other. We therefore cognize these facts. Furthermore, we cognize the system of mentally-represented rules from which the facts follow. That is, we cognize the grammar that constitutes the current state of our language faculty and the rules of this system as well as the principles that govern their operation. And finally we cognize the innate schematism, along with its rules, principles and conditions. In fact I don’t think “cognize” is very far from “know” . . . but this seems to me a relatively minor issue, similar to the question whether the terms “force” and “mass” in physics depart from their conventional sense (as they obviously do). (Chomsky, 1980a: 69–70; see also 1975b: 164–5, 1986: 32–3, 265)

Although such a suggestion is, I think, on the right track, on the face of it, however, this is no introduction of a “technical” term. Unlike the terms “force”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 135 and “mass” that are introduced in physics in quite precise ways, “cognize” is patently just another word for “know,” just less constrained so as to include unconscious states. Indeed, given that a few pages later Chomsky thinks that “it is not at all clear that the ordinary concept of knowledge is even coherent” (1980a: 92), one might wonder why he thinks “cognize” is any better off, or, more importantly, needed at all.7 The skeptic here wants a reason for thinking that they are mental, intentional states,8 even when they are not (accessibly) conscious. Otherwise they seem, as Galen Strawson (1994: 166) compares them, to be “sounds” on an unplayed CD. One common strategy for dealing with Chomsky’s use of “knowledge” is to appeal to a distinction Ryle (1949) famously drew between “knowing how” and “knowing that,” which he thought contrasted the cases of a skill, such as, for example, knowing how to ride a bicycle, from knowing that a certain theory of bicycling is true, and many philosophers have proposed assimilating knowledge of language to such “know how.”9 Devitt (2006a: 210–20) prefers to compare theories of language with theories of skills, such as catching a ball or playing a piano, concluding that the literature of these latter theories should make us doubt that language use involves representing syntactic and semantic properties; it makes such a view of language use seem too intellectualistic. (Devitt, 2006a: 221) 7 Chomsky does add the following on behalf of “cognize”: If the person who cognized the grammar and its rules could miraculously become conscious of them, we would not hesitate to say that he knows the grammar and its rules, and that this conscious knowledge is what constitutes his knowledge of the language. Thus “cognizing” is tacit or implicit knowledge, a concept that seems to me unobjectionable . . . (Chomsky, 1980a: 70) But this adds nothing. Of course, should someone become miraculously conscious of the principles of digestion, we might well say she knew them—then. But of course that does not imply she knew—or cognized—them all beforehand! 8 I postpone discussion to Part III (Chapters 8–10) of Chomsky’s (2000) odd denials that his theory involves any intentionality. On the face of it, “cognize” would seem to be every bit as intentional as “know.” Some philosophers might wonder whether Chomsky intends “cognize” to operate at a “personal” vs. “sub-personal” level of description. Dennett (1969) insists that it is typically not “persons” who are deploying the relevant rules, in the way they might, say, traffic laws, but only “sub-personal” systems; and some would go on to confine intentional ascription to just the personal level. I am afraid I have never found the distinction well-enough drawn to settle whether or not it is persons that might have, e.g., the unconscious attitudes attributed to them by Freud or the introspective confabulations attributed by Nisbett and Wilson (1977); and, if those are personal, why not the rules by which a person parses visual or linguistic input? 9 Stanley and Williamson (2001) argue that knowledge-how is simply a species of knowledge-that, based on subtle discussion about the uses of the two idioms in English. The present issue is not, however, a lexical, but rather an explanatory one: is any use of “know” needed for linguistic theory?

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

136 Representation of Language This suggestion often brought with it a behaviorist conception of a behavioral skill that made no commitments to any internal “cognitive” achievements of a “know that” sort.10 Chomsky explicitly rejects the appeal to “know how” as an account of knowledge of language: [T]here would be little point to a concept of “cognizing” that did not distinguish “cognizing the rules of grammar” from the bicycle rider’s knowing that he should push the pedals or lean into a curve. . . . [W]e take bicycle riding to be a skill, whereas knowledge of language and its concomitant . . . is not a skill at all. The skill in question might, perhaps be based on certain reflex systems, in which case it would be incorrect to attribute a cognitive structure in the sense of this discussion . . . (Chomsky, 1980a: 102)

But the distinctions here are orthogonal: “skills” (e.g. to do mental arithmetic) and “reflexes” (e.g. parsing, as in J.A. Fodor 1983: 72) might well involve “know that”, and plenty of what are presumably non-“reflexive” non-skills (e.g. the immune system) might not involve any “know how” at all.11 Chomsky’s claim here seems to amount to no more than a casual reflection on our ordinary talk and is actually at odds with his stated lack of interest in it (as at, e.g. Chomsky, 1980a: 92ff, 1986: 27, 266–9). He is not concerned with any sort of “analysis” of the terms “knowledge” or “language,” and readily acknowledges his usages of them might diverge from ordinary thought and talk.12 His usages should, I think, be understood precisely along the Putnamian lines we mentioned in §3.2 about the usage of scientific “natural kind” terms generally as a means of “getting at” an underlying natural kind that we hope that science will eventually characterize correctly. Chomsky can be taken to be using “knowledge” to refer to kinds of phenomena that he suspects a correct (psycho-)linguistic theory will ultimately define (cf. §4.5 below). 10 Of course, Devitt (2006a) is no behaviorist; but he does seem to take seriously some of their recent connectionist descendants (see 2006a: 220–43). In an earlier discussion, Michael Dummett (1981) also argues that Chomsky is only entitled to claim that knowledge of language is “knowledge how.” See Hursthouse (1991) (§3.4.1, fn32 above) for reasonable worries about over-intellectualization generally. 11 Actually, Chomsky’s view of bicycle riding is not entirely stable, even in the same volume: see (1980a: 3,53), as well as his (2003: 280) where he directs the reader to his (1975b). The issue seems to me a subtle empirical one, about which I doubt either Chomsky or anyone else is in a position to provide a definitive account. 12 In his (1968/2006: 169) Chomsky dismisses the “know-how/-that” distinction as not capturing what he regards as his own third alternative, “knowledge of” language, which Collins (2007a) regards as simply an adoption of yet another informal idiom. But it is not clear that this idiom is of any explanatory importance either.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 137 This distinction between ordinary and scientific uses of a term invites further distinctions that should properly insulate Chomsky’s concerns from any of the usual philosophical ones that have been raised against him, between conceptual and non-conceptual content (§4.3) and between a working and an explanatory epistemology (§4.4).

4.3 Non-Conceptual Content Devitt’s and other’s worries about over-intellectualism could be easily reduced if we apply a distinction many have raised with regard both to Chomsky’s discussion, and other discussions of what might be called “early” cognitive processing, such as occurs in many processes that do not seem to be part of a person’s general cognition. Certainly in the original (1965: 30) Aspects model, the child seemed to bear a cognitive relation to the contents of, say, the principles and parameters of a grammar without being able to reason about it consciously in the rich and explicit ways that professional linguists might. This is a problem that arises with respect to many systems of “early” perceptual processing, and has been addressed with respect to vision by appeals to “non-conceptual content,” or content of states that explain perceptual abilities without being generally available to cognition in the way that concepts typically are.13 For example, appeals to “non-conceptual content” permit us to claim that a person might well represent, say, the property square, but without having a representation that expresses the corresponding concept, {square}.14 A person’s visual system may use a simple predicate, triggered by certain stimulus arrays, that might have the content {symmetry}—or perhaps {symmetrical-looking}— but not one that expresses that concept and so allows her to reason about symmetry in the usual ways.

13 There has been much as yet unsettled dispute about how precisely to draw the distinction between conceptual and non-conceptual content—and whether it is actually a distinction between kinds of content, or between kinds of states (e.g. perceptual vs. rationally integrated ones) that involve a uniform kind of content (see Heck, 2000, 2007, and Bermudez and Cahen, 2015, for useful discussions). I do not want to take a stand on these difficult issues, but will simply assume that there is an important distinction here to be drawn, and will use what has become the standard term “(non‑)conceptual content” to draw it, without, I hope, prejudice to those issues. 14 See the Glossary for my (provisional) conventions for distinguishing properties (small caps), conceptual (curly brackets), and non-conceptional (double curly brackets) contents.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

138 Representation of Language

Figure 4.1 The Mach Square/Diamond

An elegant example was provided by the physicist Ernest Mach (1914) who noted that an experience of a square can involve precisely the same objective stimuli as an experience of a diamond (Figure 4.1). However, the experiences will still be different, depending upon how the subject “sees” or “thinks” of it (I put “think” in scare-quotes, since some might confine it to genuine, integrated conceptual reasoning). As Peacocke (1992: 75–7) stresses in discussing this example, a child (or perhaps an animal) could distinguish these two experiences without having the concepts {square} or {diamond}: at any rate, they may not be able to reason about squares being equal sided and equal angled, and so might be regarded as lacking those concepts. Indeed, as Peacocke goes on to note: Intuitively, the difference between perceiving something as a square and perceiving it as a (regular) diamond is in part a difference in the way the symmetries are perceived. When something is perceived as a diamond, the perceived symmetry is about the bisection of its angles; when . . . as a square . . . about the bisection of its sides. (Peacocke, 1992: 76)

But, as Peacocke again emphasizes, someone could have these different experiences without having the concept {symmetrical}. It seems pretty clear that, insofar, of course, as they are attributing content at all (cf., Part III), Chomskyans are attributing non-conceptual content to the states of the I-language and associated systems, such as parsing. There is no reason for anyone to suppose that many—or perhaps even any!—meta-linguistic concepts, least of all conscious ones, are available among speakers generally. The relevant SDs issuing from the language faculty to a central processor are likely to be the results of a modularized processing system involving non-conceptual ones—let us call them “NCSDs”—and it is therefore not surprising that there is no immediately conscious awareness of them with the conceptual contents as ordinarily conceived by the linguist. Some of these latter are, of course, what one learns to deploy either in “grammar school” or by taking linguistics classes. All that may be available to the naive hearer is

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 139 awareness of some or other distinctions that the linguistic NCSDs are marking, just as all that is available to a non-geometer looking at the Mach figures are distinctions marked by non-conceptual visual ones. Note also that the NCSDs determine how things look/sound; they themselves are not the “things” we seem to see or hear—although, of course, they are likely to be the basis on which we consciously report our intuitions when we do deploy meta-linguistic concepts (again, there is more to our phenom enology than mere sensation). And, again, the importance for the Chomskyan of such reports is not necessarily their truth, but simply the evidence they provide, via the NCSDs, of the underlying grammar Thus, I shall assume throughout the remainder of our discussion that the grammatical content ascribed by linguists is largely non-conceptual. Some of it may, of course, become conceptual as the child learns meta-linguistics expressions, such as “word” or “sentence,” and the like. The issue will become particularly important in considering the role of intuitions as evidence in §7.2.2.

4.4 An Explanatory vs. a Working Epistemology An important distinction that I think goes to the heart of the disagreement between Chomsky and many philosophers is one that has not to my know ledge been sufficiently noted in standard epistemological discussions. It is between what I will call a “working,” largely consciously explicit epistemology, and an “explanatory” one, or a general scientific account of how animals and people succeed in having and manifesting their remarkable cognitive abilities: the rationality, intelligence, and frequent success of many of their efforts. As Chomsky (1986: 18, 1988a: 3–4) frequently quotes Bertrand Russell (1948): How comes it that human beings, whose contacts with the world are brief and personal and limited, are nevertheless able to know as much as they do. Russell (1948: xiv)

By contrast, unlike Chomsky and most cognitive scientists, Plato, Descartes, and the traditional Rationalists and Empiricists seem to me mostly concerned with a working epistemology, specifically with what people are justified or entitled to believe in consciously settling disputes that explicitly arise in science and ordinary life, or in answering familiar philosophical sceptics. Historically, of course, this sort of working epistemology has often been accompanied by explanatory speculations about the “origins of ideas,” of the

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

140 Representation of Language sort one finds persistently in the philosophical tradition, from Plato to the Rationalists, who emphasized a special faculty of purely reflective “a priori” reason, and from empiricists up to and including Quine and Devitt, who insist that all knowledge is based on experience, even if sometimes, as in the case of logic and mathematics, quite indirectly. But one point that is striking about all these speculations is how extraordinarily few of then are informed by any serious, empirically controlled research of just how people do acquire knowledge. I do not want here to reiterate familiar complaints about this (what seems in retrospect) often irresponsible speculation, but merely want to emphasize how one should be careful about inferring the character of this explanatory epistemology from features of the working one. After all, there is no real reason the two should coincide: maybe what we fairly self-consciously do in reflection and explicit argument is quite different from what, as it were, our minds/brains may do often fallibly and inexplicitly, but quite efficiently in reasoning and learning about the world.15 One might think that this distinction between a working and an explanatory epistemology is simply the familiar distinction between a “normative” and “descriptive” one. And perhaps it is; but I think it is crucial to notice how normative considerations may enter into an explanatory psychology in ways that might differ from the role they play in a working one, at least until we have a sufficiently rich psychology that is able to unify the two. After all, one task of a “descriptive” explanatory psychology is surely to explain just how we and other animals come to understand things and succeed in so many of our efforts as well as we appear to do. It is not unlikely that at least some of this success is due to our using strategies that, given our innate endowment in our normal environmental niche, are immensely reliable, and sometimes “rational” (Why did Fisher win so many chess games? He was no dope!).16 Although normativity may not be intrinsic to psychological explanation (cf. fn 15), there there is no reason to exclude it from an explanatory project. Still, however, the strategies we might employ in working reflection may not be the ones that explain an animal’s success, especially given its specific 15 Cf. Kahneman (2011), although I do not mean to be drawing a principled distinction between “slow” and “fast” thought, much less “personal”/”sub-personal” levels of mental ascription, but between the obviously different purposes of a working vs. an explanatory epistemology (hence the “as it were”). Nor do I mean to be suggesting for a moment that the explanatory ascription of attitudes is in any way intrinsically “normative,” as many have insisted (see the exchange between myself and Ralph Wedgewood in McLaughlin and Cohen, 2007). 16 Here, of course, I am sympathetic to so-called “reliabilist” alternatives to classical definitions of “knowledge,” but since I am not concerned with an “analysis” of the term, I don’t want to commit myself strictly to those alternatives here. It’s enough that reliability serves as a strategy for explaining and justifying beliefs.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 141 endowment in its specific niche. Some of those successful strategies may turn out to be highly specialized and specific to particular domains—say, of language, or the folk theories of biology or mind—and there may be no general explanatory epistemology to be had. Indeed, a genuinely explanatory epis temology looks like it will be immensely more difficult to secure than trad itional philosophers have supposed, and our working practice simply cannot wait on its results. The contrast between a working and an explanatory epistemology is particularly important in understanding much of Chomsky’s core project. Given our ignorance of what is genuinely responsible for our cognitive abilities, it may well be that the best working epistemology for the forseeable future is Quine’s (1960/2013) pragmatic “Neurathianism”: in explicitly justifying one’s claims, one starts at different places at different times, depending upon what serious doubts have been raised about some issue, much as, in Quine’s familiar figure from Otto Neurath (1932/83: 92), one repairs a boat while remaining afloat in it, piece by piece, standing on one side to repair the other, on a second to repair the first, and so on, only perhaps to stand ultimately on the last to repair the first. No single plank is immune to repair or revision.17 In support of the Neurathian view, attention is standardly drawn to the many revisions in thought that have been occasioned particularly by the increasing sophistication of science, as in the cases of the development of set theory, non-Euclidean geometries, General Relativity, quantum mechanics and (perhaps) even arithmetic and Classical logic. These developments, at any rate, have led many philosophers to be at least more careful than many have traditionally been about insisting that certain claims are absolutely indubit able, particularly on the basis of reflection or “intuition.” At best, one can rely on one’s intuitive verdicts just insofar as they have been shown to be reliable (as, of course, by far most cases of mathematicians’ reasoning about arith metic appear to be). As a working scientist, the concern here is with what scientists claim, not the internal psychology responsible for their claims. Chomsky is likely quite sympathetic to such a stance. For all his and his colleagues’ reliance on intuitions as evidence for properties of the language faculty, and the hope that the principles may turn out to display “virtual conceptual necessity” (§2.2.9, fn 44), he never appeals to them as somehow indubitable or “self-evident” (cf. §7.1.2 below).

17 We will discuss some problems with the view when we consider Quine’s efforts to marshal it on behalf of his attacks on meaning in §10.2.2.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

142 Representation of Language This Neurathian figure can seem to invite Quine’s other familiar figure of “confirmation holism” whereby:18 our statements about the external world face the tribunal of sense experience not individually, but only as a corporate body. (Quine, 1953/61b: 38)

Quine was expanding on an observation of Pierre Duhem (1906/54), who drew attention to how empirical scientists, testing an empirical hypothesis and confronted with a failed prediction, do not automatically reject the hypothesis. They will usually check out any number of auxiliary hypotheses regarding the conditions of the experiment, the reliability of the experimental apparatus, and the plausibility of any number of background beliefs.19 Quine’s holism has been tremendously influential in philosophy in the last seventy years. Michael Devitt (1996) has been a particularly staunch defender of it, understanding it as a kind of ultra-empiricist claim: there is only one way of knowing, the empirical way that is the basis of science (whatever way that may be). (Devitt, 1996: 2; see also his 1998/2010b: 257; 2011:284)

He thinks this view discredits what he, with Quine, regards as unjustifiable appeals to a priori knowledge, and scorns both Platonic and Rationalist, what he calls “Cartesian” appeals to a special faculty of a priori reason (as well as to linguistic intuitions, as we will discuss in §7.1.1 below). Now, as a general working methodology in science, it is likely Chomsky would agree to it this as a good rule of thumb as well. As a scientist, he certainly would be the first to regard his theory as requiring the kind of empirical justification he takes himself to be providing. But Quine (1960/2013) seems to treat his holism equally as a claim about an explanatory epistemol ogy, as well as about a working one (he does not distinguish). As we discussed in §3.3, given his behavioristic framework, Quine viewed people’s “cognitions” (again, if that is what they can be called) as essentially bundles of dispositions 18 I stress “can seem to invite” since, of course, it does not entail it: one might move freely around one’s belief system while the criteria for assessment remain local. 19 Quine extended the Duhemian claim in a way that it is unlikely Duhem himself would have endorsed, to include in the “holism” all of our beliefs, including those of logic, mathematics, and analyticities. I have elsewhere (Rey 1993, 1998, 2020b) expressed skepticism about this extension, but logic and mathematics are not at issue here. Analyticities I will discuss in Chapter 10 below.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 143 to assent and dissent that change under the pressure of stimulation, with no particular disposition sacrosanct. And, in a way, perhaps that is what is centrally relevant to working science, the concern being with the sentences scientists are prepared to defend. Of course, since the demise of behaviorism it has been clear that such a view will not begin to suffice for explanatory purposes, where, precisely along Chomskyan lines, the interest has been increasingly in innate, mostly unconscious cognitive capacities regarding not only language, but physical objects, numbers, minds, and morals.20 It would seem that the gulf between a working and an explanatory epistemology could not be greater. The crucial point for present purposes is not whether confirmation holism and Devitt’s claims about the “empirical way” being the “only way” of knowing are true, but only to allow that, while a working epistemology for the forseeable future might well be more or less Quinean, without sacrosanct “foundations,” an explanatory epistemology could turn out to be quite otherwise, based on peculiar, innate, maybe even a priori principles, and the inputs from our various perceptual modules.21 In any case, with this distinction in mind, I think it is safe to say that Chomskyans are, themselves, plausibly working within a, roughly, Quinean epistemology (Chomsky, 2000: 67, seems to endorse it), even though they are unlikely to endorse it remotely as an explanatory one. Without abandoning a Quinean working epistemology, Chomskyans are free to explore the possibility that some sort of knowledge of I-language may be an important explanatory ingredient of our cognition, whether or not it plays a role in people’s everyday practices of working justification and philosophers’ conceptions of it, and whether or not full, conscious conceptual justifications that might mollify a skeptic play any role in the acquisition of language.

4.5 Computational-Representational Theories (“CRT”s) One of the most consequential ideas of the twentieth century was, of course, Turing’s theory of computation, in particular his proposal of what has come to be called a “Turing Machine.” Not only did it give rise to the 20 See, e.g., Spelke (2003, 2017), Carey (2009), Apperly (2010) and Mikhail (2011). Questions could also be raised about how sensory and proprioceptive and other forms of self-knowledge fit into a holistic explanatory framework: it surely does not seem as if people arrive at or justify their claims about what things look or sound like by considering their entire theory of the world! We will return to this issue in §7.1.3. 21 In Rey (1998, 2001, and 2020b) I defend the possibility of a naturalistic a priori in at least an explanatory epistemology.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

144 Representation of Language production of the all too familiar artifactual computers that have come to dominate our lives, but, quite independently, they have inspired a wide variety of computational-representational theories (“CRT”s) of mental processes. As the name indicates, there are two parts to a CRT, a computational part and a representational part. As I will explain shortly, they are quite independent: one could have one of them without the other—indeed, as we will see in due course (§§8.4–8.5), Chomsky and some of his followers might be regarded as thinking that there is really no substantive representational part over and above the computations. And someone could think the mind is representational but not explained by its being computational: some dualists, connectionists—and, ironically enough, Jerry Fodor (2000)!—might think this. The computational part of an CRT is the central subject matter of most cognitive psychology and of almost all Chomskyan linguistics, and there is no need to discuss it in any detail here.22 There is one point, however, that is seldom sufficiently stressed in those and other discussions, what I call the physical locality condition. When Alan Turing provided his famous characterization of computation in terms of “Turing Machines,” he quite sensibly did not specify the transitions between states in terms of just any old thing one might put on the tape. In particular, he did not specify that the transitions occur if the cursor had to detect on the tape something that was loved by the king, had been drawn by Picasso, referred to Notre Dame, or was cubical, an NP, or an English sentence. If he had, then one might well wonder how such a machine could possibly work: how could the cursor possibly detect such things? Such a machine would have to able to figure out, for example, what the king loved, who drew it, what it referred to, and whether something was cubical or a sentence, and all these abilities would seem to require a “homunculus” with the very “intelligence” that a computational theory was being invoked to explain, precisely as foes of computational theories have often claimed (cf. §3.4.2 above). This is, of course, precisely why Turing specified the symbols purely formally, that is, in terms of some or other local and physically detectable features, for example, “0”s and “1”s that could be individuated by their entirely

22 See Newell and Simon (1976), Fodor (1975, 1987, and 2010) for classic discussions of it, as well as my (Rey 1997: chps 6–9) for an introduction. Fodor (1983, 2000) has serious qualms about it, arguing that, although a CRT is a necessary condition for cognition, it is not sufficient for a theory of what he regards as “central” mental processes. After all, if theory confirmation is holistic in ways that Quine insisted, then it is not computable along standard lines. Fodor is prepared to claim that that is just so much the worse for a computational theory. Alternatively, one might alternatively think it is so much the worse for Quinian holism, especially given its vagueness (see Quine, 1986: 493), and the complete lack of any positive arguments for it (see my 2020b and Quine’s, 1991/2008, own serious reservations about it, discussed in Verhaegh, 2018).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 145 local shape properties—or, in a modern computer, by electro-magnetic properties (we will return to this important issue in §11.2).23 It is for this reason that it is crucial that the computational part be specifi able independently of the representational part, even if at the end of the day they form a natural explanatory unit.24 The syntactic specification is an important component that, when combined with a representational theory, affords a serious explanation of some intelligent computation. Thus, Turing Machines are standardly presented with an intended interpretation of the symbols on the tape: sequences of “0”s and “1”s represent, for example, the numbers in the domain of some computable arithmetic function, or the balances in someone’s bank account. In the case of artifactual computers, the intended interpretation is whatever the artifactor or user of the machine just stipulates—which is why there is seldom any discussion of the relation between the symbols and what they symbolize: the intentionality is entirely for free! If, however, we are to apply a computational model to a naturally occurring computer, for example, the mind, we cannot rely on such stipulations, and so need to provide a “naturalistic” account of intentional content. But, again, this is an issue entirely and necessarily independent of the specification of the computations themselves. Both because of these facts, and for various further, unfortunate historical reasons (see §11.3.3), the representational part is almost never addressed by psychologists or linguists, but only by philosophers. In chapter 9 of Rey (1997) I reviewed the main proposals that philosophers have made, but the issue is too large to deal with here. It will, I think, be enough to consider a modest version of a few of them that, I will argue, is all that will be needed for

23 Strictly speaking, of course, Turing Machines are abstract mathematical objects that may be “implemented” in any number of ways. Indeed, operations specified at the fairly abstract level of standard linguistics discussions may be implemented by any number of different (“real time”) algorithms at lower levels, which may in turn be run in real time on actual machines in any of a vast multitude of ways (cf. §3.3.4). My point is that, however computations are to be implemented in real time, it is crucial to Turing’s proposal that these abstract devices be specified in a way that permits local physical, “mechanical” implementations and transitions between their constituent states. (I set aside the complexities of “quantum computing,” although I expect an analogous condition would need to be fulfilled.) 24 There is sometimes confusion on this point. Searle (1984), for example, claims that computational symbols have no meaning; they have no semantic content; they are not about anything. They have to be specified purely in terms of their formal or syntactical structure. (Searle 1984: 31) While it is true that computations have to be specified without reference to meaning, this does not entail that they do not have meaning properties, which may play an important role in a computational explanation. Bachelors can be specified without reference to their loneliness, even though their loneliness may explain a lot of their behavior.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

146 Representation of Language purposes of explanatory adequacy in linguistics. But I postpone that complex discussion to Chapters 10–11. What is worth noting here is how a CRT could provide a clearer account of the kind of “knowledge” and “cognizing” that Chomsky seems to have in mind. Grammars are attained by children via processes that are described abstractly in terms of the rules, principles, and constraints of a generative grammar, but which could ultimately be spelt out in terms of the kinds of mechanical procedures of a CRT, as presumably Chomsky conceived of his (1965: 30) Aspects model. The structural representations produced by those processes are then available for further (one hopes!) computationally characterized processes of comprehension and production. Searle (2002c) claims that any such appeal to computational procedures cannot in general be explanatory since (in an argument that, we will see in §8.4, Chomsky, 2000: 105, surprisingly seems to endorse): computation exists only relative to an observer. Chomsky would like linguistics to become a natural science, but “computation” does not name a phenomenon of natural science like photosynthesis or neuron firing, rather it names an abstract mathematical process that we have found ways to implement in commercial hardware and which we can often interpret as occurring in nature. Indeed, just about any law-like process that can be described precisely can be described computationally. Thus some scientists describe the stomach in computational terms (“the gut brain”) but we all know there is no mental life in the stomach and no zeroes and ones. Computation is not discovered in nature, rather it is assigned to it. (Searle, 2002c)

But such a dismissal disregards the essential explanatory role of a computational model in a Chomskyan linguistics. To be sure, from the fact that almost any phenomena could be described or interpreted computationally, it does not follow that that description should be taken literally. What argues for taking any description of anything literally is that the description is part of the best explanation of the regularities exhibited by the phenomena. Whether the stomach or a thermostat engages in computations depends upon whether a computational model is essential to capturing regularities in its processes. If not, then perhaps the model is simply an expository metaphor. But this does not seem to be true in the case of language. There simply does not seem to be a non-computational model of language acquisition, comprehension, or

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Knowledge and the Explanatory Project 147 standard use that would explain the crucial data of Chapter 1. Searle, at any rate, points to none.25 It should be noted, however, that there is absolutely no need for a Chomskyan linguist to endorse CRT as an approach to the whole of mentality. Indeed, Chomsky (19934: 154; 2000: 105) has often expressed serious doubts about CRT as a general theory of thought. But here is one place where some might well put the word “know” to good use. Although it might be the case that, if a rule is explicitly represented in a system, over which representations the computations are defined, it might be “known” by it, this seems much less clear if a rule is merely implemented (§3.3.4). As Demopoulos and Matthews (1983) observed: [If] all that we are committed to is that the mind computes the functions which correspond to rules of the grammar, [then] the sense of the claim that the rules are represented and used is exhausted by the (much weaker) assertion that our linguistic behavior accords with the grammar . . . not unlike the “knowledge” that the earth-moon system has of its Hamiltonian, and this seems clearly unacceptable. (Demopoulos and Matthews, 1983: 406)

Such a use of “know”—or “cognize”—would actually fit Chomsky’s own reply to the question I raised earlier of how to distinguish cognizing the rules of language from cognizing the laws of physics, both of which a speaker may obey. In reply to the question as I raised it in Rey (2003a), Chomsky (2003) wrote that we should attribute Newton’s laws to some internal system of Jones, or of a rock, if there were reason to believe that they do access this system when they (decide to) fall from a height. But that course is plainly wrong; it suffices to attribute to Jones and the rock the property of having mass. That property, however, will not suffice . . . for insect navigation, or for Jones’s interpretations of [linguistic phenomena]. Accordingly, we should attribute to them other properties: . . . for insects, cognizing a system of path integration and calculation of solar ephemeris; for Jones, cognizing [his language]. (Chomsky, 2003: 278, emphasis mine)

25 Note that even if a computational model is essential as an explanation of digestion, this wouldn’t entail that digestion is a mental process. CRT is the claim that core mental processes are computational, not the obviously absurd converse, that all computational processes are mental.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

148 Representation of Language But what it is for insects to “access a system” when they navigate? One clear answer would surely be that the insects access explicit representations of their paths and the solar emphemeris—precisely as Gallistel (1990) argues that most insects do. And so the same would hold of Jones: he cognizes those aspects of his language that he explicitly represents, accessing those representations when he decides to speak. But, as in all such discussions of “know,” intuitions about its use vary, and until it is assigned a serious explanatory role there is no reason to adjudicate between them.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

PART II

T HE C OR E PHILO SOPHICA L V I EWS

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

5 Grades of Nativism From Projectible Predicates to Quasi-Brute Processes

Chomsky’s theory of language is probably best known for what seems to many people to be its audacious hypothesis that a specific UG is “innate”: aside from the setting of variable syntactic parameters and the mappings of basic phonological forms to morphemes, UG is available to the child essentially from birth.1 A generative grammar by itself does not entail this strong nativist thesis. As we will see in this section, it is at least a serious conceptual possibility that the essentials of UG could be acquired by domain general methods of statistical inference, what I will call “GenStat” strategies, by which children acquire knowledge of, say, table manners or styles of dress. Such strategies have become one of the most energetically pursued rivals to a nativist UG. Although few would doubt that statistical computations have some role to play in acquisition, the view of Chomsky’s that I will defend in this chapter is that, construed as rivals to an innate UG, such proposals are as fundamentally mistaken as would be analogous proposals about the structure of the visual or endocrine systems. The principles governing such structures are determined by our biology; all that experience does is to determine the specific character those structures assume within the bounds of those principles. This important Chomskyan proposal can be obscured, however, by a var iety of arguments that have been or could be raised for nativist hypotheses of weaker sorts. Each of these can be seen to have emerged in response to a problem that can be associated with a specific philosopher who is well-known for having pressed it. In this chapter I want to lay out these different problems and indicate the increasingly strong grades of nativist hypotheses they invite. 1 A great many excellent discussions have already been written on Chomsky’s nativism (see especially Laurence and Margolis, 2001, and Berwick et al., 2011). I mean only to supplement those discussions with some further philosophical points that I don’t think have been widely appreciated, which is why I include this topic under “core philosophy” rather than “core linguistic theory,” of which it is in fact every bit as much a part.

Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0005

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

152 Representation of Language I will conclude that what I call Chomsky’s “quasi-brute (causal)”2 process nativism is interestingly stronger than any of the other sorts, but, for all that, also the most plausible. As many have observed, the term “innate” is not a happy one, being notoriously difficult to characterize across biological domains.3 For the purposes of the present discussion it will suffice to rely on a notion relatively specific to cognitive abilities: a species-wide “initial state” of a system that develops into fairly stable states as a result of input to the system.4 Chomsky’s core claims are that, in the case of grammar, that initial state is UG, a biological endowment that begins to become available sometime around birth, and serves as the basis for responding to the input in such a way that brings about a stable state of competence with very nearly the same I-language as others in the ambient community. The task of an explanatorily adequate theory that we discussed in §4.1 is to characterize that initial and final state, and the intensional function that takes the child from one to the other under the influence of experience. One problem that bedevils much of the discussion of nativism is the contrast that it is supposed to bear to learned. In §5.1 will argue that, as Leibniz observed and is actually quite evident in Chomsky’s (1965) initial proposals, the innate and learned often go hand in glove. “Innate” should be contrasted not with learned from experience, but rather with acquired by it. The issue is whether the source of the cognitive material is in us, or is somehow “derived” or “constructed” from experience, for example, by association or other statistical processes. Whether material is learned or not has to do not with that question, but rather with the issue of how it is activated: is it by a rational process, or by a brute causal one? We will see that Chomsky’s views on this latter issue have evolved over the decades, and so the question should at this point at least be left open. Chomsky has often provided what I think are a number of misleading arguments for an innate UG, and it is important to set them aside so the best ones can shine out. The most bewildering is his characterizing the denial of UG as somehow committed to there being no difference between “my granddaughter, a rock and a rabbit” (2000: 50). Obviously, no one holds such a silly view: rocks have no cognitive life at all, and GenStatists could distinguish 2 By “brute-(causal)” I will mean pure physical causation, unmediated by cognitive (representational/intentional) processes or properties (I will delete the “causal” where it can be presumed). By “quasi-brute” I mean processes that are only partially mediated by cognitive processes. 3 See Ariew (2007) and Gross and Rey (2012) for reviews and discussion. 4 One can leave open just how many stable states there are—obviously more than one with multilingual children, but perhaps more than one even in “mono-lingual” cases (see Chomsky (1978a: 7; 2000: 8, 27; and Yang, 2002).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 153 human from rabbit minds in any of a variety of ways (cf. §5.4.3 below). Chomsky provides better arguments than this, but it turns out that many of these also tend to establish much weaker nativisms than the one that I will argue is by far his crucial one: a largely quasi-brute “causal process” as opposed to a mere “predicate” nativism. The best known argument Chomsky provides for an innate UG is that from the “Poverty of the Stimulus,” which I will lay out in §5.2, turning in §5.3 to what I think is his somewhat misleading assimilation of it to the Rationalist’s concern with geometry and mathematics. Chomsky calls this “Plato’s Problem,” but we will see that the issues raised specifically by Plato need not present a problem for sophisticated GenStatists, who do not need to be limited to simple inductive proposals (§5.4.1). As is well known, at least in philosophy, simple inductions face the problem of “non-projectible” predicates raised by Chomsky’s mentor, Nelson Goodman (§5.4.2), and sophisticated GenStatists can and have responded to this challenge by appealing to more sophisticated, “Bayesian” forms of statistical inference (§5.4.3). However, even these more sophisticated approaches are beset by widely noted empirical problems (§5.4.4), as well as by two more philosophical ones: an insufficiently appreciated problem of “modality” raised by Leibniz (§5.4.5); and a little noticed problem of “Superficialism,” raised in effect by Quine (though not quite characterized in this way by him) (§5.4.6). I will argue that especially these last two problems invite the quasibrute process nativism that I take to be Chomsky’s settled view (§5.4.7). I will conclude this chapter (§5.5) on a somewhat ecumenical note. GenStatism can be regarded as an instance of a general “Usage Based Linguistics”5 which seeks to base language acquisition in social communication and coordination. Such approaches can be prey to what I earlier characterized as “teleo-tyranny” (§2.2.5), and are standardly presented antagonistically as alternatives to a Chomskyan approach. But they need not be. All hands agree that language is used sometimes for communication, which is likely to be an issue of performance, involving human purposes and general intelligence. Although Chomsky sometimes seems to suggest that the only serious linguistic theories are of underlying grammatical competence, there is absolutely no reason his core theory need exclude further interesting theories of such processes, in particular of speech pragmatics and social and historical patterns in language use, any more than theories of physics exclude theories of engineering or meteorology. 5 This term was introduced by Ronald Langacker (1987), an early proponent of such views. For a recent review of the range of views under this umbrella, see Newmeyer (forthcoming).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

154 Representation of Language

5.1 Innate and Learned! Chomsky (1996) claims that language acquisition is sometimes misleadingly called “language learning”; the process seems to bear little resemblance to what is called “learning.” It seems that growth levels off before puberty. . . . After the system stabilises, changes still take place, but they seem to be at the margins; acquisition of new words, social conventions of usage, and so on. Other organs develop in rather similar ways. (Chomsky, 1996: 13)

Now, as we noted above (§3.1), the analogy of language with bodily organs is certainly useful in drawing attention to the ways in which the language system may automatically develop. And certainly no one is tempted to think that puberty is learned. But language does in fact seem a complex case. In fact, the above denial of language learning does not actually fit well with many of the other proposals Chomsky has made. As we saw (§2.2.2), he (1965: 30) initially proposed that acquisition of a grammar was the result of a comparison between the output of each of a small finite set of innate hypotheses and information about the ambient language, and a selection of one of them by, for example, a simplicity metric. We noted that such a model is a version of what might can be regarded as a classic theory of learning, the familiar “hypothetical-deductive” (“H-D”) model of scientific hypothesis confirm ation, according to which learning is the result of treating information from the world as evidence for a hypothesis. Indeed, Chomsky (1968/2006, 2010: 61) made a famous comparison of language acquisition with Peircean abduction in science, according to which our significant theories of any domain involve what has come to be called “an inference to the best explanation,” the range of acceptable explanations being highly constrained from birth. It is doubtless such a model that led Jim Higginbotham (1991) to insist that language acquisition is “a rational achievement.” Yet Higginbotham, like Chomsky, nonetheless regards at least Universal Grammar as innate! So the manifested grammar on this view so far is both innate and learned: it is possessed innately as among the candidate grammars, but then one of the candidates is selected, that is, learned on the basis of experience. You can learn what you already possess. What is ruled out by nativism is not learning a grammar from experience, but rather constructing it from sense experience in the way that empiricists had typically claimed for all our ideas. However, our grammars, like our

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 155 ideas of most things, are very likely not mere logical constructions from sense impressions—but they may be learned from them nonetheless. One might usefully think of the set of grammars that are innately possessed as being just that set that can come to be acquired by learning on the basis of exposure to certain sorts of input. Rather than being opposites, at least linguistic innateness and learning are dependent upon one another, part and parcel of a creature’s cognitive system. The point here is worth emphasizing, since the vociferous nativist Jerry Fodor (1998:ch 6) painted himself—and, he claimed, a Chomskyan linguistics—into a corner by missing it. Fodor (1975, 1981b) advanced a famous argument against the possibility of learning any new concept:6 If the mechanism of concept learning is the projection and confirmation of hypotheses (and what else could it be?), then there is a sense in which one never learns a new concept. For, if the hypothesis-testing account is true, then the hypothesis whose acceptance is necessary and sufficient for learning [an arbitrary concept] C is that C is that concept (Fodor, 1975: 95)

Fodor goes on to argue that, to a first approximation, all concepts expressed by mono-morphemes (and many poly-morphemes) in natural languages are innate, their activation due to “brute-causal triggering” by stimulation from the environment (see Fodor, 1981b: 273). I am not concerned here to defend or refute Fodor’s (as it has come to be called) “Mad Dog Nativism” (although see Rey 2014b for a provisional defense), but only to call attention to a needless difficulty he was led into by his statement of these claims. In his 1998 Fodor worries that if the above claims are true, then there is a puzzle about why there should appear to be rational relations between a concept and the worldly stimuli that activate it. He takes as an example the concept doorknob (which he takes to be primitive7) and wonders why, if it is innate and therefore, by his account, not constructed from experience, it is typically activated by the sight of doorknobs, or things that look or seem to function like them, and not arbitrary stimulations by, say, whipped cream or giraffes (Fodor calls this the “doorknob/doorknob

6 A view with which Chomsky, himself, has expressed agreement (see Piatelli-Palmarini, 1980: 260–1). Fodor’s main argument for the claim is based upon what he takes to be the fact that concepts have no analyses; see Fodor 1998. 7 Readers have wondered why he picked a polymorpheme that seems to invite analysis. Perhaps because it really does not. doorknob is not compositional: is it, e.g., a knob for a door, a knob in the shape of a door, a knob used as a door, or a knob engraved with an image of a door?

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

156 Representation of Language problem,” 1998: 127–8). That concepts are activated by learning, specifically, hypothesis testing would seem to be the only option that would capture the apparent rationality of the relation; but he argued that that option presupposes the very conceptual competence it is supposed to explain: “primitive concepts can’t be induced; to suppose that they are is circular” (1998: 132). In a footnote (1998: 128, fn8), he further argued that the same problem arises for a Chomskyan linguistics, where, if parameter settings are merely brutecausally triggered, as Chomsky (1981) proposed, then it would be a bizarre coincidence that a child sets the parameters in accordance with the parameters of her community.8 I will not pursue here Fodor’s own proposals for his problem (but see §11.2 for discussion of one of them), since the supposed problem seems to me entirely artificial. Along the lines I have sketched, concepts and languages can be both innate and learned. People are born with a concept, say, doorknob or triangle, and proceed to learn by confirmation from experience which things do or do not (approximately) satisfy these concepts. This seems to be precisely what the classical Rationalist, Leibniz, had regarded as the right way to think about his own nativist suggestion:9 I quite agree that we learn innate ideas and innate truths, whether by paying heed to their source or by verifying them through experience. So I do not suppose, as you say I do, that in the case you have mentioned we learned nothing new. And I can’t accept the proposition that whatever is learned is not innate. The truths about numbers are in us; but still we learn them, whether by drawing them from their source, in which case one learns them through demonstrative reason (which shows that they are innate), or by testing them with examples, as common arithmeticians do. — (Leibniz, 1704/1981: bk I, i, 23)

And, as I have said, this seems to be precisely how Chomsky (1968/2006, 1980a: 136) understood his (1966: 30) Aspects proposal for language acquisition. In any event, there need not be the often presumed opposition between

8 It might be thought this could be a brute causal correspondence, in the way, say, that the pattern of marks on a piece of paper correspond to the pattern of marks on the plate from which they were printed. But the problem is that lacking a subject is not a brute causal property of speech in the way that marks of ink are. More on this crucial issue shortly in §5.4.7; see also §11.1. 9 I am indebted to Steven Gross for drawing my attention to this quote. Interestingly, Chomsky (1965: 50) quotes this passage, but reads it as Leibniz simply refusing to “draw a sharp distinction between innate and learned.” But it’s not that the distinction is blurry; it’s the wrong one to draw!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 157 statistical and nativist models of language acquisition, at least not at the abstract level we have considered so far. As Lidz and Gagliardi (2015) observe: The presence of UG does not remove the need of a learning theory that explains the relation between the input and the acquired grammar. (Lidz and Gagliardi, 2015: 12–13, see also Yang, 2003: 451)

The tension, however, between the two kinds of models is not confined to such an abstract level. In the rest of this chapter, I want to argue that what is at issue in the nativism debate is not nativism vs. statistical learning tout court, but whether language is constrained in a highly domain specific way that renders it crucially different from acquisition of knowledge or the competencies presumably at work in learning arbitrary codes or low-level empirical facts. We will see that there is a series of increasingly severe constraints that point to a domain-specific process that is largely independent of the rest of the cognitive system.

5.2 The Poverty of the Stimulus The main consideration advanced for the innateness claim is what has come to be called the “Poverty of Stimulus” (PoS) Argument. There are various forms this argument has assumed over the years, and here I will present only what strike me as the most important ones.10 The core idea is the patently obvious fact that virtually all children manage to effortlessly attain competence in their ambient language(s) in a stunningly short period of time on the basis of what, for the task, is massively impoverished and often corrupt data. By the age of three or so, they have seldom if ever actually heard more than a tiny fraction of the things they can say and understand. Moreover, they often hear fragmented, interrupted, and technically ungrammatical speech (e.g., archaic prayers, whimsical songs, and poems); and, of course, they have never been instructed about the relevant rules or constraints that are manifest in the WhyNots. It is unlikely in the extreme not only that a child would ever utter 10 Chomsky (1968/2006: xi) sometimes claims he intended no such argument—it was simply an observation (see also Collins 2008b: 102–3), which it was a “tactical mistake” to mention, since it suggested that the growth of language might be based on experience, unlike the growth of all other organs (see Chomsky and McGilvray 2012: 40). However, given that the acquisition of at least some features of language, e.g., vocabulary, and the setting of parameters, is surely in part experiential, as well as the wording of the debates of the last fifty years or so over issues that go by the name, I will stay with it.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

158 Representation of Language or hear *Who did you and play ball today?, but, if she did, that a father would reply, “No, no. You can say Who did you play ball with today?, but you musn’t say *Who did you and play ball today?—but, if he did, what on earth would the child make of it?. That Who did you and play ball? is merely naughty? This last point is sometimes expressed as the claim that there is virtually no explicit negative data with regard to syntax—or, at any rate, not remotely enough of it to enable children to become competent with the ambient grammar in the three to four years that it generally takes them to do so. Aside from a few linguists, virtually no one is consciously aware of the kinds of rules that explain the WhyNots, or even that they exist; and, given their abstractness and complexity, certainly no one unacquainted with linguistics could possibly state such rules if it occurred to them to try. This lack of negative data raises what has been called the “subset problem”: given two possible languages, one of which permits only a “constrained” proper subset of the constructions of another (say, two ways of forming questions rather than twelve), how do children manage to acquire the more constrained subset if they have not been corrected for over-generalizing (e.g., to the twelve)?11 Of course, some might say that language is acquired by “habit” and/or “imitation,” along the lines suggested by behaviorists such as Quine (1970b: 4): “A language is mastered through social emulation and social feedback.” But such vague and empirically unsupported proposals are belied by the deep fact of “creativity” we noted in §1.5.2: other than obviously routinized speech, most sentences we encounter and produce we have never encountered or produced before. Indeed, one might wonder where children so much as get the idea that English clauses can be indefinitely nested in such sentences as This is the cat that chased the mouse that . . . lived in the house that Jack built. No other animal seems to do anything like it. Apart perhaps from aspects of phonetics and obvious routines, language seems less like “habit” or “imitation” than virtually any other human or animal activity.12 So the question is: how do kids do it? Well, maybe they are clever at grasping patterns, using GenStat procedures. But why would they never utter the WhyNots, or try UG-impossible rules, for example, ones that depend on linear distance, or counting the number of words in a string (cf., Chomsky, 2013: 63)? Well, perhaps, although they are not provided explicit negative data about what they cannot say, there is subtler, more “implicit” negative evidence 11 See the detailed example from Lasnik and Uriagereka, 2002, in §5.4.2 below. The “Subset Principle” is the claim that children do acquire “the smallest possible language compatible with the input at each stage of the learning procedure” (Clark and Roberts 1993: 304–5). See Pearl and Sprouse (2013) for discussion and Pearl (forthcoming) for a rich recent review. 12 With the possible exception of song-birds; see Bolhuis and Everaert (2013).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 159 that influences them (cf. Cowie, 2008: §2.3.1(a)). After all, absence of evidence can sometimes be evidence of absence (as in the case of there being no hippopotamus in the room). But in the case of language that alone cannot be sufficient, since there is a potential infinitude of things that could be said that are perfectly grammatical, for example, a million conjunctions of “Snow is white,” or some of the ungainly sentences from Faulkner or Henry James that are mercifully set down only in print. So—even if they were even interested in doing so, and it is by no means obvious that they are—children cannot rule out a sentence or phrase as ungrammatical just because they have not heard it before. Of course, children’s speech may often be implicitly corrected by adults repeating what they took a child to be erroneously saying by uttering it correctly, or by failing to utter a construction that a child might have expected. Examining several of the childes corpora, Chouinard and Clark (2003) cite evidence that 50–67 per cent of the child’s errors are reformulated by parents into congruent contrast utterances in which the error is replaced by the correct form. But everything depends on how the error and correction are understood. Corrections or rephasings are more likely to be for matters of style, manners, or fact (“Say Please can I go out,” “Don’t say damn,” “No, daddy’s not a cow.”). Indeed, as we noted in §1.5.8 in the example of when Walt Disney is on, many corrections ignore ungrammaticality and only focus on truth. What is striking about the examples of WhyNots is that they almost never seem to occur in standard adult–child exchanges. A fairly subtle WhyNot is discussed by Leddon and Lidz (2006), involving Principle A of the Binding theory (see §2.3.3; note the “*i/j” index on herself in (b) but not (a), permitting co-reference only to Jane in (b)): (a) Maryi wondered which picture of herselfi/j Janej saw. (i.e., where herself can be either Mary or Jane) (b) Maryi wondered how proud of herself*i/j Janej was. (i.e., where herself cannot be Mary) If the object of the main (or “matrix”) verb wondered is not the noun phrase which picture . . ., but instead a finite, complement clause (how proud . . .), then herself cannot co-refer with the non-local Maryi. Leddon and Lidz report that in a study of the first 10,000 wh-questions uttered by parents to children, there was not a single one with wh-phrase complement that contained an anaphor, pronoun, or name! Yet children respect the constraint nonetheless (see Lidz, 2018, for lucid discussion of the full complexities of the example).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

160 Representation of Language One might suppose, though, that sometimes issues of grammar come up. However, Virginia Valian (1999) has pointed out that, while people might suppose that an adult might say to a child: (c) Don’t say Want banana; say I want a banana. in fact: All studies that have investigated parental reactions show parents of 2-yearolds do not overtly correct their children’s speech [cites studies]. Explicit correction is rare at any age and tends not to occur at all for children younger than 4 years old. Nor do adults produce ungrammatical strings for children and label them as ungrammatical. (Valian 1999: 501)

Indeed, it is extremely doubtful that either (non-linguist) adults or children ever produce any of the more egregious WhyNots that we considered in §1.3, for example, violations of constraints on islands, contractions, or parasitic gaps. Most amusingly, when children are corrected about grammar, they often ignore it! Martin Braine (1971) reports trying to correct one of his children’s use of “one” as a noun modifier. Using different nouns on different occasions, his efforts went somewhat as follows: Child: Want other one spoon, Daddy mb: You mean, you want THE OTHER SPOON. Child: Yes, I want other one spoon, please, Daddy. mb: Can you say “the other spoon”? Child: Other one spoon. mb: Say “other” Child: Other mb: Spoon Child: Spoon mb: Other spoon Child: Other spoon. Now give me other one spoon. He notes: Further tuition [was] ruled out by her protest, vigorously supported by my wife. Examples indicating a similar difficulty in using negative information

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 161 will probably be available to any reader who has tried to correct the grammar of a two- or three-year-old child. (Braine, 1971: 160–1; quoted in Pinker, 1994: 114)

In any case, even if there are some occasions of implicit correction of grammar that children might notice and care about, it is unlikely such evidence would be sufficiently available and consistent across cultures to begin to account for the stable, convergent acquisition children display.

5.3 Plato’s Problem? Chomsky (1986: xv) calls the difficulty here “Plato’s Problem”: how can we know so much given “the poverty of the stimulus” available in the environment. The problem was famously raised by Plato in his “Meno,” where he has Socrates engage in a dialogue with an untutored slave boy. The boy (with a little prodding) comes to appreciate a non-obvious theorem of geometry, apparently relying only on his own resources. As we will see, this is actually an unfortunate comparison that ill-serves the “quasi-brute process” nativism to which I think Chomsky is ultimately committed. The problem of language acquisition is a much more peculiar, specific problem than that raised by Plato’s slave-boy, since, for all the (actually rather meager) evidence that Plato provides in the “Meno,” it could easily be the case that the boy “unconsciously knows” geometry simply by applying general learning strategies to spatial concepts—general strategies that, in fact, one might well expect to include math and geometry—and this is why he can fairly readily be led to consciously recognize the general truths that Socrates proposes. At any rate, neither Plato nor Chomsky have shown there to be something peculiar about the geometric reasonings that would require domain-specific processes, as opposed to merely possessing the relevant concepts. The point could, of course, be argued, depending on one’s view of geom etry. But certainly Plato and other traditional Rationalists thought such knowledge of necessary truths was due to the light of pure, a priori reason, not to contingent domain specific structures of the sort we seem to find in the case of language.13 No one thinks that the principles of grammar are “truths 13 It is important to distinguish the a priori from the innate: a claim can be known a priori if it is “justifiable independently of experience”; it is innate if it is somehow in-born. The two may overlap, but a belief, say, that space is Euclidean, might be innate but not knowable a priori.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

162 Representation of Language of reason,” knowable a priori, in the way that the Rationalists regarded mathematics. These, people need to learn not by mere reflection, but by engaging in the kind of laborious, empirical inquiry pursued by professional linguists. Chomsky’s comparisons to Plato are only a slight improvement on his distracting query about what distinguishes a human infant from a rock. Similarly, Chomsky’s stress on the productivity and creativity of grammar, while a fine point against, say, Behaviorists, is not quite so convincing against GenStatists, who might well claim that such properties are true of most any reasonable GenStat system. This is why the WhyNots are particularly crucial evidence for UG, since they are not entailed by mere recursion, creativity, or mastery of general notions of logic, mathematics, or geometry. So what further constraints does a Chomskyan theory impose? One way to appreciate them is to consider the claims of GenStatisms and a series of increasingly difficult problems that can be raised against them.14

5.4 General Statistical (“GenStat”) Approaches 5.4.1 Simple Induction The statistical approach that might first come to mind is simple induction, or generalizing about an entire class on the basis of some sample of it. This would certainly be one way to understand a GenStatism that proceeds to account for language acquisition by the construction of “learning algorithms.” Lappin and Schieber (2007) for example, characterize their statistical project as one of finding procedures for setting . . . parameters on the basis of samples (observations) of the phenomenon being acquired. The success of the algorithm relative to . . . observations can be verified by testing to see if it generalizes correctly to unseen instances of the phenomenon. (Lappin and Schieber, 2007: 394)

Now, to be sure, there is widely accepted evidence that, at least in certain artificial languages, phonological forms can be statistically inferred from phon etic input. Saffran et al. (1996) provided evidence that infants can begin to segment words on the basis of transition probabilities between stressed and

14 I’m indebted to David Israel for discussions of the next section, and for many of the references to the relevant literature.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 163 unstressed vowels and consonants. For example, in the phrase “pretty baby,” 8-month-olds are more likely to hear the sounds corresponding to “pre” and “ty” more frequently together than they are the sounds corresponding to “ty” and “ba,” and they display a novelty preference for “new words” like /tyba/ if uttered alone. They also displayed greater listening times to novel (part-) words (Yang, 2003: 451). Gomez and Gerken (1999) showed further that children are able to generalize grammatical structures with less than two minutes of exposure to an artificial grammar. Even when the individual words of the grammar were changed, infants were still able to discriminate between grammatical and ungrammatical strings, indicating that they were not learning vocabulary-specific grammatical structures, but were somehow sensitive to the general rules of that (artificial) grammar.15 There has been considerable work in computer science devising algorithms for extracting what seem to be grammatical patterns from printed text, such as The Wall Street Journal.16 Some interesting success has been had in extracting aspects of hierarchical over linear structure from typical child-directed speech. Perfors et al. (2011) ran computer models that demonstrated that a learner equipped with the capacity to explicitly represent both linear and hierarchical phrase-structure grammars—but without any initial bias to prefer either—can infer that the hierarchical phrase-structure grammar is a better fit to typical child-directed input, even on the basis of as little as a few hours of conversation. (Perfors et al., 2011: 313)

Indeed: The hierarchical phrase-structure grammar favored by the model—unlike the other grammars it considers—succeeds in one important auxiliary fronting task, even when no direct evidence to that effect is available in the input data. (Perfors et al., 2011: 313) 15 There is, of course, a serious question about the significance of the results. As Nick Allott (pc) has pointed out to me, “the input they feed the network is already much cleaned up and, crucially, linguistically processed: basically a string of syllables and nothing else.” An example Saffran et al. (1996: 1927) provide is bidakupadotigolabubidak, produced by a speech synthesizer producing a monotone “female” voice. Consequently, the infants may have been exploiting what phonologists regard as their small innate repertoire of human phonemes, distinguishing, e.g., vowels and conson ants. Note that the stimuli in an experimental situation involving a simple, artificial grammar are also much more controlled than they would be in the, as it were, noisy life of an infant. 16 This is a fairly vast, largely technical literature to which we cannot do full justice here. See, e.g., Elman (1990, 1993), Klein and Manning, 2003; Chater and Manning, 2006; Lappin and Schieber, 2007, and Chater et al., 2015, for general discussions.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

164 Representation of Language Note that the problem has been wisely shifted from assigning probabilities to sentences in a corpus, to assigning probabilities to certain kinds of structures in the corpus. Construed as simple inductive inferences, the above proposals are, however, subject to a serious problem noticed by Nelson Goodman.

5.4.2 Goodman’s Problem of Projectibility Beginning in the 1940s, Chomsky’s philosophical mentor Nelson Goodman (1955/83) provided an interesting new twist to the familiar problem of induction first noticed by David Hume. Hume had called attention to the problem of how to infer facts about unobserved (e.g. future, but also past) phenomena from the finite number of observations to which human experience is perforce confined. He suggested that what people must do is to assume a principle of “the uniformity of nature” to sustain the inference. One thing Goodman showed is that no such principle could work. Specifically, Goodman pointed out that there is an infinite number of possible concepts that could be “uniformally” projected from any finite set of data. For those who have not heard his famous “grue” problem: consider all the emeralds that have ever been examined. All of them have been examined before some time t, let’s say, 3000AD, and have been found to be green. But, if so, then they all also satisfy all of the infinitude of following predicates: x is grue3000: x is green and examined before 3000AD or otherwise blue. x is grue3001: x is green and examined before 3001AD or otherwise blue. x is grue3002: x is green and examined before 3002AD or otherwise blue. . . etc. The point is, of course, not confined to perceptual predicates like “green.” For any predicate, we can define an analogous infinite supply of “unnatural,” “grue-some” predicates: “emerose” (emerald until 3000AD or a rose), “electrino” (electron until 3000AD or a neutrino), etc. Novices to this problem may complain that “grue” and the like can be dismissed as “unnatural” because of the weird disjunction and reference to time. Whether or not such disjunctions and references are in fact unnatural even for us (consider {is either born in the USA or has been naturalized}, {pre-columbian} or {post-renaissance}), notice that in at least the grue-some cases, the disjunction and time reference is entirely relative

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 165 to our background predicates “green” and “blue.” Relative to the predicates “grue3000” and “bleen3000” (green/blue and examined before 3000ad or otherwise blue/green), it is “green” and “blue” that are disjunctive and time referential: x is green iff (x is grue3000 and examined before 3000AD or otherwise bleen3000) x is blue iff (x is bleen3000 and examined before 3000AD or otherwise grue3000).

A Martian endowed with “grue3000” and “bleen3000” predicates or concepts would be as startled by being presented with green emeralds after the year 3000AD, as you and I would be were we to be presented with grue3000 (i.e. blue) ones then. One might wonder how innateness helps with grue, since, given that it is innately expressible (we have after all just expressed it!), why does the child not select it? In the cases of both language and the spontaneous deployment of concepts, as we will discuss shortly, there may be severe restrictions on which concepts are actually available for vision and grammar. We can presume that, for the human child, green, but not grue, is as a matter of fact expressed by a simple predicate built into the visual module, and the likes of grue and “non-natural” languages are available only by construction in central cognition.17 The Goodman problem for linguistics is nicely exemplified by an example from Lasnik and Uriagareka (2002). They consider the difficulty of acquiring knowledge of the auxiliary transformation in English for a child who has heard: (1a) Is the dog hungry? The child must generalize to allow (1b) Is the dog that is in the corner hungry? But how can she do this? Lasnik and Uriagerka first note the problem that the child who encounters only (1a) would face of deciding between two simple hypotheses (I simplify the original):

17 I’m grateful to Jacob Beck (pc) for pressing me on this issue. See, e.g., J.D. Fodor (2009) for a review of research in this regard in the case of grammar.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

166 Representation of Language (2) A. Front the first auxiliary. B. Front the auxiliary in the [main clause]. The child hypothesizing A need not know much about English (or language in general), other than the fact that is in (1) counts as an auxiliary. However, the child hypothesizing B must somehow know what [a main clause] is. . . . That is, such a child must have a fair amount of prior knowledge about the structure of sentences. (Lasnik and Uriagareka, 2002: 149)

Lasnik and Uriagareka go on to note—in a way precisely paralleling Goodman’s example of “grue”—that even hearing (1b) would not suffice for the child learning the correct rule (2)B: (1b) is not direct evidence for anything with the effects of hypothesis B. At best, (1b) is evidence for something with the logical structure of “A or X,” but certainly not for anything having to do with B, implying such notions as [main and dependent] clause. (Lasnik and Uriagereka, 2002: 149)

Thus, (1b) could equally be evidence for: (2C) Front the first auxiliary or the second one if there is the word dog in the sentence. or (or à la Goodman): (2D) Front the first auxiliary until 3000AD, or the second one, thereafter. both thereby allowing (before 3000AD): (1c) *Is the cat that in the corner is hungry? Only a theory of what are the humanly natural linguistic predicates would explain the child’s selecting rule (2B) over these (yes, to us, absurd) alternatives and so being prepared to accept the likes of sentence (1b). Goodman was concerned with “the New Riddle of Induction”: why does it seem to be more rational to make inductions over “green” or “first auxiliary,” rather any of these other grue-some predicates? The issue raised by Chomsky is: what is it about the child that causes her to acquire a human language and not any of the infinite number of grue-some alternatives?

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 167 It bears stressing (because it is so persistently ignored) that a failure to appreciate this problem vitiates appeals to mere “analogy,” “abstraction,” and “generalization” of the sort that empiricists are wont to invoke. It was such empty appeals that Chomsky (1959/64) criticized in Skinner, and one could equally criticize in more recent, non-behaviorist proposals about acquisition, as in Goldberg (1995, 2003) and Tomasello (2003: 41, 2005: 189–4) (cf., §5.5 below). Everything is “similar” and “analogous” to everything else in some respect, if only in the fact that any two things belong to an infinite number of sets: Genghis Khan is similar to a pistachio nut in both being singular, having existed in Eurasia, and having been mentioned in this book; (1c) and (a) are analogous in both satisfying both rules (2C) and (2D). The explanatory work to be performed by analogy, abstraction, and generalization depends entirely on just which respects ought to be analogized, abstracted, and used as bases for generalization. What Goodman’s problem makes vivid is just how empty those notions are if the relevant respects are not specified.18 There have, of course, been quite a number of responses to Goodman’s puzzle. Goodman (1955/83), himself, proposed various criteria of “entrenchment” (essentially, using concepts that have worked before); Quine (1969b) and Lewis (1983) appeal to concepts of “natural kinds” and “natural properties.” But, however much these appeals might (or might not, see Titelbaum, 2010) answer the normative question of which predicates, properties or kinds we ought to appeal to in good working inductions, they would seem on the face of it to be of little help in settling Chomsky’s question in an explanatory epistemology of determining how children do in fact generalize from the meager evidence of their early lives, whether or not they are the least bit “justified” in doing so. It is also important to notice that an appeal to “natural” concepts or properties will likely be of little help here, unless this just means the concepts and properties with or about which humans actually do think. A great many of what many people take be perfectly “natural” properties they entertain can be pretty weird and “unnatural” at least from the point of view of the natural world. Consider luggage, fashionable, nations, ghosts,—not to mention the concepts of noun, verb, and auxiliary! In fact, leave aside “grue”: there’s every reason to think that “green” and “blue” pick out motleys

18 Thus, Cowie (2008: §2.3.1(a)) imagines a child accepting Is that girl that’s jumping castle Kayley’s daughter? since it “has the same basic structure” as Is that mess that is on the floor in there yours? But the hard question is how the child comes to appreciate such a shared “structure” without having the categories and principles of UG in the first place.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

168 Representation of Language of observer-dependent surfaces that have no other genuinely natural property in common (cf., §9.1.2(iv)). At best, “looks green” and “looks blue” seem to refer to quite idiosyncratic properties merely projected by the visual system as a result of the specific chemical structure of the human retina and visual system—just as “being a noun” or “a sentence” seem to refer to idiosyncratic properties projected by the human language system. What the world naturally offers is one thing; what people represent may be quite another.19

5.4.3 Bayesian Strategies Sophisticated statistical approaches to language acquisition do usually acknowledge that any statistical inductions have to be made against a background of specific natural predicates and hypotheses. Thus, implicitly conceding the importance of the Goodman problem, Jenny Saffran (2002) acknowledges the problem: One way to avoid this combinatorial explosion would be to impose constraints on statistical learning, such that learners perform only a subset of the logically possible computations. . . . Such constraints might arise from various sources, either specific to language acquisition or from more general cognitive and/or perceptual constraints on human learning. (Saffran, 2002: 173)

And Lappin and Schieber (2007) note: It is uncontroversial . . . that human language acquisition must have some bias or innate component. . . . There is, however, a non-trivial open question— some would call it the key question in linguistics—as to the detailed structure of the bias, the nature of universal grammar. Which aspects are derivable from general cognition and which are task-specific properties of natural language acquisition . . . [i.e. whether acquisition involves] an uncomplicated, uniform, task-general learning model, which we will term a weak-bias model in contrast to strong bias models that are highly articulated, non-uniform, and task-specific. (Lappin and Schieber, 2007: 395) 19 Pietroski (2015) makes a similar point, pointing out that linguistic properties, not appearing in nature apart from people, are in fact likely to be grue-some.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 169 In the case of language, this weak bias would presumably include the predicates for the various linguistic categories (although just which of them, and how they might be “constructed” from statistics on lower level predicates is left open).20 One obvious way to include the “right” predicates and hypotheses would be to employ a more sophisticated form of statistical inference based upon Bayes’ well-known theorem in probability theory, whereby the probability of a hypothesis is a function of probabilities of hypotheses prior to an examination of any evidence. Specifically, the probability of a hypothesis H, given evidence, E, pr(H/E), is computed by the equation: pr(H/E) = [pr(H) × pr(E/H)] / pr(E) where: pr(H) = the prior probability of H; pr(E/H) = the “likelihood” of the evidence, E, given H; pr(E)    = the probability of the evidence by itself. Thus, for example, the probability that a certain grammar is correct would depend upon the prior probability of that grammar, how probable that grammar would render that evidence, and the probability of the sensory evidence one has received.21 Here higher “prior” probabilities of certain hypotheses employing only certain predicates may enjoy priority over others, so avoiding at least humanly grue-some ones. In the case of language, this bias would presumably include the predicates for the various linguistic categories. So GenStatists, along with anyone who has noticed the “grue” problem, can be what we might call “predicate nativists.” A crucial and controversial issue is, of course, the source of the assignment of these initial “priors” of the various competing hypotheses. This is where GenStatists could allow that UG predicates and hypotheses are, indeed, provided innately. However, although this might seem like an important concession to the PoS argument, the possibility is left open that it is still domain general learning mechanisms that are responsible for language acquisition. But then, in view of what we have just noticed about the compatibility of hypotheses being both innate and learned, one might wonder exactly what the difference between Chomsky and GenStatists comes to. Especially given that Chomsky’s

20 This acknowledgment is sadly not true of all such statistical approaches and critics of PoS arguments: thus it is not so much as mentioned by in either Tomasello’s (2005) or Cowie’s (1999, 2008) critiques of Chomsky’s nativism or by Devitt (2006a: 249) in his endorsement of Cowie’s critiques. For detailed responses to Cowie’s (1999) and several other philosophers’ attacks on Chomsky, see Laurence and Margolis (2001). 21 See Pearl (forthcoming) for discussion and examples.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

170 Representation of Language (1965: 30) initial Aspects model has the same general structure as a Bayesian one,22 it may be hard to see. After all, Bayesian inference can be regarded as a limiting case of abduction, the case in which other criteria for a good explanation, for example, simplicity, conservativism, and generality, do not play a role; and it is to abduction that Chomsky (1968/2006: 79ff) initially analogized language acquisition. We will see shortly that he eventually—and for good reason—distanced himself from that analogy. First, though, we will consider some initial, purely empirical problems for even a sophisticated Bayesian GenStatism.

5.4.4 Empirical Difficulties (1) Virtual lack of Data: In a survey of 10,000 wh-phrases of the childes database, Leddon and Lidz (2006) found no wh-phrases that even contained a reflexive pronoun, and in a search of nine corpora of child-directed speech, containing a total of 6,750 words, Pearl and Sprouse (2013: 53–5) found no examples of parasitic gaps. Yet competent speakers effortlessly and unselfconsciously respect the relevant rules. (2) Abstraction from errors and misleading data: Ordinary speech is rife with short-cuts, false starts, mispronunciations and other anomalies. As we noted in §4.1, even during normal exchanges, either directed to them or overheard, children hear all manner of potentially misleading anomalous speech: deletions (Love you! Looks like an accident; see §1.4.1); meta-linguistic comments (It would be silly to say “Pigs fly”; You musn’t say “Shove it”); interruptions, false starts, errors, and “register” differences due to gender, class, regional dialects, and the archaic or mannered speech of prayers, poems, nursery rhymes, and nonsense games. Some of this may be sufficiently consistent to increase the probability of both non-UG and (in the case of archaic poems and prayers) non-target UG grammars.23 As Chomsky (1968/2006) observes: 22 It is seldom noticed that Chomsky’s (1965: 30) Aspects model of language acquisition (§2.2.2 above) is—putting probabilities aside—not only structurally identical, as we noted, to a deductivenomological model of scientific confirmation, but also to Bayesian inference, and it was this that may have invited GenStat proposals. However, as we noted in §2.2.8, the Aspects model was soon replaced by a more mechanical, “Principles and Parameters” model, which instead invites the “process” nativism we will discuss shortly. 23 See J.D. Fodor and Crowther’s (2002) particularly amusing reply to Pullum and Scholz’s (2002) example (taken from Sampson) of Blake’s poem “The Tiger,” in which, while some lines manifest contemporary English auxiliary structure, many others are in Early Modern: e.g.:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 171 [The child] must also differentiate the data of sense into those utterances that give direct evidence as to the character of the underlying grammar and those that must be rejected by the hypothesis he selects as ill-formed, deviant, fragmentary, and so on. (Chomsky, 1968/2006: 78)

Of course, varying registers and archaic speech are problems for everyone. But they would seem to be a particularly difficult problem for a system that was quite as hostage to the statistics of experience as even Bayesian statistical systems would appear to be. (3) Relations Between Forms (cf. §1.5.3 above): Perfors et al. (2011), themselves, recognize that there are plenty of aspects of syntax that cannot be captured naturally [by our statistical models]. In particular they do not represent the interrogative form of a sentence as a transformation of a simpler declarative form. Perfors et al (2011: 316)

But such a lack is crucial to assessing their proposal as a serious alternative to a UG model. An explanatorily adequate grammar should capture not only the structure dependency of sentences, but also the felt connections between, for example, questions and embedded clauses to their corresponding declaratives (Who does Bill like? and Mary is the one who Bill likes obviously bear a systematic relation to Bill likes Mary; see Berwick et al., 2011). An understanding of these simple relations is patently part and parcel of an ordinary speaker’s linguistic competence: the very fact that Perfors et al. acknowledge this would seem to undermine their proposal in a quite fundamental way. (4) Independence of general intelligence: As we noted in §1.1.9, an interesting, crucial datum about language acquisition is its independence of general intelligence, or, at any rate, IQ. If acquisition is simply the application of general statistical strategies to the domain of grammar, then why is it that extremely low IQ children who seem to be poor at applying those strategies in general nevertheless acquire grammar pretty much as easily and rapidly as children with average IQ.—A particularly striking case in this regard is Smith and Tsimpli’s (1995) case of Christopher, who, despite an IQ of 56, gained In what distant deeps or skies Burnt the fire of thine eyes? On what wings dare he aspire? What the hand dare seize the fire? How is the child to know which constructions to heed and which to ignore?

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

172 Representation of Language fluency in more than a dozen different languages! The size of a person’s vocabulary may well reflect intelligence and education; grasp of grammar apparently does not. In addition to these purely empirical problems, GenStat strategies face two more conceptual issues: Leibniz’s problem about learning modality, and the problem raised by Quine that we discussed in §3.3.2, of how to distinguish linguistic facts on the basis of merely behavioral dispositions.

5.4.5 Leibniz’s Problem of Modality Leibniz (1704/1981) famously raised a problem for any empiricist theory of knowledge:24 The senses never give us anything but instances, i.e. particular or singular truths. But however many instances confirm a general truth, they aren’t enough to establish its universal necessity; for it needn’t be the case that what has happened always will—let alone that it must—happen in the same way. For instance, the Greeks and Romans and all the other nations on earth always found that within the passage of twenty-four hours day turns into night and night into day. But they would have been mistaken if they had believed that the same rule holds everywhere, since the contrary has been observed up near the North Pole. (Leibniz, 1704/1981: 2–3)

from which he concludes: From this it appears that necessary truths, such as we find in pure mathem atics and particularly in arithmetic and geometry, must have principles whose proof doesn’t depend on instances (or, therefore, on the testimony of the senses), even though without the senses it would never occur to us to think of them. . . . The proof of them can only come from inner principles, which are described as innate. (Leibniz, 1704/1981: preface, 2–3)

More generally, “modal” claims regarding what is (im)possible, or what must, can, or can’t happen, cannot be established merely on the basis of the statistics of experience alone. Of course, rules of grammar do not have quite the status of the kinds of laws of reason that Leibniz had in mind. They indicate what 24 I’m indebted in this section to comments of Jerry Samet.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 173 cannot be said in a quite restricted sense, viz., only, at best, for creatures with a certain universal grammar. It is striking that we can easily imagine a language that did allow many of the WhyNots; it just it seems it would not be a human language. But the point about modality remains: could any mere enumeration of speech instances ever establish what “can’t” be said in English, as opposed to merely what hasn’t been—or has been, but in violation of the rules? A simple way to illustrate this problem is to imagine wondering whether it is illegal to shoot sparrows in Cincinnati. It will not be enough merely to notice that no one actually does so. Maybe no one wants to; or has ever gotten around to it. Or maybe plenty of people have done so, breaking the law just as many people jaywalk or drive through a red light. It is hard to see how one could ordinarily discover that something is illegal, without actually being explicitly told—or being born knowing it!25 Turning to the case of grammar: to learn that one cannot say *Who do you think the pictures of scared Mary? it is not enough never to have encountered an utterance of it. Again, there is obviously a potential infinity of grammatical utterances no one will ever hear. Moreover, in the ordinary course of speech, children will inevitably encounter ungrammatical (e.g. interrupted, archaic) speech. So mere (non-)utterance is far from being either necessary or sufficient for learning that some string of words is grammatically proscribed. But what else could they go on? What the GenStatists propose is that a child might “induce” grammars, rather as they make inductions to regularities from distributions in any sample. This will be constrained by vocabulary and priors, as we know from Goodman and Bayesians that any sensible inductions had better be. The problem with this is that rules of grammar are only obliquely related to probability distributions. It is of course likely (though in real corpora, unlike the WSJ, far from certain) that strings will be in fact grammatical; but it is also likely they will be cogent, polite, exhibit a certain style, and—indeed—be relatively short, for example containing fewer than a dozen consecutive adjectives. To be sure, many of the induced generalizations—for example that English sentences have subjects, verbs, and clauses in hierarchical structures— may correspond to grammatical rules. For a large class of cases, however, grammatical constraints seem to be surprisingly sharp and unexceptionable, in both production and in terms of Unacceptability reactions. It is not that violations of, for example, island, contraction, and negative polarity constraints

25 Perhaps specific legal claims about, e.g, shooting sparrows, might be confirmed as a part of a general theory of the ambient legal norms. But we would still need to know how experience alone could provide the evidence of such a modal category of what one can(not) legally do in the first place.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

174 Representation of Language are improbable; it is that they are in some important way impossible. When a person has an Unacceptability response to a WhyNot, this is not a matter of their making an estimate of how likely or unlikely it is that such a string may be produced. They are simply manifesting their competence. There is something in them that inexplicably resists the starred sequences, not because they are too long, or factually, pragmatically, or semantically weird, but because in some way they simply cannot be processed in the usual way. It is these peculiar modal constraints on language that it is hard to imagine a statistical sampling from experience producing.26 Pace Pearl (forthcoming, §3.2.3) the constraints do not seem to “derive directly from a simple dislike of low probability items”; they seem to be an intrinsic feature of the language system. A response open to a Bayesian at this point might be to merely assign zero probabilities to hypotheses a child seems never to consider, so that non-natural grammars are treated on a par with logical contradictions. Aside from the fact that there seems to be nothing contradictory or incoherent about a non-natural grammar, it is hard not to find this response a little ad hoc. At any rate, one well might wonder why some otherwise perfectly coherent grammars would be excluded in this way. Certainly a theory that explained why certain grammars were impossible would be preferable to one that did not, and we will see that Chomksyan theories are at least more promising in this regard.27 This difficulty actually points to what I think is perhaps the most serious problem facing GenStat strategies, a problem facing any non-psychologically informed approach to grammar.

26 Chater et al. (2015) seem to take themselves to be addressing this “logical problem”: The general puzzle of learning language from positive evidence has been raised to the status of a “logical” problem of language acquisition . . . [which] requires highly selective generalization, and it is not clear what information allows the child to distinguish those cases where generalization is legitimate from those where it is not. (Chater et al., 2015: 148) This is actually, of course, Goodman’s grue problem, but neither it nor the Leibniz problem will be solved by their proposed solution: By converting this logical problem to a probabilistic one and demonstrating formally that the unobserved [sentences] become increasingly unlikely in a fashion that a learner can exploit, we have shown how a learner might be able to constrain their generalizations appropriately. (Chater et al., 2015:222) The Goodman problem is that probabilistic generalization requires having the right predicates in the first place, and the Leibniz problem posed by WhyNots involves not their improbability, but the entirely orthogonal problem of their illegitimacy. 27 Note that this explanatory problem is especially acute if grammatical predicates, such as “IP,” are not some sorts of constructions from sensory ones, since one can wonder where they came from. As I will stress shortly in discussing Quine’s problem, that there is even such phenomena as grammar and its peculiar categories at all requires explanation. (See Pylyshyn, 1984, for discussion for the superiority of appeals to cognitive architecture.)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 175

5.4.6 Quine’s Problem of Behavioral Indeterminacy As we noted in §3.3.1, Quine raised a problem of how to distinguish claims about language from claims about the world without appealing to psychological distinctions that seem impossible to draw in the behavioristic terms he required, a problem that we saw generalized to all the main categories of language. Those categories depend upon a Galilean abstraction to idealized systems unavailable to Superficialists generally. Quine, himself, of course, would likely have thought: so much the worse for the categories. But GenStatists are not behaviorists. They take the categories seriously enough to try to show how they could be mastered by a child merely on the basis of its experience. But then they face the specific problem raised by Quine of how they or the child are even to conceptualize the “grammar” their algorithms are supposed to learn. For, again, apart from the structure of human psychology, it is simply not determinate that there is such a thing as grammar, much less what the specifics of it are. It is not enough that some induced rules may generate un exam ined sentences from the Wall Street Journal, the childes corpus, or even the judgments of (un)acceptable phrase structure from trained linguists. Whether these corpora or judgments are reliable indicators of grammar would depend on whether their best explanation is due in fact to human grammatical competence—let us suppose, UG—or, instead, to shared pragmatics, (dis)belief, style, or features of computability and shortterm memory. But how is an infant supposed to know to be on the lookout for all these distinctions? Even if an innate, modally constrained grammar is a specific theory that the child is innately disposed to entertain, why should mere regularities in the input be evidence of it, rather than just evidence simply of how people happen to talk? In their recent book bringing together a decade of their research, Chater et al. (2015) at least recognize that there is a problem here: a challenge for future theoretical work on learnability in language acquisition is to attempt to look for results that distinguish not merely competence from performance factors, but also different types of regularity (phonological, syntactic, or semantic)—that is, to see how far it is possible for an ideal learner to infer the modular structure of linguistic knowledge (at least insofar as this knowledge really is modular) from linguistic input. This kind of analysis might be able to capture the fact that Chomsky’s famous Colorless green ideas sleep furiously may be appropriately viewed as syntactically acceptable, even though it is semantically incoherent. (Chater et al., 2015: 172)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

176 Representation of Language But I think they underestimate the difficulty of the task. Given that the grammar is not well manifested in behavior—remember our Chomsky “joke” from Chapter 3 that people do not actually speak natural languages—then what would be required for its acquisition would not be mere extraction of regularities in people’s speech, but hypotheses about the structure of their psych ology, precisely the sorts of hypotheses Chomskyan linguists are attempting to provide. A mere statistical inference on the basis of repeated speech patterns is unlikely to be adequate. What one needs is a full-fledged abduction, or inference to the best explanation of whatever data are available about the underlying structure of the human mind, that is, the simplest, most general, theory that is best integrated with whatever else we know about memory, attention, culture, etc. This is an inference that would require vastly more evidence than is available in the speech heard by children, and which is unlikely to be in the repertoire of even the most intellectually gifted child. There seems little doubt that infants from birth are deploying a theory of mind (see, e.g., Hamlin and Wynn, 2011, and Baillargeon et al., 2014). and so it is likely they would deploy it to address the aforementioned problems of sorting out errors and other misleading data. But it would be quite surprising if the distinctions between grammar and the rest of the systems relevant to language, much less the relevant evidence for them, had begin to occur to infants in anything like the detail necessary to spell out rules of grammar as linguists do, who have access to vastly greater data not only about human languages in general, but about various psychological mechanisms that might lead them to discount certain data and include others. What would lead children to exclude center-embeddings or (in English) subject-less questions like “Finish your thesis?”

5.4.7 Quasi-Brute Process Nativism The issues raised by the preceding discussion can, I think, be nicely exhibited by a tempting philosophical argument that Sampson (1989, 1999) and Pullum and Scholz (2002) consider against a too simple understanding of a PoS argument: Consider the position of a linguist—let us call her Angela—who claims that some grammatical fact F about a language L has been learned by some speaker S who was provided with no evidence for the truth of F. . . . How does Angela know that F is a fact about L? If the answer to this involves

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 177 giving evidence from expressions of L, then Angela has conceded that such evidence is available, which means that S could in principle have learned F from that evidence. (Pullum and Scholz, 2002: 15)

Pullum and Scholz conclude that PoS arguments simply concern the different quantity of data available to the child as opposed to the linguist. This is a nice point of logic. But it still crucially misconstrues the difficulty to which Chomskyans seem to me to be pointing. Specifically, the Leibniz and Quine problems suggest that constraints on the proportions of data to hypotheses were not really the issue raised by the problem of language acquisition. Rather, the problem has something to do with the structure of the acquisition process. What more plausibly explains the distinctive modal force of acquisition of a grammar in the face of impoverished data is not some unusual probabilility assignments, but something about the innate structure of acquisition, a structure that seems to involve some specific kind of quasi-brute causal process. Even though the child had better represent the relevant linguistic categories, it is not as though it is a remotely available cognitive option for her to decide whether the evidence supports some non-UG grammar. The I-language just develops and winds up being able to process some strings and not others. To a first approximation, it might be regarded as a relatively autonomous little “machine” in our brains that is governed by the “principles” without those principles being considered, confirmed or even represented.28 The Leibnizian modal force in the case of grammar is due to a mechanical constraint; the Quinean behavioral invisibility, to categories imposed by this internal machine. Thus, it is not merely that there is not “nearly enough” data for the child to settle on the right grammatical principles; it is that the child is not remotely engaged in the kind of theory confirmation project that occupies the linguist, even if it turned out that there were enough linguistic data. While the prin ciples of an I-language are true of children’s I-languages, they seem not to be fixed by virtue of being hypotheses the child entertains and confirms. Rather, they seem to be true about the people’s brains in the way that chemical principles are true of their digestive systems, not awaiting confirmation for their application. The relevant processes are not cognitive, but largely brute

28 Of course, in processing certain strings, especially for the purposes of perception, the linguistic categories, e.g. of noun and verb, may need to be represented (an issue to which we will return to in §11.2). The point here is only that the principles governing the categories need not be.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

178 Representation of Language causal ones, implementing the principles largely without any rational assessment of them.29 Note, though, a complication here. Although a process may be brute, governed by principles that are not represented, this need not be true of the properties over which the process is defined. The principles governing the process patently concern such properties as being a noun, a verb, a DP, or an IP, which certainly do not seem to be brute physical properties (we will return to this issue in §11.2). Now, as we noted in §2.2, Chomsky (1981) did initially think of parameters being set as a simple mechanical, “triggering” effect. However, this proposal was soon seen to be implausible, and in any case invited Fodor’s doorknob/doorknob puzzle concerning why the parameters would be set so nearly in accordance with the ambient language (see §5.1 above). We will return to this issue in Chapter 11, but for now we may leave it open that parameter setting may be due to statistical learning (as in Yang, 2002 and Lidz and Gagliardi, 2015), rendering the process only “quasi-brute.” In sum, as Chomsky (2000) later put it: We can think of the initial state of the faculty of language as a fixed network connected to a switch box; the network is constituted of the principles of language, while the switches are the options to be determined by experience. (Chomsky, 2000: 8)

Although a child may need to locally (dis)confirm the settings of the switches, she no more confirms principles of UG than she confirms the principles by which her metabolic or immune systems operate.30

29 Cowie (2008: §2.1) misses this crucial point. She rightly notes that, in the P&P model, “the innate UG was no longer viewed as a set of tools for inference,” but then goes on to claim that “it was conceived as a highly articulated set of representations of actual grammatical principles.” But, as was stressed in §2.2.8, “principles” were precisely the sorts of things that did not need to be represented, much less confirmed, in the acquisition process. 30 Again, it is important to note how this process nativism diverges from classical Rationalism, and how language acquisition is not a feature of a faculty of reason in the way that Plato and Descartes presumed our understanding of geometry was. Descartes (1637/1984: 140–1) does tie language to the “rational soul,” but notes that “it patently requires very little reason to be able to speak” (the connection here could, of course, proceed in the opposite direction, from language to rationality, for which language is perhaps necessary but not sufficient, as speculated by Berwick and Chomsky, 2011). Katz and Postal (1991: 144) discuss other ways in which Chomsky’s nativism diverges from classical Rationalism. Note that this “process” conception of UG also undermines otherwise attractive analogies several authors have drawn between UG and an innate morality (see. e.g., Rawls, 1971, Dwyer et al., 2010 and Mikhail, 2011 and 2017). Although moral concepts and rules may well be innate, their continuity with rational cognition suggests they are more like an innate theory, such as innate theories of physics, biology, and mind, than like largely formal processes such as grammar.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 179 Note finally that this process nativism also provides a straightforward reply to Ray Jackendoff ’s “linguists' paradox” that we raised at the end of Chapter 1. The child comes to acquire effortlessly the rule system that not even the best linguists have been able fully to specify over sixty years of intense research— just as she acquires new teeth at six! Linguistics may be just as hard as dentistry.

5.5 Usage Based Strategies Of course, it is the postulation of a domain-specific modularized machine that the GenStatists are resisting. They are committed to treating the child as a “little linguist” who has to induce UG by merely domain general statistical procedures. They and cognitive-functional linguists are confident about the explanatory power of their approach independently of a Chomksyan grammar: Cognitive-functional Linguistics can nevertheless account for all of the major phenomena of Generative Grammar. For example, in the cognitive–functional view, natural language structures may be used creatively (generatively) by their speakers not because speakers possess a syntax divorced from semantics, as in Generative Grammar, but rather because they possess highly general linguistic constructions composed of word categories and abstract schemas that operate on the categorical level. Furthermore, for cognitive-functional linguists, the hierarchical structure on which so much of generative analysis depends is considered to be a straightforward result of the hierarchicalization process characteristic of skill formation in many other cognitive domains. (Tomasello, 2014: xxiii–xxiv)31

In support of these remarks, and like many in this tradition, Tomasello (2005: 189–94) appeals to the kinds of simple “inductive, analogical and statistical learning methods” that we discussed above. But, as we saw, these methods cannot explain, because they presuppose the domain-specific information and processes that are being denied; specifically, the relevant predicates and 31 It is worth noting a striking asymmetry in the explanatory burdens of what are sometimes called “classical” computational accounts of some specific mental process vs. accounts that appeal instead to “skills” as Tomasello seems to be proposing, or as radical connectionists often do. “Classical” views need not deny that “mere skill” and connectionist accounts are sometimes appropriate. But, asymmetrically, defenders of these latter accounts are in the position of dismissing “classical” accounts as never needed, not even for the specific phenomena they address! This requires that these alternative accounts discharge that burden for every such phenomena, which is, of course, no easy task (see my 2002b discussion of Hubert Dreyfus’ (2002) similar appeals to skills in this regard).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

180 Representation of Language processes that are needed to determine the relevant inductions and analogies. The WhyNot data are particularly crucial in this regard since they dramat ically raise the question of why children do not generalize in what would seem to be perfectly natural ways, rather than being guided by idiosyncratic rules. Where, in “the hierarchical processes characteristic of skill formation,” would they so much as get the idea of language being constrained by, of all things, a relation of c-command, or by “binding domains” or “islands”? Specifying the right “abstract schemas” for acceptable sentences would seem to require identifying quite specific, idiosyncratic structures that defy perfectly intelligible communicative purposes.32 Such data are crucial insofar as they are explained and predicted by a Chomskyan nativist proposal, but not, so far as I have read or can imagine, by any cognitive functional/constructivist one. One of the most frequent objections to Chomsky’s entire approach is that, in so restricting its focus to formal features of an “autonomous” core of syntax, it neglects other, blatantly obvious and important features of language, viz., its role in social communication and coordination. Here again it is hard to resist the suspicion of a teleo-tyranny at work (cf. §2.2.5 above). But note that the above “process” proposal affords plenty of room for the communicative “functions” of language emphasized by, for example, Goldberg (1995) and Tomasello (2003, 2005). It simply supplements those proposals with the postulation of a domain specific syntactic module many of whose idiosyncratic features, such as island constraints, seem to serve no communicative “function,” but inform and constrain performance that may involve all manner of motivation, and in that way be susceptible to functional/teleological explanation. In a revealing remark Tomasello (2014) seems to rebuke Chomskyans for their interest in such outré data as the WhyNots: The data that are actually used toward this end in Generative Grammar analyses are almost always disembodied sentences that analysts have made up ad hoc (typically derived implicitly from the genre of written language,

32 This is a crucial problem unnoted by Fiona Cowie (2008: §2.2.1(b)) in her otherwise useful philosophical review of the topic. Most of these data have been ignored by anti-generativists. An exception is Goldberg (2006), who attempts to provide a non-generativist “construction grammar”/ functionalist account of island constraints. This is—at best—a work in progress which has not yet managed to reproduce the coverage of the data of standard generativist accounts (see Lidz and Williams, 2009: 184, and other critical work cited there for discussion).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 181 which operates in some of its own unique ways, especially with respect to context . . . rather than utterances produced by real people in real discourse). (Tomasello, 2014: xix)

But a concern with such data is hardly an embarrassment for a Galilean like Chomsky, whose interest we have noted from the start is explicitly in the underlying mental structures responsible for language, not in what he expects is a theoretical motley of ordinary use. It was no defect of Newton that he did not try to explain the detailed trajectories of clouds! But, likewise, it is no defect of meteorology or auto-mechanics that they do not provide their explanations in terms of quantum physics. A theory of competence does not preclude theories of performance. Where Chomskyans are concerned with the underlying mental structures that are responsible for linguistic competence, GenStatists, usage based, and historical linguists are concerned with issues regarding the use of such structures, both by humans and, in the case of GenStatists, by machines, often for purposes such as rapid machine translation. Their work has by no means been unfruitful. Sperber and Wilson (1986/95), the neo-Griceans (e.g. Horn 1984), and others (e.g. Bach and Harnish 1979), developing insights of Grice (1967/1989), have spawned a rich industry of work on linguistic pragmatics and the principles governing communication, and in recent years this work has been subjected to an array of interesting empirical tests (Noveck and Sperber, 2004).33 Chomsky may be sceptical about the availability of serious theories at this latter level, but nothing in his core theory would be threatened should he be wrong in this regard. Gather ye explanations where ye may!

5.6 Conclusion I do not pretend to have dealt adequately with the vast array of GenStat proposals that have been or could be reasonably made as alternatives to a Chomskyan grammar. It is enough for our purposes here to appreciate the serious, at least prima facie difficulties they face, and thus the prima facie case for a domain-specific UG that is as innate as are principles of digestion or growth. On the other hand, I have argued that they need not be regarded as 33 See also Breheny (2011) and Noveck (2018) on experimental pragmatics, and Allott (2019) and Allott and Wilson (forthcoming) for discussion of linguistic pragmatics in the context of a Chomskyan linguistics. Roberts (2007) presents a useful review of generative approaches to historical linguistics to that date.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

182 Representation of Language antagonistic alternatives: theories of performance may well complement theories of competence, with neither excluding the other. None of this discussion should suggest that nativist UG proposals completely solve the problems of language acquisition. Neither positing innate hypotheses, nor an innate machine begin to explain how the hypotheses or the machine manage to relate to the sensory/motor material afforded by ambient speech. Any nativist proposal is up against another problem that we will discuss in §11.2.2, the problem of how to relate innate abstract linguistic categories to sensory perception. As Tomasello (2003: 182ff) rightly observes, echoing Janet Fodor (2001), if children are to set the “head” parameter, then they must recognize what in the input counts as “head”. But heads do not come with identifying tags on them in particular languages; they share no perceptual features in common across languages, and so their means of identification cannot be specified in UG. (Tomasello, 2003:183, quoted in Cowie 2008: §2.2.1(d))

Well, actually, much of the “means of identification”—tags indicating linguistic categories, such as noun, verb, preposition—would be provided by UG, but not clearly by any of the alternatives. What does, of course, have to be learned is which perceived sequences of phonemes fall into which categories: how does the mind know what a phoneme sounds like? And then: which sequences are words, nouns, functional morphemes, etc.? This is a quite general problem—after all, how do children know what lines or triangles look like? There are two parts to this problem: the first is how do they even get off the ground, knowing what (approximately) ideal triangles look like, or how phonemes sound? This ought perhaps to be called “Kant’s Problem,” since in his (1787/1968: B176) “Schematism” he seems to have been the first to notice it. Perhaps the innate “prototypes” proposed by Fodor (1998: 136–7) as a solution to his doorknob/doorknob problem (§5.2 above) would serve as such schemata, which we will consider in §11.2. But even if the child is equipped with idealized schemata, a second problem is how to apply them to perceptual input, given all the clutter and noise. In the case of language, Lidz and Gagliardi (2015) point out that any plausible theory of comprehension must distinguish the “intake” to the language system from the massive auditory “input” a speaker receives, which includes, for example, mere noises, errors, archaic poems, and prayers that might cause the language system simply to “crash.” Filtering out such material almost surely involves the kind of general intelligence, theory of mind, and GenStat

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Grades of Nativism 183 processes that are stressed by Usage Based theories. But once the input is filtered and compared with idealized schemata, the rest of the processing might well be performed by a modularized, quasi-brute confirmation system along the lines of Yang’s (2002) proposal. But both problems are problems for everyone, GenStatists, cognitive constructivists, and process nativists alike, and I postpone further discussion of it to §11.2.2.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

6 Resistance of Even Mental Realists and the Need for Representational Pretense The Superficialists we discussed in Chapter 3 are not the only opponents of Chomsky’s psychological conception of linguistics. There has been a surprising amount of resistance to that conception from critics who are perfectly comfortable with realist, internalist understandings of mentalistic talk, but just think that Chomskyans are not entitled to it in the way they think in the case of language, at least not on the basis of the evidence they standardly provide. As in the case we just observed of many of the differences over Nativism, some of this resistance to the psychological conception is due to many of these critics simply having radically different concerns than Chomsky’s, for example, a concern with prescriptive linguistic rules, or with conscious experience (§6.1). But some resistance is subtler, and turns on differing conceptions of the project of a scientific linguistics. Some, notably, Jerrold Katz and Scott Soames, have argued that linguistics is concerned with abstract, “Platonistic” entities, on analogy with the entities of logic and mathematics (§6.2). At an opposite, more “nominalist” extreme, others, like Michael Devitt (2006a, b; 2008a), have argued that it should be concerned with external physical utterances, not with either abstracta nor with any of the actual workings of the mind, even if the latter are necessarily the cause of those utterances.1 Devitt has been one of the most persistent and sophisticated critics of the psychological conception, but his criticisms seem to me, on the whole, to betray a number of fundamental misunderstandings of Chomsky’s project that need to be addressed in some detail (§6.3). The ontological theme that runs throughout them, however, does draw attention to a misleading way in which the Chomskyan project is standardly presented, for example, in terms that seem to refer to items on a page or in the acoustic stream. I will argue 1 I use “Platonist” and “nominalist” in the somewhat casual way that has become customary, “Platonism” to refer to views concerned with abstract objects outside space and time, “nominalism” only with individual items within it. Nothing in our discussions will turn on the actual historical dispute. Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0006

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 185 that Chomksyans, like in fact many other “internalist” psychologists, really do not intend the theory to be understood in this way. Additionally, and perhaps slightly presumptiously, I will propose that they are simply engaged in what I will call representational pretense that there really are such external items— whether or not they believe there are. (I hope to mitigate the air of presumption by adding a note of reassurance §6.4.2 that I am decidedly not claiming that all of linguistics is a sham!) The situation is quite like the way vision theorists might and often do pretend that there really are colors or Euclidean triangles apart from people’s representations of them in their visual systems, whether or not they are realists about such phenomena. It is just far easier to talk as if the items are real, than to be constantly inserting what they are really discussing, a system of “representation” in all their claims (§6.4.1). I will conclude as I did in Chapter 5 with some qualified ecumenical comments regarding different interests that might reasonably be included under the wide rubric, “study of language” (§6.5).

6.1 Initial Red Herrings 6.1.1 Prescriptivism The renowned philosopher David Wiggins (1997) takes indignant exception to Chomsky’s denigration of there being an important phenomenon of “proper English.” In reaction to Chomsky’s proposal, Wiggins writes: If we omit from the account of linguistic communication all mention of the languages in which speech is conducted . . . then the linguist leaves no locus at all for normative considerations. . . . The Chomskian . . . here will end up with nothing else to say but “anything goes” or “provided enough people will now go along with it, anything goes.” . . . [N]obody who actively and energetically and carefully speaks a language could possibly accept [such a view]. Many will feel an almost irresistible temptation to remark that the deliberate self-impoverishment of current linguistic theory is . . . singularly well made for the current state of writing and speaking. (Wiggins, 1997: 522–3)

But, of course, as have seen (§3.2), Chomsky is explicitly not concerned with “an account of linguistic communication” of the sort that concerns many others interested in language, much less with prescriptions regarding “good grammar.” These issues are simply not remotely part of his project. But one

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

186 Representation of Language might note in a more direct reply to Wiggins that the history of languages does rather suggest that, within innate constraints, pretty much anything does go, so long as enough people do agree. In any case, while many problems can likely be laid at Chomsky’s door, it is hard to believe that whatever decline there may be in standards of speech is one of them.

6.1.2 Consciousness Another early argument against the psychological reality of grammars is based on skepticism about totally unconscious mental processes of the sort Chomsky seems to be postulating. The philosopher Thomas Nagel (1969/82) compared the situation with ascription of unconscious beliefs and desires in psychoanalytic theory: we may observe that accurate formulations of grammatical rules often evoke the same sense of recognition from speakers who have been conforming to them for years that is evoked by the explicit formulation of repressed mater ial that has been influencing one’s behavior for years. . . . If the condition of recognizability cannot be met, however, the ascription of knowledge and belief seems to me more dubious . . . . (Nagel, 1969/82: 223–4; see also his 1993/95)

A decade later, John Searle (1992) proposed what he calls the “Connection Principle”: We understand the notion of an unconscious mental state only as a possible content of consciousness, only as the sort of thing that, though not conscious, and perhaps impossible to bring to consciousness for various reasons, nonetheless is the sort of thing that could be or could have been conscious . . . all unconscious states are in principle accessible to consciousness. (Searle, 1992: 155–6)2

Searle argues that no account omitting consciousness can account for “aspect ual shape” (1992: 156–61, 169–72), or the ability of the mind to distinguish

2 See G. Strawson (1994) for a similar view. For further discussion of the issue, and of Searle’s s kepticism about CRTs of minds in general, see Rey, 1992, 1997, and the essays in Preston and Bishop (2002).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 187 among even necessarily co-extensive concepts, such as [water] and [H2O]. But, as the very nomenclature he uses shows, this is patently false: it could surely be an objective feature of a brain that it represented water with a complex chemical formula or with a simple primitive. After all, as we noted in §4.5, an attitude state arguably involves a relation to a specific formal representation in a brain. We have already noted in §4.5 that the specific sorts of mental states that seem to concern Nagel and Searle, such as “knowledge” or “belief ” as they are ordinarily understood, are actually not essential to a Chomskyan explanation. The crucial issue is whether the theory is committed to unconscious intentional or “contentful” states that play a crucial role in language production and comprehension. Can there be unconscious intentional states? Perhaps conscious recognizability after analysis was viewed by Freud as a requirement on psychoanalytic ascriptions of repressed attitudes, but whether or not Freud was justified in so insisting, it is interesting to note that the more general claim he and Searle are pressing flies in the face of both traditional appeals to the unconscious and to good scientific explanation. Whyte (1967: 186) notes frequent, unqualified appeals to unconscious processes in such diverse historical authors as Plato, Cudworth, Leibniz, Hamann, Carus, Schelling, Richter, Schopenhauer, Fechner, and Nietzsche. Chomsky’s postulations pretty clearly belong to this tradition. But, more importantly, they are simply welcomed or refuted by their explanatory success or failure, whether it be of linguistic competence in the case of Chomsky or of parapraxes in the case of Freud. In any case, one would have thought the failure of introspectionism a century ago showed that conscious, introspectible states simply do not seem to form a theoretically interesting kind.3

6.2 Platonism Aside from the Superficialist objections we discussed in Chapter 2, there have been two alternative proposals to Chomsky’s psychological conception of linguistics, so-called “Platonist” ones, which I will discuss in this section, and “Nominalist” ones, that I will discuss in §6.3. 3 See Chomsky (2000: 86–98) for more detailed criticism of Nagel’s and Searle’s insistence on accessibility to consciousness; Nisbett and Wilson (1977) for a seminal paper on the unreliability of introspection; and Carruthers (2011) for more recent discussion of the now vast literature on the topic. Davies (1989: §2) provides an excellent discussion of the irrelevance of still further concerns about consciousness and “(sub-)doxastic” states; I think these were of greater concern at the time he wrote than (I hope) they are now.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

188 Representation of Language It is worth remarking at the outset just how peculiar both such proposals are as any sort of objection to Chomsky’s project. As we saw (§4.1), Chomsky sets his own task to be a patently psychological one of achieving “explanatory adequacy,” or accounting for how children can acquire competence in a particular grammar on the basis of the paltry primary linguistic data they encounter in their first years of life. Now, it is awfully hard to see why this should not be a reasonable aim for at least some linguists to pursue—who gets to legislate about “linguistics”? Perhaps the objection is merely that the evidence Chomsky provides for his project falls so far short of attaining it that it really ought to be regarded as evidence for something else. I take it that it is some combination of these objections that these critics are raising and that we will be discussing. But it should be borne in mind that it is always an option for Chomsky to allow theorists to pursue whatever their own interests may be, so long as he’s free to pursue his.—But perhaps he has not always presented his own view so ecumenically; so I will (§6.5). Linguistic Platonism was forcefully advocated by Jerrold Katz (1981, 1985c), Katz and Postal (1991), and Scott Soames (1984). These authors see linguistics as akin to mathematics: just as mathematics is concerned with the truths about numbers, linguistics is concerned with the truths about abstract sentences, in a way that sets aside issues about how people grasp or are otherwise psychologically related to them. Natural languages (NLs) are abstract objects, rather than concrete psychological or acoustic ones. . . . This view is the linguistic analog of logical and mathematical realism, which takes propositions and numbers to be abstract objects . . . [with] no psychological goals, depends on no psychological data, and has no psychological status. [L]inguistics is an autonomous formal science with its own goals and domains of fact. (Katz and Postal, 1991: 515, emphasis mine; see also Soames 1984: 157)

It is certainly true that the impressive success of modern logic was due in part to Frege’s insistence on freeing the study of logic from the nineteenth-century conception of it as concerned with “the laws of thought.” Perhaps linguistics should follow suit. Katz and Postal put forward three arguments against Chomsky’s “conceptualism” (whereby linguistics is based in human concepts): (i) from the need for SLE “types”; (ii) from analytic necessities, and (iii) from the problem of “the veil of ignorance.” Each deserves a comment:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 189

6.2.1 The “Type” Argument This is a red herring. Katz and Postal (1991: 522–3) observed correctly that, at least in standard practice, linguistics is concerned with type sentences not tokens, and, while tokens are (purported) spatio-temporal phenomena (one can ask where and when one was produced), types are abstracta, existing outside space and time, as, for example, sets of tokens. Indeed, they claimed that, sentences are “by definition abstract objects. Thus, conceptualism is false” (1991: 523). This is baffling. All parties to the discussion can agree with the role of types in categorizing virtually any individual phenomena.4 Thus, most sciences classify physical phenomena using ample set theory and mathematics to classify physical phenomena in various ways, but this does not make those sciences purely abstract and autonomous from theories of spatio-temporal reality (as Jim Higginbotham, 1991: 561, put it: “You might as well say that the MIT beaver and the common cold are abstract objects”). Similarly, linguistics can be regarded as a way of classifying and explaining some psychological phenomena by exploiting abstract resources, in this case, of “type” expressions and abstract tree-structures, all concerning what in the end are psycho logical phenomena grounded in a spatio-temporal brain. Related to this argument, it is sometimes pointed out that languages are typically taken to be “infinite,” whereas human brains are finite. But this, too, is a red herring. It may well be that classifying a finite brain in terms of a certain structure of rules involves a commitment to at least a potential infinity of sentences, and, to be sure, an infinite number of sentences cannot be squeezed into a finite brain. But, as Chomsky repeatedly notes, the whole point of a generative grammar is to show how, in von Humboldt’s (1836: 122) phrase, language “must . . . make infinite employment of finite means.” It is enough that there are generative rules realized in the brain that have this consequence. Perhaps spelling out this fact requires abstracting from any actual psychology, or even physics (suppose the universe is finite in space-time); but so, of course, would similarly spelling out people’s—or just a computer’s—arithmetic competence.5 4 Even strict nominalists such as Goodman (1951/77) will replace talk of types/features/properties by talk of “replicas.” But whatever serious philosophical issues there are with a Chomskyan linguistics, strict nominalism is not one of them. But see Chomsky and McGilvray (2012: 90ff) for passing discussion of the point. 5 As we discussed in §3.4.3, Pietroski and Rey (1995) proposed a strategy to accommodate such idealizations. Note that the idealization may need to abstract from the size and grain of space-time, which, let us suppose, may turn out to be finite and discrete. Langedoen and Postal (1991: 238) argue

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

190 Representation of Language

6.2.2 The “Necessity” Argument A more serious issue raised by Katz and Postal (1991) concerns not ontology alone, but rather the kind of facts that linguistics is supposed to explain. Katz and Postal provided six examples of the kinds of facts that they claimed “set the agenda for a theory of NL universals” (1991: 517), by which they mean a set of abstract, Platonistic objects, idealized away from any commitment to human psychology. What they had in mind echos the ambitions of Frege (1884/1974) who sought to set out the principles of logic and arithmetic in a way that avoided the peculiar “psychologism” of many nineteenth-century theorists of logic which seemed to aim to base logic and mathematics in facts about peculiarly human reasoning. Frege rightly urged that valid, truth-preserving inference or a mathematical truth should not be hostage to the vicissitudes of how human beings happen to reason about such matters. So perhaps what counts as a sentence of English ought not be hostage to how people happen to process language. The first four of their examples are familiar enough facts about grammat icality: case, phonology, and parasitic gaps of the sort we have discussed in Chapters 1 and 2, which Chomskyans are equally entitled to regard as typical targets of a psychological linguistic explanation. Katz and Postal went on, however, to include two further kinds of examples that are more problematic, and about which Chomskyans have perhaps not always been entirely clear (I re-number Katz and Postal’s, 1991: 516–17, examples): (1) If (a) is true, then in virtue of NL so, necessarily, is (b): (a) John killed Bill; (b) Bill is dead. (2) The proposition expressed in (a) is a truth of meaning independently of empirical fact: (a) Whoever is persuaded to sing intends/desires to sing. Indeed, Katz and Postal claimed that linguistics provides “a framework within which to clarify (philosophically) important semantic properties like ambiguity, synonymy, meaningfulness and”—they add without comment—“analyticity” (1991: 519), or, in one of its standard characterizations “truth by virtue of that, since sentential constructions can be shown to be equi-numerous with the power set of atomic sentences, i.e., the continuum, it would be physically impossible to realize all sentential constructions. But so would be the continuum of possible velocities of physical objects, which is hardly an argument against the use of the infinitesimal calculus!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 191 the meaning of its terms, independently of any facts.”6 And, of course, for philosophers, this is redolent of the Positivist ambition to explain our a priori knowledge of necessary truths like those of logic and mathematics by appeals to facts about language (see, e.g., Ayer, 1936/52). But, while a linguistic semantics might well play a role in mapping sentences produced by the I-language to abstract propositions, there is no reason to require it to play a role in a theory of the truth of those propositions. Physics may require mappings of physical magnitudes to abstract entities such as numbers; but that does not require physics to provide a theory of arithmetic! As David Israel (1991) pointed out in reply to Katz and Postal, “there are facts about English, about what propositions are expressed by certain utterances, and then there is a non-linguistic fact: that one propositions entails another” (1991: 571). Specifically, with regard to their example (1) above, he noted: It is just not true that if the proposition expressed by [an utterance of (1a)] is true that, then “in virtue of NL so, necessarily, is” the proposition expressed by [an utterance of (1b)]. Rather, if the proposition that, according to the grammar of English, is expressed by [an utterance of (1a)] is true, then, in virtue of the structure of the propositions concerned, the proposition that, according to the grammar of English, is expressed by [an utterance of (1b)] must also be true. (Israel, 1991: 571, emphasis mine)

Pace Katz (1981: 115), it is simply not true that meaning introduces a species of necessary truth into natural languages. For Frege, the analytic truths were simply sentences obtained from logical truths by substitution of synonyms; but what made those logical truths necessarily true was not a matter of language, but of “the laws of truth” (see Frege, 1884/1974). What makes (1a) entail (1b) is not a matter of language alone, but of the fact that, well, killings necessarily cause deaths.7 Why do they do this? That may be a matter of either the concepts, 6 This characterization has a venerable history—and seems sometimes to be carelessly accepted by Chomsky (1979b: 145, 1988a: 33), in passages that Katz and Postal quote (1991: 542). However, since Quine’s (1953b, 1954/76) famous criticism about such a formulation, many have retreated from it to a more qualified and essentially epistemological definition, “can be known to be true merely by virtue of knowing the meaning of the sentence,” see, e.g., Devitt (1996). Note that whether the meanings that the conceptual system understands a sentence to express are even themselves “propositions” expressed by truth conditions is an issue at least partly about the nature of thought, which is also not clearly in the provenance of linguistics, pace Soames (1984: §3.2). 7 Why is there the illusion of “truth by virtue of language”? One source may be the fact that someone who denied that bachelors are married could be taken either to be denying an obvious necessary truth about bachelors (qua bachelors), or to have a different mapping of sentences to senses, and this latter seems more likely than the former.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

192 Representation of Language the proposition, or maybe just the metaphysics of the properties of killing and death, but in any case not of language itself.8 Although Katz and Postal (1991: 542, fn13) did recognize that analyticity and truth are not properties of expressions, but of their abstract senses, they fail to explain why a theory of those senses—as opposed to a theory merely mapping sentences to those senses—is the responsibility of linguistics, and not of cognition in general. The analogy with Frege’s project is simply inapt, even by Frege’s own standards.

6.2.3 The “Veil of Ignorance” Argument This is perhaps the oddest of the arguments. Katz and Postal (1991) rightly pointed out that conceptualism commits its adherents to an account of NLs which faithfully reflects whatever actual human linguistic knowledge turns out to be. Since [they] adopt their position before competence is understood, they acquire their commitment behind a veil of ignorance. . . . Since it is a contingent matter what innate principles are incorporated into the mind/brain, those principles could organize the child’s linguistic experience in the form of a huge list of sentential paradigms . . . that includes every sentence that could be encountered in linguistic experience. . . . In this case, linguistic conceptualists would be committed to saying there are only finitely many English sentences. (Katz and Postal,1991: 524–5)

Since it is obvious there is a potential infinitude of English sentences—one can prefix “Bill believes that . . .” to any sentence to create a longer one—but humans only live a finite amount of time, that in their view refutes conceptualism. How could a finite brain deal with an infinite repertoire? But, of course, that is Chomsky’s whole point about the explanatory need for a generative procedure. The huge list possibility does not seem to be any more serious than Descartes’ possibility of all experience being caused by an evil demon, or the result of stimulating some brain in a vat: they are fine as an exercise in a beginning philosophy class, but it is hard to see why they should be taken seriously in science. Note that, for starters, good explanations 8 Pace the Positivists’ ambition to collapse the one to the other; see Bloomfield, 1933, quoted below. See Quine (1954/76) for a rich review and discussion.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 193 support counterfactuals, for example, what would happen if people were to wake up or to live longer? Katz and Postal seem to have fallen prey to the facile move from conceivability to serious scientific possibilities that suffuses too much philosophy: The mere existence of this possible world [of the huge list] suffices to show that the goal of characterizing an NL is independent of the goal of characterizing NL competence. (Katz and Postal, 1991: 525)

That would be like arguing that polio should not be characterized as the activation of a certain virus because it is conceivable that it isn’t!9 In any case, if, against all the evidence, human linguistic competence did turn out to consist in a huge look-up list, or, for that matter, the result of radio transmissions from Mars, why should the abstract language that Katz and Postal want to study be of any but mathematical interest? To take a different example: one can imagine that the human visual system operates by deploying a full list, VT1, of all the kinds of shapes of objects one can ever see (people, animals, flowers, etc.), or, alternatively, by a more modest theory, VT2, that deployed only some relatively local calculations of geometric forms, storing templates of specific patterns associated with various objects, and relying on a lookup list merely of these in order to recognize particular ones. Of what other than a purely academic interest would VT1 be if in fact humans (and all known animals) used only versions of VT2? By way of contrast, consider the case of logic with which Katz and Postal (1991: 520) and Soames (1984) constantly compare linguistics. Frege rightly regarded logic as separate from psychology. But that is because logic has obvious significance far beyond psychology in mathematics and in virtually any serious theory of anything. It is concerned with “the laws of truth,” most centrally, the principles of a deductively valid argument, and these might well have nothing to with any laws of human thought processes. Indeed, should some calamity render the entire human race incapable of logical thought, this would quite rightly not have the slightest consequence for logic or Frege’s theory. But of what interest, independent of human minds, is a theory of natural languages? Although a theory of possible languages is an interesting piece of

9 Collins (2010b: 54, 57, fn13) makes similar points in his own reply to Katz and Postal. And Collins (2010a) notes in reply to Langedoen and Postal, that natural languages may have a crucial recursive component without being recursively enumerable; cf. §2.2, fn15 above.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

194 Representation of Language mathematics (to which Chomsky, 1956, contributed; see §2.2.2 fn11 above), the subset of them that are natural to human beings could well be a mathemat ically arbitrary one, with no significance beyond that fact. SLEs seem to have no role to play in the world other than as, roughly speaking, ways of classifying psychological phenomena (cf. §§9.2–9.5 below). If humans and anything like them disappeared, or had never appeared at all, linguistics would simply have no (non-purely mathematical) subject matter. As Chomsky put it: There is no initial plausibility to the idea that apart from the truths concerning . . . the I-language . . . there is an additional domain of fact about [a Platonistic] language, independent of any psychological states of individuals. (Chomsky, 1986: 33–4)

Of course, those interested in pure mathematics are free to pursue their interests as they like. The point is that that pursuit does not show Chomsky’s interests to be in the least inappropriate.

6.2.4 Platonism’s Need of Psychology Another way to appreciate the same point is to notice that, even construing linguistics as Platonistically as one liked, still one would crucially need psych ology to indicate which Platonistic structures to take seriously in explaining the data set out in Chapter 1. We have seen various models of varying degrees of abstraction as the theory has developed, from the confirmation model of acquisition of Aspects, to the mechanical principles and triggering models of P&P, to the recent speculations about a simple, neurally and evolutionarily not implausible operation of “merge.” As we noted in §2.2.9, there has been increasing attention to how the demands of linearization imposed by the phonological system impose an external constraint on the purely hierarchical (“Calder mobile”) structures of the recursive merge system; and in §3.3.3 whether island constraints are internal to the grammar, or instead, like center embeddings (The man the boy the dog licked liked died) explained by shortterm memory constraints. A theory of a Platonistic “linguistic reality” pursued conceptually independently of psychology would not be able to provide a specification of either. Our “intuitive” verdicts about the cases are simply insufficient.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 195 The same goes for Soames’ (1984) insistence that Among the most important Leading Questions of linguistics are instances of (Q1–Q3): (Q1) In what ways are [all natural languages, e.g., English and Italian] alike and in what ways do they differ from one another? . . . (Q2) What (if anything) distinguishes natural languages from e.g., “finite state languages” or animal communication systems, e.g., “bee language”? (Q3) In what ways have (has) [natural languages] changed and in what ways have [they] remained the same? (Soames 1984: 158)

Soames claims that what makes these Leading Questions is the relative paucity of their theoret ical presuppositions, as well as their centrality in guiding the development of a broad spectrum of linguistic theories. (Soames 1984: 158)

But, for all their paucity, the presuppositions seem to flagrantly include one that a Chomskyan is precisely denying, viz., that linguistics should be studying external E-languages rather than an internal I-language. At any rate the “leading questions” for Soames are not leading questions for Chomskyans, unless they are understood to be about the I-language of speakers who are loosely categorized as speakers of, for example, “English” or “Italian.”10 We turn now to the opposite extreme from Platonism, the essentially nominalist proposals of Michael Devitt.

6.3 Devitt’s “Linguistic Reality” (LR) In the course of now nearly thirty years of many articles and two books,11 Devitt emphatically rejects Chomsky’s psychological conception of linguistics, arguing instead that

10 In fairness to Soames, his discussion was written a few years before Chomsky (1986) introduced the distinction between I- and E-languages, and argued for the lack of significance of the latter (although, as we have seen, it was implicit in work going back at least to his 1965). 11 See Devitt 2003, 2006a, 2006b, 2008a, 2008b, as well as a similar, earlier discussion in Devitt and Sterelny (1987/99: chs 8–9).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

196 Representation of Language (LR): Linguistics has something worthwhile to study apart from psycho logical reality of speakers: it can study a linguistic reality. This reality is in fact being studied by linguists in grammar construction. The study of this linguistic reality has a certain priority over the study of psychological reality. A grammar is about linguistic reality not the language faculty. Linguistics is not part of psychology. (Devitt, 2006a: 40)

Indeed: A grammar is a theory of the nature of the expressions that constitute a language, not of the psychological reality of that language in its competent speakers. (Devitt, 2008a: 205)

Again, Devitt is not committed to the staunch anti-mentalism of Bloomfield or Goodman. He acknowledges that language is produced by minds; he just thinks linguistics concerns a “linguistic reality” conceptually, largely independent of them.12 The constitutive conditions of language do not depend upon anything like a psychological conception of the details (see Devitt 2008a: 155–7), any more than the constitutive conditions of being a table or a chair do. Indeed, Devitt’s insistence on (LR) should be seen in the context of his concern with realism generally, a concern significantly different from what motivates Chomskyans and one that I will briefly address in §6.3.1. Although I think Devitt is mistaken about (LR), his discussions raise a number of important, intuitively often quite compelling issues that will both place Chomsky’s conception in sharper relief, and force a more literally accur ate (if somewhat surprising) re-statement of the relevant grammatical prin ciples. In the end, though, I will conclude in §6.5 that the differences between Devitt and Chomskyans are so vast (see fn 15) that they really should be regarded as concerned with largely different phenomena, Devitt with essentially external uses of language, Chomsky with the specific internal sources of

12 Devitt does not use the term “conceptually independent.” I will use it only to capture his claim that “psychological facts are not the subject matter of grammars” (2008a: 207), and his “fourth methodological point”: A grammar as a theory of language has a certain explanatory and epistemic priority over a theory of the psychological reality underlying language. (Devitt, 2006a: 274) all of which I read as saying that we ought theoretically to conceive and specify grammars largely without commitment to any detailed underlying psychology (“largely,” since, again, Devitt does think language in general is constitutively the result of human psychology; he just doesn’t think it is so in anything like the detail of a Chomskyan linguistics).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 197 it. Devitt’s main error is to claim that Chomsky is somehow mistaken in having the internal, psychological interest he pursues.13 In his published writings, Devitt (2006a: §§2.1–2.3, 2008a: 205) bases his argument for (LR) on three distinctions: (D1) Between a competence and its outputs; (D2) Between structure and processing rules; (D3) Between the respecting of structure rules and the inclusion of them among processing rules. He illustrates these distinctions by a number of examples—horseshoes, chess, logic—but the example closest to natural language is perhaps the dances of the bees, which Karl von Frisch (1927/53) famously showed were a systematic way that bees indicated to other bees the location of sources of food.14 And it is true that von Frisch’s theory of the dances and its indicator properties is clearly set out without any speculation about the underlying psychological competence that makes them possible. At least the external “structure rules” of the dance, whereby certain patterns in the dance indicate the location of food relative to the azimuth of the sun, can be perfectly well specified without any mention of any processing rules. The structure rules must, of course, be consistent with whatever processing rules there are, but the latter need not be included among them. But even here, whatever external “structure rules” capture the indication system the bees use, the latter do not preclude the study that would interest Chomskyans, of what might be the underlying internal structure rules that characterize the “internal state that allows for this range of dances and not some other range” (Chomsky, 2002a: 138). Devitt (2008a:205) then proceeds to argue as follows: (D1P1) These distinctions apply to humans and their languages. (D1P2) A grammar is a theory of the nature of the expressions that constitute a language, not of the psychological reality of that language in its competent speakers.

13 As we have noted, the position is not entirely symmetrical, since, at least sometimes, Chomsky allows that there might be significant external linguistic phenomena to notice. He just doesn’t think that an account of them is likely to be remotely as deep as the internalist account he thinks linguistics should pursue, and upon which any externalist insights will ultimately heavily depend. 14 In a commentary on Devitt, Barry Smith (2006: 440) expresses skepticism about the bee findings. But, as Devitt had earlier pointed out (2006a: 20 fn3, 205), Riley et al (2005) have confirmed von Frisch’s original findings with more sophisticated experimental techniques.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

198 Representation of Language concluding: (D1C) The linguistic conception is true and the psychological one false. This argument can be straightforwardly dismissed by simply pointing out that the application of the distinctions to natural language is crucially different from its application to bee dances (as well as to chess, logic, and horseshoes) in a way that undermines (D1P2). I will address this issue in §6.3.2, proceeding in §6.3.3 to Devitt’s different understanding from Chomsky of the competence/ performance distinction, as well as to the distinction between “structural” and “processing” rules (§6.3.4), a distinction that can be easily confounded with that between competence and performance. But I think what really motivates Devitt’s commitment to (LR) are two deeper issues, the first concerning the role of “conventions” in linguistic theory, which I will address in §6.3.5; the second, a related conviction that the subject matter of linguistic theory consists in the actual tokens of SLEs that he takes to be produced in utterances and print. Helpfully, in an email, Devitt also supplied an argument for this latter view, which I will consider in §6.3.6. This argument is interesting insofar as it explicitly engages Quine’s views about ontological commitment (although perhaps not in the way that Devitt presumes) and has the merit of drawing attention to a misleading appearance in the standard statement of generative principles that has not been sufficiently noticed even by Chomskyans themselves. But, first, some remarks about the background sources of the disagreements here.

6.3.1 Fundamentally Different Concerns The differences between Devitt’s and Chomsky’s views of language are so persistent, pervasive, and exasperating to almost everyone who has encountered them that it will help to have some idea of the fundamentally different concerns that are motivating them.15 15 The differences are both terminological and substantive and are so extensive and interlaced that it may be worth listing them here with an indication of where I shall discuss them (where “D” is Devitt and “C” is Chomskyan): 1. D is concerned with acoustically entokened E- vs. C’s concern with I-languages (§§6.3.2, 6.3.5, 9.3, 9.9); 2. D is concerned with correlations between E-tokens and their referents, not with the internal phenomena that concern Cs (§§6.3.2, 6.3.3, 6.3.5);

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 199 As we have noted, Chomsky thinks linguistic theory should be focused upon the underlying internal psychological system responsible for the remarkable ability of human beings to acquire a system as complex as a natural language. He sees no other plausible source of the manifold complexities of syntax to which he has drawn attention. His bêtes noires have been what he regards as the excessively externalist and empiricist claims to the contrary. By contrast, Devitt is nowhere seriously concerned with natural language syntax, for the details of which he is by and large happy to defer to Chomskyan theories. He is not a linguist, but a philosopher fundamentally concerned throughout his career with ontological realism, or the reality of the world independent of our thought and talk about it.16 His bêtes noires have been Kantian idealism and its descendants in the radical relativism of Thomas Kuhn (1962), the “world-making” of Goodman (1978), and the panoply of “(post-)structuralist” views associated with, e.g., de Saussure (1914/77) and Derrida (1987) (and he might have included the Chomskyans, Ray Jackendoff, 1983, 2006a, and McGilvray, 2014). One main consideration Devitt deploys early on against those views is a “causal theory of reference,” whereby reference is secured not so much by the descriptions a speaker may associate with a word, but by the external causal relations between the speaker and the world. Without those causal relations, he fears we can fall prey to the ontological relativisms that he deplores. Now, Devitt and Chomsky can pretty well agree about resisting all of their bêtes noires. Although Devitt (2006a: 220–43; 2014: 280) does occasionally flirt with an associationism, and Chomsky (2000: 182) with Goodmanian “world-making,” neither of these flirtations are essential to their core concerns.

3. D’s conception of “competence” is essentially dispositional-behavioral; C’s is of an idealized, hidden system that doesn’t always surface in (even dispositions to) behavior (§6.3.3); 4. D sharply distinguishes structure from processing rules in a way that Cs need not (§6.3.4) 5. D endorses David Lewis’ (1969: 1) “platitude that language is ruled by convention”, which Cs tend to reject (§6.3.4); 6. D regards intuitions as claims about an E- language, whereas Cs see them essentially as manifestations of the I-language (§7.1.2); 7. D thinks items in the brain have and do not represent linguistic properties, whereas Cs at least speak about them as representing them (§8.2, but see §§8.4, 8.6 for complications); 8. D is inclined to a “brute-causal associationist/connectionist” view of linguistic processing, a view which would likely be resisted by at least many Cs (§7.2.3). 16 Devitt’s opposition is not only to the idealist metaphysics, but also their epistemology, especially their commitment to the a priori. This concern will not figure in this chapter, but surfaces in Devitt’s opposition to Chomskyans reliance on “intuitions,” which we will consider in Chapter 7. Devitt (1984/91/97) and Devitt and Sterleny (1987/99) are good sources for Devitt’s concerns with language and reality, and Devitt (2010b) provides a useful, relatively up-to-date collection of many of his papers on all of these issues.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

200 Representation of Language Devitt (2008a: ch 9) more often endorses a fully internalist computational psychology, and Chomsky (1995b: 18; 1995c; 1996: 35) seems happy with realism at least with regard to the ontologies of serious sciences. But their understandable opposition to their respective bêtes noires can be so fierce and uncompromising that they often ignore the others’ concerns and make themselves seem more at odds than they need be. I think this is particularly true of Devitt’s insistence on realism and externalism regarding language (I will discuss some of Chomsky’s opposite excesses in §§9.1.2, 10.4.2). I am not pretending that their views can be fully reconciled, although I will recommend some ecumenicalism in §6.5. But I think bearing these different concerns in mind should help the reader have a feel for the otherwise bewildering tensions between their views that I will discuss in the remainder of this chapter, and in the next. We turn first to the important differences between natural languages and the dances of the bees.

6.3.2 Bee Dances and Natural Language There is an obvious dis-analogy between findings about how bee dances indicate food and the data that concern Chomsky. The bee research is focused essentially on lawlike correlations between motions of the bee and states of the world, not the pattern of motions of the individual bee apart from those co-relations. It is the apparently communicative use of these motions to indicate nectar sources that leads the researchers to sometimes regard the bee dance as a “language”—nearly enough for purposes here, an E-language. But this is in sharp contrast to a Chomskyan focus upon I-languages, or the internal conditions that produce outputs, independently of whatever correlations, if any, that might obtain between those outputs and any conditions in the world.17 As Chomskyans often emphasize, unlike the language of the bees, human language is stunningly stimulus independent: a person might utter a grammatical sentence under virtually any condition. Consequently, premise (D1P1) begins to beg the question against Chomsky, since Devitt’s distinctions apply in significantly different ways to natural 17 Chomsky (1959/64) actually stresses that it is an important feature of human languages that words and sentences do not stand in the kinds of co-relational communicative relations typical of animal cries and dances.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 201 language than they do to the bee dance: with the bee, the concern is with the effects of the external manifestations; Chomsky’s concern is with the causes of them, and how they are essential to what a natural language is. So he would reject the analogy with the bees, and understand both (D1P1) and (D1P2) as applying differently in the two cases. Of course, given his concerns with realism and causal approaches to reference, Devitt’s sympathy with the essentially externalist bee dance research is not surprising. Defending his non-psychological conception of the linguistic task, he writes: The . . . most important reason starts from the intuition that our concern with sentence tokens, as with bees’ dances, is with their meanings. . . . [W]e should be concerned with the properties of sentence tokens that enable them to play certain striking roles in our lives, including the role of informing us about reality; these are the “meanings” of tokens. (Devitt, 1996: §§2.3–2.8)

Now, I think Devitt is right that we have an interest in wanting a theory that would explain the striking roles that meaningful tokens of SLEs play in our lives, just as we saw (§5.5) that Usage Based theories do. Certainly some of what Devitt (1981, 2015) and others have written on behalf of “causal theories of reference” and “deferential uses of words” have offered important insights against traditional “description theories” of the phenomena, insights to which, we noted earlier, Chomsky (1975b: 18) himself seems actually sometimes quite sympathetic. But we cannot always get what we want.18 And Chomsky joins many others (e.g., Wittgenstein, 1953/2016; Kripke, 1972/82; Horwich, 1998) in doubting that we will ever get anything like a serious theory of the topic (an issue to which we will return in §10.4). What Devitt needs to show in order for his focus on LR even to begin to compete with Chomsky’s on psychology is that these mind/world relations are as remotely susceptible to theories as deep as those of I-languages seem to be.19 But, for now, we can leave it that he thinks he can do this. 18 In particular, it is not at all clear that a theory of syntax itself would offer much explanation of why language is a good guide to reality. The WhyNots suggest that natural language syntax does not always offer the most straightforward way in which information about the world might be imparted (cf., §2.2.5 above, on resisting teleo-tyranny). 19 In a footnote to the first passage (2006a:29fn19), Devitt cites Randy Harris (1993:5) who claims the definition of linguistics as “the study of the links between sound and meaning” is “one that virtually all linguists would agree to” (indeed, it goes back to Aristotle); and with enough slack about “meaning,” perhaps everyone would agree, both those concerned with worldly co-relations and those, like Chomsky, concerned with purely internal I-meanings.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

202 Representation of Language

6.3.3 Competence and Its Outputs A source of further confusion arises from what Devitt proposes as “the most theory-neutral view of competence in a spoken language”: the ability to produce and understand sentences with the sounds and meanings of that language. (Devitt, 2006a: 128)

But this is hardly neutral to Chomsky’s conception: it blatantly begs the question against him! Recall that, contrary to the standard linguistic views through chapter 1 of Aspects (see §2.3.3), the outputs of the “grammar” do not consist in external utterances or inscriptions, but rather in representations of features that are made available to other sub-systems. And just what those representations may be can likely not be specified independently of the system that produces them, any more than the molecular structures produced by chemical bonding can be specified independently of the principles of that bonding (consider again the interesting controversy we noted in §3.3.3 about whether island constraints are mandated by the grammar or are the results of processing difficulties in memory).20 In any case, it is only the output of these subsystems, in conjunction with other performance systems, that might (or not, as the speaker pleases) issue in the externalia that Devitt takes to be the “linguistic reality” that linguistic theory should properly address. Speakers’ judgments about external speech may be evidence for a grammar, but only if they are evidence for what the grammar itself predicts, viz., the outputs of the I-language, and, as stressed in §3.3.3, this cannot be determined prior to a theory of the internal systems that produce them. Let Devitt have his externalia and theorize about them as he pleases. They do not seriously figure anywhere in Chomskyan theory (at least post-1965: ch 2; cf. §2.2.3 above).

20 Mark Greenberg (pc) has made a related point to me. He notes that Devitt (2006a) concedes that “the theory of the outputs of linguistic competence is not concerned simply with actual outputs: it abstracts from performance errors to consider outputs “when competence is working well” (2006a: 18). But in order to make this idealization, we need to ask when something, e.g., memory constraints, interferes with the competence. But then an understanding of the relevant ideal outputs does require, first, an understanding of the relevant psychological aetiology and so, pace Devitt, is not independent of it.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 203

6.3.4 Structural vs. Processing Rules Continuing with his analogy with bee dances, Devitt also draws a distinction between the structural rules that govern them and the processing rules that might be involved in their production: A bee returning from a distant food source produces a “waggle dance” on the vertical face of the honeycomb. The positioning of this dance and its pattern indicate the direction and distance of the food source. These dances form a very effective symbol system governed by a surprising set of structurerules. It is the task of a theory of the dance symbols to describe these structure-rules. In contrast, the processing-rules by which the bee performs this rather remarkable feat remain a mystery. (Devitt, 2006a: 20)

To be sure, the distinction is clear enough in the case of bee dances so long as one is considering their dance as an external language correlated with external phenomena. But it becomes less clear when we consider the I-language. The issue of the “processing” of the I-language has been muddied over the years by confusion of performance issues with competence ones. This began with Chomsky’s careless, ultimately unintended but influential remark early at Aspects, (1965: 9) that treated “generate” as a merely logical notion (cf. §3.4.2 above). It became further confused by the increasingly intricate issues surrounding the distinctions between the parser and the grammar, and even whether there is a distinction between them at all. On the face of it, the parser and the grammar appear to doubly dissociate: as we saw in §1.4, there are grammatical sentences that are virtually impos sible to parse (e.g. The man the boy the mother the sister hated loved hit died) and there seems to be parsable speech that is not grammatical (*More people have been to Europe than I have.).21 Now, if the grammar and the parser are indeed so separate, then it might be tempting to conclude that the grammar is indeed a purely logical construction, having no causal import. However, this would clearly be a mistake. As we noted in §3.3.3 above, there are causal roles that Chomskyans have postulated for a grammar to play other than just within parsing and production. For example, it is responsible for making certain

21 Some think these appearances can be explained away; see e, e.g., Philips (2001) and, for a useful short review of the present state of the topic, Smith and Allott (2016: 150–9).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

204 Representation of Language structures available to systems of parsing and production, as well as, presumably, to memory, attention, judgment, and reasoning. Making such structures available in these ways will involve constraints and operations on how they are produced, and given our ignorance, these constraints may still have to be characterized in considerable abstraction from the specific causal processes that may implement them. But none of this implies that Chomskyan linguistics is completely indifferent to issues of implementation. Unfortunately, Devitt identifies issues of processing with performance throughout his discussion. For example, he and Sterelny (1987/99: 508) explicitly claim that “if [a grammar] G is psychologically real then it plays a role in performance, in the production and understanding of sentences”; and in discussing scepticism about integrating grammar into “the theory of language use” Devitt (2006a) slides un-self-consciously from it to skepticism about any embodiment of grammatical rules in speakers’ minds: Fodor, Bever, and Garrett made it clear that early theories of how a grammar should be incorporated into a theory of language use were wrong. As a result, grammatical rules had a reduced role in the theory of language use in the decade or so that followed. . . . The psycholinguistic interest in trying to find the rules of the language embodied in a speaker’s mind seems to have steadily diminished without, so far as I can see, much explicit acknowledgment that it has; the interest is simply withering away. Psycholinguists mostly now approach the study of language processing as if it were, except for the Respect Constraint, independent of grammars. And so they should. (Devitt 2006a: 196, emphsis mine)

But here I suspect Devitt has the history wrong. Along recent Minimalist lines (§2.2.9), generativists could regard the principles of (not the parser or production systems, but) the I-language as either structural or processing ones that are in some way or other ultimately realized in the brain. As we noted in §§3.3.2–3.3.4, computational procedures can be specified in any number of degrees of abstraction, with rules either explicitly represented or merely implemented in conjunction with processes of other systems such as attention, short-term memory, and the pragmatics of parsing and production, all of which will provide further bases for empirically selecting among extensionally equivalent grammars one of them as psychologically real. Of course, even if Devitt were to focus on Chomsky’s project of explaining the WhyNots and the other crucial data mentioned in Chapter 1, he likely

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 205 would still regard the abduction to any psychologically real set of grammat ical principles as “too swift” (Devitt and Sterelny, 1999: 185). But here it seems to me, as I suspect it does to most Chomskyans, that one just does the best science one can, leaving one’s mind open to further evidence and serious alternatives as they become available. Given that Devitt is focused on language use, and does not anywhere even begin to try to explain the crucial data of Chapter 1 in any detail, he would not seem to be in the best position to second-guess Chomskyans about the matter.

6.3.5 Conventions Devitt neverthless does seem engaged in a kind of second-guessing on behalf of (LR) by appealing to “conventions” by and large over an innate UG as sustaining the regularities in Linguistic Reality. He insists that: SLEs are social objects like the unemployed, money and smokers . . . which have their relevant linguistic properties. . . in virtue of environmental, psy chological, and social facts and, particularly, social conventions. (Devitt, 2006a: 156)

Now, one might think he has in mind merely the basic lexical facts, linking meanings to morphemes. But, surprisingly, he includes syntax: Meaning is constituted by conventional word meanings and a syntax that is conventional to a considerable extent. (Devitt, 2006a: 156)

Indeed, in an astonishingly sweeping characterization of linguistics, he claims that our theoretical interest in language is in explaining the nature of these largely conventional properties that enable language to play this guiding role. (Devitt, 2006a: 182) The study of the (largely) conventional meanings of actual linguistic entities, meanings constituted by a (partly) conventional syntax and conventional word meanings, is the concern of linguistic theory. Our theoretical interest in language is in explaining the nature of these conventional meanings that enable language to play such an important role in our lives. (Devitt, 2006a: 189)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

206 Representation of Language Devitt does allow that some features may be innate (2006a: 13, 103), but this seems mostly lip-service, since, so far as I can find, he never provides any serious examples. He does tentatively speculate that what innate rules of nat ural language there are are the result of their being rules of the Language of Thought postulated by Fodor (1975). Indeed, he goes so far as to claim that, if Fodor’s postulation is false, then “the rules specified by UG are not, in a robust way, innate in a speaker” (Devitt, 2008a: 13). Given what seem to be the substantial discrepancies exhibited by the WhyNots between what we can think and what we can grammatically say, Devitt’s speculation seems improbable. But, whatever the status of that hypothesis, surely its falsity would not undermine the substantial arguments for the innateness of UG that we reviewed in Chapter 5! His claim does suggest that he does not take the oddities of UG rules to seriously tell against his conventionalist conception. To the contrary, Devitt sometimes seems actually to think that conventional rules would suffice instead of innate ones. As we noted from the start, a major challenge to a conventionalist view is to explain, inter alia, the WhyNots. Devitt does not address these specifically, but in reply to a related challenge of Collins’ (2008b)—about whether conventions could have evolved regarding inaudible elements, such as multiple occurrences of PRO (see §2.2.7, fn36 above)—Devitt sees no problem: Consider the string “Bob tried to swim.” The idea is, roughly, that each word in the string has a syntactic property by convention (e.g., “Bob” is a noun). Put the words with those syntactic properties together in that order and the whole has certain further syntactic properties largely by convention; these further properties “emerge” by convention from the combination. The most familiar of these properties is that the string is a sentence. A more striking discovery is that it has a “PRO” after the main [finite] verb even though PRO has no acoustic realization. There is no mystery here. (Devitt, 2008a: 217–18)

But surely this is a completely empty wave of the hand. The more one focuses on the surprising abstractness, ineluctability, and conscious inaccessibility of both the tree-structures, the relevant relations defined by them and the presence of inaudible elements such as PRO, the more difficult it is to see how children just “put the words with those syntactic properties together in that order and . . . certain further syntactic properties . . . ‘emerge’ by convention

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 207 from the combination.” It is hard to believe that Devitt thinks he has provided even a whisper of a serious alternative explanation.22 To repeat the points frequently made against any such conventionalist proposal, one of the main lessons to be learnt from the WhyNots is how implaus ible it is that the constraints they exhibit could have arisen from any sort of convention. The rules seem never to be violated much less corrected, and some of the relevant phenomena, for example, the curious parasitic gaps, and the behavior of reflexives in wh-phrase complements seldom occur in childhood corpora (see §5.2), and many of the relevant syntactic elements are not manifest in overt speech. When Collins (2008a) wonders how conventions could account for such phenomena, Devitt (2008a) replies: I don’t know. But I don’t need to know to sustain linguistic realism. I have shown that it is plausible that a whole lot of sounds and inscriptions that humans produce form representational systems. Those systems are not fully innate and so must be partly conventional. I have shown how it is possible for conventions to yield unvoiced elements. I have indicated in a general way, referring to David Lewis (1969), how linguistic conventions, like other conventions (that are not stipulated), arise from regularities together with some sort of “mutual understanding.” . . . Lewis begins his book by claiming that it is a “platitude that language is ruled by convention” (1969, 1). This is surely right. (Devitt, 2008a: 218)

But it is precisely how “surely right” Lewis’ view is that is challenged by the WhyNots and the other crucial data that can be adduced for Chomsky’s psy chological conception and seem, pace Devitt, impossible to explain without it.23

22 I am reminded of Louise Antony’s (2002) nice rejoinder to Hubert Dreyfus’ similar rejection of representationalist theories of skills. She quotes Monty Python’s “How to Do It” advice: Well, last week we showed you how to become a gynecologist. And this week on “How to do it” we’re going to show you how to play the flute, how to split an atom, [and] how to irrigate the Sahara Desert. . . . How to play the flute. (Picking up a flute.) Well here we are. You blow there and you move your fingers up and down here. See https://www.youtube. com/watch?v=tNfGyIW7aHM In response to Collins (2008c), Devitt (2008b: 253) does retreat to agnosticism about whether the rules on PRO are conventional or innate. One wonders what he thinks about all the other rules. 23 Curiously Pereplyotchik (2017)—who exhibits a great deal more knowledge of syntax than Devitt—seems to share Devitt’s faith in the Lewisian program: “working out the applications of Lewis’s general picture to the specific case of syntactic conventions is a research project whose eventual success we have been given no reason to doubt” (Pereplyotchik, 2017: 41). Like Devitt, however, he nowhere addresses the relevant data that provides more than ample reason for doubt.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

208 Representation of Language

6.3.6 Devitt’s Ontological Argument In a recent email that he was happy for me to quote, Devitt put his argument for (LR) this way (I’ve re-labelled the claims): (D2P1) Any theory is a theory of x’s iff it quantifies over x’s and if the singular terms in applications of the theory refer to x’s. (D2P2) A grammar quantifies over nouns, verbs, pronouns, prepositions, anaphors, and the like, and the singular terms in applications of a grammar refer to such items. (D2P3) Nouns, verbs, pronouns, prepositions, anaphors, and the like are linguistic expressions/symbols (which, given what has preceded, are understood as entities produced by minds but external to minds). (D2C) So, a grammar is a theory of linguistic expressions/symbols. (Devitt email to author, 10 December 2015)

Premise (D2P1) is a version of Quine’s (1953/61a) “criterion of ontological commitment,” which we will accept for the sake of the present discussion. It is (D2P2) that is under contention, since I will argue that it is false (at least for Chomskyan grammars). There will be no need to discuss (D2P3) here, although we will return to it in Chapter 9.24 Devitt (2003, 2006a) claims that “a great deal of the work that linguists do, day by day, in syntax and phonology is studying a language in the nominalistic sense” (2006a: 31), that is, external expressions in an E-language.25 Indeed, he “takes all the objects that linguistics is about to be concrete tokens” (2006a: 30). And he is certainly right that Work on phrase structure, case theory, anaphora, and so on, talk of “nouns,” “verb phrases”, “c-command”, and so on all appear to be concerned, quite straightforwardly, with the properties of symbols of a language, symbols that

24 (D2P2) and (D2P3) are claimed in a related discussion (Devitt, 2008a: 211). So the present argument can be regarded as a backup argument for the discussion there. 25 Devitt (2006a: 31) cites what sound like Chomsky’s E-language formulations of his task in his (1957) Syntactic Structures, and at (1980a: 222) as evidence of the linguist’s interest in a non- psychological linguistic reality. But, as we mentioned in §1.4, fn 33 above, the 1957 formulation was only for the limited purpose of what were in fact mere lecture notes for a course at MIT, and the 1980 formulation is simply distinguishing “generation of sentences by the grammar” with “production and interpretation of sentences by the speaker” (1980a: 222) (as we saw in the (1965: 9) Aspects quote, Chomsky tends to over-simplify his characterization of his competence project when he is anxious merely to distinguish it from a performance one. He is more careful in other of the characterizations I provide).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 209 are the outputs of a competence. This work and talk seems to be concerned with the properties of items like the very words on this page. (Devitt, 2006a: 31)26

Devitt takes such talk to involve a commitment to the reality of nouns and verb phrases, per (D2P3), as the external output of competence. Now, let us allow what Quine (1953/61a) reasonably claimed, that a theory is committed to the entities it quantifies over, the values of its singular terms. Opening any standard elementary linguistics textbook, it would certainly appear that the authors are referring to and quantifying over types of SLEs, tokens of which appear on its pages as objects of discussion. But Devitt seems to have carelessly neglected a crucial caveat in Quine’s discussions of these issues, viz., that theorists cannot always be taken at their workaday word. There is, for example, Quine’s (1953/61a) well-known recommendation of replacing the common talk of “properties” in favour of predicates; and his (1960: ch 7) extended discussion of “ontic decision” in general, where he discusses various “entitia non grata,” such as “sense data,” “infinitesimals,” “ideal objects,” “miles,” “minutes,” and “degrees Fahrenheit,” over which scientists seem routinely to quantify and to which they might seem committed, but can be shown on reflection not to be. The philosopher’s task, as he construes it, consists of making explicit what had been tacit, and precise what had been vague; of exposing and resolving paradoxes, smoothing kinks, lopping off vestigal growths, clearing ontological slums. (Quine, 1960/2013: 275)

Thus, clearly for a Quinean, in considering the commitments of a Chomskyan theory, we should not go merely by what is said in elementary textbooks, but instead be guided by what sort of entities are performing the genuine explanatory work the theory is being invoked to perform. Devitt has not always missed this point. In earlier work, he and Sterelny (1987/99) write: The best reason that we can expect to find for thinking that linguistics is about x rather than y is that the considerations and evidence that have

26 Devitt also discusses here the claims of some phonologists, Burton-Roberts et al. (2000), about their object of study. This raises complex issues that are best postponed to §9.2.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

210 Representation of Language guided the construction of linguistic theory justify our thinking that the theory is true about x but not y. (Devitt and Sterelny, 1987/99: 498–9)

So what are the considerations and evidence that guide Chomskyan research? Well, as we have observed, Chomskyan linguists seldom appeal to actual corpora data about the use of external speech. For all they care, some people may take vows of silence and never speak at all; some people may constantly be interrupting themselves, seldom producing complete sentences, and others might not bother to use resources of recursion and complex embeddings. To repeat again what Chomskyans always stress, the interest is in competence, not performance, which may or may not manifest an underlying competence. As we have also stressed, crucial data for that competence is provided by, for example, WhyNots, for which it is virtually impossible to imagine a nonpsychological explanation. Remarkably, in his most recent writing on the topic, Devitt (2020b) actually agrees: [B]ecause the phenomena [of WhyNots] are psychological! But the issue is whether grammars directly explain them. I would argue, along the lines of [learning about an external LR] that grammars as they stand do not directly explain them, although their accounts of the syntactic structure of linguistic entities contribute to the explanation. (Devitt, 2020: p379)

Indeed, he even grants the Chomskian view that the language has some of its syntactic properties, including perhaps the WhyNots, as a result not of convention but of innate constraints on the sorts of language that humans can learn “naturally”. (Devitt, 2020: p379, fn11)

This is baffling. Chomskyans are not merely calling attention to “some” syntactic properties that are innate. They offer a substantial theory that the innate properties are the main fundamental properties of grammar, and, if they are right in this (which Devitt has provided no reason to doubt) then those properties arguably constitute what a grammar is! This is precisely why they regard grammar as part of psychology. If Devitt believes there is some theoretically serious social alternative, he should cite it. Chomskyan linguistics is per force about psychology because that is simply where the ultimately explanatory

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 211 law-like regularities lie. Pace Devitt (2008b), it is phenomena such as the WhyNots and the host of other peculiarities of grammar that “warrant the robust psychological assumption that there are mental states with the properties the grammar ascribes to VPs” (2008b: 251). Given that Chomskyans regard the explanatory point of their work to be psychological, it is not surprising therefore that they don’t take themselves to be discussing what they regard as an ill-defined set of external SLEs themselves (indeed, recall from §2.2.3 that between chapters 1 and 2 of Aspects, sentences as sequences of acoustic phonemes were explicitly replaced by talk of much more abstract “features”). Rather, they are concerned with an internal computational explanation defined over mental representations of them—just as vision theorists are concerned not with real colors, but with computations in the visual system over representations of colors, whether or not there really are colors answering to them.27 As we noted in §3.3.3, the relevant intension is not determined from a predetermined extension, but, rather, the extension from a theory of the intension and how the system that embodies it interacts with other mental processes. Thus, any actual external SLEs, and any “linguistic reality” independent of that computational psychology, are largely irrelevant to the theory. It is representations of SLEs, not the SLEs themselves, that are, after all, the sorts of things upon which the relevant computations reasonably operate, just as computations of bank balances operate not upon dollars and cents themselves, but over representations of them.28 One wonders how Devitt could have failed to notice this central explanatory interest in representations, since, although the term doesn’t keep cropping up often in textbook discussions of the nuts-and-bolts rules (a fact to which I will return shortly), it suffuses practically all of the theoretical ones. In addition to the influential “hypothesis testing” passage at Aspects (Chomsky, 1965: 30) that we have discussed a number of times, there is an

27 Given the hegemony of extensionalist presumptions for the last century in philosophy (see §8.6 below), I cannot stress enough that, as I use “representation,” there can be a representation of an x even though there is no x, as in the case of Zeus and the highest prime. I submit that this usage accords with much ordinary usage, and, more importantly, is explanatorily essential—see §§8.6–8.7 below for extended discussion. 28 As we saw (§4.5) and will discuss further (§11.1), the point is sometimes missed and so bears stressing: what was essential to Turing’s revolutionary conception of a computer was its proposal of transitions between states that would be based not on arbitrary properties of what might be represented, but entirely on local physical properties of representations of those things, e.g., the position of a switch, or the orientation of a magnetic field.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

212 Representation of Language abundance of other passages. For example, in his initial seminal work, LSLT, Chomsky (1955/75) writes (I italicize the relevant phrases): In considering the nature of linguistic theory, we have been led to regard the theory of linguistic structure as being, essentially, the abstract structure of “levels of representation” (Chomsky, 1955/75: 105)

and he elaborates on the point in his (1975a) introduction to its eventual publication: This program [of LSLT] was one of the strands that led to the development of the cognitive sciences in the contemporary sense, sharing with other approaches the belief that certain aspects of the mind/brain can be usefully construed on the model of computational systems of rules that form and modify representations, and that are put to use in interpretation and action. (Chomsky, 1975a: 5)

This is by way of contrast with other approaches: It has been common in various fields concerned with language to describe linguistic behavior as “the use of words.” . . . The theory of finite-state Markov sources, as developed by Shannon and others, might reasonably be taken as a precise characterization (at the syntactic level) of the vague proposal that linguistic behavior is a matter of “the use of words.” In LSLT this topic is not considered at all. —(Chomsky, 1975a 6–7, emphasis mine)29

There is no need to decide here whether Chomsky is right in so focusing his discussion. Maybe his approach in the end will prove fruitless. I have not quoted all these passages at such length to somehow establish their truth; only to make perfectly vivid the explanatory intent of the theory, what Chomsky takes his project to be at least trying to discover, and maybe, so far, it has simply failed. The only question that needs to be pressed here is why Devitt insists upon reading linguists as concerned with the SLEs that textbooks only superficially appear to be discussing, in the face of this overwhelming evidence that the theory is concerned instead only with computations on representations of them. Surely second-guessing a scientist’s emphatic, explicit intentions ought to give one pause—as Devitt would surely

29 See also similar passages in Chomsky (1965: 4–5, 1968/2006: 25, 1980a: 129–30).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 213 agree, it should have given Positivists more pause than it did to claim that Newton was concerned to explain his sensory experience, and not with the laws of motion!

6.4 Literal Form of Principles and Parameters 6.4.1 Representational Pretense Some sympathy might, however, be due to Devitt in his failure to notice the proper ontological point. As we will see in more detail in Chapters 9–11, the practice of Chomskyans themselves is seldom clear with regard to what they regard SLEs as actually being, or what exactly they mean by “representation,” a term that has a vexed and confused history, some of it related to some of the most difficult problems in philosophy.30 Crucially linked to these problems is that of specifying exactly what sorts of entities Chomskyans do take SLEs to be. Occasionally there is a reference to their being “psychologically real,” but we will see that this is no help until this also very vexed phrase is explained (see §9.7). As we shall also see (§9.8), the use/mention confusions (or confusions of representations with what they represent) suffuse most discussions, and these and the general issues they raise are seldom noted or explicitly spelled out (one reason Devitt might have been misled is that he tries generously to ignore them). But, most problematically of all, the rules, principles, parameters and operations of the theory are (as far as I have read) never stated in terms of representations, but entirely in what is sometimes called “the material mode,” of talking about SLEs themselves. To take one of our simple examples regarding negative polarity items (NPIs) (§2.3.2), the constraint on them would be typ ically expressed as: (NPI-C) An NPI must be c-commanded by a licensor. An NPI would certainly appear to be a word like any, tokens of which are uttered or appear in print on a page, as in Bill doesn’t have any wool. The usual rule excludes, Bill does have *any wool. This is standardly made clear by the presentation of a familiar tree structure: 30 I have tried to disentangle these problems in earlier exchanges with Chomsky (§9.8 below). Devitt (2006a: 7, fn9) notes these exchanges, but “hasn’t the heart” to enter into them. In view of how crucial they might be to clarifying the ontological commitments at issue, perhaps he should have mustered the courage.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

214 Representation of Language TP /

\

NP

T′

|

/

\

N

T

NegP

|

|

/

Bill

pres

Neg

|

|

does

n’t

\ VP /

\

V

NP

|

/

\

have

det

N

| any

| wool

The element n’t c-commands its sister, the VP, and so all of her descendants, thus licensing any. But where exactly are these tree structures with SLEs attached at their t erminal nodes? As we noted in §2.2.3, Chomsky initially took for granted what was then the received view of theorists such as Bloomfield and Quine that identified SLEs as sequences of acoustic phenomena, such as phones or phonemes. It is of course unclear where one might find the relevant trees there. In Chomsky (1965: ch 2), however, he began to retreat “inward” from this “beads on a string” view. Just “where” SLEs can be located, and, indeed, whether they exist at all, is a complex, essentially philosophical issue we will address squarely in Chapter 9. But there is no need to address that issue just yet. The only point needed now is the purely linguistic one of whether it makes any difference to linguistic theory what SLEs are, or whether they exist at all. As Chomsky (2000) remarks: Suppose we postulate that corresponding to an element “a” of phonetic form there is an external object “*a” that “a” selects as its phonetic value; thus, the element [ba] in Jones’s I-language picks out some entity *[ba], “shared” with Smith if there is a counterpart in his I-language. Communication could then be described in terms of such (partially) shared entities, which are easy

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 215 enough to construct: take “*a” to be the singleton set {a}, or {3, a}; or, if one wants a more realistic feel, some construct based on motions of molecules. (Chomsky, 2000: 129)

Such postulation would be irrelevant to the project of an explanatorily adequate generative grammar. That explanatory project is only about the computational system that underlies linguistic competence, not about whatever external phenomena the computed representations may happen to represent. What is needed for that project are representations of the tree structures with their various SLE nodes, not actually any tree structures or SLEs themselves.31 The point here is actually quite general to any linguistic or psychological theory that aspires to be purely “internalistic” as Chomskyan and many other theories do. Insofar as linguists or psychologists are interested in how a creature thinks, imagines or, more generally, represents the world, they are not interested in what the represented things actually are, or even whether they exist at all, or in whether the principles that seem to govern them are actually true of anything externally real. It is enough that the system computes such representations of such things. Thus, vision theorists are not interested in real colors and shapes, if such there be: those phenomena are issues for physics and geometry. What interests vision theorists are the internal representations of colors and shapes over which the computations of the visual system are defined. Most linguists and psychologists, of course, believe that mental phenomena in some way or other depend (or “supervene”) upon the brain, and so take their talk of SLEs or colors to be “really about” the brain and/or the computational processes there that are responsible for the SLEs that we (seem to) “hear,” or the colors we (seem to) “see.”32 And this is fine—as I have stressed, linguists 31 Chomsky, himself, does sometimes lose track of this point, worrying in Chomsky and McGilvray (2012:91) about talk of trees created by merge being committed to “sets in the head.” But, of course, on anyone’s view, what’s on the head are at most representations of sets, again, not the sets themselves! Of course, it could well turn out that the computationally most efficient representation of a tree structure may itself be a tree structure, and so “tree structure” might be used ambiguously to refer both to such representations and the “things” they represent. This would explain the ease of the use/mention confusions/collapses that we will discuss at length in §9.8. 32 This is probably an appropriate place to ask whether a Chomskyan linguistics is “internalist” in a way that is in tension with the kind of “externalist” considerations raised by, e.g., Putnam (1975c) and Burge (1984). Aside from the red herring that Chomsky (2000: 160) raises about mental images of cubes being caused other than by real cubes (cf. §8.6 below), the issue about the degree to which linguistic explanation takes for granted the general environment in which speakers live and speak seems, however, simply not to arise. Unlike philosophers, linguists simply do not think about whether their theories would be true of, e.g., brains in vats or in other radically dissimilar environments, although presumably at some level of abstraction such a brain might engage in “the same computations.” Chomsky (2000: 27) does

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

216 Representation of Language are studying representations which are, indeed, presumably entokened somehow in the brain. However, whether these representations can be identified with these things they represent is, to say the least, non-obvious. I doubt many vision theorists would be happy to say that a representation in the visual system of the color green is itself green, or one of a rotating cube that it itself is a cube, rotating in the brain. Whether perhaps SLEs should be identified with their representations in the brain is another fraught issue that we will turn to in §9.8. Of course, there might be reasons other than the demands of an internalist psychological theory to believe in an ontology of colors, Euclidean shapes, or SLEs. Philosophers have long disputed these issues, and we will consider some of their reasons also in Chapter 9. Insofar as they care about these issues at all, psychologists and linguists could be realists, anti-realists, or entirely agnostic about them. But, to get on with their expository work, they might simply go along with the widely shared realistic appearances. So what is going on in the ordinary ways that Devitt rightly notices that linguists talk? I submit that linguists’ talk of tokens of SLEs in speech or on a page is in fact (what I will call) a representational pretense, exactly like the pretense of many vision theorists when they talk about “the colors” and Euclidean shapes (circles, cones) typically represented in vision. They simply pretend that the things that people (including themselves) seem to hear or see are real, ignoring the question of whether they actually are as irrelevant to the expository purposes at hand (see the postscript below (§6.4.2) for reassurances about the completely non-invidious intent of this point). This representational pretense does have a perhaps surprising consequence for the statements of the various rules, principles, and parameters of the theory. Given that the theory is not about SLEs, such as NPIs, but concerns representations of them, the above rule about them, for example, should more accurately be expressed as: (NPI-C Rep): A representation of an NPI must represent it as at a node in a tree in which it is c-commanded by something represented as a licensor. And similarly for principles and parameters of binding, movement, case, head-directionality, and so forth (see §2.2.7). note that the computational system characterized as the I-language might be “embedded in perform ance systems that use it for locomotion,” but does not discuss what difference that would make to how the theory might be expressed. But I suspect an externalist like Burge might find my proposal of representational pretense and Chomsky’s (2000: 129) denial of the external reality of phonetic element quoted above uncongenial to his views. Perhaps he will address this issue in the future.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 217 But why then are the principles, etc., not all expressed in this way? For the obvious reason that it would be preposterous at this point in the development of theory to be cluttering up linguistic texts with such grotesquely cumbersome prose. Linguistics is quite hard enough already. What is perfectly sens ible instead is simply to conveniently pretend there are the SLEs and the tree-structures in which they appear, and then state any rules, principles, constraints, parameters as true under the pretense.33 Mutatis mutandis for the colors and shapes discussed by vision theorists. Of course, even such paraphrases of the various principles, etc., might not be strictly correct. As Chomsky (2002a: 94–5) frequently puts it, they might be mere “epi-phenomena”: for example, the Binding Principles may only appear to be true as a result of the literal truth of some minimalist computation involving only merge and checking. But it is important to see how “being epiphenomenal” takes on a special significance for the proposed representational paraphrases. Without the paraphrases, one might perhaps regard earlier rules and generalizations of the theory as simply approximately true, say, in the way that the gas laws may be stated with more or less generality and precision, with the truth ultimately to be captured by statistical laws of swarming molecules. However, a representational theory presents a further curl to such an account: for if, say, c-command relations, binding domains or indices are not literally represented, then the proposed paraphrases are not even approximately true in the way that Boyle’s laws are. They are simply false, so to say, at any scale. All that would be true is that the system would appear to represent binding domains and the like, and they might serve in that way as a very “abstract” way of characterizing the actual computations on representations in a manner that would not be even approximately true. To compare them to a nice example of Dan Dennett’s (1987: 107) about a chess playing computer, they would have the status of a rule such as “The machine tries to get its queen out early,” even though no such rule is actually represented or implemented in the machine: the machine simply operates as though it is.34 33 One might think that such talk of pretense could be literally understood simply as conditionals: if the representations of the I-language were true, then the following principles would hold. But it is not clear to me that pretenses can be so understood. One can pretend that all manner of stories might be true without accepting that they might be genuinely possible, the least integratable into one’s conception of the actual world, and so serve as the antecedents of serious conditionals. I can pretend that Mickey Mouse, Euclidean triangles or colors exist without having any idea at all what the real world would be like if they actually did. The topic deserves much more discussion; I just don’t want to presume any verdict here. 34 And thus the issues generativists often raise about “levels of representation” (see §2.2.1 above) can be regarded as literal claims about precisely what material is explicitly represented in proposed computations, as opposed to being merely epiphenomenal in the present sense. The analogy with

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

218 Representation of Language

6.4.2 A Reassuring Postscript about Pretense Some linguists have taken exception to my use of the term “pretense,” worrying that it suggests their theory is a sham. I intend no such suggestion, which I hope my qualification, “representational,” will help block. “Pretense” has been an issue of a great deal of attention in philosophy in the last forty years, as is shown in the many quite positive topics that discuss it in the Stanford Encyclopedia of Philosophy, for example, “imagination,” “possible objects,” “dialetheism,” and “relative identity.” My own use has been influenced by interesting proposals by Gareth Evans (1982) regarding negative existentials, and Mark Crimmins (1998) regarding identity statements, the latter using it in what he calls a “shallow sense” to mean merely “act as if.” Note that pretending that p does not entail p is false, or even that one believes it is: one may just not want to enter into an argument about it (“I know you doubt the existence of numbers and I don’t, but let’s pretend they exist”). Some philosophers will also note that pretense so understood is similar to Husserl’s (1913) suggestion of a “phenomenological epoché” (or “bracketing” or “suspension of judgment”) being essential to psychology. For all the risk of invidious connotations, however, I suspect my use of “representational pretense” here would be preferable to an exposition of the complexities of Husserl. Regarding this “as if ” character to my proposal (and a similar one of Collins, 2020b), Devitt (2020b) writes that it has a sadly nostalgic air to it. The idea that widely accepted claims about the nature of the external world should be rewritten as claims that it is as if that world has that nature . . . has a long grim history that includes the disasters of metaphysical idealism and scientific instrumentalism. In my view (1984, 1991), the idea has little to be said for it. (Devitt, 2020b: 380)

Devitt claims that it would “generalize to social entities in general,” for example, to money and votes (2020b: 380). But there is in fact no need to so generalize in the least here. In questions of ontology, I think one should proceed cautiously, one day, one theoretical domain at a time: SLEs chess of course breaks down if, unlike the usual chess pieces, SLEs do not exist: a machine might regularly move real queen pieces, making the generalization true, whereas a speaker may never perform operations on any actual SLEs.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 219 may be one thing, votes and money quite another. In §9.2, I will propose some criteria for distinguishing among the wide and diverse range of cases about which irrealist claims are made.

6.5 A Somewhat Ecumenical Conclusion: Internalist and Externalist Interests Having defended at length the point of the initial “joke” of Chapter 3, and a related point about representational pretense, I want now to take it back a little, and at least allow room for the perfectly reasonable ordinary claim that people do indeed “speak natural languages.” Chomsky’s particular dismissal of the reality of E-languages makes an important theoretical point, as would the denial of an objective reality to “weeds.” However, just as the lack of theor etical interest to the ordinary category of weeds does not itself imperil the reality of the particular plants called “weeds,” so would the lack of theoretical interest in E-languages not imperil talk of what ordinarily referred to as, for example, “French,” “English,” “Mandarin,” “Swahili.” As we observed at the end of Chapter 5, nothing that matters to Chomsky’s core theory of linguistic competence would be seriously threatened by the existence of interesting theories of performance. E-languages may well exist as social kinds, and be the “languages” people speak, just as weeds may be the unwanted plants they pluck out of their gardens.35 Moreover, pace Chomsky, perhaps E-languages form more of a natural kind than do weeds, figuring in at least some social generalizations, for example, Grimm’s Laws, or claims about the Great Vowel Shift beginning in fourteenthcentury English. Chomsky’s occasional arguments against E-languages are by themselves less than compelling: the fact that E-languages may have vague boundaries, and are subject to social and political agreements is no more an argument against their scientific interest than they would be arguments against the scientific interest of economies. Whether or not E-languages in addition to I-languages are a scientifically important focus of attention depends not on the success or failure of Chomsky’s program, but on whether

35 Given his general views on semantics, which we will discuss in §10.4, Chomsky should be perfectly hospitable to this “polysemous” way of allowing different usages of a word to pick out different items. To anticipate: one might say in a Chomskyan scientific context that people do not speak external languages, although in most ordinary contexts one perfectly well says that they do.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

220 Representation of Language there are actually any stable laws and generalizations about E-languages themselves.36 Chomsky certainly has no particular reason to deny that the specific phono logic al forms that a speaker attaches to meanings and syntactic structures may sometimes be largely conventional. This may well be what gives rise to “Saussarian arbitrariness,” or the brute “association of concepts with phonological matrices” (Chomsky, 1995a: 169), although, of course, such arbitrariness could also arise just by happenstance. As we noted in §3.2, these conventions are what people seem ordinarily to have in mind in thinking of “grammars” and “languages,” such as what ordinarily gets called “English,” “Finnish,” “Mandarin,” etc., which involve not only phonology, but also other informal rules and constraints on speech: they “register” differences in the way one talks to different sorts of people (to children, people of a different class; not swearing in church, cue-ing irony and sarcasm, etc.). Chomsky simply thinks, however, that these conventions are neither deep nor robust enough to form a natural kind of phenomenon amenable to scientific, or, at any rate, Galilean scientific investigation (which, in any case, is surely a matter of degree).37 Especially given his occasional later scepticism to the contrary, it is worth noting that Chomsky (1975b) himself recognized that the study of language and UG, conducted within the framework of individual psychology, allows for the possibility that the state of knowledge attained may itself include some kind of reference to the social nature of language. Consider, for example, what Putnam (1975c) has called “the division of linguistic labor.” In the language of a given individual, many words are semantically indeterminate in a special sense: the person will defer to “experts” to sharpen or fix their reference. (Chomsky, 1975b: 18)

36 For work that suggests that there might be, see Ringe and Eskra (2013). Note that I am for the nonce ignoring further arguments against the reality of E-languages that could be based not on the mere fact that they are social objects, but rather because the requisite SLEs that are taken to constitute them may not exist at all. We will consider such arguments in Chapter 9. If they are correct, then a defender of the explanatory utility of generalizations about E-languages would likely need to resort to the same representational pretense in which I argued above the theorist of I-languages (and colors and Euclidean figures) engages (see §9.8 for further discussion). 37 Chomsky is given to a certain hyperbole on the issue: “If by ‘conventions’ we mean something like ‘regularities in usage’, then we can put the matter aside; these are few and scattered” and do not have “any interesting bearing on the theory of meaning or knowledge of language” (1996: 47–8, quoted in Devitt (2006a:178; see also Chomsky (1980a: 81–3). Regularities in usage are few and scattered? What do dictionaries record? It is hard to see what entitles him to such sweeping remarks; certainly nothing internal to his core theory.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

ResistAnce of Mental Realists 221 Although he, himself, thinks this deference could be understood individualistically, this does not deny the possibility of other kinds of study of language that incorporate social structure and interaction. Contrary to what is sometimes thought, no conflict of principle or practice arises in this connection. (Chomsky, 1975b: 18)

Given the oddities of especially the WhyNots, it seems doubtful that such laws would cover the same phenomena that concern Chomsky. Were E-languages a fit subject for science, still, of course, many of their properties might be crucially explained by the properties of the underlying I-languages. Consequently, concern with E-languages need not be seen as any sort of rival to Chomsky’s core theory, which would remain precisely as interesting as it is, with or without them.38 So a Chomskyan can cheerfully leave Devitt free to pursue his interest in LR39, Wiggins his interest in cultural prescription, Nagel, Searle, and Strawson in the deliverances of introspection, Katz and Soames in Platonistic entities, and Cognitive-Functionalists (§5.5) their interest in social functions. If any of them can demonstrate real theoretical power and depth in their pursuits, fine and good. Obviously, Chomsky is skeptical (I suspect he regards them as akin to the claims of animal husbandry and veterinary science in contrast to substantive evolutionary theory). But nothing in his core theories would be affected in the slightest if this turned out not to be true, and there is no real need for him to dismiss them. So long as the interest of each does not preclude the interest of the others, it is hard to see the need for any irreconcilable disagreement between them, as we noted Chomsky (1975b: 18, 1986: 33–4) himself allowed.

38 All of this discussion abstracts from quite different, but serious sociological issues about how academic resources should be allocated, which research projects deserve support, publication, appointments in departments, and the like. Such issues likely played as a large a role as any purely theoretical ones in determining the character of many of the disputes between Chomsky and his opponents, in ways that would require an intellectual historian of the period to trace. See Newmayer (1986) and Harris (1993) for efforts of this sort. 39 Putting to one side Devitt’s empirically dubious speculations about the conventionality of syntax (cf. §6.3.5 above).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

7 Linguistic Intuitions and the Voice of Competence 7.1 A “Voice of Competence”? Considerable scepticism has been raised in recent philosophy with regard to the reliability of the “intuitions” to which philosophers traditionally have appealed about a variety of topics in the subject, notably possibility, necessity, the a priori, meaning, and concepts.1 We will discuss some of this scepticism in Chapter 10, with special attention to the challenge Quine (1953/61b, 1954/76) raised regarding how to distinguish intuitions about semantic and/ or purely “conceptual” issues from deeply held beliefs about a domain: is “Cats are animals” analytic as Katz (1974) insisted it was, or is it merely a bio logical hypothesis, disconfirmed if they turn out to be robots, as Putnam (1962a/75a) claimed they could do? What I want to address in this chapter is what can seem to be a similar issue regarding the role of speakers’ intuitions regarding the syntax and phon ology of speakers on which many Chomskyan linguists rely in testing their theories of those domains. Skepticism has been raised with respect to these intuitions as well,2 but I will argue that it is less warranted than in the case of semantics and conceptual analysis. Syntax and phonology have the advantage of not interacting so closely with a speaker’s worldly beliefs, and so the intu itions about, say, whether a given string is well formed or pronounced a certain way are not likely to be confounded with them. Indeed, the case of syntax seems to me to provide an ideal case of the successful appeal to the 1 This chapter is a considerable expansion of earlier articles (e.g., Rey, 2014a), and was also the basis for a talk at the October 2017 conference on intuitions at the University of Aarhus, and of an earlier, shorter version of it that appears in the proceedings of that conference as my Rey (2020a). Differences between that version and this chapter, due largely to responses of Devitt (2020) to that paper, will be noted in due course. 2 For further discussion of such skepticism, see also, e.g., Schütze (1996, 2003), Bach (2002), Wasow and Arnold (2005), Gibson and Fedorenko (2010). There is not space to address all of these discussions. I believe the view I will defend here will suffice as a general reply.

Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0007

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 223 kinds of special intuitions on which philosophers might hope to rely for claims about meaning, even if these latter are a lot harder to ascertain than has been traditionally supposed. Specifically, I want to defend at least a version of what Michael Devitt (2006a,c; 2008a) has called the “Voice of Competence” (VoC) view, according to which the spontaneous intuitions of native speakers provide special evi dence of their internal representation of the phonology and syntax of their I-language, a view Devitt rejects (§7.1.1–7.1.3). I will defend a VoC by draw ing analogies with the spontaneous reactions of subjects in vision experi ments, which provide special evidence of the representations in their visual systems (§7.2). This defense requires a number of distinctions that I think Devitt does not sufficiently discuss, specifically between a grammar and a parser (§7.2.1), and between conceptual and non-conceptual content (§7.2.2). Moreover, Devitt’s alternative model of speakers’ reactions in terms of sen tences merely “having” rather than representing properties fails to explain how those properties might be integrated into a speakers’ psychology so that they ineluctably “hear” their language as language, with many of the con straints that language imposes. What’s crucial to intuitions serving as special evidence is their having a special representational etiology: intuitions of lin guists investigating languages that are not native for them may have a differ ent etiology, and so fail to provide any such special evidence.3 I make no claim that the model is true: I am not a psycholinguist and so not remotely in a position to establish any such truths, which in any case would require much more evidence and clarity about the structure of the mind than I take to be anywhere available. I am only concerned to reply to Devitt’s largely philosophical objections to a VoC. First and foremost I want to defend its status (pace his 2006a: 118, 2014: 288) as a seriously “respectable model.” However, I will also consider evidence that seems to me to make it an empirically plausible one, more plausible than the alternative one that he proposes (§7.3). In addition to serving as a defense of Chomskyan reliance on special intu itions, my discussion will have a further point. We will see in Chapters 8 and 11 that part of the very anti-representationalism that Devitt advocates as an alternative proposal to that of a VoC seems, ironically enough, to be 3 I will not defend various stronger views of intuitions that sometimes seem to have been claimed, e.g., that “linguistic intuition is the ultimate standard that determines the accuracy of any proposed grammar, linguistic theory, or operational test” Chomsky (1965: 21, emphasis mine); I would be surprised if any current Chomskyans would defend such views. Nor will I be defending the view held by some (e.g., Katz, 1981: 195–6) that VoC intuitions are in some way a priori, based on some non-natural grasp of “Platonic” entities (cf. §6.2.1 above and §9.4 below).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

224 Representation of Language endorsed by Chomsky (2000, 2003), himself, and by some of his followers, and that my objections to Devitt apply equally well to them (see esp. §11.1).

7.1.1 Devitt’s Skepticism about (Non-)Standard Models As we discussed in §4.4, Devitt opposes special “Cartesian” knowledge everywhere in philosophy and science, claiming that there is only one way of knowing, the empirical way that is the basis of science (whatever way that may be). (Devitt, 1996: 2)

In terms of Quine’s (1953/61b) well-known metaphor, all of our beliefs about the world form a seamless web and are confirmed or disconfirmed only on the basis of properties of the entire web: there is no “knowledge on the cheap,” no beliefs for which there is some special confirmation independent of the empirical methods of science, not even in logic or mathematics. In his (2006a, 2006b, and 2014) Devitt argues that this claim should also include our know ledge of language.4 That Chomskyans think otherwise is, he claims, attested by a number of passages Devitt (2006a: 96, 2006c) cites that seem to articulate versions of a VoC view, according to which intuitions are the result of a “deduction from tacitly known principles” (Graves et al., 1973: 325; cf. Chomsky, 1986: 27; both quoted at Devitt 2006a: 96). Jerry Fodor (1981d) did not go quite so far as to say the process is deductive, but claimed only that intuitions . . . confirm grammars because grammars are internally repre sented and actually contribute to the etiology of the speaker/hearer’s intui tive judgments” (Fodor, 1981a: 200–1; quoted at Devitt, 2006a: 96)

See also Pateman (1987: 100) and Dwyer and Pietroski (1996: 338) for similar claims. Devitt (2006a) objects to these VoC views. Although a number of his objec tions do tell against some of those overly strong proposals, I think that in the

4 It is important to stress that Devitt is not challenging in this connection the truth of Chomsky’s postulation of largely innate principles and parameters as a theory of “linguistic reality, which he thinks people’s competence respects (see 2006a: §2.3, and §6.3.2 above). As we saw (§6.3), he is chal lenging merely the psychological reality of the theory, and, here, that intuitions provide any special evidence of that reality.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 225 end they do not succeed against a reasonable version of them, which I will be attempting to provide. An important development that many of these earlier suggestions over looked is the emergence of what I called in §5.4.7 a “process” conception of competence and its attendant nativism, involving governing “principles” instead of empirically confirmed “rules,” failing to incorporate Stabler’s (1983) important distinction between embodied (what I have called “implemented,” §3.3.4) and represented rules/principles.5 This development has occasioned a fairly widespread retreat from the view that the VoC involves anything like deduction from explicitly represented rules. Consequently, all that will be relevant for our purposes here is not a defense of the full fledged “Standard Cartesianism” regarding the representation of rules that serves as Devitt’s main target, but only what he calls a “non-standard Cartesian explanation”: some intuitions are the relatively direct causal result of the representation of linguistic entities and properties that are governed by principles merely “embodied” somehow or other in the brain (see Devitt, 2006a: 117). The prin ciples (and/or parameters) might, of course, turn out also to be represented; but this will not be a concern here. Devitt rejects not only a standard Cartesian view, but any such non-standard view as well: Any non-standard Cartesian explanation . . . must give the embodied but unrepresented rules a role in linguistic intuitions other than simply produ cing [presumably external speech] data for central-processor reflection. And it must do so in a way that explains the Cartesian view that speakers have privileged access to linguistic facts. It is hard to see what shape such an explanation would take. (Devitt, 2006a: 118)

Indeed, Devitt goes so far as to say: We do not have the beginnings of a positive answer to these questions and it seems unlikely that the future will bring answers. (Devitt, 2006a: 118)

It is this last claim in particular that seems to me unwarranted, and I will describe in this chapter at least “the beginnings” of a perfectly naturalistic VoC, as well as provide some serious prima facie evidence for it. 5 Although I think it is important to distinguish in this way between “rules” and “principles” (cf. §2.2.8), the terms tend to often be used interchangeably in the present discussion, as I, like others, will likely sometimes fall into doing.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

226 Representation of Language

7.1.2 I- vs. E-languages Devitt’s opposition to a VoC is, not surprisingly, part and parcel of his non-Chomskyan conception of linguistics as concerned with a largely non-psychological “linguistic reality,” which we discussed at length in §6.3. And perhaps if the intuitions on which Chomskyans rely were only about an external “E-language,” then certainly some wariness would be in order: after all, why should speakers have any privileged access to a social conventional system largely external to them?6 A lot would depend in this case on the details of the relation between the E-language and a speaker’s psychology. However, even if linguistics were concerned with an external “linguistic real ity,” as in socio- and historical linguistics, the question would still arise how anyone comes to understand it—particularly a child! Where do the linguistic sensitivities to even E-linguistic categories come from, to say nothing of their respect for the non-obvious principles that exclude the WhyNots?7 It is one thing to notice that typists use their thumbs only for the space bar, quite another to notice the oddities of island constraints. If, as Devitt (2006a: 180; 2008b: 252; 2020b: 379, fn11) allows, some significant part of the E-language may be innate, comprising an internal I-language, then one might imagine that speakers could have privileged access to at least that part of their compe tence. It is pretty clear, however, that Devitt (2014) would object to a VoC even for that. Intuitions about such internal properties are, for him, just theoretical speculations about people’s speech abilities, like those about other abilities people might have, for example, to swim, to type, to think about mathematics, all of which, for Devitt, are judgments based upon external empirical evidence rather than on any deliverances of an internal VoC.

6 Note, again, that Chomskyan casual claims about the usual external languages such as “English” are simply short for “the I-language shared by more or less by those who are called ‘English’ speakers.” 7 Devitt (2020b) fails to appreciate the explanatory burden here. Discussing a binding example, Fred’s brother loves himself, he writes: What explains the fact that everyone in the English-speaking community would interpret the mark himself in Fred’s brother loves himself as a reflexive c-commanded by the mark Fred’s brother? Why this remarkable coincidence of interpretation? The core of the answer is simple: because in English himself is a reflexive c-commanded by that DP; and the com petence of any English speaker enables her to detect this linguistic fact; that’s part of what it is to be competent. (Devitt, 2020b: 380) Perhaps competence in “English” requires sensitivity to the binding constraints, but we can still ask how people can possibly be sensitive to such subtle and (until Chomskyans) unarticulated principles. It is hard to see a plausible answer other than at least the tree structures and categories of c-command and locality are innately shared by them as part of the I-language that explains that competence. We will consider this fairly deep problem further in §11.1.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 227 There is one point on which there is a crucial difference between Devitt’s and a Chomskyan conception of “intuitions”: on Devitt’s view, intuitive verdicts about a string are understood as meta-linguistic claims about the strings, rather in the way that intuitive arithmetic intuitions about numbers might be, to be assessed merely in terms of their truth about the strings or the numbers. For Chomskyans, the verdicts might indeed be true—a certain string might indeed be rightly thought by the speaker to be ungrammatical in her I-language—but what is crucial is not their truth, but their role as evidence of the structure of the inner I-language. This is one of several ways in which linguists’ reliance on linguistic intuitions is different from traditional philoso pher’s reliance on intuitions as a source of “self-evident,” a priori knowledge.8 The interest is in facts about the source of the intuition in the speaker’s psych ology, not in whether it is correct about the domain that it concerns. In any case, as we stressed at the outset (§1.1), “Unacceptability reactions” need not be in the least self-conscious or meta-linguistic in the way they typically are for linguists and their reflective students. It would be enough that speakers simply produce some idiosyncratic reactions to various strings: hesitation, perplexity, or just pupillary dilation would suffice as well.9 Linguistic intuitions are taken seriously only if they are specific sorts of causal manifestations of the I-language, and not, as they could be for Devitt, simply intelligent surmises. For the purposes of this chapter, however, I will largely ignore this differ ence between truth and mere evidence, simply addressing the claims, 8 This also seems to be a difference between a Chomskyan understanding of the enterprise and Crispin Wright’s (2007: 15, fn18) more traditional philosophical one. Collins (2014: 59) rightly claims that, as evidence of the I-language, the issue of truth need not really arise for intuitions, and in his (2020a) calls attention to the important fact that “many of the relevant intuitions are not of the “yes-acceptable/no-unacceptable form, as if an informant is a Roman emperor in the Colleseum. The intuitions, rather, concern the possible pairings of interpret ations and sentences” (Collins, 2020a: 103), rather as vision theorists might ask subjects about the different ways they can see a figure. But of course someone could nevertheless be mistaken about what is (un)grammatical or interpretable in their own I-language (consider speakers’ mistaken rejections of center-embeddings; cf. §1.4). 9 Thus, Schütze’s (1996/2016: 122) discussion of the complexities of eliciting meta-grammatical judgments from pre-literate peoples, and Devitt’s (2006a: 109, fn20) of Bob Matthews’ difficulties with illiterate Catalan peasants, are of no particular consequence for linguistic theory, although of course explicit and reliable judgments likely make the data easier to collect and understand (but one does wonder in those cases just how hard the informants were pressed, and precisely the full range of fea tures informants were really ignorant of—did they not even recognize ambiguity, irony, or rhyme?) See also Collins (2008b). Curiously, Devitt (2020: 59) claims “hesitation and the like provide evidence from usage not from intuitive judgment”. I am imagining hesitation, perplexity, and pupillary dilation in reaction to a heard or imagined string, which surely reflects on what is intuitively perceived and judged to have been heard (this issue surfaces again in his understanding of “garden path” phenomena, see fn 35 below).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

228 Representation of Language challenged by Devitt, that the verdicts could ever be the result of a special route to knowledge, a “Voice of Competence.” That is, I shall be addressing merely the claim: if, as Chomskyans think, intuitive verdicts of native speakers were true and manifestations of an I-language, then that would arguably render them epistemically privileged. Such a claim would be philosophically interesting apart from the issues of a Chomskyan linguistics.

7.1.3 Devitt’s Alternative Proposal Instead of a VoC, Devitt proposes a model like that with which someone might understand the competence of a typist: the competent speaker has a ready access to a great deal of linguistic data just as the competent typist has to a great deal of typing data. . . . So she is in a position to have well-based opinions about language by reflecting on these tokens. . . . As a result she is likely to be able to judge in a fairly immediate and unreflective way that a token is grammatical, is ambiguous, does have to co-refer with a certain noun phrase, and so on.

Devitt proposes that: Such intuitive opinions are empirical central-processor responses to linguis tic phenomena. They have no special authority: although the speaker’s com petence gives her ready access to data it does not give her Cartesian access to truths about the data. (Devitt, 2006a: 109)

This, of course, is simply an instance of his Quinean epistemology that we discussed in §4.4. Although he does not distinguish between a plausible “working” and a far less plausible “explanatory” version of that view, it is clear in this passage that he regards typists’ knowledge of typing as explained along holistic Quinean lines: their opinions about the processes that go into it have no special authority, no “Cartesian access to truths about the data.”10 As Steven Gross (pc) has nicely put it, for Devitt there is no real distinction for typists or speakers between “intuitions” and mere “hunches.” 10 A philosopher close to Chomsky’s views, Peter Ludlow (2011: 69), defends a similar view, although not in as much detail as Devitt.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 229 However, as we also noted in §4.4 fn 20, there are many privileged avenues of proprietary knowledge that do not seem to be acquired or justified by the methods of empirical science, for example, in the case of knowledge of our own standard sensory states, personal memories, emotions, and many of our propositional attitudes. It is by no means obvious that typists’ knowledge of typing do not in fact involve some such special knowledge of afferent–efferent connections in their bodies, not available to a mere external observer. But this is not the place to speculate about typists. Let us allow, for the sake of argu ment, that knowledge of typing and other skills, such as swimming and bicycle riding are “Quine-empirical” in the way Devitt suggests, obtained and confirmed only as part of the entirety of our beliefs about the world. What should we say about our knowledge of language, and specifically about our understanding of the “data” of language that we encounter in speech and in queries by linguists? Devitt describes “the data of language” in a somewhat confusing variety of ways. Early on in his (2006a) he writes: What a speaker computes are functions turning sounds into messages in comprehension, and messages into sounds in production. (Devitt, 2006a: 67, emphasis mine)

But “sounds” are usually understood generically as noises, and, as Devitt (2014: 287) well knows, native speakers do not hear speech in their language as mere noise (indeed, it is virtually impossible for them to do so). So in pro viding data to linguists, what do they hear them as? Devitt does not say immediately, but a little later he does take speakers’ intuitive verdicts to be about “expressions”: She asks herself whether this expression is something she would say and what she would make of it if someone else said it. Her answer is the datum. Clearly her linguistic competence plays a central role in causing this datum about her behavior. That is its contribution to the judgment . . . (Devitt, 2006a: 109)

Devitt says nothing explicit about what he takes these “expressions” to be that the speaker asks herself about, but on the next page he alludes to minimalpair experiments in which “ordinary speakers are asked to say which of two word strings is ‘worse’ ” (2006a: 110). So it would seem that he might regard

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

230 Representation of Language “expressions” as strings of words. But then what is a “word”? Linguists understand it as a composite of phonological, syntactic, and semantic properties that structure perception, a view that, so far as I can see, Devitt has no reason to contest. What he does contest is that it is representations of these properties in the I-language that are causally responsible for speaker intuitions. What a speaker hears are noises that have those linguistic properties, but whatever represen tations that inform the speaker’s syntactic intuitions are theory-laden perceptual judgments, reflecting past [central processing] con ceptualizations, just as are the intuitions of the [experienced, perceptually acute] paleontologist [and] art expert. . . . The immediate causally relevant background for these linguistic perceptual judgments is thought about the language, not competence in the language. . . . (although, of course, the com petence will have provided data for those thoughts). (Devitt, 2020: 58)

The speaker hears noises that, while they have properties produced by the underlying competence system—recall, he thinks they are properties of items in an E-language—they do not do this by virtue of processing representations of those noises assigned by that system—just as the world provides bones that have properties that people do not automatically perceive, but which they come to consciously classify as having whatever properties they been taught such things have (we’ll consider shortly in §7.2.3, whether something merely “having” certain properties is enough for a listener to hear it as having them).11 Actually, minimal-pair experiments are also performed with regard to phonemes—for example, bare/pare—and one wonders whether Devitt thinks one hears rhymes and other phonological phenomena. But Devitt sets aside

11 Devitt identifies still other phenomena as “the data”: “On the received linguistic view, the com petence supplies information about those properties. On the modest view I am urging, it supplies behavioral data for a central-processor judgment about those properties” (Devitt, 2006a: 110; emphsis mine). Then in his (2020a: 53) writes: “ ‘data’ is to be understood on the model of ‘primary linguistic data’: the data are linguistic expressions (and the experiences of using them)” (referencing his 2010a: 254, and 2010b: 835, fn4). But a few lines later, he narrows them down: “The datum is the experience of trying to understand the string” (2020a: 54, emphasis mine). This is all pretty confusing, especially since “primary linguistic data” is Chomsky’s (1965) term for the (relatively limited) speech that leads a child to acquire a grammar, which I do not think he regards as either “answers to questions,” “behavioral data,” or as “experiences of trying” to do anything.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 231 the question “whether language use involves representation of the phonetic and phonological properties of the sounds” (2006a: 222). He assumes that this must involve, at least, representations of the phonetic properties, the “physical” properties of the sound, as a result of transduction. Beyond that I shall (wisely) have nothing to say on the vexed question of what else it involves. I suppose, although I shall not argue, that my reasons for doubting that syntactic and semantic properties are represented would carry over to phonological properties. (Devitt, 2006a: 222–3)

However, although he rightly notes deep disagreements in the foundations of phonology (2006a: 222, fn25), Devitt mentions none that cast doubt on whether speech is heard as the utterance of a sentence, a fact that he (2014a:286) claims he is “not resisting.” In the above passage, Devitt does seems to allow that “phonetic” properties are represented, but “phonological” ones are not. To be sure, there is a differ ence between them. To a first approximation, phonology has to do with the character of what speakers “hear as their language” and what is represented and processed by the phonological component of their I-language. Phonetics concern the sounds and/or actual articulatory gestures that produce them. As we will discuss further in §9.2, the relation between the two is complex, the phonology being an abstract, discrete, categorical, often highly idealized ver sion of the usually quite messy phonetics of continuously varying gestures and sound waves. Although Devitt is not explicit on the issue, the above pas sage suggests that what he takes speakers to hear are the “phonetic proper ties,” which he regards as the “ ‘physical properties’ of the sound,” and which I will characterize as “phonetic noise.” After all of these sundry, somewhat casual remarks, Devitt does add a more explicit statement of what he regards as the data of language, in order to dis tinguish his claim from the perceptual analogy of the VoC: the task of comprehension . . . is to deliver information to the central proces sor of what is said, information that is the immediate and main basis for judging what is said, for judging “the message.” So, intuitions about what the message is are analogous to intuitions about what is seen. But the former intuitions are not the ones that concern us: for, they are not intuitions about the syntactic and semantic properties of expressions. (Devitt, 2006a: 112, emphasis mine)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

232 Representation of Language In light of this last and the other claims above, it would seem that Devitt’s view is best characterized as:12 (Dev) Spontaneous linguistic intuitions like intuitive judgments in general, are empirical theory-laden central processor responses to the phonetic noises that have linguistic properties and convey messages of speech. And this claim certainly contrasts with the Voice of Competence view: (VoC) Spontaneous linguistic intuitions sometimes at least have a special status due to their being caused in a relatively direct way by representations of phonological and syntactic properties of the grammar (via the parser). A number of qualifications must be mentioned:

(i) Devitt is, of course, well aware of the complexities of distinguishing “semantic” from “pragmatic” elements of linguistic “messages,” as in the case of (for starters) indexicals and lexical polysemy and ambiguity, some of which he discusses in his (2006a: §11.8). I shall assume for the sake of argument that what Devitt intends here is some sort of truth-valuable message, not the more anemic, what might be called “proto-semantic” material which some Chomskyans (e.g., Pietroski, 2003) have in mind when they discuss the “meaning” properties r elevant to linguistics, and to which Devitt makes no reference (we will discuss this view further in due course; see §10.4). (ii) The “fairly direct way” in (VoC) is, of course, a serious hedge, needed both to allow for various intermediate mechanisms of attention, mem ory, and sensitivity to relevance needed for any judgments, and to rule out a worry that arises for any such causal proposal, the problem of (as they are called) “deviant causal chains.” Thus, the fact that a person’s I-language generates a certain string may be part of the causal explan ation of why a linguist believes it does, which leads him to tell me it does, and so results in this—but let us assume no other—way in my claiming that the string is acceptable—with the result that there would 12 I have re-labeled what he labeled “(ME”). In light of Devitt’s (2020a: 61, fn15) complaints of my having in earlier articles misunderstood parts of his discussion, this statement of (Dev) differs slightly from the statement in Rey (2020a), but not, I believe, in ways that affect the ensuing points. I use “noise” instead of the (to my ear) more ambiguous “sound” in order to stress the contrast. I will refine both it and (VoC) further in §7.3.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 233 be no distinction between (VoC) and (Dev)! I presume there will be a way to rule out deviant chains in an ultimately detailed theory, and so I will take “a fairly direct way” as read. (iii) One should note also what Anna Drożdżowicz (2015: 105) has emphasized, that what appears in perception are quite likely features of the grammar to the extent that they have been processed by a parser and other performance systems, whose outputs may not correspond to the mappings of sound and meaning effected by the grammar. This leads her quite rightly to suggest that the apter phrase would be “Voice of Performance.” For simplicity, I shall stick to Devitt's phrase, acknowledging, however, the mediation of these other systems between UG and comprehension. (VoC) is not committed to denying that all manner of information might well enter into intuitive reports (even the theoretical biases of professional linguists, cf. Schütze, 1996!), only that, amongst all that information, there are typically structural descriptions (“SD”s) that have their source relatively directly in the grammar. When they do, they are especially interesting to the linguist insofar as they and other responses may relatively directly reflect the constraints the grammar imposes on the parser.

7.2 Parsing and Perception What I and others have defended elsewhere, and I will defend in more detail here, is the claim that many linguistic intuitions have the same status as standard reports of perceptual experience in vision experiments.13 Insofar as they are to be taken as evidence of linguistic and perceptual pro cessing, both linguistic intuitions and perceptual reports are presumed to be fairly directly caused by auditory or visual processes, operating under 13 See Rey (2008). Against my protestations, Devitt (2014: 282) claims this view originates with me. I still want to insist that it was to my mind first suggested by Jerry Fodor in his seminar at MIT in 1974 in which he presented the ms. of his (1975), and certainly in his (1983). Nick Allott has drawn my attention to a nice statement of it in Neil Smith (1999/2004): There is no difference in principle between our linguistic judgment about our sentences of our native language and our visual judgment about illusions such as the Müller-Lyer arrows. (N. Smith, 1999: 29) And Smith and Allott (2016: 139) explicitly deploy the analogy against Devitt, as do Barry Smith (2006) and Mark Textor (2009). Textor’s plausible proposals of “linguistic seemings,” though, seem to me to need the distinctive etiology in addition to the phenomenology, since non-native speakers might come to enjoy the phenomenology without it being the kind of reflection of the I-language that Chomskyans are after (and note there is no reason for linguists to rely on speakers’ phenomenology: mere processing difficulties would suffice; cf. §7.3.3 below).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

234 Representation of Language their specific constraints. In the case of language, these processes produce SDs of at least some phonological and syntactic properties of various SLEs (e.g., words, and phrases, such as NPs, VPs, PPs), and the intuitions are reliable evidence insofar as those descriptions do in fact play a distinctive role in their production.14 I presume that such perceptual systems are at least modestly modular: the processes are fast, mandatory, and at least highly resistant to revision by central information. I prescind from the debate whether they are fully “informationally encapsulated” along the lines of Fodor (1983). It is enough for our purposes that many of the representations of the module are what I shall call “informationally resistant,” as, for example, in the case of most visual illusions and at least many of the WhyNots.15 Recalling examples (re-numbered) from §1.3, no matter how much contextual information you have, it is virtu ally impossible not to balk at (1) *Who did Mary go with Bill and __ to the movies last week? (cf. Mary went with Bill and who to the movies last week?) (2) *Who did Susan ask why Sam was waiting for __? (cf. Susan asked why Sam was waiting for who?) or to hear the last himself as referring to the self-centered John: (3) *John1 was always concerned with himself1. He always talked about himself1, would constantly praise himself1 in public, and earnestly hoped the boss liked himself1. In addition, of course, one can’t hear speech in one’s native language as noise, no matter how much one realizes that is what is being produced by it. 14 There is no need, of course, to claim that all phonological and syntactic properties are repre sented in perception. There may well be representations of features internal to syntactic and phono logical computation that do not surface in the output of the respective computational system. It is enough that there seem to be representations of many phrasal features, e.g., VPs, NPs, and the treestructures in which they appear. 15 Devitt (2014: 4) claims there is cognitive penetration in the case of language, since “the intu itions of linguists could reflect their theoretical bias,” and he cites an example to this effect from Schütze (1996). However, a few biased intuitions do not show penetration of a perceptual module, since it goes without saying that intuitions may sometimes have causes other than a nevertheless encapsulated module, with language, as with vision. The question is whether the vast majority of the robust reactions to, e.g. WhyNots, on which linguists rely are likely to be so caused. As we noted at the start in Chapter 1 (§1.1, fn10), there is now quite substantial evidence of a correspondence of roughly 95% between the linguists’ and non-linguists’ responses, the majority of which—such as the examples we will shortly consider—do seem highly resistant to cognitive penetration. See also the experimental examples of syntax trumping semantics and contextual knowledge in §7.3 below.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 235 In discussing linguistic perception per se, a number of issues need to be sorted out.

7.2.1 Linguistic Perception as Parsing A major issue in linguistic perception is the relation between a grammar and a parser. How exactly the distinction should be drawn is controversial, but I shall be adopting the weak view that they are at least conceptually different entities.16 Thus, according to one recent textbook: [T]he hearer uses knowledge of language and information in the acoustic signal to reconstruct a phonological representation that is then used to retrieve a set of lexical items from the internalized lexicon. Identifying the syntactic relations between the perceived set of words is the essential next step, which eventually leads to recovering the basic meaning the speaker intended. Reconstructing the structure of a sentence . . . is a job undertaken by the structural processor, or parser. . . . We think that hearers (and readers) systematically compute syntactic structure while processing sentences. (Fernández and Cairns, 2011: 204–5, emphasis original)

Of course, the grammar and the parser are not totally unrelated: The grammar constrains the parser’s structural analyses (for example, the parser will not build a parse that produces an ungrammatical sentence). However, the grammar does not have preferences about structural ambigu ities, nor does it contain information about the resources necessary to process particular sentences. The grammar is part of the hearer’s linguistic competence, while the parser is a component of linguistic performance. (Fernández and Cairns, 2011: 214)17

Again, as Drożdżowicz (2015: 105) has emphasized, what appears in percep tion are quite likely to be features of the grammar to the extent that they have been processed by a parser and other performance processes, whose outputs 16 See Momma and Phillips (2018) who argue that they are in fact one system, simply accessed in different ways under different time constraints (see §3.3.3, fn 17 above). 17 There seem to be some exceptions to the parser not allowing ungrammatical strings, e.g., *The man the girl the cat scratched died (cf. §1.4 above). But it is clear that these are few and far between, and that the grammar provides the structures and categories in which the parser generally traffics.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

236 Representation of Language may not always correspond to the mappings of sound and meaning effected by the grammar. In any case, I shall assume as a tentative default position that phonological and syntactic parsing are involved in linguistic perception. As Chomsky and Halle (1968) propose: what the hearer ‘hears’ is what is internally generated by the rules. That is, he will “hear” the phonetic shape determined by the postulated syntactic struc ture and the internalized rules. (Chomsky and Halle, 1968: 24)

Thus, speakers bring to bear phonological and syntactic knowledge in disam biguating a sound that they can hear as “a name” or “an aim.” In which case, here is one way a VoC story might go: assume for the present discussion that there are automatic, at least informationally resistant perceptual modules, and specifically one for language which “deliver[s] representations which specify, for example, morphemic constituency, syntactic structure and logical form” (Fodor 1983: 93).18 No matter how much contextual information is supplied, normal English speakers will balk at the “movement” of Who from its “echo” location in our examples (1)–(3) above. Thus, just as visual SDs are the output of a visual one, linguistic SDs provide the output from a language module that is input to (let us suppose) a central processing system, which then processes them in combination with other representations (e.g. of the context of the utterance) to produce more or less spontaneous verdicts on what has been said (or, in the case of vision, seen).19 Attention can then be drawn to different aspects of the SD, say, by their being “highlighted,” that is, computationally enhanced, or sent to special addresses for further processing.20 And the naive subject simply responds, in both the 18 I leave aside the role of input from visual signing and reading of orthographic representations, since this adds a layer of complication additional to the usual auditory input that is the chief concern of the linguist (see §1.1 above). 19 C. Wright (1989) and Higginbotham (1991) raise some curious qualms about this sort of repre sentational story. Higginbotham (1991: 557) seems to endorse a claim from Wright (1989: 258) that we mentioned earlier, §3.5, fn 39), writing that “the very existence of first-person authority seems to undermine the conception of language as involving the unconscious manipulation of representations, unless languages are seen as something like Lockean secondary properties” (1991: 557). Now, lan guages may well be instances of secondary properties (see §§9.1–9.3 below), but not because they involve internal manipulations of representations, which are, plausibly, the only way first-person authority about them could be explained (I might well know that by “rabbit” I mean rabbit, merely by disquotation; cf. Burge, 1988). Higginbotham does ask, rhetorically, about such a mechanistic picture, “How should I know what my language machine is doing?” (1991:564), to which the answer is: know ledge of parses does not require theoretical knowledge about parsers; it is enough simply that the parser is causally responsible for the representations of the parses themselves. 20 By “attention” here I mean “top down” deliberate attention, not automatic bottom-up attention that likely also occurs in young children. I shall also assume what Devitt (2014: 284) regards as an “on-line” version of all of this processing, although just which features are salient on-line may be a

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 237 language and vision cases, with a report caused by these highlighted SDs—for example, how the stimulus sounds to them, what were the words, phrases used, what “co-refers” with what, what was “the message” expressed—just as in a vision experiment she might report on how a stimulus looks and/or what worldly things she took herself to see. These “intuitive” reports are then evidence for the principles obeyed by the respective linguistic or visual faculty insofar as those principles afford the best explanation of, inter alia, the respective SDs. For a fuller example, consider an utterance of (4) John hopes Bill will help himself. This could be represented by the auditory parser by an SD that would include something like the following as a specification of its syntax: IP /

\

NP

I′

| John1

/ I | -s

\ VP / \ V CP | / \ hope C IP | that

/

\

N

I′

|

/

\

I

Bill2

VP

| will

/

\

V

NP

|

|

help

himself2

matter of the focus of attention. It will obviate these and other complications Devitt raises simply to assume that “the outputs of a module” are all also inputs to the central processor.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

238 Representation of Language There might be similar specifications of its phonology; for example, an utterance of (4) might be represented phonologically as: (4a) /dʒɑn hoʊps ðət bɪl wɪl help hɪmself/ and there might also be semantic and pragmatic ones.21 As we noted earlier, the indices on John, Bill, and himself capture the fact that a speaker would spontaneously have the “intuition” that “himself ” must co-refer with “Bill” not “John.” Hearers would respond with whatever overt vocabulary they have available, for example, “No, himself can’t be John, but has to be Bill.” Sometimes, of course, an utterance might be ambiguous, either phonolo gically (this guy/the sky), syntactically (Flying planes can be dangerous), or proto-semantically (John hopes he will win), just as the visual system seems to produce alternative SDs of a standard Necker cube diagram (see Pylyshyn 2006). Both systems issue in what seem to be perceptual intuitions of ambiguity, structural relations (what’s “part of ” and/or “sub-ordinated” to what) and even anomaly (ungrammatical strings and “impossible,” Escheresque figures; see §1.3, figure 1.1, and the more extensive discussion of evidence in §7.3 below). Insofar as they are to be taken as evidence of linguistic and perceptual pro cessing, both linguistic intuitions and perceptual reports are presumed to be fairly directly caused by representations that are the output of, respectively, a language faculty and a visual module. In the case of vision, the faculty pro duces SDs of visual properties (e.g. shape and color; part/whole relations); in the case of language, the faculty produces SDs of phonological, syntactic, and proto-semantic properties of various linguistic objects (words, phrases, sen tences), and these SDs are in turn the basis for subjects’ intuitive reactions and verdicts. Although Devitt (2006a: 114) rejects the analogy with vision,22 Devitt (2020: 58) recently allows that “syntactic intuitions are examples of . . . percep tual intuitions”. As we noted earlier (§7.1.3), he claims that they are:

21 As I said above, I presume these last figure in what Devitt means by “the message.” Devitt’s own views on parsing are actually hard to make out, see §7.1.3 above. 22 Devitt rightly claims that not all representations inside the modules are accessible to the CP. However, on the proposed model, it is only the outputs of both visual and parsing modules that the CP can access, which may be a subset of the SDs the module employs. According to VoC, it is on those outputs that “perceptual intuitions” are based. In Rey (2008) I discuss his misunderstandings of the literature in this regard.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 239 theory-laden perceptual judgments, reflecting past CP conceptualizations, just as are the intuitions of the [experienced, perceptually acute] paleontologist [and] art expert . . . (Devitt, 2020: 58)

However: the immediate causally relevant background for these linguistic perceptual judgments is thought about the language, not competence in the language. (Devitt, 2020: 58)

And so he concludes that there is “nothing to change my negative view of the [vision] analogy” (Devitt, 2020a: 58). But it is precisely the character of the data provided by competence that is the issue between (VoC) and (Dev): is it merely the phonetic noise and the “message,” or does it include competencespecified information about phonology and syntax? The comparison with the paleontologist identifying a bone, and an art expert a statue (Devitt, 2020a: 55) is inapt, since the information that some thing is a bone or a statue does need to be assiduously acquired by the CP from experience. But the information relevant to the (VoC) is emphatically not the sort acquired by assiduous study. The central point of Fodor’s (1983) classic monograph on the “modularity of mind” was that speech perception was modular insofar as it was, inter alia: mandatory, domain-specific, innately specified, hardwired, autonomous, not assembled, and with shallow inputs, limited central access and charac teristic breakdowns . . . fast…[and] informationally encapsulated. (Fodor, 1983: 37, 61, 64)

Most of these features are decidedly unlike the perceptions of a paleontologist or art historian, who can freely perceive bones or art without learned categor ies. Indeed, it is precisely because they suspect intuitions of native speakers are not like the learned intuitions of paleontologists that linguists prize them, as opposed to the intuitions of non-native, sophisticated linguists who may in fact explicitly know far more about the language. (Similarly, a vision theorist does not want judgments of “how things look” that are based upon a fellow vision theorist’s theoretical ideas about how things should look!) Of course, the consciously labeled conceptual categories of school grammar need to be acquired by study. But one of the most central claims of a Chomskyan

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

240 Representation of Language linguistics is that the child has ready access to many of the categories without any such study or conscious labeling. How is this possible?

7.2.2 Non-Conceptual Content Again: NCSDs What I suspect is at the heart of Devitt’s rejection of (VoC) are what he regards as four further problems he (2014:285) raises for a perceptual version of it (example re-numbered as (4) above): (A) Ordinary hearers understanding (4) have no conscious awareness of its SD or of any inference from the SD to a translation of (4). (B) Given that it takes a few classes in syntax even to understand an SD, it is hard to see how ordinary hearers could use it as a premise even if they had access to it. (C) The speed of language processing is, as Fodor emphasized, “mindboggling”: “the recovery of semantic content from a spoken sentence” is “very close to achieving the theoretical limit” (1983: 61). Given, as Fodor also noted, “the relative slowness of paradigmatic central pro cesses” (p. 63), it is unlikely that such a significant part of understand ing as moving from an SD to translation is a central process. (D) Suppose that an SD of (4) was made available to the central processor and the hearer based her intuitive linguistic judgments on the rich infor mation it contained. How come then she does not have the intuition that, say, in (4), “John” c-commands “himself ”? If her competence speaks to her in this way, “how come it says so little?” (2006a: 101) . . . What are we to make of these apparently arbitrary restrictions? (Devitt, 2014: 285)

Devitt (2006a: 210–20) compares theories of language with theories of skills, such as catching a ball or playing a piano, concluding that the literature of these latter theories “should make us doubt that language use involves repre senting syntactic and semantic properties; it makes such a view of language use seem too intellectualistic” (2006a: 221). The first two of these objections we addressed in §4.3, where we pointed out that all that is required in a Chomskyan account are non-conceptual struc tural descriptions of linguistic items. The appearance of over-intellectualism is due only to presuming that the SDs available to perception are conceptual, and, as we noted in §4.3, it seems pretty clear that Chomskyans—insofar as

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 241 they are attributing content at all; cf. §8.4—are attributing non-conceptual content to the states of the I-language and associated systems, such as parsing. The relevant SDs issuing from the language faculty to a central processor are likely to be the results of a modularized processing involving non-conceptual ones—what I called “NCSDs”—and it is therefore not surprising that there is no immediately conscious awareness of them as NCSDs with the kind of con ceptual contents self-consciously entertained by the linguist. Of course, one can learn that certain conceptual/lexical categories apply to one’s experience, and in this way perhaps come to notice the distinctions one learns in “grammar school” or by taking linguistic classes.23 All that may be available to the naive hearer is awareness of some or other distinctions that the linguistic NCSDs are marking, just as all that is available to a non-geometer looking at the Mach figures are distinctions marked by non-conceptual visual ones. The NCSDs structure how things look and sound; they themselves are not the “things” we seem to see or hear—although, of course, they likely are the basis on which we consciously report our intuitions when we do deploy meta-linguistic concepts (as I think can’t be stressed enough, there is more to phenomenology than phenomenal qualities).24 And, again, the importance for the Chomskyan of such reports is not necessarily their truth, but simply the evidence they provide, via the NCSDs, of the underlying grammar. As for (C), the issue of speed: as I have indicated, much has yet to be under stood about the non-conceptual character of the NCSDs of the (proto)semantic content of I-language expressions; but it is hard to see why there should in principle be any difficulty presented by the speed of the mapping from those SDs to concepts. If, for example, Pietroski’s (2010) proposals turn out to be correct, and the (proto)-semantic content of that output consists of “instructions to fetch concepts,” perhaps as a result of language acquisition in general, those concepts might be fetched and assembled into a full “message” fairly automatically. With regard to (D): it is a matter of subtle empirical fact just what informa tion NCSDs supply, but it would not be surprising that they might supply 23 Devitt’s preoccupation with such conceptual meta-linguistic judgments surfaces in many passages, e.g., his (2006a: 91), (2014: 2), and throughout his (2020a). If this were his only target, it would of course be a strawman. 24 Devitt (2020: 63) claims “speakers are not aware of [NCSDs] and would not understand them if they were” and wonders “if SDs are not conceptual how could they provide the content of intuitions that are conceptual” (2020: 63). Of course, speakers are likely not aware of NCSDs as NCSDs. But the point is that the NCSDs inform and structure their experiencing of speech just as they do of the Mach diamond. Conceptual judgments can then be made on the basis of the NCSDs involved in that experience.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

242 Representation of Language NCSDs of syntactic categories occurring in a tree, but not of all relations in the tree between categories. On the other hand, the fact that speakers seem automatically to respect the principles regarding anaphora and negative polarity items does suggest that relations such as c-command at least structure their perception, even if it is not an introspectively available “phenomenal item.” Thus, Devitt really should not assume that c-command is quite as— non-conceptually—unavailable to ordinary speakers as he supposes.

7.2.3 Having vs. Representing Properties Devitt (2006a: 235, see also 32, fn25) initially seems to regard it as “uncontroversial” that “parsing assigns a syntactic analysis to the sentence.”. To be sure, this does not imply an early model proposed by Frazier and J.D. Fodor (1978) that the principles or conditions of well-formedness by which it does so are represented (cf. §7.1.1 above). However, he continues: Even if [the conditions of well-formedness] need not be represented, we may wonder whether, on [the Frazier–Fodor] theory, the syntactic properties of the input have to be. (Devitt, 2006a: 239)

In response, Devitt advances his “fourth tentative proposal” that speedy automatic language processes . . . do not operate on metalinguistic representations of the syntactic and semantic properties of linguistic expressions. (Devitt, 2006a: 276)

which suggests that either speedy language perception does not involve parsing, or that parsing does not involve meta-linguistic representation (representations of SLEs). I think what Devitt has in mind here would be regarded by many (psycho-) linguists as a quite surprising possibility. A common conception of parsing that many might take to be virtually definitional is that it is a process that delivers a description of various syntactic properties of an SLE.25 Devitt wants to resist this conception. He does so at two levels. He proposes as a “tentative” 25 For examples of roughly the kind of output a mental parser is thought to provide, see the Stanford Parser, nlp.standford.edu:8080/parser/(where you can type in your own examples!). Aside from issues about the specific focus of Chomskyan research, there is no reason not to include in “parsing” repre sentations of phonological and semantic features in addition to purely syntactic ones.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 243 radical hypothesis that “speedy automatic language processes are fairly brute-causal associationist ones that do not operate on SDs” at all (2014: 280; see also 2006a: 220–43). However, although he claims this view is “better supported than the widespread view” to the contrary, he (wisely) wants to “rest nothing on that tentative view” in his criticism of the VoC (2014: 280). For the sake of the argument, he will allow that both the language and the vision processors operate on SDs, but that these SDs are not what is sent to the central processor that is responsible for linguistic intuitions: My discussion does not confine the level of representations deployed by those systems at all. Indeed, I clearly accept that the operations of those systems may involve all sorts of representations (2006a: 114). My claim is simply about what representations those systems deliver to the central processor . . . what information does the [language] system as a whole ultimately pass on to the central part of the mind that makes intuitive judgments? (Devitt, 2014: 281–2, emphasis original)

Devitt goes on to claim that what is delivered to the central processor is some thing that has linguistic properties but does not specify or represent them. Referring to an utterance of sentence (4) (4) John hopes Bill will help himself. and its SDs that we discussed in §7.2.1, he claims: As a result of what the system . . . delivers to the central processor, the mental representation involved in the hearer’s final understanding of the utterance will have something like those syntactic and semantic properties, and be a rough translation of [(4)]. But this is not to say that it has those properties because the system delivers an SD that describes those properties. (Devitt, 2014: 284; example renumbered; see also p. 287)

That is, for Devitt, it seems the natural language sentences themselves are actually entokened at the interface between the language module and the central processor. There are quite a number of problems with this proposal. First, from the fact that some item has a certain property, it of course does not follow that anything treats it as having it. In general, neural states have a multitude of physio-chemical properties—not to mention a potentially infinite number of

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

244 Representation of Language relational ones (e.g., being n miles from Paris)—but these properties are not thereby incorporated into a mind and the states are not treated as having those properties by any system in the mind, central or otherwise. They are certainly not perceived as having those properties. For that to happen, one would think the properties had better be represented, that is, made available to presumably computational processes of, for example, recognition, comparison, and mem ory, at least some of them centrally.26 Indeed, as we will discuss in §11.1 against a similar proposal of David Adger (2019), there is a general explanatory problem in psychology to account for how any creature in a universe of “local” physical interactions can in gen eral be sensitive to non-local, relational, or non-physical (what I call “abstruse”) properties. A computational-representational theory provides a strategy. It is not logically necessary; other strategies can sometimes work. There could be “surrogate” local properties, the way a person may be identi fied by a fingerprint, or the Big Bang by the spectra of local white noise. But there is a burden on the defender of such a claim to at least suggest where such a surrogate is plausibly to be found, and how exactly it would work—and not be representational. A quite natural computational strategy suggested by the work of many writers, both in visual and linguistic perception, is that it is some sort of probabilistic (e.g. Bayesian) one, something’s being detected as a certain item as a function of the prior probabilities and likelihoods of its being so (as in Chomsky and Halle, 1968: 24, quoted above; see also §11.2.2 below). All that is important for now is to note that, if that is the strategy, then it per force deals in representations, particularly the locally specifiable representations of the categories to which probabilities are attached. In his recent reply to this issue, Devitt (2020) agrees with all of this, but thinks it is beside his point: For, my proposal is that the properties are incorporated in the mind: they are incorporated in the sub-central processing system that delivers to the CP the representation that has the properties. And I’m going along with the view that these properties are indeed represented by SDs which are available to computational processes in that system. . . . [However],we have no reason to believe that, in thus hearing the sentence, the CP thereby has access to the 26 It’s at this point that Devitt raises the four problems about the role of SDs in perception that we discussed in §7.2.2. As emphasized there, it is entirely open to suppose that SDs are non-conceptual NCSDs. Note that the point doesn’t depend upon settling the further issue of whether the representa tions are “classical” data-structures, or the “distributed representations” of neural nets. The issue is representation at all, not its specific character or the computations upon them.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 245 SDs and hence to the informational basis for intuitive judgments about its syntax. “Hearing an utterance in a certain way is one thing, judging that it has certain properties, another.” (Devitt, 2020a: 65)

But, again, how does a representation merely having a certain property explain a speaker’s hearing it as having that property, especially since it has an indefinite number of properties that a speaker does not consciously hear? To be sure, hearings by themselves are not the judgings based upon them. But the issue is whether the relevant hearings can occur without representings. Perhaps what Devitt has in mind at this point is another of his (quite) ten tative hypotheses, that there is a “language of thought” (“Mentalese”), and that there is at least “a great deal of similarity between the syntactic rules of Mentalese and the speaker’s public language” (Devitt, 2006a: 149). However, even if Devitt could make clear how an often highly abbreviated, largely com municative vehicle like English could serve the strictly formal, fully explicit demands of an internal code, there would still be the problem of how some of its syntactic—and phonological?—properties became objects of perception. How could the central processor recognize a word, a noun, a verb, an NP, VP, IP, or PRO? Devitt (2006a: 225) acknowledges that, in view of the relational natures of linguistic properties, “recognizing a word as having [a syntactic] property may not always be an easy matter,” curiously adding, however, that “it mostly is,” referring us to his treatment of the issue some pages earlier in his §10.6. But there he merely observes that it is quite easy to tell many English adverbs by the fact that they end in “ly” (2006a: 185), adding: It can also be easy to tell that an object has a certain relational property if learning to identify the object involves learning to identify it as having that property . . . identification comes with word recognition. One way or another, it is quite easy to tell the explicit structural properties of utterances, although sometimes hard to tell the implicit ones. (Devitt, 2006a: 185–6, emphasis mine)

“Hard to tell” is a staggering understatement.27 Most linguistic properties can decidedly not be locally identified. They are enormously complex relational

27 Even his example of adverbs does not work: the “-ly” suffix is a fairly good indicator of “how” adverbs, but not of, e.g, “when” (yesterday) or “where” (upstairs) ones; and plenty of words ending in -ly are not adverbs:, e.g., fly, belly, sly, ply.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

246 Representation of Language ones involving abstract issues about their role in the grammar as a whole. Whether certain material is the main or a nested verb, or whether an NPI stands in the correct c-command relation to a licensor, or just whether “well” is an adverb, adjective, interjection, noun, or verb—these are not facts that are easily read off from anything but fairly complex computations on—what? It is hard to imagine any other plausible candidate than representations of the syntactic properties of an utterance to which probabilities might be assigned by statistical computations and which are made available to the central processor in fully “hearing” language.28 Lastly: at this point we can no longer set aside here Devitt’s (2006a: 222–3) odd disregard of phonological and phonetic properties. Surely these— [+sonorant], [+nasal]—are not entokened anywhere in the brain, module, central processor, or otherwise! (I set aside use/mention confusions/collapses for examination in §9.8.) But then how are they perceived or comprehended by the central processor without being represented? And if they need to be represented, why should other linguistic properties not be as well, their repre sentations attached to the phonological ones? Indeed, how could the whole speech episode be integrated into a single perceptual object if they were not? Speaking at least phenomenologically, one seems to hear a single utterance as its having at least some phonological, morphological, syntactic, and (proto-) semantic properties all together. In sum: even if Devitt restricts his “brute-causal” proposal to only the items relayed from the language system to the central processor, he has yet to show how the needed linguistic processes and perception could be accom plished by neural items merely “having,” as opposed to representing linguistic properties—or that any serious theorist thinks it can.29

28 Curiously, Devitt (2006a) cites a number of authors who he seems to think are explaining how linguistic properties could play a role in the mind without being represented. Thus, he claims that in Vigliocco and Vinson’s (2003) appeal to “procedural knowledge” in the sentence, The ferocious dog bites the boy, “there seems to be nothing that requires the representation of syntactic or semantic properties” (2006a: 231). What is curious is that their proposal in fact—in the very passages Devitt quotes!—explicitly relies on “representations”: Assuming that the speaker wants to express the event ‘The ferocious dog bites the boy’, the stored meaning-based representations for ‘ferocious’, ‘dog’, ‘to bite’, and ‘boy’ are retrieved. These representations would further specify that ‘dog’ and ‘boy’ are nouns . . . (Vigliocco and Vinson, 2003: 183, emphasis mine) 29 As I mentioned at the beginning of this chapter, ironically enough, something very like Devitt’s view can seem to coincide with a position of Chomsky (2000, 2003), which is, I’ll argue (§9.8), due to careless, but systematic use/mention confusions (see §9.6). I hesitate to attribute these to Devitt, though the move from representing to having properties, of course, can invite them.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 247

7.2.4 How Would NCSDs Help? In his final “serious” objection to (VoC), Devitt (2014) argues that, even if (NC) SDs were available to the central processor, they still would not sustain a VoC: Suppose that the language system did deliver a partial SD to the central pro cessor, how would the SD’s information “fairly directly cause” the intuitions that are the concern of VoC? (Devitt, 2014: 287)

He considers the two possibilities of dealing with an ungrammatical string: (i) that the system provides to the central processor an SD of such a string, or (ii) that it does not—it gags or “crashes.” He continues: If (i), then that SD would not directly cause intuitions of ungrammaticality. For, that SD does not come with a sign saying “ungrammatical.” To judge that the SD is of an ungrammatical string, the subject would have to apply her theoretical knowledge of the language to the SD. That’s [(Dev)], not (VoC). If (ii), then information provided by SDs would have nothing to do with a subject’s grammaticality intuitions. Rather, the presence or absence of the SD would be the data for the central processor’s response. So, not (VoC) again. Devitt, 2014: 287, substituting “(Dev)” for his “(ME),” see fn 12 above)

But all this misconstrues the VoC proposal. Of course, it is no more likely that the parser adjoins “ungrammatical” to an NCSD of an ungrammatical string, than that the visual system adjoins “impossible” to NCSDs of impossible vis ual figures (such as the Penrose triangle, see §1.3 above). It is enough that the perceiver in one way or another detects a difficulty in dealing with the mater ial, which, of course, it can perceive phonologically, and in some phrasal parts, even if not as a whole sentence. After all, Chomskyans do not rely merely on up and down judgments of acceptability, but on any number of further things a hearer might say, do, or just manifest that serve as evidence of the oper ations and constraints of the I-language. In any case, if the NCSDs were delivered to the central system by the parser, then there is a simple answer to his question about how they would “fairly directly cause” VoC intuitions: they would do so by serving as the representa tions on which intuitive judgments are causally, computationally, and eviden tially based, much as NCSDs of, say, occlusion relations and axes of symmetry provide the basis for reports of how things look; or, to take another sort of

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

248 Representation of Language example, how efferent copies of motor commands provide special intuitive knowledge of one’s voluntary movements.30 Our various perceptual systems are barraging us with detailed, often non-conceptual information that either directly cause or constitute premises in inferences about how, for example, things look, feel, smell, and sound, often in idiosyncratic non-conceptual terms. Why should they not provide us with information about linguistic properties in similar ways? In any case, if linguistic perception involves parsing, and parsing is heavily constrained by the I-grammar, then, pace Devitt (2006a: 118, 2014: 288), we seem to have a perfectly respectable model of a VoC.31 But, of course, merely that an explanation is intelligible is perhaps of no great moment if there is no evidence that would motivate taking it seriously. Devitt goes on to demand evidence for the (VoC) view: Even if we could . . . come up with the required explanation, we would still need a persuasive reason to prefer that explanation to my modest one [i.e., (Dev)] if the abduction [to such an explanation] is to be good. (Devitt, 2006a: 118)

So we now turn to providing that evidence and evaluating that abduction.

30 Devitt (2020: 64) claims that all of this is not an answer; it is just a pronouncement that (VoC) can answer. We need some idea of how this story is possible . . . Of course, one would like a computational theory of mind to be spelled out in detail, but surely this sketch is clear enough: the output of a module consists of NCSDs that then causally occasion various reactions, typically, representing the stimulus as being segmented in certain ways and not others, and then, for the conceptually competent, bringing to bear conceptual representations that serve as the basis for meta-linguistic perceptual reports. This is what vision theorists have in mind for reports of visual experiences of, e.g., the Mach diamond (see §4.3); if syntactic intuitions are perceptual then why should precisely the same story not hold for them? 31 Note also that this perceptual model comports well with Gross and Culbertson’s (2011) finding that, actually, the most reliable intuitions for linguistics seems to come from native speakers who have been exposed to some cognitive science: they understand enough to know what to attend to in their perceptual experience. As I like to think of perceptual education in general, one learns to become sensitive to one’s own sensitivities. Note that, pace Devitt (2020:56), Wasow and Arnold’s (2005) critique of many linguists’ uncritical reliance on intuitions says nothing against (VoC). As they stress in an opening paragraph: We have no quarrel in principle with the use of primary intuitions as evidence of theoretical claims . . . [but only] with the way they are collected, and the over-reliance on this one type of evidence. (Wasow and Arnold, 2005:1482) For many of the reasons they stress, the VoC may simply often be harder to hear than linguists have assumed. Human perception in general is pretty noisy, affected by many sources other than a VoC, and prudence suggests—at least for the many cases Wasow and Arnold discuss—precisely the care they recommend.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 249

7.3 The Evidence Incorporating some of the distinctions we have discussed, the two rival explanations might be summarized as follows (where “=>” can be read as “eventuates in”; I place in bold the crucial differences): (VoC): audition => represented parsing of input => NCSDs, at least some of which => input to Central Processing => intuitive reports (Dev): audition => instantiated parsing of input => phonetic noises having linguistic properties & “the message” => input to Central Processing => intuitive reports32 Note, again, that on both views, central processing may involve any further information available to the speaker from whatever source. (VoC) differs from (Dev) only in allowing that some of that information available to the Central Processor consists in NCSDs. I want to stress from the outset that I do not take any of the following evi dence as apodeictic evidence for (VoC). Evidence for perceptual processing in general is difficult to obtain (e.g., controlling for all manner of central guessing), and psycholinguistics shares in these difficulties.33 But I take what follows to provide at least a strong prima facie case for it. Perhaps no single bit of it is conclusive, but it is hard to resist its cumulative effect, which Devitt would have to explain away in order to sustain his (Dev). By way of understanding what I take to be the significance of these experi ments, I shall be assuming that if some conscious (i.e., introspectible) percep tual task is sensitive to certain phenomena, that is a prima facie reason to suppose that the phenomena are in some way (e.g. at least non-conceptually) available for intuitive verdicts. Perhaps this is a mistaken assumption.34 But if 32 Devitt (2020: 61fn15) claims (Dev) does not capture what he called (ME), since “the central pro cessor has direct access only to the message, not to any intermediate information involved in arriving at it”. But that is just what (Dev) states! But, despite what he says here, I have added “phonetic noises” in light of the sundry other remarks he makes about “the data of language” in his (2006a) that I dis cussed in §7.1.3 above. If he intends something still different, he will need to supply a diagram of his own. 33 For notorious example, it remains a vexing methodological problem with regard to both vision and language how to disconfound central from module-internal top-down processing, as in the case of understanding the phoneme restoration effect (J.A. Fodor 1983: 76ff), and in trying to characterize the output of the visual system generally (Pylyshyn 2006: 73–6). 34 Devitt (2020: 64) claims this assumption is falsified by, e.g., “an outfielder catching a fly ball [being] sensitive to the acceleration of the tangent of the angle of the elevation of his gaze,” a phenom enon that “is surely not available to his intuitions.” But if, per the assumption, the outfielder were

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

250 Representation of Language it is, it seems to me the burden is on someone who thinks otherwise to provide a model of the experimental results. Merely claiming, as Devitt does, that the phenomena “have” various syntactic properties will not be enough, since, again, there is obviously an indefinite (likely infinite) number of properties a state could have without it affecting the content of perception in the least.

7.3.1 Involuntary Phonology There is an important point that we passed over earlier in allowing Devitt latitude in discussing noises or phonology, that we can’t help but hear speech in our native tongues as language (cf. J.A. Fodor, 1983). Enlarging on their claim that we quoted earlier, Chomsky and Halle (1968) write: We do not doubt that the stress contours and other phonetic facts . . . constitute some sort of perceptual reality for those who know the language in question. In fact we are suggesting a principled explanation for this conclusion. A person who knows the language should “hear” the predicted phonetic shapes . . . (Chomsky and Halle, 1968: 25, italics mine)

Indeed, at least proto-phonological perception appears to be a specific innate auditory capacity. Neonates are able to discriminate adult phonemic distinc tions categorically (discrimination of the same acoustic difference is reliably better across adult phonemic boundaries than within them), but lose this ability for distinctions in the languages other than the one they are acquiring (Eimas et al., 1971; Werker and Lalonde, 1988). Thus, neonates will distin guish [p] and [ph], which Hindi children will continue to do, but adult English speakers will not, just as adult native Japanese will confuse English [r] and [l]. One can easily imagine a machine of the sort that Devitt sometimes seems to have in mind, whose modular output to the central processor involved no meta-linguistic descriptions, and linked mere noises to “messages,” but all the evidence, both introspective and experimental, suggests that we are not remotely such machines. We will turn in due course to consider whether we also hear other linguistic properties, such as syntactic and semantic ones. introspectively conscious of that acceleration via, presumably, NCSDs, then, one expects, the phenom enon would be available to his intuitive judgments that, e.g, “that ball seems to be coming towards me.” Of course, the assumption does not apply if he cannot make such introspective judgments.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 251 Devitt (2014) acknowledges these facts about involuntary parsing but, again, thinks his distinction between having and representing linguistic prop erties will suffice. Continuing to discuss our (4): (4) John seems to Bill to want to help himself he writes: In understanding [(4)], we hear it as having those linguistic features and not others in that, as a result of all the processing in the language system, we come up with a representation that has those features, and not others . . . (Devitt, 2014: 287; example re-numbered)

But, to repeat the point of §7.2.3 above, to be part of our experience it is not enough that the SD-output have linguistic properties; there has to be some sort of further incorporation of it into our mental life, and, given the abstrusity of the properties (cf., §11.1 below), it is hard to think of what would work besides representation. Again, [+nasal] is not a neural property! No one yet has a completely adequate theory of perceptual experience, but it is hard to believe that it will not have a crucially intentional, representa tional component. Much perception exhibits the distinctive properties of intentionality that we will discuss shortly (§8.1): items are perceived as dis playing one “aspect” or another; and one can have perceptual experiences of impossible things—as in the case of any perceptual experience of a genuine triangle at all. What else besides representations could select from among the infinitude of properties a state possesses just the ones the hearer “hears” the representation “as” possessing? It is these representations that allow the hearer to recognize what she has heard, compare it with other things she hears or remembers, and use it as a basis for further behavior in ways that are familiar from all other parts of a computational-representational psychology. At any rate, anyone as sympathetic to a “language of thought” hypothesis as Devitt (2006a: §9.2) should certainly be hospitable to such a claim. But then it is hard to see how he could doubt that the auditory experiences of native speakers involve NCSDs of at least phonological properties of speech (and probably a lot more). But if NCSDs structure the way we hear an utterance, then they must be causally available to whatever part of the central system is responsible for our awareness of what we hear. And how we judge what is said is surely often determined by how we hear it. So in some fairly direct way the NCSDs are

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

252 Representation of Language also available to play a role in forming our conscious linguistic intuitions. It is hard to believe that the intuitive judgments about sentences that interest linguists do not often reflect (i.e., are not caused by) some aspects of how we hear and so understand them (cf. Smith 2006: 949–55). What about the apparent independence of meta-linguistic intuitions and linguistic competence, stressed by Shütze (1996) that Devitt (2014: 279) thinks “provides persuasive empirical evidence against VoC”? Well, surely the least meta-linguistically articulate presumably know when someone has said something, and hear utterances as containing words and phrases, even if they don’t attend to these and other features, much less talk about them: someone can surely enjoy puns and rhymes without reflecting on them as such, just as Moliere’s king can speak prose without knowing it. True, it might be harder to obtain as much comparatively direct evidence of speaker competence as we can with the VoC of the more articulate; but there is likely plenty of indirect evidence, and, of course, reasonable generalizations that theorists can make from the attentive and articulate.

7.3.2 “Meaningless” Syntax One of the most famous examples Chomsky (1957) produced for both the interest and the relative autonomy of syntax is his (5) Colorless green ideas sleep furiously which any native speaker can parse, even though it has no readily intelligible, literal “message” (of course, it may have any number of poetic or metaphor ical ones). But one doesn’t have to make up such examples. There is plenty of technical language in any specialized area that speakers can parse and “hear as” English, without any clear idea of what messages are being conveyed. As Devitt (1984/91/97) knows only too well, philosophy has provided more than its share of such prose. To take a recent example: There is a secret of denial and a denial of the secret. The secret as such, as secret, separates and already institutes a negativity; it is a negation that denies itself. It de-negates itself. (Derrida, 1987: 25)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 253 Perhaps the best example of purely syntactic parsing is Lewis Carroll’s “Jabberwocky”: ’Twas brillig, and the slithy toves Did gyre and gimble in the wabe: All mimsy were the borogoves, And the mome raths outgrabe.

Any English speaker—including lots of children—could readily “parse” the sentence without having a clue as to its “message,” if only by exploiting the same syntactic structures of perfectly meaningful sentences, for example, “’Twas windy, and the happy doves . . ..” Going beyond such anecdotal evidence, people remember lists of words if they are syntactically well-structured, even if semantically anomalous (Miller and Isard, 1963; Newmeyer, 1986: 7; Fernández and Cairns, 2011: 206), and patently do so in the case of much syntactic but nonsense verse. Of course, when sentences are meaningful, people tend to remember “the message” bet ter than they remember its syntactic form (Newmeyer, 1986: 9). This latter fact is perhaps what moves Devitt to his view that hearers perceive only the message; but, what I would have thought would be obvious to any philoso pher, is that people can also recall plenty of nonsense. In between nonsense and messages, there are, of course, the WhyNots (§1.1),which provide additional evidence of the character of speech percep tion. Again, these are strings whose message we can easily understand, even if at the same time we are immediately aware that there is something wrong with the way it was said. To reiterate some of the examples I surveyed in §1.1: *Who do you wanna sing? *John’s taller than she’s. *Who do you want to visit Tom and next week? You might well understand what I was asking about if, for some reason, I sloppily asked *What do you want the ketchup and on your burger? or *Who do you wanna sneeze, and yet you still, if you noticed, might find my phraseology oddly “wrong.” It is hard to see how you could be doing this without keeping track of more than merely the message—indeed, are trying to parse the *‑ed sequences.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

254 Representation of Language

7.3.3 Syntax Trumping “the Message”: Garden Paths, Structural Priming, and “Slips of the Ear” Further evidence of the perceptual reality of at least some syntactic categories is supplied by the “garden-path” phenomena mentioned earlier. Thus, naive subjects hearing or reading “The horse raced past the barn fell” are initially confused, presumably because the parser too quickly represents the main verb as “raced,” and has to recompute after it encounters “fell” at the end. This initial, unsuccessful parse determines how the hearer hears the words, and why she then has trouble understanding the message (see Fernández and Cairns, 2011: 211ff for discussion).35 Garden-path phenomena by themselves, however, do not quite show that syntax is perceptually represented. It is open to a defender of (Dev) to claim, as Devitt (2006a:148ff, 276) tentatively does, that the syntactic structure of natural language is (roughly) identical to that of people’s “language of thought” (a claim sometimes suggested by Chomsky; see Hauser et al., 2002). If so, then garden-path examples could be explained equally as features of how the perceiver of a sentence (at least initially) thinks its message: she begins thinking that (some) horse raced (somewhere), and has to re-think this message when she perceives “fell.”36 We will see that this possibility, how ever, seems to be belied by dissociations of syntax from semantics. I suspect Devitt takes the (Dev) view of linguistic perception to be sup ported by at least some psycholinguistic research. Thus, he quotes Crain and Steedman’s (1985) claim that the primary responsibility for the resolution of local syntactic ambiguities in natural language processing rests not with [syntactic] structural mechan isms, but rather with immediate, almost word-by-word interaction with semantics and reference to the context. (Crain and Steedman, 1985:321; quoted at Devitt, 2006a: 237)

It is certainly true that their principles of “A Priori Plausibility,” “Referential Success,” and “Parsimony” that Devitt (2006a: 237) cites would seem to favor the priority of the worldly message over syntax in determining 35 Regarding garden-paths, Devitt, himself, has a curious response: “these phenomena are examples of language usage; they are not intuitions about the linguistic properties of the expressions that result from that usage” (Devitt, 2006a: 113). Again, it is difficult to see why the Chomskyan is not perfectly entitled to infer from such data facts about the parsing and representation of syntactic properties. 36 Of course, pace Devitt, the structure of the sentence would presumably have to be represented somewhere in order to even guess at its meaning; just not in perception.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 255 perception—but, note, only “for the resolution of syntactic ambiguities,” which the parser would need to have supplied in the first place! But it is worth noting, for the record, cases where the syntax and the message can come apart. A number of experiments have shown that syntax can sometimes take precedence over the message. Ferreira et al. (2001) found when the gardenpath parsing conflicts with the plausibility of a sentence’s message, many hearers will still take the garden path. Thus, given (6) While Mary bathed the baby played in the crib, and asked, Did Mary bathe the baby?, subjects will say she did. Apparently, they still treat the baby as the direct object of bathed, even after they have recognized the correct parse that excludes this reading. Similar results have been obtained with regard to co-reference (or, alterna tively, co-indexing). Using a familiar form of syntactic ambiguity, (7) Charming babies are boring. (7a) Charming babies is boring. Cowart and Cairns (1987) showed that it took longer for subjects to respond in (8) to a cue of is than to are: (8) Whenever they lecture during the procedure, charming babies… As in the case of “first resort” for filling gaps, the parser evidently chooses the first structurally available noun phrase, charming babies as a referent for they—despite the implausibility of the message that babies were delivering a lecture (see Fernández and Cairns, 2011: 224). Trumping of the message by syntax is exhibited by “structural priming,” or the tendency for hearers and speakers to repeat aspects of sentence structure that they have recently encountered. This can also occur independently of message meaning. In a highly influential experiment, Bock (1986) found that they could affect the syntax of subjects’ descriptions of a picture by priming them with semantically unrelated sentences with the same syntax. For example, subjects were presented with a picture that could be described either by the active sentence: (10) Lightning is striking the church

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

256 Representation of Language or its passive equivalent: (10a) The church is being struck by lightning They were more likely to offer the active (7) if they were primed with a seman tically unrelated active, (11) One of the fans punched the referee than if they were primed with (11a) The referee was punched by one of the fans and vice versa. Similarly, Bock and Loebell (1990) found that this last sentence, with its agentive by-phrase (by one of the fans) was primed by a semantically unrelated sentence with a locative by-phrase (by the blinking light): (12) The foreigner was loitering by the blinking light. In a lengthy review of these and related results, Pickering and Ferreira (2008: 431) conclude that “priming appears to cut across meaning distinctions,” indeed, that the results provide compelling evidence for the view of “autonomous syntax,” which regards “syntactic knowledge as independent of other forms of knowledge, such as the specific features of meaning or perceptual properties of utterances.”37 A still further source of evidence of syntax trumping semantics is in “slips of the ear,” where a listener often makes “radical changes in phonology and syntax, completely lacking in semantic appropriateness” (Bond, 2008: 307). For example: (13) I’m going to go back to bed until the news has been heard as (13a) I’m going to go back to bed and crush the noodles.

37 They go on to stress that these other features can also have priming effects: sometimes the semantics does, indeed, trump the syntax; but it is of course enough for a claim about the autono mousness of syntactic perception that sometimes it does not.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguistic Intuitions and the Voice of Competence 257 and (14) I seem to be thirsty has been heard as (14a) I sing through my green Thursday. As Bond (2008: 308) concludes: “Listeners are open to extremely implausible utterances, not at all constrained by semantic or pragmatic appropriateness.”. Contrary to (Dev), sometimes people actually (mis-)hear words and syntax having no relation to an obvious intended “message.”

7.4 Conclusion It certainly appears that the evidence seriously favors (VoC) over (Dev), at least for some standard phonological and syntactic properties: speakers seem intensely sensitive to them independently of the “message” that might be con veyed by perceived speech—or might not be in the case of syntactic nonsense! It is hard to see how this apparently perceptual sensitivity could be explained other than by presuming that many standard phonological and syntactic properties are perceptually represented, at least in the form of NCSDs. (Dev), by contrast, is committed to speakers somehow moving from noises to—per haps38—merely unrepresented phonology they have heard and produced, on the analogy of generalizations they have made about their own and others’ swim ming, bicycling, and touch-typing. It is worth stressing that these latter, mostly motor domains, seem on the face of it spectacularly different from language, both in their computational complexity, and in their relation to cognition. They are likely highly routin ized in contrast to the constant novelty (“creativity”) of language, and they do not seem to be essentially cognitive in the way that linguistic competence seems to be: someone is not linguistically competent who cannot understand a language—how people hear and judge noise is essential to that understand ing. By contrast, perfectly competent swimmers, bicyclists, and even typists need not be able to understand, hear, or judge very much at all about the 38 Recall Devitt (2006a: 223) suspects his arguments against the representation of syntax would “carry over” to phonology (see §7.1.3(iii) above)!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

258 Representation of Language activities. So it is not surprising that linguistic intuitions that manifest that hearing and understanding might be essentially implicated in linguistic com petence in a way that they are not in these other cases. To be sure, hearing is one thing and judging quite another: but for the linguistically competent, their best judgments are largely based on what they take themselves to have heard, usually on what they have consciously heard, which then provides the basis for further conscious reasoning about “the message” in the central pro cessor. In any case, again, if the NCSDs are needed to characterize the con scious perceptions on which parsing of “the message” is based, they should be equally available to explain meta-linguistic intuitions as well. But, once again, I want to stress that there is no need to advance my specific version of (VoC) as one that has been conclusively confirmed. It is enough to show that it is scientifically reasonable, and indeed could account for some of the known phenomena at least as well, if not a lot better, than (Dev) and other accounts that try not to appeal to a privileged epistemology. Perhaps many of the phenomena by themselves might be susceptible to a non-(VoC) explan ation, but it is hard not to be impressed by the trend of the whole of it.39

39 Note that I don’t mean to endorse intuitions tout court, and certainly share some of Devitt’s g eneral wariness about the recourse to “intuitions” in semantics and philosophy (see §10.1 sec (iv) below). There, as everywhere, I think what’s important is the provision of good explanations of them. The problem is that, unlike in the case of syntax and phonology, there are just far too many quite diverse explanations still available.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

PART III

IN T E N T IONA LI T Y

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

8 Chomsky and Intentionality We now come to what I consider to be the most significant issue in this book: the role of “intentionality”in a Chomskyan linguistics.1 For the nonce—I will refine the discussion in due course—we can just think about this property as being the “aboutness” that seems to be distinctively displayed by many of the “mental” terms that, following many of Chomsky’s texts, I have been freely using throughout the discussion. Thus, “intuitions,” “perception,” “knowledge,” “cognizing,” “abduction,” “innate ideas,” and, most importantly, the nearly ubiquitous term “representation” are all standardly understood as being “about” or “of ” certain objects, real or unreal: for example, chairs, cats, nations, numbers, ghosts, goddesses, colors, triangles, and SLEs, all of which can figure in perceptions and perceptual representations in vision—which might be about (of, or directed upon) objects, surfaces, colors, and shapes—and intuitions and representations in the case of language which are about SLEs— sentences, words, phrases, and the like—again, real or unreal. In what follows, I will set out some basic features of the notion of intentionality, with an eye to its role not so much in ordinary folk ascriptions, but in serious psychological explanations (§8.1), especially in many of Chomsky’s own presentations of his theory (§8.2). I will then briefly mention some of the substantial skepticism about intentionality that curiously dominated twentiethcentury philosophy (§8.3), and which seems, surprisingly, to be shared by Chomsky, leading him to deny what would appear to be the explicit intentionalisms we quote in §8.2 and on which he himself seems to rely (§8.4). John Collins (2007b, 2009, 2014) endorses Chomsky’s denial and attempts to resolve the apparent tension by proposing what I call an abstract-algebraic reading of the theory that abstracts from any intentionality (§8.5). One might, however, wonder what is driving Chomsky to his denial and Collins to his (to my mind, extreme) proposal. Has Chomsky reneged on such a crucial issue, sending us back to the Quinean doldrums? Or has there been 1 See the Preface, fn5, and §3.2, to distinguish three homophonous expressions that can be easily confused in this regard.

Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0008

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

262 Representation of Language some misunderstanding of his views all along? Becoming clear about these issues is not merely a matter of tracing the history of Chomsky’s personal views, but it is crucial to a proper understanding of the character of the Chomskyan core program, and of the role of intentionality in psychology and philosophy generally. We will see that one of Chomsky’s concerns is with the idiom, “representation of x,” which he seems to think must always involve an external x. This is, to be sure, a view about intentional representation that has gained ascendency—almost hegemony!—in philosophy since the 1960s. It invites Chomsky’s rejection of the phenomenon, since he believes there are no external SLEs that could be so represented; and if there are no external SLEs, but intentional representations require that there be, then intentional representations have no place in linguistics (§8.6). But many other philosophers have noted that there are alternatives to such strong externalist views. In particular, one can allow that even very basic representations can be “empty,” in that they can be “of ” things that do not exist. Although many have feared (as Brentano claimed) that this presents a problem for any naturalistic view, I think it can, at least provisionally, be presented in a more innocuous and naturalistic way than has often been supposed (§8.7).

8.1 Intentionality The issue of intentionality in the sense that we will be discussing was raised in the late nineteenth century by the German philosopher who we’ve mentioned in passing, Franz Brentano (1874/1995), who was resurrecting a notion that had been largely ignored since the medieval scholastics. He famously claimed that this property of “aboutness,” or being (as it is often put) “directed upon an object,” real or unreal, was what was distinctive of all mental phenomena. Whether or not he was right about that, it certainly seems to be true of many of the states that figure in ordinary folk psychology as well as cognitive science. These states centrally include what philosophers call “(propositional) attitudes,” or states that are described as essentially involving expressions that take a “that,” “to,” or “of ” complement, frequently with an “as” adjunct.2 For some standard examples:3 2 Note that, despite the standard term, as is evident in the examples, the “object” of an attitude need not be propositional: one can have a perception of a frog (as a rock) without necessarily thinking a proposition. 3 This list is by no means inclusive or exhaustive, nor should it be taken to imply that “the intentional” will turn out to be uniform across all these terms, or across all the uses of them in different sciences

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 263 x perceives/learns/knows/believes/notices/represents that p; remembers/prefers that p/to A; e.g., x perceives/learns/knows/believes/notices/represents/remembers/ prefers that sound s is twice as loud as sound s´; x represents the 60/100 deaths as a loss, the saving of 40/100 lives as a gain; x sees the Necker cube as facing down to the left, but then up to the right. x perceives/imagines/represents/worships y (as F) e .g., x imagines/represents/worships Zeus (as a god); the Kanizsa figure as a triangle. x is a perception/memory/expectation/representation of x (as F) e.g., x is a perception/memory/expectation of a red light followed by shock. Brentano was surely right in noticing that at least these sorts of mental states are intentional in his sense. One of the reasons intentionality deserves attention is that it exhibits a number of striking features. Enlarging on our initial characterization along the same lines as Brentano and others, there are the following (neither exclusive nor exhaustive) six features that are thought to be characteristic of this property and of many of the above attitudes: (i) “aboutness” or “intentional content”: e.g., attitudes/representations can be about/of space, objects, people, words, syllables, numbers, ghosts—anything one can think of, whether real or not. (ii) “ aspect”: Seeing, hearing, representing something standardly involves doing so with respect to one aspect of it rather than another. One might see a speck of light as merely a speck of light, or as a comet, a planet, as a morning vs. an evening star; or think of Zeus as a god or as a rogue; or hear a noise as a noise, or as a word, and/or as the main verb of a sentence; (iii) hyper-intensionality: The aspects that can be thought or perceived are characteristically hyper-intensional, i.e., distinct even when they are necessarily co-extensional: thus, one can think 2+2=4, without thinking that 2+2 = +√16, despite it being necessary that 4 = +√16; one can see a symmetrical diamond as a diamond or as a (necessarily identical) square; (cf. Collins, 2014: 33–4). At this point, we are simply getting into the general b allpark, not observing what might be the varying constraints on different explanatory projects that might be pursued.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

264 Representation of Language (iv) answerability: attitudes/representations can be assessed in terms of their truth or falsity, or accuracy or inaccuracy, and thus are “answer able” to whether the world is as they represent it as being;4 (v) a lack of existential commitment: As we noted in (i), attitudes and representations can be about things that do not exist, and may even be impossible, as when one entertains beliefs, fears, or desires about illusory triangles, largest primes, ghosts, or the posits of false scientific theories, as in the case of phlogiston and the ether. Brentano called the “things” that states are about in cases of non-existence, “intentional inexistents” (I will also sometimes refer to “perceptual inexistents” in perceptual cases). There is some very confusing terminology involved here: the last three features are sometimes expressed meta-linguistically as “logical” features. Normal, non-intentional (with a “t”) verbs create “extensional” (with an “s”) contexts, which are “referentially transparent”: one can substitute co-extensional expressions “salve veritate” (i.e., saving truth values). Thus, if Bill kicked Mark Twain, and “Mark Twain” and “Sam Clemens” are co-extensional, that is, refer to the same man, then Bill in fact kicked Sam Clemens. However, intentional (with a “t”) verbs fail this condition, and therefore create “intensional” (with an “s”) contexts, which are “referentially opaque”: Bill can think Mark is a great writer, but that Sam is not, despite the co-extension (or co-reference) of “Mark” and “Sam” (and in hyper-intensionality: Bill’s thinking 2+2=4 does not imply that he thinks 2+2 = +√16, despite the necessary co-extension of the constituent expressions).5 I prefer the non-meta-linguistic “material” mode in which I expressed the features above, which I will use throughout. A last feature that many attach to intentional states is: (vi) (ir)rationality: all of the above are arguably necessary for the role of intentionality in characterizing reasoning, whether deductive, induct ive/statistical, abductive (explanatory, confirmatory) or practical, and whether sound or irrational. Many think that rationality is also necessary for intentionality, but this is not entirely obvious.6 Although, of 4 Cf. Collins (2014: 31). Fodor (1987) advanced a related property of “robustness”: it is possible for an intentional representation to be false. 5 See Mates (1952), Quine (1956) and Burge (1979) for classic discussions, and Nelson (2019) for a recent review of the vast and intricate literature on the topic 6 Dennett (1971, 1987) claims that ideal rationality is presupposed by intentional ascriptions; Fodor (1981c) and Rey (2002a, 2007) argue otherwise.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 265 course, much of human and animal thought is certainly rational, Kahneman (2011, reviewing his work with Tversky) demonstrated its susceptibility to surprisingly persistent strains of irrationality—which are nonetheless fully intentional: for example, the gambler’s fallacy seems to have to do with how people fallaciously think about infinite sequences. This is not the place to provide a full account of all of these features.7 Many accounts have been proposed, none of them yet entirely satisfactory. The main strategies have been to identify the intentionality of a state or a symbol either in terms of its relations to phenomena in the external world (as in Dretske, 1981, and Fodor, 1987, 1991), or in terms of its (“conceptual”) role in thought (as in Peacocke, 1992), or as some combination of both (as in Block, 1986). As will emerge in §10.3, there will no need for us to decide between these strategies here. There is a widespread presumption in much of the philosophy literature— likely a holdover of its “ordinary language” days—that ascription of attitudes is somehow confined to, or essentially based in ordinary (“folk”) talk at a “personal” level. Such usage can lead to an excessive focus upon singular causal explanations that, for example, explain why Bill goes shopping in terms of his wanting a beer.8 It is crucial to our discussion to note that attitude ascription figures in a wide range of serious scientific laws and generalizations regarding mental processes, as in the following salient examples: psychophysics, e.g., Fechner-Stevens laws about what “sounds louder/softer”; looks brighter/dimmer; perceptual illusions: the industry of research on how things look, what a nimals see things as; e.g., one line longer than the other; the Necker cube facing down or facing up; see Palmer (1999a) and Pylyshyn (2006).

7 I have deliberately omitted conditions others include, e.g., conscious accessability (Searle, 1990, and G. Strawson, 1994), which we dismissed for standard reasons in §6.1.2. Davidson (1984a) and Collins (2014: 32) add a further condition of “meta-representation,” or the ability of an intentional system to reflect on and thereby correct its own states, but this seems empirically implausible. Ants may well represent the location of their burrow, and even correct their foraging paths, despite there being no evidence of their having any representations of their representations, or perhaps of any mental states whatsoever (cf. Gallistel, 1990, Burge 2010). 8 See, e.g., Nagel (1986), Hornsby (1997), Ramsey (2007: 25). As Chomsky has often quite rightly stressed, it is hard to see why ordinary talk would have any special primacy in the case of the mental than in other cases in which science has begun with ordinary talk, but often departed quite far from it, as in the case of “mass,” “energy,” “space,” and “time.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

266 Representation of Language animal navigation: the abilities of animals to track prey, landmarks and home in ants, birds, bees, as in Gallistel (1990) and Burge (2010); (fallacious) patterns of reasoning: “framing effects”and statistical fallacies in work of Kahneman and Tversky (summarized in Kahneman 2011); innate “knowledge” of objects, causation, theory of mind, number, as in Spelke (2003, 2017), Carey (2009), and Baillargeon et al (2014); memory, episodic (about individual episodes) vs. semantic (general information), as in Tulving (1985); attention, change blindness, introspective confabulation: Nisbett and Wilson (1977), Grimes (1996), and Carruthers (2011) call attention to myriad ways in which people can fail to see things and changes right in front of them, and can take themselves to be introspecting what are often merely rational narratives about themselves as about anyone else; emotions: joy, grief, disgust reactions, standardly have intentional objects (the believed success of one’s enterprise, the death of a friend); see Seligman (1975/92) and Rozin et al. (1993); “Mind Reading”: children seem to display the ability to understand the beliefs and intentions of others at an early age, and, of course, throughout their life (see Apperly, 2010); morals: children and adults seem to be intensely sensitive to issues of fairness and altruism; see Mikhail (2011) and Baillargeon et al. (2014). Additionally, of course, as we discussed in the last chapter: psycholinguistics: language processing and acquisition, as, for example, in the case of parsing and linguistic perception and production. There are fairly robust experimental results in all these areas. To be sure, there are occasional efforts to avoid intentionalist explanations of them (see P.M. Churchland, 1981; Ramsey, 2007 and Chater, 2018), but it is an uphill battle, one that has not nearly been won for the majority of the insightful explanations these areas increasingly provide. Indeed, I submit it is virtually impossible to imagine even how to express the findings of the above research without an intentional idiom, that is, an idiom having at least some of the properties (i)–(vi). But I hasten to add that that does not mean that a psychologist is committed to any of the usual folk attitude terms, such as “belief ” or “desire”: it is perfectly easy to imagine a

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 267 psychological explanation that employs more specific, disciplined terms, such as (perhaps) Chomsky’s “cognize” (cf. §4.5 above), or in decision theory, “credences,” “on-line” (or “occurrent”) noticings, etc. What the ultimately explanatorily appropriate attitude idioms might be is likely an issue not to be settled soon. All that is at issue in the present discussion is whether it will exhibit at least some of the properties that seem peculiar to intentionality.

8.2 Chomsky as Intentionalist Chomsky does not explicitly address issues of intentionality one way or another in the purely technical writings of the core theory;9 and he seldom addresses it in any detail in his more general, meta-theoretic work. But throughout his writings, he often unself-consciously characterizes his pro posals in what seems at least to be similar to the intentionalist vocabulary just outlined. He was perhaps most explicit in his introduction to the 1975 publication of The Logical Structure of Linguistic Theory (1955/1975): The construction of a grammar . . . by a linguist is in some respects analogous to the acquisition of language by a child. . . . The child constructs a mental representation of the grammar of the language. . . . The child must also select among the grammars compatible with the data. (Chomsky, 1975a: 11)

And, as we saw in §2.3.1, Chomsky’s (1965: 30) proposal for language acquisition seemed to be simply an application of Hempel’s D-N model of scientific explanation, which it is worth repeating here for the vividness of the apparent intentionalisms of many of its clauses (which I place in bold): A child who is capable of language learning must have: (i) a technique for representing input signals; (ii) a way of representing the structural information about these signals;

9 Chomsky does discuss different “levels of representation” each distinguished from the others by both the kinds of information they contain (see §2.2.1). He also discusses “representationalist” theor ies that do not posit “derivational” operations in the grammar (see §2.4 fn57). But, although these seem patently intentional idioms, the issue of intentionality never explicitly figures in these discussions (which is one reason linguists working merely within the core theory totally ignore it).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

268 Representation of Language (iii) some initial delimitation of a class of possible hypotheses about language structure; (iv)a method for determining what each such hypothesis implies with respect to each sentence; (v) a method for selecting one of the (presumably, infinitely many) hypotheses that are allowed by (iii) and are compatible with the given primary linguistic data. (Chomsky, 1965: 30, emphasis mine) These were no passing remarks.10 In Chomsky and Halle (1968), they wrote, regarding “phonetic transcription,” that it should be understood: as a representation of what the speaker takes to be the phonetic properties of an utterance given his hypothesis as to its surface structure and his know ledge of the rules of the phonological component. (Chomsky and Halle, 1968: 294, emphasis mine)

Indeed: Speech perception is an active process, a process in which the physical stimulus that strikes the hearer’s ear is used to form hypotheses about the deep structure of the sentence. (Chomsky and Halle, 1968: 294, emphasis mine)

Two decades later, in his (1986), Chomsky compared his enterprise to: what Gunther Stent has called “cerebral hermeneutics,” referring to the abstract investigation of the ways in which the visual system constructs and interprets visual experience. (Chomsky, 1986: 40, emphasis mine)

And in the 2004 preface to the third edition of Language and Mind (1968/2004), he wrote: the object of inquiry is to be, not behavior and its products, but the internal cognitive systems that enter into action and interpretation. . . . For language, “the principles on which knowledge rests” are those of the internalized 10 As I mentioned in §2.2.2, Janet Fodor (2009) thinks of the first fifty pages of Aspects, and particular this confirmation model, as “one of the most important fifty pages . . . that Noam has written, and . . . still very germane today” (J.D. Fodor, 2009: 256).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 269 language (I-language) that the person has acquired. Having acquired these principles, Jones has a wide range of knowledge, for example that “glink” but not “glnik” is a possible lexical item of English; . . . that “when John climbed the mountain he went up although he can climb down the mountain”; that books are in some sense simultaneously abstract and concrete as in John memorized and then burned the book. (Chomsky, 2004c, vii–viii, bold emphasis mine)

Moreover, he famously announced, early and late, that he means to be defending “mentalism” against, particularly, Behaviorism, and reviving the suggestion of “Rationalism” regarding “innate ideas.” Specifically, he was proposing a “computational/representational” theory (CRT) of grammatical competence that he sees as continuous with similar proposals advanced in other parts of psychology, particularly those that have been developing as part of the “cognitive revolution” which was partly inspired by his work. And, as will be discussed in §11.3, he has been adamant in rejecting what he regards as the “methodological dualism” rampant in any schools of philosophy that treat mind differently from the rest of nature. The intentionalist commitment also seems to surface in Chomsky’s explicit advocacy of a CRT in other areas of psychology. In his (1980a) Rules and Representations, for example, he writes: Linguistics is the abstract study of certain mechanisms, their growth and maturation. We may impute existence to postulated structures at the initial, intermediate, and steady states in just the same sense as we impute existence to a program that we believe to be somehow represented in a computer or that we postulate to account for the mental representation of a three-dimensional object in the visual field. (Chomsky, 1980a: 188, bold emphasis mine)

After introducing his neologism, “cognize” (see §4.2), he makes this commitment quite vivid by claiming that we cognize the grammar that constitutes the current state of our language faculty and the rules of this system as well as the principles that govern their operation, where, cognizing has the structure and character of knowledge but may be and in interesting cases is inaccessible to consciousness . . . (Chomsky, 1980a: 69–70; see also 1986: 265)

What is particularly striking is the contrast he then proceeds to draw between a “non-cognizing missile that acts in accord with laws governing the orbit of

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

270 Representation of Language the moon” with a “cognizing missile that computes its trajectory by refer ence to a representation of these laws,” the latter (abstracting from its specific task-orientation) serving as a model for a normal speaker’s understanding of language (1980a: 102–3, 275, n61). Specifically, the cognizing missile: incorporates an explicit theory of the motions of the heavenly bodies and information about its position and velocity and that carries out measure ments and computations using its internalized theory to adjust its course as it proceeds. . . . In [this] case . . . inquiry might lead us to attribute to the missile something like a “mental state” . . . postulating a system which involves the cognizing of certain principles and computation involving those principles and certain representations. (Chomsky, 1980a: 102, emphasis mine)

One way to appreciate just how intentionalist such a missile would have to be is to imagine that the “internalized theory” and information about its pos ition is actually false, systematically leading the machine astray. In such a case, we would seem to be forced to ascribe to it a (mis‑)representation of that theory and data, an ascription that would explain further computations it might deploy to “correct” itself. Of course, someone might observe that a standard sort of missile is a human artifact, and so the attributions to it of anything like “mental states” might depend essentially on the purposes and intentions of the artifactor. And, although this appears to be what Chomsky (2000: 105) later says about such cases, it is worth stressing that this is not how he seems to understand the issue up until the mid-1980s. There is, for example, the following passage in his 1986 reply to Kripke with regard to a machine that “fell from the sky” that we quoted in §3.4.3: We could develop a theory of the machine, distinguishing hardware, memory, operating system, program and perhaps more. It’s hard to see how this would be crucially different in the respects relevant here from a theory of other physical systems, say . . . the organization of neurobehavioral units . . . that explain how a cockroach walks. (Chomsky, 1986: 239)

So Chomsky would seem to be quite clearly assimilating his core theory to an explicitly intentionalist CRT.11 It is therefore surprising to read his subsequent denials that he is doing so, which will be discussed in §8.4. 11 It is worth noting that Chomsky (1968) himself regards Quine’s main argument for his denigration of intentionality—his “indeterminacy of translation” (Quine, 1960/2013: ch 2)—as empirically refutable, a point to which we shall return.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 271 But before we consider them, it is worth setting up something of the context in which these issues in general have arisen.

8.3 The Controversy Intentionality has been of intense concern to philosophers for at least the last century and a half,12 partly as a result of two striking, controversial theses about the mind that Brentano advanced: (i) that intentionality is the distinctive mark of all mental phenomena, and (ii) that intentionality is irreducible to physical sciences. Our concern here will not be with the first thesis, except to note, as we discuss in due course, that at least some distinctive features of intentionality do seem to be associated with all the “mental” idioms mentioned above that appear in Chomsky’s discussions.13 It is the second thesis that has standardly come to be known as “Brentano’s Thesis” and has been the object of an immense amount of effort to refute it, none, so far, meeting with great success. This failure and the other difficulties in dealing with the peculiar features we mentioned above have led many “naturalistically” minded philosophers to try to dispense with the notion and endorse what has (infelicitously) come to be called “eliminativism” about it.14 In keeping with his general rejection of “things of the mind” (see §3.3.2, fn15 above), Quine (1960/2013) famously wrote: One may accept the Brentano thesis as either showing the indispensability of intentional idioms and the importance of an independent science of intention, 12 An interesting question for an intellectual historian is why it is virtually only philosophers who worry about the issue of intentionality, given, as indicated, the near ubiquity of the notion of “representation” in psychology and cognitive science. Only a few psychologists have expressed concern. Palmer (1978: 259), for example, notes that “we, as cognitive psychologists, do not really understand our concepts of representation” (quoted in Ramsey, 2007: 7). 13 It does seem to be a property of all attitude states. What is unclear is whether it is also true of, e.g., sensations and moods, none of which will concern us here. 14 See, e.g., Quine 1960/2013), Davidson (1970), PM Churchland (1981), Dennett (1987), and Egan (1992, 2014). There are many forms of this view (see Rey 1997: ch 3 for discussion). The only form that will be relevant here is what Collins (2007b) calls “meta-scientific eliminativism,” which does not deny the existence of intentional states, but only that they figure in serious scientific explanations. I discuss his view in §8.5.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

272 Representation of Language or showing the baselessness of the intentional idiom and the emptiness of a science of intention. My attitude, unlike Brentano’s, is the second. . . . (Quine, 1960/2013: 221)

He does accord intentional talk an instrumental status: Not that I would forswear daily use of the intentional idiom, or maintain that they are practically dispensable. But they call, I think, for bifurcation in canonical notation. . . . If we are limning the true and ultimate structure of reality, the canonical scheme for us is the austere scheme that knows no quotation but direct quotation and no propositional attitudes but only the physical constitution and behavior of organisms. (Quine, 1960/2013: 221)

Such a view was, of course, linked to his presumption of a purely “behaviorist” psychology, but it is not confined to it. In various forms it continues to be endorsed to this day in some “radical connectionist,” “neuro-philosophical” “dynamical system” views,15 as well as by many of the methodological dualists we discussed in §4.7—from all of whom many of us read Chomsky’s work as providing salvation. In Chapters 10–11 we will consider whether such a Draconian dilemma between Brentanian dualism and these various forms of eliminativism really exhausts the reasonable possibilities.

8.4 Chomsky as Anti-Intentionalist For all his use of intentionalist vocabulary, however, Chomsky has, at least since the mid-1980s, occasionally come to share in this skepticism about intentionality in serious science, and to explicitly renounce any intentionalist understanding of his proposals. Most surprisingly, he claims that intentionality has no role to play in science: If “cognitive science” is taken to be concerned with intentional attribution, it may turn out to be an interesting pursuit (as literature is), but is not likely to provide explanatory theory or to be integrated into the natural sciences. (Chomsky, 2000: 23)

15 See, e.g., P.M. Churchland (1981, 1989) and P.S. Churchland (1986) regarding both connectionism and neurophilosophy, and Brooks (1991) and van Gelder (1995) regarding Dynamical Systems Theories. Ramsey (2007) provides a useful discussion of some of these views.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 273 Indeed, Chomsky appears to embrace the very exclusion of intentional attribution from natural science that was a distinctive claim of the Hermeneutic tradition, from Dilthey to Quine, Kripke, Davidson, and Nagel (see §11.3.1 below). In passages that read eerily like the famous passage of Quine (1960/2013: 221) just quoted, Chomsky writes:16 [I]ntentional phenomena relate to people and what they do as viewed from the standpoint of human interests and unreflective thought, and thus will not (so viewed) fall within naturalistic theory, which seeks to set such factors aside. (Chomsky, 2000: 22–3; see also p.132)17

Indeed: We can be reasonably confident that “mentalistic talk” will find no place in attempts to describe and explain the world. . . . The notion “common store of thoughts” has no empirical status, and is unlikely to gain one even if the science of the future discovers a reason, unknown today, to postulate entities that resemble what we think (believe, hope, expect, want, etc.). (Chomsky, 1996: 74–7)

One would have thought that the examples he mentioned in the passage from his (2004c) preface to his Language and Mind (1968/2004) that we quoted in §8.2 would be paradigm examples of items from a “common store of thoughts” (at least broadly understood as objects of attitudes) shared among English speakers—that glink but not glnik is an English word; that climbing a mountain entails going up, even though one can also “climb down” it—as would the contents of other “cognizings” by a speaker (if the principles and parameters are “known” or “cognized” innately by human beings, do they not also serve 16 Serious afficionados of Chomsky’s views should take special note of the following quotes, since I have several times been taken aback by how many people who have read the material in which they occur have been surprised by them when they are quoted. I suspect they found them so inconsistent with the earlier passages I have quoted in §8.2 that they simply failed either to notice or to remember them. One referee for the present volume, for example, wrote: It is of course legendary that Chomsky was an early champion of internal representation and computation. . . . It comes as something of a surprise to me that he has any particular views about intentionality in general, much less Quine-influenced “wariness about” intentionality. (pc) And Tyler Burge (2003) opens his “reply” to Chomsky assuming that they both “agree that eliminativism about mental/psychological kinds is not a serious possibility even for science, much less commonsense” (2003: 451). 17 This particular claim occurs in a passage that might be construed as merely explicating, not endorsing a view of Putnam (1988). Given, however, the numerous other claims like it elsewhere in the same volume, it is hard not to take it as an endorsement.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

274 Representation of Language as shared contents of thought?). In any case, even if linguistics itself were to provide no instances, it is hard to see that the other parts of psychology would not, as in the examples set out in §8.1. Without some common store of intentional contents, generalizations in any sort of cognitive psychology would be impossible. Chomsky even goes so far as to reject computer models of psychological processes: Computer models are often invoked to show that we have robust, hardheaded instances of the kind: psychology then studies software problems. That is a dubious move. Artifacts pose all kinds of questions that do not arise in the case of natural objects. Whether some object is a key or a table or a computer depends upon designer’s intent, standard use, mode of inter pretation, and so on. The same considerations arise when we ask whether the device is malfunctioning, following a rule, etc. There is no natural kind of normal case. . . . Such questions do not arise in the study of organic mol ecules, the wings of chickens, the language faculty, or other natural objects. (Chomsky, 2000: 105)18

Put aside the fact that this passage seems to explictly contradict his own (1986: 238–9) reply to Kripke (§3.4.3 above). What is more to our point is that these passages seem to go against the passages quoted in §8.2 above in which he compared linguistics structures to “a program that we believe to be somehow represented in a computer” or a “cognizing missile” that “incorporates an explicit theory of the motions of the heavenly bodies (Chomsky, 1980a: 102–3, 275fn61, 188). That is, he seemed to think of both the “language faculty” and the visual system as precisely such “computational” devices “following rules.” To be sure, the semantics of an artifact typically has whatever the artifactor chooses. But the semantics of a naturally occurring machine is surely determined by the explanation of its states and behavior. One source of Chomsky’s antagonism to “computer” models per se may be their early association with an excessively behavioral conception of their role, for example, to get something to pass the “Turing Test,” or to perform tasks intelligent humans routinely perform, or, as we discussed in §5.4, simply

18 Ironically, the claim here is virtually identical to the claims of Searle and Kripke we quoted earl ier (§3.4.3). It is some measure of the rampant confusions on this topic that, to my knowledge, neither Searle nor Chomsky ever note the similarity.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 275 computing statistical analyses for the purposes of text comprehension or mechanical translation. Given that actual work in “artificial intelligence” is concerned almost entirely with such “performances” and not the competence issues that concern Chomsky, it is not surprising that he distances himself from it. But one can sympathize with Chomsky’s (1975b: 39–40) antipathy to these projects without prejudice to the word “computer,” or to the possibility of a CRT characterizing the structure and competencies of various sub-systems of the mind.19 Chomsky does raise some quite separate issues about the boundaries of intentional explanation, which should be mentioned only to be set aside: Let us consider [the] problem . . . of determining when we should attribute belief, or rising and turning and aiming towards— when we are justified in doing so? To quote one recent formulation, we ask what are “the philosoph ically necessary condition[s] of being a true believer”. . . . No one seeks to clarify the philosophically necessary conditions for a comet to be truly aiming at the Earth—failing to hit it, if we are lucky, another intentional attribution. Similarly, we are invited to explore the criteria for determining where to draw the line between comets aiming at the Earth and Jones walking towards his desk; on which side should we place barnacles attaching to shells and bugs flying towards the light? Such questions do not belong . . . to naturalistic inquiry in other parts of the sciences. (Chomsky, 2000: 147)

Now, of course, Chomsky is right that no one discusses intentional attribution to a boring case like comets, but this is not because the questions of where to draw some sort of distinction between comets, barnacles, bugs, birds, and Jones walking towards his desk are not perfectly good ones in naturalistic inquiry. They are explicitly the concern of empirically rich work of, e.g., Gallistel (1990) on animal navigation, Clayton et al. (2007) on the surprising cognitive capacities of corvids, and Burge (2010) on ants and spiders. Here, as everywhere else in nature, drawing sharp lines is of course a fool’s errand, but that does not make the question of what considerations are relevant perfectly serious scientific ones.

19 Another issue that Chomsky (1996: 12, 2000: 105) seems to think vitiates a computer model of the mind is the distinction between “hardware” and “software” that often arises for artifactual machines, but not clearly for the brain. But, putting marketing interests to one side, surely physical and algorithmic descriptions of the brain correspond closely enough to the artifactual distinction (cf. Marr, 1982).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

276 Representation of Language In any case, we need to ask how Chomsky regards all the talk of “computation” and “representation” that suffuses virtually all the statements of his theory, if this talk is to be understood non-intentionalistically.

8.5 Collins’ Abstract-Algebraic Reading In a series of articles, John Collins (2007b, 2009, 2014) lays out an interesting “meta-scientific eliminativist”—what I will call an “abstract-algebraic”—reading of Chomsky’s core theory that seems to make sense of both the apparent intentionalist descriptions and of Chomsky’s rejection of them. Although I don’t think in the end this reading does justice to the Chomskyan project, it does point to a number of issues worth addressing. The view Collins (2009) wants to defend is that [T]he states of the [language] faculty are not . . . contentful; for in no sense are the states of the faculty answerable beyond them, and in no sense does a competent speaker/hearer have to take them to be so in order to be competent . . . (Collins, 2009: 262)

Collins (2007b) sets out the interpretation most fully in his (to my mind) somewhat startling (re‑)reading of Chomsky’s early work, particularly the (1959) review of Skinner and the presentation of the (1965) Aspects hypoth esis confirmation model that we have discussed. I will first argue in §8.5.1 a purely historical point, that this is not a plausible interpretation of Chomsky’s early (pre-1980s) work, before proceeding in §8.5.2 to consider Collins’ own ingenious proposal in its own right.

8.5.1 As an Interpretation of Early Chomsky Collins concedes that the Aspects model appears to be intentionalistic: If we take Chomsky at his apparent word, then it is difficult not to see him as being committed to an intentional framework. Theory construction is an intentional activity par excellence. The language-acquiring child must represent a hypothesis, compare it to the available data, evaluate its fit with the data, assess

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 277 other possible hypotheses, and arrive at a judgement as to the best hypothesis. In sum, acquiring a language on this model is a rational achievement. (Collins, 2007b: 645)

He thinks, however, that this appearance is deceptive: I think we can see Chomsky as employing “theory” as a trope or metaphor. What was crucial about the “theory” talk was not the appeal to hypothesis testing or the other explicitly intentional notions, in general, anything that would make acquiring a language a rational achievement. What the theory talk did, rather, was capture the formal and structural features of competence and its development in such a way as to place it far beyond the reaches of any empiricist/behaviourist account. Talking of theories did this well enough, but it is a quite separate issue whether the linguistic theories developed required any answers to the philosophical problems that arise once one employs intentional vocabulary. (Collins, 2007b: 645; emphasis mine)

One does wonder why, if Chomsky intended the intentionalist talk that we quoted above as a mere trope, and the theory as merely a syntactic skeleton, he didn’t say so?20 Collins suggests that he didn’t bother, since the significant dialectic for Chomsky was not with intentionalists, but with behaviorists who thought that grammar could be acquired simply by experience, and talk of “theory” was just a way of stressing the importance of specific internal structure that the Behaviorists in particular ignored on principle: 20 Collins adds in support of his interpretation that “in the passage quoted, ‘hypothesis formulating’ occurs in ‘scare quotes,’ and that, moreover, other passages in Chomsky’s writings suggest a formal (or algebraic) reading. For example, Chomsky (1959) writes: It is reasonable to regard the grammar of a language L ideally as a mechanism that provides an enumeration of the sentences of L in something like the way in which a deductive theory gives an e numeration of a set of theorems’ [Collins’ emphasis]. (Chomsky, 1959/64: 576) and in (1965): It seems plain that language acquisition is based on the child’s discovery of what from a formal point of view is a deep and abstract theory—a generative grammar for his language— many of the concepts and principles of which are only remotely related to experience by long and intricate chains of unconscious quasi-inferential steps. (Chomsky (1965: 58) Now, of course, one can treat the theorems of a system purely formally; but “quasi-inferential” steps are not clearly part of a deductive theory; and, absent a formal theory of the kind of non-deductive inferences the child seems to be making, it appears we would have to understand those steps seman tically, i.e., intentionalistically.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

278 Representation of Language It would seem that Chomsky is using “theory” and the attendant termin ology in a specific sense . . . , suggesting that aspects of the capacities generative theory attributes to the child might be fruitfully conceived as being theory-like. . . . [T]heory-talk might well be properly seen as a motivating gloss on the generative theories of transformational structure that is radically under-determined by data available to the child. (Collins, 2007b: 645–6)

Talk of “under-determination,” however, continues to suggest that the child is making precisely the kind of non-demonstrable inference that is characteristic of the hypothesis testing that Chomsky seemed to be attributing to the child (one doesn’t talk of the “under-determination” of kidney development by imbibed liquids!), and about which, as we have quoted, Chomsky and Halle (1968:294; see also pp. 24–5) are quite explicit. One might wonder why, if Collins is right, Chomsky ever thought he was reviving a “mentalistic” psychology, restoring the Rationalist postulation of “innate ideas”? Ideas and representations are on the face of it about or “of ” things, real and unreal, under one aspect or another: if Chomsky wanted to appeal to them without intentionality, then why did he not supply a positive account of how he intends to do this, and of exactly how we are to understand an idea or representation without it being “of ” something of some sort? Moreover, there are his (1968) arguments against Quine’s (1960/2013: ch 2, 221) own explicitly anti-intentionalist “thesis of the indeterminacy of translation” that we saw enunciates the very same claims as the Chomsky (2000) passages we quoted. And why does Chomsky not express at least partial agreement with Quine, or with Goodman’s (1969) “The Emperor’s New Ideas,” renouncing the apparent commitment to “ideas” that Goodman critically ascribes to him?21 Although Collins’ reading is imaginatively charitable, it seems to me the more likely hypothesis is that Chomsky simply changed his mind.

21 In any case, it is difficult even to imagine a coherent story of something that has “ideas,” “representations,” and “mental” states but no intentionality—no propositional attitudes, no thoughts, preferences, memories, expectations, no representations “of ” anything at all (what sense attaches to the word “representation” without an implicit “of ” clause?). Consider a claim Jerry Fodor (pc) once reported to me a Chomskyan made to him: “The Creationists are essentially right; they’re just wrong about the God part” (the point being that Creationists are right in being skeptical of the explanatory power of natural selection, cf. §2.2.5 above and Chomsky, 2002a: 139–51). Even Fodor, so fond of perverse jokes, pleaded that the comment not be made public. But now suppose it was, and Chomskyans went around claiming they were Creationists! People would be right to be baffled. Claiming to be a mentalist, but rejecting intentionality seems no less paradoxical.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 279

8.5.2 Collins’ Positive Proposal It seems that, pace Collins, Chomsky began to reject his earlier intentionalist hypothesis-testing model in the wake of the adoption of the P&P model in the late 1970s. Thus, in Chomsky (1980a) he writes: Learning seems pretty much like what Peirce called “abduction”, a process by which the mind forms hypotheses according to some rule and selects among them with reference to evidence and, presumably, other factors. It is convenient sometimes to think of language acquisition in these terms. . . . I have occasionally used this metaphor, but I don’t think it should be taken ser iously. If we take it partially seriously, then . . . the question whether language is learned or grows will depend upon whether the mind equipped with universal grammar presents a set of grammars as hypotheses to be selected on the basis of data and an evaluation metric, or whether the steady state grammar arises in another way—for example, by virtue of a hierarchy of accessibility . . . and a process of selection of the most accessible grammar compatible with given data. (Chomsky, 1980a: 136)22

But it is important to note that, even here, the rejection is not of intentionality tout court, but only of a specific kind of hypothesis testing model we have seen he had proposed in Chomsky (1965: 30). It remains to be seen—and argued—that “a process of selection of the most accessible grammar compatible with given data” can be understood non-intentionalistically. After all, being “compatible with” is a relation that standardly obtains only between 22 Note that, prior to this passage, Chomsky never actually expressed the abduction proposal as a “metaphor.” There is also (to my mind) a surprising exchange in this connection between Chomsky (1980b) and Robert Matthews (1980: 26), who takes Chomsky’s distancing himself from abductive models to raise the question whether Chomsky’s remarks “should not be construed as an abandonment of the intentional idiom itself.” Oddly, Chomsky (1980b) seems explicitly to resist this suggestion: We agree that, at some level, much of what is called “learning” . . . should be characterized in a “non-intentional, presumably physiological vocabulary . . .” But I do not see that this amounts to abandoning a “rationalist” account of language acquisition in which “the various processes . . . are defined over . . . contents [of a state], and innate structure “is characterized intentionally in terms of both the content of a state and the learner’s relation to that content” (say, cognizing). . . . (Chomsky, 1980b: 47, emphasis original; bracketed interpolations are Chomsky’s) For the record, when I asked Chomsky about this passage, he claimed (email 21 May 2001) that he understands this (1980b) reply to Matthews to be “perfectly explicit in rejecting the intentional inter pretation,” and wondered that I didn’t recognize what he claimed then to have been a “reductio ad absurdum.” It is some measure of the difficulty of understanding Chomsky’s position on intentionality that none of the many colleagues I have asked have been able to read it in this way (but see Collins, 2007b: 647, fn31, for a characteristically generous reading).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

280 Representation of Language items with intentional content—it is meaningful sentences that bear logical relations of entailment and compatibility with each other—and it is therefore by no means obvious that the process of selection of a grammar could be characterized without reference to that content. To be sure, Chomsky (1981) imagined parameters might be set by a purely physical “triggering” process. But, as noted in §2.2.8, that was soon shown to be hopeless, given that it emerged that the setting of one parameter depended upon the setting of another, as in Yang (2002: 26ff). This is a process that could hardly any longer be regarded as a piece of simple, brute causal triggering, but seemed to require some kind of probabilistic weighing of one grammar against another, more like hypothesis testing after all (as Yang seems to suggest; cf. Janet Fodor’s (2009: 256), interest in returning to the Chomsky (1965: 30) Aspects confirmation model). One could, of course, abstract the purely “computational” portions of the theory from such information and the functions they compute. One can, after all, abstract from (i.e., disregard) phenomena in any way one chooses. Sometimes this is theoretically useful, as when physics abstracts from matters of taste, and sometimes not, as when, say, someone might try to explain the motion of a horse abstracting from its mass. A number of philosophers (e.g. Stich, 1983) have suggested that cognitive psychology in general be confined to the characterization of internal formal computations, abstracted from their semantic properties. Whether or not this is plausible for psychology as a whole,23 it is patently not the way in which a Chomskyan theory is in fact presented even these days; and, I submit, it is doubtful the theory could be presented in this way. In the first place, as I argued in §6.4, the theory is standardly presented under what I called a “representational pretense” that what is being discussed are externally uttered or inscribed SLEs, when what it is in fact about is internal representations of them, over which computations are defined. But, fine, perhaps the literal theory is only about those internal representations, formally, but not intentionalistically described. But, secondly, a purely syntactic proposal risks falling afoul of the point we noted in §3.3.3, that the grammars that concern Chomskyans are still fairly abstract relative to any specific algorithms or neural implementations (e.g, whether they are serial or parallel), and certainly to the character of the 23 See Jerry Fodor (1975, 1987) for reasons to think it is not. Roughly, his perfectly straightforward response is to point out that, although the representations of a computational explanation have to be formally/mechanically specifiable, the explanations of cognitive psychology are expressed in terms of their content: decision theory can be implemented formally, but if it is to explain choices, then it has to be expressed in terms of the content of the believed risks and benefits that explain them.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 281 specific symbolic system over which such computations would have to be defined (e.g. whether it is binary, decimal, alphanumeric, or—most likely— some other notation). Even relaxing the externalist pretense, the theory as it is presented is not about the notation the brain might use, but about the categories and relationships any such notation would be required to capture. The categories of the theory are, for example, nouns, verbs, NPs, VPs, not expressions like “noun,” “verb,” “NP,” or “VP”! Even the meticulous formalism of Chomsky’s (1955/75) LSLT only indicates the formalism schematically, the letters Chomsky uses (e.g., “P1” . . . “Pn”) standing for whatever representations play the specific role he indicates (in the case of “P1” . . . , “Pn” , the phonemic primitives of the system). Although any particular symbols could, of course, be individuated without regard to their interpretation, that some symbol is a phonemic primitive, or an NP, VP-, or IP-symbol, requires that we be able to identify the role the symbols play, not only internally in a system, and not only for a single human at a time, but for all humans across time—otherwise, how could an explanatorily adequate grammar be regarded as “universal”? But, arguably, to specify those roles just is to provide a theory of the content of the symbols that play them (we will return to this issue in §11.2). Collins (2009, 2014) offers a further suggestion that might appear to meet this worry. True, we need the linguistic categories in order to properly categor ize the syntactic states. They need not, however, be regarded as being represented by the internal neural states; they are merely a system of categories: We might simply think of the abstract properties as the means of describing or individuating kinds of brain states that are subsumable under certain generalizations couched in terms of the type-individuating abstracta. Under such a construal, the brain states do not represent the abstracta . . . rather, the abstracta simply individuate the would-be vehicles of content for the explanatory endeavor at hand. (Collins, 2014: 38)

Collins prefaces this suggestion with the acknowledgment that To be sure, we should not confuse a verb phrase or a Euclidean figure with patterns of synaptic firing, but it just doesn’t follow that such patterns represent (are the vehicles for) verb phrases or Euclidean figures. (Collins, 2014: 37)

But, on the face of it, his suggestion would seem to invite this very consequence, for, after all, subsuming something under generalizations about Fs

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

282 Representation of Language requires categorizing it as an F, and that would seem to be sufficient for it being an F: if a molecule is subsumed under generalizations by being categorized as a lattice, then that would seem sufficient for it being a (token of a) lattice. So if certain brain states are subsumable under linguistic generalizations by being categorized as SLEs, then they are (tokens of) SLEs. But then we would be left with the use/mention confusion that Collins thinks he can avoid.24 But Collins (2009, 2014) embellishes the idea with a further ingenious suggestion. The collapse of use/mention can be avoided by noticing that apparent appeals to SLEs could be understood along lines once suggested by Nelson Goodman (1949, 1976),whereby a representation, particularly of something unreal as in the case of a “representation of a unicorn,” can be understood as an “x-representation” (e.g., a “unicorn-representation”), which need not be committed to the existence of x’s (e.g., unicorns).25 On this model, then, the contents of the representations (the ‘x’ of ‘representation of x’) are now understood as ways of categorically typing states as kinds of representations or structures, where such categories—the x’s—are the posits of our theories. So, when we attribute a representation of PRO to a speaker/hearer, we are saying that a state of the subject is to be typed as, PRO, where such a state stands in an explanatory relation to whatever phenomena a theory that posits PRO explains. (Collins, 2014: 50)

Collins stresses that he is not advocating this “method of hyphenation” in general as a way of categorizing representations, since, as has often been observed, the hyphens prevent any generalizations over the “x’s” so hyphenated (one could not infer from something’s being a “P-&-Q”-representation 24 Not everyone might be as anxious to avoid it: we will return in §9.6 to the use/mention c onfusions and/or collapses that are rampant in linguistics. Note that Smith and Allott (2016) seem to understand Collins’ view in precisely these ways. Thus, expanding on Collins’ (2014) suggestion, they write: When linguists say that someone has a representation of a sentence or an NP in his head, what is meant is that there is some brain state that is an instantiation of an entity which is described at an abstract level (as a sentence or NP) in the linguist’s theory. (Smith and Allott, 2016: 201) But if the brain state is an “instantiation” of a sentence or NP, then it is a sentence or NP, just as if a molecule is an instantiation of a lattice then it is a (token of) a lattice. 25 The effect of the hyphen is to fuse the prefixed with the suffixed word, so that they are no more logically distinct than is the word “dog” in “dogmatic.” In particular, they are not open to quantification, so one cannot infer “there is a unicorn” from “There is a unicorn-representation.” Collins (2014: 49) points out that such a proposal has also been advocated by McGilvray (1999: 127) as a way of understanding Chomsky’s position, and that a similar proposal can be found independently in Dennett (1969). Given that Chomsky was close to Goodman when the latter was formulating these suggestions, it is not an implausible suggestion.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 283 that it was a “P’-representation). But he thinks it affords the best way of construing “representations of SLEs” as they figure in linguistics: Notwithstanding the absurdity of the hyphenated account as an analysis or account of logical form, I do think it offers genuine insight into sub-personal representation of linguistic properties. What is right about the account is that it depicts the representations posited by linguistic theory as monadic (non-relational) states of a system that are typed according to the categories of the theory. So, on this account, to talk of a system representing x is not to relate the system to x in anyway at all, but to type a state of the system as an x-type (x-representation), where ‘x’ is a bit of our theoretical technology. (Collins, 2014: 53)

Of course, the theorist has to drop the hyphens when expositing the relations among the abstract categories of the grammar. But this can be done abstractly in isolation from the actual states and processes of the speaker (this is the “abstract” aspect). But when we then recruit this abstract system of categories to explain the states and processes of the speaker, we can re-introduce the hyphens, and treat “x-representations” as a way of merely categorizing those states and processes, without, he claims, loss of explanatory power (this is the “algebraic” aspect).26 After all: The representations are states of a system that are typed in terms of the representa, which are the theoretical notions employed by the relevant theory. The linguistic mind doesn’t represent verb phrases, or PRO, or anything else. We use ‘verb phrase’, ‘PRO’, et al. to categorise states in patterns that our theories track. This doesn’t make the mind/brain states unreal, no more than using vector algebra to track the trajectory of a cannon ball makes the trajectory unrea1. (Collins, 2014: 60–1)

Collins does recognize that: One problem with the story so far is that the speaker/hearer must apparently make inferences and arrive at generalisations, but we have effectively said that this cannot happen unless the speaker/hearer possesses representations of x, not mere x-representations. (Collins, 2014: 54) 26 In conversation, Collins analogized the role of the SLE abstracta to the role of point masses and ellipses in a physical explanation of, e.g., planetary motion, which may figure only in the abstract geometry that is used to categorize that motion, not in the world itself. I leave what seem to me the non-obvious issues about the reality of point masses to another day.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

284 Representation of Language He goes on to claim, however: The simple answer to this problem is that speaker/hearers do not make inferences or draw generalisations about the kind of states that our theories attribute to them, no more than a boat on a river does vector algebra. The theory may well, say, model a particular process or state of understanding as a derivation of structures, but it does not follow that the theory is true only if the speaker/hearer makes the derivation (whatever precisely that means). It suffices that the state-transitions the theory models are reflected in the system itself without the system necessarily doing inferences. (Collins, 2014: 54)

Now, as we discussed in §5.4.7, it is certainly true that, unlike earlier pro posals, the P&P model requires a great deal less for a speaker in the way of representations and derivations: the I-language functions according to prin ciples that are neither hypothesized nor confirmed by experience. This is not at all clear, however, with regard to the settings of the parameters, which, although Chomsky initially expected they would just be brutely triggered by input, look like they may well require at least probabilistic derivations (see Yang, 2002: 26ff, Lidz and Gagliardi, 2015: 12–13). Whether or not this is ultimately true should not be decided merely by a wish to avoid intentionality. Quite apart from that issue, however, it is important to notice that the aim of explanatory adequacy (§4.1) could not be reasonably satisfied without the results of the I-language computations being integrated into systems of perception and parsing that are manifestly needed in order for a child to acquire a language. The grammar will have to both inform and constrain the parser, and in the course of acquisition, be modified (e.g. having parameters set) by the parses. But how can this happen unless the categories of the grammar are the very ones being represented and probabalistically computed by the parser, which provides data that are supposed to be “compatible” or not with a provisional grammar? Collins (2014: 54, fn15) aborts his discussion at this very point as being “a topic . . . too broad to be further discussed here.” But, although the grammar and the parser may be theoretically distinct, for explanatory adequacy they must be able at least to communicate with each other, sharing, as it were, a common coin.27 Children must hear ambient

27 Of course, the grammar and the parser would perforce do this if they are simply aspects of the same system, as in Momma and Phillips, 2018: 748; cf. §3.3.3, fn17 above. As Howard Lasnik (pc) pointed out to me, Chomsky (1995a) himself recognizes this issue:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 285 noises, exactly as they seem to do, as SLEs of the ambient language, e.g., as a sentence having dropped a subject, if they are to set the null-subject param eter (see §2.2.7 above). As both Chomsky and Halle (1968: 24–5) and Lidz and Gagliardi (2015) stress, at least some probabilistic inferences to at least some SLE categories seem to be required to get from the input to the grammar (see §5.3 above), and it is hard to see how such processing could possibly occur without the computations being defined on intentionalistic representations of the linguistic categories (we will return to this crucial issue in §11.2.2). Pace Collins’ (2009: 262) claim we quoted at the start of §8.5, surely, competent speaker/hearers do have to take themselves to be hearing and producing token SLEs in order to be competent! Collins (2014) does briefly consider parsing: It might be that a syntactic representation is still operative in parsing as a check on the parse. Alternatively, it could be that the full set of syntactic relations (as recorded in a tree) are never represented explicitly, but only constrain the parse step by step and reveal themselves in parse failures. (Collins, 2014: 54, fn15)

But by way of illustration, he curiously considers a case of a normal English hearer who might not have disambiguated a sentence, for example Mad dogs and Englishmen go out in the midday sun, to which his grammar could however assign different parses (indicating the different scopes of mad). In such a case, of course, parsing may be incomplete, and the point of the representations, therefore, is to account for the parameters of freedom a speaker-hearer has in his consumption and production of language, at least in a modally robust, idealised way. We do not need to imagine

Another source of specificity of language lies in the conditions imposed “from the outside” at the interface, what we may call bare output conditions. These conditions are imposed by the systems that make use of the information provided by [the I-language], but we have no idea in advance how specific to language their properties might be—quite specific, so current understanding suggests. . . . The information provided by [the I-language] has to be accommodated to the human sensory and motor apparatus. Hence, UG must provide for a phonological component that converts the objects generated by the [I-language] to a form these “external” systems can use. (Chomsky, 1995a: 221, see also p. 3, his 2004c and 2016a: 8) These output conditions are legibility conditions, in the sense that other systems must be able to “read” the expressions of the language and use them as “instructions” for thought and action (Chomsky, 2000: 9)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

286 Representation of Language that every speaker represents a sentence to themselves in some particular way prior to or coeval with consumption or production of it. (Collins, 2014: 58)

Of course, this would be true of the hearer, he imagines, who does not parse sufficiently to notice the ambiguity. But what about the hearer who does? That hearer does hear it as having one parse or the other and it is hard to see how to capture this fact other than by presuming the hearer represents the one parse rather than the other.28 So why do Chomsky and Collins not cheerfully adopt precisely the intentionalist attitude that Chomsky and Halle (1968: 24–5) seemed to embrace and that seems on the face of it so plausible? Collins (2014), himself, simply seems to have standard doubts about the scientific utility of intentionality generally,29 and in some of the passages we have quoted above in §8.4, Chomsky seems to share such doubts. But Chomsky also seems worried by a further issue that does not bother Collins, viz., the expression “representation of ” and its seeming commitment to external phenomena being represented. This is, to be sure, a treacherous expression, and needs discussion.

8.6 Chomsky’s De Re Reading of “Representation of ” In a curious, but revealing passage, Chomsky (2000) writes: The internalist study of language . . . speaks of “representations” of various kinds, including phonetic and semantic representations at the “interface” with other systems. But . . . we need not ponder what is represented, seeking some objective construction from sounds to things. The representations are postulated entities, to be understood in the manner of a mental image of a rotating cube, whether it be the result of tachistoscopic presentations of a 28 There is the view of Devitt (2014: 284) that we considered in §7.2.3 that parsing requires only that the parsing states have syntactic properties, not that they represent them, but, as I argued then and will argue further in §11.1 against a similar proposal of Adger (2019), it is crucially unclear how such a system would work. 29 Thus, Collins (2014: 43) writes, “I take it to be granted on all sides that intentionality . . . remains an obscure and imponderable notion, at least if approached from a theoretical perspective.” But this of course simply begs the question against those of us who decidedly do not grant its obscurity or imponderability, certainly not at the usual level of the kinds of moderately successful psychological explanations we surveyed in §8.1 above. True, many have thought that the difficulty of “reducing” intentionality renders it theoretically obscure; but it is not clear what is actually obscure or why “reduction” is needed or so urgent as to vitiate the notion (we will return to this issue in §10.2.2(iii)).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 287 real rotating cube or stimulation of the retina in some other way; or imagined, for that matter. (Chomsky 2000: 160; emphasis mine)

There are a number of issues here that need to be disentangled.

(i) There is a use of “representation” and other intentional idioms that is seriously relational and existentially committed to an “x” as its right side relatum, “of x.” This usage would be an instance of what is regarded in general as a “de re” (or “of the thing”) reading of an intentional phrase. Thus, we might speak of Oedipus as believing of Jocasta, his mother, that she is a suitable wife, opposed to “de dicto” readings, which are “referentially opaque,” co-referential expressions not being substitutable without risking a change in truth-value, such as when we might say that Oedipus thought Jocasta was suitable, but that his mother was not.30 This second, de dicto, use, however, does not carry existential commitment, as when we might say that Oedipus represented Zeus as a powerful god: here there is obviously no “objective construction” of any Olympian god corresponding to Oedipus’ thoughts.31 (ii) Chomsky would also be largely right to presume that de re readings of at least the term “represent” have come to play a salient role in recent philosophical discussions, especially since the influential work of Kripke (1972/1980), Putnam (1962b/75b, 1975c), and Burge (1979, 1984), which inspired a movement committed to “direct reference” theories of meaning that tried to free semantics from “description theories” associated with Fregean “senses.”32 The term “representation” is not often discussed, but, when it is, it seems to be univocally understood relationally, that is, as de re (see, e.g, Lycan, 1996; Millikan, 1998). This does seem startling to those of us who have been interested in Botticelli’s representations of Aphrodite and Newton’s of absolute space.

30 See the references in fn5 above. Quine (1956) expressed the distinction in a way close to what I will suggest here, between “relational” and mere “notional” attitude ascriptions. 31 There is a further issue of whether one can have singular thoughts about non-actual things. Pace Gareth Evans (1982) and others, I will assume that one can, as when one has a thought about each of two rainbows, the thought’s singularity being likely an internal feature of the thought’s representation. See also the Kanizsa triangle(s) in §8.7, figure 8.1. 32 The approach was originally intended as addressing reference, but, for a variety of reasons, soon spread to issues of “meaning” and “content.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

288 Representation of Language (iii) Indeed, I think Chomsky is overly impressed by this apparent externalist hegemony.33 As we noted in §8.1, there is a venerable tradition going back at least to Brentano that regards the possibility of “empty” thoughts as one of the marks of the mental. We will return to this issue shortly (§8.7), but few externalists need seriously deny the innocuous truth that one can represent a cube without being caused to do so by any genuine particular cube, that, indeed, people stimulated by a tachistoscope might experience illusions.34 What externalists often believe is that a representation has the external content that did or would cause it to be entokened under certain conditions, or that some complex representations (“the largest prime”) might be necessarily empty. Of course, just which mental representations might be complex and necessarily empty is an empirical, deeply non-obvious question. (iv) Lastly, and most importantly for our purposes: Chomsky’s claim in the passage is actually quite puzzling. “An image of a rotating cube,” even when there is no real cube causing it, cannot be quite so unambiguously said to be a representation of “an x without an x.” Even though an image of a cube may fail to be of a particular real cube, it is still a representation of a cube, and not, say, a sphere, and obviously a crucial bit of the psychology would be lost if we did not make that distinction. And so, even in terms of Chomsky’s own example, we do need to “ponder what is represented,” perhaps “seeking some objective construction from sounds to things,” even if not always ones that are present at the time that the representation is activated—or perhaps, as we will consider in the next section, some “relation” to “something” that does not and maybe could not exist at all! What is especially puzzling about Chomsky’s denial of the need to ponder what is represented is that the “things” being represented in linguistics are the very things the theory is about, viz. SLEs! And it does strike me as something of a scandal that linguists have not sufficiently pondered what they are. That is: the question that concerns me is equally a question about linguistic

33 As are others, see, e.g., Lasnik (2005:63). Notice that Chomsky’s (2000: 23) dismissal of intentional attribution is immediately preceded by the correct observation that a theory of jealousy is not likely to “distinguish between states involving real and imagined objects.”. Quite so: a theory of jealousy is likely to be de dicto, not de re. 34 Thus, it would hardly be news to Burge to learn that mental images of a cube might not be caused by a cube, as Chomsky (2000: 160) bizarrely seems to suggest.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 289 ontology as it is about intentionality. As I hope will emerge in Chapter 9, the questions are surprisingly intertwined. But first we need to make room for the curious category which I think we will ultimately need in order to classify SLEs: Brentano’s category of “intentional inexistents.”

8.7 Empty Intentional Representations and “Intentional Inexistents” So, we should ponder further someone experiencing a rotating cube though none is causing it. Of course, if there is no cube in front of the subject then the subject cannot really be said to see a rotating cube: you can’t see what isn’t there.35 All you can do is to seem to see such things. But the experience might be sufficiently vivid and prolonged that one might be able to describe “it” in considerable detail, so much so that it would be hard for the subject and the experimenter not to talk and think about “it,” comparing it perhaps to other rotating cubes the subject also seemed to see. As Chomsky (2000) recognizes: Study of the visual and motor systems has uncovered mechanisms by which the brain interprets scattered stimuli as a cube. (Chomsky, 2000: 17)

Chomsky similarly talks about “inner speech” (2000: 174) and how we can mentally rehearse SLEs without speaking them out loud, noticing, for example, when they rhyme. But how are we to understand such claims about non-existent cubes or “inner speech” that no one but the subjects of such experiences can “see” or “hear”? Or does Chomsky really want to say that neural states themselves can also be heard by others to “rhyme”?! The first thing to notice is that, at least so far as characterizing the experiences of imagined cubes or inner speech, it is quite enough simply to allow that they involve what might be called “quasi-perceptions”: so-called “mental images” of rotating cubes or imagined speech are cases of subjects being in states very like the state of actually seeing a cube or hearing speech, but without the usual causes of those states. Whether such states involve the existence of an inner mental image of a cube or of the speech is a further empirical 35 I don’t want to insist on this relational reading of the English “see” (although many do; see, e.g., Dretske, 1969). The main concern here is simply to notice how one needs to be careful about the issue.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

290 Representation of Language question that there is no need to decide here. It is enough that it is certainly theoretically possible that there isn’t any such further “thing”: it is just a state that presents things more or less “as if ” there were.36 On the face of it, quasi-perception would seem to afford a natural way to take Chomsky’s claims about “a mental image of a rotating cube” that is not produced by an actual cube: regard such an image as simply an empty representation, a representation that fails to be a representation of any actual cube. But there is, however, a famous difficulty with our talk of such representations: are they representations of nothing? Well, yes, in a sense; but obviously not in the sense in which some nonsense (say, “a slithy tove”) might represent nothing. What needs to be observed is that intentional expressions (“see,” “hear,” “belief,” “thought,” “representation”) are “about,” or “of,” two very different kinds of “things,” depending on the existential assumptions of its user, as in “representation of Trump” vs. “representation of Zeus.” Without trying to provide a serious theory about this difference in use, I will call the use of ┌representation of x┐ or related nouns and verbs where we can easily deny the existence of x, “a purely intentional use” of “representation” and other intentional idioms; the others, “existential uses.”37 Of course, sometimes we may want to leave it open whether or not the x exists (“Let ‘John Doe’ represent the guy we’re looking for”), although if we learned that x actually does exist, it might then be a (delicate) open question whether we would still be using “represent” purely intentionally, or would switch to the existential (real existence has, as it were, a way of grabbing our intentions). For our purposes here, I will use “intentional object” to refer to whatever a representation is “of ” or “about,” and assume “represent” can be used even when the object turns out to exist after all, even though my provisional definition of (intentional) content will be in terms of cases when it does not, as follows:

36 See J.J.C. Smart’s (1959: 149–50) nice discussion of “topic-neutral” descriptions—descriptions that can simply be neutral as to whether there really is some inner cube or speech one is perceiving in such cases. Whether such a strategy could account for all the empirical data associated with mental imagery is the topic of extended discussions between, e.g., Kosslyn (1986) and Pylyshyn (2006). I discuss the strategy further in Rey (1981). And, of course, the strategy comports well with the strategy I have attributed to linguists generally, of adopting a “representational pretense” with regard to the existence of external SLEs (see §6.4 above). 37 I would have phrased the issue in a more natural way (e.g., “where we deny the existence of the thing represented”), except for (i) the maddening ambiguity of that way of speaking, and (ii) Chomsky’s persistent complaint (see, e.g., 2000: 41) that philosophers crucially presuppose some “mystical” relation of “reference” that they nowhere explain. To avoid contaminating the present discussion with any such presuppositions that Chomsky might find unacceptable, I draw the distinction in a way that is close to his own way of describing the case of the “mental image of rotating cube” that is only “imagined.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 291 A pure intentional content is the content of a representation, α, ┌α represents/is a representation of x, but there is no x┐,38 when we use the idiom

If it is important to distinguish an intentional content from the purported object whose existence in a purely intentional use is being denied, I will place the words for that content in curly brackets. For example, the pure intentional content of “a rotating cube” in “representations of a rotating cube, though there is no rotating cube” is {a rotating cube}. Note that a rotating cube is one thing, the content, {rotating cube} quite another. For starters, the latter can exist when the former does not, and, of course, vice versa. A reason for being careful in distinguishing the purely intentional from the existential uses of intentional terms is the almost universal appeal of (what has come to be called) the “McX” response to the ancient “riddle of nonbeing” (“How can you deny that Pegasus exists, since your denial talks about him, and so he’d better exist to be talked about!” See Quine 1953/61a: 1–2). As any teacher of that riddle will attest, a standard response is (in Quine’s fable) “McX’s”: “Pegasus is an idea in one’s mind” (1953/61a: 2). It is to avoid suggestions such as that there really are winged horses in one’s mind that I think one needs to resort to the purely intentional usage that will concern me here.39 But, for all the absurdity of the McX response, it does seem to be on to something. And there is, of course, a long tradition, extending back to the Middle Ages and resuscitated by Brentano, that claims that existing “in thought” is a way, or “mode” in which an otherwise actual, spatio-temporal object may exist, or perhaps “subsist”; or, anyway, that such “intentional inexistents” such as Pegasus need in some way or other to be acknowledged for purposes of psychology (see Kemmerling, 2004, for discussion). Unlike many who have addressed this problem, for example, Meinong (1899/1978) and Parsons (1980), I am loathe to rely here on any special metaphysics. I agree with Richard Cartwright’s (1960 :639) nice quip: “unreality is just that: it is not another reality.”40 38 Where what is to be substituted for the variable “x” are singular terms, and for the variable “α” quotation-names of singular terms. The square-quotes are to allow that these variables, unlike the other letters, are themselves being used, not mentioned. 39 Are (representations with) pure intentional contents relational phenomena? Well, when x is a purely intentional content, a person is standing in the thinking relation to {x}. But note that this doesn’t entail she is thinking about the intentional content {x}. Even in a purely intentional usage, “thinking about x” is one thing; “thinking about {x}” quite another. 40 I also am sympathetic to Quine’s (1953/61a) evocation of Russell’s (1905) proposal to treat empty terms as essentially definite descriptions. But Quine has in mind the needs of first-order logic, where empty predicates are assigned the null set as their extension, rendering them all synonymous. This is a

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

292 Representation of Language Especially vivid cases of intentional inexistents are what might be called “perceptual inexistents,” such as the “Kanizsa triangle” (figure 8.1), which is the apparent black triangle in the foreground which seems superimposed on the white lined triangle and three disks. Note that its blackness appears more intense than the black of the background.

Figure 8.1 Kanizsa Triangle(s)

The illusion is so vivid that people who haven’t seen it don’t believe it is an illusion until they place something over the pac-man figures and white lines and observe that the “illusory contours” disappear and the blackness now appears uniform. Note also that (pace Gareth Evans, 1982, see fn 30 above) one can have singular thoughts and representations of the two non-existent triangles. It is difficult to resist saying that we can refer to these illusory “things.” Certainly a vision scientist would engage in the representational pretense of speaking to experimental subjects as if they could. At least provisionally, I propose we understand talk of such intentional inexistents along the following, metaphysically deflated lines: (DEF) y is an intentional inexistent for a representational system S iffdf there is a representation in S that has the pure intentional content {y} and y does not exist,41 is consequence that may suffice for mathematics and many parts of natural science, but is not likely to be useful for a psychology that might need to distinguish, e.g., representations of demons from representations of ghosts. Of course, one might hope to avoid this difficulty by “analyzing” terms that are empty into terms that ultimately are not, as Russell (1912) hoped to do by appeal to terms for sense-data. But such empiricist proposals are notoriously fraught with difficulties (see Kripke, 1972/1982, and Fodor, 1998). 41 One might worry that the English predicate so defined, with variable “y,” could only be satisfied by real things. While that might be true in the standard treatment of predicates in first-order logic, it seems pretty clear it won’t work for natural languages, where we seem all the time to be “referring” to

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Chomsky and Intentionality 293 where a “system” can be a community-shared work of fiction, a theological tradition, or—as in the cases that concern me here—a system in a human or animal psychology, for example the visual system, or the I-language where, insofar as it interfaces with a perceptual system, we might talk of perceptual inexistents. As this definition makes clear, the ontology remains purely an ontology of representations and their intentional contents, which are needed in any case for any adequate psychology, as in the case of the immense variety of illusory figures to which we are stably responsive, in cartoons, flashing signs, and digitalized light bulb displays.42 And, of course, it is in view of at least the serious possibility that ordinary linguistic perception is illusory, and it making no difference to linguistic theory if it is, that I claimed that linguists are simply engaged in “representational pretense” when they speak of SLEs being spoken or heard. Note that the above proposal may go some way towards assuaging the worry that “intentional inexistents” are some sort of weird entities. It would also seem to be a reasonable way to capture some of what Collins (2014: 50) was after in claiming that the neural states responsible for language are “monadic states [which] are the truth makers, if you will, of . . . dyadically specified states,” but insisting that the states are still intentional, and thereby allowing for the apparent tokens of SLEs that speakers take themselves to perceive. The proposal does rely on the notion of “(pure) intentional content” about which an account might reasonably be thought to be needed. And, although much anxiety and ink has been spilt trying to provide such a full account, for reasons I will discuss in §10.2.2, I think it is far too early to try to provide an account of such a theoretically rich idea. It is enough to point to the obvious explanatory work it seems to perform in at least the serious scientific projects mentioned in §8.1 and to expect to be able ultimately to define the notion(s) in terms of that specific work (see §11.2 for an example of how this might go).

non-existent objects and can in some sense “quantify” over them, as when we say without a trace of paradox, “There are many things that don’t exist: Zeus, Santa Claus, phlogiston, . . .”. Note, however, that I am not trying to provide a semantics for English, but only a strategy for dealing with the theor etical claims of linguistics. 42 Chomsky (2000) interestingly proposes in passing that ordinary expressions in natural language “pick out” not “things in the world,” but “things in some kind of mental model, discourse representation, and the like,” in which case the study is “a form of syntax.” Now certainly many expressions in natural language, such as “the present king of France” or “the rainbow in the sky” do not pick out things in the external world; but it is not clear what it means to say that they pick out “things in a mental model,” unless that just means they pick out intentional inexistents (i.e., they don’t “pick out” anything real at all). They certainly do not pick out discourse representations or other “syntactic” entities: “the sky” does not pick out the words “the sky”! We will return to this confusion shortly (§9.8.1).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

294 Representation of Language Perhaps as a consequence of Chomsky thinking there is no need to ponder what is represented in either the case of a non-existent cube or in the representations of language with which his theory deals, it is hard to find passages in his or others’ writings about what exactly SLEs are. What are the, for example, “phonemes,” “phonological features,” “words,” “sentences,” “phrases,” “NPs,” “IPs,” that linguistic theories seem to be about? I want to argue that they are, indeed, intentional inexistents; but my argument requires an excursion into linguistic ontology, which turns out to be a surprisingly complicated topic, one I’m afraid, deserving a full chapter of its own.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

9 Linguistic Ontology 9.1 Background 9.1.1 Intentionality and Ontology As we just saw in Chapter 8, one of Chomsky’s main objections to intentionality consists in what he regards as its externalist commitment to there being actual external objects for intentional representations to be “of.” Both because of what I have argued was Chomsky’s misunderstanding of this issue, but also to underscore the earlier discussion of “representational pretense” (§6.4), we should look at the specific ontological problems that are bothering him, and whether there might not be less drastic solutions to them than denying a scientific role for intentionality altogether. It will help to begin with an interesting claim of Chomsky’s that motivates a great deal of what he says in this regard. It occurs in a passage that we quoted in (§6.4.1) about the reality of phonetic forms which serves as a model for much of his thought about the relation of mind to reality: Suppose we postulate that corresponding to an element “a” of phonetic form there is an external object “*a” that “a” selects as its phonetic value; thus, the element [ba] in Jones’s I-language picks out some entity *[ba], “shared” with Smith if there is a counterpart in his I-language. Communication could then be described in terms of such (partially) shared entities, which are easy enough to construct: take “*a” to be the singleton set {a}, or {3, a}; or, if one wants a more realistic feel, some construct based on motions of molecules. With sufficient heroism, one could defend such a view, though no one does, because it’s clear we are just spinning wheels. (Chomsky, 2000: 129; see also 1996:48, 2016a: 43–4)

As the “no one does” indicates, Chomsky takes himself to be expressing a view widely shared in contemporary linguistics. In the present chapter we will see that there is certainly something right about such skepticism concerning SLEs in general. But such skepticism also raises an issue that is surprisingly Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0009

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

296 Representation of Language little addressed by Chomskyans: precisely what SLEs positively are.1 After surveying a number of alternatives, I will propose what seems to me the most reasonable view, what I call “folieism”: along the lines of the proposal we have just discussed at the end of Chapter 8, SLEs may and likely do not exist at all! As we have noted (§6.4), actual entities are not needed for any linguistic or other explanatory purpose. Nor are they needed for commnication: normal verbal communication is a kind of folie à deux (or à n, for the n speakers and hearers involved) in which speakers and hearers enjoy a stable and innocuous illusion of taking themselves to hear SLEs that are seldom if ever actually produced. This may seem to some readers a startling, perhaps outrageous claim, completely undermining the hard-won truths of standard linguistic theories. But recall the point stressed in §6.4, that at least a Chomskyan linguistics is explicitly concerned with the computational-representational system under lying linguistic competence, not with any externalia that the representations of that system represent. In any case, I hope the claim will come to seem less preposterous once the alternatives to it are made clear.

9.1.2 Resisting General Anti-Realism (i) General Anti-Realism Ontology, or the discussion of what exists, has historically been a fraught topic not only in philosophy, but in many of the sciences, and especially in reconciling the verdicts of many sciences with ordinary commonsense.2 Before we consider the complex case of SLEs, it will be useful to consider how such issues might be handled generally. This is particularly important, since Chomsky and/or some of his followers have sometimes casually expressed quite general anti-realist views that are problematic and can derail ontological discussions from the start.3 For example, Chomsky (1996) writes: There need be no objects in the world that correspond to what we talk about, even in the simplest cases, nor does anyone believe there are. (Chomsky, 1996: 22) 1 Not that there need not be a single answer for all the variety of such entities (cf. Preface, fn6). 2 See, for example, the turbulent history of controversy surrounding the reality of molecules and other “theoretical entities” in physics and other sciences (see, e.g., Duhem, 1906/54; Putnam, 1975a; van Frassen, 1980; Devitt, 1984/91/97), as well as disputes about secondary properties and social entities that we will discuss below. 3 Devitt and Sterelny (1987/99: chp 13) point out that linguistics has been a source of a general anti-realism since the influential work of Saussure (1914/77).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 297 And the prominent linguist long associated with Chomsky, Ray Jackendoff (2006) proposes abandoning the unexamined notion of “objects in the world,” and, for purposes of the theory of reference, pushing “the world” into the mind of the language user too, right along with language. –(Jackendoff, 2006: 226)

Well, I, for one, do believe that there are at least some objects in the mindindependent world that correspond to things we sometimes talk about, but in this section, I do not need nor want to defend a general realism. I only wish to show how nothing in Chomsky’s core theory is incompatible with it. There are two independent considerations that lead Chomsky to his occasional anti-realist claims and which are almost inextricably intertwined in his discussions: the first is a quite common observation about the interest relativity of the concepts by which people categorize the world; the other is a more recent observation about the interest relativity of our uses of words. Note their independent subject matter: concepts vs. word usage. We will see it is difficult to determine in the passages I quote just which he is addressing. I will discuss the first in the remainder of this section, the second in §10.4.2 after we’ve discussed a linguo-semantics in §10.4.1. Ironically, Chomsky’s conceptually based anti-realism seems to surface in passages not far from those in which he makes the anti-intentionalistic claims that we discussed in §8.4—ironically, since the claim of anti-realism is that there is no reality to things apart from people’s ideas about them! Thus, twenty pages after rejecting intentionality in his (2000: 159), Chomsky endorses the “constructivist” views of his mentor, Nelson Goodman: We can think of naming as a kind of “world-making” in something like Nelson Goodman’s (1978) sense, but the worlds we make are rich and intricate and substantially shared thanks to a complex shared nature. (Chomsky, 2000: 181)

Chomsky proceeds to cite with evident approval Hobbes’ (1655/1839: 16ff) view that “Names are not signs of things, but of our cogitations,” as well as the view John Yolton (1984: 213) ascribed to Descartes and Reid, that “The world as known is the world of ideas . . .” (emphasis original). Berwick and Chomsky (2011) continue the thought: The symbols of human language and thought . . . do not pick out mindindependent objects or events in the world. . . . What we understand to be a

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

298 Representation of Language river, a person, a tree, and water and so on, consistently turns out to be a creation of what the seventeenth century investigators called the “human cognoscitive powers,” which provide us with a rich means to refer to the outside world from intricate perspectives. (Berwick and Chomsky, 2011: 39)

This is a view that Chomsky reiterates in a conversation with McGilvray: the ship is not a thing in the world to start with; it’s a mental construction that, of course, relates to things in the world to start with. (Chomsky and McGilvray, 2012: 125)

In his (2016a), Chomsky reasonably notes that [Aristotle] concluded that we can “define a house as stones, bricks and timbers,” in terms of material constitution, but also as a “receptacle to shelter chattels and living beings,” in terms of function and design, and we should combine both parts of the definition, integrating matter and form, since the “essence of a house” involves the “purpose and end” of the material constitution,

but oddly concludes from these obvious truths: Hence a house is not a mind-independent object. –Chomsky (2016a: 44, emphasis mine)

On the face of it, this is an explicit anti-realism about houses, and presumably anything else picked out by concepts involving human interests. Chomsky (1995b: 1, 24; 2000: 153) does seem prepared, however, to be completely realist about concepts not involving our interests, notably the concepts of “naturalistic theory, which seeks to set such factors [as human interests and unreflective thought] to one side” (2000: 22). Indeed: We construct explanatory theories as best we can, taking as real whatever is postulated in the best theories we can devise (because there is no other notion of “real”) (Chomsky, 1995a: 35; see also 1995c) I am keeping . . . to the quest for theoretical understanding, the specific kind of inquiry that seeks to account for some aspects of the world on the basis of usually hidden structures and explanatory principles. (Chomsky, 2000: 134)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 299 Thus we need to distinguish Chomsky’s interest-based anti-realism from the more extreme form of what might be called “pan-conceptual” anti-realism advocated by Jackendoff in the quote above, and also by a prominent champion of Chomsky’s views, James McGilvrary (1999, 2014; see also Chomsky and McGilvray 2012). In his (1999) study of Chomsky’s work, he wrote: Chomsky is a constructivist. This stems from his internalism and nativism and amounts to the idea that the things and the “world” of common sense understanding, and, in a different way, of science are in large measure products of our minds. As Chomsky says in an interview that appears in Language and Politics, “You could say that the structure of our experience and our understanding of experience is a reflection of the nature of our minds, and that we can’t get to what the world really is. . . . ” Common sense understanding is anthropocentric and serves our interests. The sciences try to be objective, but they and, in a different way, the phenomena they deal with are human constructs or artifacts, made by us in order to understand. In this respect, their worlds are still products of our minds. (McGilvray, 1999: 5–6; emphasis mine)

Jackendoff (2006) does worry that his own similar proposal smacks of a certain solipsism or even deconstructionism, as though language users get to make up the world any way they want, as though one is referring to one’s mental representations rather than to the things represented. (Jackendoff, 2006: 226)

But he goes on to claim that “there seems little choice” the perceptual world is reality for us. Apart from sensory inputs, percepts are entirely “trapped in the brain”; they are nothing but formal structures instantiated in neurons . . . (Jackendoff, 2006: 228)

And he thinks this is not so bad, since We are ultimately concerned with reality for us, the world in which we live our lives. Isn’t that enough? . . . If you want to go beyond that and demand “a more ultimate reality,” independent of human cognition, well, you are

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

300 Representation of Language welcome to, but that doesn’t exactly render my (linguistic-cognitive) enterprise pointless. (Jackendoff, 2006: 229, emphasis original)4

Now, of course, as I stressed in §6.4, it is perfectly reasonable for linguists and psychologists to confine themselves to computations on formal representations realized in our brains. They can remain entirely agnostic about whether there actually are real things represented by those representations: their theory itself neither entails there are nor that there are not things in the external world corresponding to the representations they discuss. But of course there had better be something that does the explanatory work of linguistic posits if their theory is be taken seriously—for example, the neural structures themselves. Or does their existence also depend upon someone thinking about them? Although Chomsky, unlike Jackendoff and McGilvray, does seem to confine his anti-realism to interest-involving concepts, there nevertheless seems to me to be a common fallacy running through all of these anti-realist arguments, whatever the scope of their application. It may well be that our concepts of things originate in us and that the range of things that are picked out by them may not be characterizable without reference to minds, but it does not follow from these claims that the individual objects themselves within that range do not exist entirely independently, and are, one by one, identifiable physically, precisely in the way that individual tables, chairs, cats, trees, rivers, and neural structures seem to be.5 It helps here to invoke a nice logical distinction introduced by Quine (1953/61d), between the ontology and the ideology of a theory. The ontology of a theory is the set of things that have to exist for the theory to be true, the things that the theory is “quantifying over.” The ideology consists of the theory’s predicates, or how the theory sorts that ontology into (sub-)sets of things. Two theories may share an ontology, but differ, irreducibly, in their ideology. There are technical examples, but an intuitive one is afforded by Fodor’s (1975) nice example of Gresham’s Law: “Bad money drives out good.” This is perhaps a law of economics, but unlikely to be one of physics. But that does not mean the respective ontologies of those theories need differ. Every piece of money is arguably some or other physical thing—the ontology of economics is a subset 4 I will return to the distinction Jackendoff is drawing between “reality for us, the world in which we live our lives” and “a more ultimate reality” in discussing the fraught term “psychologically real” in §9.7 below. 5 Cf. Fodor (1998: 147ff). As I noted in §6.3.1, these ontological issues have been a central concern of Michael Devitt. His (1984/91/97: III) provides an excellent discussion of Goodman’s “world-making” and many other fallacious arguments for anti-realism that have been rampant in many quarters, from Kuhn (1962) to Rorty (1979) and Putnam (1987).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 301 of the ontology of physics—but the predicate “x is money” is not a predicate of physics, and it is doubtful it could be defined in it. The property of being money is not a physical concept/property, even though it is presumably satisfied only by physical things.6 Thus it simply does not follow from the fact that certain concepts and cat egories of things in the world are (let us suppose) defined relative to the human mind, that the “objects or events in the world” picked out by the cat egories themselves do not exist perfectly independently of minds, or that speakers cannot sometimes use words to refer to them. Thoughts, desires, experiences of after-images and shooting pains: these are all clearly “minddependent” in that they exist (if they do) only so long as the right people are in the right mental states. But houses are not remotely mind-dependent in this way. Although houses may of course be standardly built to serve a certain purpose by creatures with minds (I don’t think we need the authority of Aristotle to know that), they may endure long after their builders—and maybe all human beings—are long gone. Or, setting aside the confound with deliberately constructed artifacts: perhaps which portion of reality counts as the Mississippi River depends upon our conceptions and interests (cf. Chomsky, 2000a: 181–3). However, so counted (and up to usual problems of vague boundaries), rivers—as well as trees, chemical compounds, continents, and stars—have their pasts and futures, and may well satisfy our concepts quite independently of whether any of us think they do, or whether there were or ever will be human beings always around to use their concepts to pick them out.7 6 I don’t want to go to the wall about money, which is likely a quite complex economic category, including odd sub-categories such as “bank balances” and “bitcoin.” Quine (1953/61d: 131) provides a simple formal example. Tarski proved a certain theory, R, of the real numbers to be complete (every truth expressible in it is provable). Now, every natural number is one or another real number. But the standard theory, N, of natural numbers is, of course, incomplete. It follows that the predicate (and/or its extension) “x is a natural number” is not expressible in R, on pain of contradiction. The ontology of N is a subset of the ontology, but the two theories have incommensurate ideologies. 7 This is probably a good time to mention a related view associated with Kant’s (1787/1968) famous distinction between what he regarded as the “phenomenal” world, or the one that is in some way constituted by human experience, and the “noumena,” or “things-in-themselves” that he claimed we cannot really even think about (despite all the things he seems to say about “it/them”). That view seems to rest on an empiricism about concepts that it is doubtful Chomsky would endorse: All concepts . . . relate to empirical intuitions, that is, to the data for a possible experience. Apart from this relation they have no objective validity, and in respect of their representations are a mere play of imagination or of understanding. (Kant, 1787/1968: B298) But why should we agree that our concepts, occasioned by experience may not apply well beyond it? In any case, in terms of the old “cookie cutter” analogy often ascribed to Kant: even if our concepts are cutters of our own invention, still the cookies they cut out may well exist before and after we (conceptually) cut them! Sometimes we may even be fortunate enough in our cutter concepts to “carve nature at its joints,” joints that were there all along, never needing the services of the cutter!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

302 Representation of Language Of course, as we discussed in §8.7, it might turn out that some of our concepts may also be revealed to be empty and apply to nothing whatsoever, as in the case of [ghosts], [phlogiston], [round square], and [the largest prime]. But, as we noted, in such cases people are not referring to their ideas of these things: if ghosts don’t exist, they don’t exist as ideas either (much less as neural states!). The ideas of them refer to nothing (or, perhaps, using “refer” here without existential import) to “intentional inexistents” (cf. §8.7). (ii) Deciding Ontology There are two issues relevant to our discussion that can interest us in ontology: (i) what things/phenomena have to exist in order for a theory to be true (in Quine’s, 1953/61a, phrase, a theory’s “ontological commitments,” cf. §6.3.6 above)?; and (ii) with what, if any, phenomena needed for a theory could or should one identify the things/phenomena, such as tables, chairs, nations, rainbows, words, and sentences that we ordinarily think about and discuss? Regarding (i), one turns to each theory and determines what phenomena that it discusses perform actual explanatory work (or, perhaps along the slightly more formal lines famously proposed by Quine in the same article, what entities a theory “quantifies over”—we need not sort out that difference here). Regarding (ii), I, myself, do not expect that there is a deeply principled way of reconciling scientific and ordinary talk, or, to use the terms of Sellars (1962), the “manifest” and the “scientific images” of the world. Along the lines of a modest Chomskyan semantic proposal that we will discuss in §10.4.1, it seems to me a largely pragmatic affair, having to do with the actual use of words in the various ordinary “language games” we play, and how to reconcile them with the very special language game of science, where we seek to describe the world, as much as we can, independently of our interests and cognitions. But this does not mean anything goes, and there are not more and less reasonable ways to reconcile science and commonsense. In particular, there seem to me at least two obvious constraints relevant to the discussion of SLEs: Extensional Stability: ceteris paribus, there should be wide agreement about uses of a term regarding what is in its extension, and what is or is not a borderline case;

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 303 and Property Preservation: the central properties associated with a term should be properties of a scientifically identifiable phenomenon (object/state/event). The first condition guards against extensional chaos, not permitting just anything to satisfy some predicate with enough relativization to different times, people, and contexts 8 The second is simply a version of Leibniz’s Law: (if x = y, then, for any property, F, x has F if and only if y does)9. That is: whatever is true of something, it true of it no matter what you call it (“a rose by any other name would smell as sweet”). (iii) Stable Cases Much of our ordinary talk seems to me to satisfy these conditions. Thus, talk of houses, tables, chairs, trees and rivers, cats and dogs, properties of fluidity and elasticity, all seems to refer to phenomena that can be pretty stably identified with physical phenomena across speakers and contexts. A specific house can be identified with a specific physical structure that is used for habitation as it was intended to be. And there can be wide variation in how such things appear: a house’s facade can be (approximately) rectangular, even though it can appear trapezoidal from up the street, and still the extension remains constant. There is, of course, the ubiquitous vagueness in the application of virtually any predicate to things in space/time, e.g, to how many hairs someone has to lose to be bald, how many molecules does something have to lose to cease 8 Thus, I presume it would be silly to insist that there is a real property in objects of being amusing, which would have to be extravagantly relativized to people, times, moods, and contexts in order to avoid contradictory ascriptions of it. The ceteris paribus of course does a lot of work here: the point is that a term is extensionally stable iff disagreements about the extension are explicable as due to independent interference. Thus, your use of “chair” can be co-extensional with mine if our disagreement is due to my not having my glasses on, cf. §3.4.3 above, but not so for many of our disagreements about what is amusing. One could regard Extensional Stability as entailed by Property Preservation, since, at least in the case when speakers intend to refer to real phenomena, they presumably believe the phenomena are stable across many speakers’ usages of a word (a color, for example, is ordinarily regarded as an enduring property of a surface). But it will be simpler to treat the two conditions separately. 9 After Gottfried Leibniz (1645–1716). There are actually two principles known by his name, the quoted, uncontroversial one, sometimes known as the “The Indiscernability of Identicals,” and its converse, “The Identity of Indiscernables,” which need not concern us here. I finesse whether one should express these principles in terms of properties (and relations) or in terms of n-place predicates (or pieces of language). Property talk is simpler and will be easier for our purposes—but more on this at §9.6 below. For those who may think the quoted law might be controversial, it is important to realize that, without the variables, what it says is essentially a logical truth: if something has a property then, by golly, it has that property!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

304 Representation of Language being a table or a chair, what to make of a new-fangled thing that is maybe half-table and half-chair. But there is by and large sufficient agreement even about what counts as such “difficult” cases: people’s judgments largely match, as Quine (1960/2013: 37) nicely put it, “umbra for umbra, penumbra for penumbra.” (iv) Unstable Cases: Secondary Properties Much more problematic are cases presented by “secondary” or responsedependent phenomena, such as being green, warm, or sweet, and what are sometimes called “tertiary” properties, such as being funny, ugly, or euphonious (all involving some degree of normativity). These all seem to involve a particular perceptual or emotional response: for example, the thing should standardly look red or green, feel warm or taste sweet, or should evoke laughter, disapproval, or pleasure. The trouble here is that these responses are not sufficiently stable across people, or even across different times in a single person’s life for science to specify a single external property corresponding to each of the responses. Consequently the extensions of such terms vary widely across people and even short periods of time, and one is hard put to see any way to stabilize them (being morally right being an especially fraught case in point, which is hopefully more stable than being amusing, although perhaps not as stable as red). The cases of secondary properties have been widely disputed since at least Galileo and Locke (who introduced the category): in their case, there seems to be far less stability than people ordinarily suppose. Color, for example, turns out to surprisingly unstable. C.L. Hardin (1988/93, 2008) sums up the case against any “normal observer standard condition” proposal about the identification of colors: Among other things it supposes that there is an observer—perhaps a statis tically constructible one—whose visual system can reasonably serve as the basis for making the required classification. In particular, it presupposes that all normal observers will locate their unique hues at approximately the same place in the spectrum, and, given a set of standard color chips under the same conditions, will agree on approximately the same chips as exemplifying those unique hues. This is by no means the case. In fact, the differences are large enough to be shocking . . .. (Hardin, 2008: 148)

Hardin goes on to describe careful experiments by Kuehni (2001) on 40 color chips arranged in a circle that showed that:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 305 If the results for the four unique hue ranges are taken together, there fails to be consensus on 26 out of the 40 chips composing the hue circle. That is to say, 65% of the hue circle is in dispute! (Hardin, 2008: 149)

This invites the observation Bertrand Russell (1912) made in the opening pages of his Problems of Philosophy, where, after similarly considering the varying conditions other than “normal” ones under which inconsistent color attributions are made, he concludes: But the other colours which appear under other conditions have just as good a right to be considered real; and therefore, to avoid favouritism, we are compelled to deny that, in itself, the table has any one particular colour. (Russell, 1912/59: 10)

There simply seems to be no real, explanatorily serious basis for designating certain conditions as “normal” as opposed to others.10 It is interesting to notice that Russell (1912: 10–11) goes on to rashly overgeneralize his conclusion to primary properties such as the shape of something— which, he correctly notices, also appears to vary with the visual acuity and angle of the observer. But what is striking about shape and other primary properties is how, despite the variability of appearances, there are stable conditions for their reality, provided by their role in physics (e.g. positions in space-time, what actually aligns with what, etc.). Of course, in ordinary life our systems of color vision are sufficiently similar that there is seldom much serious disagreement: sufficient numbers of people agree about red vs. green traffic lights well enough not to regularly collide with each other. But, of course, things get stickier at paint stores, particularly between men and women who, statistically, seem to have cones with slightly different sensitivity to wave lengths in the regions associated with “blue” and “green.” I think a useful, fairly principled way to consider (anti-)realism about a phenomenon (e.g. an object, a property, a kind) is in terms of the explanation of the (in)variance in people’s judgments: there is reason to be a realist insofar as the invariance is due largely to the world; an irrealist insofar as what invariances 10 Of course, someone totally profligate about properties could insist there really are color properties; they are just highly relational: to a person, a context, a specific viewing angle, immediately prior visual experience, and probably what one had for breakfast. But, at least in science, entities have to earn their explanatory keep, and presumably no explanatory purpose would be served by such a preposterously relativized property—one that, moreover, would not uphold an explanatory requirement of supporting counterfactuals (“If our cones were composed instead of . . . , then . . .” “If the earth’s atmosphere were denser, then . . .”); cf. fn 8.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

306 Representation of Language there are are due largely to the properties of observers. Atoms and molecules, cats and dogs, and tables and chairs should be regarded as real insofar as the explanation of the agreements between people about their extension is due to the invariant existence of those things, their disagreements due to their variable epistemic access, for example, variable theories and evidence. But when it comes to what is red, green, funny, sexy, disgusting, or embarrassing, the variability often seems to be due to irreconcilable differences in the properties of the observers: differences in sensitivities, in just what sorts of things amuse, arouse, disgust, or embarrass them. What stabilities there are are largely due to stabilities in those observers. As independent properties of the external world onto which they are projected, they enter into no serious laws, and are consequently issues about which further epistemic argument would ultimately be futile.11 (v) Non-Preservative Cases Even if we could stabilize the ascriptions of some of these properties/items, the second consideration, the Leibniz Law problems, can still abound. For example, even if the extension of the ordinary use of “triangle” is quite stable across people and contexts, still, on reflection, when one remembers that such a figure is supposed to be composed of three perfectly straight lines, no actual figure in space-time is ever really triangular: if you look closely enough, there are all manner of irregularities in at least any lines you perceive. At best, a figure may be “triangular enough” in a context, so long as there is no reason to look more closely. Thus, by Leibniz’s Law, there actually are no triangles, or indeed, any of the standard Euclidean figures with perfectly straight lines with no thickness.12 Again, though, I want to stress that the issue about at least commonsense realism may be largely pragmatic, independent of the more principled issue of a theory’s ontological commitments. If one wants to insist on the unreality of tables and chairs, or on the reality of secondary properties, identified in some way highly relativized to perceivers and contexts, there seems to me no 11 What laws there appear to be are psychological laws about, e.g., x looking red in context y to agent z. Realism about color is, however, sometimes still defended even in the face of all the above variation, e.g., by Byrne and Hilbert (2003), who insist (without ascertainable argument) that there is some nonarbitrary fact, as yet unknown, that grounds the reality. Collins (2020b) stresses the locus of invariance as being a fundamental motivation of a Chomskyan linguistics, cf. fn 23 below. 12 Is all our ordinary talk about colors and triangles therefore false? We will return to this issue in §10.4.1.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 307 principled obstacle to doing so (recall Chomsky’s qualification in the quote at the start of this chapter, about how such ontologizing could be brought off “with enough heroism”). The only issue worth stressing here is the options and costs of one physical identification over another. The costs for identifying houses are negligible; the costs for colors and secondary properties, given their instabilities, much higher; the cost for rainbows and Kanizsa figures, higher still. And the cost for the SLEs, I will now argue, is worse than all of these.

9.2 The Problems with SLEs In §6.4 I argued that Chomskyan theories, by their own lights, seem to be committed to merely computations over representations of the SLEs that speakers take themselves to be hearing and producing. So there seems to be no explanatory reason to posit SLEs in addition to the mental representations of them, despite the theory being standardly presented in terms of SLEs. The theorists merely conveniently pretend to be discussing them, remaining at most agnostic about their actual existence. But perhaps SLEs, though they are not, qua SLEs, needed for linguistic theory, like houses and chairs they can be identified with physically specifiable phenomena, for example, disturbances in the air.13 By our conditions for settling such ontological questions, we need to ask whether our talk of SLEs is sufficiently stable and whether their physical identification would preserve their theoretically important properties. Examination of actual speech strongly suggests that neither obtain. It will be clearer to take the second ontological criterion, Property Preservation, first. Consider what specific worldly phenomenon could be identified with the standard tree structure linguists provide as analyses of natural language sentences. What actual thing in the world possesses this structure? Of course, the tree talk should not be taken literally, except for saying there is a causally efficacious hierarchy of ordered n-tuples, whose ter minal members are SLEs. But do the acoustic events we produce when we take ourselves to utter a sentence possess even this n-tuple structure? Reflection on the case should at least give us pause.

13 I am enormously indebted to Bill Idsardi for tutorials in which he helped me find my way through some of the complexities of the material in this section, and to Andrew Knoll for tenaciously pressing various arguments that made their way into his (2015) dissertation.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

308 Representation of Language Consider by way of comparison the causally efficacious structure of an automobile.14 Cars, like (purported) linguistic entities are artifacts, arguably tokens of types, produced by human beings with certain specific intentions. But I submit it is absolutely crucial to the explanation of why a car is so reli able that it in fact has (or realizes) a certain causal structure: the pistons fit snugly into the cylinders, so that, when the gas is ignited by the sparks from the plugs, they are pushed down with sufficient force to turn the crankshaft, and so forth. Most importantly, the standard properties of a car are the properties of this physical object with this complex causal structure. Now, does anything in the air have the structures ascribed to sentences in the way that the motors of cars have the structure of an internal combustion engine? It is certainly not obvious that they do. We need to look closely.

9.2.1 “Beads on a String” A natural suggestion for linguistic ontology is that utterances should be understood as acoustic events in the air having phonological, morphological, and syntactic properties in precisely the way that they are perceptually represented as having them. And the structures might well be compositional along familiar physical lines: syntactic structures might be regarded as equivalence classes of morphemes, which in turn are composed of phonemes, in turn composed of phones—or specific allophones (or variant acoustic manifest ations of a phone)—which could then be identified as acoustic wave-forms (“sounds” on one reading of the word). And many have supposed that this latter identification posed no problem. Bloomfield (1933), for example, characterized the phoneme as: a minimum unit of distinctive sound-feature. . . . The speaker has been trained to make sound-producing movements in such a way that the phon eme features will be present in the sound waves, and he has been trained to respond only to these features. (Bloomfield, 1933: 79; see also Z. Harris, 1951: 63–75) 14 I use the example of cars, since, like SLEs, they are deliberate complex productions of human beings. In Rey (2008) I instead contrasted the case of natural language with that of a “language of thought” (LOT): on the supposition that there is one, and that it was, say, first-order logic, then a brain or machine’s performing a deduction from “(Fa v Ga)” and “-Fa” to “Ga” might be explained by, inter alia, the causal efficacy of the syntactic structure of “(Fa v Ga)” and an application of the rule of Disjunctive Syllogism, tokens of that structure being identifiable with physical structures in the brain. It is doubtful that there are any such genuine causal roles for the SLEs of natural language systematic ally to play. In the present context, however, the analogy might be thought to beg relevant questions about the relation of thought to natural language that would be too difficult to address here.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 309 Meaningful language is constructed by combining phonemes: Once we have defined phonemes as the smallest units which make a difference in meaning, we can usually define the individual phoneme according to the part it plays in the structural pattern of the speech forms. . . . The phonemes so defined are the units of signaling; the meaningful forms of a language can be described as arrangements of primary and secondary phonemes. (Bloomfield, 1933: 136)

In terms of a simile, the phonetician R.H. Stetson (1945/51: 6) used to characterize the view, sentences were composed of phonemes like “beads on a string.”15 What might the “beads” be? There are two basic candidates, phonetic and phonemic phenomena (i.e. entities, or “feature” instances of them, see fn28 below). As we noted in passing in §7.1.3, phonetic phenomena are the actual sounds of speech and/or articulatory gestures that produce them; phono logical phenomena are what speakers “hear those sounds as” and/or are what are processed by the phonological component of their I-language. The relation between the two is complex. The phonological phenomena are abstract, discrete, and often highly idealized versions of the usually quite messy phon etics, with its continuously varying gestures and acoustic waves. The reasons for this are, when you think about it, fairly obvious.

9.2.2 Efficiency and Noise The problem is that speech does not seem to be at all segmented even prob abilistically along the lines of statistical norms in the way that speakers and hearers standardly take it to be. As the phoneticist, John Laver put it: The stream of speech within a single utterance is a continuum. There are only a few points in this stream which constitute natural breaks, or which show an articulatory, auditorily or acoustically steady state being moment arily preserved, and which could therefore serve as the basis for analytical segmentation of the continuum into “real” phonetic units. (Laver, 1994: 101)

15 Recall from our discussion in §2.2.3 above that Chomsky, himself, seemed to maintain versions of the “beads on a string” conception of Bloomfield and Quine up through chapter 1 of his (1965) Aspects (but not thereafter).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

310 Representation of Language What segmentations there are—and, to be sure, many phonemes often have some distinctive indicators—are not to be relied upon generally. There are simple, straightforward reasons for this unreliable segmentation: speakers are efficient, and the articulatory system involves a digital to analog transduction that, unsurprisingly, is susceptible to statistical noise that varies between speakers and between contexts. As a result, the acoustic flow of normal speech is produced without anything like the “boundaries” between linguistic elements that we take ourselves to “hear” and capture very roughly in orthography.16 Each of these issues deserves comment. (i) Communicative Efficiency Speakers usually try to convey what sentences they intend to utter as quickly and effortlessly as their hearers will permit, and they can depend upon hearers being (partly innately, partly by culture) so well attuned to them as not to need any fully explicit externalia. Indeed, normal speech exploits such phenomena as: (a) Filling in: Efficient “top-down” processes “fill in” phones that are demonstrably absent, as when mere silence between the “s” and the “l” in “slit” is heard as “split” (b) Assimilation: nearby phonemes affect each other: in tenth, the often alveolar /n/ is pronounced as a dento-alveolar when a dental, /θ/, follows it; (c) Anticipation: Speakers form their mouths differently in anticipation of later segments: the mouth is differently formed to pronounce the “s” in “spoon” vs. the “s” in “spleen; (d) Displacement: Phonemes are heard differently depending on material at other locations: the difference between “rider” and “writer” is heard as a difference between /d/ and /t/, but in fact the /d/ and /t/ are often pronounced identically (as flaps), the difference in how they are heard being due to the shorter sound of the vowel preceding the the flap. That is: normal speech is much more like the casual “Whachyadoin?,” or a doctor’s rapid handwriting, than like the standardized fonts on a computer.17 16 Notice that the underlying point could be argued virtually a priori to be a feature of all our interactions with the world. Given that macro-creatures like ourselves live in a complex, noisy world, it would seem inevitable that an intelligent creature should use simple, relatively easily computable categories to try to classify stimuli and motor movements in abstraction from the noise. 17 Someone might claim that tokens of these standardized fonts supply tokens of words, phrases, and sentences. They certainly suffice for ordinary purposes. But the issue here is a theoretical one in linguistics, not typesetting, and, as we have noted from the start, for a variety of reasons orthography is

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 311 As the renowned speech scientist Alvin Liberman (1996) pointed out, a system allowing for such shortcuts solves the problem of communicating long strings of discrete, categorical units quickly and in a fashion that can be manageably decoded by a hearer. In fact, without them: The problem is that, at normal rates, speech produces from eight to ten segments per second, and, for short stretches, at least double that number. But if each of those were a unit sound, then rates that high would strain the temporal resolving power of the ear, and . . . also exceed its ability to perceive the order in which the segments had been paid down . . . (Liberman, 1996: 33)

Consequently, the phonological order a hearer recovers from a stretch of sound is determined by the overall “shape of the acoustic signal, not by the way the pieces of sound are sequenced in it” (Liberman, 1996: 34).18 (ii) Noisy Transduction Although speakers usually intend their utterances to be understood in a segmented, digitalized fashion, their articulatory apparatus is an essentially “analog” system, with continuous variation along most parameters, and susceptible to all manner of statistically noisy interference. The linguists Mark Hale and Charles Reiss (2008: 109–18) distinguish the computational processes involved in speech production, whereby phonemic representations are algorithmically transformed into phonetic ones, from transduction and other non-cognitive physical processes whereby phonetic representations are transformed into articulatory acts.19 These latter will include patently not regarded as part of theoretical linguistics. At best, it supplies a coarse and often clumsy transcription of the approximate phonology, and so is no more part of language proper than is the particular notation of the IPA. There may be tokens of the sequence of Times New Roman, 8 font, alphabetical letters in “whachyadoin?,” and that sequence may indicate to a standard reader the complex of various SLEs that the linguist studies, but, needless to say, that sequence is not itself a serious SLE. 18 This last may be a bit hyperbolic. Obviously, the order of at least some acoustic features associated with specific phonemes provides fairly consistent indications of the identity of the relevant SLE. Moreover, the relevant contrastive features are likely exaggerated for infants and foreigners learning a language, or the hard of hearing. The point is that there can be considerable variation in contraction, and that often it may be just the overall shape that is critical. But there is plenty of room for many intermediate cases. A famous set of examples are the many pronunciations of the word “extraordin ary,” which can vary from six to only two syllables: for most British English speakers from the hyper-careful [‘ek’strəʔɔ:ɪnɘdrɪ] through the fairly careful [ɪk’strɔ:dnrɪ] to the very colloquial [‘strɔnrɪ]. (Fudge, 1990: 40; quoted in Wetzel, 2006) 19 I simplify their account, omitting their (undeveloped) postulation of a “gestural” score that is an intermediary between phonetics and articulations, and the “auditory” score, between acoustics and

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

312 Representation of Language on-linguistic processes such as action planning and attention, the effects of n emotions, alcohol consumption, burp suppression, and the presence of crackers or peanut butter in the vocal tract (to cite some of their examples, Hale and Reiss, 2008: 110). Such transduction can involve subtler issues than crackers and peanut butter. Jerry Fodor et al. (1974) observed: Values of [represented] phonetic features may change instantaneously [e.g. from +voiced to -voiced], but states of the vocal apparatus cannot. Dampening of resonators, movement of the lips and tongue, opening and closing of the velum, etc. all take time, so that it is plausible to expect the configuration of the vocal apparatus that is actually produced by any set of instructions will be determined, at least in part, by the character of the immediately preceding configuration. In fact, we know that it is also partly determined by the character of the intended succeeding configurations . . . (Fodor et al., 1974: 302)

This has the consequence that Liberman and Mattingly (1985) pointed out: phonemic features cannot be mapped one-to-one with actual articulatory movements. The movement to which any given [representation] of [a] phonetic feature gives rise depends upon the current state of the vocal tract at the time the feature is tokened. For example, “lip rounding” may involve not just the lips, but the jaw as well. (Liberman and Mattingly, 1985: 22)

Not only may a feature representation map to different movements, but the same movement can be mapped to different representations: a single articulator may participate in the execution of two different gestures at the same time; thus, the lips may be simultaneously rounding and closing in the production of a labial stop followed by a rounded vowel, for example, [bu]. (Liberman and Mattingly, 1985: 22)

They go on to explain how the same problem arises even for more abstract motoric properties, for example, ones individuated in terms of the changes they might induce in the vocal tract, but conclude: phonetics (see Hale and Reiss, 2008: 116–18 and Volenec and Reiss, 2017, for the proposal of a “cognitive phonetic” level for computations between the phonological and articulatory systems). Transduction in general is a process by which one physical process, e.g. acoustics, is transformed into another, e.g. electrical signals in a microphone or a brain, a process we will discuss further in §11.1.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 313 We would argue, then, that the gestures do have characteristic invariant properties, as the motor theory requires, though these must be seen, not as peripheral movements, but as the more remote structures that control the movements. These structures correspond to the speaker’s intentions. (Liberman and Mattingly, 1985: 23)

We will return to the role of speaker intentions in §9.5.20 For now, it appears what hearers have to recover from the acoustic forms that reach their ears are the speakers’ intended representations. At best, acoustic phenomena provide evidence for linguistic competencies, but are not constitutive of them. Indeed, in a useful analogy, Fodor et al. (1974) compare the acoustic effects of speech to the clues left by a criminal: the hearer is in the position of a detective, inferring the identity of the criminal from the clues—but not by identifying the criminal with the clues: [T]he detective may be able uniquely to determine the identity of the crim inal from the clues, but it does not follow that there is a characteristic clue for every criminal. . . . The acoustic representative of a phone turns out to be quite unlike a fingerprint. It is more like the array of disparate data from which Sherlock Holmes deduces the identity of the criminal. (Fodor et al., 1974: 301; see also Laver, 1994: 106)

What “clues” and “signals” a speaker provides a hearer will vary according to the speaker’s estimation of what the hearer in a particular context will need: articulating slowly and distinctly for children, foreigners, and noisy telephones; speeding up and employing copious contractions with familiar friends and colleagues; and proceeding at a breakneck pace in the highly stylized speech of an auctioneer. This tends to have the result nicely captured in a famous analogy made by the earlier linguist Charles Hockett (1955): Imagine a row of Easter eggs carried along a moving belt; the eggs are of various sizes, and variously colored, but not boiled. At a certain point, the belt carries the eggs between the two rollers of a wringer, which quite effect ively smash them and rub them more or less into each other. The flow of eggs before the wringer represents the series from the phoneme source; the mess that emerges from the wringer represents the output of the speech 20 The relevant intentions are presumably the ones immediately involved in producing an tterance—perhaps ones inside a special speech production module—not whatever speech or other u acts someone might also intend to perform. I will take this as read in the remainder of this chapter.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

314 Representation of Language transmitter. At a subsequent point, we have an inspector whose task it is to examine the passing mess and decide . . . the nature of the flow of eggs which previously arrived at the wringer. (Hockett, 1955: 210; quoted in Fernández and Cairns, 2011: 164)

And, of course, clues and evidence would seem to be the order of the day when we consider the languages of the deaf (e.g., ASL), and, even more spectacularly, the tactile “Tadoma” language developed for the deaf and blind (whereby intended SLEs are detected, to an amazing degree of accuracy, by touching the face and neck of a speaker! See Reed et al., 1985). But, if the acoustic material merely provides clues to the intended SLEs, then they need not be identical to them; indeed, their work would be done so long as they merely clued hearers to the SLEs the speaker intended—even if, moreover, they were never actually produced. All of these phenomena obviously undermine the “beads on a string” view, which I take to be a virtually defining property of at least phonemic strings. The early linguist, Otto Jespersen (1924/63) was skeptical of any acoustic def initions of “words”: Words are linguistic units, but they are not phonetic units: no merely phon etic analysis of a string of spoken sounds can reveal to us the number of words it is made up of, or the division between word and word. This has long been recognized by phoneticians and is indisputable: a maze sounds exactly like amaze, in sight like incite, a sister like assist her. . . . As, consequently, neither sound nor meaning in itself shows us what is one word and what is more than one word, we must look out for grammatical (syntactic) criteria to decide the question. (Jespersen, 1924/63: 93)

This is a view echoed recently by, for example, the phonologists Holt and Lotto (2010): The end result of all these sources of variability is that there appear to be few or no invariant acoustic cues to phoneme identity. (Holt and Lotto, 2010: 1218; cf. Liberman and Mattingly, 1985: 12)

Nonetheless, phonemes are heard and understood by and large as discrete entities that crucially follow each other in certain sequences. Since the discrete

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 315 sequencing is clearly essential to the explanatory role of SLEs, then, by Leibniz’s law, evidence that there are no such acoustic beads on a string is therefore evidence that SLEs cannot be plausibly identified with them.21 We will now also see that SLEs don’t do any better than color does with respect to the issue of the stability of what speakers take their extensions to be.

9.3 Dispositional Strategies Consider our first ontological criterion, Extensional Stability. Ever since Locke distinguished primary and secondary qualities, many philosophers have been inclined to take up his further suggestion to define the secondary ones in terms of dispositions to respond in a certain way to patterns of primary ones. Thus, for something to be red would be for it to look red if it were it to be viewed under normal conditions. As we noted above, Russell argued that, for a lack of a principled way to specify some but not other conditions as “normal,” this an unpromising strategy, at least for color. The strategy looks even less promising for SLEs, which we have just seen exhibit the same sort of variability not only in how they sound, but also in how they are produced, and so one would need to specify a “normal” condition for both. Obviously, one would want to abstract from clearly extraneous influences, such as eating, inattentiveness, inebriation, agitation, or song. But what of other less avoidable factors, such as anatomy, social status, speed, mumbling, or “speech-impediments”? Should linguistics be grounded by BBC newscasters, l’Academie Française, or how sufficiently well-educated lawyers entone in court? Consider simply the wide variation in pronunciations of “extraordinary” that we quoted from Fudge (1990: 41; see fn18 above); or the surprising variation observed in a famous study of Peterson and Barney (1952), where they presented 70 listeners with 10 sets of different vowels, each presented 152 times, and found a surprising level of disagreement about the identification of certain vowels: 21 Of course, a standard way to deal with noisy systems like speech is to apply statistical analyses. This is precisely how computer speech recognition devices work, exploiting “Hidden Markov Model” computations, which provide ingeniously elaborate statistical analyses of input (see, e.g., Rabiner 1989). But, of course, they no more attest to the reality of the SLEs they recognize than statistical demographic studies attest to the reality of “the average American family” as an actual family.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

316 Representation of Language Of the 152 sounds . . .143 were unanimously classified by all observers as /ɪ/. Of the 152 sounds which the speakers intended for /ɑ/, on the other hand, only 9 [6%] were unanimously classified as /ɑ/ by the whole jury. (Peterson and Barney, 1952: 177)

Moreover, the abilities of listeners to “correct” for all this variation itself obviously varies widely, some of us worse with accents and impediments than others. One would be very hard put to specify a “normal” speaker, listener, or context of delivery that would suffice for all speakers of “the same language,” dialect or even a single person’s idiolect over time.22

9.4 Articulatory Idealizations Chomskyans, of course, are not daunted by mere superficial variabilities. As they are the first to insist, there are systems of syntactic, semantic, and phonological competence underlying the diversity of languages, dialects, and pronunciations. Why not one for phonetics as well, which might relate the phonology to articulators and/or acoustic forms? In general, Chomskyans tend to be skeptical that there are systematic theories relating the deliverances of syntax, phonology, and even semantics (cf. §10.4) to phenomena external to them, such as the motor system and the circumstances of speech. As we noted earlier, Hale and Reiss (2008: 109–18) distinguish processes of computation that produce phonetic representations from the transductions that relate those representations to actual motor activity, subject to arbitrary influences that there is no reason to think will be amenable to general theory. Nevertheless, Galilean idealization might serve here as well. One might try to idealize the phonetics to the effects a specific phonological instruction 22 Peterson and Barney did not try to control for dialectal differences among hearers, except to the extent that they were all presumably mutually intelligible “English” speakers; but doing so would require individuating dialects in some principled way, which, as we saw (§4.3) Chomskyans have reasons to be skeptical can be done. Devitt (2006a: 156) would of course appeal to social conventions: SLEs are social objects like the unemployed, money and smokers . . . which have their relevant linguistic properties “in virtue of environmental, psychological, and social facts” Devitt (2006b: 583) and, particularly, social conventions. (Devitt, 2006a: 156) See also Devitt’s (2006a: 186, 2006b: 583, 598ff, 2008a: 221) and Barber’s (2006: 420ff) appeals to conventions in a population that associates sounds and meanings. The burden would be on them to show that there was sufficient stability in the role of such conventionally defined SLEs in an external linguistic reality to sustain stable identifications that could play a serious explanatory role. Note that this external stability is to some extent supplied in the case of money by a legally imposed mint.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 317 would have in a highly idealized mouth in splendid isolation from other factors, especially from instructions for other features, that is, ceteris paribus (cf. §3.4.3). Thus, the instruction [+nasal] might be to lower the soft palate in such a way as to produce a certain idealized sound in abstraction from the anticipations, assimilations, and displacements we discussed above. Note that this seems to be the way the IPA (International Phonetic Alphabet) is conceived. Here, phones are identified in terms of articulatory gestures, accompanied by a schematized drawing of a typical vocal tract, one for the consonants (Figure 9.1) and another for the vowels in an even more idealized schema of the oral cavity (Figure 9.2). Of course, it is unlikely that any actual speaker’s vocal tract ever completely realizes such idealized schemata, or that features are very often, if ever, prod uced in such isolation. But the idealization could nevertheless provide the right basis on which to organize all the variation. Everyone may have the same underlying I-phonological idealizations, but differ widely in their articulatory performance systems in varying contexts. One way to spell this out might be by specifying those conditions as ideal that are “explanatorily basic,” the ones involving the production of appearances on which other Nasal Cavity Palate (Roof of Mouth)

Velum

Alveolar Ridge

Oral Cavity Lips Teeth

Voice Box

Tongue (Tongue tip & Tongue Blade)

Figure 9.1 Places for Articulation of Consonants

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

318 Representation of Language VOWELS Close

Front

Central

i•y

i•u

Back ɯ•u ʊ

I Y Close-mid

ɘ•ɵ

e•Ø

ɤ•o

ɘ Open-mid

ɜ•ɞ

ɛ•œ

ɐ

æ Open

ʌ•ɔ

a•Œ

ɑ•ɒ

Where symbols appear in pairs, the one to the right represents a rounded vowel.

Figure 9.2 Locations for Articulation of Vowels

productions of the appearances asymmetrically and explanatorily depend.23 I will develop this suggestion in §10.3 and §11.2. I raise it here only as a theor etical possibility that might provide a Galilean reply to Chomskyan skepticism about phonetics. An interesting question would be how precisely to characterize such highly idealized phonemic features: should it be in acoustic or articulatory terms? Some argue that the IPA is an historical artifact developed some fifty years before spectroscopy permitted precise acoustic signatures of phones and phonemic phenomena to be identified, and that these should now be incorporated with perhaps their phonotactic, alternative, and contrastive relations to each other (cf., Goldsmith, 1995). There is no need to settle these complex issues here. Perhaps the ultimate characterization should be in terms of some set of pairings of highly idealized gestures and phonological representations (perhaps established in early babbling, cf. Hoff, 2009), so that what the hearer has to do is to decide which pairing best fits the imagined endogenous motor input instructions and the auditory input (a model that seems to me suggested by Chomsky and Halle, 1968: 294, which we will discuss further in §11.2). It is enough for our purposes here that, whatever the ultimate candidates, they will be ones that involve substantial idealization from actual speech. 23 I have in mind a version of Jerry Fodor’s (1987, 1991) “asymmetric dependency” theory of content that I will discuss in §10.3. Fodor (pc) once mentioned to me that it was phonological phenomena that he initially had in mind for that theory. I am indebted to Sophia Sanborn for pressing me to include this approach here.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 319 In appealing to constraints imposed by idealized articulators, such a proposal might seem to render the case of phonemic phenomena in some ways better off than that of colors. However, it also makes the situation in other respects worse. In the case of colors, if by virtue of certain idealized dispositions, some range of, say, surface reflectances were to be stably identified with red, then it would turn out that there would be many red surfaces in the world, since, were that very surface to be observed in perfect isolation under ideal conditions, then it would appear red in those conditions. But even if one were to find an appropriate idealization under which certain gestures and/or acoustic patterns in isolation were stably identified as, say, a /d/, that pattern might seldom, if ever, occur in ordinary speech. Unlike a surface judged to be colored, which likely would retain its relevant properties across different illumination conditions, instructions to utter SLEs are standardly executed in groups, and so each of them invariably gets compressed or otherwise distorted in all the ways we have mentioned, not retaining their local integrity as a surface typically would. So, unlike colors, even ideally defined tokens of phonemes might be rarely, if ever, actually produced; and certainly words and phrases, composed of so many distorted phonemes, would be even rarer.

9.5 Speaker Intentions? One plausible suggestion that would seem to accord with much ordinary judgment might be that a given acoustic phenomenon counts as a token of an SLE if it were to be produced by a speaker with the (immediate, cf. fn 20) intention to produce that SLE, whether or not the phenomenon displays the essential properties of segmentation of the SLE that the speaker intended to produce (cf. the above quote from Liberman and Mattingly, 1985: 23; Kaplan, 1990:101-4). To be sure, this corresponds to a loose criterion that we seem willing to allow in the case of many artifacts: something can count as a piece of art, a diagram, an expression of joy, a home, if its creator intended it as such. Moreover, speech perception of an acoustic phenomenon will often be regarded as veridical if the hearer does in fact recover the intention with which the phenomenon was produced. But there are clearly limits, determined pragmatically and contextually. No matter how sincere a person might be, their mere intention cannot create a tuna sandwich by producing a xylophone. Similarly, if a person’s mumblings are understood only by his spouse, or inferred from unusual cues other than the acoustics themselves, there

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

320 Representation of Language would be no interesting theoretical reason to include them as tokens of the intended linguistic type.24 Perhaps there is some ingenious combination of all the above approaches that could circumvent these difficulties and provide satisfactory externalist analyses that would provide a basis for saving ordinary entokenings of SLEs.25 But one might wonder why it is so important to provide one. If one looks at the theoretical work SLEs are supposed to perform, their definition in terms of even dispositions to respond to any but highly idealized, rarely instantiated acoustic phenomena is entirely needless and extraneous to linguistic theory. In any case, Chomskyan linguists do not await such externalist analyses of ordinary SLEs any more than vision theorists wait upon externalist analyses of ordinary colors. To vary Max Weinreich’s famous quip:26 an externalist analysis of ordinary speech would seem at best an analysis of an idiolect with a gun-boat—and an obsessive metaphysician at the helm!

9.6 SLEs as Abstract Objects By way of completeness of our survey of ontological options, we should mention again the view of Jerrold Katz (1981, 1985c), Katz and Postal (1991) and Scott Soames (1984), which we discussed at length in §6.2.1, that SLEs might 24 As Hawthorne and LePore (2011:478) rightly observe, “Grunts and groans are not [utterances] of words no matter how much we intend them to be so.” They offer as a criterion for utterance identity simply that “one passes the standards of the relevant community” (p464). They admit this invites a denial of the existence of words, but, averse to abandoning the posits of commonsense, they opt instead for a “sloppy realism” about them (p482); that is to say, a strict anti-realism. An interesting question can be asked regarding the “McGurk effect,” where a hearer is provided an auditory /b/ but sees the speaker simultaneously producing the lip formation for [g]—and, as a result, hears a /d/ (see McGurk and MacDonald, 1978). Now, is this an “illusion” or simply an unusual entokening of /d/? Since for an anti-realist about SLEs almost all “hearings” of them are illusions, this further distinction among them would seem to be entirely arbitrary and pragmatic. 25 Thus, in response to my discussion, Nick Allott has replied: But so what if there were all this variation? It’s well known that realisations of phonemes are affected by their phonetic context, rather like letter shapes in cursive handwriting. But it doesn’t follow from this that the contextual variants (allophones) are not variants of the phonemes (just as different cursive letter forms are still forms of letters). One can even make a cursive typeface, because the effects from neighboring letters are rather predictable. See e.g. https://designmodo.com/cursive-webfonts. Allophones of phonemes are largely similar in being contextually conditioned in mostly very predictable ways. And, of course, to try to stabilize communication, one can stylize and formalize both orthography and speech (although good luck with doctor’s prescriptions!) Again, there’s Liberman and Mattingly’s (1985: 23) claim that the common structures “correspond to the speaker’s intentions”. In any case, the question is whether such stylization is anything more than a transient happy accident, playing no serious explanatory role in linguistics. 26 In a 1945 speech, the linguist and Yiddish scholar, Max Weinreich, said, “a shprakh iz a dialekt mit an armey un flot” (“a language is a dialect with an army and navy”), although, according to a Wikipedia article on the remark, he attributed it to a Bronx high school teacher in the audience.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 321 be identified as abstract objects. In addition to the replies to their arguments there, one can add here the ecumenical reply we gave to Devitt’s nominalism: gather ye science where ye may! But just as Chomsky’s interest in I-language does not preclude an interest in abstracta, an interest in abstracta does not preclude his interests in psychology. Indeed, as we noted in discussing John Collins’ (2009, 2014) appeal to SLEs as a system of abstracta, if we are to satisfy Chomsky’s aim of explanatory adequacy (§4.1), there will ultimately need to be some explanation of the possibility of speakers’ perception and production of what they take to be tokens of SLEs. After all, it is hard to see how a child could acquire a language without being able to perceive and/or parse at least some initial ambient events as token SLEs. In providing such an account, a Platonist will need to engage in real world psychology after all.27 Thinking about abstracta does raise the vexing problem of how to regard “properties” or “features” in general. They are standardly conceived as abstracta, but there is the problem that they can also be invoked to explain causal relations. It is, after all, the property of having a certain mass at a certain position in space-time that explains the moon’s effect on the tides. More fundamentally, there are the physicist’s fundamental properties of, for example, mass, charge, and spin that also seem to be at least partly localized. It would appear, therefore, that properties cannot only be abstract, but somehow need to be “entokened,’ or “instanced”: it is the instancing, or what is sometimes called the “trope” of the property of having a certain mass that is responsible for the tides on a specific occasion.28 And this will seem to be true in the linguistic case: people take themselves to perceive and parse tokens of SLEs, often representing them as instancing properties of being a noun or being a VP (one person may take themselves to have heard “love” as instancing being a verb, another as instancing it being a noun). Of course, if the acoustic stream cannot be segmented into particular phonological elements, then it cannot be 27 The problem is made perhaps more vivid when someone appeals to abstract entities to classify neural states. Collins (2014) proposes that we might think of the abstract properties as the means of describing or individuating kinds of brain states that are subsumable under certain generalizations couched in terms of the type-individuating abstracta. Under such a construal, the brain states do not represent the abstracta . . . rather, the abstracta simply individuate the would-be vehicles of content for the explanatory endeavor at hand. (Collins, 2014: 37, emphasis mine) But, however much it is plausible to suppose one could use abstracta to individuate neural states without their being represented or (per Quine, 1953/61a) being “values of variables,” it is hard to see how one could account for the deployment of states so individuated in the perception of acoustic events as uttered tokens, a point to which we will return in Chapter 11. 28 I use “instancing,” since it’s the objects that “have” the properties or tropes that are said to “instantiate” them. See Campbell (1990) for discussion.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

322 Representation of Language segmented into the corresponding tropes either, even if the “abstract” feature itself might still exist in Plato’s heaven and be represented by people’s various “concepts” of them. I suspect that the thorny issues of sorting all this out will not impact our discussion, and so I will continue to use “property” for either the abstracta or for tropes.

9.7 “Psychological Reality” Chomsky does briefly address the issue of linguistic ontology in his (1968) book with Morris Halle, The Sound Pattern of English. In a section entitled, “On the Reality of Phonetic Representations,” they address the question, “What exactly is a phonetic representation?,” in its own separate section. It is answered thus: A phonetic representation has the form of a two-dimensional matrix in which rows stand for particular phonetic features; the columns stand for the consecutive segments of the utterance generated; and the entries in the matrix determine the status of each segment with respect to the features. In a full phonetic representation, an entry might represent the degree of intensity with which a given feature is present in a particular segment. (Chomsky and Halle, 1968: 5)

Lest these matrices be regarded as merely there for the linguist, Chomsky and Halle “propose further that such representations are mentally constructed by the speaker and the hearer and underlie their actual performance in speaking and ‘understanding’ ” (1968: 14). Indeed, a subsequent section is devoted to the question of psychological reality, in which they conclude: A person who knows the language should “hear” the predicted phonetic shapes. . . . We take for granted, then, that phonetic representations describe a perceptual reality. (Chomsky and Halle, 1968: 25, emphasis mine)

But what does that mean? What is a “perceptual reality”? They go on to stress that there is nothing to suggest that these phonetic representations also describe a physical or acoustic reality in any detail. (Chomsky and Halle, 1968: 25)

So one might think that “perceptual reality” would be a kind of “psychological reality.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 323 But this latter phrase suffers from a crucial ambiguity. When asked about the “psychological reality” of his grammars, Chomsky (1980a:106ff, 191) replied that there is no reason to be less of a realist about linguistics and psychology than about physics: What is commonly said is that theories of grammar . . . have not been shown to have a mysterious property called “psychological reality.” What is this property? Presumably, it is to be understood on the model of “physical reality.” But in the natural sciences, one is not accustomed to ask whether the best theory we can devise in some idealized domain has the property of “physical reality,” apart from the context of metaphysics and epistemology, which I have put aside here. . . . The question is: what is “psychological reality,” as distinct from “truth in a certain domain”? (Chomsky, 1980a: 106–7)29

Chomsky goes on to note (1980a: 107–8) that it was Edward Sapir (1933/49) who introduced the phrase “psychologically real,” specifically with reference to phonemes, and is mystified why critics of Sapir thought that he should have simply claimed they were convenient “fictions”: it is clear from the ensuing debate up until the present that no matter how powerful the “linguistic evidence” might have been, it would not have sufficed to establish “psychological reality.” . . . In short, the evidence available in principle falls into two epistemological categories: some is labelled “evidence for psychological reality,” and some merely counts as evidence for a good theory. Surely this position makes absolutely no sense…. (Chomsky, 1980a: 108)30

But, as we quoted above, Chomsky and Halle (1968: 25) themselves claim that “phonetic representations describe a perceptual reality” which is not “a physical or acoustic reality in any detail,” so it was perfectly reasonable for Sapir’s critics to wonder whether Sapir’s phonemes were convenient fictions. Especially since, as we noted in §6.4, Chomskyans are centrally concerned with an internal computational-representational theory, they may at least remain agnostic about whether what the representations represent are real, and so engage in a 29 I presume the “metaphysical and epistemological issues” that Chomsky is setting aside are the usual ones about theoretical entities in science generally, which have been rife throughout the history of twentieth-century science, but were not raised in any special way by phonology. He does qualify his claim here a little in his (1980b) reply to Harman, but only to (rightly) allow that the ontology of a theory cannot always be read directly off the working statements of the theory, without addressing the specific explanatory work of the purported entities. 30 See also Chomsky (2003: 283) and Chomsky and McGilvray (2012: 73).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

324 Representation of Language convenient, what I called a “representational pretense” in supposing that they—not the representations, but what the representations represent!— are real. Thus, realism about theoretical entities is not what is at issue. That would simply amount here to realism about mental representations and the computations over them. But what reason would there be to also insist on realism about the SLEs that are represented? We have reviewed above all the reasons for thinking that they are not acoustic or articulatory phenomena. Given that phonemes are not in the acoustic stream, does anyone really want to insist that they have a “theoretical” reality comparable to, say, electrons? If they do not exist in the acoustic stream, where are they? Do they have causal powers? How? I do think there is a widespread temptation to hold that many of the things that we think about “exist” in a special way in a kind of personal “world” peculiar to the thinker, importantly distinct from the usual spatio-temporal world, that, who knows how, causally interacts with it. Thus, as we saw, Jackendoff (2006: 229, quoted above) contrasted “reality for us, the world in which we live our lives,” from “a more ultimate reality.” And Thomas Kuhn (1962: 134) notoriously claimed that “after a [scientific] revolution scientists work in a different world.” Of course, if (as any right minded scientist likely thinks?) the mind just is the brain, then, it follows SLEs are in the brain, no? So SLEs would have the causal powers of brain states. Silly philosophical problem dissolved. But not quite so fast. The inference is obviously invalid. Ghosts, angels, the fountain of youth, colors, rotating cubes, ideal geometric shapes: lots of “things we think about” are said to be “psychologically real” and “in the mind,” in that thoughts about them play a real role in people’s lives. But it would be lunacy to say that the things thought about, themselves, are in the brain! We do sometimes say, “Pegasus is just an idea in the mind,” but this of course cannot be literally true.31 What we mean is merely that Pegusus is imaginary, and is represented in the mind, just as Chomskyans talk of “representations” of SLEs as part of a familiar CRT (see §4.4). There is a simple, but ever so profound a truth stressed (only?) by philo sophers that needs to be borne in mind in all such discussions:

31 See the discussion of “McX in §8.7 above. It is hard to resist suspecting Chomsky (2000) is prey to such a confusion when, as we noted in §8.7, fn47, he claims that natural language expressions refer not to “external objects,” but “things in some kind of mental model, discourse representation, and the like . . . a form of syntax.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 325 (U/M) A representation of a thing is in general not the same as the thing itself. For example, the word “dog” is not a dog. To suppose otherwise is to fall prey to the “use/mention” confusion, confusing the use of the word to talk about the animals, and the mention of the word to talk about the word itself (which is standardly indicated by placing quotation marks around the word). So the representation “VP” is not itself a VP.32 But then, given (U/M), we need still to ask: what happened to, e.g., the VP? How are we to understand the expression “representation of x” where, n.b, the x is the sort of SLE, for example a VP, standardly discussed by linguists? There are other possible candidates that some linguists have considered for tokenings of SLEs. There are, after all, token states of the brain: why not regard them as token SLEs? The view that SLEs are token brain states I will call “neuralism.” Chomskyans often seem to endorse such a view, but I am afraid this is partly a result of pervasive, surprisingly un-self-conscious use/mention confusions in their writings. Only occasionally is it endorsed as a deliberate use/mention collapse, deliberately identifying a representation with what it represents. I will consider first the confusions (§9.8.1), then the collapse (§9.8.2), and then a related confusion between representations and their intentional objects (§9.8.3).

9.8 SLEs as Neural items 9.8.1 Use/Mention Confusions Before embarking on this topic, it should be stressed that no one, least of all philosophers, should be particularly sanctimonious about keeping use and mention clear. The confusion is widespread, but innocuous in most ordinary speech—“Look at the rabbit in the picture and the number on the door!”—as it often is in most formal logic and mathematics.33 But innocuous though the

32 Ah, but, confusingly, the representation, “an NP” is in fact an NP! Sure, and sometimes a picture of a red tomato may itself be red; but that is no reason to confuse the picture with the tomato itself! Of course, for convenience we might talk about “the tomato in the picture,” but few (I hope) would be confused by this. 33 N.b., a number is an abstract object, not in space or time; a numeral is a name of a number and might be entokend on a door. But, to be sure, the distinction is ordinarily not of the slightest import ance. Higginbotham (1991a: 556) acknowledges that linguists often conflate use and mention, but claims there that the distinction is “only pedantically insisted upon.” He does go on to note that “what it is for something to be a sentence for a person is for it to be a grammatical structure that is apprehended and applied to certain perceptible objects” (1991a: 556), but he does not seem bothered by the fact that, in applying the structure to perceptible objects, the objects of our perception are importantly

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

326 Representation of Language confusion may often be, sometimes push comes to shove: it is no solution to the problem of how a representation represents something to simply identify the two. Whatever intentionality is, it is not the identity relation! Fairly egregious examples of use/mention confusions occur in crucial passages of Chomsky (2000), where he writes (italic emphasis is original; I place in bold what seem to me the critical portions): The computational procedure maps array of lexical choices into a pair of symbolic objects, phonetic form and LF. . . . The elements of these symbolic objects can be called “phonetic” and “semantic” features, but we should bear in mind that all of this is pure syntax and completely internalist. It is the study of mental representations and computations, much like the inquiry into how the image of a cube rotating in space is determined from retinal stimulations, or imagined. We may take the semantic features S of an expression E to be its meaning and the phonetic features P to be its sound; E means S in something like the sense of the corresponding English word, and E sounds P in a similar sense, S and P providing the relevant information for the performance systems. (Chomsky, 2000: 125) An expression E of [language] L is a pair PHON, SEM, where PHON(E) is the information relevant to the sound of E and SEM(E) to its meaning. PHON and SEM are constructed by computational operations on lexical items. . . . PHON(E) and SEM(E) are elements at the “phonetic level” and “semantic levels” respectively; they are phonetic and semantic “representations.” The terms have their technical sense; there is nothing “represented” in the sense of representative theories of ideas, for example. (Chomsky, 2000: 173)

Earlier, Chomsky spoke of: the generative procedure that forms structural descriptions (SDs), each a complex of phonetic, semantic and structural properties (Chomsky, 2000: 26).

different from states of each other’s brains that represent those objects. However, in another piece written at roughly the same time, he is less sanguine: Naturally, it is a good thing to know both what is represented and how it is represented; and it may seem that one can combine these aims by thinking of the elements of what is known as themselves mental representations. I think this move breeds confusion. In the case of grammar in particular, I believe that a great deal of confusion has been caused by thinking of the typical notions of grammar, such as noun and sentence, as notions of mental representations. . . . There is no more reason to say that nouns or sentences are representations than there is to say that numbers, or even chairs or human beings are mental representations. (Higginbotham, 1991b: 125)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 327 “Phonetic features” here are understood both as (properties of) sounds and as “pure syntax”: “symbolic objects,” or “mental representations.” “PHON(E)” and “SEM(E)” are being used here both for “information,” and for its “representation,” the “expressions” that express that information, over which the computations are defined.34 Lest one suppose use/mention conflation is confined to Chomsky, here are samples from other recent leading texts. In the “glossary” of a textbook in which he is setting out the meanings of specific technical terms, Andrew Radford (1997) writes (critical wording in bold): Feature: A device used to describe a particular linguistic property (e.g. we might use a feature such as [Nom] to denote the nominative case-feature carried by pronouns such as he). By convention, features are normally enclosed in square brackets. . . . The head-features of an item describe its intrinsic grammatical properties. The feature specification of a word is (a representation of) the set of features which characterize the idiosyncratic properties of the word. (Radford, 1997: 507)

And Adger and Svenonius (2011), uncritically reproduce in their article similar use/mention confusions of Pollard and Sag (1987), in a passage that is explicitly supposed to offer “some ontological clarifications”: Some ontological clarifications: An important issue that needs to be clarified when discussing features is their theoretical status. In certain unificationbased frameworks, such as HPSG, features are used as part of a description language for grammatical theory: “Intuitively, a feature structure is just an information-bearing object that describes or represents another thing by specifying values for various attributes of the described thing; we think of the feature structure as providing partial information about the thing described.” (Adger and Svenonius, 2011: 2, quoting Pollard and Sag; bold emphasis mine) 34 See also Smith and Allott (2016: 217). On behalf of such talk, Nick Allott (pc) claims that the way linguists use this kind of terminology tends to reconstruct the representation/represented divide. ‘[+voice]’ and similar are names of mental features, while actual acoustic events may be voiced or unvoiced. The mental feature ‘[+voice]’ is not itself voicing or voiced, but rather (very roughly) (part of) an instruction to produce it/what is produced when it is perceived. Perhaps. But then the question arises about the relation between phonological “instructions” and the intended actions, which on the face of it would seem to involve the very intentionality that Chomsky and others dismiss. So this charitable reading would obscure the issue at the heart of the present chapter.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

328 Representation of Language Now, with a little work (or maybe just a blurring of thought), one can of course make a good guess at what is intended, and I suppose this is what most readers cheerfully do. But these passages are well-nigh unintelligible as they stand. Surely features do not ordinarily describe anything—and if they do, what thing is that? Another feature—or a property? And where is it instanced? Nor are features enclosed in square brackets, as Radford claims: it is names or other representations of features that are formed that way. The expression “[Nom]” is formed by enclosing the letters, “N,” “o,” and “m” in square brackets and thereby forming a name of the feature [Nom]. And it is the representations of head-features of an item that describe its grammatical properties. Only in Radford’s last sentence, with the optional (?) parenthetical “a representation of,” does the passage finally begin to make sense.

9.8.2 SLEs as Neural items—Deliberate Use/Mention Neural Collapse One person’s confusion can be another’s deliberate identification. And certainly use/mention collapse is, as we have mentioned, often convenient: as one works through a day, it is certainly a lot less garrulous to be saying one is computing numbers, or phrases produced by an I-language, rather than always saying more precisely one is computing numerals, or representations of those phrases (cf. fn33 above). This might easily lead one simply to ignore the distinction and explicitly endorse use/mention collapse. Thus, early on in his (1955/75) LSLT, Chomsky adopted a convenient convention of collapsing use and mention, in the way that logicians frequently do, when it doesn’t matter. Within a page of setting out the primes, Pn, as phonetic symbols that will be mapped to “physical descriptions of phones,” he adds: We will henceforth apply the term “phones” to symbols of Pn, as well as to utterance tokens represented by them. –(Chomsky, 1955/75: 159)

This is a relaxing of use/mention conventions that was in fact already in force, for example, at 1955/75: 106. I suspect that, historically, what happened is that the policy was adopted not only by him, but by his followers—and was then promptly forgotten as a policy, and simply persisted in his writings as an un-noted convention, as in the passages quoted above. As Collins (2007b: 647, fn31) notes, “There is no representation/represented dichotomy.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 329 To be sure—and perhaps this is the basis for the strongest motivation for neural collapse—it may well turn out that the representations of highly structured SLEs—say, of a phrase with multiple nestings—will itself be highly structured in a way that is homomorphic to the structured SLE, in the same way as the tree structures or nested bracketings that linguists standardly provide.35 If the representations are homomorphic with the content they represent, and that homomorphism explains the data, then perhaps their content is explanatorily otiose. So why not take [+voice] and “NP” as themselves identical to phenomena in the mind/brain, just as the proponent of use/mention collapse proposes, a view I call “neuralism.” There seem to me, however, a number of difficulties with neuralism. The most obvious one is that [+nasal] is patently not a feature of a neural state. It is supposed to be the result of lowering the velum (soft palate)—check the IPA or any basic textbook in phonology—and is certainly represented that way in the brains of standard speakers. To a first approximation, phonemes are standardly represented as possible sounds produced by certain arrangements of the oral articulatory apparatus: voicing is (intended to be) produced by vibrating your vocal chords; consonants by audible constrictions of the vocal tract, or flap or a tap of the tongue, etc. Indeed, Chomsky and Halle’s (1968) provides an explicit theory of features categorized in terms of “place,” “manner,” and “voicing” of articulation: where else but in the mouth could a speaker aim to realize such a feature? Surely not in the brain! In what he takes to be a refutation of my (Rey, 2003a) intentionalist recommendations, Chomsky (2003: 276) cites with approval Halle’s (1983) view of phonological expressions—what Chomsky calls “PHON(E)”—as “instructions for articulatory gestures.” But such instructions had better represent and normally cause the relevant parts of the articulatory apparatus to respond. If this is so, then a consonantal cannot be identical to the instruction to tap one’s tongue, else one could have actually produced it without any response of the apparatus at all! At least in phonology, use/mention collapse would make nonsense of its claims. In any case, Chomsky’s apparent radical re-positioning of phonetic/phonological features from the oral cavity to some part of the brain should not be undertaken lightly. If serious, it passes by surprisingly un-noted, not to mention un-argued, in the discussions. This “instruction” view calls attention to an even more troublesome problem for neuralism, the implications of the view for linguistic tokens, or the particular 35 Louise Antony (pc) pressed this possibility on me.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

330 Representation of Language SLEs that people at least take themselves to utter, hear, read, and refer to on a page. For if “NP” simply refers to a category of representations in the brain, thereby serving as a way of classifying the representation, then it follows that a token of an NP is just a brain state, just as “homosapiens” as a name of a species has as its tokens individual people. So far as I can find, Chomsky does not discuss tokens.36 This is perhaps not surprising. The generalizations of syntax are not intended to be generalizations about tokens, but rather proposals about general features of computations or constraints upon them. However, though tokens may not be needed for the statement of syntactic principles, it would certainly seem on the face of it that representations of tokens of SLEs are nevertheless needed for an explanatorily adequate linguistic theory. For it to be possible for children to acquire language, they presumably have to be able at least to perceptually represent what they take to be tokens of SLEs produced by themselves and others in the ambient environment (which, as we will discuss in §11.1, is no small achievement). And, whatever they are doing, they are not perceiving tokens of brain states!

9.8.3 Intentional Object/Representation Confusions The passage we quoted in §9.8.1 from Chomsky (2000: 173), regarding PHON(E) and SEM(E), raises a further complication, something of an interaction between these use/mention equivocations and the exaggerations of Externalism. One might, after all, wonder what happens to the use/mention confusion in the case of an empty expression, for example “a rotating cube” when there is no cube. The non-existence of the represented phenomenon might lead one to suppose that there is no longer a possibility of confusing use and mention, since there is no real object whose properties could be confused with those of the respective representation; so conflating use and mention in the ways that Chomsky does is of no consequence. But this would obviously be an error. Consider Pegasus: he is supposed to have wings, but from the fact that there is no Pegasus, it does not follow that

36 See Bromberger and Halle (1992) and Wetzel (2006) for discussion of types and tokens, albeit along Chomskyan, but more realist lines than mine. Perhaps it is surprising that someone, like myself, who thinks SLEs don’t exist at all, should be concerned about tokens. However, as in the case of SLEs generally, the existence of representations of tokens is entirely compatible with the non-existence of the tokens themselves (note that, pace Gareth Evans, 1982), I allow for empty singular terms; see §8.6, fn31).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 331 the word “Pegasus” itself has wings; nor even that the “idea of Pegasus” does. Ideas themselves may be many things—good, bad, silly, bold—but they simply do not have wings. Similarly, the expression “a rotating cube” is not cubical. But what, then, in such cases would be being confused with what? Identifying even non-worldly features with representations themselves would seem to be a brother to the use/mention confusion, what might be called the “intentional object/representation” confusion: we confuse properties of the intentional object with properties of its representation; or, more generally, properties specified in the pure intentional content of a representation with properties of the representation itself. This confusion seems especially inviting in psychological cases: thus, from the fact that many people have experiences of, say, rotating images of cubes, psychologists have sometimes supposed that there must be an image actually rotating in the head. But that doesn’t follow. All that follows is that there is a representation in the head with the intentional content, for example {a rotating cube}.37 The confusion can be even more enticing when the intentional content concerns linguistic items, that is, representations themselves, since, after all, unlike rotating cubes, some kinds of linguistic items could be in the head—indeed, according to a “language of thought” hypothesis that standardly accompanies a CRT, there are. But, again, even if a person’s Language of Thought were identical to their natural language, it is still not tokenings of such sentences that speakers are hearing or trying to produce when they speak.

9.9 Folieism: SLEs as Perceptual Inexistents A number of themes of our discussion now converge. In this chapter, we have considered several proposals about the ontology of SLEs and found them problematic: SLEs cannot be identified with the usual acoustic, articulatory, or neural phenomena produced in ordinary speech, and reliance on their being merely abstract objects for classifying neural states fails to account for their role in psychology as serving as the types of which speakers take themselves to hear and produce tokens. The most promising strategy for identifying them with anything even possibly real seems to be one that would couple idealized sounds with highly idealized, totally isolated articulatory gestures. 37 Of course, there could be other, better reasons for thinking there are imagistic representations in the brain, e.g. response time phenomena (as in Kosslyn, 1986), as well as topographic maps in the brain for sensory systems. However, the bad reasons are seldom culled from the better ones, and one needs to be careful about generalizing from what may be special cases (see Pylyshyn, 2006).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

332 Representation of Language But this leaves SLEs seldom if ever entokend in the actual world. However, as we observed in §6.4, Chomskyans are not really concerned with any actual SLEs that people think they hear and produce; they simply conveniently pretend they are entokened in the air or on a page. And, as we observed in §8.6, contrary to presumptions of externalist philosophers of the last fifty years— and oddly shared by Chomsky—representations can be “of ” or “about” things that do not exist at all. Given this convergence, but perhaps paradoxically to ordinary thought, it would seem that the natural conclusion to draw about the ontology of at least the SLEs speakers claim to hear and produce is that they simply don’t exist at all, neither in the air, on the page, or the brains of speakers or hearers; or, anyway, there is no need of them in any explanatory project. Chomskyans, of course, seldom say anything that sounds quite so outrageous, but John Collins has brought to my attention the following passage from an interview with Chomsky: I do not know why I never realized before, but it seems obvious, when you think about it, that the notion of language is a much more abstract notion than the notion of a grammar. The reason is that grammars have to have a real existence, that is, there is something in your brain that corresponds to the grammar. . . . But there is nothing in the real world corresponding to language. (Chomsky, (1983/2004: 107)

At best, what existence some phonological phenomena might be thought to enjoy is under highly idealized conditions that seldom, if ever, obtain. For the most part, “they” are what, following Brentano, I call “intentional” and/or “perceptual inexistents”: “things” that we take ourselves to perceive and prod uce in space and time, and which we think about and discuss, but which simply are not there—not “in the mind,” or in some mysterious “mental” or “psychological reality” or in some Meinongian realm of “being” without “existing.” Again, as Cartwright (1960:639) stressed, “unreality is . . . not another reality.” But how, one might wonder, is communication possible without actual tokens of SLEs? I propose the following view:38 38 I call this view “folieism,” rather than the more common philosophical “fictionalism” since this latter view is standardly associated with a view about theoretical posits, such as (at one time) molecules or numbers, and I want to stress here instead the powerful illusions we are under in normal perception and production of speech. Indeed, I want to insist that, pace casual and (I think) confused talk, linguists are not seriously positing SLEs as explanatory entities in the way that scientists may posit molecules or numbers, even though the illusion is surely as strong for them as it is for normal speakers.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 333 Folieism: Communication is a kind of folie à deux (or à n, for the n speakers/hearers of a “common language”): SLEs generally do not exist. Speakers and hearers have such a stable, shared illusion of their production that the hearer is usually able to determine precisely which ones the speaker intended to utter. Their actual existence, even should it occasionally or ideally occur, makes no difference to the success of communication or linguistic theory.

How do speakers and hearers bring this off? They standardly intend to utter certain SLEs, and this causes various sub-systems of their brain to produce various representations of them, which in turn causes contractions in their articulatory system that produce certain wave forms that traverse the air, impinge upon the auditory systems of hearers and, if everything goes well, produce representations there of the very same SLEs the speaker intended to utter. Many of these latter representations seem to be produced in an at least partially encapsulated module, like that of vision, audition and other sensory modalites, and so give rise to the impression of phonological items being heard. However, this latter is an illusion: examination of the acoustic events reveals no reliable entities with the structures linguists have reasonably argued that phonological items and other SLEs possess. The situation is actually quite like the situation of the Kanizsa and similar figures we discussed in §8.7. Indeed, entire alphabets can be “composed” of such figures, and are regularly used in public print (see Ninio, 1998/2001: 90) (Figure 9.3). Consider, for example:

Figure 9.3 Illusory lettering

(see also the lettering used in the title on the cover of this book). Here it is even more obvious that there isn’t really an inscription of the orthographic word; yet communication runs smoothly nonetheless. I see no reason not to regard such Kanizsa phenomena as the model for SLE perception generally. In any event, the serious possibility of such a model is enough to establish that the existence of actual SLEs is as inessential to communication and linguistics as the existence of illusory figures (not to mention colors) are to theories of vision.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

334 Representation of Language In a reply to my proposal, Devitt (2006b) disagrees: It would be miraculous if [a convention between sound and meaning] were established where the only thing that is regularly associated with that meaning is an inexistent sound. We have no name for this intentional inexistent via which we could establish the convention and there is no other way to establish it. (Devitt, 2006b: 606)

But Devitt’s imagination fails him. Surely all that is required for conventional cooperation between people are shared beliefs and experiences, not ones that happen also to be veridical. Think of the immense variety of illusory figures to which we are stably responsive, from Kanizsa figures to colors and rainbows and figures in animated cartoons. It often takes considerable knowledge and attention to sort out just which them are not real. One of the points of talking about perceptual inexistents is to bring out just how natural it is to “see” and “talk about” such “things” (to someone who doubts “there are” intentional inexistents to which we can “refer”: just “point to” a Kanizsa triangle, or “a rotating cube” on a tachistoscope!). At any rate, it is easy to imagine conventions systematically arising from such stabilities and attaching to these apparent objects. I am not sure what Devitt means by saying “we have no name for [the phonemic] intentional inexistents”: after all, we regularly take ourselves to refer to SLEs by putting “them” in “mention” quotes (or in linguistics in italics). And it is unclear why he thinks it matters: surely he does not think all conventions need to be verbally explicit! It is enough that the experience of an inexistent can be reliably evoked among perceivers, either by producing acoustic blasts along the above lines, or by learning to associate an evoked image, say, of Santa Claus, with a certain further meaning. These experiences could well turn out to be sufficiently regular to support the generalizations offered by socio- and historical-linguists, as well as many of the insights about reference in natural language that Devitt’s work elsewhere provides (cf. §6.5 above). Of course, what would ground these generalizations would be those experiences which, like many of the experiences of color, may be shared by groups of individuals. On the present account, it would be natural and efficient for those linguists concerned with E-languages to engage in the same representational pretense in which I argued theorists of I-languages regularly engage (see §6.5). In any

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Linguistic Ontology 335 case, linguists need not care a whit whether it turns out that SLEs are illusory or real. Nor need we.39 Recourse to intentional inexistents and representational pretense in the relatively innocuous ways I have described may provide a possible coherent account of the ontology of linguistics. But, still, linguists might think their explanations would be entirely satisfactory without them, perhaps simply by insisting on the widespread use/mention collapse that they already practice, or Collins’ abstract-algebraic strategy of simply characterizing neural states in terms of abstract linguistic types. I have criticized these proposals in various ways, but the question reasonably arises whether an intentionalist theory could fare any better. It is that question we will consider in the remaining chapters.

39 Curiously, Collins (2014: 43–4) also rejects the foliest suggestion that speakers and hearers are under a constant illusion of producing and hearing SLEs, for two reasons: the first is a claim about a “crucial characteristic of illusions,” that “we can be debriefed or get to be in a position to see the visual set-up that wrought the effect.” The second is a claim about delusions, which “presuppose that the capacities are functioning in some abnormal way.” Both claims fly in the face of a venerable tradition of philosophers and scientists who have adduced myriad explanatory reasons for thinking that human beings are quite normally susceptible to what may well be ineluctable illusions regarding, e.g., the reality of secondary properties, causation, free will, personal identity, and the prospects of eternal life, which many of them cannot be brought to appreciate. I do not see why the illusion of SLEs should be any different.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

10 Linguo-Semantics We come now to the most contentious issue of this book, an account of the intentionality that I’ll argue linguistics needs. I shall not attempt to deal with the issue of intentionality generally, much less try to provide the kind of “reductive” theory of it that many philosophers have demanded. As I will argue, I think that demand has been oddly inappropriate, flying in the face of what many have accepted as perfectly reasonable scientific practice elsewhere, and in any case, given the still immature character of the relevant psycho logical theories, it is wildly premature. What I want to do in this chapter and the next is merely to indicate how I think the notion of intentionality is needed in the specific explanatory projects Chomskyans have in mind, argu ing that an intentionalist proposal plays a crucial role in those projects in ways that the non-intentionalist proposals I considered in the previous chap ters have failed to fulfill. Some preparatory work, however, is needed. We need first to distinguish (§10.1) the issue of intentionality as it has come to figure in two different domains; the first with respect to a Chomskyan theory of the I-language, what has come to be called a “linguo-semantics,” which will be the topic of the pre sent chapter, the second with respect to the content of thought, a “psycho semantics,” which will be the topic of the next. Secondly, we need to confront the challenges that Quine famously raised against the possibility of any semantic theory, linguo- or psycho-. I will argue that only one of these chal lenges, an explanatory one, is as serious as has been widely thought (§10.2). And, thirdly, we need to consider a further “disjunction” problem for seman tics raised by Jerry Fodor. Although (pace Burge, 2010: 322–3) I think the problem is real, I find Fodor’s proposed solution to it problematic in a num ber of ways. But I will distill what I think is a sufficiently modest idea from both it and from a similar (and similarly problematic) proposal of Paul Horwich (1998, 2004). I will argue that this distillation at least provides a serious strategy for replying not only to the disjunction problem, but also to Quine’s explanatory challenge, particularly if both ideas are freed from the premature demand for “reduction” with which their own proposals are unnecessarily burdened. Combined with a Chomskyan conception of the Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0010

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 337 appropriate semantics of the I-language (§10.4), I think it also provides a promising basis for a linguo-semantics and, perhaps a modest form of analy ticity (§10.4.1). I will conclude with a caution against what can appear to be a further anti-realism that Chomsky sometimes associates with his conception (§10.4.2)

10.1 Linguo- vs. Psycho-semantics Questions about intentionality actually arise in two quite different ways: with regard to the meaning or content of expressions in a natural language—a linguo-semantics—or with regard to the meaning or content of thoughts or mental representations—a psycho-semantics (cf. Chomsky, 2000: 165). This latter is often (but not always) thought to be an issue about the semantics of a special internal code of the brain, a “language of thought” (an LOT; see Fodor, 1975).1 As if there were not controversy enough about how natural language expressions have their meaning, there is even more controversy about how psychological states have whatever “content” they have. In any case, although Chomsky and others may speculate that the two are substan tially the same (see Hauser et al., 2002, Chomsky, 2007), it will be safest for the nonce to make no presumption either way: they could well be different, and, indeed, are likely to be so to the extent that intelligent infra-humans without a natural language may need a (perhaps quite limited) LOT nonethe less (see Gallistel and King, 2009). Natural language expressions may, moreover, be subject to a variety of con straints that are different from those of mental representations generally (cf. Chomsky, 1993: 34–5). One way in which the two may differ is that, with regard to natural language, Chomsky (2000: 36–52) claims it is not words that refer, or sentences that are true or false, but rather people’s uses of them (a view which we will discuss in §10.4). This is less plausible in the case of an LOT, since its expressions certainly do not seem to be deliberately “used” by people in the way that they use natural language sentences to express the thoughts that are directly expressed by the sentences in an LOT. In what follows I will therefore treat the two domains separately, discussing a linguo-semantics in

1 Note that a language of thought needs to be a highly explicit, formal language over which internal mental computations can be defined, and there are no obvious reasons that a natural language would serve as an LOT. But even were it to turn out that one would, I shall assume for the sake of discussions here that it would not.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

338 Representation of Language this chapter, and the psycho-semantics needed for the representations postulated by linguistic theory in Chapter 11.

10.2 Meaning and the Analytic There are some issues that arise equally for a both linguo- and a psychosemantics, even though the solution to them will be slightly different. One fairly major issue that was the original focus of Quine’s attacks on the notion of meaning was the use of it by the Logical Positivists in particular, who appealed to it to underwrite the “analytic/synthetic” (“a/s”) distinction. They hoped this distinction would serve to distinguish the apparently a priori knowledge of the necessary truths of logic, arithmetic, and “philosophical analysis” from the a posteriori knowledge of contingent truths afforded by the empirical sciences.2 As I aim to show, one can actually agree with quite a few of Quine’s reservations about the Positivistic project, without acceding to his rejection of synonymy, meaning, and intentionality.

10.2.1 The Analytic Data It is important to appreciate the prima facie plausible data that can be adduced for the a/s distinction. Consider the following two sets of sentences: I. (1) All doctors that specialize on eyes are rich. (2) All ophthalmologists are rich. (3) All bachelors are ophthalmologists. (4) If Holmes killed Sikes, then Watson is dead. (5) Anyone who is famous is rich. II. (6) All doctors that specialize on eyes are doctors. (7) All ophthalmologists are doctors. (8) All bachelors are unmarried. (9) If Holmes killed Sikes, then Sikes is dead. (10) Anyone who is famous is well-known.

2 Quine (1954/76) offers a superb discussion of the history of some complexities of the issue. In my (2017) encyclopedia entry (Rey 2003/17), I provide a longer discussion, in addition pursuing more recent proposals, some of which appear in the present chapter.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 339 Most competent English speakers who know the meanings of all the constituent words would find an obvious difference between the two sets: whereas they might wonder about the truth or falsity of those of set I, they would find themselves pretty quickly incapable of doubting those of set II, and would be capable of recognizing indefinite numbers of cases like them. They are (nearly enough) what Kant (1787/1968: A6–7)3 called “analytic” sentences: unlike those of set I, they seem to be known automatically, “just by virtue of knowing what the words mean (and how they are put together),” as many might spon taneously put it. Indeed, a denial of any of them would seem to be in some important way unintelligible, very like a contradiction in terms.4 Now, on the face it, these data would seem as interesting and worthy of explanation as any of the data for a linguistic theory that we appealed to in Chapter 1. Indeed, violations of apparent analyticities seem akin to the mixed semantic/syntactic phenomena we considered in §1.3, such as negative polarity items and binding constraints. It is just that there do not always seem to be some purely syntactic reflexes also at work, as there appear to be in those cases. The lack of a syntactic reflex invites an important explanatory challenge raised by Quine in his famous (1953/61b) essay “Two Dogmas of Empiricism,” regarding how to distinguish purported analyticities from merely commonly held, tenacious beliefs (and their denials from such disbeliefs). If Quine’s challenges are successful—and many think they are—then there would seem to be no basis in reality for any special theoretical treatment of the above intuitions, nor for any theory of any sort of meaning at all!

10.2.2 Quinean Challenges Quine’s (1953/61b, 1954/76) challenges were directed at specific conceptions of meaning that had been proposed by the Logical Positivists, but then also (in his 1953/61c, 1960/2013) at the linguistic and psychological notion more 3 These well-known passages of Kant should actually not be taken to constitute his serious, positive account of the “analytic,” since his views of “definition” were actually far more complex (and more congenial to many recent views!) than these perfunctory remarks at the beginning of the Critique. See especially his 1787/1968: A727–732 for his skepticism about definitions outside of mathematics. But see Katz (1972) and Pietroski (2018) for a defense and development of the earlier view, particularly as the view that the sense of an analytic sentence is fully contained in the sense of (any) one of its terms. 4 A crucial point to notice is that a sentence such as “John is a bachelor who is married” is not a formal contradiction, of the form “John is a X who is not X.” It can only be converted into one by substitution of the supposedly synonymous “not married male” for “bachelor,” yielding “John is a not married male who is married.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

340 Representation of Language generally. I say “challenges,” since there are at least four different strands running through Quine’s discussion (the first two of which we have already encountered in §4.4): (i) Revisability: Quine claims that every sentence believed to be true is revisable in the light of experience; (ii) Confirmation Holism: The philosopher Pierre Duhem (1906/54) famously claimed that sentences expressing individual beliefs are confirmed only as parts of larger sets of beliefs, a claim that Quine expands to include the claims of logic, mathematics, and supposed analyticities; (iii) Reductionism: an acceptable theory of meaning and the analytic needs to be reducible to terms of an otherwise empirically acceptable the ory, ultimately, physics;5 and (iv) Explanatory Role: Quine challenges the semanticist to provide a dis tinctive explanatory role for claims of meaning and analyticity, spe cifically one that distinguishes them from merely socially shared, tenacious beliefs that are taken to be obviously true. These strands are surprisingly independent of one another, and so deserve brief, separate discussions to make clear just which of them pose a challenge to semantic theory. (i) Revisability As we discussed in §4.4, Quine (1953/61b) famously claimed that all beliefs about the world, including those regarding logic, mathematics, and supposed analytic truths, are revisable in the light of experience. Quine’s understanding of the doctrine is actually problematic in a number of ways. Surely he intends something more than what might be called mere “banal fallibilism”: that people can make mistakes and be corrected by experience about anything. Not only standard empirical theories, but also many calculations, purported proofs, or claims of synonymy are regularly corrected in the light of 5 My use of “reductionism” here is broader than the (somewhat unusual) use Quine employs in “Two Dogmas,” where it is used for a strong verificationist thesis, that the meaning of a sentence is its distinctive method of verification, ultimately in terms of sensory experience.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 341 experience of some sort (having one’s sums corrected; being reminded of the proper use of a word, or of an overlooked possibility). Perhaps this would be news to those who still want to insist on some absolutely indubitable, imme diately “self-evident” “claims of reason,” and, for them the Neurathian figure of moving around one’s boat, every belief being open to revision, may be a salutary working epistemology (see §4.4 above). But it is not clear it should be taken seriously beyond that. Clearly, what Quine had in mind is rational revision. But then we are owed some account of the standards of reason, how and whether they might intel ligibly be revised as well. Most importantly for our purposes, how are such standards to be applied to mere sentences that do not have specific meanings?6 Rational revision of a disposition to assent to a claim is only interesting if the revision preserves its meaning. As Fodor and LePore (1992) nicely put it: It’s no news that you could hold on to “Burning is the liberation of phlogiston” in the face of [Lavoirsier’s] results if it means that Greycat has whiskers. (Fodor and LePore1992:47)

Far from undermining a notion of sentence meaning, considerations of rational revisability would seem to presuppose it!7 (ii) Confirmation Holism As we also noted in §4.4, the Neurathian metaphor can seem to invite (although it does not entail) the “confirmation holism” that Quine (1953/61b) adopted from Pierre Duhem (1906/54), whereby beliefs are not confirmed individually, but only as an entire body, a claim that Quine extended far beyond Duhem’s observation, to include logic, arithmetic, and, most import antly for the present discussion, claims of synonymy and analyticity. Quine was concerned with a strong version of the “Verificationist Theory of Meaning” that had been pressed by the Logical Positivists, whereby the 6 Given Quine’s respected career in logic, it is astonishing that he and his many followers say little about these issues, except for sometimes curiously insisting on the priority of the first-order predicate calculus, and then appealing to informal virtues of hypotheses such as “simplicity” and “generality” (cf. §4.4 above). But see Verhaegh (2018) for evidence of Quine’s own serious qualms about “Two Dogmas” in this regard. 7 In fairness to Quine (1960/2013: ch 2), he would, as a behaviorist, likely have insisted in reply to Fodor and LePore that all that is being revised are dispositions to assent and dissent to strings of noises; and such a “merely verbal” revision would simply occasion a more “charitable” manual of translation, which, in any case, does not capture any determinate fact. But for reasons we have discussed (§3.3), one can understand Fodor and LePore ignoring this now quaint view.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

342 Representation of Language meaning of a sentence consisted in the conditions under which it was con firmed. Given the Duhem observation, it would seem that there were no specific conditions that could be attached to any particular sentence in isola tion from at least a great many others. You should not tie any claim to any specific verification, since who knows what tomorrow’s researches will bring (cf. Putnam, 1965/75)? Does this holism present a challenge to the possibility of semantics? Well, per haps to very simplistic theories of it, whereby a competent use of a sentence has to be able to specify its verification conditions. But suppose, along Chomskyan lines, that the meanings of speakers’ sentences might be as hidden and t heoretical as the principles of grammar, and so unavailable to ordinary thought. Even veri ficationist definitions might still be in the cards (perhaps “x scratches y” is the best definition of “x is harder than y”); it is just that people might not always be required or able to appeal to them in an ordinary, working epistemology. Fodor and Lepore (1992: 52) treat confirmation holism as equivalent to the claim that it is an a posteriori (or empirical) matter of what is in fact causally connected to what in the world, not a matter to be settled by a priori reason, least of all by reflection on the meanings of one’s words or concepts. But this is fallacious, for a reason made clear by Davidson (1963). It may well be a poste riori that the event of firing a gun caused a death, but well nigh a priori ana lytic that the cause of the death caused the death. Causal connections relate events de re; a priori analytics ones, de dicto (see §8.6).8 It could be that a death was caused by and so confirms a killing, even if it is analytic that killing causes dying. In any case, it is hard to see how confirmation holism really should have anything to do with ruling out the analytic. For surely it is plausible only inso far as confirmation preserves meaning. Fodor and LePore’s (1992: 47) above point applies again: observing a thousand flying sparrows had better not be your evidence that pigs can fly simply because you have decided to let “pigs” mean sparrows! But, of course, verificationism is not the only semantic theory one might hold. Appealing to experience as the basis of meaning does offer at least a principled basis for selecting which predicates to count as semantic primitives, viz., sensory-motor ones (empiricism does have some explanatory attractions!). But a Chomskyan would of course be skeptical of such empiricism On the other hand, determining another principled basis may be quite non-trivial.

8 In Rey (2018) I discuss the surprisingly central role that Quine’s confirmation holism played in a wide range of Fodor’s views.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 343 (iii) Reductionism Quine (1953/61b) spent a surprising number of pages considering ways of defining analyticity (surprising, since the lack of an “analysis” of “analytic” would seem an odd complaint to lodge against it). He explored plausible explanations in terms of “synonymy,” “definition,” “intension,” “possibility,” and “contradiction,” plausibly pointing out that each of these notions seemed to stand in precisely as much need of explanation as “analyticity” itself.9 They form what he called a small “closed curve in space” (1953/61b: 30). This worry gave rise to more than a generation of philosophers trying to break out of “the intentional circle” by attempting to provide “naturalistic” definitions of inten tional notions in non-intentional terms. As Fodor (1987: 97) slightly facetiously put it, “if intentionality is real, then it really has to be something else.” That a notion may only be defined with other notions of its ilk is in itself, however, no reason to abandon it. Quine’s own emphasis on holism ought to have led him to notice that terms in an explanatory theory come, as Shoemaker (2000) nicely phrased it, as “package deals.” In at least the case of theoretically interesting terms, if definitions are wanted, one likely needs to wait for the final formulation of a well-confirmed theory and define all its theoretical terms all at once, via, for example, Ramsey sentences.10 It is actually another of the puzzling historical questions in this domain why exiting the intentional circle continued to be thought a serious problem from Quine through much of Fodor.11 With Putnam’s (1960) and Fodor’s 9 As we noted in fn 4 above, the sentence “John is a bachelor who is married” is not a formal c ontradiction, of the form “P & not P.” It can only be converted into one by substitution of the sup posedly synonymous “not married male” for “bachelor.” But Quine (1953/61b) argues we have no basis for claiming that two such formally distinct expressions are synonymous other than by appeal to the very notion of analyticity we are trying to analyze: e.g., “possibility” does not look like it will help either, since one would need a criterion to rule out the possibility of married bachelors, and Quine argues that that would also have to involve synonymy and/or analyticity. 10 Very briefly, “Ramsification” consists in defining each term in a theory by setting out its relations to all the other terms introduced by the theory, and to terms antecedently understood. Thus, what it is to be a carburetor might be defined in terms of the role it plays with, e.g., pistons, spark plugs, ignition, etc., and old terms concerning, e.g., motion. In the case of the mind, what it is to be, e.g. a thought, a belief or desire would each be defined by the relations between them and to, e.g., stimuli, behavior, and perhaps other physical features of people’s bodies and the surrounding world. Such definitions can be captured formally by the following schema: for each of n number of “new” terms, t1 ,…,tn, introduced by what one distills as the essential claims of some theory, T(t1 ,…,tn, o1 ,…,om), containing an already understood m number of “old” terms o1 ,…,om, define each term, ti, (where “(ιx )” means the x, and “(∃!x),” there exists exactly one x), by: t i = (ι x i ) ( ∃! x1 )…( ∃! x (i−1) ) ( ∃! x (i+1) )…( ∃! x n ) ( T x1 , … x (i−1) , x i , x (i+1) …, x n , oi …, om ) For lucid discussion, see Lewis (1970b, 1972/80), and for its versatility with regard to mental terms, Rey (1997: ch 7). I will be invoking Ramsification in the treatment of intentionality in §11.2. 11 The idea that the lack of a reduction forces one to be purely instrumentalist about intentional ascription dominated much Anglo-American philosophy of mind and language, from Quine

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

344 Representation of Language (1968) advocacy of “functionalist” theories of mental states, and particularly with Lewis’ formal rendition of them using Ramsey sentences, one would have thought most philosophers of mind would have no longer worried about exiting any theoretical circles, per se, at all! What is reasonably to be avoided are vicious circles, where the only entrée we have on one term is via another whose only entrée is the first. And, to be sure, the intentional circle, especially in Quine’s hands, can superficially appear to be of that sort. But what Ramsey-functionalist definitions of terms allow are various points of entrée of problematic terms by their joint roles (along with “old” terms) in the psychological theory in which they appear, and which is judged by its overall explanatory power. Functionalist approaches in this way permit us to distinguish two different kinds of characterization one might provide of intentional notions, a “hori zontal” characterization at the level of psychological explanations—say, in theories of perception, language, and decision making—that largely traffic in intentional terms, such as perceive, believe, prefer, represent, and meaning; and a “vertical” concern, that aims to “reduce” them to non-psychological, nonintentional terms. Quine really should have been the first to agree that every thing would depend on the relative explanatory success of an intentionalist psychology as a whole at a horizontal level, whether or not the individual terms could be vertically “reduced” to non-intentional ones. Surely for a “naturalized epistemologist” good horizontal scientific explanation trumps philosophical qualms! In any case, as will emerge in discussions in this chapter, I see no reason to burden semantics, linguistics, or psychology with the task of “reduc tion.” Those fields have quite enough problems of their own already.12 (1960) and Dennett (1987) through, most recently with regard to Chomskyan linguistics, Frankie Egan (1992, 2014). Without a demand for reduction, it is hard to see why one should be an instru mentalist about intentionality any more than about any other theoretical property. 12 Tyler Burge (1993, 2010: 296-8) takes a stronger line, claiming that naturalization projects have more in common with political or religious ideology than with a philosophy that maintains perspective on the difference between what is known and what is speculated. Materialism is not established, or even deeply supported, by science. (Burge, 1993: 117) Although I am sympathetic to Burge’s insistence on intentional notions earning their legitimacy from their explanatory role in psychology whether or not they have been reduced to non-intentional ones, his dismissal of “naturalistic” efforts seems to me to ignore the fruitfulness of the “materialist” drive for unification in biology, chemistry, and physics in the last century (see Rey, 2002a, for discus sion). My claim is simply that, although such unification is an important ultimate aim, it isn’t nearly as urgent as philosophers in the wake of Quine’s challenges have assumed. But it can sometimes lead to useful insights, as it will shortly be evident that I think it has in the case of Fodor’s “asymmetric dependency” proposals (pace Burge, 2010: 307 fn37). Note, by the way, that Fodor (1990: 127–31) himself seemed to give up on the full reductionist project, though one wonders why he continued to press the need to avoid intentional notions in the account he does provide.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 345 Consequently, the challenge that Quine should really be regarded as having raised in his attacks on intentional notions is whether they can earn their explanatory keep even at a horizontal level of psychological explanation. Now, of course, Quine was a behaviorist and so thought there was a better theory at that level than an intentionalistic one. We addressed the inadequacies of this view in §3.3 However, his challenges to the explanatory merits of intentional notions go well beyond behaviorism, and are quite independent of it, as well as of the other challenges we have just discussed, of whether there is a satis factory reduction of the intentional, and whether all beliefs are revisable and confirmed holistically. (iv) Explanatory Role The most important issue that worried Quine about the a/s distinction was the explanatory work it was marshalled to perform, which, to his mind was not actually psychological, but more peculiarly philosophical, specifically, to explain the necessity and a prioricity of logic and mathematics. Quine reasonably doubted that the necessity and a priori status of logic and mathematics could be explained along the lines the Positivists hoped. For our purposes here that issue can be set aside, and we may wonder merely how Quine proposed to explain what seems to be at least the fairly robust appearance of the analytic, what I called the “analytic data,” consisting of sentences of the sort in set II of §10.2.1. Of course, taking for granted the behaviorism of his time, Quine likely thought no mentalistic explanation should be taken seriously. But that view does not actually occur explicitly in “Two Dogmas,” but rather in his next essay, “The Problem of Meaning in Linguistics,” (1953/61c). He thinks the notion, or anyway “the idea idea,” is simply theoretically bankrupt.13 In a passage that is perhaps startling to read in the present days of cognitive science, he confidently wrote: Now there is considerable agreement among modern linguists that the idea of an idea, the idea of the mental counterpart of a linguistic form, is worse than worthless for linguistic science. I think the behaviorists are right in holding that talk of ideas is bad business even for psychology. The evil of the idea idea is that its use, like the appeal in Molière to a virtus dormitiva, engenders an illusion of having explained something. And the illusion is 13 As did Goodman (1969) in his one very critical essay on Chomsky, “The Emperor’s New Ideas” (which is, of course, ironic, given Chomsky’s later dismissals of intentionality that we discussed in Chapter 8).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

346 Representation of Language increased by the fact that things wind up in a vague enough state to insure a certain stability, or freedom from further progress. (Quine, 1953/61c: 48)

Interestingly, however, in that essay Quine doesn’t actually try to explain the analytic data. And he does elsewhere only in a number of passing, fairly perfunctory remarks. The most familiar are his appeals in the earlier essay (1953/61b),“Two Dogmas,” to “centrality”: sentences are more or less revis able, depending upon how “peripheral” or “central” their position is in the web of one’s beliefs. The appearance of sentences being “analytic” is simply due to their being, like the laws of logic and mathematics, comparatively cen tral, and so they are given up, if ever, only under extreme pressure from the peripheral forces of experience. But no sentence is absolutely immune from revision; all sentences are thereby empirical, and none is actually analytic. Of course,Quine’s explanation will not do. In the first place, putting aside our earlier objection from Fodor and LePore (1992: 47) about the need for stability of meaning, centrality and the appearance of analyticity don’t seem to be so closely related. “The earth has existed for more than five minutes,” “Some people think,” “Everyone is mortal,” are central—they are certainly not going to be seriously revised—but they do not seem to be in the least analytic; and many standard examples of what appear to be analytic are not remotely central to our thought: “Bachelors are unmarried” and “Aunts are females” are of no central importance, and could easily be revised if someone really cared, and didn’t mind “changing their meaning.” Far from unrevisability explaining analyticity, it would seem to be analyticity that explains unrevisability: the only reason one balks at denying bachelors are unmarried is that that’s just what “bachelor” means! Some decades later, Quine (1975) speculated that what people take to be analytic may be simply those claims by which they learned a word: presum ably, people learned “bachelor” via “unmarried male,” and perhaps “vixen” from “female fox.” But surely Quine’s arm-chair psychologizing places him in no position for any serious speculations about lexical acquisition. In any case, as Kripke (1972/80) and Putnam (1975c) famously pointed out, people likely learned proper names and natural kind terms from accidental information, for example, “Columbus” from “discoverer of America,” “gold” from “what’s in Mom’s wedding band,” and “water” from “what comes from the tap and fills the seas”—maybe this is all some children know about them—but none of these, of course, appear to be remotely analytic. Clearly, a more serious explanation is needed. Perhaps the one most directed at meeting Quine’s challenge was Fodor and Katz’s (1963) and, particularly, Katz’s (1974, 1997) proposals.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 347 The main idea was to appeal to what they regarded as the hidden structure of lexical items, specifically, the “s(election)-restrictions” they posit on lexical items, on analogy with hidden restrictions on phonology and syntax. Unlike the perfunctory Quinean proposals, this suggestion enjoys some of the independent plausibility of a Chomskyan approach to language. The question is whether the analogy to syntax and phonology can really be sustained. A problem less stressed by Quine but more so by Fodor (1981b, 1998) is with set II of the “analytic” data. A significant problem with all the examples is that, unlike the intuitive responses that serve as evidence for underlying systems of syntax and phonology, the intuitive evidence for an underlying system of semantics seems widely confounded by simply, deeply entrenched beliefs. Consider a sample of s-restrictions proposed by Laurel Brinton (2000: 154) in their textbook on the structure of English: Trot— requires [+QUADRAPED] subject {The horse, *the money, *the spider} trotted home Fly— requires [+WINGED] subject {The airplane, the bird, *the goat} flew north Admire— requires [+HUMAN] subject {Judy, *the goldfish} admires Mozart Melt— requires [+SOLID] subject {the candy, *the smoke, *the water} Now, of course, it would certainly be surprising to encounter trotting spiders, flying goats, and goldfish admiring Mozart: but it is hard to see that Brinton provides any evidence that the restrictions are due to language and not merely to our commonplace theories of the world (it is surely not analytic that goats can’t fly!). Moreover, since ice is a form of water, doesn’t water melt when ice does?). Perhaps there are meaning constraints in the vicinity of these examples (“trots” requires legs, “admire” a mental agent), but the challenge is to provide a serious basis for insisting there are. Katz (1972: 172) provides examples that are perhaps more plausible: of analytic sentences: (a) John buys from those who sell to him. (b) John sells books to those who buy then from him. (c) John remembers things he does not forget. (d) John wants to have the things he desires. (Katz, 1972: 172)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

348 Representation of Language of contradictory ones (e) John sells things only to people who don’t buy from him. (f) John does not like whomever he fancies. (g) John remembers things he always forgets.

(Katz, 1972: 178)

But even here one can wonder about the effects of pressure from background beliefs or presumptions.14 A famous example is “cats are animals” which Putnam (1962a) argued could be given up in light of discovering that the little things are really clev erly disguised robots controlled from Mars—to which Katz (1981: 118ff, 1990: 216ff) replied that that would simply be a case in which it turned out there were no cats at all. Many were left wondering whether Quine wasn’t right after all about there being no fact of the matter either way. As Chomsky (1975a) acknowledges: much of what is often regarded as central to the study of meaning cannot be dissociated from systems of belief in any natural way (Chomsky, 1975a: 23, quoted in Katz 1981: 116)15

Katz (1981: 118ff) does reasonably point out that dwelling on these supposed counterexamples to analytic claims betrays a disregard of the competence/ performance distinction that seems so plausible in the case of syntax and phonology. He ingeniously compares Putnam’s robot/cat example to “refer ential” uses of a term, such as “witch” to apply to ordinary Salem citizens, despite their lacking supernatural powers (1981: 145).16 Implicitly acknowledging that not everyone will be convinced, Katz (1981) does claim, “Even if there is only one case of analytic entailment there is a scientifically acceptable domain of meaning. The size of the domain is not at issue” (1981: 144). But if the domain were in fact only one or a dozen or so examples, then it is not clear that it would be an apt domain for a theoretical generalization. Here, as 14 Not to be tiresome, but imagine John transferring ownership of something to someone ignorant of the sale; or forever being infatuated with people who disgust him; or readily remembering things he absent-mindedly forgets. I do not want to claim that such “counterexamples” are definitive; only that they should lead one to be very cautious about “semantic” intuitions, and the difficulty of distinguish ing them from casual claims of folk psychology that may be overturned by a more careful one. 15 He sometimes expresses the point even more strongly, capitulating to Quine’s challenge: I believe that one cannot separate semantic representations from knowledge of the world. . . . In the semantic interpretation, everything interacts (Chomsky, 1977a: 147–8, quoted in Katz 1981: 116, trans by V. Valian, emphasis original) See also the exchange between him and Harman in Chomsky (1980b). 16 “Referential” uses of terms were discussed at length by Keith Donnellan (1966) with his famous example of a speaker using “the man in the corner drinking champagne” to refer to someone in fact drinking water.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 349 elsewhere, everything depends not on however few or many intuitions are available, but upon their explanation. As Jerry Fodor (1998) put it: No doubt, intuitions deserve respect, [but] it is always up for grabs what intuitions are intuitions of. . . . And, as far as I know, there is nothing in phil osophy aside from these raw intuitions that seriously suggests that content constituting conceptual connections exist. (Fodor, 1998: 86–7)

I think, however, that Fodor underestimated the character of the relevant intuitions. Like Quine, nowhere did he try to explain the apparent inconceivability of the denial of analytic claims. Of course, some inconceivabilities may be due to a failure of imagination, as, for example, in the case of people’s dif ficulties imagining four dimensional curved space-time. But some, like the denial of the examples in set II in §10.2.1, seem due not to failure of theoret ical imagination, nor to any beliefs about the world, but to some specific knowledge people have of the meanings of the relevant words. Explanation is called for, but one that addresses the relevant data. I think we will appreciate what a better explanation might be by consider ing Fodor’s response to a different problem he raises for a theory of meaning, the so-called “disjunction” problem.

10.3 The Disjunction Problem and BasicAsymmetries In addition to Quine’s explanatory challenge, another problem for a theory of meaning was raised by Fodor (1987), which (oddly unnoted by him) was essentially a version of the “Kripkenstein” problem raised by Kripke (1982) that we discussed earlier (§3.4.3). It is worth stating it in way that is abstracted from the specific contexts in which Fodor and Kripke raised it .17 Fodor (1980/90, 1987, 1990) advocated various versions of what he called an “information” semantics, which, inspired by the work of Dretske (1981), sought to locate meaning in some or other co-variant relation of tokenings of a symbol to actual or possible phenomena in the world (along the lines of classical “information theory”).18 In keeping with claims we have already noted in Brentano (see §8.1 above), he reasonably stipulated that it is essential 17 As we saw, Kripke raised his problem in a largely superficialist context; Fodor raised his in the context of an externalist, “informational” semantics. As we will shortly see, there is no need to restrict the present proposal in either way. 18 Such theories can be regarded as a development of what might be called an etiological use of “represent” with regard to portraits and photographs, which cannot be of, e.g., some person unless

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

350 Representation of Language to any meaningful term that it can in principle be applied in the absence of the phenomenon that constituted its meaning. Thus, “That’s a horse” could be caused by a poorly glimpsed cow on a dark night, or by words in a book, or just by recollections of a farm. He calls all such tokenings caused by phenom ena other than the referent of the term “wild,” whether erroneous or other wise.19 The question is what determines that “horse” means, say, horse, and the application to a cow on a dark night was wild, or, alternatively, that it means horse or cow on a dark night? It is important to notice the complete generality of the problem: if a term T can be entokened under conditions C1, C2,…, Cn, then it might be assigned any one of the indefinitely “complex” contents: {C1}, {C1 or C2},…, {C1 or C2 or . . . C(i-1) or . . .} (leaving at least one Ci false so that that there is at least one possibility of error). What determines which meaning/content is the right one, and which ento kenings are errors, or, more generally, “wild”?20 In response to this problem, Fodor (1987, 1990) proposed and defended at length an “asymmetric dependency” theory of content, according to which wildly caused tokenings of a term asymmetrically depend on the meaningconstitutive ones, for example, “Horse” means [horse] if horses cause “horse” tokenings, and non-horse caused tokenings depend upon that fact.21 There that person figured in a specific causal way in its production (cf. Chomsky, 2003: 76); this is a specific usage that will not be at issue here. 19 Leave aside logically special uses, e.g., a term “F” that applies to something iff it’s raining or not; or perhaps some terms applied in the first-person, (e.g., a term that applies iff the user is incorrigibly aware of the state to which it applies). Nothing in the present discussion will turn on such cases. 20 Note that this generality of the problem, depending only on the existence of wild tokenings, tells against the odd complaint of Burge (2010: 322–3) that the problem is somehow restricted to outré philosophical cases. The problem clearly arises if a system is capable of error, as both Brentano and Fodor argued any intentional system must be. 21 More specifically, his view might be expressed as follows (where “x” is a demonstrative, and “yT (“Hx”)” means agent y tokens “Hx”): “Hx” means {that’s a horse} for agent y IF: (i) There exist some conditions C, such that, in C, it’s a law that, ceteris paribus, yT(“Hx”) iff (y is presented with a horse) and (ii) non-horse caused yT(“Hx”)s depend upon (i), but not vice versa. So, if the law in (i) were not true, then ceteris paribus non-horses would no longer cause yT(“Hx”), but non-horses might no longer cause T(“Hx”) while (i) could remain true. Note that the main connective is “if,” not “if and only if,” the condition being merely a sufficient one, not necessarily a necessary one.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 351 are other bells and whistles he adds, but for us all that is important is a certain modest idea that lies at the heart of his proposal. Fodor himself actually presented the proposal as a modest one. It was only supposed to answer Brentano’s challenge by providing merely a naturalistic ally sufficient condition for intentionality: if the condition was sufficient in an important set of cases, that would be enough for intentional explanations in them to be naturalized, even if there were other cases that required other con ditions. In fact, he ultimately intended his proposal even more modestly: it was a “sufficient condition” for meeting disjunction problems in a wide range of cases (see Fodor 1990: 127–31).22 I do not want unequivocally to defend Fodor’s proposal, certainly not as he stated it.23 Quite a number of trenchant objections have been raised against his formulation, and, moreover, along the lines of my skepticism about reduc tive programs, it seems to me that he should not and did not need to express the idea in that spirit in order for it to be theoretically fruitful. Ironically enough, this point is perfectly clear if we consider his straight forward, intuitive motivation for the view. In reply to a worry that anyone should take his view seriously, Fodor wrote: We have, I suppose, a variety of practices with respect of the linguistic expressions we use. And I suppose it’s plausible that these practices aren’t all on a level; some of them presuppose others in the sense that some work only because others are in place. . . . Notice [for example] that you have to invoke the practice of naming to specify the practice of paging. So the practice of paging is parasitic on the practice of naming; you couldn’t have the former but that you have the latter. But not, I suppose, vice versa? . . . I take it to be plausible that you could, so I take it to be plausible that paging is asymmet rically dependent on naming. (Fodor, 1990: 96–100)

He generalized the case: These kinds of considerations show one of the ways that asymmetric dependence gets a foothold in semantic analysis: Some of our linguistic

22 Readers will be forgiven for having missed this further modesty, since it is hard to find in the earlier statements of the proposal. In this later passage he surprisingly allows that perhaps a system also has to be conscious. But it is hard to see these hedges as intended simply to prevent losing sight of what he thinks is the important idea of the asymmetric dependency, a focus I will try to maintain in what follows. 23 For standard criticisms, see Loewer (1997) and Adams and Aizawa (2017).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

352 Representation of Language practices presuppose some of our others, and it’s plausible that practices of applying terms (names to their bearers, predicates to things in their extensions) are at the bottom of the pile. (Fodor, 1990: 96–100)

Notice that the claim involves not merely an asymmetric dependence, but a claim about what specific practice is “at the bottom of the pile.” Now, one might wonder why Fodor did not leave it that. Well, he did go on to point out that the above story patently involves intentionality and so would fall afoul of an adequate response to Brentano’s or Quine’s worries about reduction. So he boldly conjectured that “we can kick away the ladder,” do away with the policies and rely on “the patterns of causal dependencies that the pursuit of the policies give rise to” alone (1990: 90–6). We need not tarry on the counterexamples that have been raised against such a proposal. As we remarked above (§10.2.2 (ii))) in relation to the Quinean demand for “reduc tion,” this further move is needlessly premature. There is no more need to find a way out of “the intentional circle” than there is to find a way of defining any theoretic term independently of its relations to other terms in a theory. Fodor (1991) emphatically stresses that he intends his semantic theory to be “atomistic”: the meaning (or content) of any one symbol is independent of the meaning of any other one. Indeed, he allows there to be intentional con tent completely independently of any psychology whatever, since, if it occurred in the case of some phenomena outside a mind, that would simply be a case of a semantic property not being exploited by a mind. But suppose, however, that it occurred in the case of a neural state that did function in a person’s mind, but that the asymmetrically dependent content that was deter mined thereby played no role in that person’s cognitive life? We would seem to have burdened a person’s mind with a content it would not otherwise possess.24 The obvious way to avoid such cases would be to require any state with intentional content to play a role in a person’s cognitive life. But, of course, that would involve including other intentional states in the characterization of intentional content in violation of the demand for immediate reduction. If we relinquish that demand, as I am suggesting we do, then such cases would no longer pose a problem. This brings us back to the basic epistemic criterion Fodor suggested above, only this time inside the agent: some ways of telling what’s what asymmetrically 24 Suppose for example, someone’s neural state N that ordinarily entokened a thought about cats could also be regularly caused by aspirin, and could also be caused by tylenol only if it was also caused by aspirin, but not vice versa. There’s no reason to suppose that the content [aspirin] remotely entered into the person’s cognitive life.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 353 depend on some privileged way (a parallel that oddly enough Fodor himself does not note). Paul Horwich (1998, 2004) independently expressed a similar idea. He proposed what I will call the “Basicality View”: (BAS) Meaning is the property of the use of a word that “is explanatorily basic: the one that best explains all the other use properties of the term. (Horwich, 1998: 41)

For brevity, I shall refer to an explanatorily basic property as a “Basic Property.”25 Horwich provided a number of examples (1998: 45, 129). The basic property for “and” is a tendency for x to accept “p and q” iff x accepts both “p” and “q”; for “red”: a disposition to apply “red” to an observed red surface; for “Vulcan”: the holding true of “Vulcan is the planet responsible for suchand-such perturbations in the orbit of Mercury; for “one”: holding true Peano’s axioms; for “Aristotle”: holding true “This is Aristotle,” pointing to Aristotle.26

Notice that, on this view, some Basic Properties (e.g. those regarding “Aristotle”) may be partly external, others (e.g. those regarding number), pace Fodor, mostly internal. Horwich also emphasizes that some of these Basic Properties may be transmitted socially, so that, for example, Aristotle’s parents or friends holding true (the Greek for) “This is Aristotle” while pointing to him provides the Basic condition that determines the meaning of “Aristotle.” Of course, in its explicit efforts to defend a “use” theory of meaning, Horwich’s proposal would appear initially about as far from Fodor’s as one might get.27 But there seems to me a way of combining some of Horwich’s view with Fodor’s that might avoid problems each has on its own.

25 Something like this view is also suggested by Michael Devitt (1996: 92). Indeed, in his (2002) commentary on Horwich, he sees himself as agreeing with this aspect of Horwich’s proposal, adding, “a mental word’s meaning is the property that plays a role in explaining not only why sentences con taining the word are caused but also why they cause behaviors in general” (2002: 112). However, there is not quite the same emphasis on what I regard as the crucial issue of its explanatory basicality and asymmetry (after all, non-basic, non-meaning properties can play plenty of explanatory roles in caus ing sentences and behavior). 26 I will not be endorsing any of these or other specific examples that have been provided without serious linguistic theory, which, for the reasons Quine pressed, I think are at least much harder to determine than Horwich and many philosophers have seemed to suppose. I am only concerned with the possibility of such meaning claims. 27 In her review of Horwich, Borg (2001: 102) welcomes Horwich’s view as an alternative to “talk of asymmetric dependencies.” Horwich (1998: 5, 112), himself, explicitly opposes his view to Fodor’s, but apparently only for the externalist commitments that I also want to resist.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

354 Representation of Language For all its reliance on the very facts about “use” that are anathema to Fodor, the similarity of Horwich’s idea to Fodor’s asymmetric dependency ought to be obvious. Explanatory basicality displays very much the same asymmetric dependency relation. On the face of it, what is explanatorily basic to the tokening of a term is the condition on which that tokening asymmetrically depends. If saying or thinking “Bob’s a bachelor” is explained by a connection with “unmarried male,” then, if that connection were broken, ceteris paribus, one would no longer think, “Bob is a bachelor”; but not vice versa: one’s giving up “Bob’s a bachelor” would not cause one to give up “Bachelors are unmarried.” Like Fodor, Horwich is concerned to “naturalize” or “reduce” intentionality to the non-intentional, and this leads him to attempt to characterize the “use” of language non-intentionally, relying essentially on Quinean, Davidsonian (quasi-)behavioristic, and “deflationist” strategies that there is no reason for us to take seriously here.28 For, again, the insistence on such a reduction seems to me wildly premature. But it is nonetheless worth distilling the com mon good idea on which both Fodor and Horwich are relying. Incorporating some of the suggestions I have been making, here is a first pass at that useful common idea: (BasicAsymmetry) The content of a symbol is to be determined by the property of meaningful tokenings of a term that is explanatorily basic, the one on which in the context of a theoretical psychology all other tokens with that content asymmetrically/explanatorily depend by virtue of that property.

The key idea, again, is the one stressed by Fodor: it is the asymmetric structure of the causal/explanatory relation itself that may be enough to identify the intentional content at least within psychological theories at a horizontal level. (BasicAsymmetry) potentially has both Horwich’s and Fodor’s proposals as special cases—cases in which the Basic Properties are ones about actual language use, or are the external causes on which wild tokenings depend. (BasicAsymmetry) is simply not limited to such cases.29 28 “Deflationism” is an approach of several contemporary philosophers to try to show that notions of truth, reference, and intentionality are not substantive, but may be understood along the lines of the “disquotational” theory of truth: (roughly) a sentence “p” is true iff p. Although I think there is a quite useful deflationary notion of truth, I am skeptical that it would suffice for all the explanatory purposes of psychology. But there is no need to enter into that issue here. 29 BasicAsymmetry is therefore neutral to the debates that have been ravaging much of philosophy for the last fifty years about whether meaning is “internal” in the way Chomsky (2000) seems to sug gest or “external” along lines defended by, e.g., Putnam (1975c) and Burge (1984).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 355 I am not supposing that either Fodor or Horwich would accept this adaptation; nor do I intend it as any sort of analysis of “explanatory basicality.” Horwich himself paraphrases what he has in mind in various ways: A word means what it does . . . in virtue of its basic use; a word’s use is responsible for its meaning. Thus, not only does a meaning property supervene on a basic acceptance property, but possession of the former is immediately explained by possession of the latter. (Horwich, 2005: 32)

I, myself, am also inclined to think of Basic Properties as the ones that are the ultimately “controlling” ones of a term. However, I expect none of these for mulations is quite adequate: mere “explanatory” notions risk conflating meaning with etymology. Dubbings of Aristotle might explain and so fix the meaning of “Aristotle,” but, while eyes might be the original source of “eye” used for the hole in a needle, they do not determine its meaning (“eye” might cease to apply to animal eyes, yet still apply to the holes in needles). In any case, “explanatory” invites epistemic commitments that seem inappropriate, and “responsible,” “supervene,” “control,” and “asymmetric dependence” perhaps too many metaphysical ones—there are doubtless deviant causal chains to avoid, and ceteris paribus clauses to add. In addition, as Fodor (1990) noted, it is not clear how to spell out counterfactuals in terms of “possible worlds,” if that is even the way to go. But I think it would be a mistake to give up on the core idea in the face of such problems. It would also be a mistake to give up on the core idea in the face of a famil iar difficulty facing the meaning of theoretical terms such as “muon”—that they often are definable only in terms of one another as an interlocking the ory. Such cases would just require that the Basic condition for a term’s content be complex, in principle definable only by Ramsification of the essential claims of an entire theory (see fn 10 above). Of course, such a strategy would approach vacuity if the “theory” of terms involved the entirety of a person’s cognitive life, as we saw above that Quine (1953/61b) originally proposed. But, even Quine (1991) himself, came to limit his “holism” merely to large tracts of beliefs. Even if uses of some terms Basically depend upon the uses of others in a specific theory, there is no reason to think they so depend upon the uses of all the terms in one’s cognitive life. There is no need, however, to require that BasicAsymmetry serve as an account of all intentional content. It is enough that it provides a basis for an illuminating explanatory strategy, calling attention to the important fact that

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

356 Representation of Language whether people share a concept does not depend on whether they would agree in their surface thoughts or behavior, but upon whether their responses are under the “control” of the same Basic Properties, an issue that may not be obvious on the surface.

10.4 A Promising Application to Language 10.4.1 Meaning without Truth BasicAsymmetry seems to me to provide an explanatorily attractive way to begin to reply within a Chomskyan linguistics to the Quinean challenge to a natural language semantics, along the following lines. Chomsky (1996, 2000) made an important proposal for how to think about the traditional semantic issues that have concerned philosophers: We cannot assume that statements (let alone sentences) have truth conditions. At most they have something more complex: “truth indications” in some sense. There is good evidence that words have intrinsic properties of sound, form, and meaning; but also open texture, which allows their meanings to be extended and sharpened in certain ways. (Chomsky, 1996: 52)

As Chomsky points out (2000: 36–52, 188), this proposal is a way of fleshing out Peter Strawson’s (1950) claim that it is not words but people that refer by using words. The items that are true or false are not sentences by themselves, but statements made on specific occasions in specific contexts. These days we can add to Strawson’s concerns, noting the wide assortment of contextual sen sitivities of our utterances to which philosophers and linguists have called attention: polysemy, open texture, vagueness, and precisifications—or just the odd, quasi-systematic uses of language, or “language games,” some striking examples of which we will consider shortly. Chomsky argues for this position by calling attention to how a type lexical item provides us with a certain range of perspectives for viewing what we take to be the things in the world, or what we conceive in other ways; these items are like ﬁlters or lenses, providing ways of looking at things and thinking about the products of our minds. The terms themselves do not refer, at least if the term refer is used in its natural language sense; but people can use them to refer to things, viewing them from particular points of view. (Chomsky, 2000: 36)

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 357 For example, in the case of the name “London”: Referring to London, we can be talking about a location or area, people who sometimes live there, the air above it (but not too high), buildings, institutions, etc., in various combin ations (as in London is so unhappy, ugly, and polluted that it should be destroyed and rebuilt 100 miles away, still being the same city). Such terms as London are used to talk about the actual world, but there neither are nor are believed to be things-in-the-world with the properties of the intricate modes of reference that a city name encapsulates. (Chomsky, 2000: 37)

If a theory attempted to assign a stable “referent” to the type word “London” (even within its standard “univocal meaning” where it is used to refer to the English city and not, for example, the Canadian one), that referent would seem to have peculiar, likely incompatible properties, for example, both a geographical location, and a social/political entity that might be moved. Similarly “book” would have to be assigned both an abstract type and con crete token (“The novel’s a bit light-weight, but weighs 2 lbs.”), and “water” any of a wide variety of different substances containing widely varying pro portions of H2O. In some useful articles enlarging on Chomsky’s suggestion, Paul Pietroski (2005, 2010) distinguishes the “I(nternal)-expressions” of the I-language from the uses of them in external speech: I-expressions do not have satisfaction conditions in the same way that con cepts do: the semantic instruction issued by “cow” is satisfied by fetching a concept that has a Tarskian satisfaction condition; and a polysemous expres sion like “book” may not determine a Tarskian satisfaction condition. (Pietroski, 2010: 266)

Pietroski proposes that we think of the semantics of I-language in terms of what that computational system makes available to our cognitive system to use in expressing thoughts, either internally or by external performances, and it is only these latter uses to which the properties of truth and reference that concern philosophers are attached. For brevity, and to distinguish this sort of “contextualism” from many other versions, I will call it “I-contextualism,” or a contextualism whose semantic basis is provided by a Chomskyan theory of the I-language.30 30 There is a wide variety of other current “contextualist” proposals about meaning that are not particularly tied to a Chomskyan linguistics, and so will not be of concern here; see. e.g., Searle (1979, 1980), Travis (1985), Sperber and Wilson (1986/95), Bach (1994), Bezuidenhout (2002), Recanati (2004), Borg (2004, 2012) and Harris (forthcoming). Carston (2002 and especially 2019) also

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

358 Representation of Language One major attraction of such a proposal is that it affords an elegant way of dealing with the kinds of contextualism about meaning that have been a theme of much philosophy since Wittgenstein’s (1953/2016) emphasis on how meaning often depends upon the use of an expression in a particular context. Perhaps the most well-known source of examples of this contextualism is in the work of Charles Travis (1985). Since the late 1970s, he has been arguing for what he claims is a gulf between the linguistic “meaning” of a sentence and its truth-conditions by imaginatively calling attention to how the truth conditions of utterances of quite innocuous sentences can vary with context and speakers’ intentions. He provides an impressive range of examples (hence the term “Travis cases”): for example, someone saying “The kettle is black” to their spouse when (a) buying one in a store, and comparing it to the red one; (b) having bought the red one, complaining that “it’s black” after having been left on the stove all night. Or different utterances of the sentence “John weighs 150 lbs,” which might be true after a big meal, or when he is in his normal clothes, but false otherwise. It is pretty clear how one could proceed to produce indefinite numbers of such cases (see Carston, 2019, for a particularly rich range of examples). The point is that, in evaluating the truth or falsity of an utterance, speakers are sensitive to the often entirely implicit intentions by which it is produced in a specific context, which usually involves innumerable shared, implicit beliefs. The effortlessness with which this happens can lead philosophers to read those intentions back into the meaning of the sentence type produced by the I-language considered in isolation, but the richness of the shared beliefs, and their variability, should at least give such philosophers pause even with regard to whether there is some sort of “core” truth-conditional meaning. There has been extended discussion regarding how to treat these cases, but there is no need to enter into that here (see references in fn30). People have appealed to various devices, for example, impliciture, hidden indices, poly semy, precisifications, open texture, all in the hope of preserving some vestige of truth conditions for a type sentence. Pietroski argues that the Chomsky proposal has the attraction of greater generality than any of these, all of which express views within a Chomskyan I-language framework, and I would equally have relied on her views as on Pietroski’s had I encountered them earlier. I will not include in “I-contextualism” some further views Pietroski (2010, 2018) has proposed about the specific monadic and conjunctive charac ter of I-semantics, or of the conceptual system that puts its deliverances to use, much of which may be congenial to a Chomskyan linguistics, but goes well beyond any of Chomsky’s suggestions. Note that an I-language need not be an informationally encapsulated processing module in the sense of Fodor (1983), in a way that Harris (forthcoming) needlessly burdens such proposals (cf. McCourt, ms.).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 359 it can regard as special instances. If it is correct, then, in conjunction with BasicAsymmetry, it could well supply the Basic conditions that provide the linguistic meaning of expressions in natural language, even if they did not also provide any general truth-conditions.31 Would such an account of meaning save the “analytic”? In the same passage quoted above in which he suggested replacing truth-conditions by “truthindicators,” Chomsky (1996) does say in passing that among the intrinsic semantic relations that seem well-established on empir ical grounds are analytic connections among expressions. . . . The fixed and rich intrinsic structure of expressions, specifically their semantic properties, must be shared among people and languages to a large extent because they are known without evidence . . . (Chomsky, 1996: 52)

But he does not point to a “reduction” of the notion of the sort sought by Quine, since it obviously presupposes the intentionalistic resources of the conceptual system. But, following our earlier discussion, this is not a worry if the resulting package has explanatory power. And it could certainly afford a possible way of explaining the analytic data. If I-meanings are merely instruc tions to our conceptual system about material relevant to constructing truthevaluable statements in various ways in various contexts, then it is a serious option that speakers could be sensitive to such instructions without being slave to them (much as they are to the constraints of their I-languages generally; cf. §3.2). That is, the constraints of a shared I-language are certainly not to be ignored; but nor do they need to be exceptionless. They serve in the way that moral and other prescriptions seem to do, as defaults on usage that can never theless be overridden for one contextually important reason or another—the burden would be on the overrider to provide a justification for the override. Killing may be prima facie wrong, but someone sensitive to that fact can nonetheless kill in self-defense. Similarly, “cats are animals” may be analytic, 31 An I-contextualist proposal also provides a way to accommodate the insights of Kripke (1972/80), Putnam (1965/75, 1975c), and Burge (1979), regarding reference and meaning: they are right to point to problems in the received internalist/descriptivist theories of these phenomena, but there is no reason to take their comments as a substantive positive theory about either the semantics or actual truth-conditions in general of type sentences containing names or kind-terms, utterances of which may, per the above proposal, be highly contextually variable. Of course, there might be cases in which the instructions of the I-language to the conceptual system are so rigid and complete that they might actually specify a full proposition with truth-conditions (perhaps arithmetical expressions do so: “There’s no largest prime” doesn’t seem very context sensi tive). The point is to allow that there may be many cases in which this does not happen.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

360 Representation of Language but discovering that the things we have been so naming have been robots all along could (depending upon the details: how much like real animals are they?) provide a reason for thinking it was false—that is, a sentence could be analytic, since it was somehow “baked into” the semantics of our I-language, but nevertheless false!32 Consequently, one could engage in something like the kind of traditional “analyses of the meanings” of ordinary terms without being bound to claiming that these analyses are true in all contexts of the use of the terms, or even in precise scientific contexts, where very special con straints might be in force. In any case, there is no reason to suppose that the deliverances of an I-semantics would be consciously available to ordinary speakers, any more than are the rules of syntax or phonology. And so they would be of little use in a working epistemology for someone insisting on a claim being true here and now on their basis. There may be analytic claims to be had, but, as Putnam (1965/75: 36) observed, they would “wash no philosophical windows,” at least not in the way that many philosophers have hoped (cf. Chomsky, 1996: 52).

10.4.2 A Cautionary Aside: Resisting More Anti-realism Relativization of truth and reference to contextual use can seem to invite another form of the anti-realism we discussed in §9.1.2, relativizing meaning and reference completely to the varying interests and perspectives of human “conversations” that Rorty (1979) seemed to be recommending. The form we discussed in that section was based on the familiar fact that many categories that humans discuss are fixed relative to their interests and particular per spectives. The form invited by the above conception is based on the polysem ous character of ordinary meaning and reference, and how these attach not to word types, but to uses of them that are also sensitive to the interests and perspectives of the user. It’s important to see that the latter doesn’t entail the former—and neither, by themselves or together, entail anti-realism. Chomsky’s “polysemy” claim (see §10.4.1) is often stated in the same passages as the conceptual one. Indeed, the passages citing Goodman and Hobbes that we quoted in §9.1.2 began by discussing “names.” But in those passages the concern is not with usages vs. type names, but with how our interests

32 I develop this suggestion in a bit more detail in Rey (2016) and (2020b). I should mention that, not unreasonably, Pietroski (pc), does not want to commit his proposals to it.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Linguo-Semantics 361 suffuse many of our concepts. The present denial of reference to type words independent of usage also leads Chomsky to reject what he takes to be a simple representationalist doctrine that is illustrated in the titles of such basic texts of contemporary philosophy and psychology as W.V. Quine’s Word and Object and Roger Brown’s Words and Things. The doctrine holds that the terms of language and thought refer directly to extra-mental things, entities that a physicist could in principle identify without exploring the mind—that is, without exploring Hume’s imagination or Aristotle’s form, the latter reformulated in cognitive rather than metaphysical terms. The doctrine is simply false for human language and thought. (Chomsky, 2018: 38–40)

Actually, here Chomsky is burdening at least Quine with a (quite crude version) of a view that Quine (1960, 1969a) himself explicitly rejects (see especially his (1969a) defense of his thesis of “the Inscrutability of Reference”)! Something like what Chomsky might have in mind did emerge later in response to the work of, for example, Kripke (1972/80), Devitt (1984/91/97), Fodor (1987, 1990), and Recanati (2012), according to which many speakers’ uses of proper names refer to their referents “directly,” unmediated by the common descriptions they associate with them33—but even here these writers typically allow some mental machinery (see, e.g., Devitt and Sterelney’s (1987/99: 80–1, 96–101) “hybrid” view). But, in any case, note that we all—physicists and folk alike—could manage to identify an object that is a table without considering anyone’s mental life: for example, “The hunk of wood/carbon molecules in the corner has a mass of 2 kilos.” Indeed, it is open to someone who embraces the I-contextualist suggestion of meaning without truth to allow that certain, for example, scientific uses of words—certain specific sorts of “conversations” or “language games”—are seeking a description of the world, as best we can, independently of peculiarly human concepts, interests and other uses of language, precisely of the sort

33 I leave aside the issues of “empty” terms we discussed in §8.7, which are widely recognized to be a challenge to “Direct Reference” views. In general, Chomsky is perfectly right, as was Wittgenstein (1953/2016) before him, in thinking that philosophers have often been overly prone to assume there is a referent for more (usages of) nouns than seems reasonable. Carsten Hansen and I (Hansen and Rey 2016) brought to bear precisely the kinds of cases of problematic “objects” that Chomsky cites to bear against what we argued were François Recanati’s (2012) far too uncritical claims about singular thought, which did not include thoughts about, e.g., Kanizsa figures, rainbows, clubs, “the sky,” cere monies, contracts, symphonies, properties, or numbers.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

362 Representation of Language that we observed in §9.1.2 Chomsky (2000: 22,134), unlike Jackendoff (2006), believes is possible. In the end, it is actually odd that Chomsky takes a stand on any of these claims, about either reference or metaphysics. After all, as we have seen above, since he does not think the I-semantics involves issues of truth or reference, his linguistic theory should not settle one way or the other what people are referring to in their use of words, much less, whether the things they are refer ring to in various of their uses do or do not exist independently of them. These issues would be a matter for a sociolinguistics (for the usage) and physics or the like (for the things themselves), and he does not claim to be undertaking either of those tasks.34

34 A number of Chomskyans have claimed that the passages I have quoted are only intended to be illustrative of the point of the previous section (§10.4.1), that it is not types, but only usages of words that refer; that if a referent was sought for a type word apart from usage, it would have incompatible properties, e.g., in the case of “book” being a type and a token. But this latter—now standard—point about “polysemy” does not by itself entail any “mind dependence” of the entities picked out by contextually variable usages. Someone might well use “London” on a particular occasion to refer to a perfectly mind-independent geographical location, e.g., “the Germans bombed London,” even if that specific location might be picked out by a political concept related to human interests.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

11 Psycho-Semantics of Perceptual Content I now turn to a quite different use of BasicAsymmetries to meet the challenges raised by Quine and Fodor for internal mental states, specifically, for the kinds of states that seem to figure in Chomskyan linguistic explanations. I will argue that linguistics needs intentionality not only in its account of a linguo-semantics, but also in order to satisfy its aim of explanatory adequacy for a theory of linguistic competence generally (cf., §4.1). As I indicated in Chapter 10, I will not be attempting to deal with all of the issues of intentionality that have concerned people, much less try to provide the kind of “reductive” theory of intentionality that many have been (I think, prematurely) seeking. I shall only be trying to indicate what I think is one essential explanatory role of intentionality in linguistic theory, arguing that an intentionalist proposal is preferable to the non-intentionalist proposals that we considered in §§8.4–8.5. I will begin by calling attention to a quite general problem faced by psych ology that is particularly vivid in the case of linguistics, that of accounting for an organism’s sensitivities to non-local, non-physical, and/or non-instantiated— what I call “abstruse”—properties, such as, for example, being a dinosaur, being a grandmother, being a cube, or being a noun or a sentence (§11.1).1 I argue (§11.2) that only an intentionalistically understood CRT, what I will call an “II-CRT,”2 though not mandated by abstruse sensitivities, offers the best prospect of explaining them, which it does by a mutually supporting combination of the BasicAsymmetry proposal (§11.2.1) and of the kind of probability theories routinely presupposed by theories of visual and linguistic perception (§11.2.2). I will then briefly discuss various sources of resistance to incorporating intentional explanation into standard science and how they motivate the methodological dualism (hereafter, “Meth-dualism”) that Chomsky deplores, despite his sharing many of those motivations and the dualism himself (§11.3.1). I will conclude the book with some comments on the “mind/body” 1 This section can be regarded as a reply to Chomsky’s (2003: 278) charge that “it would be pointless, or worse, to pursue the terminological course [of appealing to intentional content] that Rey recommends.” 2 I include the two “I”s to capture the fact that CRT is to be understood as both intensional (a computational procedure) and intentional (involving the properties of intentionality I listed in §8.1). Representation of Language: Philosophical Issues in a Chomskyan Linguistics. Georges Rey, Oxford University Press (2020). © Georges Rey. DOI: 10.1093/oso/9780198855637.003.0011

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

364 Representation of Language problems that provide some of the background to those motivations. I will argue that, contrary to many of Chomsky’s writings of the last three decades, these problems are perfectly expressible outside of Cartesian mechanics, and are as alive and troubling as ever. Indeed, at least some of them are problems to which Chomsky’s core linguistics—especially understood intentionalistically— offers welcome insights (§11.3.2).

11.1 Sensitivities to Abstruse Phenomena People are sensitive to an astounding range of phenomena, real and unreal: shapes, colors, kinds of animals, objects and artifacts, works of art and music, social and legal entities, even non-existent phenomena, such as unicorns and triangles. By “sensitive to” I will tentatively mean that we seem to make discriminations expressed in terms of these phenomena, responding differently in their (maybe counterfactual) presence in ways that seem explanatorily illuminating, for example, someone might be startled because they thought they just saw a unicorn/a dinosaur/a triangle, or heard a forbidden word.3 Most importantly for present purposes, many of these phenomena are what I will call “abstruse” phenomena: they are “distal” or non-local, as in the case of dinosaurs, distant galaxies, or the fall of Rome; or they are non-physical in that they are not described in the standard terms of physical theories, as in the case of symphonies, nations, and SLEs;4 and some of them may be n on-instantiated— indeed, may be impossible—as in the case of ghosts or Euclidean figures. These sensitivities to the abstruse raise a crucial explanatory issue: how could a physical system like our brain possibly possess such sensitivities? After all, physical systems are systems whose interactions, at least at the macro-level of brains, are importantly local and ultimately characterizable physically.5 It is somehow by virtue of the “proximal,” physical properties of, for example, electromagnetic radiation and the properties of molecules that 3 I actually think that BasicAsymmetries capture what it is to be a specific sensitivity, but I don’t want to press this until §11.2.1. 4 By “non-physical” here I do not mean that the phenomena described by these terms—the symphonies and nations themselves—are not in some way constituted by physical phenomena; only that the terms “symphony” and “nation” are patently not terms in theoretical physics (in terms of Quine’s 1953/61d, the “ontology” of symphonies and other artifacts is a subset of the ontology of physics, but its “ideology” may be irreducibly different; cf. §9.1.2 above). 5 Along lines I will discuss in §11.3.2, I will be assuming simply present macro-physics, setting aside dualistic views that we are not physical systems. I abstract as well from issues arising at the quantum level and evidently exploited in quantum computation about whether all (or even any) physical interactions are genuinely “local,” since I see no reason to think the issues are sufficiently settled or relevant to the explanatory problem.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 365 impinge locally upon our surfaces, causing electrochemical processes transmitted locally from cell to cell, neuron to neuron in our brains and nervous systems, that we seem to have the mental life that we have and behave as we do. But, if that is so, how can we be sensitive to phenomena that are not local, physical, or instantiated? Indeed, it would seem that the further the phenomena diverge from actual, local, physical ones, the more challenging the explanation will be. What is needed is some kind of systematic connection between the physical and the abstruse.6 Some abstruse sensitivities could, of course, arise from happy correlations between abstruse and proximal phenomena, so that, at least across a wide variety of conditions, a system could (seem to) exhibit sensitivity to a specific abstruse property simply by being sensitive to the correlated proximal one. Thus, being sensitive to certain scents can lead dogs to (seem to be) sensitive to the presence of cadavers, even though, without the scents, the dogs may not know a cadaver from someone asleep. Evidence suggests that an ant navigates merely by computing a vector algebra defined over, for example, the steps it has taken and the polarization of incoming light, which changes as a result of changes in its direction of travel. The system computationally transforms these correlates of distal distance and direction, and generates the vector sum of the distance–direction vectors that describe its outward walk. The ant then follows the inverse of that vector back to its nest—all without there being any explanatory need for it to represent the distal distances themselves.7 But is it plausible that all cases of abstruse sensitivities are due to such fortuitous correlations? One way to appreciate the issue is to consider a transducer, or any device that transforms a physical magnitude of one form systematically into a physical magnitude of another by virtue of physical laws. Microphones, for example, are transducers that transform sound waves into electromagnetic impulses that other transducers, for example, amplifier-speakers, can transform back 6 This is a development of a problem I raised for empiricism in Rey (1997: §4.3). It was inspired by Fodor and Pylyshyn’s (1981) criticism of Gibson, which in turn echoed Chomsky’s (1959, 1965: 58) complaints about Skinner. Fodor (1986) went on to claim that the challenge is presented by sensitiv ities to “non-projectible properties,” such as grue (see §5.4.2 above), arguing that only a creature with a mind could detect them. But that is not quite right: simply connect a timer to some detectors of certain frequencies of light and you have arguably got yourself a mind-less grue-detector. I submit that he meant to be getting at abstruse properties. 7 See Burge (2010: 501–2) and Knoll and Rey (2017). Gallistel (pc) thinks this is not actually true of the ants, but all that is important here is the serious possibility, which does not turn on just which creatures do what. Other examples are the magnetosomes by which some creatures are able (in certain conditions) to detect magnetic north (see Dretske, 1988). A more perfect example of such “surrogacy” is afforded by truth-tables, which would allow a system to be sensitive to truth-functional validity without representing it. It is such fortuitous local surrogates that give the lie to Fodor’s (1975) hyperbolic proclamation, “No computation without representation,” though I will argue that this does become plausible as one considers sensitivities to abstruse phenomena generally.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

366 Representation of Language into sound. Animal sense organs consist, at least in part, of just such transducers, transforming, for example, sound and light waves into electric impulses along visual and auditory nerves. Now, so long as a theory confined itself to transducible properties, it could afford an illuminating explanation of the sensitivity of an animal to them. For example, an animal is sensitive to ambient heat because it has thermo-receptors, a set of transducers that convert local molecular kinetic energy to electrical impulses along delta-A and C-fibers to the brain. But abstruse properties, themselves, not being physical or local, are patently not transducible. Even if some input/output law, say, the behaviorist’s Law of Effect (see §3.3.1 above), did provide evidence of a system’s sensitivity to grandmothers, it alone would, by itself, explain rather little: we should reasonably want to know how that sensitivity could obtain (do grandmothers all have a special look or scent?). The law would not be seriously explanatory of such a sensitivity if there were no account of how possibly a physical system could exhibit it. This is the explanatory challenge raised by people’s sensitivity to SLEs. It is a challenge, I shall argue, that is plausibly met only by an intentionalist theory. Consider, for example, David Adger’s (2019) recent proposal that all of the relations between grammar, phonology, and a conceptual system be regarded as so many transductions:8 Syntactic “representations” are not representations with intentionality (construed as an aboutness relation to the external world, cf. Chomsky (1995)). They are simply structures/configurations, aspects of which are “transduced” to/from other cognitive systems (cf. Pylyshyn, 1984).9 —(Adger, 2019: 7)

After all, he claims: a syntactic representation, say, a tree, labelled bracketing, set-theoretic structure, set of lines in a derivation etc. is just a configuration, a structure, which is a shape consisting of basic units bearing particular relations to each other, like a gesture of the hand. Such a structure is legitimated (that is, has the shape it has rather than some other kind of shape) by a generative procedure (the generative procedure determines what structures are available). (Adger, 2019:8) 8 I am grateful to David for his preparing a short paper from the slides he presented at a conference in which we participated in Trondheim, 2015. 9 Note the presumption that is attributed to Chomsky, that aboutness involves a relation to “the external world,” the de re presumption that I argued in §§8.6–8.7 it was important to resist.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 367 It is transduction to and from these structures that he claims allows them to be integrated into the rest of a speaker’s psychology: The acoustic signal involves energy transfer of particular physical magnitudes to some brain activity, providing differential categories for the brain to work with. This is the transducer function. These differentiated categories are then accessible to computation, in this case a computation that manipulates phonetic, phonological, and morphophonological symbols. The computation is sensitive to properties of the symbols, and those properties are, ultimately, grounded in acoustic properties via transduction. . . . For articulation, as opposed to perception, we can think of the phonetic/phonological computation as terminating in instructions to the articulators. (Adger, 2019: 12–14, emphasis mine)

Indeed, transduction will not only serve to mediate between acoustic input and motor output, but also between the language system and the conceptual system: This same view can be applied to the other side of the computation, the mapping to the systems of the mind that are concerned with planning, thoughts, concepts, etc. Virtually nothing is known about this, but I suggest that the phonetic side of the equation provides a good model. The ‘semantic’ computation maps to what we might rather clumsily call ‘thoughticulators’. (Adger, 2019: 14)

In sum: Syntactic structures from this point of view connect to syntax-external aspects of the world through further computation that transforms them into structures that can be mapped to acoustic and motor systems on the one hand, and to conceptual and planning systems internal to the brain on the other. Intentionality doesn’t ﬁgure into the explanation of how syntactic structures are causally eﬃcacious in comprehension or language production. (Adger, 2019: 14–15)

Were an account of mental integration so easy! The question to be raised throughout this proposal is whether “transduction” makes any sense with respect to the relation between (representations of) syntactic structures, and between them and (representations of) cognitive or

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

368 Representation of Language phonological ones, making the computation “sensitive” to the “differentiated categories” and relevant linguistic “properties of the symbols,” which are able to produce “instructions to the articulators” and conceptual systems. We implicitly noted this problem in considering Fodor’s doorknob/ doorknob problem which he raised for triggering models of parameter seemings (§5.2, esp fn8). Why should the parameter of pro-drop not be triggered by ambient pro-drop speech in the way that the pattern of marks on a piece of paper corresponds to the pattern of marks on the plate from which they were printed? The problem is that being pro-drop or lacking a subject are not local, physically specifiable properties in the way that patterns of marks are. To be sure, a speaker’s knowledge of language is realized in some local physical properties in the brain. But, recalling the distinction from Quine (1953/61d) that we noted in §9.1.2, although linguistics and neurophysiology may share an ontology—all mental states are, indeed, realized by one or another brain state—they need not and likely do not share their ideologies. In particular, just because either a syntactic representation, or even a syntactic structure itself, may be nothing ontologically more than a structure in a brain generated by a procedure, it does not follow that the predicates of linguistics that describe that structure and procedure are physical predicates—or, to put the point in terms of properties—that the relevant linguistic properties are physical properties. After all, from a physical point of view, linguistic properties such as being pro-drop are as abstruse as any, so it is a substantial challenge to explain how children can be sensitive to them. Adger appeals to Zenon Pylyshyn’s (1984) classic account of transduction on behalf of his proposal. Far from supporting Adger’s claims, however, it explicitly undermines them. In setting out the conditions for transduction, Pylyshyn amplifies precisely what we have already noted. Considering transducers that convert physical input into symbols in a computational system, he writes: The behavior of a transducer is to be described as a function from physical events onto symbols: (a) The domain of the function must be couched in the language of physics; (b) The range of the function must be computationally available atomic symbols; and

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 369 (c) A transducer can occur at any energy level locus where, in order to capture the regularities of the system, it is necessary to change descriptions from physical to symbolic. The only requirement is that the trans formation from the input, described physically, to the ensuing token computational event, also described physically, follow from physical principles. (Pylyshyn, 1984: 154, emphasis mine) Thus, the input to a transducer might be physical stimulation of a retina or a tympanum and the output physically characterized electrical signals that can serve as tokens of symbol types that are the medium of a system’s computations.10 Now, again, every symbol token is some or other physical token, for example, an electrical one. But the property of being a symbol/sign is not a physical property: a symbol, qua symbol, is individuated by its role in a computational system and can be realized in any number of physical media, electrical or mechanical. “Is electrical” is part of the ideology of physics; “is a symbol”—not to mention “is an NP”—is part of the ideology of a computational linguistics or psychology. So, again, we need some account that will mediate the relation between the two, and Pylyshyn wryly notes “the relative scarcity of proposals for transducers that meet the physical condition, compared with the range of purely symbolic systems, for example, those of phonology, syntax, and much of both behavioral and cognitive psychology”. The crucial point is that: if we do not subscribe to this [physical] criterion, we stand little chance of systematically relating behavior to properties of the physical world. (Pylyshyn, 1984: 168)

The challenge for psychology is to explain what seems to be a sensitivity, that is, a systematic co-variation of internal states to various abstruse phenomena, for example, linguistic structures as a consequence of auditory stimulation. This requires there being some kind of systematic law-like relation between those linguistic structures and such physical stimulation, and, as Pylyshyn has pointed out, transduction cannot alone supply it, since it traffics only in physical properties, and linguistic structures are not, as such, purely physical properties.11 10 At this point in the discussion, consonant with Adger’s avoidance of intentionality, we need not assume that the symbols symbolize anything. They could be just the “0”s and “1”s that the computational system might read and write (and Adger does acknowledge one needs a computational system). 11 Of course, for a physicalist, every causally explanatory (“macro-”) property must be in some way “realized” in some or other physical properties. But the potentially indefinite diversity of realizers of a

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

370 Representation of Language There simply are no transduction laws relating syntactic structures, qua syntactic structures, to physical phenomena, or linguistic properties to neural ones.12 Although Pylyshyn thinks transducers are necessary for p erception, he does not for a moment think they are sufficient, or somehow obviate intentional representations, of, for example, SLEs or idealized objects in a 3D space. The same error of ignoring the problem of abstruseness arises for a non-Chomskyan such as Michael Devitt, who, we saw in §7.2.3, claims that linguistic perception can be explained merely by SLEs “having” various linguistic properties rather than representing them. But how could the mere having of a linguistic property explain why someone perceives it as such? Most things—even most things in one’s brain—have plenty of properties that are not perceived as such; moreover, linguistic properties are abstruse, so an explanation is needed for how particularly they could give rise to linguistic perceptions of them. Someone—perhaps John Collins along the lines of §8.5—might claim that accounting for such sensitivities is a “performance” issue not in the purview of a theory of the I-language. Perhaps. But, as I stressed in §8.5, an explana torily adequate theory of the I-language has to incorporate representations that can satisfy the interface conditions imposed by, inter alia, the sensori motor system, by sharing a common coin. What could a machine be like to be sensitive to such phenomena as words, phrases, phonological, syntactic, and semantic features—that is, SLEs—that are not local, physical, or likely even instantiated (cf. §9.2) properties of the stimuli? As we noted in §5.2, Fodor (1998: ch 6) pointed to a specific way that the problem arises for Chomskyans, but also to what seems a promising solution. He was concerned there in general with the relation between perceptual stimuli and the concepts that are occasioned by them, and worried that his own nativist views rule out the relation being one of hypothesis confirmation, adding in a “linguistics footnote” (1998: 128, fn8) how the problem seemed to arise for a Chomskyan linguist as well. The linguist is actually presented with two problems: (i) to explain how a child can even be sensitive to phenomena as abstruse as SLEs—mere “triggering” is no explanation unless there is some account about how this triggering could be effected in an essentially local, physical system—and (ii) why there seems macro-property does not provide the basis for a law, since, as Pylyshyn rightly notes with respect to trying to capture “reinforcers” by their physical realizers, “it would be such a garrulous disjunction that it would fail to capture the relevant generalizations” (1984: 166; see Fodor, 1975: ch 1, who argues that this is in general why “special sciences,” such as psychology or linguistics, are necessary in add ition to basic physics). The point is especially vivid if one considers the indefinite number of possible counterfactual realizations of a macro-property. 12 Of course, there is some kind of “information” transfer between syntactic states. But the point is that not all information transfer is transduction, since much of it involves relating abstruse properties.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 371 to be a non-arbitrary relation between the grammar the child acquires and the grammar of the ambient community. Fodor (1998: 128, fn8) then went on to note that “it may be that . . . the polemical resources of the hypothesis-testing model have been less than fully appreciated.” So let us return to hypothesis-testing models that may have been prematurely abandoned with the advent of P&P models (§2.2.7). Intentionality may serve where transduction gives out.

11.2 Probabilities, Disjunctions, and BasicAsymmetries 11.2.1 Sensitivities as BasicAsymmetries The hypothesis testing model proposed in Chomsky (1965: 30) was recruited a few years later by Chomsky and Halle (1968) for the specific task of perceiving SLEs: We might suppose, on the basis of what has been suggested so far, that a correct description of perceptual processes would be something like this. The hearer makes use of certain cues and certain expectations to determine the syntactic structure and semantic content of an utterance. Given a hypothesis as to its syntactic structure . . . he uses the phonological principles that he controls to determine a phonetic shape. The hypothesis will then be accepted if it is not too radically at variance with the acoustic material. . . . Given acceptance of such a hypothesis, what the hearer “hears” is what is internally generated by the rules. That is, he will “hear” the phonetic shape determined by the postulated syntactic structure and the internalized rules. (Chomsky and Halle, 1968: 24–5)13

A natural way to understand this proposal, of course, is that the hearer is sensitive to the abstruse properties of SLEs essentially by computing the probability of hypotheses about them and/or the degree of fit between postulated and encountered patterns. As they later claim in the same volume (expanding a passage we quoted earlier in §8.2): Speech perception is an active process, a process in which the physical stimulus that strikes the hearer’s ear is used to form hypotheses about the deep structure of the sentence. Given the deep structure and the rules of the 13 Or, of course, not, or only partly in the case of difficult or ungrammatical speech. There are presumably thresholds. I’ll take this as read throughout.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

372 Representation of Language grammar, all other representations of the sentence can be derived, including in particular the phonetic transcription. . . . These derived representations are used by the speaker to check his hypothesis against the external stimu lus, which provides the data that stand in the most direct . . . relationship with the phonetic transcription. . . . Even crude agreement between the external stimulus and the internally generated hypothesis suffices to confirm the latter. (Chomsky and Halle, 1968: 294, emphasis mine)

More recently, Lidz and Gagliardi (2015) advance a similar proposal: Importantly, UG must also be embedded in an architecture that allows learners to extract information from the input. This information is used to identify which of the resources defined by UG is applicable to any given sentence or sentence type. . . . The input feeds into a perceptual encoding mechanism, which builds an intake representation. This perceptual intake is informed by the child’s current grammar, along with the linguistic and extralinguistic information-processing mechanisms through which a representation from that grammar can be constructed (Omaki and Lidz, 2014). To the extent that learners are sensitive to statistical–distributional features of the input . . . that sensitivity will be reflected in the perceptual intake representations. (Lidz and Gagliadi, 2015: §2.2)

Now, of course, “hypotheses” are standardly intentionalistic sorts of things: they are “about” the phenomena whose probability is being estimated: the “probability,” or degree to which they are “not too radically at variance with the acoustic material,” being a measure of whether the hypothesis being confirmed is true. And, as we noted in §10.3, such talk of truth and error is subject to the “disjunction problem”: given that some acoustic material will not fit a hypothesis, H, perfectly, but only probabilistically, there is the possibility that it will be misjudged, a high threshold probability being assigned to H under conditions C when H is not likely true in C. But then what determines that H is not in fact true in C? Whether arbitrary stimuli S1, S2, S3, . . . , satisfy a hypothesis H depends upon the content of H: if the content is {S1}, then all the other Si (i > 1) are errors; if the content is {S1 or S2}, then neither S1 nor S2 are errors, only the other Si (i > 2) are; and so forth. Probabilities inherit the problem, since assigning probabilities to a hypothesis depends upon what the hypothesis is, specifically upon its truth conditions. The probability of something being F will usually be different from the probability of its being F or G, or F or G or H, or . . . etc.: the probability that a sound is a /k/ is not in general the same as the probability that it is a /k/, a hiccup or a cough.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 373 So what determines the content of the hypotheses Chomsky and Halle claim the hearer forms about the structure and phonetics of the sentence, or of the “information” Lidz and Gagliardi claim UG allows the hearer to extract? Note that solving this problem will be essential to the success of any serious theory of the I-language, be it concerned with phonology, syntax, or semantics. Explanatory adequacy requires that all these areas have a basis for the categories that will figure in their laws, rules, or principles to which a child becomes increasingly sensitive. Given that the respective states in each of these cases can be caused by any number of imperfectly realized speech forms, for example, rapidly (mis-)pronounced, co-articulated, assimilated SLEs, or by completely non-linguistic noises such as coughs, hiccups, or squeaky wheels, it is essential to the laws that they not be beholden to them. Laws about the non-voicing of plural /s/ in English after a non-voiced consonant should not be falsified by instances of buzz saw sounds preceded by the tick of a clock! The laws are true of genuine morphemes/phonemes/features that may only be encountered in perceptual experience with varying probabilities; but the probabilities are about those morphemes/phonemes/features, not about anything that happens, for whatever reason, to sound like them, and would be part of a disjunctive cause. As we also observed in §10.3, Jerry Fodor (1987, 1991) made a proposal that he thought solved such disjunction problems. Incorporating suggestions of Horwich (1998), I defended a modest version of the proposal: (BasicAsymmetry) The content of a symbol is determined by the property of meaningful tokenings of a term that is explanatorily basic, the one on which in the context of a theoretical psychology all other tokens with that content asymmetrically/explanatorily depend by virtue of that property. And I went on to argue that this seemed to me to offer a strategy to capture semantic claims in a linguo-semantics. I think it will also serve as part of an explanatory strategy for implementing Chomsky and Halle’s proposal about linguistic and other early perceptual processes that involve sensitivities to abstruse phenomena.

11.2.2 Perception as Probabilistic Confirmation It will help to start with a slightly less difficult example than an SLE, indeed, one of the very sort that Chomsky mentions a number of times by way of explaining what he means by “representation”:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

374 Representation of Language a rotating cube, whether it be the result of tachistoscopic presentations of a real rotating cube or stimulation of the retina in some other way; or imagined, for that matter. (Chomsky, 2000: 158; see also 1980a: 105–6)

Now people do seem manifestly sensitive to such a phenomenon. Indeed, one can suppose there are certain ideal conditions under which it is a law that, under those conditions, they can distinguish (at least approximately) rotating cubes from, for example, rotating pyramids. But perception is seldom under ideal conditions, and sometimes people will react to something as if it is a rotating cube when it is not—when, for example, it is a poorly glimpsed pyramid, or, as Chomsky (2000: 160) emphasized, as a result of real or tachistoscopic presentations.14 A standard mentalistic explanation of such sensitivity would posit an internal state with the (non-conceptual) content {{rotating cube}} and some system of confirmation whereby it comes to be applied in the light of perceptual experience.15 As we mentioned at the end of §5.6 that Kant perceptively noted, this will need to involve a “schematism” that will mediate between the abstract content of the representation and perceptual input, producing specific representations that are commensurate with sensory input from, for example, vision or touch. As a more contemporary theorist, Stephen Palmer (1999a) put it in the case of the case of visual categorization of objects: There must be some way in which the object representation is matched or compared against possible category representations. . . . For this to happen, the object and category representations must be of the same type. . . . The object representation might be a template, a feature list, a structural description, or some as-yet-unknown alternative, but whatever it is, the category representation must be of the same type so they can be matched. Trying to 14 If the visual system were good enough, of course, the hypothesis that there is a cube might never actually be assigned a probability of 1, i.e., certainty, it being impossible for any input to be a genuine Euclidean cube. It is enough that the assignment of 1 is a theoretical possibility—i.e., one defined in the theory—even if it is not a (metaphysically) real one. The condition under which the hypothesis would receive a probability of 1 is one that the theorist can have every reason to believe is a “limit,” approached asymptotically, as plausibly in the case of geometric figures (cf. Quine, 1960/2013: §52, on ideal gases and frictionless planes). Alternatively, of course, the BasicAsymmetry content might simply be something like {{cube-ish}} or {{looks cubical }}, which might be more apt for a visual system that would have a limit to its resolution. The issues here bear more discussion, but I trust the general point is clear enough. 15 At any rate, this seems to be the default assumption of many vision scientists (see, e.g., Knill and Richards, 2012), as well as many linguists such as Chomsky and Halle (1968) and Lidz and Gagliardi (2015), who (in conversation) have been surprised that I labor it. I press it only in light of the skepticism about intentionalistic representation that we have encountered in Chomksy, Collins, Adger, and Devitt.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 375 compare a template and a feature list, for instance, is futile. The only way it could be accomplished is by processes that essentially convert the template into a feature list that can be compared with the other feature list, convert the feature list into a template that can be then compared with the other template, or convert both representations into a third possibility that can then be compared with each other. (Palmer 1999a: 413–14)

Note that, as we mentioned in §5.6 in reply to Tomasello (2003), Fodor's (1998:136-7) innate “prototypes” that he proposes as triggers for concepts may be just what serve Kant's and Palmer's purposes. All of this might not seem to be an improvement on simply positing the sensitivity in the first place, until, however, we recruit a CRT of the familiar sort we sketched in §4.5, and, in particular, an II-CRT, that is understood as both intensional and intentional (see fn 2 above). It could provide a computational account of confirmation by exploiting, for example, a Bayesian algorithm that determines the probability of a certain hypothesis on certain evidence as a function of the probability of the evidence, the prior probability of the hypothesis, and the likelihood of it generating that evidence.16 Specifically, we may suppose that: (i) There is a basic law linking that representation in an internally represented system of geometry with a specific basic explanatory condition of being a three-dimensional regular solid of six equal square sides (which may, again, be only a theoretical “limit” possibility). (ii) Either innately, or as a result of experiential updating, priors are assigned to various perceptual geometric hypotheses—e.g., “That's a cube/sphere/pyramid”—as well as likelihoods of different stimulus patterns given each of the hypotheses. (iii) The visual system computes the relative probability of one of the hypotheses given the actual stimulation received. Of course, given (let us agree with Plato and Descartes) that no genuine cubes are ever presented in experience, and that the (non-conceptual) content is really {{cube}}; the perceptual system will always be (to various degrees) 16 Cf. §5..4.3 above. I am using Bayesianism only as an illustrative example, with no commitments to the specific procedures of “updating” by conditionalization that many Bayesians standardly propose, or to whether the probabilities (in addition to the relevant categories) are explicitly represented in the content of the representation or are merely implemented as a property of the relevant attitude state in the operation of the system (cf. Gross, forthcoming).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

376 Representation of Language mistaken in determining that a cube was being presented. But, given the structure of the system, these errors will be explained and asymmetrically depend upon the Basic explanatory condition provided by (i): this is what determines what the probabilities in (ii) and (iii) are probabilities of. But that Basic condition does not depend upon the errors.17 In this way, the BasicAsymmetry actually provides an explication of what we were initially only intuitively describing as a “sensitivity.” What makes a response a manifestation of an animal’s sensitivity to one property rather than another (e.g. disjunctive) one is that the response to that property is the basic response on which all other responses asymmetrically and explanatorily depend. The two components of the explanation, the BasicAsymmetry and the probabilistic computations support each other: the BasicAsymmetry identifies the sensitivity that the probabilistic computations explain.18 Returning to the case of language, and streamlining the account: along lines of the above and the Chomskyan evolutionary speculations we sketched in §2.2.10, suppose that a creature has evolved an internal mental organ, a recursive I-language, with its categories of SLEs, and that this organ were then recruited by pre-existing cognitive and perceptual systems to classify (and produce) perceptual stimuli. A natural way for it to do this would be to also recruit maybe pre-existing probabilistic reasoning strategies, and so determine which linguistic categories provided the most probable construal of the proximal stimuli it has encountered, along the lines of the hypothesis testing model proposed by Chomsky and Halle (1968) that we quoted in §11.2.1. By so categorizing the stimuli, it would in this way display sensitivities to the abstruse phenomena of SLEs.19 17 Of course, there are cases in which it might. Louise Antony (pc) tells me that research suggests that being susceptible to certain illusions (such as the “bent stick” in water) may be necessary for seeing most things right. Perhaps (although one should be careful about considering precisely what (non-conceptual) content to assign to vision, {{straight}} or {{looks straight}}, cf., fn 14). In any case, the account I am proposing is intended to provide only sufficient conditions for a successful explan ation of abstruse sensitivities, not a necessary one. I suspect there will be other strategies that might also work, particularly in the context of a fuller psychology than is being considered here. 18 This mutual support answers the difficulty, raised by Bill Ramsey (2007:125), that a purely external covariant (what he calls a “receptor”) notion of content “does not provide us with any sense of how a state or structure plays a representational role”. On the present account, it plays the role of determining what a probabilistic account of perception is measuring probabilities of by way of explaining abstruse sensitivities. Note that the present account accords with Nicholas Shea’s (2018: §2.6) pluralistic proposal of a “varitel” semantics characterizing intentional states in terms of the variety of different explanatory roles they may play in explanations of different psychological capacities, differing only in not relying on Shea’s largely teleo-semantic and externalist explanatory framework. 19 To take up the suggestion of §9.4, BasicAsymmetries may in fact also afford a way of actually defining what an SLE is: e.g., a phonological feature might just be that (perhaps counterfactual) phenomenon that would be the Basic cause of certain tokenings of a mental symbol, on which all other tokenings of it asymmetrically and explanatorily depend. This would, of course, make phonological

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 377 Note that the representational states in both the visual and linguistic case exhibit the properties of intentionality we mentioned in §8.1: “aboutness” and “aspect”: the same stimulus can be differently represented (as in the case of ambiguous figures or sentences); hyper-intensionality: the same stimulus can be differently represented by necessarily co-extensive (as in perceiving a symmetrical diamond as a diamond or as a square; a sentence as a mere sequence of words or as a binary tree-structure); answerability: the accuracy of the representation is answerable to facts about the stimulus, and may only be satisfied by the stimulus approximately (as in the case of cubes) or rarely (as in the case of SLEs); rationality: the computations might be more or less reasonable as probabilistic inferences. Most importantly for our discussion, there is the lack of existential commitment that impressed Chomsky: there need not be an actual cube or an SLE causing proximal stimuli, and so there need not be any de re “representation of.” Again, as Brentano stressed, not all representation is representation of real phenomena. Why is a specifically II-CRT crucial to this process? Well, it is a happy fact that at least some probabilistic reasoning can be implemented as locally specifiable computations that bring about representations of the mapped geometric states on the basis of sensitivity to only local physical input. Thus, an II-CRT supplemented with BasicAsymmetry in the above way would begin to meet the challenge of explaining how such a system could be sensitive to non-local and likely non-instantiated cubes, spheres, and SLEs. To be sure, this may turn out not to be the way in which humans do in fact do it, much less must— perhaps our sensitivities in this regard are as fortuitous as the ants’ sensitivity may be to actual distances. But it is a plausible way in which we could do it— indeed, it seems to be a way that many (e.g. vision) psychologists actually seem to take for granted, and, analogously, the way that Liberman and Mattingly (1985) regard the invariant properties of articulatory gestures, which they claimed: features “response dependent” along the lines of colors and other secondary properties (but need not include being a doorknob as Fodor, 1998:127ff, unnecessarily generalized; see §5.3 above).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

378 Representation of Language must be seen, not as peripheral movements, but as the more remote structures that control the movements. These structures correspond to the speaker’s intentions. (Liberman and Mattingly, 1985: 23; quoted in §9.2.2(ii) above)

Intentions are surely intentional if anything is! I stress yet again that the aim of the above proposal is neither to provide a general psychosemantics nor one that reduces intentional content to some non-intentional pattern in ways that Fodor, Horwich, and others have hoped to do. As I said, such ambitions strike me as excessively motivated by reductionist concerns that are not urgent so early in the development of an empir ical psychology. I am concerned only with providing a role for what seems a reasonable notion of intentionality at a “horizontal” level of current psychological explanations, as opposed to “vertical” reductions, and then only for cases of relatively modularized systems, such as vision and grammar, where the full panoply of conceptual connections afforded by general (“central”) reasoning need not be addressed. The point is to explain how ascriptions of such content can meet the challenge of explaining the sensitivities of a local physical device to abstruse phenomena such as cubes and SLEs. The interest of the proposal is in its explanatory power, not in its putting to rest all philosophical anxieties about intentionality. Again, as Shoemaker (2000) characterized “functionalist” accounts in general: intentional explanation comes as “a package deal.”20 It seems to me some intentionalist package deal very like it will be indispensible to providing the common coin an “explanatorily adequate” theory will require.

11.3 Concluding Remarks on Meth(odological)-Dualism I do not pretend to have offered an apodeictic, much less an a priori argument for the role of intentionality in a Chomskyan linguistics. The above proposals are intended in the same explanatory spirit in which Chomskyans advance their theory of grammar in the first place. Perhaps more constraints are needed, and perhaps better, non-intentionalist explanations are to be had. But this is just a standard issue in any empirical, explanatory project. It is striking, however, that resistance to a scientific intentionality has not always proceeded in this spirit. For all the confidence with which Chomsky (2000, 2003) dismisses intentionality in science, he never actually provides a 20 I’m grateful to Carsten Hansen for reminding me of this nice phrase.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 379 serious argument for the dismissal, much less examines the relevant explanatory issues.21 At most, he offers a perfunctory, passing observation about the problems with a relational, de re reading of “representation of” that we saw (§8.6) was obviously confused, both with regard to the examples (e.g. of a rotating cube) that he provided, and certainly with regard to intentional idioms generally; and then some casual remarks about how artifactual com puters are not an apt model for psychology, a claim few defenders of computational models of mind would dispute. In what follows, I want to consider more serious sources of resistance to incorporating intentional explanation into standard science and how they motivate the Meth-dualism that Chomsky curiously finds irrational and mystical, despite his sharing many of those motivations himself (§11.3.1). I will conclude the book with some comments on the “mind-body” problems he dismisses that provide some of the background to those motivations (§11.3.2).

11.3.1 Motivations for Meth-Dualism As I have mentioned a number of times, Chomsky’s dismissal of a role for intentionality in science is ironic, since (although, surprisingly, he nowhere seems to recognize this fact) it is just such a dismissal that is the basis for many philosophers’ endorsement of the very Meth-dualism that he otherwise deplores. In this concluding section I want to sort out some of the surprisingly diverse issues that are entangled here with an eye to defending at least the modest intentionality I have proposed for linguistics from both Chomsky’s and the background Meth-dualist’s reservations about it. Chomsky persistently characterizes Meth-dualism in a bizarrely uncharitable way, as: The view that we must abandon scientific rationality when we study human beings “above the neck” (metaphorically speaking), becoming mystics in this unique domain, imposing restrictions and a priori demands of a sort

21 Certainly not in the kind of detail in which he considers arguments for specific syntactic ana lyses. The exceptions that, so to say, prove the rule are in his more purely scientific writings, for example in his impressively careful consideration of the role of semantic considerations in linguistics in his (1955/75: 85–97), where he deals carefully with Quinean worries about meaning, and in his and Halle’s discussion of “psychological reality” in their (1968), that we’ve cited several times. But, as we’ve noted, these are all passages that seem to presuppose intentionality!

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

380 Representation of Language that would never be contemplated in the sciences, or in other ways departing from normal cannons of inquiry. (Chomsky, 2000: 76; see also 112, 135 and 2016a: 30)

Chomsky claims to find this view implicit in the work of philosophers as eminent and diverse as Quine, Davidson, Kripke, Dummett, Putnam, and Nagel (see, e.g., Chomsky 1975b, ch 4; 1986, ch 4; 2000, ch 2–6). To be sure, as we have noted in previous chapters, there are problems in many of the sometimes shallow (or what I have called “superficialist”) objections some of these figures have raised against Chomsky’s program. But it seems pretty wild to suggest that all of these quite serious philosophers are even implicitly advocating an abandonment of scientific rationality for some sort of mysticism regarding the mind. For any genuine understanding of the issues here, it is important to have some appreciation of the historical contexts from which both those discussions and Chomsky’s own have emerged. Some sort of Meth-dualism has always dominated speculation about the mind insofar as one finds it relying heavily on introspective, a priori, and/or “armchair” speculation, as opposed to systematic experiment. This is conspicuous in the very Rationalist and occasional Empiricist proposals that Chomsky endorses.22 During the time in the nineteenth century that more empirical approaches began to emerge, Brentano, as we mentioned, threw down the gauntlet claiming that intentional phenomena were “irreducible” to anything physical. This did indeed give rise to a Meth-dualism that was pursued in the “phenomenological” tradition of, for example, Husserl (1913/82) and Heidegger (1927/82). Here, psychology was seen as based largely on a kind of reflective introspection which, while often conceptually quite rich, was not particularly responsible to the controlled experimental methods of natural science. Further deepening the rift, others, for example, Windelband (1894/1915) and Dilthey (1927/76), proposed a “hermeneutic” approach, according to which understanding human reason and action involved a fundamentally different, “empathic” (“verstehen”) understanding, more akin 22 Chomsky (1996: 1, 2016b) often belittles the significance of the “cognitive revolution” since the 1960s as largely just a re-discovery of similar (and to his mind often better) ideas of, e.g., Descartes and Cudworth in the seventeenth century. But he overlooks at least two crucial differences between then and now: (i) serious computational models of anything really were not possible before Turing’s (1936) revolutionary work (it is no accident that there were no computers in the seventeenth century!), and (ii) for all the brilliance of many pre-twentieth-century theorists of the mind, it is a striking and peculiar fact that the idea of carefully controlled psychological experiments seldom seems to have occurred to them. Chomsky (2000: 80; 2010: 3–4; 2018: 35) likes to point out how Hume often compared his theories with those of Newton, but does not note that his speculations were not remotely based on the kind of detailed empirical findings that formed the basis for Newton’s ideas.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 381 to literature, in contrast to the “explanatory” (“erklären”) methods of the natural sciences.23 Perhaps somewhat surprisingly, this rift, drawn essentially along the same lines, continues down to the present day in a great deal of analytic philosophy. In particular, Quine (1960/2013) and his followers argued that intentional ascription amounted to little more than a “charitable” way of interpreting our neighbor, a way “that finds him consistent, a believer of truths, and a lover of the good,” as Davidson (1970: 253) touchingly (if pretty implausibly) put it.24 And, in his otherwise admirable rendering of functional approaches to the mind in terms of Ramsification (see §10.2.2, fn17 above), David Lewis (1972:212) proposed defining mental concepts in terms of “platitudes which are common knowledge among us—everyone knows them, everyone knows that everyone else knows them, and so on” (1972: 212), oddly not imagining applying the technique to material that is more explanatorily serious.25 As I indicated in the Preface, many of us encountering Chomsky’s work in the 1960s saw in it salvation from all this. This is why it was so disconcerting to read such anti-intentionalist passages from Chomsky that I quoted in §8.4: If “cognitive science” is taken to be concerned with intentional attribution, it may turn out to be an interesting pursuit (as literature is), but is not likely to provide explanatory theory or to be integrated into the natural sciences. (Chomsky, 2000: 23)

and to find him to be invoking the hermeneutic contrast of “theoretical” vs. literary and historical understanding: we learn much more of human interest about how people think, feel and act by reading novels or studying history or the activities of normal human life 23 Interestingly, as we quoted in §8.2, Chomsky (1986: 40) himself at one point described his own project as “cerebral hermeneutics”! 24 Curiously, so far as I have read, neither Quine nor Davidson acknowledge the connection to these earlier traditions (but see Mantzavinos, 2016, for an excellent discussion). Peter Winch’s (1958/90) may have been an important discussion that linked them. This “charity” view has also sometimes been expressed by saying that intentional ascription is “normative,” as we saw in our discussion of Kripkenstein in §3.4.3, an idea that has been energetically pursued by Dennett (1978): deciding on the basis of available evidence that something is (or may be treated as) an intentional system permits predictions having a normative or logical basis rather than an empirical one. (Dennett, 1978: 13) See also McDowell (1985: 389), Nagel (1986: 115–16) and Wedgwood (2007) for various versions of the view, and Rey (2002a) and (2007) for criticisms of them. 25 Lewis (1994) later gives a passing nod to theory instead, but does not elaborate. See Lycan (1987) and Rey (1997: ch 6) for discussion of applying functionalism and Ramsification to scientific psychological theories, an approach Block (1978/80) called “psychofunctionalism.”

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

382 Representation of Language than from all of naturalistic psychology, and perhaps always will; similarly, the arts may offer appreciation of the heavens to which astrophysics does not aspire. We are speaking here of theoretical understanding, a particular mode of comprehension. Any departure from this carries a burden of justification. (Chomsky, 2000: 77)

Of course, there is more to the Verstehen tradition than art and novels. Explaining people’s states and behavior in terms of their thoughts, motives, representations, understandings, patterns of reasoning, attitudes towards risk, and other mental states, as the Verstehen tradition insisted upon doing, is precisely the project of the various research programs we mentioned in §8.1— including linguistics with, as we have seen, its talk of parsing, perception, internal representations, and aim of explanatory adequacy. If the computational accounts they propose are on the right track, then, pace the Hermeneuticists, the Meth-Dualists, and Chomsky, they do not seem to present any serious departure from the usual sorts of theoretical understanding to which scientists aspire.26

11.3.2 Mind/Body Problems As I have indicated, many of these issues about the proper role of intentionality grew out of the traditional “mind/body problem” that can be traced back at least to Descartes. Surprisingly, Chomsky (1996: 5–6, 2004b: 173–4) in a n umber of lectures and publications over the last three decades has been attempting to dismiss these supposed problems, so much so that I think something needs to be said about them, if only to make clear their errors, but also complete lack of relevance to our preceding discussion and the rest of his theory.27 26 For the record, Thomas Nagel (1986) expresses an astonishingly sweeping skepticism about such programs: Eventually, I believe, current attempts to understand the mind by analogy with man-made computers that can perform superbly some of the same external tasks as conscious beings will be recognized as a gigantic waste of time. The true principles underlying the mind will be discovered, if at all, only by a more direct approach. (Nagel, 1986: 16) But it is difficult to find in his work any serious argument for this view, much less any detailed discussions of the specific programs and their shortcomings, other than that they may not be sufficient for us to be able to empathize with bats. 27 As Joseph Levine (forthcoming) notes, Chomsky’s full views on the mind/body problems are actually difficult to make out, since, alongside passages in which he dismisses the problems, he also seems to endorse standard materialist approaches to them. In a particularly puzzling passage, he seems to endorse some remarks of the eighteenth-century chemist, Joseph Priestley:

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 383 Chomsky’s dismissal of “the mind/body” problem is based almost entirely on his claim that it makes sense only in terms of the notion of body peculiar to Descartes’ contact mechanics, whereby all movements need to be caused by local impacts between ultimate “corpuscles” of matter. He argues that this view was discredited by Newton’s theory of gravitation, and that there has not been a viable alternative to the Cartesian notion since.28 Turning Ryle’s famous phrase, “the ghost in the machine,” on its head, Chomsky claims: Newton exorcised the machine, leaving the ghost intact. The mind-body problem in its scientific form did indeed vanish as unformulable, because one of its terms, body, does not exist in any intelligible form. (Chomsky, 2014; see also 1993: 38; 1996: 6,41ff; 2002a: 53, 84; 2016a: 56)

Indeed, for example: Eliminative materialism . . . is total gibberish until someone tells us what matter is . . . and nobody can tell you what matter is. (Chomsky, 1993: 84)29

With the Newtonian discoveries, matter “ought to rise in our esteem, as making a nearer approach to the nature of spiritual and immaterial beings”, the “odium [of] solidity, inertness, or sluggishness” having been removed. Matter is no more “incompatible with sensation and thought” than with attraction and repulsion. “The powers of sensation or perception and thought” are properties of “a certain organized system of matter . . . a property of the nervous system, or rather of the brain”. (Chomsky, 1995b: 8) and elsewhere seems to recognize within such a proposal, precisely one of the standard mind/body problems, since there appears to be nothing to say about how organized matter can have such properties as the creative aspect of language use, in that respect we’re exactly as much in the dark as the Cartesians were. (Chomsky, 2016b) I shall concern myself here only with what seem to me the more salient, and certainly more idiosyncratic dismissals, and direct the reader to Levine’s subtle attempts to reconcile them with these other passages. 28 Actually, the history of contact mechanics is a great deal more complex than Chomsky indicates, many versions of the doctrine being seriously pursued to this day (see, e.g., Johnson’s, 1987, apparently classic text, which traces the history of the view not to Descartes—whom he never mentions— but to the work of Helmholtz in the 1880s). Worries about the former persist in Einstein’s concerns about “spooky action at a distance” in quantum mechanics. Although locality remains as a reasonable working (but surely modifiable) assumption in psychology (cf. §11.1, fn 4 above), it is hard to see why any theorist about the mind should care whether physics traffics in “bodies” or “fields.” For the record, note that Descartes’ view involved both the idea of locality and that it transpires between “corpuscles,” and that field theories can allow the former while denying the latter. 29 In this passage he allows that eliminative materialism may alternatively be the view that scientists should “just look at neurophysiology” (Chomsky, 1993: 85), a view that there is no need here to defend or discuss. Curiously, he also disparages here a view very like one of his own that we mentioned in §3.3.3 (where a grammar is an abstract characterization of a procedure implemented somehow in the brain):

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

384 Representation of Language Now, Chomsky’s discussion of the history here is certainly partly correct,30 but, as it concerns the “mind/body problem”—or, really, problems—I fear it misses some standard issues that seem perfectly expressible without that history, and, moreover, peculiar enough to our thought and talk about minds as to be puzzling independently of any controversial notion of “body” or “matter.” In fact, anyone who has taught elementary introductions to the philosophy of mind does not (I hope) begin with a discussion of Descartes’ contact mechanics or with any sort of general characterization of “body.” It is easy enough to raise the standard problems with a conception of body that is as clear and readily available to anyone with a standard Western education, viz., our ordinary biological bodies. Not to belabor the obvious, these consist of, for example, heads, torsos, limbs, and a multitude of internal organs. People have, of course, varying degrees of familiarity with their bodies, but they surely know enough to find the following claim overwhelmingly plausible (where “current natural sciences” include current physics, chemistry, biology, and neurophysiology, and “bodily states and events” are any states or events involving human bodies describable in the proprietary terms of those sciences):31 (SBN): Simple Bodily Naturalism: Every individual bodily motion that is the result of a mental state can be explained as (also) the result of bodily states and events along standard biological/physio-chemical lines of current natural sciences. When people say the mental is the neurophysiological at a higher level, they’re being radically unscientific. . . . The belief that the neurophysiological is implicated in [the mental] could be true, but we have scant little evidence. (Chomsky, 1993: 85) In fact, this seems to be the quite plausible view defended by Collins (2009: 262), quoted above in §3.3.3. Note in any case that Eliminativists usually stress the elimination of the mental (cf. Collins, 2007b, in §8.5 above); their views of the “materialist” alternatives can vary. 30 Although Randy Gallistel (pc) has called my attention to a fascinating debate in the nineteenth century about the compatibility of mental–physical causation with the laws of physics, and referred me to Kremer (1990). 31 Levine (forthcoming) minimizes the metaphysical (e.g. “materialist”) commitments in an insightful way by pointing out that, in the end, opposition to dualism is simply an insistence on explaining mental phenomena in any non-mental terms: mental phenomena are simply not “basic” primitive phenomena of the world. I would rely on this simpler formulation were it not for that fact that Chomsky (1996) seems prepared even to resist that idea: It may seem offensive to common sense and sound thinking to suppose that certain matters (intentionality and aboutness, consciousness, behavior that is uncaused but appropriate, or whatever) are among “the ultimate and irreducible properties of all things” that physicists seek to catalogue (Jerry Fodor’s formulation). But the stipulation is not very helpful. Why these, but not attraction and repulsion? (Chomsky, 1996: 44) Now, it seems to me we have every reason to rule out this and other perhaps intermediate possibilities (such as Nagel’s, 2012, appeal to Aristotlean teleology), and so I want to propose (SBN) in addition to Levine’s insight.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 385 That is, the natural sciences are “complete,” maybe not necessarily with regard to the universe (about which, to be sure, mere philosophers of mind should not opine), but at least with regard to the mind and human body: there are no special problems in explaining any human bodily phenomena that also appear to be produced by mental states, at least not by virtue of their being so produced. A physical movement, say the raising of an arm or the articulatory motions of speech, may often be the result of some mental deliberation, and so count as an “action,” but those specific events could also be explained by reference, for example, to electrical impulses coursing down efferent nerves and causing the muscles to contract.32 (SBN) does not entail that there are no problems in natural sciences, nor even that arm movements and articulatory motions are not among them. Nor does it rule out macro-levels of description that are in some important sense causally explanatory but not “reducible” to physics. Perhaps “mental state” and “action” descriptions—for example, the raising of an arm, as opposed to its merely rising—are at such a non-reducible macro-level. All that SBN claims is that, if these states and actions somewhere down the line cause any bodily events, the latter events are not for that reason counterexamples to present natural sciences.33 For all its banality, however, SBN sets enough of a stage for us to raise several standard philosophical problems. However, it is important to stress, these problems are by no means confined to philosophy, but are increasingly being addressed by cognitive scientists. At the risk of belaboring issues familiar to every philosopher: (i) “Privacy” and First-Person Privileges: Every normal human being seems to have some kind of very special access to her or his conscious mental states, but not to the states of anyone else. There seems to be what (not a philosopher, but) a leading vision scientist, Stephen Palmer (1999b: 931ff), called a “subjectivity barrier” (which, as we noted in §3.3.1, behaviorism was a strenuous effort to circumvent).

32 I would be surprised were SBN not presumed by most every serious scientist, though it is hard to find any bothering to say so in print. Pace Collins (2015: 102), SBN does rule out purported para psychological phenomena such as telekinesis or levitation, since these would seem to present challenges to present natural science. Indeed, SBN presents a serious challenge to any dualistic view on which mental phenomena are not in some way or other constituted by present natural phenomena, since it would threaten to render mental phenomena explanatorily otiose. 33 It could turn out, of course, that certain bodily motions on a particular occasion were crucially the result of some indeterminate quantum event. But we have no reason to suppose that this was because the motion was also the result of a mental event.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

386 Representation of Language As Palmer goes on to note, this first problem also raises a further, specific metaphysical problem about, for example, color that is to some extent independent of this first epistemic one: (ii) The qualitative character of sensory experience: One kind of phenomenon of which people seem to have special knowledge, each in his or her own case, is what seems to be a distinctive “qualitative character” (or “qualia”) of much of their sensory experience, for example, “what it is like” to see red, have an itch, or a toothache (cf., Nagel, 1986). Quite apart from the supposed “privacy” of such phenomena, it is extremely difficult to see how they could possibly be fully explained by bodily phenomena, much less identified with them. Palmer provides empirical support for the objective possibility of, for example, inversion of “red-green” color experiences, which he thinks cast serious doubt on the possibility of science being able to give a c omplete and testable explanation of the quality of color experience, or any other kind of experience for that matter. (Palmer, 1999b: 942)

As Levine (1983) famously noted, there seems to be an “explanatory gap” between the properties of the brain and the properties of sensory experience, a gap that (pace Collins, 2015: 102) seems conceptually far more problematic than gaps between, say, economics and natural science, in that it seems impossible even to imagine how it could possibly be bridged. (iii) Rationality: Descartes made a prescient observation that still seems plausible even today in the age of fancy computers: although machines can perform certain things as well as or perhaps better than any of us can do, they would infallibly fall short in others, by which means we may discover that they did not act from knowledge, but only from the disposition of their organs. For while reason is a universal instrument which can serve for all contingencies, these organs have need of some special adaptation for every particular action. (Descartes, 1637/1970: 116)

Indeed, reflecting on contemporary holistic accounts of confirmation (cf. §4.4 above), Jerry Fodor (2000) raised some general reasons to be skeptical of a general Turing computable model of reasoning that he otherwise endorsed.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 387 The problem is arguably a version of the “frame problem” discussed in computer science.34 Again, the problem here seems conceptual: how possibly could any sort of holistic confirmation work in any system of local causation and computation (cf. §11.1 above)? (iv) Free Will: If every motion of a human body can be entirely explained by existing naturalistic theories, then how can a person’s deliberate actions have been performed “freely,” in a way that renders them deserving of moral praise or blame? How could they possibly have acted in a way other than they did? Chomsky (1983, 2002a: 50, 60) himself is sensitive to this particular problem (oddly not including it in his dismissal of the “mind/body problem,” and uncritically presuming it is not an illusion, as many have argued it is). Discussing Cartesian contact mechanics, he writes: crucial aspects of what people think and do would lie beyond its scope, in particular, voluntary action. Set the machine in a certain state in a particular external situation, and it will be “compelled” to act in a certain way (random elements aside). But under comparable circumstances, a human is only “incited and inclined” to do so. . . . The available evidence suggests that some aspects of the world, n otably the normal use of language, do not fall within the mechanical philosophy. (Chomsky, 1996: 3)

But Chomsky is wrong to suggest the problem is somehow confined to Descartes’ “mechanical philosophy.” SBN will suffice. Putting, indeed, “random” or statistical quantum considerations aside (which would of course be of no help for responsible action), the current n atural sciences of human bodies seem to indicate that their states and motions are not only “inclined,” but every bit as “compelled” as the states and motions of any machine. It is important not to confuse this traditional philosophical issue with the purely scientific issue that Chomsky rightly raises of whether the use of language—linguistic performance—affords a theoretically tractable domain. Just because some processes may involve immensely complex interaction effects (cf. §3.2 above), this does not entail that they somehow preclude “mechanical” explanation (even though Chomsky does often associate the two ideas, as in his 1986: 3). The movement of leaves in the wind and rain outside my window is 34 See McCarthy and Hayes (1969), Fodor (1983:114), and Shanahan (2017) for discussion.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

388 Representation of Language likely not a theoretically tractable domain, but (again, barring quantum effects) is presumably as “mechanical” as any other process in nature. Lastly, there is the problem for (SBN) that was the original motivation for my writing this entire book: (v) Intentionality: as we have discussed above, people seem to have thoughts and perceptions that are about an indefinite variety of abstruse phenomena, real or unreal. But how is this generally possible for what seems to be (on current theories) a local physical/biological system?35 I want to stress, again, that none of these problems make mention of any sort of Cartesian contact mechanics, or any notion of body more controversial than that of a typical human biological one.36 Indeed, SBN can be stated without any general definition of “body,” “mind,” or “physical.” Many philosophers agree with Chomsky that such definitions are hard to come by, and that, to be sure, conceptions of what is “physical” have significantly evolved and are likely to continue to do, especially in light of the puzzles of quantum mechanics and its relation to General Relativity. As Bilgami and Rovane (2005:193) point out in their defense of Chomsky’s views about the mind/body problem, “our physical concepts have not proved to be very stable.” However, this does not begin to invite the suggestion they go on to make that: If [our physical concepts] are not stable, then none of the alleged problems about the mind-body relation can be stable either. (Bilgrami and Rovane (2005: 193)

It all depends on whether we have any reason to believe that the present instabilities in physics have anything whatsoever to do with the mind.37 Although quantum phenomena could conceivably play a role in explaining 35 Of course, Chomsky may not regard this as a problem if, as in his (1996: 44) that we quoted in fn 33, he is prepared to find intentionality to be an “ultimate and irreducible property of all things.” 36 I include “biology,” since Chomsky (2000: 44) seems to endorse Wittgenstein’s (1953/2016: §113) claim that “we can only say of a human and what is like one that it thinks,” and dismisses the question of whether machines can think as “meaningless” as the question whether submarines swim (a comparison first proposed by Edsger Dijkstra, 1984). Of course, given that we understand a lot less about thinking than we do about submarines and swimming, one might wonder whether the issue is really a merely verbal one, and why and whether biology is really essential to thought. See Rey (2016) for discussion. 37 Collins (2015: 95–7) notes that a number of philosophers, e.g. Lycan (2003), have responded in this way to Chomsky’s claim (which can be traced back to Carl Hempel, 1969), and he cites in reply Penrose’s (1989) speculation about how quantum effects on mental processing specifically raise problems for computational theories of mind. But while they (and lots of other issues) may well raise problems for such specific theories, neither he nor Penrose provide any serious reason to suppose that these quantum effects will have anything to do with the problems I have mentioned, as Collins acknowledges at the end of this passage (2015: 97).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Psycho-Semantics of Perceptual Content 389 some mental phenomena, there is not the slightest reason to suppose the issues they raise have anything to do with the above five puzzles, any more than they do with problems of love, war, or the national debt. Were the above mind/body problems really not so stable! Chomsky (1975b: pt IV; 2000: 82–3) draws a distinction between “problems,” which fall within the range of our scientific abilities, and what he regards as hopeless “mysteries,” which do not, and which he thinks may never be amenable to serious solution. Aside from the problems he regards as “gibberish,” he seems to regard the problem of free will as falling into the latter category, a suggestion that Colin McGinn (1994) has pursued further with regard to a great many philosophical conundra. But, apart from the persistence of the conundra, neither Chomsky nor McGinn provide the slightest evidence for such a conjecture. In fact, in assessing our understanding of the mind, it is crucial to bear in mind the complexity of religious, political, moral, psychotherapeutic, and merely everyday practical agenda that have all been at work, pulling our conceptions now this way and now that. Here the problem is not “mystery,” but simply trying to keep these different issues from getting confused, which history shows is no mean feat. More importantly, as Chomsky (2016b) himself rightly acknowledges, there has in fact been some progress in understanding of at least some aspects of many of these problems in the cognitive scientific literature of the last sixty years.38 Just which mind/body problems are “mysteries” and which are increasingly subtle “problems” seems for the foreseeable future a wholly ephemeral matter, one that we should not begin to decide at this early stage in the history of scientific efforts to address them. Oddly, Chomsky’s distinction also omits four further important categories into which many mind/body problems may plausibly fall: (a) linguistic confusions: a tendency to assimilate too simply what are complex and diverse uses of words (a point stressed by Wittgenstein, 1953/2016, Austin, 1979, and Chomsky himself, cf. §10.4 above!); (b) conceptual confusions, regarding simply how we think of for example, privacy, qualia, freedom, and intentionality;

38 Thus, much work combining research in psychology and philosophy has made interesting inroads into all five problems. For work on introspection and self-attribution that seriously qualifies the issues of privacy see, e.g., Nisbett and Wilson (1977), Burge (1988), Carruthers (2011); on vision, audition, pain perception and the problems of qualia, see, e.g., Melzack and Wall (1988), Aydede (2019), Palmer (1999b), Hardin (1988/93, 2008), and Block (2007); with regard to free-will, Austin (1956/79, 1966/79), G. Strawson (1987), Wegner (2002), and Nahimas (2014); and on intentionality, Burge (1984), 2010), Loewer (1997), Neander (2017), Shea (2018), as well, of course, as the last several chapters and references above. See Collins (forthcoming) for a penetrating discussion of the fragility of Chomsky’s “problems/mysteries” distinction.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

390 Representation of Language (c) illusions regarding how we experience ourselves, and ineluctably think of other people, as having conscious experiences and “freedom” to choose (cf. G. Strawson, 1987, Rey,1996, Wegner, 2002, Frankish, 2017), and (d) what may ultimately turn out to be irresolvable tensions in our thought, for example, between our view of ourselves as physical objects and as morally significant agents in the world (cf.e.g., Kant, 1787/1968: B72–481; P. Strawson, 1962; Nagel, 1986). I do not mean to decide here just which of the problems fall into which of the categories—that, in fact, has always been a large issue in the philosophy of mind, at least since Kant. The point here is only that, in dismissing continuing work on the mind/body problem as “gibberish,” Chomsky is badly—and needlessly!— ignoring its richness and complexity. In fact, for reasons I hope have emerged in this book, I think a Chomskyan linguistics has made contributions towards solutions of some aspects of the above problems, and it is odd that Chomsky, instead of dismissing them, does not discuss how his core theories may contribute to a better understanding of them. The application of the Galilean Method to psychology, emphasizing the importance of a theory of competencies underlying superficial performance; the positing of an unconscious, innate computational system of grammar as an explanatorily adequate means of capturing human linguistic competence, perception, and understanding; the separation of a theory of meaning from a theory of truth: all of these distinctive contributions of a Chomskyan linguistics substantially deepen our appreciation of distinctions that have not yet been sufficiently incorporated into our understanding of the problems. I have made a modest proposal of my own with respect to the fifth problem, that of explaining intentionality, suggesting how a Chomskyan theory, supplemented with a combination of probability theory and BasicAsymmetries, might serve to explain not only the sensitivities to the abstruse categories of grammar required for an explanatorily adequate theory, but also some of the ways we perceive and understand language—surely aspects of its qualitative feel! Whether these suggestions can be sustained and extended beyond those cases to sensitivities displayed in people’s more general cognitive abilities seems to me to await a far better understanding of those abilities than I believe anyone has yet attained. But I hope that the discussions I have provided in this book afford good reasons to think that a Chomskyan approach, intentionalistically understood, offers a promising strategy.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

References to Works of Chomsky (including with joint authors) For ease of reference, the following are the references for the works of Chomsky cited in this book. Chomsky, N. (1951/79), The Morphophonemics of Modern Hebrew, London: Routledge. Chomsky, N. (1955/75), The Logical Structure of Linguistic Theory, University of Chicago (1955/79), New York: Plenum Press (the ms. was prepared and circulated in mimeograph in 1955–56, but was not published until 1975, both by Plenum press and by the University of Chicago. Only the latter contains an invaluable index). Chomsky, N. (1956), “Three Models for the Description of Language,” IRE Transactions on Information Theory, II–2: 113–24. Chomsky, N. (1957), Syntactic Structures, The Hague: Mouton. Chomsky, N. (1959/64), “A Review of B. F. Skinner’s Verbal Behavior,” Language, 35: 26–58. Re-printed in J. Fodor and J. Katz (eds), Readings in the Philosophy of Language, Englewood Cliffs, NJ: Prentice Hall, pp. 547–78. Chomsky, N. (1962), “Explanatory Models in Linguistics,” in E. Nagel, P. Suppes, and A. Tarksi, Logic, Methodology and Philosophy of Science, Stanford, CA: Stanford University Press. Chomsky, N. (1964), Current Issues in Linguistic Theory. The Hague: Mouton. Chomsky, N. (1965), Aspects of the Theory of Syntax, Cambridge MA: MIT Press. Chomsky, N. (1966), Cartesian Linguistics. New York: Harper and Row. Chomsky, N. (1968), “Quine’s Empirical Assumptions,” Synthese, 19 (1/2): 53–68. Chomsky, N. (1968/2006), Language and Mind, 3rd edn, Cambridge: Cambridge University Press. Chomsky, N. (1970), “Remarks on Nominalization,” in R. Jacobs and P. Rosenbaum (eds), Readings in English Transformational Grammar, Waltham, MA: Ginn, pp. 184–221. Chomsky, N. (1971), Problems of Knowledge and Freedom, New York: Vintage. Chomsky, N. (1973), For Reasons of State, New York: Pantheon Books. Chomsky, N. (1975a), Introduction to (1955/75). Chomsky, N. (1975b), Reflections on Language, New York: Pantheon Books. Chomsky, N. (1977a), Dialogues avec Mitsu Ronat, Flammarion (same as 1979). Chomsky, N. (1977b), Essays on Form and Interpretation, New York: North Holland. Chomsky, N. (1978a), “A theory of core grammar,” Glot, 1, 7–26. Chomsky, N. (1978b), Interview with Sol Saporta. Washington State University, Department of Linguistics Working Papers in Linguistics, supplement, 4: 1–26. Chomsky, N. (1979), Language and Responsibility, New York: Pantheon. Chomsky, N. (1980a), Rules and Representations, Oxford: Blackwell. Chomsky, N. (1980b), “Précis of Rules and Representations with Commentaries and Replies,” Behavioral and Brain Sciences, 3: 1–61. Chomsky, N. (1980c), “The New Organology,” Behavioral and Brain Sciences, 3: 42–61. Chomsky, N. (1981), Lectures on Government and Binding, Dordrecht: Foris.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

392 References to Works of Chomsky Chomsky, N. (1983), “Things no amount of learning can teach: Noam Chomsky interviewed by John Gliedman,” Omni 6 (11): http://www.chomsky.info/interviews/198311--.htm Chomsky, N. (1983/2004), The Generative Enterprise Revisited: Discussions with Riny Huybregts, Henk van Riemsdijk, Naoki Fukui and Mihoko Zushi (originally published in 1983 as The Generative Entrprise); Berlin: De Gruyter Mouton. Chomsky, N. (1986), Knowledge of Language: Its Nature, Origin and Use, New York: Praeger. Chomsky, N. (1987), “Language in a Psychological Setting,” Sophia Linguistica, 22. Chomsky, N. (1988a), Language and Problems of Knowledge: The Managua Lectures, Cambridge, MA: MIT Press. Chomsky, N. (1988b), Language and Politics, a collection of interviews edited by C. Otero, New York: Black Rose Books. Chomsky, N. (1991), “Linguistics and Adjacent Fields: A Personal View,” A. Kasher, The Chomskyan Turn: Linguistucs, Philosophy, Mathematics and psychology, Oxford: Blackwell, pp. 3–25. Chomsky, N. (1993), Language and Thought, London: Moyer Bell. Chomsky, N. (1995a), The Minimalist Program, Cambridge, MA: MIT Press. Chomsky, N. (1995b), “Language and Nature,” Mind, 104(413):1–61. Chomsky, N. (1995c), “Rationality/Science,” Z Papers Special Issue; https://chomsky. info/1995____02/ Chomsky, N. (1996), Powers and Prospects, Boston: South End Press. Chomsky, N. (2000), New Horizons in the Study of Language, Cambridge: Cambridge Univ Press. Chomsky, N. (2002a), On Nature and Language, Cambridge: Cambridge University Press. Chomsky, N. (2002b), “Exchange with John Searle,” New York Review of Books, July 18, 2002 Issue, https://www.nybooks.com/articles/2002/07/18/chomskys-revolution-anexchange/ Chomsky, N. (2003), “Reply to Rey,” in L. Antony and N. Hornstein, Chomsky and His Critics, Oxford: Blackwell, pp. 274–87. Chomsky, N. (2004a), “Beyond Explanatory Adequacy,” in A. Belletti (ed.), Structures and Beyond: the Cartography of Syntactic Structures, Oxford: Oxford University Press, pp. 104–31. Chomsky, N. (2004b), “Biolinguistics and the Human Capacity,” available at https://chomsky.info/20040517/Delivered at MTA, Budapest, May 17, 2004; appears as chapter 7 in Chomsky (1968/2006). Chomsky, N. (2004c), introduction to Chomsky (1968/2004). Chomsky, N. (2005), “Three Factors in Language Design,” Linguistic Inquiry, 36: 1–22. Chomsky, N. (2006), Preface to 3rd edition of (1968/2006). Chomsky, N. (2007), “Approaching UG from below,” in U. Sauerland, and H. Gärtner, (2007), pp. 1–29. Chomsky, N. (2009a), “The Mysteries of Nature: How Deeply Hidden?” Journal of Philosophy, Cvi (4): 167–200. Chomsky, N. (2009b), “Remarks,” in M. Piatelli-Palmarini (ed.), Of Minds and Language: a Dialogue with Noam Chomsky in the Basque Country,” Oxford: Oxford University Press, pp. 13–43, 52–4. Chomsky, N. (2010), “The mysteries of nature: How deeply hidden?” in J. Bricmont and J. Franck (eds), Chomsky Notebook, New York: Columbia University Press, pp. 3–33. Chomsky, N. (2013), “Problems of Projection,” Lingua, 130: 33–49.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

References to Works of Chomsky 393 Chomsky, N. (2014), “Science, mind and the limits of understanding,”. Lecture at the The Science and Faith Foundation (STOQ), The Vatican, January 2014; https://chomsky. info/201401__/ Chomsky, N. (2015), “Some Core Contested Concepts,” Journal of Psycholinguistic Research, 44 (1): 91–104. Chomsky, N. (2016a), What Kind of Creatures Are We?, New York: Columbia University Press. Chomsky, N. (2016b), Transcript: Noam Chomsky on the Cognitive Revolution—Part 1, https://bertrandchomsky.wordpress.com/2016/12/05/first-blog-post/ Chomsky, N. (2017), “Two Notions of Modularity,” in R. Almeida and L. Gleitman (eds), On Concepts, Modules and Language: Cognitive Science at its Core. Oxford: Oxford University Press, pp. 25–40. Chomsky, N. (2018), “Mentality beyond consciousness,” in G. Caruso (ed.), Ted Honderich on Consciousness, Determinism, and Humanity; New York: Palgrave Macmillian, pp. 33–46. Chomsky, N. and McGilvray, J. (2012), The Science of Language: Interviews with James McGilvray, Cambridge: Cambridge University Press (note: these are extensive interviews conducted by James McGlvray in 2004 and 2009, and lightly edited and published by him as this volume which also includes some expository appendices and commentaries by him. Single bracketed interpolations are McGilvray’s; double bracketed ones mine). Chomsky, N. and Halle, M. (1968), The Sound Pattern of English, Cambridge, MA: MIT Press. Chomsky, N. and Stemmer, B. (1999), “An On-Line Interview with Noam Chomsky: on the Nature of Pragmatics and Related Issues,” Brain and Language, 68: 393–401. Miller, G. and Chomsky, N. (1963), “Finitary models of language users,” in P. Luce, R. Bush, and E. Galanter (eds), Handbook of Mathematical Psychology, Volume 2. New York: Wiley, pp. 419–92.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References References to Chomsky’s works are listed separately. Adams, F. and Aizawa, K. (2017), “Causal Theories of Mental Content,” Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/content-causal/ Adger, D. (2003), Core Syntax. Oxford: Oxford University Press. Adger, D. (2019), Linguistic representation: a note on terminology vs. ontology, available at: https://ling.auf.net/lingbuzz/004616?_s=HJCyU-Jh42SAH2Jo&_k=xxn1bUWy_ex1G4yl Adger, D. and Svenonious, P. (2011), “Features in Minimalist Syntax,” in C. Boeckx, Oxford Handbook of Minimalist Syntax, Oxford: Oxford University Press, pp. 27–51. Allott, N. (2019). “Scientific tractability and relevance theory,” in K. Scott, R. Carston and B. Clark (eds), Relevance: Pragmatics and Interpretation, Cambridge: Cambridge University Press, pp. 29–41. Allott, N. and Rey, G. (2017), “The Many Errors of Vyvyan Evans The Language Myth,” Linguistic Review, 34 (3): 1–20. Allott, N. and Textor, M. (2017), “Lexical modulation without concepts: Introducing the derivation proposal,” Dialectica, 71 (3): 399–424. Allott, N. and Wilson, D. (forthcoming). “Chomsky and pragmatics,” in Allott, Lohndal and Rey (forthcoming) Allott, N., Lohndal, T., and Rey, G. (forthcoming), The Blackwell Companion to Chomsky, New York: Wiley-Blackwell. Almeida, R. and Gleitman, L. (eds) (2017), On Concepts, Modules and Language: Cognitive Science at its Core. Oxford: Oxford University Press, pp. 25–40. Andersen, J. (1980), Cognitive Psychology and Its Implications, San Francisco: Freeman Anscombe, G. (1985), “Review of Saul Kripke’s Wittgenstein on Rules and Private Language,” Canadian Journal of Philosophy, 1: 103–9. Antony, L. (2002). “How to play the flute: A commentary on Dreyfus’s “Intelligence without representation,” Phenomenology and the Cognitive Sciences, 1 (4): 395–401. Antony, L. and Hornstein, N. (2003), Chomsky and His Critics, Oxford: Blackwell Antony, L. and Rey, G., (2016), “Philosophy and Psychology,” in H. Cappellan, T. Gendler, and J. Hawthorne (eds), Oxford Handbook on Philosophical Methodology, Oxford: Oxford University Press, pp. 584–6. Apperly, I. (2010). Mindreaders: The Cognitive Basis of “Theory of Mind.” Hove, East Sussex, Psychology Press. Ariew, A. (2007), “Innateness”, in M. Matthen and C. Stephens (eds), Handbook of the Philosophy of Biology, Amsterdam: Elsevier. Austin, J. (1956/79), “A Plea for Excuses, ” Reprinted in Austin (1979), pp. 175–204. Austin, J. (1966/79), “Three Ways of Spilling Ink?” The Philosophical Review, 75 (4): 427–40. Reprinted in Austin (1979), pp. 272–88. Austin, J. (1979), Philosophical Papers, 2nd edn, ed. by J. Urmson and G. Warnock, Oxford: Oxford University Press. Aydede, M. (2019), “Pain.” Entry in Stanford Encyclopedia of Philosophy, https://plato. stanford.edu/entries/pain/

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

396 General References Ayer, A. J. (1936/52), Language, Truth and Logic, 2nd edn, New York: Dover. Bach, K. (1994), “Conversational Impliciture,” Mind & Language, 9: 124–62. Bach, K. (2002), “Seeming Semantic Intuitions,” in J. Keim Campbell, M. O’Rourke, and D. Shier (eds), Meaning and Truth, New York: Seven Bridges Press, pp. 21–33. Bach, K. and Harnish, R. M. (1979), Linguistic Communication and Speech Acts, Cambridge, MA: MIT Press. Baillargeon, R., Sotoh, P., Sloane, S., Jin, K., and Bian, L. (2014), Infant Social Cognition: Psychological and Sociomoral Reasoning, APA Handbook of Personality and Social Psychology, Vol. 1 ed by E. Borgida and J. Bargh, Washington, DC: APA; http://labs. psychology.illinois.edu/infantlab/articles/baillargeon_setoh_sloaneetal_2014.pdf Baker, G. and Hacker, P. (1984), Language, Sense and Nonsense: A Critical Investigation into Modern Theories of Language, Oxford: Blackwell. Baker M. (2001), The Atoms of Language: the Mind’s Hidden Rules of Grammar, New York: Basic Books. Baker, M. (forthcoming), “On Chomsky’s Legacy in the Study of Linguistic Diversity,” in Allott, Lohndal, and Rey (forthcoming). Barber, A. (2006), “Testimony and Illusion,” Croatian Journal of Philosophy, 6: 401–30. Bealer, G. (1984), “Mind and Anti-Mind: Thinking Has No Functional Definition,” Midwest Studies in Philosophy, 9: 283–328. Berkeley, G. (1710/1999), A Treatise Concerning the Principles of Human Knowledge, Eugene OR: University of Oregion Press. Berlinsky, D. (1988), Black Mischief: Language, Life, Logic and Luck, Boston: Harcourt, Brace and Jovanovich. Bermudez, J. and Cahen, A. (2015), “Non-Conceptual Content,” entry in Stanford Encylopedia of Philosophy https://plato.stanford.edu/entries/content-nonconceptual/ Berwick, R. and Chomsky, N. (2011), “The Biolinguistic Program: The Current State of its Evolution and development,” in A. DiSciullo and C. Boeck (eds), The Biolinguistic Enterprise, Oxford: Oxford University Press, pp. 19–41. Berwick, R. C., Pietroski, P., Yankama, B., and Chomsky, N. (2011), “Poverty of the Stimulus Revisited,”Cognitive Science 35 (7): 1207–42. Bezuidenhout, A. (2002), “Truth-Conditional Pragmatics,” Philosophical Perspectives, 16: 105–34. Bianchi, A. (ed.) (2020), Language and Reality from a Naturalistic Perspective: Themes From Michael Devitt, Cham: Springer. Bickerton, D. (2009), Adam’s Tongue: How Humans Made Language and How Language Made Humans, New York: Hill and Wang. Bilgrami, A. and Rovane, C. (2005), “Mind, Language and the Limits of Inquiry,” in McGilvray (2005), pp. 181–203. Block, N. (1978/1980), “Troubles with Functionalism,” in Block (1980a), pp. 268–306. Block, N. (1980a and b), Readings in the Philosophy of Psychology (two vols), Cambridge, MA: Harvard University Press. Block, N. (1986), “Advertisement for a Semantics for Psychology,” in P. French, T. Uehling, and H. Wettstein (eds), Studies in the Philosophy of Mind, vol 10 of Midwest Studies in Philosophy, Minneapolis, MN: University of Minnesota Press. Block, N. (2007), “Consciousness, Accessibility, and the Mesh Between Psychology and Neuroscience,” Behavioral and Brain Sciences, 30: 481–99. Bloomfield, L. (1933), Language, New York: Henry Holt. Bloomfield, L. (1936/85), “Words or Ideas?” in Katz (1985a), pp. 19–25. Bock, K. (1986), “Syntactic Persistence in Language Production,” CognitivePsychology 18: 355–87.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 397 Bock, K. (1995), “Sentence production: from mind to mouth,” in J. Miller and P. Eimas, (eds.), Speech, Language and Communication, San Diego: Academic Press, pp. 181–216. Bock, K. and Loebell, H. (1990). “Framing sentences,” Cognition, 35 (1): 1–39. Boden, M. (1988), Computational Models of the Mind, Cambridge: Cambridge Universitry Press. Boeckx, C. (2006), Linguistic Minimalism: Origins, Concepts, Aims, Oxford: Oxford University Press. Bogen, J. and Woodward, J. (1988), “Saving the Phenomena,” The Philosophical Review, 97: 303–52. Bolhuis, J. and Everaert, M. (2013), Birdsong, Speech, And Language: Exploring the Evolution of Mind and Brain, Cambridge, MA: MIT Press. Bond, Z. (2008), “Slips of the Ear,” in D. Pisoni and R. Remez (eds), Handbook of Speech Perception, Malden, MA: Blackwell, pp. 290–310. Borer, H. (1984), Parametric Syntax, Dordrecht: Foris. Borg, E. (2001), “Review of P. Horwich,” Meaning. The Philosophical Review, 110: 101–4. Borg, E. (2004), Minimal Semantics, Oxford: Oxford University Press Borg, E. (2012), Pursuing Meaning, Oxford: Oxford University Press. Botha, R. (1989), Challenging Chomsky: The Generative Garden Game, Oxford: Blackwell. Braine, M. (1971), “On two types of models of the internalization of grammars,” in D. J. Slobin (ed.), The Ontogenesis of Grammar, Cambridge, MA: Academic Press, pp. 153–86. Breheny, R. (2011), “Experimentation-based pragmatics,” in W. Bublitz and N. Norrick (eds), Handbook of Pragmatics: Volume 1 Foundations of Pragmatics, Berlin: De Gruyter Mouton, pp. 561–86. Brentano, F. (1874/1995), Psychology from an Empirical Standpoint, transl. by A. C. Rancurello, D. B. Terrell, and L. McAlister, London: Routledge, 1973. (2nd edn, intr. by Peter Simons, 1995). Bresnan, J. (ed.) (1983), The Mental Representation of Grammatical Relations, Cambridge, MA: MIT Press. Bresnan, Joan (2001), Lexical-Functional Syntax. Oxford: Blackwell. Brinton, L. (2000), The Structure of Modern English, Philadelphia: John Benjamins. Brock, J. (2007), “Language abilities in Williams syndrome: a critical review,” Development and Psychopathology, 19: 97–127. Brody, M. (2002), Towards an Elegant Syntax, London: Routledge. Bromberger, S, and Halle, M. (1992), “The Ontology of Phonology” in S. Bromberger, On What We Know We Don’t Know. Chicago: University of Chicago Press, pp. 209–28. Brooks, R. (1991), “Intelligence without Representation,” Artificial Intelligence, 47: 139–59. Brown, R. and Hanlon, C. (1970), “Derivational complexity and order of acquisition in child speech,” in J. R. Hayes (ed.), Cognition and the Development of Language, New York: Wiley, pp. 11–53. Burge, T. (1977), “Belief De Re,” Journal of Philosophy, 74(6):338–62. Burge, T. (1979), “Individualism and the Mental,” Midwest Studies in Philosophy, 4 (1): 73–121. Burge, T. (1984), “Individualism and Psychology,” Philosophical Review, 95 (1): 3–45. Burge, T. (1988), “Individualism and Self-Knowledge,” Journal of Philosophy, 85 (11): 649–3. Burge, T. (1993), “Mind-Body Causation and Explanatory Practice,” in J. Heil and A. Mele, Mental Causation, Oxford: Oxford University Press, pp. 97–120. Burge, T. (2003), “Reply to Chomsky,” in M. Hahn and B. Ramberg, (eds), Reflections and Replies: Essays on the Philosophy of Tyler Burge, Cambridge MA: MIT Press, pp. 451–70.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

398 General References Burge, T. (2010), Origins of Objectivity, Oxford: Oxford University Press. Burton-Roberts, N., Carr, P. and Docherty, G. (2000), Phonological Knowledge: Conceptual and Empirical Issues. Cambridge: Cambridge University Press. Byrne, A. and Hilbert, D. (2003), “Color realism and color science,” Behavioral and Brain Sciences, 26: 3–64. Campbell, K. (1990), Abstract Particulars, Oxford: Blackwell. Carey, S. (2009), Origin of Concepts, Oxford: Oxford University Press. Carroll, L. (1893), Syliva and Bruno Concluded, New York: Macmillan, available on-line at https://ia700408.us.archive.org/12/items/sylviebrunoconcl00carriala/sylviebrunoconcl00carriala.pdf Carroll, L. (1895), “What the tortoise said to Achilles,” Mind, 104 (416): 691–3. Carruthers, P. (2000), Phenomenal Consciousness: A Naturalistic Theory, Cambridge: Cambridge University Press. Carruthers, P. (2006), “The case for massively modular models of mind,” in R. Stainton (ed.), Contemporary Debates in Cognitive Science, Oxford: Blackwell, pp. 3–21. Carruthers, P. (2008), “Meta-cognition in Animals: A Skeptical Look,” Mind & Language, 23 (1): 58–89. Carruthers, P. (2011), The Opacity of Mind: an Integrative Theory of Self-Knowledge, Oxford: Oxford University Press. Carston, R. (2002), Thoughts and Utterances, Oxford: Blackwell. Carston, R. (2019), “Ad Hoc Concepts, Polysemy and the Lexicon,” in K. Scott, B. Clark, and R. Carston (eds), Relevance, Pragmatics and Interpretation: Essays in Honor of Deirdre Wilson, Cambridge: Cambridge University Press. Cartwright, N. (1983), How the Laws of Physics Lie, Oxford: Oxford University Press. Cartwright, R. (1960/87), “Negative Existentials,” Journal of Philosophy, 57 (20/21): 629–39; reprinted in Philosophical Essays, Cambridge, MA: MIT Press, 1987, pp. 21–31. Chater, N. (2018), The Mind is Flat: the Illusion of Depth and the Improvised Mind, New York: Penguin Random House. Chater, N. and Christianson, M, (2008), “Language as Shaped by the Brain,” Behavioral and Brain Sciences, 31: 489–558. Chater, N. and Manning, C. D. (2006). “Probabilistic models of language processing and acquisition, 2 Trends in Cognitive Sciences, 10 (7): 335–44. Chater, N., Clark, A., Goldsmith, J., and Perfors, A. (2015), Empiricism and Language Learnability, Oxford: Oxford University Press. Cherniak, C. (2005), “Innateness and Brain-Wiring: non-genomic nativism,” in A. Zilhao (ed.), Cognition, Evolution and Rationality, London: Routledge. Chisholm, R. (1957), Perceiving: a Philosophical Study, Ithaca: Cornell University Press. Chomsky references listed independently Chouinard, M. and Clark, E. (2003), “Adult Reformulations of Child Errors as Negative Evidence,” Journal of Child Language, 30: 637–9. Churchland, P. M. (1981), “Eliminative Materialism and the Propositional Attitudes,” Journal of Philosophy, 78: 67–90. Churchland, P. S. (1986), Neurophilosophy, Cambridge, MA: MIT Press. Churchland, P. M. (1989), A Neurocomputational Perspective: the Nature of Mind and the Structure of Science, Cambridge, MA: MIT Press. Churchland, P. S. (2013), “Forward” to Quine (1960/2013), pp. x–iv. Cinque, Guglielmo and Richard S. Kayne (eds.). (2005), The Oxford Handbook of Comparative Syntax. Oxford: Oxford University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 399 Clark, R. and I. Roberts (1993), “A Computational Approach to Language Learnability and Language Change,” Linguistic Inquiry 24: 299–345. Clayton, N., Dally, J., and Emery, N. (2007), “Social cognition by food-caching corvids. The western scrub-jay as a natural psychologist,” Philos Trans R Soc Lond B Biol Sci. 2007 Apr 29; 362 (1480): 507–22. Cohen, J. and Rogers, J. (1991), “Knowledge, Morality and Hope: the Social Thought of Noam Chomsky,” New Left Review I/187, May–June, pp. 5–27. Collins, J. (2004), “Review of Antony and Hornstein (2003),” Erkenntnis 60: 275–81. Collins, J. (2004), “Faculty Disputes,” Mind & Language, 19 (5): 503–35. Collins, J. (2006), “Between a Rock and a Hard Place: A Dialogue on the Philosophy and Methodology of Generative Linguistics, Croatian Journal of Philosophy, 6: 471–505. Collins, J. (2007a), “Linguistic Competence without Knowledge,” Philosophy Compass, 2: 880–95. Collins, J. (2007b), “Meta-scientific Eliminativism: A Reconsideration of Chomsky’s Review of Skinner,” in British Journal for the Philosophy of Science, 58: 625–58. Collins, J. (2008a), “Knowledge of Language Redux,” Croatian Journal of Philosophy, 22: 3–43. Collins, J. (2008b), Chomsky: a Guide for the Perplexed, London: Conintuum. Collins, J. (2008c), “A Note on Conventions and Unvoiced Syntax,” Croatian Journal of Philosophy, 7: 241–7. Collins, J. (2009), “The Perils of Content,” Croatian Journal of Philosophy, 9: 259–89. Collins, J. (2010a), How Long Can a Sentence Be and Should Anyone Care? Croatian Journal of Philosophy, 10: 199–207. Collins, J. (2010b), “Naturalism in the Philosophy of Language; or Why There is No Such Thing as Language,” in S. Swayer (ed.), New Waves in Philosophy: Philosophy of Language, London: Palgrave-Macmillan, pp. 41–59. Collins, J. (2011), The Unity of Linguistic Meaning, Oxford: Oxford University Press. Collins, J. (2014), “Representations without Representa: Content and Illusion in Linguistic Theory,” in Piotr Stalmaszczyk (ed.), Semantics and Beyond: Philosophical and Linguistic Inquiries, Berlin: De Gruyter, pp. 27–64. Collins, J. (2015), “Naturalism without metaphysics,” in J. Collins and E. Fischer (eds), Experimental Philosophy, Rationalism, and Naturalism: Rethinking Philosophical Method, London: Routledge, pp. 85–109. Collins, J. (2018), “Perceiving Language: Issues Between Travis and Chomsky,” in J. Collins, and T. Dobler, The Philosophy of Charles Travis: Language, Thought and Perception, Oxford: Oxford University Press, pp. 155–80. Collins, J. (2020a), “Semantic and Syntactic Intuitions: Two Sides of the Same Coin,” in Schindler et al, pp. 89–108. Collins, J. (2020b), “Invariance as the Mark of the Psychological Reality of Language.” in Bianchi, pp. 7–44. Collins, J. (forthcoming), “Chomsky’s problem/mystery distinction,” in Allott, Lohndal, and Rey (forthcoming). Combert, J. (1992), Metalinguistic Development, New York: Harvester-Wheatsheaf. Corbett, G. (1991), Gender. Cambridge: Cambridge University Press. Cowart, R. and Cairns, H. (1987), “Evidence for an Anaphoric Mechanism within Syntactic Processing: Some Reference Relations Defy Semantic and Pragmatic Constraints,” Memory and Cognition, 15: 318–31. Cowie, F. (1999), What’s Within: Nativism Rwe-Considered, Oxford: Oxford University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

400 General References Cowie, F. (2008), “Innateness and Language,” at Stanford Encyclopedia of Philosophy: on-line. Crain, S. (2012), The Emergence of Meaning, Cambridge: Cambridge University Press. Crain, S. and Steedman, M. (1985), “On Not Being Led up the Garden Path: The Use of Context by the Psychological Syntax Processor,” in David R. Dowty, Lauri Karttunen, and Arnold M. Zwicky (eds), Natural Language Parsing: Psychological, Computational, and Theoretical Perspectives, Cambridge: Cambridge University Press, pp. 320–8. Crane, T. (2013), The Objects of Thought, Oxford: Oxford University Press. Crimmins, M. (1998), “Hesperus and Phosphorus: Sense, Pretense, and Reference,” Philosophical Review, 107: 1–48. Culicover, P. and Jackendoff, R. (1999), “The View from the Periphery: The English Comparative Correlative,” Linguistic Inquiry, 30 (4): 543–71. Curtiss, S. (1988), “Abnormal Language Acquisition and the Modularity of Language,” in F. Newmeyer (ed.), Linguistics: the Cambridge Survey, vol II, Cambridge: Cambridge University Press, pp. 96–116. Davidson, D. (1963), “Actions, Reasons, and Causes,” Journal of Philosophy, LX (23): 685–700. Davidson, D. (1970), “Mental Events,” in L. Foster, and J. Swanson, (eds), Experience and Theory, London: Duckworth, pp. 207–24. Davidson, D. (1984), Inquiries into Truth and Interpretation, Oxford: Clarendon Press. Davidson, D. (1984a), “Thought and Talk,” in his Inquiries into Truth and Interpretation, Oxford: Clarendon Press, pp. 155–79. Davies, M. (1986), “Externality, psychological explanation, and narrow content,” Aristotelian Society Supplementary Volume, 60: 263–83. Davies, M. (1987), “Tacit knowledge and semantic theory: Can a five per cent difference matter?” Mind, 96, 441–62. Davies, M. (1989), “Tacit knowledge and subdoxastic states,” in A. George (ed.), Reflections on Chomsky, Oxford: Blackwell, pp. 131–52. Demopoulos, W. and Matthews, R. (1983), “On the hypothesis that grammars are intern ally represented,” Behavioral and Brain Sciences, 6 (3): 405–6. Dennett, D. (1969), Content and Consciousness, London: Routledge. Dennett, D. (1971/78), “Intentional Systems,” in Dennett (1978), pp. 3–24. Dennett, D. (1975), “Why the Law of Effect Won’t Go Away,” Journal for the Theory of Social Behavior, 5 (2): 169–88. Dennett, D. (1978), Brainstorms: Philosophical Essays on Mind and Psychology, Cambridge, MA: MIT Press. Dennett, D. (1987), The Intentional Stance, Cambridge: MIT Press. Dennett, D. (1991), Consciousness Explained, Boston, MA: Little Brown & Co. Dennett, D. (1995), “Superficialism vs. Hysterical Realism,” in Philosophical Topics, 22 (12): 530–6. DePaul, M. and Ramsey, W. (1998), Re-thinking Intuition: the Psychology of Intuition and its Role in Philosophy, Lanham, MD: Rowman and Littlefield. Derrida, J. (1987), Languages of the Unsayable: The Play of Negativity in Literature and Literary Theory, New York: Columbia University Press. Descartes, R. (1637/1984), “Discourse on the Method,” in J. Cottingham, R. Stoothoff, and D. Murdoch (eds and trans), The Philosophical Writings of Descartes, Vol. II, Cambridge: Cambridge University Press. Devitt, M. (1981), Designation, New York: Columbia University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 401 Devitt, M. (1984/91/97), Realism and Truth, Princeton, NJ: Princeton University Press. (The 1997 Princeton edition is the 1991 Blackwell 2nd edition together with a new Afterword. Its index is a reprint of the 1991 index and so does not cover the Afterword.) Devitt, M. (1996), Coming to Our Senses, Cambridge: Cambridge University Press. Devitt, M. (1998/2010b), “Naturalism and the A Priori,” Philosophical Studies, 92: 45–65; reprinted in Devitt (2010b), pp. 253–70. Devitt, M. (2002), “Meaning and Use,” Philosophy and Phenomenological Research, 65: 106–21. Devitt, M. (2003), “Linguistics Is Not Psychology,” in A. Barber, (ed.), Epistemology of Language, Oxford: Oxford University Press, pp. 107–39. Devitt, M. (2006a), Ignorance of Language, Oxford: Clarendon Press. Devitt, M. (2006b), “Defending Ignorance of Language: Responses to the Dubrovnik Papers,” Croatian Journal of Philosophy, VI (18): 571–609. Devitt, M. (2006c), “Intuitions in Linguistics,” British Journal for the Philosophy of Science, 57: 481–513. Devitt, M. (2008a), “Explanation and Reality in Linguistics,” Croatian Journal of Philosophy, VIII (23): 203–31. Devitt, M. (2008b), “A Response to Collins’ Note on Conventions And Unvoiced Syntax,” Croatian Journal of Philosophy, VIII (23): 249–55. Devitt, M. (2010a), “Naturalism in Philosophy of Language,” in S. Sawyer, New Waves in Philosophy of Language, New York: Palgrave Macmillan, pp. 41–59. Devitt, M. (2010b), Putting Metaphysics First, Oxford: Oxford University Press. Devitt, M. (2011), “No Place for the A Priori,” in M. Schaffer and M. Veber (eds), What Place for the A Priori?, Chicago and LaSalle: Open Court, pp. 9–32. Devitt, M. (2013a), “Linguistic Intuitions are not the “Voice of Competence,” in M. Haug, (ed.), Philosophical Methodology: The Armchair or the Laboratory?, London: Routledge. Devitt, M. (2013b), “Responding to a Hatchet Job: Ludlow’s Review of Ignorance of Language,” Discusiones Filosóficas, 14 (23). Devitt, M. (2014), “Linguistic intuitions and cognitive penetrability,” Baltic International Yearbook of Cognition, Logic and Communication, 9 (1): 4. Devitt, M. (2015), “Should Proper Names Still Seem So Problematic?” in A. Bianchi, (ed.), On Reference, Oxford: Oxford University Press, pp. 108–43. Devitt, M. (2020a), “Linguistic Intuitions: A Response to Gross and Rey,” in Schindler et al (2019), pp51–68. Devitt, M. (2020b), “Stirring the Possum: Responses to the Bianchi Papers,” in A. Bianchi, (ed.), Language and Reality from a Naturalistic Perspective: Themes From Michael Devitt, Cham: Springer, pp. 371–455. Devitt, M. (in prep), Overlooking Conventions: The Trouble with Linguistic Pragmatism, book proposal submitted to Oxford University Press, December 2018. Devitt, M. and Sterelny, K. (1987/99), Language and Reality, 2nd ed., Cambridge, MA: MIT Press. Dijkstra, E. (1984), Lecture to Association for Computing Machinery, South Central Regional Conference, http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/ EWD898.html Dik, S. (1997), The Theory of Functional Grammar, 2nd edn, Berlin: Mouton de Gruyter. Dilthey, W. (1927/76), “The Understanding of Other Persons and their Life-Expressions,” in Patrick Gardiner (ed.), Theories of History, Trans. J. Knehl. Glencoe, IL: Free Press, 1959, pp. 213–15.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

402 General References Donnellan, K. (1966), “Reference and Definite Descriptions,” Philosophical Review, 77: 281–304. Dretske, F. (1969), Seeing and Knowing, London: Routledge and Kegan Paul. Dretske, F. (1981), Knowledge and the Flow of Information, Cambridge. MA: MIT Press. Dretske, F. (1988), Explaining Behavior: Reasons in a World of Causes, Cambridge, MA: MIT Press. Dreyfus, H. (2002), “Intelligence without representation—Merleau-Ponty’s critique of mental representation The relevance of phenomenology to scientific explanation,” Phenomenology and the Cognitive Sciences, 1 (4): 367–83. Drożdżowicz, A. (2015), Investigating Utterance Meaning, PhD dissertation for Dept of Philosophy, Classics, and History of Art and Ideas, University of Oslo. Drożdżowicz, A. (forthcoming), “Speakers’ intuitive judgements about meaning—the voice of performance view,” Review of Philosophy and Psychology. Duhem, P. (1906/54), The Aim and Structure of Physical Theory, translated by P. Wiener, Princeton, NJ: Princeton University Press. Dummett, M. (1981), “Objections to Chomsky,” London Review of Books, 3 (16(3)): 5–6. Dwyer, S. and Pietroski, P. (1996), “Believing in Language,” Philosophy of Science, 63: 338–73. Dwyer, S., Huebner, B., and Hauser, M. (2010), “The Linguistic Analogy: Motivations, results, and speculations,” Topics in Cognitive Science, 2: 486–510. Egan, F. (1992), “Individualism, Computation and Conceptual Content,” Mind, 101:443–59 Egan, F. (2014), “How to Think about Mental Content,” Philosophical Studies, 170:115–35. Eimas, P., Siqueland E, Jusczyk P, Vigorito J. (1971), Speech perception in infants. Science, 171: 303–6. Elman, J. (1991), “Distributed representations, simple recurrent networks, and grammat ical structure,” Machine Learning, 7: 195–225. Elman, J. (1993), “Learning and development in neural networks: The importance of starting small,” Cognition, 48: 71–99. Epstein, Kitahara, H., Kawashima, R., and Groat, E. (1998), A Derivational Approach to Syntactic Relations. Oxford: Oxford University Press. Evans, G. (1981), “Semantic Theory and Tacit Knowledge,” reprinted in his 1985 Collected Papers, Oxford: Clarendon Press. Evans, G. (1982), Varieties of Reference, ed. by J, McDowell, Oxford: Oxford University Press. Evans, V. (2014), The Language Myth, Cambridge: Cambridge University Press. Evans, N. and Levinson, S. (2009), “The Myth of Language Universals: Language Diversity and Its Importance for Cognitive Science,” Behavioral and Brain Sciences, 32 (5): 429–8. Everett, D. (2012), Language: The Cultural Tool, New York: Pantheon Books. Fernández, E. and Cairns, H. (2011), Fundamentals of Psycholinguistics, Malden, MA: Wiley-Blackwell. Ferreira, F., Christianson, K., and Hollingworth, A. (2001), “Misnterpretations of garden path sentences: Implications for models of reanlysis,” Journal of Psycholinguistic Research, 30: 3–20. Fillmore, C. and Kay, P., and O’Connor, M. (1988), “Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone,” Language, 64 (3): 501–38. Fitch, W. (2010), “Three meanings of ‘recursion’: key distinctions for biolinguistics,” in Larson et al. (2010), pp. 73–90. Fodor, J. A. (1968), Psychological Explanation, New York: Random House.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 403 Fodor, J. A. (1970), “Three reasons for not deriving ‘kill’ from ‘cause to die’,” Linguistic Inquiry, 1:429–38. Fodor, J. A. (1972), “Some reflections on L.S. Vygotsky’s,” Thought and language. Cognition, I: 83–95. Fodor, J. A. (1975), The Language of Thought, New York: Crowell. Fodor, J. A. (1978/81), “Tom Swift and his procedural grandmother,”. In Fodor (1981a), pp. 204–24. Fodor, J. A. (1981a), RePresentations: Philosophical Essays on the Foundations of Cognitive Science. Cambridge, MA: MIT Press. Fodor, J. A. (1981b), “The present status of the innateness controversy,”. In Fodor (1981a), pp. 257–31. Fodor, J. A. (1981c), “Three cheers for propositional attitudes,” in Fodor (1981a), pp. 100–23. Fodor, J. A. (1981d), “Introduction: What Linguistics is Talking About,” in Block (1981b), pp. 197–207. Fodor, J. A. (1980/90), “Psychosemantics, or where do truth conditions come from?” in W. Lycan (1990), Mind and Cognition, Oxford: Blackwell, pp. 312–18. Fodor, J. A. (1983). The Modularity of Mind, Cambridge, MA: MIT Press. Fodor, J. A. (1986), “Why paramecia don’t have mental representations,” Midwest Studies in Philosophy, X (1): 3–23. Fodor, J. A. (1987), Psychosemantics, Cambridge, MA: MIT Press. Fodor, J. A. (1990), A Theory of Content and Other Essays, Cambridge, MA: MIT Press. Fodor, J. A. (1991), “The dogma that didn’t bark (a fragment of a naturalized epistemology),” Mind, New Series, 100 (2): 201–20. Fodor, J. A. (1998), Concepts: Where Cognitive Science Went Wrong, Cambridge, MA: MIT Press. Fodor, J. A. (2000), The mind doesn’t work that way. Cambridge MA: MIT Press. Fodor, J. A. (2010), LOT2: The Language of Thought Revisited. Oxford: Oxford University Press. Fodor, J. A. and Katz, J. (1963), “The structure of a semantic theory” Language, 39: 170–210. Fodor, J. A. and Lepore, E. (1992), Holism: A Shopper’s Guide, London: Blackwell. Fodor, J. A. and Pylyshyn, Z. (1981), “How direct is visual perception? Some reflections in Gibson’s ‘Ecological Approach’,” Cognition, 9: 139–An196. Fodor, J. A. and Bever, T., and Garrett, M. (1974), The Psychology of Language, New York: McGraw Hill. Fodor, J. D. (1978), “Parsing Strategies and Constraints on Transfromations,” Linguistic Inquiry, 9 (3): 427–73. Fodor, J. D. (2001), “Learnability Theory: Triggers for Parsing with,” in E. Klein and G. Martohardjono (eds), The Development of Second Language Grammars: a Generative Approach, Philadelphia, PA: John Benjamins, pp. 363–403. Fodor, J. D. (2009), “Syntax Acquisition: an Evaluation Metric After All?,” in PiatelliPalmarini (2009), pp. 256–77. Fodor, J. D. (unpublished), “A treelet library,” talk in at The Norwegian Summer Institute on Language and Mind, University of Oslo, August 2017). Fodor, J. D. and Crowther, C. (2002), “Understanding stimulus poverty arguments,” The Linguistic Review, 19: 105–45. Fodor, J. D., Nickels, S., and Schott, E. (2017), “Center-embedded Sentences: What’s Pronounceable is Comprehensible,” in Roberto de Almeida and Lila Gleitman (eds), Minds on Language and Thought, Oxford: Oxford University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

404 General References Føllesdal, D. (2013), preface to Word and Object, 2nd ed, (ed. by Dagfinn Føllesdal), Cambridge, MA: MIT Press. Føllesdal, D. and Quine, D. (eds) (2008), Quine in Dialogue, Cambridge, MA: Harvard University Press. Frankish, K. (2017), Illusionism as a Theory of Consciousness, Exeter (UK): Imprint Academic Ltd. Frazier, L. and Fodor, J. D. (1978), “The Sausage Machine: a New Two-Stage Parsing Model,” Cogntition, 6: 291–325. Frege, G. (1879/1997), Begriffsschirft, reprinted in M. Beaney (ed.), (1997), The Frege Reader, Oxford: Blackwell. Frege, G. (1884/1974), (Die Grundlagen der Arithmetik, Breslau: Wilhelm Koebner. English translation, The Foundations of Arithmetic: A logico-mathematical enquiry into the concept of number, by J. L. Austin, Oxford: Blackwell (second revised edition, 1974). Frege, G. (1892/1980), “Über Sinn und Bedeutung,” in Zeitschrift für Philosophie und philosophische Kritik, 100: 25–50. Translated as ‘On Sense and Reference’ by M. Black in Translations from the Philosophical Writings of Gottlob Frege, P. Geach, and M. Black (eds and trans), Oxford: Blackwell (third edition, 1980). Freidin, R. (2012), “A brief history of generative grammar,” in G. Russell and D. Graff Fara (eds), The Routledge Companion to the Philosophy of Language, New York: Routledge, pp. 895–916. Freiden, R. and Vergnaud (2001), “Exquisite connections: Some remarks on the evolution of linguistic theory,” Lingua: 639–66. Freud, S. (1901/60), The Psychopathlogy of Everyday Life, ed. J. Strachey, trans A. Tyson, NewYork: Norton. Fromkin, V. (1980), Errors in Linguistic Performance: Slips of the Tongue, Ear, Pen, and Hand, New York: Academic Press. Fudge, Eric (1990), “Language as Organized Sound: Phonology,” in N. E. Collinge (ed.), An Encyclopedia of Language, London: Routledge. Gallistel, C. (1990), The Organization of Learning, Cambridge, MA: Bradford Books/MIT Press. Gallistel, C. and King, A. (2009), Memory and the Computational Brain: Why Cognitive Science will Transform Neuroscience, New York: Wiley-Blackwell. Garnsey, S. M., Tenenhaus, M. K., and Chapman, R. M. (1989), “Evoked Potentials in the Study of Sentence Comprehension,” Journal of Memory and Language, 29: 181–200. Gethin, A. (1990), Anti-linguistics: a Critical Assessment of Modern Linguistic Theory and Practice, Oxford: Intellect. Gibson, E. and Fedorenko, E. (2010), “Weak quantitative standards in linguistics research,” Trends in Cognitive Sciences, 14: 233–4. Goldberg, A. (1995), Constructions: A Construction Grammar Approach to Argument Structure, Chicago: University of Chicago Press. Goldberg. A. (2003), “Constructions: A New Theoretical Approach to Language,” Trends in Cognitive Science, 7 (5): 219–24. Goldberg, A. (2006), “Syntactic Constructions,” in K. Brown (ed.), Encyclopedia of Language and Linguistics, 2nd edn, Vol. 12: 379–83. Oxford: Oxford University Press. Goldsmith, J. (1995), “Phonological Theory,” in J. Goldsmith (ed.), The Handbook of Phonological Theory, Oxford: Blackwell, pp. 1–23. Gomez, R. and Gerken, L. (1999), “Artificial grammar learning by 1-year-olds leads to specific and abstract knowledge,” Cognition, 70: 109–35. Goodman, N. (1949), “On Likeness of Meaning,” Analysis, 10 (1): 1–7.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 405 Goodman, N. (1951/77), The Structure of Appearance, 3rd edn, Cambridge, MA: Harvard University Press. Goodman, N. (1953), “On Some Differences About Meaning,” Analysis, 13 (4): 90–6. Goodman, N. (1955/83), Fact, Fiction and Forecast, 4th edn, Cambridge, MA: Harvard University Press. Goodman, N. (1969), “The Emperor’s New Ideas,” in S. Hook. (ed.), Language and Philosophy, New York: New York University Press. Goodman, N. (1976), Languages of Art, Hackett Publishing Company. Goodman, N. (1978), Ways of World-Making, Indianapolis: Hackett Publishing. Graves, C, Katz, J., Nishiyama, Y., Soames, S., Stecker, R., and Tovey, P. (1973), “Tacit Knowledge,” Journal of Philosophy, 70: 318–30. Grice, H. (1989), “Logic and Conversation,” Studies in the Way of Words, Cambridge, MA: Harvard University Press. Grimes, J. (1996), “On the failure to detect changes in scenes across saccades,” in K. Akins (ed.), Perception (Vancouver Studies in Cognitive Science) vol. 2, New York: Oxford University Press, pp. 89–110. Gross, S. (2005), “The Nature of Semantics: On Jackendoff ’s Arguments,” The Linguistic Review, 22: 249–70. Gross, S. (forthcoming), “Probabilistic Representations in Perception: Are There Any, and What Would They Be?” Mind & Language. Gross, S. and Culbertson, J. (2011), “Revisited Linguistic Intuitions,” British Journal for the Philosophy of Science, 62: 639–56. Gross, S. and Rey, G. (2012), “Innateness” for E. Margolis, and S. Laurence (eds), Oxford Handbook on Cognitive Science, Oxford: Oxford University Press. Gutting, G. (1998), “ ‘Rethinking Intuition’: a Historical and Metaphilosophical Introduction,” in DePaul and Ramsey (1998), pp. 3–16. Hacker, P. (1990), Wittgenstein: Meaning and Mind, Oxford: Blackwell. Hacking, I. (1994), “Chomsky and his Critics,” in Otero (1994), vol II: 381–90. Hageman, L. (1994), Introduction to Government and Binding Theory, 2nd edn, Oxford: Blackwell. Hale, M. and Reiss, C. (2008), The Phonological Enterprise, Oxford: Oxford University Press. Hall, G. (ed.) (2004), Weaving a Lexicon, Cambridge, MA: MIT Press. Halle, M. (1954), “The strategy of phonemics,” Word, 10: 197–209. Halle, M. (1983), “On Distinctive Features and Their Articulatory Implementation,” Natural Language & Linguistic Theory, 1 (1): 91–105. Hamlin, J. K. and Wynn, K. (2011), “Young infants prefer prosocial to antisocial others,” Cognitive Development, 26 (1): 30–9. Handbook of the International Phonetic Association (1999), Cambridge: Cambridge University Press. Hansen, C. and Rey, G. (2016), “Files and Singular Thoughts Without Objects or Acquaintance: The Prospects of Recanati’s (and Others’) ‘Actualism’, ” Review of Philosophy and Psychology, 7 (2): 421–36. Hansen, N. (2018), “Just What Is It That Makes Travis’s Examples So Different, So Appealing?” In John Collins and Tamara Dobler (eds), The Philosophy of Charles Travis: Language, Thought, and Perception, Oxford: Oxford University Press, pp. 113–34. Hardcastle, W., Laver, J., and Gibbon, F. (2010), Handbook of Phonetic Sciences, Oxford: Blackwell.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

406 General References Hardin, C. L. (1988/93). Color for Philosophers. Indianapolis: Hackett. Hardin, C. L. (2008), “Color qualities and the physical world,” in E. Wright (ed.), The Case for Qualia, Cambridge, MA: MIT Press, pp. 143–54. Harley, H. (2012), “Lexical decomposition in modern syntactic theory,” in Markus Werning, Wolfram Hinzen, and Edouard Machery (eds), The Oxford Handbook of Compositionality, Oxford: Oxford University Press, pp. 328–50. Harman, G. (1963/87), “Generative grammars without transformation rules: a defense of phrase structure,” Language, 39: 597–616. Reprinted in Walter J. Savitch, Emmon Bach, William Marsh, and Gila Safran-Naveh (eds), The Formal Complexity of Natural Language (Dordrecht, Holland: D. Reidel, 1987) pp. 87–116. Harman, G. (1965), “The Inference to the Best Explanation,” Philosophical Review, 74: 88–95. Harman, G. (1967), “Psychological Aspects of the theory of syntax,” Journal of Philosophy, 64: 75–87. Harman, G. (1974/82), On Noam Chomsky, 2nd edn, Amherst, MA: University of Massachusetts Press. Harman, G. (1983), “Internally represented grammars,” Behavioral and Brain Sciences, 3: 408. Harman, G. (1990), “The Intrinsic Quality of Experience,” in J. Tomberlin, (ed.), Action Theory and Philosophy of Mind: Philosophical Perspectives, vol 4; Atascadero, CA: Ridgeview Publishing. Harris, D. (forthcoming), “Semantics without semantic content,” Mind & Language. Harris, R. (1993), The Linguistic Wars, Oxford: Oxford University Press. Harris, Z (1951), Methods in Structural Linguistics. Chicago: University of Chicago Press (reissued as Structural Linguistics). Hartsuiker, R. (2014), “Monitoring and Control of the Production System,” in The Oxford Handbook of Linguistic Production, Oxford: Oxford University Press, pp. 417–36. Hateren J. H. van, Srinivasan M. V., and Wait, P. B. (1990). “Pattern recognition in bees: Orientation discrimination,” Journal of Comparative Physiology A, 167: 649–54. Hauser, M. (1997), The Evolutiton of Communication, Cambridge, MA: MIT Press. Hauser, M. and Watumull, J. (2017). “The Universal Generative Faculty: The source of our expressive power in language, mathematics, morality, and music,” Journal of Neurolinguistics, 43: 78–94. Hauser, M., Chomsky, N., and Fitch, W. (2002), “The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?” Science, 298: 1569–79. Hawthorne, J. and LePore, E. (2100), “On Words,” Journal of Philosophy, 109(9):447–485. Hayes, C. (2018), Cognitive Gadgets, Cambridge, MA: Harvard University Press. Heck, R. (2000), “Non-conceptual content and the space of reasons,” Philosophical Review, 109: 483–523. Heck, R. (2007), “Are there different kinds of content?” in J. Cohen and B. McLaughlin (eds), Contemporary Debates in the Philosophy of Mind, Oxford: Blackwell. Hegel, G. (1812/2010), The Science of Logic, translated by George di Giovanni, Cambridge: Cambridge University Press. Hegel, G. (1820/1942), The Philosophy of Right, trans by T. M. Knox, Oxford: Clarendon Press. Heidegger, M. (1927/82), The Basic Problems of Phenomenology. Trans by Albert Hofstadter. Bloomington: Indiana University Press. Hempel, C. (1962), “Deductive-Nomological vs. Statistical Explanation,” in H. Feigl and G. Maxwell (eds), Minnesota Studies in the Philosophy of Science, Vol. III, Minneapolis: University of Minnesota Press, pp. 98–169.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 407 Hempel, C. G. (1969), “Reduction: Ontological and linguistic facets,” in S. Morgenbesser, P. Suppes, and M. White (eds), Philosophy, Science, and Method: Essays in Honor of Ernest Nagel, New York: St. Martins Press, pp. 179–99. Hempel, C. (1980), “Comments on Goodman’s Ways of Worldmaking,” Synthese, 45: 193–9. Higginbotham, J. (1983a), “Logical Form, Binding and Nominals,” Linguistic Inquiry, 14: 395–420. Higginbotham, J. (1983b), “The Logical Form of Perceptual Reports,” Journal of Philosophy, 80: 100–27. Higginbotham, J. (1991a), “Remarks on the Metaphysics of Linguistics,” Language and Philosophy, 14: 555–66. Higginbotham, J. (1991b), “The autonomy of syntax and semantics,” in J. L. Garfield (ed.), Modularity in Knowledge Representation and Natural-Language Understanding, Cambridge, MA: MIT Press, pp. 119–31. Higginbotham, J. (1994), “The Autonomy of Syntax and Semantics,” in Otero (1994), vol I, pp. 458–71. Hill, A. (1958/62), Proceedings of the Third Texas Conference on Problems of Linguistic Analysis in English, Austin: University of Texas Press (the conference took place in 1958). Hobbes, T. (1655/1839), De Corpore, London: Molesworth. Hockett, C. (1955), A Manual of Phonology, Baltimore, MD: Waverly Press. Hoff, E. (2009), Language Development, 3rd edn, Belmont, CA: Wadsworth. Hofmeister P., Jaeger T., Arnon I, Sag I., and Snider N. (2013), “The source ambiguity problem: Distinguishing the effects of grammar and processing on acceptability judgments,” Language and Cognitive Processes, 28: 48–87. Holt, L. and Lotto, A. (2010), “Speech Perception as Categorization,” Attention, Perception, & Psychophysics, 72 (5): 1218–27. Horn, L. (1984), “Toward a new taxonomy for pragmatic inference: Q-based and R-based implicature,” in D. Schiffrin (ed.), Meaning, Form, and Use in Context: Linguistic Applications, Washington, DC: Georgetown University Press, pp. 11–42. Horn, L. (2001), A Natural History of Negation, Stanford: CSLI publications (originally published by University of Chicago Press, 1989). Hornsby, J. (1997), Simple Mindedness: In Defense of Naive Naturalism in Philosophy of Mind, Cambridge, MA: Harvard University Press. Hornstein, N. (2005), Understanding Minimalism, Cambridge: Cambridge University Press. Horty, J. (2012), Reasons as Defaults, Oxford: Oxford University Press. Horwich, P. (1998), Meaning, Oxford: Oxford University Press. Horwich, P. (2004),“Wittgenstein’s Meta-philosophical Development,” in his From a Deflationary Point of View, Oxford: Oxford University Press, pp. 159–70. Horwich, P. (2005), Reflections on Meaning, Oxford: Oxford University Press. Horwich, P. (2009), “Wittgenstein’s Definition of ‘Meaning’ as ‘Use’,” in D. Whiting (ed.), The Later Wittgenstein on Language, Palgrave: Macmillan. Householder, F. (1965), “On Some Recent Claims in Phonological Theory,” Journal of Linguistics, 1: 13–34. von Humboldt, W. (1836/2000), “On the Diversity of Human Language Construction and its Influence on the Mental Development of the Human Species,” ed. M. Losonsky, trans by P. Heath, Cambridge: Cambridge University Press See also: http://www.cambridge.org/ us/academic/subjects/philosophy/philosophy-texts/humboldt-language-diversity-human-

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

408 General References language-construction-and-its-influence-mental-development-human-species-2ndedition#HwoxEeolGUPbQ4hG.99 Hursthouse, R. (1991), “Arational Actions,” The Journal of Philosophy, 88 (2): 57–68. Husserl, E. (1913/82), Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy—First Book: General Introduction to a Pure Phenomenology, trans F. Kersten. The Hague: Nijhoff, 1982 Israel, D. (1991), “Katz and Postal on Realism,” Linguistics and Philosophy, 14: 567–74. Israel, M. (2011), The Grammar of Polarity: Pragmatics, Sensitivity and the Logic of Scales, Cambridge: Cambridge University Press. Jackendoff, R. (1969), Some Rules of Semantic Interpretation for English, Doctoral dissertation, MIT. Jackendoff, R. (1972), Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. Jackendoff, R. (1983), Computation and Cognition, Cambridge, MA: MIT Press. Jackendoff, R. (1987), Consciousness and the Computational Mind, Cambridge, MA: MIT Press. Jackendoff, R. (1991), “The Problem of Reality,” Nous, 25: 411–34. Jackendoff, R. (1992), Languages of the Mind: Essays on Mental Representation, Cambridge, MA: MIT Press. Jackendoff, R. (1993), “The Paradox of Language Acquisition,” Teaching Thinking and Problem Solving, 13 (5), 1–6. Reprinted in C.-P. Otero (ed.), Noam Chomsky: Critical Assessments, London, Routledge, pp. 445–51. Jackendoff, R. (1997), The Architecture of the Language Faculty, Cambridge, MA: MIT Press. Jackendoff, R. (2002), Foundations of Language, Oxford: Oxford University Press. Jackendoff, R. (2003), “Locating Meaning in the Mind (Where It Belongs),” in Stainton (20063), pp. 219–36. Jackendoff, R. (2004), “Towards Better Mutual Understanding,” Behavioral and Brain Sciences, 26: 695–702.Jackendoff, R. (2006), “Locating Meaning in the Mind (Where It Belongs),” in R. Stainton (ed.), Contemporary Debates in Cognitive Science, Oxford: Blackwell, pp. 237–55. Jackendoff, R. and Cullicover, P. (1999), “The View from the Periphery: The English Comparative Correlative,” Linguistic Inquiry, 30 (4): 543–71. Jakobson, R. (1960), “Closing statements: Linguistics and poetics,” in T. A. Sebeok, (ed.), Style in Language. Cambridge, MA: MIT Press, pp. 350–73. Jakobson, R. and Halle, M. (1968), “Phonology in Relation to Phonetics,” in B. Malmberg (ed.), Manual of Phonetics, Amsterdam: North Holland Publishing. Jespersen, O. (1924/63), The Philosophy of Grammar, London: Allen and Unwin. Johnson, K. (1987), Contact Mechanics, Cambridge: Cambridge University Press. Kahneman, D. (2011), Thinking, Fast and Slow, New York: Farrar, Straus & Giroux. Kant, I. (1785/1948), The Moral Law: Kant’s Groundwork of the Metaphysic of Morals, trans H. Paton, London; Hutcheson University Library. (Pagination is from 2nd edition published in Kant’s lifetime.) Kant, I. (1787/1968), Critique of Pure Reason, trans N. K. Smith, New York: Macmillan. Kant, I. (1788/1956), Critique of Practical Reason, trans Lewis White Beck, Indianapolis: Bobbs-Merrill. Kant, I. (1788/2004), Critique of Practical Reason, trans T. Abbott, Chicago: Courier. Karmiloff-Smith, A. (1986), “From meta-processes to conscious access: evidence from children’s meta-linguistic and repair data,” Cognition, 23: 95–147.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 409 Katz, J. (1972), Semantic Theory, New York: Harper and Row. Katz, J. (1974), “Where Things Now Stand with the Analytic–Synthetic Distinction,” Synthese, 28: 283–319. Katz, J. (1979), “Neo-Classical Theory of Reference, ” in P. French, T. Uehling, and H. Wettstein (eds), Contemporary Perspectives on the Philosophy of Language, Minneapolis: University of Minnesota Press, pp. 103–24. Katz, J. (1981), Language and Other Abstract Objects, Totowa, NJ: Rowman and Littlefield. Katz, J. (1985a), The Philosophy of Linguistics, Oxford: Oxford University Press. Katz, J. (1985b), “Introduction” to Katz (1985a). Katz, J. (1985c), “An Outline of Platonist Grammar,” in Katz (1985a). Katz, J. (1990), The Metaphysics of Meaning, Oxford: Oxford University Press. Katz, J. (1997), “Analyticity, necessity, and the epistemology of semantics,” Philosophy and Phenomenological Research, 57 (1): 1–28. Katz, J. and Postal, P. (1991), “Realism vs. Conceptualism in Linguistics,” Language and Philosophy, 14: 515–54. Kayne, R. (1975), French Syntax: The Transformational Cycle, Cambridge, MA: MIT Press. Keil, F. (1989), Concepts, Kinds, and Cognitive Development. Cambridge, MA: MIT Press. Kemmerling, A. (2004), “ ‘As it were pictures’—On the Two-Faced Nature of Cartesian Ideas,” in: R. Schumacher (ed.), Perception and Reality: from Descartes to the Present, Berlin/New York: Mentis, pp. 43–68. Kempen, G. (2000), “Human grammatical coding,” Manuscript submitted for publication. Kingston, J. (2007), “The phonetics–phonology interface,” P. de Lacy (ed.), Handbook of Phonology, Cambridge: Cambridge University Press, pp. 435–56. Kinzler, K. D., Shutts, K., and Spelke, E. S. (2012), “Language-based social preferences among children in South Africa,” Language Learning and Development, 8: 215–32. Klein, D. and Manning, P. (2003), “Accurate Unlexicalized Parsing,” Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Vo1 I: 423–30. Klima, E. (1964), “Negation in English,” in J. A. Fodor and J. J. Katz (eds.), The Structure of Language. Upper Saddle River, NJ: Prentice-Hall. Knill, D. and Richards, W. (eds) (2012), Perception as Bayesian Inference, Cambridge: Cambridge University Press. Knoll, A. (2015), The Explanatory Role of Intentional Content in Cognitive Science. Ph.D. thesis, Dept of Philosophy, University of Maryland, College Park. Knoll, A. and Rey, G. (2017), “Arthropod Intentionality,” in J. Beck, (ed.), Routledge Handbook of Animal Minds, London: Routledge. Kosslyn, S. (1986), Image and Mind. Cambridge, MA: Harvard University Press. Koyré, A. (1943), “Galileo and the Scientific Revolution of the Seventeenth Century,” Philosophical Review, 52 (4): 333–48. Kremer, R. (1990). The Thermodynamics of Life and Experimental Physiology, 1770–1880. New York: Garland. Kripke, S. (1972/1982), Naming and Necessity, Cambridge: Harvard University Press. Kripke, S. (1982), Wittgenstein on Rules and Private Language, Oxford: Blackwell. Kuehni, R. (2001), “Focal colors and unique hues,” Color: Research and Application, 26 (2): 171–2. Kuhn, T. (1962), The Structure of Scientific Revolutions, Chicago: University of Chicago Press. Ladd, D. (2011), “Phonetics in Phonology,” The Handbook of Phonological Theory, 2nd edn, ed. J. Goldsmith, J. Riggle, and A. C. L. Yu, Oxford: Blackwell, pp. 348–73. Lakoff, G. (1971), “On Generative Semantics,” in D. Steinberg and L. Jacobovits (eds), Semantics, Cambridge: Cambridge University Press, 232–96.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

410 General References Lakoff, G. (1974), “Syntactic Amalgams,” in Papers from the Tenth Annual Meeting of the Chicago Linguistic Society 1974; http://georgelakoff.files.wordpress.com/2011/01/syntacticamalgams-lakoff-1974.pdf Langacker, R. (1987), Foundations of Cognitive Grammar, 2 vols., Stanford, CA: Stanford University Press. Langedoen, D. and Postal, P. (1991), “Sets and Sentences,” in Katz (1985a), pp. 227–48. Lappin, S. and Schieber, S. (2007), “Machine learning theory and practice as a source of insight into universal grammar,” Journal of Linguistics, 43: 393–427. Larson, R., Déprez, and Yamakido, H. (2010), The Evolution of Human Language, Cambridge: Cambridge University Press. Lashley, K. (1948/51), “ ‘The problem of serial order in behavior’ Cerebral Mechanisms in Behavior,” in Jeffress, L. (ed.), Cerebral Mechanisms in Behavior: the Hixon Symposium, New York: Wiley, pp. 112–35. Lasnik, H. (2001), “The Minimalist Program in Syntax,” Trends in Cognitive Sciences, 6 (10): 432–37. Lasnik, H. (2005), “Grammar, Levels and Biology,” in J. McGilvray (ed.), The Cambridge Companion to Chomsky, Cambridge: Cambridge University Press, pp. 60–83. Lasnik, H. and Lohndal, T. (2013), “Brief overview of the History of Generative Syntax,” in M. den Dikken (ed.), The Cambridge Handbook of Generative Syntax, Cambridge: Cambridge University Press, pp. 26–60. Lasnik, H. and Uriagereka, J. (2002), “On the Poverty of the Challenge,” The Linguistic Review, 19: 147–50. Lasnik, H. with Depiante, M. and Stepanov, A. (2000), Syntactic Structures Revisited, Cambridge, MA: MIT Press. Lasnik, H., and Uriagereka, J. with Boeckx, C. (2005), A Course in Minimalist Syntax, Oxford: Blackwell. Laurence, S. and Margolis, E. (2001), “The Poverty of Stimulus Argument,” British Journal for the Philosophy of Science, 52: 217–56. Laver, J. D. M. (1980), “Monitoring systems in the neurolinguistic control of speech production,” in V. A. Fromkin (ed.), Errors in Linguistic Performance: Slips of the Tongue, Ear, Pen, and Hand. New York: Academic Press. Laver, J. (1994), Principles of Phonetics, Cambridge: Cambridge University Press. Leddon, E. and Lidz, J. (2006). “Reconstruction Effects in Child Language,” in D. Bamman, T. Magnitskaia, and C. Zaller, (eds.), BUCLD 30, Somerville, MA: Cascadilla Press, pp. 328–39. Legate, J. and Yang, C. (2002), “Empirical re-assessment of stimulus poverty arguments,” The Linguistic Review, 19: 151–62. Leibniz, G. (1704/1981), New Essays on Human Understanding, trans P. Remnant and J. Bennett, Cambridge: Cambridge University Press, 1981. Lenneberg, E. (1967), Biological Foundations of Language. New York: John Wiley & Sons. Levelt, W. J. M. (1989), Speaking: from Intention to Articulation. Cambridge, MA: MIT Press. Levine, J. (1983), “Materialism and Qualias: the Explanatory Gap,” Pacific Philosophical Quarterly, 64: 354–61. Levine, J. (forthcoming), “Chomsky and the Mind-Body Problem,” In Alott, Lohndal, and Rey (forthcoming). Levine, R. and Postal, P. (2004), “A corrupted linguistics,” in P. Collier and D. Horowitz (eds), The Anti Chomsky Reader, San Francisco, CA: Encounter Books, pp. 230–1.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 411 Levinson, S. C. (1991), “Pragmatic reduction of the Binding Conditions revisited,” Journal of Linguistics, 27: 107–61. Lewis, D. (1969), Convention: a Philosophical Study, Cambridge, MA: Harvard University Press. Lewis, D. (1970), “How to Define Theoretical Terms,” Journal of Philosophy, 67: 427–4. Lewis, D. (1972/1980), “Psychophysical and Theoretical Identification,” in Block (1980), vol I, pp. 207–15. Lewis, D. (1983), “New Work For a Theory of Universals,” Australasian Journal of Philosophy, 61: 343–77. Lewis, D. (1994), “David Lewis: Reduction of Mind,” in S. Guttenplan, (ed.), A Companion to the Philosophy of Mind, Oxford: Blackwell. Lewontin, R. (1990), “How much did the brain have to change for speech?” Behavioral and Brain Sciences, 13 (4), 740–1. Liberman, A. M. (1996), Speech: A Special Code, Cambridge, MA: MIT Press. Liberman, A. and Mattingly, I. (1985), “The motor theory of speech perception revised,” Cognition, 21 (1): 1–36. Lidz, J. (2018), “The explanatory power of linguistic theory”, in N. Hornstein, H. Lasnik, P. Patel-Grosz, and C. Yang, (eds), Syntactic Structures after 60 Years: The Impact of the Chomskyan Revolution in Linguistics, De Gruyter: Mouton, pp. 225–40. Lidz, J. and Gagliardi, A. (2015), “How Nature Meets Nurture: Universal Grammar and Statistical Learning,” Annual Review of Linguistics, 1 (1): 12.1–21. Lidz, J. and Williams, A. (2009), “Constructions on holiday,” Cognitive Linguistics, 20 (1). Lightfoot, D. (1994), “Subjacency and Sex,” in Otero (1994), Vol II: 721–4. Loewer, B. (1997), “A Guide to Naturalizing Semantics,” B. Hale, and C. Wright, The Blackwell Companion to the Philosophy of Language, Oxford: Blackwell, pp. 108–26. Löfqvist, A. (2010), “Theories and Models of Speech Production,” in Hardcastle, Laver, and Gibbon. (2010), pp. 353–77. Ludlow, P. (2011), The Philosophy of Generative Linguistics, Oxford: Oxford University Press. Lycan, W. (1987), Consciousness, Cambridge, MA: MIT Press. Lycan, W. (1996), Consciousness and Experience, Cambridge, MA: Bradford Books / MIT Press. Lycan, W. (2003), “Chomsky on the Mind-Body Problem,” in Antony and Hornstein (2003). Lyons, J. (1966), “Towards a ‘notional’ theory of parts of speech,” Journal of Linguistics, 2: 209–36. .McCarthy, J. and Hayes, P. J. (1969), “Some Philosophical Problems from the Standpoint of Artificial Intelligence,” in D.Michie and B.Meltzer (eds), Machine Intelligence 4, Edinburgh: Edinburgh University Press, pp. 463–502. McCawley, J. (1968), “The Role of Semantics in Grammar,” in E. Bach, and R. Harms, (eds), Universals in Linguistic Theory, New York: Holt, Reinhart and Winston, pp. 125–70. MacCorquedale, K. (1970), “On Chomsky’s review of Skinner’s Verbal Behavior,” Journal of the Experimental Analysis of Behavior, 13 (1): 83–99. McCourt, M. (ms), Semantics and modularity. Talk presented to Depaertment of Philosophy, University of Maryland, 5 April 2019. McDonald, F. (2009), “Linguistics, Psychology, and the Ontology of Language,” Croatian Journal of Philosophy, 27: 291–301.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

412 General References McDonald, F. (2012), “Why Language Exists: Stating the Obvious,” Croatian Journal of Philosophy, 34: 1–12. McDowell, J. (1985), “Functionalism and Anomalous Monism,” in E. LePore and B. McLaughlin (ed.s), Actions and Events: Perspectives on the Philosophy of Donald Davidson, Oxford: Blackwell, pp. 387–98. McGilvray, J. (1999), Chomsky: Language, Mind and Politics, 1st edn, Cambridge: Cambridge University Press. McGilvray, J. (ed.) (2005), Cambridge Companion to Chomsky, 1st edn, Cambridge: Cambridge University Press McGilvray, J. (2014), Chomsky, 2nd edn, Cambridge: Polity. McGilvray, J. (ed.) (2017), Cambridge Companion to Chomsky, 2nd edn, Cambridge: Cambridge University Press. McGinn, C. (1994), The Problems of Philosophy: the Limits of Inquiry, Oxford: Blackwell. McGurk, H. and MacDonald, J. (1976), “Hearing lips and seeing voices,” Nature, 264: 746–8. Mach, E. (1914), The Analysis of Sensations, Chicago: Open Court. McLaughlin, B. and Cohen, J. (2007), Contemporary Debates in Philosophy of Mind, Hoboken, NJ: WileyBlackwell. MacQuorcodale, K. (1970), “A reply to Chomsky’s Review of Skinner’s Verbal Behavior,” Journal of the Experimental Analysis of Behavior, 13: 83–99. Mahmoudzadeh, M., Dehaene-Lambertz, G, Fournier, M, Kongolo, G, Goudjil, S, Dubois, J, Grebe, R, and Wallois, F. (2013), “Syllabic discrimination in premature human infants prior to complete formation of cortical layers,” Proceedings of the National Academy of Sciences, 110: 4846–51. Malcolm, N. (1977), Thought and Knowledge. Ithaca, NY: Cornell University Press. Malmgren, A. (2006), “Is There A Priori Knowledge by Testimony?,” Philosophical Review, 115 (2): 199–241. Mantzavinos, C. (2016), “Hermeneutics,” in Stanford Encyclopedia of Philosophy, https:// plato.stanford.edu/entries/hermeneutics/ Manzini, R. (1992), Locality, Linguistic Inquiry Monograph Series 19. Cambridge, MA: MIT Press. Manzini, R. and Wexler, K. (1987), “Parameters, Binding Theory, and Learnability,” Linguistic Inquiry, 18 (3): 413–44. Marchant, J. (2012), “Ellipsis,” in T. Kiss and A. Alexiadou (eds), Syntax: An International Handbook of Contemporary Syntactic Research, Berlin: Walter de Gruyter; http://home. uchicago.edu/~merchant/pubs/merchant.ellipsis.pdf Marr, D. (1982), Vision, Cambridge, MA: MIT Press. Marslen-Wilson, W. and Tyler, L. (1980), “The temporal structure of spoken language understanding,” Cognition, 8: l–71. Mates, B. (1952), “Synonymity,” in L. Linsky (ed.), Semantics and the Philosophy of Language, Champaign-Urbana: University of Illinois Press. Matthews, R. (1979), “Are the grammatical sentences of a language a recursive set?” Synthese, 40: 209–24. Matthews, R. (1980), “Language learning vs. language growth,” Behavioral and Brain Sciences, 3: 2–26. Matthews, R. (2006), “Knowledge of Language and Linguistic Competence,” in E. Sosa and E. Villanueva (eds), Philosophy of Language: Philosophical Issues 16, Oxford: Blackwell, pp. 200–20.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 413 May, L., Byers-Heinlein, K., Gervain, J., and Werker1, J. (2011), “Language and the Newborn Brain: Does Prenatal Language Experience Shape the Neonate Neural Response to Speech?” Frontiers in Psychology, 2: 222. Published online September 21, 2011. doi: 10.3389/fpsyg.2011.00222 PMCID: PMC3177294 PMID: 21960980 Maynes, J. and Gross, S. (2013), “Linguistic Intuitions,” Philosophy Compass, 8: 714–30. Meinong, A. (1899), “Über Gegenstände höherer Ordnung und deren Verhältnis zur inneren Wahrnehmung”, Zeitschrift für Psychologie und Physiologie der Sinnesorgane, XXI: 182–272. Reprinted in Meinong (1968–78), Vol. II: 377–480. Transl. as “On Objects of Higher Order and Their Relationship to Internal Perception” by Marie-Luise Schubert Kalsi, (1978), Alexius Meinong on Objects of Higher Order and Husserl’s Phenomenology, The Hague: Martinus Nijhoff, pp. 137–208. Melzack, R. and Wall, P. (1988), The Challenge of Pain, London, New York: Penguin Books. Merchant, J. (2018), “Ellipsis: A survey of analytical approaches,” in J. Craenenbroeck, and T. Temmerman, (eds), The Oxford Handbook of Ellipsis, Oxford: Oxford University Press. Michotte, A. (1946/63), The Perception of Causality. Translated by T. R. Miles and E. Miles. London: Methuen. Mikhail, J. (2011), Elements of Moral Cognition: Rawls’ Linguistic Analogy and the Cognitive Science of Moral and Legal Judgment. Cambridge:Cambridge University Press Mikhail, J. (2017), “Chomsky and Moral Philosophy,” in McGilvray (2017), pp. 235–54. Miller, G. and Chomsky, N. (1963), “Finitary models of language users,” in P. Luce, R. Bush, and E. Galanter (eds), Handbook of Mathematical Psychology, Volume 2. New York: Wiley, pp. 419–92. Miller, G. and Isard, S. (1963), “Some Perceptual Consequences of Linguistic Rules,” Journal of Verbal Learning and Verbal Behavior, 2: 217–28. Millikan, R. (2000), On Clear and Confused Ideas: An Essay about Substance Concepts, Cambridge: Cambridge University Press. Momma, S. and Phillips, C. (2018), “The relationship between parsing and generation,” Annual Review of Linguistics, 4: 233–54. Montague, R. (1974), Formal Philosophy: Selected Papers of Richard Montague. Edited and with an introduction by Richmond H. Thomason, New Haven, CT: Yale University Press. Műller, G. (2011), “Constraints on Displacement a Phase-based Approach” http://www. uni-leipzig.de/~muellerg/mu204.pdf Musso, M., Moro, A., Glauche, V., Rijntjes, M., Reichenbach, J., Büchel, C., and Weiller, C. (2003). “Broca’s area and the language instinct,” Nature Neuroscience, 6 (7): 774–81. Nagel, T. (1969/82), “Linguistics and Epistemology,” in Harman (1974/82), pp. 219–28. Nagel, T. (1974/82), “Linguistics and Epistemology,” in Harman (1974/82), pp. 219–28. Nagel, T. (1986), The View from Nowhere, Oxford: Oxford University Press. Nagel, T. (1993/95), “The mind wins!” Review of Searle (1992) New York Review of Books, Reprinted as “Searle: why we are not computers” in T. Nagel (1995), Other Minds. Oxford: Oxford University Press, pp. 96–110. Nagel, T. (2012), Mind and Cosmos: Why the Materialist Neo-Darwinian Conception of Nature is Almost Certainly False, Oxford: Oxford University Press. Nahmias, E. (2014), “Is Free Will an Illusion? Confronting Challenges from the Modern Mind Sciences,” in Walter Sinnott-Armstrong (ed.), Moral Psychology (Volume 4: Free Will and Moral Responsibility), Cambridge, MA: MIT Press, pp. 1–25. Neale, S. (2005), “Pragmatism and Binding,” in Z. Szabo, (ed.). Semantics versus Pragmatics. Oxford: Oxford University Press, pp. 165–286.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

414 General References Neander, K. (2017), A Mark of the Mental: In Defense of Informational Teleosemantics, Cambridge, MA: MIT Press. Neidle, C., Kegl, J., MacLaughlin, D., Bahan, B. and Lee, Robert G. (1999), The Syntax of American Sign Language: Functional Categories and Hierarchical Structure, Cambridge, MA: MIT Press. Nelson, M. (2019), Propositional Attitude Reports. Entry in Stanford Encyclopedia of Philosophy, on-line at https://plato.stanford.edu/entries/prop-attitude-reports Neurath, O. (1932b/1983), ‘Protocol Statements’, in R. Cohen, and M. Neurath, (eds), Philosophical Papers 1913–1946, Dordrecht: Reidel, pp. 91–9. Nevins, A, Pesetsky, D., and Rodrigues, C. (2009), “Evidence and argumentation: A reply to Everett” Language, 85 (3). Newell, A. and Simon, H. A. (1976), “Computer science as empirical inquiry: Symbols and search,” Communications of the Association for Computing Machinery, 19 (3): 113–26. Newmeyer, F. (1986), Linguistic Theory in Amerca, 2nd edn, New York: Academic Press. Newmeyer, F. (1996), Generative Linguistics, London: Routledge. Newmeyer, F. (1998), Language Form and Language Function, Cambridge, MA: MIT Press. Newmeyer, F. (2017), “Where, if anywhere, are parameters? A critical historical overview of parametric theory,” in C. Bowern, L. Horn, and R. Zanuttini (eds), On Looking into Words (and beyond), Berlin: Language Science Press, pp, 547–69. Newmeyer, F. (forthcoming), “Chomsky and Usage-Based Linguistics,” In Allott, Lohndal, and Rey (forthcoming). Nicol, J. and Swinney, D. (1989), “The Role of Structure in Coreference Assignment During Sentence Comprehension,” Journal of Psycholinguistic Research, 18: 5–19. Ninio, J. (1998/2001), The Science of Illusions, trans F. Phillip, Ithaca, NY: Cornell University Press. Nisbett, R. E. and Wilson, T. D. (1977), “Telling more than we can know: Verbal reports on mental processes,” Psychological Review, 84 (3): 231. Noveck, I. (2018), Experimental Pragmatics: The Making of a Cognitive Science, Cambridge: Cambridge University Press. Noveck, I. and Sperber, D. (eds) (2004). Experimental Pragmatics, New York: Palgrave Macmillan. Otero, C. (ed.) (1994), Noam Chomsky: Critical Assessments, 3 vols; London: Routledge. Palmer, S. (1999a), Vision Science: Protons to Phenomenology, Cambridge, MA: MIT Press. Palmer, S. (1999b), “Color, consciousness, and the isomorphism constraint,” Behavioral and Brain Sciences, 22: 923–89. Parsons, T. (1980), Nonexistent Objects, New Haven, CT: Yale University Press. Pateman, T. (1987), Language in Mind and Language in Society: Studies in Linguistic Reproduction. Oxford: Clarendon Press. Peacocke, C. (1986), “Explanation in Computational Psychology: Language, Perception and Level 1.5,” Mind & Language, 1 (4): 388–402. Peacocke, C. (1992), A Study of Concepts, Cambridge, MA: MIT Press. Peacocke, C. (1999), “Computation as Involving Content: A Response to Egan,” Mind & Language, 14: 195–202. Pearl, L. (forthcoming), “Modeling syntactic acquisition,” in J. Sprouse (ed.), Oxford Handbook of Experimental Syntax. Pearl, L. and Sprouse, J. (2013), “Computational models of acquisition for islands,” in J. Sprouse & N. Hornstein (eds), Experimental Syntax and Island Effects, Cambridge: Cambridge University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 415 Penrose, R. (1989), The Emperor’s New Mind, Oxford: Oxford University Press.Penzias, A. and Wilson, R. (1965), “A Measurement of Excess AntennaTemperature at 4080 Mc/s,” Astrophysical Journal, 142: 419–21. Pereplyotchik, D. (2011), “Psychological and Computational Models of Language Comprehension: In Defense of the Psychological Reality of Syntax,” Croatian Journal of Philosophy, XI (31): 31–72. Pereplyotchik, D. (2017), Psychosyntax: The Nature of Grammar and its Place in the Mind, Cham, CH: Springer. Perfors, A., Tennenbaum, J., Griffiths, and Xu, F. (2011), “A tutorial introduction to Bayesian models of Cognitive Development,” Cognition, 12 ( 3): 302–21. Peters, A. (1985),“Language Segmentation: Operating Principles for the Perception and Analysis of Language,” in D. Slobin, The Cross-Linguistic Study of Language Acquisition, vol 2: Theortetical Issues, Hillsdale, NJ: Lawrence Erlbaum, pp. 1029–64. Peters, Jr., P. and Ritchie, R. (1973), “On the Generative Power of Transformational Grammars,” Information Sciences, 6: 49–83. Peterson, G. and Barney, H. (1952), “Control Methods Used in a Study of the Vowels,” The Journal of the Acoustical Society of America, 24 (2): 175–84. Phillips, C. (1996), Order and Structure, PhD dissertation, MIT. Distributed by MIT Working Papers in Linguistics. Phillips, C. (2001), “Parsing: Psycholinguistic Aspects,” International Encyclopedia of Linguistics, Oxford: Oxford University Press. Phillips, C. (2006), “The real-time status of island phenomena,” Language, 82 (4): 795–823. Phillips, C. (2013), “On the Nature of Island Constraints,” in Sprouse and Hornstein (2013a), pp. 64–108. Phillips, C. and Wagers, M. (2007), “Relating structure and time in linguistics and psycholinguistics,” Oxford Handbook of Psycholinguistics, Oxford: Oxford University Press, pp. 739–5. Piattelli-Palmarini M. (ed.) (1980), Language and Learning: The Debate between Jean Piaget and Noam Chomsky (the Royaumont debate); Cambridge, MA: Harvard University Press, 1980. Piattelli-Palmarini M. (ed.) (2009), Of Minds and Language: a Dialogue with Noam Chomsky in the Basque Country, Oxford: Oxford University Press. Pickering, M. and Ferreira V. (2008), “Structural Priming: A Critical Review,” Psychology Bulletin, 2008 May; 134(3): 427–59. Pietroski, P. (2000), Causing Actions. Oxford: Oxford University Press. Pietroski, P. (2003), “The Character of Natural Language Semantics,” in Alex Barber (eds), Epistemology of Language, Oxford: Oxford University Press, pp. 217–56. Pietroski, P. (2005), “Meaning Before Truth,” in G. Preyer and G. Peters (eds), Contextualism in Philosophy, Oxford: Oxford University Press. Pietroski, P. (2010), “Concepts, Meanings, and Truth: First Nature, Second Nature, and Hard Work,” Mind & Language, 25: 247–78. Pietroski, P. (2015), “Vocabulary Matters. 50 Years Later: Reflections on Chomsky’s Aspects,” in A. Gallego and D. Ott (eds), MIT Working Papers in Linguistics #77, Cambridge, MA: MIT, pp. 199–210. Pietroski, P. (2017), “Semantic Internalism,” in McGilvray (2017), pp. 196–216. Pietroski, P. (2018), Conjoining Meanings: Semantics without Truth Values, Oxford: Oxford University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

416 General References Pietroski, P. and Rey, G. (1995), “When Other Things Aren’t Equal: Saving Ceteris Paribus,” British Journal for the Philosophy of Science, 46: 81–110. Pinker, S. (1994), The Language Instinct, New York: Harper. Pinker, S. (1999), Words and Rules: The Ingredients of Language. New York: Basic Books. Pinker, S. and Jackendoff, R. (2005), “The faculty of language: what’s special about it?” Cognition, 95: 201–36. Poeppel, D., Idsardi, W., and van Wassenhove, V. (2009), “Speech Perception at the Interface of Neurobiology and Linguistics,” in B. Moore, L. Tyler, and W. MarslenWilson, The Perception of Speech, Oxford: Oxford University Press. Pollard, C. and Sag, I. A. (1987), Information-based Syntax and Semantics – vol. 1, Fundamentals, Stanford, CA: CSLI. Pollard, C. and Sag, I. (1994), Head-driven Phrase Structure Grammar. Chicago: University of Chicago Press. Postma, A. (1997), “On the mechanisms of speech monitoring,” in W. Hulstijn, H. Peters, and P. van Lieshout (eds.), Speech Production: Motor Control, Brain Research and Fluency Disorders. Amsterdam: Elsevier. Postma, A. (2000), “Detection of errors during speech production: a review of speech monitoring models,” Cognition, 77: 97–131. Pratt, C. and Grieve, R. (1984), “The Development of Metalinguistic Awareness,” in Tunmer et al 1984, pp. 2–11. Preston, J. and Bishop, M. (2002), Views into the Chinese Room, Oxford: Oxford University Press. Priest, G. (2005), Towards Non-Being. The Logic and Metaphysics of Intentionality, Oxford: Oxford University Press. Pullum, G. (2012), “Rules that Eat Your Brain,” https://www.chronicle.com/blogs/linguafranca/2012/08/29/rules-that-eat-your-brain Pullum, G. (2018), “Philosophy of Linguistics,” in K. M. Becker and I. Thomson (eds), The Cambridge History of Philosophy, 1945–2015, Cambridge: Cambridge University Press. Pullum, G. and Gazdar, G. (1982), “Natural Languages and Context Free Languages,” Linguistics and Philosophy, 4: 471–504. Pullum, G. and Scholz, B. (2002), “Empirical assessment of stimulus poverty arguments,” The Linguistic Review, 19: 9–50. Putnam, H. (1960), “Minds and Machines,” in S. Hook (ed.), Dimensions of Mind, New York: New York University Press, pp. 148–80. Putnam, H. (1961/75a), “Some Issues in the Theory of Grammar,” in Putnam (1975a), pp. 85–106. Putnam, H. (1962a/75a), “It Ain’t Necessarily So,” Journal of Philosophy, LIX: 658–71; reprinted in (1975a), pp. 237–49. Putnam, H. (1962b/75b), “Dreaming and Depth Grammar,” in his (1975b) II, pp. 304–24. Putnam, H. (1965/75), “The Analytic and the Synthetic,” in his Philosophical Papers: Mathematics, Matter and Method (1975), vol. 2, Cambridge: Cambridge University Press, pp. 33–69. Putnam, H. (1968/75), “Is Logic Empirical?” in Robert S. Cohen and Marx W. Wartofsky (eds), Boston Studies in the Philosophy of Science, vol. 5, Dordrecht: D. Reidel, 1968, pp. 216–41. Reprinted as “The Logic of Quantum Mechanics” in his Philosophical Papers: Mathematics, Matter and Method (1975), Cambridge: Cambridge University Press, pp. 174–97. Putnam, H. (1979), Is Semantics Possible?,” Metaphilosophy, 1: 187–201. Putnam, H. (1975a), Mathematics, Matter and Method: Collected Philosophical Papers, vol I, Cambridge: Cambridge University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 417 Putnam, H. (1975b), Mind, Language and Reality: Collected Philosophical Papers, vol II, Cambridge: Cambridge University Press. Putnam, H. (1975c), “The Meaning of ‘Meaning’,” Minnesota Studies in the Philosophy of Science, 7: 131–93; re-printed in (1975b), pp. 215–71. Putnam, H. (1983), “ ‘Two Dogmas’ Re-visited,” in his Collected Papers vol 3, Cambridge: Cambridge University Press. Putnam, H. (1987), The Many Faces of Realism, La Salle, IL: Open Court. Putnam, H. (1988), Representation and Reality, Cambridge, MA: MIT Press. Pylyshyn, Z. W. (1984), Computation and Cognition: Towards a Foundation for Cognitive Science, Cambridge, MA: MIT Press. Pylyshyn, Z. (2006), Seeing and Visualizing: It’s Not What You Think, Cambridge, MA: MIT Press. Quine, W. (1934/76), “Truth by Convention,” in his (1976), pp. 77–106. Quine, W. (1940/81), Mathematical Logic, Revised edition, Cambridge, MA: Harvard University Press. Quine, W. (1953/61), From a Logical Point of View, 2nd rev edn, New York: Harper & Row. Quine, W. (1953/61a), “On What There is,” in (1953/61), pp. 1–9. Quine, W. (1953/61b), “Two Dogmas of Empiricism,” in (1953/61), pp. 20–47. Quine, W. (1953/61c), “The Problem of Meaning in Linguistics,” in (1953/61), pp. 47–64. Quine, W. (1953/61/d), “Notes on the Theory of Reference,” in (1953/61), pp. 130–8. Quine, W. (1954/76), “Carnap and Logical Truth,” in his (1976), pp. 107–32. Quine, W. (1955/76), “Posits and Reality,” in his (1976), pp. 246–54. Quine, W. (1956), “Quantifiers and Propositional Attitudes,” Journal of Philosophy, 53: 177–87. Quine, W. (1970b), “Philosophical progress in language theory,” Metaphilosophy, 1 (1): 2–19. Quine, W. (1960/2013), Word and Object, 2nd edn, ed. Dagfinn Føllesdal, Cambridge, MA: MIT Press. Quine, W. (1961), “Reply to professor Marcus,” Synthese, 13 (4): 323–30. Quine, W. V. O. (1969a), “Ontological Relativity,” in W. V. O. Quine, Ontological Relativity and Other Essays, New York: Columbia University Press, pp. 26–68. Quine, W. (1969b), “Epistemology Naturalized,” in his Ontological Relativity and Other Essays, New York: Columbia University Press, pp. 69–99. Quine, W. (1970a), “Methodological Reflections on Linguistic Theory,” Synthese, 21 (3–4): 386–98. Quine, W. (1970b). “Philosophical progress in language theory,” Metaphilosophy, 1 (1): 2–19. Quine, W. (1975/81), “Five Milestones of Empiricism,” in Theories and Things, Cambridge, MA: Harvard University Press, pp. 67–72. Quine, W. (1976), Ways of Paradox and Other Essays, 2nd edn, Cambridge, MA: Harvard University Press. Quine, W. (1986), “Reply to Henryk Skolimowski,” in The Philosophy of WV Quine, expanded edn, ed. L. Hahn and P. Schilpp, pp. 492–3. Quine, W. (1990/2008), “The Phoneme’s long Shadow,” in his (2008), pp. 364–67. Quine, W. (1991/2008), “Two Dogmas in Retrospect,” in his (2008), pp. 390–400. Quine, W. (2008), Confessions of a Confirmed Extensionalist, ed. D. Føllesdal and D. Quine, Cambridge, MA: Harvard University Press. Quine, W. and Ullian, J. (1978), The Web of Belief, New York: McGraw Hill. Rabiner, L. (1989), “A tutotial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, 77 (2): 257–86.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

418 General References Radford, A. (1997), Syntactic Theory and the Structure of English: a Minimalist Approach, Cambridge: Cambridge University Press. Radford, A. (2004), English Syntax: an Introduction, Cambridge: Cambridge University Press. Radford, A., Atkinson, M., Britain, D., Clahsen, H., and Spencer, A. (2009), Linguistics: an Introduction, 2nd edn, Cambridge: Cambridge University Press. Rai, M. (1995), Chomsky’s Politics, London: Verso.Ramsey, W. (2007), Representation Reconsidered, Cambridge: Cambridge University Press. Rawls, J. (1971), A Theory of Justice, Cambridge, MA: Harvard University Press. Recanati, F. (2004), Literal Meaning, Cambridge: Cambridge University Press. Recanati, F. (2012), Mental Files, Oxford: Oxford University Press. Reed, C., Rabinowitz, W., Durlach, N., Braida, L., Conway-Fithian, S., Schultz, M. (1985), “Research on the Tadoma method of speech communication,” Journal of Acoustic Society of America, 77 (1): 247–57. Reimer, M. and Michaelson, E. (2019), “Reference,” Entry in Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/reference/ Reinhart, T. (1976), The Syntactic Domain of Anaphora, Ph.D. dissertatin, MIT, Cambridge, MA: distributed by MIT Working Papers in Linguistics. Reinhart, T. (1983), “Anaphora and Semantic Interpretation,” Journal of Linguistics, 21 (1): 221–6. Reiss, C. (2016), “Substance Free Phonology,” in A. Bosch, and S. Hannahs (eds), Handbook of Phonological Theory, London: Routledge; available on-line at: https://www.researchgate.net/publication/305776096_Substance_Free_Phonology Rey, G. (1981), “What Are Mental Images?,” in Block (1980b), pp. 117–27. Rey, G. (1983), “Concepts and Stereotypes,” Cognition, 15: 237–62. Rey, G. (1990), “Constitutive Causation and The Reality of Mind,” commentary on John Searle (1990), Behavioral and Brain Sciences, 13 (4). Rey, G. (1993), “The Unavailability of What We Mean: a Reply to Quine, Fodor and LePore” in Grazer Philosophica, special edition ed. by J. Fodor and E. LePore, pp. 61–101. Rey, G. (1995), “Dennett’s Unrealistic Psychology,” Philosophical Topics, 22 (1–2): 259–89/ Rey, G. (1996), “Towards a Projectivist Account of Conscious Experience,” in T. Metzinger, (ed.), Conscious Experience, Paderhorn: Ferdinand-Schoeningh-Verlag, pp. 123–42. Rey, G. (1997), Contemporary Philosophy of Mind: a Contentiously Classical Approach, Oxford: Blackwell. Rey, G. (1998), “A Naturalistic A Priori,” Philosophical Studies, 92: 25–43. Rey, G. (2001), “Digging Deeper for the A Priori, Commentary on Laurence Bonjour, In Defense of Pure Reason,” Philosophical and Phenomenological Research, Nov: 649–56. Rey, G. (2002a), “Physicalism and Psychology: a Plea for Substantive Philosophy of Mind,” in C. Gillet and B Loewer (eds), Physicalism and Its Discontents, Cambridge: Cambridge University Press, pp. 99–128. Rey, G. (2002b), “Problems with Dreyfus’ Dialectic,” Phenomenology and the Cognitive Sciences, 1 (4): 204–8. Rey, G. (2003a), “Chomsky, Intentionality and a CRTT,” in L. Antony and N. Hornstein, Chomsky and His Critics, Oxford: Blackwell, pp. 105–39. Rey, G. (2003b), “Representational Content and a Chomskyan Linguistics,” in A. Barber, (ed.), Epistemology of Language, Oxford: Oxford University Press, pp. 140–86. Rey, G. (2003c), “Why Wittgenstein Ought to Have Been a Computationalist (and What a Computationalist Can Learn from Wittgenstein),” Croation Journal of Philosophy, III (9): 231–64.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 419 Rey, G. (2003/17), “The Analytic/Synthetic Distinction,” in Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/analytic-synthetic/ Rey, G. (2005), “Philosophical Analysis as Cognitive Psychology: the Case of Empty Concepts,” in H. Cohen, and C. Lefebvre (eds), Handbook of Categorization in Cognitive Science, Dordrecht: Elsevier, pp. 71–89. Rey, G. (2006a), “The Non-Existence of Language—But Not Cars,” in Stainton (2006), pp. 237–55. Rey, G. (2006b), “Conventions, Intuitions and Linguistic Inexistents: a Reply to Devitt,” Croatian Journal of Philosophy, VI (18): 549–70. Rey, G. (2007), “Resisting Normativism in Psychology,” in Contemporary Debates in Philosophy of Mind, Oxford: Blackwell, pp. 69–84. Rey, G. (2008), “In Defense of Folieism,” Croatian Journal of Philosophy, VIII (23): 177–202. Rey, G. (2009), “Concepts, Defaults, and Internal Asymmetric Dependencies: Distillations of Fodor and Horwich,” in N. Kompa, C. Nimtz, and C. Suhm (eds), The A Priori and Its Role in Philosophy, Paderborn: Mentis, pp. 185–204. Rey, G. (2013), “We Aren’t All Self-Blind: a Defense of a Modest Introspectionism,” Mind and Langauge; also available with postscript at http://sites.google.com/site/georgesrey Rey, G. (2014a), “The Possibility of a Naturalistic Cartesianism Regarding Intuitions and Introspection,” in M. Haug (ed.), Philosophical Methodology: The Armchair or the Laboratory?, London: Routledge. Rey, G. (2014b), “Innate and Learned: Carey, Mad Dog Nativism, and the Poverty of Stimuli and Analogies (Yet Again),” Mind & Language, 29 (2): 109–32. Rey, G. (2016), “Analytic, A Priori, False—And Maybe Non-Conceptual,” European Journal of Analytic Philosophy, 10 (2): 85–110. Rey, G. (2018), “A Remembrance of Jerry Fodor and His Work,” Mind & Language, 33 (4): 321–41. Rey, G. (2020a), “A defense of the Voice of Competence,” in Schindler et al (2020), pp. 33–49. Rey, G. (2020b), “Explanation First: the Priority of Scientific Over ‘Commonsense’ Metaphysics,” in A. Bianchi, (ed.), Language and Reality from a Naturalistic Perspective: Themes From Michael Devitt, Cham: Springer, pp. 299–327. Rey, G. (2020c), “The Non-Primacy of Subjective Intentionality” in A. Sullivan, (ed.), Thoughts and Language: Essays in Honor of Brian Loar, London: Routledge pp. 331–352. Riley, J. R., U. Greggers, A. D. Smith, D. R. Reynolds, and R. Menzel (2005), “The Flight of Honey Bees Recruited by the Waggle Dance,”Nature, 435: 205–7. Ringe, D. and Eska, J. (2013), Historical Linguistics: Toward a Twenty-First Century Reintegration. Cambridge: Cambridge University Press. Ritchie, W. and Bhatia, T. K. (eds) (1999), Handbook of Child Language Acquisition, San Diego: Academic Press. Rizzi, L. (1982), Issues in Italian syntax. Berlin: Walter de Gruyter. Roberge, Y. (1990), Syntactic Recoverability of Null Arguments, Montreal: McGill-Queen’s Press. Roberts, I. (1997), Comparative Syntax, Oxford: Oxford University Press. Roberts, I. (2007), Diachronic Syntax, Oxford: Oxford University Press. Roeper, T. and Speas, M. (2014), Recursion: Complexity in Cognition, Berlin: Springer. Rorty, R. (1979), Philosophy and the Mirror of Nature, Princeton, NJ: Princeton University Press. Rosenbloom, P. (1950), The Elements of Mathematical Logic, New York: Dover.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

420 General References Ross, John R. (1967/86), Constraints on variables in syntax. Doctoral dissertation, Massachusetts Institute of Technology. Published in 1986 as Infinite syntax!: Norwood, NJ: ABLEX (available online at http://hdl.handle.net/1721.1/15166). Rozin, P., Haidt, J., and McCauley, C. R. (1993). “Disgust,” in M. Lewis and J. Haviland (eds), Handbook of Emotions, New York: Guilford, pp. 575–94. Russell, B. (1905), “On Denoting,” Mind, 14 (56): 479–93. Russell, B. (1912), Problems of Philosophy, New York: Henry Holt. Russell, B. (1919), Introduction to Mathematical Philosophy, London: George Allen and Unwin. Russell, B. (1948), Human Knowledge: Its Scope and Limits, London: Allen and Unwin. Russell, G. (2008), Truth in Virtue of Meaning: a defence of the analytic/synthetic distinction, Oxford: Oxford University Press. Ryle, G. (1949/2009), The Concept of Mind, Abingdon: Routledge. Ryle, G. (1961), “Use, Usage and Meaning,” Proceedings of the Aristotelian Society, Supplementary Volumes, 35: 223–30. Saffran, J. (2002), “Constraints on Statistical Language Learning,” Journal of Memory and Language, 47: 172–96. Saffran, J., Aslin, R., and Newport, E. (1996), “Statistical learning by 8-month-old infants,” Science, 274: 1926–8. Safir, K. (2004), The Syntax of Anaphora. Oxford: Oxford University Press. Sampson, G. (1980), Schools of Linguistics: Competition and Evolution, London: Hutcheson. Sampson, G. (1989), “Language acquisition: growth or learning?” Philosophical Papers, 18: 203–40. Sampson, G. (1999), Educating Eve. London: Cassell. Sandler, W. (2010), “The uniformity and diversity of language: Evidence from sign language. Response to Evans and Levinson,” Lingua, 120 (12): 2727–32. Sanford, A, and Sturt, P. (2002), “Depth of Processing in Language Comprehension: not Noticing the Evidence,” Trends in Cognitive Science, 6: 382–86. Sapir, E. (1933), “The psychological reality of phonemes,” in Selected Writings in Language, Culture and Personality, Los Angeles: University of California Press, pp. 46–60. Sartre, J. P. (1943/1948), Being and Nothingness, trans by H. Barnes, New York: Philosophical Library. Sauerland, U. and Gärtner, H. (eds) (2017), Interfaces+Recursion = Language? New York: Mouton de Gruter. de Saussure, F. (1914/77), Cours de linguistique générale, ed. C. Bally and A. Sechehaye, with the collaboration of A. Riedlinger, Lausanne and Paris: Payot; trans. W. Baskin, Course in General Linguistics, Glasgow: Fontana/Collins, 1977. Savin, H. and Bever, T. (1970), “The Non-perceptual Reality of the Phoneme,” Journal of Verbal learning and Verbal Behavior, 9: 295–302. Schiffer, S. (1987), Remnants of Meaning, Cambridge, MA: MIT Press. Schindler, S., Drożdżowicz, A. and K. Brøcker (eds) (2020), Linguistic Intuitions: Evidence and Method. Oxford: Oxford University Press. Schütze, C. (1996/2016), The Empirical Base of Linguistics: Grammaticality Judgments and Linguistic Methodology, 2nd edn, Chicago: University of Chicago Press. Schütze, C. (2003), “Linguistic Evidence, Status of,” in L. Nadel, Encyclopedia of Cognitive Science, London: Nature Publishing Group. Scott, R. and Baillargeon, R. (2017), “Early False-Belief Understanding,” Trends in Cognitive Science, 21 (4): 237–49. Searle, J. (1974/82), “Chomsky’s Revolution in Linguistics,” in Harman (1974/82).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 421 Searle, J. (1979), Expression and Meaning, Cambridge: Cambridge University Press. Searle, J. (1980), “The Background of Meaning,” in J. Searle, F. Kiefer, and M. Bierwisch (eds), Speech Act Theory and Pragmatics, Dordrecht: Reidel, pp. 221–32. Searle, J. (1984), Minds, Brains and Science: the 1984 Reith Lectures, Cambridge: Cambridge University Press. Searle, J. (1987), “Indeterminacy, Empiricism, and the First Person,” The Journal of Philosophy, 84 (3): 123–46. Searle, J. (1990), “Consciousness, Explanatory Inversion, and Cognitive Science,” Behavioral and Brain Sciences, 13 (4): 585–96. Searle, J. (1992), The Re-discovery of the Mind, Cambridge, MA: MIT Press. Searle, J. (2002a), “End of the Revolution. Review of Chomsky, N., New Horizons in the Study of Language and Mind,” New York Review of Books, February 28, 2002; https:// www.nybooks.com/articles/2002/02/28/end-of-the-revolution Searle, J. (2002b), reply to Bromberger, New York Review of Books, April 25, 2002; https:// www.nybooks.com/articles/2002/04/25/chomskys-revolution/ Searle, J. (2002c), reply to Chomsky, New York Review of Books, 49 (12): July 18, 2002; https://www.nybooks.com/issues/2002/07/18/ Seligman, M. (1975/92), Helplessness: On Depression, Development, and Death, San Francisco: W.H. Freeman. Sellars, W. (1962), “Philosophy and the Scientific Image of Man,” in R. Colodny (ed.), Frontiers of Science and Philosophy, Pittsburgh, PA: University of Pittsburgh Press, pp. 35–78. Sellars, W. and Chisholm, R. (1957), “Intentionality and the Mental: Chisholm–Sellars Correspondence on Intentionality,” with introduction by Sellars, in H. Feigl, M. Scriven, and G. Maxwell (eds), Minnesota Studies in the Philosophy of Science, vol. II, Minneapolis: University of Minnesota Press, pp. 507–39. Shanahan, M. (2017), “The Frame Problem,” Stanford Encyclopedia of Philosophy, on-line at: https://plato.stanford.edu/entries/frame-problem/ Shea, N. (2018), Representation in Cognitive Science, Oxford: Oxford University Press. Sheehan, M. (forthcoming), “Parameters and linguistic variation,” in Allott, Lohndal, and Rey (forthcoming). Shieber, S. (1985), “Evidence Against the Context-freeness of Natural Language,” Linguistics and Philosophy, 8: 333–43. Shoemaker, S. (2000), “Introspection and Phenomenal Character,” Philosophical Topics, 28 (2): 247–73. Skinner, B. (1938), The Behavior of Organisms, Oxford: Appleton-Century. Skinner, B. (1953), Science and Human Behavior, New York: Macmillan. Skinner, B. (1957), Verbal Behavior, Acton, MA: Copley Publishing Group. Skinner, B. (1963/84), “Behaviorism at Fifty,” Behavioral and Brain Sciences, 7 (4): 615–21. Slominsky, N. (1988), Perfect Pitch, Oxford: Oxford University Press. Smart, J. (1959), “Sensations and Brain,” The Philosophical Review, 68 (2): 141–56. Smith, B. (2006), “What I know When I Know a Language,” in E. Lepore and B. Smith, The Oxford Handbook of the Philosophy of Language, Oxford: Oxford University Press, pp. 941–82. Smith, N. (1999), Chomsky: Ideas and Ideals, 1st edn, Cambridge: Cambridge University Press. Smith, N. (2004), Chomsky—Ideas and Ideals, 2nd edn, Cambridge: Cambridge University Press. Smith, N. (2010), Acquiring Phonology, Cambridge: Cambridge University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

422 General References Smith, N. and Allott, N. (2016), Chomsky—Ideas and Ideals, 3rd edn, Cambridge: Cambridge University Press. Smith, N. and Cormack, A. (2015), “Features from Aspects via the Minimalist Program to Combinatory Categorial Grammar,” in Gallego and Ott, 50 Years Later: Reflections on Chomsky’s Aspects, MIT Working Papers in Linguistics #77: Cambridge, MA: MIT Press, pp. 233–48. Smith, N. and Tsimpli, I.-M. (1995), The Mind of a Savant: Language Learning and Modularity, London: Blackwell. Soames, S. (1984), “Linguistics and Psychology,” Linguistics and Philosophy, 7: 155–79. Sosa, E. (1998), “Minimal Intuition,” in DePaul and Ramsey (1998), pp. 257–69. Spelke, E. (2003), “Developing knowledge of space: Core systems and new combinations,” in S. M. Kosslyn and A. Galaburda (eds), Languages of the Brain. Cambridge, MA: Harvard University Press. Spelke, E. (2017), “Core Knowledge, Language, and Number,” Language Learning and Development, 13: 2, 147–70. Sperber, D. (1998), “The Mapping Between the Mental and the Public Lexicon,” in P. Carruthers and J. Boucher (eds), Language and Thought: Interdisciplinary Themes, Cambridge: Cambridge University Press, pp. 184–200. Sperber, D. and Wilson, D. (1986/95), Relevance: Communication and Cognition, Oxford: Blackwell. Sprouse, J. and Almeida, D. (2017), “Setting the empirical record straight: Acceptability judgments appear to be reliable, robust, and replicable,” Behavioral and Brain Sciences, 40, E311. doi:10.1017/S0140525X1700059 Sprouse, J. and Hornstein, N. (2013a), Experimental Syntax and Island Effects, Cambridge: Cambridge University Press. Sprouse, J. and Hornstein, N. (2013b), “Experimental Syntax and island Effects; towards a Comprehensive Theory of Islands,” in Sprouse and Hornstein (2013a), pp. 1–7. Sprouse, Jon, Schütze, Carson T., and Almeida, Diogo (2013), “A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001–2010”; Lingua, 134: 219–48. Stabler, E. (1983), “How are grammars represented?” Behavioral and Brain Sciences, 6: 389–402. Stainton, R. (ed.) (2006), Contemporary Debates in Cognitive Science, Malden, MA; Oxford: Wiley-Blackwell. Stanley, J., and Williamson, T. (2001), “Know How,” The Journal of Philosophy, 98 (8): 411–44. Stebbing, S. (1943), A Modern Elementary Logic, London: Methuen. Stetson, R. (1945/51), Motor Phonetics: A Study of Speech Movements in Action, 2nd edn, Amsterdam, North Holland Publishing Co. Stevens, K. and Hanson, H. (2010), “Articulatory-Acoustic Relations as the Basis fo Distinctive Contrasts,” in Hardcastle, Laver, and Gibbon (2010), pp. 424–52. Stich, S. (1983), From Folk Psychology to Cognitive Science: The Case Against Belief, Cambridge, MA: Bradford Books / MIT Press. Stone, T. and Davies, M. (2002), “Chomsky Amongst the Philosophers, review of Chomsky (2000),” Mind & Language, 17 (3): 276–89. Strawson, P. (1950), “On Referring,” Mind, 59: 320–44.Strawson, G. (1987), Freedom and Belief, Oxford: Oxford University Press. Strawson, G. (1994), Mental Reality, Cambridge, MA: MIT Press. Strawson, P. (1962), “Freedom and Resentment,” in G. Watson (ed.), Proceedings of the British Academy, Volume 48, Oxford: Oxford University Press, pp.1–2.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 423 Sturt, P. (2007), “Semantic re-interpretation and garden path recovery,” Cognition, 105: 477–88. Szabo, Z. (1999), “Expressions and Their Representations,” Philosophical Quarterly, 29: 145–63. Takács, O. (2013), “Changing Trends in Psycholinguistic Parsing Theory,” Argumentum, 9: 353–62. Takahashi, E. and Lidz, J. (2008), “Beyond Statistical Learning in Syntax,” in A. Gavarró and M. Joãs Fritas (eds), Proceedings of GALA 2007: Language Acquisition and Development, pp. 444–54. [ling.umd.edu/labs/acquisition/papers/Takahashi_Lidz.pdf] Textor, M. (2009), “Devitt on the epistemic authority of linguistic intuitions,” Erkenntnis, 71: 395–405. Thompson, D. (1917/2014), On Growth and Form, Cambridge: Cambridge University Press. Titelman, M. (2010), “Not Enough There There: Evidence, Reasons, and Language Independence,” Philosophical Perspectives, 24: 477–528. Tolman, E.(1948). “Cognitive maps in rats and men,” Psychological Review, 55: 189–208. Tomasello, M. (2003), Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press. Tomasello, M. (2005), Constructing a Language A Usage-Based Theory of Language Acquisition, Cambridge, MA: Harvard University Press. Tomasello, M. (2009), “Universal Grammar is Dead,” Behavioral and Brain Sciences, 32 (5): 470–1. Tomasello, M. (ed.) (2014), The New Psychology of Language: Cognitive and Functional Approaches to Language Structure, 2nd edn, Vol I, Abingdon: Taylor & Francis Group, 2014. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/umdcp/detail. action?docID=1707450. Townsend, D. and Bever, T. (2001), Sentence Comprehension: the Integration of Habits and Rules, Cambridge, MA: MIT Press. Travis, C. (1985), “On what is strictly speaking true,” Canadian Journal of Philosophy, 15: 187–229. Travis, C. (1996), “Meaning’s role in truth,” Mind, 105: 451–66. Trubetzkoy, N. (1939/1969). Principles of Phonology. Foreword by Christiane A. M. Baltaxe, trans. Berkeley & London: University of California Press. Tulving, E. (1985), Elements of Episodic Memory, Oxford: Oxford University Press. Tumner, W. and Herriman, M. (1984), “The Development of Metalinguistic Awareness: a Conceptual Overview,” in Tumner et al (1984), pp. 12–35. Tunmer, W., Pratt, C., and Herriman, M. (1984), Metalinguistic Awareness in Children: Theory, Research, and Implications, New York: Springer. Turing, A. (1952), “The Chemical Basis of Morphogenesis,” Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 237 (641): 37–72. Unger, P. (1984), Philosophical Relativity, Oxford: Oxford University Press. Uriagereka, J. (1998), Rhyme and Reason, Cambridge, MA: MIT Press. Uriagereka, J. (1999), “Multiple Spell-out,” in S. Epstein and N. Hornstein (eds.), Working Minimalism, Cambridge, MA: MIT Press, pp. 251–82. Vaihinger, H. (1968), The Philosophy of “As if.” trans by C. K. Ogden, London: Routledge. Valian, V. (1999), “Input and Language Acquisition,” in W. Ritchie, and T. Bhalia, (eds), Handbook of Child Language Acquisition, San Diego, CA: Academic Press, pp. 497–527; http://maxweber.hunter.cuny.edu/psych/faculty/valian/docs/1999Input.pdf) Valian, V. (2014), “Arguing about Innateness,” Journal of Child Language, 41 (supp S1): 78–92.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

424 General References van der Hulst, H. (ed.) (2010), Recursion in Human Language, Berlin: Mouton de Gruyter. van der Wal, S. (1996), Negative Polarity Items in Dutch and English: A Lexical Puzzle. Technical Report, University of Groningen. van Fraasen, B. (1980), The Scientific Image, Oxford: Oxford University Press. van Gelder, T. (1995), “What Might Cognition Be, if not Computation,” Journal of Philosophy, 95: 341–81. Verhaegh, S. (2018), “Sign and Object: Quine’s forgotten book project,” Synthese: 1–22; https://doi.org/10.1007/s11229-018-1693-z Vigliocco, G., and Vinson, D. (2003), ‘‘Speech Production,’ in L. Nadel, (ed.), Encyclopedia of Cognitive Science, London: Nature Publishing Group, vol. 4: 182–9. Vigliocco, G., Butterworth, B., and Garrett, M. (1996), “Subject–verb agreement in Spanish and English: Differences in the role of conceptual constraints,” Cognition, 61: 261–98. Vincente, A. (2012), “On Travis Cases,” Linguistics and Philosophy, 35 (1): 3–19. Volenc, V. and Reiss, C. (2017), “Cognitive phonetics: the transduction of distinctive features at the phonology–phonetics interface,” Biolinguistics, 11 (SI): 251–94. von Frisch, K. (1927/53), The Dancing Bees: An Account of the Life and Senses of the Honey Bee, English translation of Aus dem Leben der Bienen, 5th rev edn, Springer Verlag. Wagers, M. W. (2008), The structure of memory meets memory for structure in linguistic cognition. PhD dissertation, University of Maryland. Wagers, M. and Phillips, C. (2009), “Multiple dependencies and the role of the grammar in real-time comprehension,” Journal of Linguistics, 45 (2): 395–433. Wagers, M. and Phillips, C. (2014), “Going the distance: Memory and decision making in active dependency construction,” Quarterly Journal of Experimental Psychology, 67: 1274–304. Wagers, M., Lau, E., and Phillips, C. (2009), “Agreement attraction in comprehension: Representations and processes,” Journal of Memory and Language, 61: 206–37. Waismann, F. (1952/62), “The Resources of Language,” in M. Black, (ed.), The Importance of Language, Englewood Cliffs, NY: Prentice Hall, pp. 107–20. Walton, K. (1973), “Pictures and Make-Believe,” Philosophical Review, lxxxii: 283–319. Wasow, T. and Arnold, J. (2005), “Intuitions in linguistic argumentation,” Lingua, 115: 1481–96. Watson, J. (1913), “Psychology as the Behaviorist Views it,” Psychological Review, 20: 158–177. Watumull, J., Hauser, M., Roberts, I., and Hornstein, N. (2014), “On recursion,” Frontiers in Psychology, 4: 1017. Wedgwood, R. (2007), “Normativism Defended,” in Contemporary Debates in Philosophy of Mind, Oxford: Blackwell, pp. 85–102. Wegner, D. (2002), The Illusion of Conscious Will, Cambridge, MA: MIT Press. Weinberg, S. (1976), The forces of nature. Bulletin of the American Academy of Arts and Sciences, 29 (4): 13–29. Werker, J. and Lalonde, C. (1988), “Cross-Language Speech Perception : Initial Capabilities and Developmental Change,” Developmental Psychology, 24 (5): 672–83. CiteSeerX 10.1.1.460.9810. doi:10.1037/0012–1649.24.5.672. Wetzel, L. (2006), “Types and Tokens,” Entry in Stanford Encyclopedia of Philosophy, https://plato.stanford.edu/entries/types-tokens/#WhaWor Whyte, L. (1967), The Unconscious Before Freud, London: Tavistock. Wiggins, D. (1997), “Language as Social Objects,” Philosophy, 72 (282): 499–524. Wilson, T. (2002), Strangers to Ourselves, Cambridge, MA: Harvard University Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General References 425 Winch, P. (1958/90), The Idea of a Social Science and Its Relation to Philosophy, 2nd edn, London: Routledge. Windelband, W. (1894/1915), “Geschichte und Naturwissenschaft,” in Präludien. Aufsätze und Reden zur Philosophie und ihrer Geschichte, Band 2, 5. erweiterte Auflage, Tübingen: J.C.M. Mohr (Paul Siebeck), pp. 136–60. Wittgenstein, L. (1953/2016), Philosophical Investigations, 4th rev edn, edited and trans by G. E. M Anscombe, P. M. S. Hacker, and J. Schulte, Chichester (UK): John Wiley and Sons Ltd. Wittgenstein, L. (1977), Remarks on Colour, ed. by G. E. M. Anscombe, Oxford: Blackwell. Wittgenstein, L. (1980), Remarks on the Philosophy of Psychology, vol. II, ed. by G. E. M. Anscombe and G. H. von Wright, Oxford: Blackwell. Wittgenstein, L. (1981), Zettel, ed. by G. E. M. Anscombe and G. H. von Wright; Oxford: Blackwell. Woodward, J. (2011), “Data and phenomena: a restatement and defense,” Synthese, 182: 165–79. Woodward. A., Pancheva, R., Hacquard, V., and Phillips, C. (2018), “The anatomy of a comparative illusion,” Journal of Semantics, 35 (3): 543–83. Wright, L. (1973), “Functions,” The Philosophical Review, 82 (2): 139–68. Wright, C. (1989), “Wittgenstein and Theoretical Linguistics,” in George, A., Reflections on Chomsky, Oxford: Blackwell, (1989), pp. 236–64. Wright, C. (2007), “Rule-following without Reasons: Wittgenstein’s Quietism and the Constitutive Question,” in J. Preston, (ed.), Wittgenstein and Reason, Oxford: Blackwell, pp. 123–44. Yamada, J. (1990), Laura: a Case for the Modularity of Language, Cambridge, MA: MIT Press. Yang, C. (2002), Knowledge and Learning in Natural Language, Oxford: Oxford University Press. Yang, C. (2003), “Universal Grammar, statistics or both?,” Trends in Cognitive Science, 8 (10): 451–6. Yolton, J. (1984), Perceptual Acquaintance, Minneapolis: University of Minnesota Press.

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

Glossary of idiosyncratic terms and abbreviations (More standard technical terms are explained on pages indicated in bold face in entries for the terms in the index.) Expressions in small caps represent properties; e.g., red represents the property of (being) red. Expressions will be mentioned in one or more of three ways, depending upon context, by double quotes, by italics, or simply by setting off the expression apart from the main text. curly (“{…}” and double curly (“{{…}}”) brackets represent (where it matters), respectively, conceptual and non-conceptual content; e.g., {square} represents the concept expressed by “square”; {{square}}, the non-conceptual content that might be had by a representation in the visual system (see §4.3 for discussion). As per phonological conventions, expressions in square (“[. . .]”) brackets represent phones; expressions in forward slashes (“/…/”) represent phonemes. *, #, ??: Strings or indices (sub-scripts) prefixed with these symbols indicate unacceptability due to, respectively, presumed ungrammaticality, semantic anomaly, or questionable status. In a case where omission of some item renders a string ungrammatical, the item is enclosed in parentheses and prefixed with a “*”, e.g.: I know I should go home, but I don’t want *(to) Abstruse Phenomena: My term for non-local, non-physical or non-instantiated/-able properties or phenomena, such as being a dinosaur, a sonata, or a (perfect) triangle (see §11.1) Aspects: Chomsky’s (1965), Aspects of the Theory of Syntax, his first major published presentation of (an early version of) his theory, which is commonly referred to in this way. CP: This is used in three different ways in different contexts: (i) complement phrase, e.g., that cats meow, which often serve as complements of verbs such as knows (see §2.2.4, fn25; OR: (ii) ceteris paribus: “other things being equal,” a clause attached to a law to rule out interferences (see §3.4.3); OR: (iii) “Central Processor.” The domain of general reasoning, as opposed to cognitive processes that occur in modules that are relatively “informationally encapsulated” (see §7.2.2).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

428 Glossary of idiosyncratic terms and abbreviations (II-)CRT: A Causal-Computational-Representational Theory, typically of Mental Processes (see §4.5). The prefix of two “I”s captures the fact that a CRT is to be understood as both intensional (a computational procedure, see §3.2) and intentional (involving the properties of intentionality or “aboutness”; see §8.1). Intentional and intensional: these are actually three exasperatingly homophonous expressions that figure in philosophical discussions: intenSional (non-extensional; see §3.2), intenTional (involving aboutness; see §8.1), and intentional (meaning roughly intended, seldom an issue in this book). Intentional Object: the “thing,” real or unreal, that a representation or other mental state is “of ” or “about” or represents. E.g., The man Samuel Clemens is the intentional object of “the author of Tom Sawyer” and of thoughts about him; the Greek Goddess Hera is the intentional object of “the wife of Zeus” and of thoughts about her (see §8.7). Intentional Inexistent: the intentional object of a representation when there is no such object in the real world. Thus, Hera is an intentional inexistent (see §8.7). LSLT: The Logical Structure of Linguistic Theory, Chomsky (1955/75), a ca. 500pp treatise Chomsky wrote while a Junior Fellow at Harvard, which is standardly referred to by its initials. Meth-Dualism Methodological Dualism. The view that mental processes are not amenable to the methods of natural science (see §11.3 and index). (NC)SDs: (non-conceptual) structural descriptions: systematic descriptions of the linguistic properties of SLEs, “non-conceptual” if not deployed freely as part of the “Central Processor” of the mind (see “CP” sense (iii) above, §7.2.2 and index). question-begging: Traditionally, a move in an argument that assumes without argument precisely what is under dispute. But it has recently come independently to mean merely to raise a question. Throughout this text, I use it with its traditional meaning. SLEs (Standard Linguistic Entities): I use this simply as a convenient umbrella for entities routinely discussed in linguistics, such as words, syllables, sentences, discourses, phonemes, phones, and their property instances (but not allophones, or specific instances of phones, which may be a special case), where the differences between them will not be at issue (see Preface, fn6). (VoC): the view that the linguistic intuitions that interest Chomskyans are those that are the “voice of competence,” of a VoC, that is a direct manifestation of features of the I-Language (see §7.1).

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Name Index Adams, F. 351n.23 Adger, D. xxv, 76n.42, 244, 286n.28, 327, 366–8, 369n.10, 374n.15 Aizawa, K. 351n.23 Allott, N. xn.3, xin.4, xxv, 20, 36n.24, 43, 45n.3, 47n.6, 51n.15, 63n.28, 65, 69, 74, 89, 100n.8, 131, 163n.15, 181n.33, 203n.21, 233n.13, 282n.24, 320n.25, 327n.34 Almeida, D. 23 Almeida, R. 71n.35 Anscombe, G. E. M. 123n.35 Antony, L. 207n.22, 329n.35, 376n.17 Apperly, I. 143n.20, 266 Ariew, A. 152n.3 Aristotle 14n.1, 98n.4, 201n.19, 298, 301, 351, 355, 361 Arnold, J. 222n.2, 248n.31 Austin, J. 389, 389n.38 Aydede, M. 389n.38 Ayer, A. J. 191 Bach, K. 181, 222n.2, 357n.30 Bacon, F. 16n.4 Baillargeon, R. 20, 176, 266 Baker, G. P. ixn.2, 3, 117, 119–21 Baker, M. 24n.13, 43n.33, 59, 73 Bambrough, R. 69n.33 Barber, A. 316n.22 Barney, H. 315 Beck, J. 165n.17 Berlinski, D. ixn.2 Bermudez, J. 138n.13 Berwick, R. 39, 78–79, 100, 151n.1, 171, 178n.30, 297, 298 Bever, T. 111, 204 Bezuidenhout, A. 357n.30 Bilgrami, A. 388 Bishop 186n.2 Blake, W. 170n.23 Block, N. 265, 381n.25

Bloomfield, L. 13, 14n.1, 16, 19, 57, 58, 102–3, 106n.16, 133n.6, 192n.8, 196, 214, 308–9 Bock, K. 110n.19, 255–6 Boden, M. 110n.19 Boeckx, C. 17n.5, 78n.44 Bogen, J. 18 Bolhuis, J. 158n.12 Bond, Z. 256–7 Borg, E. 353n.27, 357n.30 Botha, R. ixn.2 Braine, M. 160–1 Breheny, R. 181n.33 Brentano, F. xii, xiii, 7, 262–4, 271–2, 288–9, 291, 332, 349, 351–2, 377, 380 Bresnan, J. 92 Brinton, L. 347 Brock, J. 42n.31 Brody, M. 92 Bromberger, S. 330n.36 Brooks, R. 272n.15 Brown, R. 41 Words and Things 361 Burge, T. 98–9, 215n.32, 236n.19, 264n.5, 265n.7, 266, 273n.16, 275, 287, 336, 344n.12, 350n.20, 354n.29, 359n.31, 365n.7, 389n.38 Burton-Roberts, N. 209n.26 Byrne, A. 306n.11 Cahen, A. 137n.13 Cairns, H. 235, 253–5, 314 Calder, A. 78 Campbell, K. 321n.28 Carey, S. 143n.20, 266 Carnap, R. 49n.9 The Logical Construction of the World 45 Carroll, L. 69n.33, 114n.25, 253 Carruthers, P. 71n.35, 187n.3, 266, 389n.38 Carston, R. 357n.30, 358 Cartwright, N. 126, 291, 332

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

430 Name Index Carus, P. 187 Chater, N. ix, 43, 117, 121, 163n.16, 174n.26, 175, 266 Cherniak, C. 81n.48 Chomsky, N. Aspects of the Theory of Syntax 20n.8, 55, 56n.21, 57, 58n.22, 81n.48, 109, 202, 208n.25, 211, 268n.10, 309n.15, 427 Current Issues in Linguistic Theory 58n.22 Language and Mind 56n.20, 268, 273 Language and Politics 299 Lectures on Government and Binding 71 Rules and Representations 269 Syntactic Structures 47–50, 53–4, 57, 58n.22, 62, 208n.25 The Logical Structure of Linguistic Theory (LSLT) 47–50, 53–4, 57, 58n.22, 212, 267, 281, 328, 428 The Morphophonemics of Modern Hebrew 48, 58n.22 The Sound Pattern of English 62, 322 Chouinard, M. 159 Christianson, M. 121n.33 Churchland, P. M. 266, 271n.14, 272n.15 Churchland, P. S. x, 44, 272n.15 Clark, E. 159 Clark, R. 158n.11 Clayton, N. 275 Clements, N. 50, 82 Cohen, Jon. 140n.15 Cohen, Jos. xxv Collins, J. xxv, xxvi, 7, 45n.1, 46n.5, 50n.12, 51n.15, 56n.20, 66n.31, 76n.41, 104n.14, 108, 111, 136n.12, 157n.10, 193n.9, 206–7, 218, 227n.8, 261, 263n.3, 264n.4, 265n.7, 271n.14, 276–8, 279n.22, 281–6, 293, 306n.11, 321, 328, 332, 335, 370, 374n.15, 384n.29, 385n.32, 386, 388n.37 Corbett, G. 66 Cormack, A. 58n.22 Cowart, R. 255 Cowie, F. x, 42n.32, 159, 167n.18, 169n.20, 178n.29, 180n.32, 182 Crain, S. 40n.29, 90n.56, 254 Crimmins, M. 218 Crowther, C. 170n.23 Cudworth, R. 187, 380n.22 Culbertson, J. 248n.31

Culicover, P. 69 Curtiss, S. 42 Darwin, C. 64 Davidson, D. xiii, 101, 119n.32, 126, 265n.7, 271n.14, 273, 342, 380–81 Davies, M. 114–15, 187n.3 Demopoulos, W. 147 Dennett, D. xiii, 117, 122, 135n.8, 217, 264n.6, 271n.14, 282n.25, 344n.11, 381n.24 Derrida, J. 199, 252 Descartes, R. xii, 10, 15, 102, 139, 178n.30, 192, 297, 375, 380n.22, 382–4, 386–7 Devitt, M. vii, x, xi, xxi, xxvi, 4n.3, 58n.22, 97n.3, 110n.19, 133–5, 136n.10, 137, 140, 142–3, 169n.20, 184, 191n.6, 195–213, 216, 218, 220n.37, 221, 222n.1, 223–33, 234n.15, 236n.20, 238–40, 241nn.23–4, 242–54, 257n.38, 258n.39, 286n.28, 296nn.2–3, 300n.5, 316n.22, 321, 334, 353n.25, 361, 370, 374n.15 Dijkstra, E. 388n.36 Dik, S. The Theory of Functional Grammar 70 Dilthey, W. 273, 380 Donnellan, K. 348n.16 Dreben, B. 25n.15 Dretske, F. 265, 289n.35, 349, 365n.7 Dreyfus, H. 179n.31, 207n.22 Drożdżowicz, A. 233, 235 Duhem, P. 142, 296n.2, 340–1 Dummett, M. 136n.10, 380 Dwyer, S. 178n.30, 224 Eddington, A. S. 18 Egan, F. 7, 112n.21, 271n.14, 344n.11 Eimas, P. 250 Einstein, A. ixn.2, 18, 383n.28 Elman, J. 163n.16 Epstein, S. 83n.51 Evans, G. 114n.24, 218, 287n.31, 292, 330n.36 Evans, N. 42, 73n.38 Evans, V. xn.3, 24n.13, 42, 73n.38, 114, 121–3 Everaert, M. 158n.12 Everett, D. 43

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Name Index 431 Fechner, G. T. 187 Fernández, E. 235, 253–5, 314 Ferreira, F. 255 Ferreira, V. 256 Fillmore, C. 69 Fitch, W. 51n.15 Fodor, J. A. 9, 18, 66n.31, 67n.32, 71n.35, 103n.13, 111, 122, 123n.36, 127n.40, 136, 144, 155–6, 178, 182, 204, 206, 224, 233n.13, 234, 239, 239–40, 249n.33, 250, 264n.4, 265, 278n.21, 280n.23, 292n.40, 300, 312–13, 318n.23, 336–7, 341–3, 344n.12, 346–55, 358n.30, 361, 363, 365nn.6–7, 368, 370–1, 373, 377n.19, 378, 384n.31, 386 Fodor, J. D. 36n.23, 55, 73, 75n.40, 165n.17, 170n.23, 182, 242, 268n.10, 280 Follesdal, D. 104n.14, 106n.15 Frankish, K. 390 Frazier, L. 242 Frege, G. 45, 53, 70, 188, 190–3 Freidin, R. 46n.6, 47n.6 Freud, S. 119n.32, 135n.8, 187 Fromkin, V. 36n.24 Fudge, E. 311n.18, 315 Gagliardi, A. 132, 157, 178, 284–5, 372, 374n.15 Galileo, G. 14, 17, 18n.6, 304 Gallistel, C. R. 114n.26, 148, 265n.7, 266, 275, 337, 365n.7, 384n.30 Garrett, M. 204 Gärtner, H. 79 Gazdar, G. 49n.11 Gerken, L. 163 Gleitman, L. 73n.35 Gödel, K. 46n.4, 51n.15, 98 Goldberg, A. 167, 180 Goldsmith, J. 318 Gomez, R. 163 Goodman, N. xii, 4, 48–9, 103, 153, 164–8, 173, 174n.26, 189n.4, 196, 199, 278, 282, 297, 300n.5, 345n.13, 360 The Structure of Appearance 45 Graves, C. 224 Greenberg, M. 202n.20 Grice, H. P. 69, 181

Grimes, J. 266 Gross, S. 23, 152n.3, 156n.9, 228, 248n.31, 375n.16 Hacker, P. M. S. ixn.2, 2, 117, 119–21 Haegeman, L. 61 Hale, M. 311–12, 316 Halle, M. 8, 58n.23, 236, 244, 250, 268, 278, 285–6, 318, 322, 329, 330n.36, 371–3, 374n.15, 376, 379n.21 The Sound Pattern of English 62, 322 Hamann, J. G. 187 Hamlin, J. K. 176 Hanlon, C. 41 Hansen, C. 361n.33, 378n.20 Hardin, C. L. 304, 305, 389n.38 Harley, H. 67n.32 Harman, G. 53n.17, 56n.20, 107, 110n.19, 113, 114n.26, 121–2, 323n.29, 348n.15 Harnish, R. M. 181 Harris, D. 357n.30 Harris, R. 68, 70n.34, 201n.19, 221n.38 Harris, Z. 45, 48, 59, 130, 308 Hauser, M. 78, 79n.47, 80, 254, 337 Hayes, C. xn.3, 24n.13 Hayes, P. J. 387n.34 Heck, R. 137n.13 Heidegger, M. 380 Hempel, C. 56, 267, 389n.37 Higginbotham, J. 39, 90, 154, 189, 236n.19, 325n.33 Hilbert, D. 306n.11 Hobbes, T. 297, 360 Hockett, C. 314 Hoff, E. 40, 318 Hofmeister, P. 23n.10 Holt, L. 314 Horn, L. 181 Hornsby, J. 265n.8 Hornstein, N. xxv, 79–80, 111 Horwich, P. 9, 201, 336, 353–5, 373, 378 Hume, D. 14–15, 17n.4, 164, 361, 380n.22 Hursthouse, R. 119n.32, 136n.10 Husserl, E. 218, 380 Idsardi, W. 133n.6, 307n.13 Isard, S. 253 Israel, D. 162n.14, 191 Israel, M. 33n.20, 67

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

432 Name Index Jackendoff, R. 44, 69–70, 78, 179, 199, 297, 299–300, 324, 362 Jakobson, R. 67 Jespersen, O. 314 Johnson, K. 383n.28 Kant, I. 15, 26n.16, 96–7, 182, 301nn.6–7, 339, 374, 390 Katz, J. 5, 178n.30, 184, 188–93, 221–2, 223n.3, 320, 339nn.3–4, 346–8 Kayne, R. 43 Keil, F. 20 Kemmerling, A. 291 King, A. 114n.26, 337 Klein, D. 163n.16 Klima, E. 82n.50 Knill, D. 374n.15 Knoll, A. 307n.13, 365n.7 Kosslyn, S. 290n.36, 331n.37 Koyré, A. 17n.5, 18n.6 Kremer, R. 384n.30 Kripke, S. 3, 9, 123–8, 201, 270, 273–4, 287, 293n.40, 346, 349, 359n.31, 361, 380 Kuehni, R. 304 Kuhn, T. 199, 300n.5, 324 Lakoff, G. 68–9 Lalonde, C. 250 Langacker, R. 153n.5 Foundations of Cognitive Grammar 70 Langedoen, D. 189n.5, 193n.9 Lappin, S. 162, 163n.16, 168–9 Lashley, K. 46n.4 Lasnik, H. xxv, 39n.27, 45n.1, 47nn.6–7, 50n.13, 57n.22, 86, 87n.55, 107, 132n.4, 158n.11, 165–6, 284n.27, 288n.33 Laurence, S. 151n.1, 169n.20 Laver, J. 309, 313 Leddon, E. 159, 170 Leibniz, G. xxx, 4, 152–3, 156, 172–3, 174n.26, 177, 187, 303, 306, 315 Lenneberg, E. 42n.32 Lepore, E. 341–2, 346 Levine, J. 93n.1, 382n.27, 384n.31, 386 Levine, R. ixn.2, 42, 73n.38, 90n.56 Lewis, D. 14n.1, 101, 167, 199n.15, 207, 343n.10, 344, 381

Lewontin, R. 64 Liberman, A. 311–1, 314, 319, 320n.25, 377 Lidz, J. 132, 157, 159, 170, 180n.32, 182, 284–5, 372–3, 374n.15 Locke, J. 14–15, 304, 315 Loebell, H. 256 Loewer, B. 351n.23, 389n.38 Lohndal, T. 39n.27, 47n.6 Lotto, A. 314 Ludlow, P. xxv, 228n.10 Lycan, W. 287, 381n.25, 388n.37 Lyons, J. 61 MacDonald, J. 320n.24 Mach, E. 138 Malcolm, N. 119 Manning, C. D. 163n.16 Manning, P. 163n.16 Mantzavinos, C. 381n.24 Manzini, R. 78n.45, 90 Marchant, J. 30 Margolis, E. 151n.1, 169n.20 Marr, D. 7, 112, 115n.29, 175n.29 Mates, B. 264n.5 Matthews, R. 49n.11, 147, 227n.9, 279n.22 Mattingly, I. 30n.25, 312–13, 314, 319, 377 Maynes, J. 23 McCarthy, J. 387n.34 McCawley, J. 68 McCourt, M. 357–8n.30 McDowell, J. 381n.24 McGilvray, J. xxv, xxvi, 35, 80–1, 95–6, 157n.10, 189n.4, 199, 282n.25, 298–9, 323n.30 McGinn, C. 389 McGurk, H. 320n.24 McLaughlin, B. 140n.15 Meinong, A. 91 Melzack, R. 389n.38 Michotte, A. 20 Mikhail, J. 20, 97n.2, 143n.20, 178n.30, 266 Miller, G. 65n.29, 110–11, 253 Millikan, R. 287 Momma, S. 109n.17, 112, 235n.16, 284n.27 Montague, R. ixn.2, 101 Musso, M. 44

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Name Index 433 Nagel, T. xxvi, 5, 186–7, 221, 265n.8, 273, 380, 381n.24, 382n.26, 384n.31, 386, 390 Neander, K. 63n.26, 389n.38 Neidle, C. 44 Nelson, M. 264n.5 Neurath, O. 141 Nevins, A. 43 Newell, A. 144n.22 Newmayer, F. 19n.7, 46n.5, 55n.19, 63n.26–7, 65n.29, 69, 70n.34, 71, 73, 153, 221n.38, 253 Newton, I. 17–18, 147, 181, 213, 287, 380n.22, 383 Nietzsche, F. 187 Ninio, J. 333 Nisbett, R. E. 135n.8, 187n.3, 266, 389n.38 Noveck, I. 181 Orr, E. 73n.37 Palmer, S. 133, 265, 271n.12, 374–5, 384, 385 Parsons, T. 291 Pateman, T. 224 Payne, T. 73n.37 Peacocke, C. 116, 138, 265 Pearl, L. 31, 40, 158n.11, 169n.21, 170, 174 Peirce, C. 56, 279 Penrose, R. 388n.37 Penzias, A. 125n.37 Pereplyotchik, D. x, 101n.10, 113n.23, 130n.2, 207n.23 Perfors, A. 163, 171 Peterson, G. 315 Peters, P. 60 Phillips, C. 109n.17, 111–12, 203n.21, 235n.16, 284n.27 Piattelli-Palmarini, M. 155n.6 Pickering, M. 256 Pietroski, P. viii, 17, 35n.22, 67n.32, 99n.5, 100n.7, 125n.37, 168n.19, 189n.5, 224, 232, 241, 339n.3, 356–8, 360n.32 Pinker, S. xxv, 40–1, 78, 161 Platoxii, 20, 139, 153, 162, 178n.30, 187, 321, 375 Meno 161 Pollard, C. 92, 327

Postal, P. ixn.2, 178n.30, 188–93, 320 Post, E. 46n.4 Preminger, O. 67, 83n.52 Preston, J. 186n.2 Priestley, J. 382n.27 Pullum, G. 49n.11, 170n.23, 176–7 Putnam, H. 17n.5, 99, 100n.7, 101n.10, 103n.13, 215n.32, 220, 222, 273n.17, 287, 296n.2, 300n.5, 342, 343, 346, 348, 354n.29, 359n.31, 360, 380 Pylyshyn, Z. 238, 249n.33, 265, 290n.36, 331n.37, 365n.6, 366, 368–9 Quine, W. V. O. viii, xii, xx, xxi, xxvi, 2, 4n.1, 48–9, 54, 57, 57n.22, 58, 68, 71–3, 101–2, 105–106, 113, 114n.24, 116, 120, 121n.33, 130, 141–3, 144n.22, 153, 158, 167, 172, 175, 191n.6, 192n.8, 198, 208–9, 214, 222, 224, 229, 264n.5, 270n.11, 278, 287n.30, 291, 300, 301n.6, 302, 304, 309n.15, 321n.27, 336, 338–45, 347–8, 359, 361, 363, 364n.4, 368, 374n.14, 379–80 “The Problem of Meaning in Linguistics” 345 “Two Dogmas of Empiricism” 105, 245–6, 339, 340n.5, 341n.6 Radford, A. xxv, 327–8 Rai, M. xxv Ramsey, W. 265n.8, 266, 271n.12, 376n.18, 762n.15 Rawls, J. 97n.2, 178n.30 Recanati, F. 357n.30, 361 Reed, C. 314 Reid, T. 297 Reinhart, T. 82n.50, 90n.56 Reiss, C. 311–12, 316 Rey, G. 1, 25n.15, 46n.5, 112n.21, 123n.36, 125n.37, 126n.39, 142n.19, 144n.22, 145, 147, 152n.3, 155, 186n.2, 189n.5, 222n.1, 232n.12, 233n.13, 238n.22, 264n.6, 271n.14, 290n.36, 308n.14, 329, 338n.2, 342n.8, 343n.10, 344n.12, 360n.32, 361n.33, 363n.1, 365nn.6–7, 381nn.24–5, 388n.36 Richards, W. 374n.15 Riley, J. R. 197n.14

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

434 Name Index Ritchie, R. 60 Rizzi, L. 43 Roberge, Y. 36n.24 Roberts, I. 43n.33, 86, 87n.55, 158n.11, 181n.33 Roeper, T. 51n.15 Rogers, J. xxv Rorty, R. 300n.5, 360 Rosenbloom, P. 48, 50n.12 Ross, J. R. 27, 59 Rovane, C. 388 Rozin, P. 266 Russell, B. 45, 53, 70, 82n.49, 139, 291n.40, 305, 315 Problems of Philosophy 305 Ryle, G. 3, 13, 101n.11, 103n.12, 117, 119–21, 135, 383 Saffran, J. 162, 168 Sag, I. 92, 327 Samet, J. 172n.24 Sampson, G. 16, 170n.23, 176 Sanborn, S. 318n.23 Sandler, W. 44 Sapir, E. 8, 323 Sauerland, U. 79 Saussure, F. de 14n.1, 19n.7, 199, 296n.3 Schelling, F. J. W. 187 Schieber, S. 162, 163n.16, 168 Schiffer, S. 126 Scholz, B. 171n.23, 176–7 Schopenhauer, A. 187 Schütze, C. 23n.10, 222n.2, 227n.9, 233, 234n.15 Scott, R. 20 Searle, J. x, xxvi, 5, 47, 74n.39, 127–8, 145n.24, 146, 186–7, 221, 265n.7, 274n.18, 357n.30 Seligman, M. 266 Sellars, W. 302 Shanahan, M. 387n.34 Shea, N. 376n.18, 389n.38 Shieber, S. 49n.11 Shoemaker, S. 343, 378 Simon, H. A. 144n.22 Skinner, B. F. 2, 102–4, 106n.15, 120–1, 157, 276, 365n.6 Verbal Behavior 55, 103

Smart, J. J. C. 290n.36 Smith, B. 197n.14, 233n.13, 252 Smith, N. xxv, 20, 31, 43–4, 45n.3, 47n.6, 51n.15, 58n.22, 63n.28, 65, 69, 74, 131, 170, 171, 203n.21, 233n.13, 282n.24, 327n.34 Soames, S. 5, 184, 188, 191n.6, 193, 195, 221, 320 Socrates 161 Speas, M. 51n.15 Spelke, E. 20, 143n.20, 266 Sperber, D. 100n.8, 181, 357n.30 Sprouse, J. 23, 31, 40, 79, 111, 158n.11, 170 Stabler, E. 114n.24, 225 Stanley, J. 135n.9 Steedman, M. 254 Stent, G. 268 Sterelny, K. 133n.5, 195n.11, 204–5, 209–10, 296n.3, 361 Stetson, R. H. 309 Stich, S. 280 Strawson, G. 5, 135, 186n.2, 221, 265n.7, 389n.38, 390 Strawson, P. 356, 390 Svenonius, P. 327 Tarski, A. 46n.4, 98, 301n.6 Textor, M. 100n.8, 233n.13 Thompson, D. 65, 78n.44, 81n.48 Tomasello, M. x, 63n.27, 167, 169n.20, 179–82 Townsend, D. 111 Travis, C. 357n.30, 358 Trubetzkoy, N. 67 Tsimpli, I.-M. 42, 44, 170, 171 Tulving, E. 266 Turing, A. 46n.4, 80, 81n.48, 103n.13, 143–5, 145n.23, 211n.28, 380n.22, 386 Uriagereka, J. 65n.29, 79, 158n.11, 166 Valian, V. 41, 160, 348n.15 van der Hulst, H. 51n.15 van Gelder, T. 272n.15 Verhaegh, S. 145n.22, 341n.6 Vigliocco, G. 37, 246n.28 Vinson, D. 246n.28 Volenec, V. 312n.19

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 07/09/20, SPi

Name Index 435 von Frisch, K. 197 von Humboldt, W. 52, 189 Wagers, M. 109n.17, 111 Wall, P. 389n.38 Wasow, T. 222n.2, 248n.31 Watson, J. 102 Watumull, J. 51n.15 Wedgwood, R. 140n.15, 381n.24 Wegner, D. 389n.38, 390 Weinberg, S. 17n.5 Weinreich, M. 320 Werker, J. 250 Wetzel, L. 311n.18, 330n.36 Wexler, K. 78n.45, 90 Whyte, L. 187 Wiggins, D. 5, 185–6, 221 Williams, A. 180n.32

Williamson, T. 135n.9 Wilson, D. 100n.8, 181, 187n.3, 357n.30 Wilson, R. W. 125n.37 Wilson, T. D. 135n.8, 266, 389n.38 Winch, P. 381n.24 Windelband, W. 380 Wittgenstein, L. 3, 9, 22n.9, 25n.15, 53, 117–21, 123, 126, 201, 358, 361n.33, 388n.36, 389 Woodward, J. 18, 38n.25 Wright, C. 126n.39, 227n.8, 236n.19 Wright, L. 63n.26 Wynn, K. 176 Yamada, J. 42 Yang, C. 75, 152n.4, 157, 163, 178, 183, 280, 284 Yolton, J. 297

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index Numerals in bold-face indicate pages on which the term is explained or discussed in detail abduction/abductive 56, 154, 170, 176, 205, 248, 261, 279, 279n.22 models 74, 279n.22 processes 74 reasoning 264 aboutness 97n.3, 261–3, 366, 377, 384n.31, 428 See intentionality abstract-algebraic reading/strategy 7, 276, 335 abstraction 15, 38, 65, 70, 107, 109n.17, 112, 125, 167, 170, 175, 194, 204, 215n.32, 310n.16, 317 abstract objects (abstracta) 8, 184, 184n.1, 188–9, 281, 283n.26, 320–22, 325n.33, 331 abstruse properties/phenomena 122n.34, 244, 363, 365n.7, 369, 371, 376–8, 388, 427 acoustic(s) 311n.19, 319 accounts 8, 57n.22 cues 314 difference 250 events 307–8, 321n.27, 327n.34, 333 features 311n.18 languages 188 patterns 315, 319 phenomena 214, 313, 319–20, 331 properties 367 reality 322–3 realization 206 signals 235, 311, 367 signatures 318 stream xiii, 5, 184, 321, 324 waves 308–9 acquisition 36n.24, 101n.10, 113, 167, 346 model 55, 194 of belief 105 of grammar 2, 154, 176–7, 178n.29, 284

problem 81 stability 40 see also language acquisition AI, see artificial intelligence allophones 308, 320n.25, 428 ambiguity 39, 190, 227n.9, 232, 238, 255, 286, 290n.37, 323 See also homophony analysis 24n.13, 44, 49, 59, 60, 73n.38, 106n.16, 112, 136, 140n.16, 155n.7, 175, 179, 187, 222, 242, 283, 314, 320, 338, 343, 351, 355 analytic(ity) 9, 105–6, 142n.19, 190, 192, 222, 337–9, 341–2, 345 analytic/synthetic distinction (a/s) 338, 345 claimsan 347–8, 353n.26, 358 data 338, 345–7 necessities 70, 108, 188 philosophy/-ers 70, 90, 381 sentences 339, 346 truths 191, 339–40 Analytical Behaviorism 103n.12, see also behaviorism answerability 264, 377 anti-intentionalist 272, 278, 297, 381 anti-realism/-ist 8, 9, 296–300, 360 arguments/claims 297, 300 philosophical proposals 8n.7 (anti)-realists 216 a priori 140, 143n.21, 199n.16, 222, 223n.3, 310n.16, 378, 380 analytic 342 justification 78n.44 knowledge 142, 162, 191, 227, 338 reason 64, 140, 141, 162, 342 status of logic and mathematics 345 superficialism 127 Aristotlean teleology 384n.31

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index 437 arithmetic 51n.15, 124, 131, 136, 141, 172, 189–90, 338, 341, see also geometry, mathematics articulatory accounts 8 apparatus 311, 329 gestures 231, 309, 317, 329, 331 idealizations 316–19 movements 312, 385 objects 324 phenomena 330–1 systems 310, 312n.19, 317, 333, 368 artifactors 125, 145, 270 artifacts/artifactual machines 123, 125, 270, 274, 275n.19, 299, 301, 308, 318–19, 364 artifactual computers 123, 144, 145, 379 artificial code 44 grammar 163 intelligence (AI) 3, 275 languages 162 aspect 71, 96, 251, 263, 278, 283, 377 Aspects (confirmation) model 13, 71, 74, 75, 121, 137, 146, 156, 170, 194, 276, 280 associationism 199 asymmetric dependency 318n.23, 344n.12, 350–4 atomistic semantic theory 352 autonomous grammar 62n.25 principles 67 sciences 188 autonomy of syntax 2, 62, 65–7, 70, 71, 108, 180, 252, 256, 256n.37 baby babble/talk 41, 132 Baconian approach to science 16 banal fallibilism 340 basicality 354–5 Basicality View 353 BasicAsymmetries 9, 10, 349, 354–6, 359, 363, 364n.3, 371, 373, 374n.14, 376–7, 390 Basic Properties 353–5 Bayesian(ism), Bayesians 169, 173–4, 375n.16 algorithm 375 GenStatism 170 statistical inference 153, 169

statistical systems 171 strategies 168, 244 beads on a string (model of external language) 57, 214, 308–9, 314–15 bee dances 198, 200, 201, 203 behavior 2, 8, 25, 38, 65, 93, 103, 107, 117, 119n.32, 124–5, 145n.24, 147, 176, 186, 199n.15, 207, 212, 229, 251, 268, 272, 332n.10, 353n.25, 356, 368–9, 384, 384n.31 behavioral data 230n.11 dispositions 19n.7, 97n.3, 106, 123, 172 evidence 117 indeterminacy 175 performance 124 psychology 369 skill 136 behaviorism 45, 46n.4, 102–3, 105, 117, 120, 143, 269, 345, 385 behavioristic, behaviorists xx, 2, 15, 49n.9, 93, 103, 107, 108, 136n.10, 158, 162, 175, 277, 341n.7, 345, 366 account of language 55 conception of a skill 136 framework 142 psychology 272 strategies 354 terms 175 belief/beliefs 20, 31, 68, 105, 121, 190n.32, 129, 140n.16, 142, 142n.18, 161n.13, 175, 186–7, 212, 222, 224, 229, 264, 266, 275, 290, 339, 340–1, 343n.10, 345–8, 355, 358 binding 31, 86, 90, 120, 226n.7 constraints 339 domains 73, 116, 180, 217 phenomena 31, 82–3, 86, 90 principles 90, 217 theory 71, 79, 91, 159 biology, biological 31, 47, 81n.48, 95, 99, 131, 141, 151, 178n.30, 344n.12, 388n.36 bodies 384 creatures 17 domains 152 gender 66 hypothesis 222 investigations 65

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

438 General Index biology, biological (cont.) phenomena 63 processes xii, 14n.2, 99 properties 112n.22 systems 47, 97, 131, 388 theories 4 traits 81 blindness 266, 314 body 20, 81, 341, 382–5, 387–90, see also mind–body problem bounding theory 71 brains 15, 23, 54, 63, 116, 121, 125n.37, 140, 177, 189, 300, 326n.33, 329, 332, 364–5 brute causal 152n.2, 243 explanations 119n.32 process 152, 177 triggering 75, 155–6, 280 view of linguistic processing 199n.15 see also quasi-brute causal process Cartesian access to truths 228 appeals to a priori reason 142 contact mechanics 39n.26, 364, 387–8 explanation 225 introspections 6 knowledge 224 mind–body problem 383 Case 66, 72, 78n.43 Case theory 72, 79 causal chains 232, 355 connections 342 dependencies 352 explanations 265 import 203 manifestations 227 powers 324 processes 153, 204 reality 8, 115n.28, 116 relations 199, 321 roles 203 sensitivity 115–16 structure 308 theories of reference 199, 201 computational–representational theory (CRT) 428

c-command 46, 73, 82–89, 91, 115n.27, 116, 120, 180, 208, 213–14, 216–17, 226n.7, 240, 242, 246 center embeddings 35, 36n.23, 111, 194, 227n.8 central processor (CP) 138, 225, 228, 230n.11, 231–2, 236n.19, 238n.22, 240–1, 243, 245–7, 249–50, 258, 427–8 ceteris paribus (CP) clauses 123–4, 125n.37, 126–7, 354 checking 78, 82n.49, 217 chemistry 64, 80, 100, 217, 344n.12 childes (corpus) 159, 170, 175 children xiv, 2, 4, 26, 31, 33, 36, 40–2, 44, 48, 51n.15, 55–6, 60, 67, 73–5, 81, 87n.55, 94, 96, 101n.10, 104, 113, 117, 119–20, 129–34, 137–9, 146, 151–2, 156–61, 163, 165–7, 170–1, 173–80, 182, 188, 206–7, 220, 226, 230n.11, 240, 250, 253, 266–7, 276, 278, 284, 313, 321, 330, 346, 368, 370–3, 982 Chomskyan theories xxv, 1, 6, 8, 18, 31, 46, 78, 91, 95, 102, 111–12, 126n.39, 130, 133–4, 162, 199, 202, 209, 280, 307, 336, 357, 390, see also core theory, generative theories Chomsky(–Schützenberg) heirarchy 49n.11 cognition 3, 79n.101, 101, 106, 137, 142–3, 144n.22, 165, 168, 178n.30, 192, 257, 299, 302 cognitive abilities 139, 141, 152, 390 capacities 3, 129, 143, 275 constraints 168 explanations 112, 119 limitations 42 penetration 234n.15 processes/processing 42n.32, 119, 137, 152n.2, 177, 427 psychologists 271n.12 psychology 104, 144, 274, 280, 369 revolution 121, 269, 380n.22 science ix, x, xi, 23, 108, 121, 212, 248n.31, 262, 271n.12, 272, 345, 381 scientific literature 389 scientists x, 139, 385 structures 119, 136, 366

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index 439 systems 79, 91, 96, 155, 157, 268, 357, 366, 376 theories 2 cognitive-functional grammar xi cognitive-functional linguistics 179 cognitive-functionalists 221 cognize/cognizing 120, 134–6, 146, 148, 261, 267, 269–70, 273–4, 279n.22 color xiii, 8, 14, 96, 117, 185, 211, 215–16, 217, 220, 238, 261, 303n.8, 304–5, 306nn.11–12, 307, 313, 319–20, 324, 333, 364, 377n.19, 386 Combinatory Categorical Grammar xi communicative efficiency 310 functions 71, 180 purposes of language 23 relations 200n.17 comparative illusion 37, 38n.25 competence/performance distinction 2, 19–20, 35, 93, 97, 109, 198, 348 complement phrase (CP) 60n.24, 84–7, 427 comprehension 109, 146, 182, 187, 229, 231, 233, 275, 367, 382 computation xi, 8, 10, 59, 79, 111, 127, 143–8, 151, 168, 211–12, 215, 217, 234n.14, 244n.26, 246, 270, 273n.16, 276, 280, 284–5, 300, 307, 312n.19, 315n.21, 316, 324, 326–7, 330, 337n.1, 364n.5, 365n.7, 367, 368, 376–7, 387 computational procedures 98, 146, 204, 326, 363n.2, 428 computational–representational of mental processes theories; (CRT) 6, 143–8, 244, 269–70, 275, 324, 363, 375, see also II-CRT of mental processes 144 as an explanatory strategy 4 of psychology 251 II-CRT 363, 375, 377 computers 99n.6, 118n.31, 123, 144–5, 163, 189, 211n.28, 217, 269, 274–5, 310, 315n.21, 379, 380n.22, 382n.26, 386 conceptual analysis 222 categories 239 competence 156, 248n.30

confusions 389 connections 349, 378 content 129, 137–8, 223, 241, 427 see also non-conceptual content conceptual-intentional (CI) interfaces 92 intuitions 241n.24 knowledge 3 necessity 78, 80n.47, 141 conceptual–intentional system (CI) 66–7, 70, 78, 79, 92, 109, 191n.6, 358n.30, 359, 366–7 conceptualism 188–9, 192 confirmation holism 142, 143, 340–2 Connection Principle 186 consciousness 186–7, 269, 384n.31 constrained homophony 39, see also ambiguity construction grammars x, 180 constructiv-ism/-ist 48, 297 grammar x strategies 4 contact mechanics 39n.26, 383–4, 387–8 contextualism 357–8, see also I-contextualism contractions 28, 160, 313, 333 control theory 71 conventional vs. non-conventional grammars 94–7 core theory (Chomsky’s) xii, xxv, 1, 3, 21, 25, 39n.26, 46, 91, 93, 100n.8, 111n.20, 153, 181, 219–21, 267, 270, 276, 297 creativity 38, 148, 162, 257 CRT, see “computational-representational theories” critical period crucial data xi, xxv, 1, 13, 15–19, 21, 23n.11, 25n.15, 39n.26, 104, 113, 120, 128, 132, 147, 204, 207, 210 Darwinian selectionist explanations 63n.26 Darwinian speculations 8 Darwin’s problem 2, 75, 80 deafness 42, 44, 314 de dicto 287–8, 342 defaults 359 deflationism 354n.28 deflationist strategies 354 de re 286–7, 342, 356n.9, 377, 379

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

440 General Index derivational operations 92, 267n.9 origin 68 structure 114 theory of complexity 111 descriptive adequacy 3, 130 determiner phrases (DPs) 61, 84–5, 88, 178, 226n.7 (Dev) 232n.12, 233, 239, 247–9, 254, 257–8 deviant causal chains 232, 355 discovery procedures 16, 133n.6 disjunction problem 9, 123n.36, 127n.40, 336, 349, 351, 372–3 disjunctions 164, 370n.11, 372 dispositions 19n.7, 20, 97n.3, 104–6, 123, 131, 143, 172, 199n.15, 315, 319–20, 341, 353, 386 dispositional accounts 8 dispositional strategies 315 domain-specific computational system xi information 179 modularized machine 179 processes 157, 161 speech perception 239 structures 161 syntactic module 180 universal grammar 181 “doorknob/doorknob” problem 155–6, 178, 182, 368 dynamical system 272 economics 20, 300, 386 efficiency 20, 79, 81, 309–10 E-language 93, 97–102, 105, 195, 199n.15, 200, 208, 219–20, 226, 230, 334 eliminativism 106n.15, 271–2, 273n.16, 276 ellipses 30, 36, 283n.26 empirical approaches 380 data, evidence 225, 252, 290n.36 hypothesis 131, 142 justification 142 methods of science 224 problems 153, 170, 172 psychology 378 sciences 229, 338

empiricism 4, 6, 6, 140, 301n.7, 339, 365n.6 empiricist conception of language acquisition xii, 14 proposals 292n.40, 380 theory 172 empiricists 15–16, 17n.5, 139–40, 154, 167 empty terms, intentional representations 98n.4, 291n.40, 361n.33 entrenchment 167 epiphenomena 48, 217, 217n.34 epistemic argument 306 challenges 113 criterion 352 issues 113n.23 priority 196n.12 epistemology vii, 4, 100n.9, 129, 137, 139–43, 167, 199n.16, 228, 258, 323, 341–2, 360 E-tokens 198n.15 Euclidean cube 374n.14 figures xxxivn.8, 220n.36, 281, 306, 364 shapes 216 space 161n.13 triangles xiii, 185 existential assumptions 290 commitment 7, 33, 264, 287, 377 quantifier 84n.53, 127 explanatory(ily) adequacy xiv, 3, 4, 10, 47, 54–6, 91, 127n.40, 129–3, 146, 188, 284, 321, 363, 373, 382 basic epistemology 3, 129, 137, 139–42, 167 project 129, 140, 215, 263n.3, 332, 336, 378 role xxvi, 8, 9, 65, 146, 148, 315, 316n.22, 320n.25, 340, 344n.12, 345, 363, 376n.18 strategy xi, 4, 82, 355, 373 explicit representations 114, 115nn.26–7,148 Extended Standard Model 47, 56n.21, 59 extensionally equivalent grammars 3, 99, 107, 114n.24, 116, 204

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index 441 external physical phenomena 58, 130 physical utterances 184 social conception of language 94 social phenomenon 14 externalism 6, 200, 330 externalist analyses 320 commitments 295, 353n.27 conception of language 13 considerations 215n.32 explanatory framework 376n.18 hegemony 288 insights 197n.13 interests 219 pretense 281 social view of language 20n.8 views 262 faculty of language 131n.3, 178 broad (FLB) 78, 80 narrow (FLN) 78, 80 features 57–60, 62, 70, 78, 96, 101 Fechner–Stephens laws 265 Fibonacci patterns 65 fictionalism 8n.7, 124, 332n.38 fine thoughts 1n.1, 15n.3, 23 see also WhyNots first-person 350n.19 authority 236 privileges 385 folieism 8, 296, 331–3 frame problem 387 Frazier–Fodor theory 242 free will 39n.26, 335n.39, 387, 389 functionalism 381n.25 functionalist accounts 180n.32, 378 approaches 344 theories of language 67 theories of mental states 344 Galilean abstraction 107, 175 approach to theorizing 125 conception of liguistic competence 21 idea xi, 31 idealization 13, 124, 316 method 17, 19, 65, 78n.44, 390, 760

theories 16, 91 view of science 15 game theory 20 garden path phenomena 227n.9, 254–5 GB/P&P 46n.6, 71–2, 75, 86, see also Government and Binding; Principles and Parameters General Relativity 141, 388 General Statistical (GenStat) approaches 3n.3, 162 procedures 158 proposals 4, 170n.22, 181, 183 strategies 151, 172 generative approach 43, 93, 181n.33 grammar xxvi, 23, 42–3, 45, 51n.15, 52–3, 62, 72, 73n.38, 74n.39, 82, 86, 104n.14, 110–11, 129–30, 133, 151, 160–1, 189, 215 model 59, 71 semantics 70, 71n.34 theories 2, 71, 92, 278 generativists 36, 43, 51n.15, 53, 82, 110, 180n.32, 204, 217n.34 GenStatists 152–3, 162, 169, 173, 175, 179, 181, 183, see also General Statistical geometric figures/forms/shapes 193, 324, 374n.14 hypotheses 375 states 377 geometry xii, 141, 153, 161–2, 172, 178n.30, 215, 283n.26, 375, see also arithmetic, mathematics Government and Binding (GB) 47, see also GB/P&P Great Vowel Shift 219 Gresham’s Law 300 Grimm’s Laws 219 grue 164–9, 174n.26, 365n.6 having vs. representing properties 242 Head Directionality 72, 78n.45, 216 Head-driven Phrase Structure Grammar (HPSG) xi, 92, 327 hermeneutic approach 268, 273, 380, 381 Hidden Markov Model 315n.21 hierarchical phrase-structure grammar 163

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

442 General Index hierarchical structures 46n.4, 78, 173, 179, 194 holism 9, 142–3, 144n.22, 340–3, 355 homonymy 39n.28 homophony 39 homunculus/homunculi 121–2, 144 horizontal vs. vertical explanations 378 hyper-intensionality 263–4, 377 hypothesis confirmation 56, 154, 276, 370 hypothesis testing 75, 113, 155–6, 211, 277–80, 371, 376 hypothesized rules 74 hypothetical-deductive imodel 154 I-contextualism 357 see also Galilean 359n.31, 361 idealization 8, 17n.5, 19–20, 65, 78n.44, 101n.10, 124–5, 127, 189n.5, 202n.20, 316–19 ideology (of a theory) 47n.7, 300, 301n.6, 344n.12, 364n.4, 368–9 I-expressions 357 I-grammar 70, 248 I-language 8, 19n.7, 24n.12, 63n.26, 79n.47, 79–80, 93, 95, 97–99, 101, 108–9, 111, 112n.22, 130, 133, 138, 143, 152, 177, 191, 194–5, 198n.15, 200–4, 214, 215n.32, 217n.33, 219–220, 223, 226–8, 230–2, 233n.13, 241, 247, 269, 284, 293, 295, 309, 321, 328, 334, 336–7, 357–8, 359, 370, 373, 376, 428 I-meanings 201n.19, 359 II-CRT: intensional and intentional CRT (q.v.) implicit representation 114n.26 independence of grammatical competence and general intelligence 41, 171 indeterminacy of translation 105, 106n.15, 270n.11, 278 induction 153, 162, 164, 167–8, 173, 180 inexistents, see intentional inexistents, perceptual inexistents inference to the best explanation 92, 154, 176 inflectional phrases (IP) xxxi, 60–1, 76, 84, 86–8, 174n.27, 178, 237, 245, 281, 293 informational content 41

Informational Criterion 115 innate ideas xii, 156, 261, 269, 278 innateness ix, 95, 155, 157, 165, 206 Inscrutability of Reference 361 intelligence 41–2, 96, 122, 139, 144, 153, 171–2, 182 intension 108, 112, 211, 343 intensional xiin.5, 97, 98, 102, 264, 375, 428 approaches 97n.3 function 152 procedures 105, 112, 115, 363n.2, 428 rules 101 intentional ascription xiii, 123, 135n.8, 264n.6, 343n.11, 381 circle 343–4, 352 content xii–xiv, 145, 263, 274, 280, 290, 291n.39, 292–3, 331, 352, 354–5, 363n.1, 378 inexistents xiii, xiv, 7, 8, 264, 289, 291–3, 302, 334–5 objects 8, 266, 330 representation 92n.57, 251, 262, 264n.4, 289, 295, 370 intentional object/representation confusion 330–1 intentionalist 7, 8, 56n.20, 266–7, 269–70, 272, 276–9, 286, 329, 335–6, 344–5, 359, 363–4, 366, 372, 378 intentionality vii, xii–xiv, xxvi, 1, 6, 7, 9, 34, 97n.3, 114n.26, 116, 122n.34, 135n.8, 145, 251, 261–5, 267, 270n.11, 271–2, 273n.16, 278–9, 284, 286, 289, 295, 297, 326, 327n.34, 336–8, 343, 343n.13, 351–2, 354, 363, 366–7, 369n.10, 371, 377–9, 382, 384n.31, 388, 389n.38, 390, 428 intentions 63n.26, 125, 126n.39, 212, 266, 270, 290, 308, 313, 319, 320n.25, 358, 378 internalism 9, 299 internalist account 197n.13 alternative to the social view 15 computational psychology 200 conceptions 99–100 explanations 117, 121 explanatory approach 14 interests 219

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index 443 psychological theory 216 study of language 286 theories 24n.11, 108, 359n.31 internalist vs. social conceptions 13 International Phonetic Alphabet (IPA) 311n.17, 317–18, 329 interpretative semantics 67, 70 intuitions xxvi, 6, 23n.10, 109, 139, 141–2, 148, 199nn.15–16, 222–8, 230–3, 234n.15, 237, 238–9, 241, 243, 248nn.30–1, 249n.34, 252, 254n.35, 251, 258, 301n.7, 339, 348, 428 IPA see (International Phonetic Alphabet) I-semantics 9, 101, 358n.30, 360, 362 island(s) (constraints) 27–8, 72, 111, 120, 160, 180, 194, 202, 226 I-syntax 101 Kanizsa figures/triangle 263, 287n.31, 292, 307, 333–4, 361n.33 Kantian idealism 26n.16, 199 insight 26 metaphysics 199n.16 theme 46 Kant’s Problem 182 know(ledge) how/that 135–6, 135n.9, 136n.10 knowledge of language xii, 3, 132, 135–6, 220n.37, 224, 229, 325, 368 Kripkenstein 123, 123n.36, 126n.39, 127, 349, 381n.24 language acquisition x, xii, 14, 41, 42, 56, 71, 75, 95, 104, 110, 115, 120, 131, 132–3, 143, 146, 153–4, 156–7, 161–2, 168–71, 174n.26, 175, 177, 178n.30, 182, 241, 267, 277n.20, 279, 279n.22 games 118, 302, 356, 361 processing 204, 240, 254, 266 language of thought (LOT) 206, 245, 251, 254, 308n.14, 331, 337 Law of Effect 103, 366 laws of form 65, 78n.44 logic and mathematics 345 nature 79 physics 78n.44, 126, 147, 384n.30

reason 172 thought 188, 193 truth 191, 193 learning/learned Leibniz problem 174n.26, 177 Leibniz’s Law 303, 306, 315 levels of representation 53, 56n.21, 79n.46, 212, 217n.34, 243, 267n.9 Lewisian program 207n.23 lexical categories 58, 60, 60n.24, 241 decomposition 67n.32, 70 items 40n.28, 50, 58, 60, 66–7, 77, 235, 268, 325, 347, 356 polysemy 232 properties 10 semantics 326 Lexical Functional Grammar xi, 92 Lexical Parameterization Hypothesis 78n.45 lexicon 58, 235 licensing 33n.20, 214 licensors 33, 65, 84–5, 116, 213, 216, 246 linguistic categories 40n.30, 101, 116, 169, 177, 182, 226, 282, 284, 376 communication 185 competence 2, 1, 2, 15, 20, 20n.20, 24, 25, 71, 72, 92, 93, 95, 98, 104, 126, 133, 171, 181, 187, 193, 203, 209, 229, 235, 257, 258, 275, 363, 390 conception/vs. psychological conception 5, 184, 195 conceptualists 192 data 56, 131–2, 188, 230n.11, 268 intuitions xxvi, 6, 25, 109, 142, 222, 223n.3, 225, 227, 232–3, 238, 243, 258, 428 judgments (linguistic) 22, 233n.13, 240, 241n.23 ontology 289, 293–4, 308, 321, 335 perception 111, 235–6, 242, 244, 248, 254, 266, 293, 363, 370 properties 132n.4, 168n.19, 199n.15, 205, 230, 232, 243, 245–6, 248–51, 254n.35, 283, 316n.22, 367, 369, 427 structure xvii, 70–1, 83, 94, 134, 212, 368

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

444 General Index linguistic (cont.) theory x, xii, xiii, xiv, xxxv, xxvi, 48, 65, 82, 108, 130–2, 135n.9, 136, 151n.1, 153, 185, 196, 198–9, 202, 205, 209, 211, 214, 223n.3, 227n.9, 277, 282, 293–4, 294, 297, 318, 329, 333, 336, 338, 353n.26, 362–3 universals ix, x, 42–3, 133, 190 linguistic reality (LR) 5, 6, 194–8, 202, 205, 207, 208n.25, 211, 224n.4, 226, 316n.22 linguists’ paradox 44 linguo-semantics 9, 297, 336–7, 363, 373 licensors (for NPIs) “little linguist,” child as 129, 133–4, 179 Lockean secondary properties 236n.19 logical construction 45, 47, 203 constructivism 48 structure 48, 53, 70, 166 logical form (LF) 70, 236, 283, 325 logically simple languages 44 Logical Positivists 16, 17n.5, 103, 213, 338–9, 341, 345 logicians ixn.2, 98, 328 Mach diamond 138, 241n.24, 248n.30 materialism 344n.12, 383 materialist alternatives 383n.29 approaches 382n.27 commitments 383n.31 drive for unification 344n.12 mathematical abilities 96 formalisms 63 function 98 generation 110n.19 ideal 78n.44 linguist 45 objects 145n.23 realism 188 mathematics xii, 17n.5, 26, 35, 46n.4, 51, 64, 98, 102, 114n.26, 140, 142n.19, 153, 162, 172, 184, 188–91, 193–4, 224, 292n.40, 339n.3, 340, 345–6, see also arithmetic, geometry McGurk effect 320n.24 McX response 291, 324n.31

meaningless syntax 252 meanings 19, 40n.28, 68, 98, 126n.39, 191n.6, 201–2, 205, 220, 316n.22, 327, 339, 341–2, 349, 356, 359 meaning without truth 356, 361 mechanical principles 74, 194 Meinongian realm 332 mental architecture 74 ascription 140n.15 capacities 64, 96 life 146, 251, 361, 365 phenomena 103, 122n.34, 215, 262, 271, 384n.31, 385n.32, 388 processes ix, xii, 3, 118, 120n.31, 121–2, 130, 144, 144n.22, 147n.25, 178n.31, 186, 211, 265, 388n.37, 427–8 realists 3, 184 representation xii, xiii, 211, 243, 267, 269, 288, 307, 323, 325–6, 335 states xiv, 97n.3, 99, 102–3, 111, 186–7, 211, 263, 265n.7, 270, 278n.21, 301, 344, 363, 369, 382, 384–5 Mentalese, see language of thought mentalism xiii, 269 mentalistic conception 128, see also psychological conception merge 76–81, 82n.49, 83, 92, 109n.18, 194, 217 meta-linguistic claims 227 concepts 139, 241 descriptions 250 intuitions 252, 258 judgments 241n.23 perceptual reports 248n.30 representation 242 metaphysics/-cal 36, 192, 199n.16, 291, 323 claims xiv commitments 355, 384n.31 idealism 218 indifference 113 issues 113n.23, 323n.29 view of explanation 17n.15 meta-scientific eliminativism 271n.14 methodological dualism/-ists (meth-dualism) xiii, 3, 123n.36, 269, 272, 363, 378–9, 380, 428 methodological dualists 5, 272

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index 445 mind–body problem 1, 10, 363, 379, 381–3, 388, 389 Minimalism/Minimalist Program 45, 46n.5, 47, 53n.17, 65, 70, 75, 78, 79, 80, 82n.49, 83, 92, 112n.22, 115n.27, 204 modality 17–3, 153 modularity of mind 239 modules 71, 86, 143, 165, 180, 233, 234, 236, 238, 246, 248n.30, 249n.33, 313n.20, 333, 358n.30, 427 morality 20, 178n.30 morals 97, 143, 266 morphemes 5, 40n.30, 50n.14, 57, 60–1, 68, 73, 96, 151, 155, 182, 205, 308, 373 morphology 40n.30, 106 morpho-phonological errors 40 morpho-phonological system 78 morpho-phonology 40n.30 mysteries 120, 389 native speakers xxxii, 6, 21, 23, 24n.14, 26, 131, 223, 228–29, 239, 248n.31, 251–2 nativism x, xii, 56n.20, 81n.48, 82, 107, 151–7, 157, 161, 169n.20, 170n.22, 176, 178n.30, 179, 184, 225, 299 Mad Dog Nativism 155 nativist x, xii, 4n.4, 56n.20, 96, 151, 155–7, 180, 182, 370 natural kind 20, 99–100, 136, 167, 219–20, 274 natural language (NL) ix, xiv, 19, 21, 31, 38, 44–5, 48–50, 51n.15, 52–4, 60, 69, 77, 82, 93–6, 97n.3, 101, 113, 122, 129, 155, 168, 176, 179, 188, 191–3, 195, 197–200, 201n.18, 206, 219, 243, 254, 292n.41–2, 307, 308n.14, 324n.31, 331, 334, 337, 356, 359 naturalized/naturalization 164, 344, 351, 354 necessity 78, 80n.47, 141, 172, 190, 222, 345 NCSDs non-conceptual SDs (q.v.) negation 34n.21, 37, 85, 252 negative data 19, 41, 158–9 negative polarity items (NPI) 33, 67, 84, 86, 89, 116, 120, 173, 213, 216, 242, 246, 339 nervous system 100, 103, 365, 383n.27

neural collapse 328–19 entities xiv implementations 280 items 246, 325, 326 nets 244n.26 organization 81n.48 patterns 134 phenomena 329, 364 speculations 80 states xiv, 243, 281, 289, 293, 302, 321n.27, 329, 331, 335, 352 structures 135, 299 wiring 82n.48 neuralism 325, 329–330 Neurathianism 141, 341–42 neurobiology 128 neurophysiology 368, 383n.29, 384 new organology 20 new paradigm x New Riddle of Induction 166 noise 15, 93, 102–6, 106, 130, 132n.4, 182, 229–32, 234, 239, 244, 249–50, 257, 263, 285, 309–10, 341n.7, 373 noisy transduction 311 nominalism 184n.1, 189n.4, 321 nominalist 57n.22, 184, 187, 189n.4, 195, 208 non-conceptual content 129–130, 223, 240–1, 425 non-conceptual structural descriptions (NCSDs) 138–9, 240–1, 244n.26, 247, 248n.30, 249, 250n.34, 251–2, 257–8, 428, see also structural descriptions (SDs) non-conventional grammar 2, 94–5 non-preservative cases 306 non-sentential phenomena 36n.24 non-standard models 224 normativity xii, 120, 126, 128n.41, 140, 304 noun phrases (NP) 26, 31, 50–2, 54, 60–1, 72, 77–8, 84–9, 114–7, 144, 159, 214, 228, 233, 237, 245, 255, 281–2, 282n.24, 294, 325n.32, 329–30, 369 NPI see negative polarity item null-subject 74, 78n.45, 285 ontological argument (Devitt’s for LR) 208 clarifications 327

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

446 General Index ontology (cont.) commitments 198, 208, 212n.30, 302, 306 criterion 307, 315 issues xiv, 310n.5 options 320 problems 295 realism 199 relativisms 199 ontology xiv, 47n.7, 190, 216, 293–5, 296, 300, 301, 322n.29, 368 of colors 211 of economics 300 of physics 301, 364n.4 of representations 292 of SLEs xiv, 330–1 see also linguistic ontology over-intellectualism 137, 240 parasitic gaps 30–1, 160, 170, 190, 207 parsing xiv, 109n.17, 112–4, 116, 136, 138, 203–4, 233, 235–6, 238n.21–2, 241–2, 248–9, 251, 253, 254n.35, 255, 258, 266, 284–5, 381 Peircean abduction 56n.20, 154, 279 Penrose triangle 25, 247 perception ix, xi, xiii, 26, 46, 102, 111, 115n.28, 116, 117, 177n.28, 182, 233–4, 235, 239–30, 242, 244n.26, 245–6, 248n.31, 250–1, 254n.36, 255, 256n.37, 258, 261, 262n.2, 263, 266, 284, 289, 320, 325n.33, 332n.38, 333, 344, 363, 367, 370, 374, 376n.18, 381, 383n.27, 388, 389n.38, 389, see also linguistic perception, speech perception perceptual content 363 illusions 265 inexistents xiii, xiv, 264, 291–2, 331–2, 333 reality 250, 254, 321–22 performance 19–20, 34–5, 38, 43, 93, 97, 110–2, 113, 124, 128n.41, 153, 175, 180–2, 198, 202–4, 208n.25, 210, 216n.32, 217, 219, 233, 235, 275, 322, 326, 348, 357, 370, 387, 390 phenomenal experience 45 item 242

qualities 241 world 300n.7 phenomenology/-ical 45n.5, 140, 233n.13, 241 philosophy/-ers/-ical 45n.5, 142, 233n.13, 241 PHON(E) 326–27, 328–29 phonemes xiii, 57, 94, 130, 163n.15, 182, 211, 214, 230, 231, 240n.33, 294, 308–9, 310n.18, 313–4, 319, 320n.25, 323–24, 329, 373, 427–8 phonemic analyses 16 boundaries 250 distinctions 250 features 309, 317 phenomena 308, 318–9 primitives 280–1 representations 311 strings 314 phones 57, 214, 308, 310, 317–8, 328, 427–28 phonetic analysis 314 context 320n.25 element 216n.32 features 312–3, 326, 327 forms 214, 295, 326 input 162 noises 231–2, 239, 249 phenomena 309 properties 231–2, 246, 268, 326 representations 285, 311, 316, 322–3, 323 shapes 236, 250, 312, 371 symbols 328, 367 transcription 268, 372 units 309, 314 value 214, 295 phonetics 40n.30, 107, 158, 231, 309, 311n.19, 316–7, 373 phonological forms (PF) 40n.28, 70, 79n.46, 96, 151, 162, 220 phonology 39n.28, 40n.30, 58n.23, 62, 70, 100, 106–8, 190, 208, 220, 222–13, 231, 238–9, 250, 256–7, 257n.39, 311n.17, 316–7, 323n.29, 329, 347–8, 360, 365, 369, 373 phrase structure 46, 47n.7, 50, 53, 59, 114, 175, 208

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index 447 phrase structure grammar xi, 61, 92, 115, 163 physical phenomena xii, 58, 122, 130, 189, 303, 364n.4, 370 physics xxxvi, 15, 17, 20, 64–5, 78n.44, 80, 100, 117–8, 122n.37, 128–29, 135–6, 147, 153, 178n.30, 181, 189, 191, 280, 296n.2, 300, 305, 305, 323, 340, 344n.12, 362, 364n.4, 364–9, 370n.11, 383n.28, 383–4, 388 Pirahã lanaguge 43 Platonism 184n.1, 187–8, 194–5 Platonistic entities 184, 221, 223n.3 linguistic reality 194 objects 190 objections 188 structures 194 Platonists 5 Plato’s Problem 55, 71, 153, 161 polysemy 232, 356, 358, 360, 362n.34 polysynthetic languages 73–4, 79n.45 positive polarity items 84 Positivist(s) see Logical Positivism/-ists 17n.5, 103, 213, 338 ambition 191, 192n.8 doctrine 105 metaphysics 199 project 338 skepticism 103 poverty of stimulus (PoS) 4, 153, 157, 169, 169n.20, 170, 176–7 pragmatic appropriateness 23n.10, 257 cases 31 elements 232 factors 101 focus 32 functions 71 issues 17, 65 phenomena 33n.20, 108 purposes 66 pragmatics 31, 65, 70, 100, 104, 153, 175, 181, 204, 306 prescriptivism 185 pretense xxxii, 213–7, see also representational pretense primary linguistic data 56, 132–3, 188, 230n.11, 268 principles rules 48, 74–5, 177

Principle A (of binding) 87, 89–90, 159 Principle B (of binding) 87–8, 90 Principle C (of binding) 87, 89 Principles and Parameters (P&P) 42–3, 47, 62, 71, 73–6, 78n.45, 137, 170n.22, 178n.29, 194, 212, 213, 216n.4, 273, 278, 284, 371, see also GB/P&P PRO (Big) PRO xivn.6, 71, 206, 245, 282–3 probabilistic computations 376 confirmations 373 derivations 284 generalization 174n.26 inferences 285, 377 reasoning xiv, 376–7 weighing 75, 280 probabilities 162–4, 169, 170n.22, 174, 244, 246, 370, 372, 375n.16, 375 probability theory 169, 362, 390 problem solving 41 processing rules 109, 197–8, 203–4 process nativism 170n.22, 178n.30, 179, see also quasi-brute process nativism pro-drop 72n.36, 368 productivity 38, 59, 113, 162 projectibility 164 projectible predicates 151, 153 propositional attitudes 229, 262, 272, 278n.21 psychofunctionalism 381n.25 psycholinguistic research 75, 113n.23, 254 psycholinguistics 55, 113, 204, 223, 249, 266, 362 psychological conception xi, xiv, xxvi, 2, 3, 54, 55n.19, 108, 112, 184, 187, 195–6, 201, 207 explanation 63, 123, 140, 210, 261, 267, 286n.29, 344–5, 378 reality 5, 17, 49n.9, 115, 116, 186, 196–7, 205, 211, 224n.4, 296n.4, 322–24, 332, 379n.21 psychologism 190 psychologists ix, x, xv, 5, 8n.8, 20, 46n.4, 102, 110, 117, 145, 185, 215–6, 266, 271n.12, 300, 331, 377 psychology ix, 7, 54, 140, 193 psychophysics 265 psycho-semantics 9, 337, 362 pure intentional content 291, 291n.39, 292–3, 331

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

448 General Index quadding 123, 126 qualitative character 386 quantum mechanics 141, 383n.28, 388 quasi-brute causal process 177 quasi-brute process nativism 153, 161, 176 question-begging 25n.15, 428 Quinean challenges 116, 339, 356 deflationist strategies 354 demand for reduction 344 doldrums 261 epistemology 144, 228 holistic empiricism 4, 6 indeterminacies 108 opponents 25n.15 problems 102, 174, 177 proposals 346 worries about meaning 379n.21 Radical Behaviorism 103, see also behaviorism Ramsey sentences 343–4 Ramsification 343n.10, 355, 381 Rationalism/-ists ix, xii, 4, 15, 140, 142, 152, 154, 161–2, 178n.30, 269, 277, 379 rationality 139, 156, 178n.30, 264, 377, 379, 386 realism/-ists vii, 3, 29, 185, 188, 195, 199–201, 207, 215, 296–8, 305–6, 323, 324 recursion 51n15, 51–3, 80, 80n.47, 162, 210 recursive capacity 81 component 193n.9 grammar 62n.24 I-language 376 merge 79, 80–1, 194 operation 79, 92 phrase structure 53 procedure 51n.15 rules 51 specification 109 reductionism 340, 343n.12, 378 reference 118, 164, 199, 201 referential opacity (opaque) 264, 287 success 254 transparency (transparent) 264 uses of terms 348

reflexive pronouns 32, 73, 89, 170 regularities ix, 16, 43, 95, 102, 128, 146, 173, 175–6, 205, 207, 211, 220n.37, 369 relations between forms 39, 171 representation 106, 107 representational levels 53, 56n.21, 79n.46, 212, 217n.34, 242, 267n.9 pretense xiv, xxvi, 184–5, 213, 216n.32, 218, 220n.36, 280, 290n.36, 292, 293, 324, 334 theories 92, 144, 207n.22, 217, 267n.9 vs. derivational views of grammar 91–2 response dependence 377n.19 revisability 9, 340–9 R-expressions 87–8 rule following 3, 123 rules and structures 3, 113 rules of grammar 69, 95, 107–8, 113, 136, 172, 176 Saussarian arbitrariness 220 schematism 107, 134, 182, 374 Scientific Behaviorism 103n.12, 117, see also behaviorism secondary properties 14, 236n.19, 296n.2, 304, 306–7, 335n.39, 377n.19 SEM(E) 326–7, 330, see also PHON(E) semantics viii, 6, 31, 40n.30, 46, 49n.9, 54, 62, 62n.25, 65, 68, 70, 70n.34, 101, 106–7, 125, 179, 191, 219n.35, 222, 234n.15, 254, 256, 258n.39, 274, 287, 292n.41, 316, 336–7, 342, 344, 347, 349, 356, 357, 358n.31, 359–61, 373, 376n.18, see also linguosemantics, psycho-semantics sensitivities as BasicAsymmetries 371 sensitivities to abstruse phenomena 364, 365n.7, 369 sensory base 45 evidence 169 experience 46, 213, 340n.5, 376 impressions 15 input 74, 299, 374 intake 132 modalities 333 perception 182 predicates 174n.27

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

General Index 449 states 229 stimulation 104 systems 331n.37 sensory–memory (SM) interface 79n.46 sensory/motor material 182 primitives 342 system 78 set theory 76n.41, 141, 189 sign language 44, 80–1 Simple Bodily Naturalism (SBN) 384n.31, 384, 387–8 skepticism 3, 32n.31, 49n.9, 67n.31, 100, 100n.9, 102–3, 107, 117, 135, 142n.19, 186, 186n.2, 197n.14, 204, 221–2, 224, 261, 272, 278n.21, 295, 314, 316n.22, 318, 339n.3, 342, 351, 354n.28, 374n.15, 382n.26, 386 skeptics ix, xii, 3, 7, 90, 129, 135, 143 skills, see knowing how SLEs: see standard linguistic entities slips of the ear 254, 256 social conceptions “of language”; see usage based linguistics speech perception 239, 253, 268, 319, 371 speech production stability of acquisition 40 Standard Aspects Model 47, 54 standard linguistic entities (SLEs) xiv, xxvi, 5–10, 78, 116, 188, 194, 198, 201, 205, 209, 211–18, 218n.34, 220n.36, 234, 242, 261–2, 280, 282–3, 283n.26, 285, 288–9, 293–6, 302, 307, 308n.14, 311nn.17–18, 314–15, 316n.22, 319–21, 324–5, 328–35, 364, 364, 370–1, 373, 376–8, 428 statistical approaches 3, 162, 168, 169n.20, see also General Statistical (GenStat) stimulus data 41 strings of words 1, 15, 21–2, 25, 34, 83n.51, 98, 113, 173, 230 Strong Minimalist Thesis 79 structural patterns ix, 309 priming 254–5 relations 46, 82, 238 rules 203 structural constraints on meaning 34

structural descriptions (SDs) 48, 55, 109, 129–30, 134, 233, 236–7, 240, 243, 247, 251, 326, 374, 428, see also non-conceptual structural, descriptions (NCSD) structuralism 45 Structure from Motion 133 subjacency 46, 72 subset problem Superficialism/-ist 3, 93, 103n.12, 105, 119n.32, 121, 127, 153, 174 approaches xi objections 117, 187, 380 premises 121n.33 views 117, 117 symbols 46, 49n.11, 50, 58, 62n.25, 70, 108, 144–5, 203, 208, 265, 281, 297, 318, 326–7, 349, 352, 354, 367, 368–9, 373, 376n.19, 427 syntactic analyses 16, 43, 242, 379n.21 cases 27, 31 features 50, 57 parameters 100, 151 proposals 46, 279 reflexes 31, 67, 339 structures 62, 70, 82, 220, 226, 253, 298, 366, 369 system 1, 67, 68 syntax 6, 31, 40n.30, 41, 45, 46, 49n.9, 62, 65–8, 70, 71, 80, 82n.50, 90n.56, 101, 106–7, 108, 134, 158, 171, 179–80, 199, 201n.18, 205, 207n.23, 208, 221–23, 237, 239–40, 252, 254–7, 258n.39, 293n.42, 316, 324n.31, 326–7, 330, 347–8, 360, 367, 369, 373, see also autonomy of syntax Tadoma 314 Tarskian satisfaction 357 teleology/-ical 384n.31 accounts 63 explanation 64, 180 issues 65 theory 70, 381n.31 teleo-tyranny 4, 62–3, 153, 180, 201n.18 thematic roles/theta roles 66, 72, 77n.43 theory of mind 103n.13, 176, 182, 248n.30, 266 Theta Criterion 72

OUP CORRECTED AUTOPAGE PROOFS – FINAL, 08/09/20, SPi

450 General Index Theta theory 72 Third Factor 80–1, 112n.22 tokens 128, 189, 198, 201, 208–9, 213, 216, 228, 243, 246, 282, 282n.24, 285, 288, 293, 308, 310n.17, 312, 319–20, 325, 325n.33, 328, 329–32, 349–50, 354, 357, 362n.34, 369, 373, 376n.19 transducers 365–9 transducible 9, 366 transduction 9, 231, 310–11, 316, 366–7, 370n.12, 371 transformational grammar 53, 59–60, 110 transformations 52–3, 59–60, 62, 73, 91, 111, 121 Travis cases 358 tree structures xivn.6, 52, 58, 78, 130, 189, 206, 213–14, 217, 226n.7, 307, 329, 377 triggering 75, 155, 178, 194, 280, 370 Turing Machine 60, 62n.25, 122, 143–5 Turing Test 274 Unacceptability 25, 134, 174, 427 Judgments 22 reactions 21–23, 106, 173, 227 responses xiii, 2, 83, 106, 120, 174 unconscious attitudes 135n.8 beliefs and desires 186 cognitive capacities 143 knowing 161 mental processes 186 states 135, 187 universal grammar (UG) ix, x, 4, 24, 40, 43–4, 59, 64, 72, 81, 120, 132, 151–4,

157–8, 162, 167n.18, 169, 170–1, 175, 177–9, 181–2, 205–6, 220, 233, 285n.27, 372–3 Usage Based Linguistics xi, 63, 70, 153 usage based strategies 179 use of language xi, 4, 16, 31, 62, 100, 109, 116, 120, 354, 387 Use/Mention (UM) 8, 58n.23, 213, 215n.31, 246, 282, 325–31, 335 veil of ignorance 188, 192 verb phrases (VP) 50, 52–4, 60–1, 72, 78, 85, 92, 114–15, 208, 212, 213–4, 233, 234n.14, 237, 281, 283, 319, 323 Verificationist Theory of Meaning 341 Verstehen tradition 382 Voice of Competence (VoC) 6, 222–5, 228, 232, 236, 239–40, 241, 252, 258, 428 Voice of Performance 233 WhyNots (see also fine thoughts) xxv, 1, 15, 21–28, 31, 35, 38, 40, 63, 71, 82–3, 90, 92–5, 106, 111, 113, 116, 134, 157–60, 162, 173–4, 180, 201n.18, 206–7, 210, 221, 226, 234, 253 Williams syndrome 42n.31 working epistemology 3, 4, 129, 139, 141, 143, 341–2, 360 world-making 199, 297, 300n.5 X-bar theory 60