141 57 2MB
English Pages 297 [298] Year 2023
Lukáš Zámečník Investigations of Explanatory Strategies in Linguistics
Lukáš Zámečník
Investigations of Explanatory Strategies in Linguistics
Processing and publishing the monograph was possible due to the generous financial support of the Faculty of Arts, Palacký University Olomouc IRP_FF_2020a Strategic development FPVC_2018_5 Models of non-causal scientific explanations in quantitative linguistics
ISBN 978-3-11-071267-4 e-ISBN (PDF) 978-3-11-071275-9 e-ISBN (EPUB) 978-3-11-071280-3 Library of Congress Control Number: 2023931068 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the internet at http://dnb.dnb.de. © 2023 Walter de Gruyter GmbH, Berlin/Boston and Palacký University Olomouc Cover image: Matveev_Aleksandr/iStock/Getty Images Plus Typesetting: Integra Software Services Pvt. Ltd. Printing and binding: CPI books GmbH, Leck www.degruyter.com
To my parents Julie and Vojtech
Foreword (. . .) speaking of linguistic law in general is like trying to pin down a ghost. Ferdinand de Saussure, Course in general linguistics (translated by Wade Baskin)
All we have to do is read a few good books recommended to us by friends and teachers and we never again lose the feeling of futility and inappropriateness when we are led to write a book ourselves. Since I know how many better authors there are, and for various reasons they do not have the opportunity to show their knowledge, argumentation skills, invention or creativity, I want to apologize to them all. I apologize to all the good authors for daring to write a book, which is moreover published in a language that should guarantee a wider circle of readers. From a professional point of view, I want to apologize to all linguists whose works I analyze and comment on in this book, as well as to all linguists who follow these works and understand them much better and in much greater complexity than I do. I am not a linguist – I have always been interested in philosophy of science, in its analytic form and mainly in relation to physics. Therefore, I also choose from the works of the classics of linguistics those passages that themselves represent a reflection of the presented linguistic theory. And that is why I am also most attracted to those quantitative linguists who deliberately carry out philosophical reflections on their own theories because they are convinced that this will enable them to improve their approach to linguistics. Despite the careful work of friends who read the manuscript, reviewers, proofreaders, translators and editors, a number of imperfections remain in the book, for which I am solely responsible. The quality that the book still manages to convey would not be possible without the number of good souls that I have had the honor of meeting in my life. I thank Peter Grzybek for providing information on polemical texts focused on methods of quantitative linguistics, and I thank him for his consultations. Thanks to Luděk Hřebíček (and his wife), whose comprehensive library of quantitative linguistics has provided me with abundant non-electronic materials. Thanks to Reinhard Köhler for valuable discussions and lending a number of important resources. Many thanks to Martina Benešová and Tyler James Bennett for substantial proofreading of the English translation of the manuscript. I thank Jan Andres, Marcello Barbieri, Martin Beneš, Dan Faltýnek, Colin Garrett, Vladimír Havlík, Jan Kořenský, Daniel Kostić, Ladislav Kvasz, Ľudmila Lacková, Tonda Markoš, Jan Maršálek, Vladimír Matlach, Jiří Milička, Adam Pawłowski, Alex Reutlinger, Claudio Rodríguez, Martin Zach and all the colleagues who educate and inspire me. I thank Radek Čech, with whose conception of scientific theory I do not agree, but it stimulates my https://doi.org/10.1515/9783110712759-202
VIII
Foreword
thinking. I thank all the students who have attended my courses in philosophy of science and models of explanation in linguistics and who have moved me closer to the beliefs, ideas and some solutions which have been reflected in this book. Special thanks to Barbora Jurková and Pavel Baránek. Many thanks to Birgit Sievert, Kirstin Börgen and Matthias Wand as well as to the reviewers Sheila Embleton and Emmerich Kelih who together helped bring the book into the world. I thank my family, Petra for the patience with which she let me write, and Šimon for his questions – especially those that inspired the examples I use in the book.
Contents Foreword
VII
1 1.1
Resolving the dilemma How to read this book
2 2.1 2.2 2.2.1 2.2.2 2.3
Philosophy of scientific explanation 10 Difficulties with models of explanation after Hempel Contemporary solutions to models of explanation The perils of non-causal explanations 25 Typology of non-causal explanations 31 The principle-based model of scientific explanation
3 3.1 3.2
Systemic explanation in linguistics 47 The missing panchronic view 48 In the name of the principle of the analysis
4 4.1 4.2
Formal explanation in linguistics 74 Building linguistics as a science 78 Explication of some models of formal explanation in linguistics 93 The first interlude: Herdan’s Language as choice and chance 102
5 5.1 5.2 5.2.1 5.2.2
Functional explanation in quantitative linguistics 115 General notion of functional explanation in linguistics 117 System-theoretical linguistics 124 What is a linguistic law? 137 The register hypothesis and the principle of invariance of the linguistic system structure 141 The unified approach in quantitative linguistics 146 The struggle with logical empiricism 149 Difficulties in grasping the linguistic law 154 The second interlude: The principle of compositeness 164 The diversity of theories in quantitative linguistics 170 The origin of linguistic law 172
5.3 5.3.1 5.3.2 5.4 5.4.1
1 6
14 23
36
59
X
Contents
5.5 5.5.1 5.6
6
Functional explanation in system-theoretical linguistics 178 Reassessment of the structural axiom 195 Beyond functional explanation: Topological explanation in system-theoretical linguistics 207 Conclusion
Appendix References
223 267
Persons index
279
Subject index
283
218
1 Resolving the dilemma I would like to communicate without using words or gestures, and just perceive everything that was in the brain of my interlocutor, like a photograph. Édouard Levé, Autoportrait (translated by Lorin Stein)
The linguist is exposed to a dilemma in creating theories.1 He or she has to choose between the purity of his or her discipline and the explanatory nature of his or her theory. He or she cannot have both, at least not up until now. Therefore, we enunciate the dilemma as follows: If the conceptual foundations of the theory are to be purely linguistic, then this is possible only when the theory becomes a linguistic description without any explanatory power. If a linguistic theory is to be explanatory, then this is possible only when it moves to another discipline and bases its conceptual foundations on it.
The dilemma needs to be clarified. It is certainly true that most theories in most scientific disciplines require instruments borrowed from another scientific discipline. It is difficult to imagine a discipline that would not use any kind of logical calculus, mathematical formalism or conceptual borrowings. Here we mean something deeper – the difference between a description and an explanation.2 Physics does not use logical and mathematical apparatus as sui generis explanatory tools.3 For physics, this apparatus represents a framework in which theories can be formulated, although the separability of this framework from the physical content may not always be unambiguous.4 Based on the analysis of selected representative classical linguistic texts, we dare say that linguistics, as a scientific discipline, was at least historically always in a situation where it formulated new theories in order to explain some linguistic phenomena. It approached the game of scientific rationality, which prevailed I owe awareness of this dilemma to Dan Faltýnek. Without my discussions and long-term intellectual encounters with him, none of the following lines would have been possible. Everything would change if we allowed a certain loosening of the concept of scientific explanation, which is typical in the current philosophy of science – we therefore exclude from our considerations the concept of metaphysical explanation (metaphysical grounding, see Bliss, Trogdon (2014)) and mathematical explanation (an explanation in mathematics, see Mancosu (2018)). Both of these approaches to the concept of explanation deserve more attention, but we have no room for them. However, these concepts are not important for our further considerations because in the line of our considerations we do not move towards analytic metaphysics. There will, of course, be philosophers of science who do not agree. Marc Lange is the most prominent character who promotes the thesis on the explanatory power of mathematical facts towards physical, biological and other facts (see Lange 2017). See, e.g., the problem of mathematical abstractions examined in Morrison (2015). https://doi.org/10.1515/9783110712759-001
2
1 Resolving the dilemma
in the second half of the 19th century,5 and which we will reflect upon only from the beginning of the 20th century. When we analyze texts by de Saussure, Hjelmslev, Chomsky, Herdan or Köhler (and others), we always encounter efforts to clearly define their scientific methods, as well as to anchor their logical tools and mathematical borrowings. In the Cours de linguistique générale, de Saussure defines linguistics in contrast to science by rejecting the panchronic view of law. In the synchronic point of view, de Saussure applies his own mathematical intuitions related to the graph theory and topology of the time when the first formalization of the mentioned areas of mathematics was born, referred to at the end of the short 20th century by quantitative linguists. Hjelmslev in Omkring sprogteoriens grundlaeggelse builds a functional model of grammar based on the principle of analysis, with references to the debates of analytic philosophers, which were related to the nascent philosophy of science. In the case of Chomsky, in the 1950s we find in Syntactic Structures the very belief that linguistics should be built in the manner of physics (cf. Chomsky 2002, 65), as a discipline capable of creating theories that provide explanations of linguistic phenomena. Chomsky apparently had the idea that he was succeeding in making a Newtonian turn in linguistics, from descriptions to explanations, just as Newton had raised descriptive kinematics to the level of the first explanatory theory of mechanics by revealing the causes of motion.6 Unfortunately, all three theories, which we have chosen as exemplars, have taken on different forms of interpretation over time. Structuralism (both de Saussure’s and Hjelmslev’s) has been given some form of immortality, not as a corroborated theory, but as a way of describing linguistic phenomena, which cannot be refuted. We can understand the language as a system, but we cannot explain7 the essence or origin of this system – in the sense of explaining the current state and evolution of the system, the original mathematical intuition failed to be developed beyond the means of description.8
We have used the vague term “scientific rationality”, but at the moment, for a historical introduction, it does not matter. Daston and Galison try to define the historical origin of the concept of scientific objectivity (see Daston, Galison 2007). And, of course, Michel Foucault relates the roots of modern philology to the turn of the 18th and 19th centuries (see Foucault 2002, 305–329). The given description of the origin of Newtonian mechanics is, of course, fragmentary and simplified. For a complex picture, the reader can refer to, for example, Torretti (1999). In the philosophy of science, the distinction between understanding and explanation is a vast area of research (see von Wright 1971), which we will not deal with here. Köhler mentions linguistic structuralism as a source of systemic thinking about language, which relies on mathematical models (see Köhler 1986, 6).
1 Resolving the dilemma
3
Generativism is the linguistic mainstream, through which linguistics is today in the phase of normal science. Although it should be falsifiable, we are not witnessing any accumulation of anomalies, which could lead to a revolution in linguistics in the foreseeable future.9 It is distinguished from structuralism by topicality, harmony with practice. While structuralism is already included in the terminology of humanities, generativism is more closely connected with formal and natural sciences, and as we will see (in the chapter 4.2), it succeeds in building on the field of cognitive linguistics and neurolinguistics. However, we believe that in these interdisciplinary linguistic theories, generativism is no longer decisive sui generis, and the explanatory potential again rests on a cooperating nonlinguistic theory. Paradoxically, with regard to Chomsky’s original physical inspiration, an analogy of generativism and pre-Newtonian physics, specifically pre-Copernican astronomy, seems to me more appropriate today. Like this astronomy, Chomsky’s generativism is beneficial for the development of formal methods (i.e. the creation of new formal grammars for building generative and transformational grammars) and in empirical work (classification of grammars of existing languages in danger of extinction). Pre-Copernican astronomy helped to facilitate the development of mathematical methods (perturbation theory, Fourier analysis) in the field of observable astronomical phenomena and achieved a high degree of accuracy in its predictions (see Hanson 1960). To be consistent and helpful, we present both structuralism and generativism as theories that can still be called explanatory. We will, therefore, consider the concepts of systemic and formal explanation, and we will pay due attention to both (in the parts 3 and 4); here we will only point out their basic pitfalls. From the beginning, structuralism has been suspected of rehabilitating the concept of teleology and teleological explanation.10 If it was able to get rid of this stigma, then at the price mentioned in the dilemma, it would reduce itself to a description. However, as we will see, the retraining of the teleological explanation into a functional explanation was also discussed, especially with the support of the developing philosophy of science (especially Hempel 1965), and it was already fully exploited in systemtheoretical linguistics in the 1980s (mainly Köhler 1986). However, the functional explanation is still alive in cognitive linguistics and neurolinguistics.
In the end, it was not provoked by the media-famous clash with Daniel Everett, which related to the dispute over the (non)necessity of the principle of recursion in reconstructing the language of the Pirahã tribe (see Everett 2005). We will deal with the concept of teleological explanation only marginally. It will serve as an obligatory ornament of our book. Its conceptualization was closely related to the introduction of the concept of functional explanation, which we will encounter below (in chapters 3.2, 5.4 and 5.5).
4
1 Resolving the dilemma
The formal explanation is almost exclusively associated with generativism (detailed in the part 4). Its drawback is the absence of an intelligible source of explanation. Although we see some possibilities to look for this source with support in the current philosophy of science,11 we are inclined to believe that it is more natural to be looked at as a formal description, especially given its strictly deductive nature. In the professional literature, of course, we also find a solution that considers the description to be explanatory. However, explanation by description does not fall within our general notion of scientific explanation (see the following part 2).12 Systemic and formal explanations are prima faciae reduced to descriptions, and the only alternative to them seems to be a functional explanation, which draws its explanatory power from a non-linguistic source. The language is then seen as a means of communication, which has its bases in the human biological, physiological and neural habit and which is rooted in social space. Here, hypotheses about the influence of economizing principles, physical, biological and cognitive limitations of speakers on the origin, anchoring and development of the linguistic system have their basis. The dilemma is, thus, revealed to us in clearer contours; on the one hand, we have a systemic and formal linguistic description, and on the other hand, a non-linguistic functional explanation. What options do we have left to solve this dilemma? We are offered two possible, interconnected ways to examine the dilemma in more details to attempt to eliminate it and to reveal a linguistic theory which will have the explanatory power sui generis. We will proceed in both ways simultaneously, without the risk of them getting too far apart. The first path is systemtheoretical linguistics, which was created and developed by Gabriel Altmann and Reinhard Köhler in the context of quantitative-linguistic methods. This linguistic theory (see chapter 5.2) uses a functional explanation, which has two basic advantages: it is precisely formulated using the functional analysis of Carl Gustav Hempel (Hempel 1965), and tries to base its explanatory power on the concept of the linguistic law. At first glance, the dilemma may seem to have already been overcome in the case of at least one linguistic theory. Nevertheless, unfortunately, we point out several fundamental problems, which relate the current functional explanation of system-theoretical linguistics to one predictable side of the dilemma. The solution
I tried it in the paper Zámečník (2018). Here is the conception of explanation in a model-based view of scientific theories (cf. e.g. Giere 2004).
1 Resolving the dilemma
5
to these problems is offered by the second mentioned path, which represents the conceptual means of contemporary philosophy of science. Fortunately, systemtheoretical linguistics itself supports this path, with its long-term openness to this area of philosophy (for more details, see chapters 5.3 and 5.5). One serious problem may, thus, be eliminated by a pure renovation of the conceptual tools of the philosophy of science, which have become outdated in system-theoretical linguistics. Unfortunately, solving this more serious problem will also be more difficult because it will depend on the successful solving of another problem, very important for the contemporary philosophy of science. This problem is the question of the possibility of non-causal explanation. Questions about the nature of scientific explanation form the symbolic beginning of the philosophy of science in the period after World War II, when some European analytic philosophers were transposed into the environment of American pragmatism. Scientific explanations have received attention from a number of perspectives, the most important of which are in historical context: the logical structure of explanation models, and the role of causal nexus and variants of explanatory models with respect to the researched problem (and the chosen scientific discipline). All these perspectives are intertwined, and also have to be viewed together for our purposes – for example, a functional explanation in system-theoretical linguistics takes the form of a deductive argument, is considered non-causal13 and reflects the complexity14 of linguistic phenomena. Causality was first refused in explanation models (especially by Hempel), and then began to return to them (mainly thanks to Wesley Salmon). Due attention has been paid to this classical stage in the development of ideas about scientific explanation in a number of expert studies (e.g. Salmon 1998, Khalifa, Millson, Risjord 2021). At present, the topic of causality has once again become part of the reflection on scientific explanation for a surprising and inspiring reason – many philosophers of science advocate the concept of non-causal explanation (e.g. Reutlinger, Saatsi 2018, Lange 2017). The non-causal explanation is a heterogeneous category based on a summary of individual examples from scientific disciplines, whose theories do not assume a causal nexus; and yet, their creators consider them as explanatory. Our reflections on the forms of explanation in linguistic theories are, thus, a contribution to this ongoing debate. Since we consider linguistic explanations prima faciae to be
Köhler puts it explicitly: “In the case of language, however, there are no known causal laws which can connect e.g. human needs for communication and a particular property of a linguistic unit or subsystem. Moreover, it does not seem at all reasonable to postulate such kinds of laws.” (Köhler 2005, 765). The term is not meant metaphorically, but technically (see Strevens 2016).
6
1 Resolving the dilemma
non-causal, it could not only enrich the mentioned debate, but also contribute to a certain homogenization of the category of non-causal explanation. We have already seen that the importance of linguistic explanations is clearly connected not only with the approaches of traditional philosophy of science, as evidenced by system-theoretical linguistics, but also with current debates in the philosophy of science (more details in chapter 2.2). The success of our ambition to resolve the dilemma of linguistic explanations and descriptions will very sensitively depend on the self-sufficiency of non-causal explanations.15 It will be necessary to determine whether non-causal explanation is threatened by the same fate as a systemic and formal explanation. Is the non-causal explanation not reduced to a description? If not, then the dilemma will be overcome. The biggest challenge, then, is to carefully distinguish non-causal explanations from descriptions. Since we are settled in the philosophy of science and not in linguistics, this philosophical-scientific perspective will be a leitmotif of the whole book. Linguistics, its theory and explanation will mean a discipline for us that will be subjected to conceptual analysis. This approach is currently common in philosophy of science in relation to individual special sciences although it is not yet common in linguistics.16 Our book is, thus, an attempt to establish philosophy of linguistics as part of philosophy of science. However, the way we conduct our conceptual analysis will differ from the prevailing practice of philosophy of science. In some respects, it can be seen as anachronistic, more closely tied to the classical philosophy of science of the second half of the 20th century. Its aim is to do more than just to explore and reflect on linguistic practice; our aim is to map the possibilities of creating distinctive linguistic explanations, which show the essentials of clearly defined non-causal explanations.
1.1 How to read this book The reader is faced with the rather difficult task – acquiring the basic tools of philosophy of science focused on the problem of scientific explanation and then observing how they are applied to a wide range of linguistic approaches. It will therefore be useful to offer the reader a schematic guide to the orientation that
For a critique of non-causal explanations, see Skow (2014). Alisa Bokulich takes a pragmatic approach to non-causal explanations when expressing the usefulness of the parallel use of causal and non-causal explanation models, see Bokulich (2018). An exception is Scholz, Pelletier, Pullum (2015). However, this entry in the SEP deals with the topic of linguistic explanation only very marginally.
1.1 How to read this book
7
introduces the structure of the book and highlights the main lines of content that unite the text throughout its length. First, about the structure of the book. In addition to the Introduction and Conclusion, it consists of four main parts (2–5), which are further divided into chapters (2.1) and subchapters (2.1.1). The four main parts deal with Philosophy of Scientific Explanation (2), Systemic Explanation in Linguistics (3), Formal Explanation in Linguistics (4), and Functional Explanation in Quantitative Linguistics (5). The book also contains two Interludes, inserted between Parts 4 and 5 and between chapters 5.3 and 5.4 of Part 5. The book is supplemented by an Appendix, which expands some of the topics discussed in the main text, and an extensive Footnotes, which also provides several quotations from the analyzed texts, in addition to references to the literature. The second part of the book deals with various concepts of scientific explanation in the philosophy of science. It briefly characterizes the main milestones of the philosophical controversies about explanation, which were initiated by the formulation of Hempel’s D-N model of explanation. It discusses in detail the analysis of current models of scientific explanation, with a particular focus on non-causal models of scientific explanations. Finally, it also formulates a new Principle-Based Model of Explanation. The issue of non-causal explanations is examined with respect to the general nature of linguistic explanations, whether in its systemic, formal, or functional form. The Principle-Based Model of Explanation is designed to be applied to all cases of linguistic explanations in the following sections of the book. The shorter third and fourth parts are successively focused on systemic and formal explanations in linguistics. The systemic explanation is explored with examples from de Saussure’s and Hjelmslev’s structuralism, while the formal explanation is examined in the context of Chomsky’s generative linguistics. With the help of classical texts, the hypothetical form of linguistic explanations is reconstructed and then, as mentioned above, these explanations are formulated using the new Principle-Based Model of Explanation. The linguistic principles that play a central role in these explanations are sequential: the principle of arbitrariness, the principle of analysis, and the principle of recursion. The Interlude: Herdan’s Language as Choice and Chance is inserted between the fourth and fifth part, which represents the connection between linguistic structuralism and quantitative linguistics. The central and most extensive fifth part is devoted to the analysis of functional explanation in linguistics. Although the first chapter briefly assesses the position of the functional explanation in linguistics from a general point of view (especially from the perspective of Martin Haspelmath and Frederick Newmeyer), the entire fifth part is devoted to an analysis of this explanation in quantitative linguistics. Given the differences in the approaches of individual quantitative linguists,
8
1 Resolving the dilemma
quantitative linguistics is modelled into Reinhard Köhler’s System-Theoretical Linguistics (chapter 5.2) and Gabriel Altmann’s Unified Approach in Quantitative Linguistics (chapter 5.3). The second and third chapters of the fifth part, together with the Interlude: The Principle of Compositeness and chapter 5.4 The Diversity of Theories in Quantitative Linguistics, present a mapping of the conceptual means of quantitative linguistics, the ways of building its theories, and the formulation of explanatory principles – in particular the Principle of Invariance of the Language System Structure (subchapter 5.2.2) and the Principle of Compositeness (Interlude). The common axis of these chapters is the analysis of the notion of linguistic law – the attempt to define it (subchapter 5.2.1 What is a Linguistic Law?), difficulties with its verification (subchapter 5.3.2 Difficulty in Grasping the Linguistic Law), and the effort to clarify its origin (subchapter 5.4.1 The Origin of Linguistic Law). The penultimate chapter (5.5) of the fifth part of the book is devoted to an analysis of the development of the use of functional explanation in SystemTheoretical Linguistics, as well as to the search for individual variants of modifications of this explanatory model that can eliminate some of its shortcomings (especially the problem of functional equivalents and the nature of the structural axiom). These alternatives are again found in the context of the development of the philosophy of science, where different interpretations of the concept of functional explanation have been proposed and the nature of the structural axiom itself has been analyzed. The outcome of these analyses then motivates the last chapter (5.6), which offers an alternative non-causal and non-functional model of explanation for System-Theoretical Linguistics. This Topological Model of Explanation is again inspired by contemporary philosophy of science and is related to some theories of quantitative linguistics (in particular, those of Luděk Hřebíček). ✶✶✶ A key feature of the book is a systematic interpretation of linguistic explanations employing contemporary philosophy of science. Philosophy of science has always been at the forefront of interest, especially in quantitative linguistics, because it provides the fundamental normative indicators for building linguistics as an exact science that has explanatory potential. This book, therefore, follows this original intention and provides a new reflection on linguistics through current philosophy of science. This key characteristic also provides a glimpse of the underlying common lines that connect the various parts of the book on several levels. The basic connecting line is the course of history in chronological order from 1916 to the present: linguistic approaches and theories are put together and their connections are traced. Thus, in the fourth part, we can observe the relation of generative linguistics to structuralism, in the First Interlude the relation of structuralism
1.1 How to read this book
9
to emerging quantitative linguistics, and in the second chapter of the fifth part also the relation of System-Theoretical Linguistics to structuralism (similarly in the Second Interlude). General considerations about the nature of formal and functional explanations in linguistics allow us to relate quantitative linguistics to other domains of linguistics (e.g. cognitive linguistics). Another line that runs throughout the book is based on the assumption of progress in linguistics. From the third part onwards, we follow the efforts of linguists to gradually transform their discipline along the lines of the natural sciences: to move from structural linguistic descriptions to formal explanations in generative linguistics and finally to functional explanations in quantitative linguistics. The search for a new topological model of explanation is also motivated by the vision of progress in quantitative linguistics. To try to eliminate the shortcomings of functional analysis (of which Hempel was already aware) and to offer an unproblematic model of explanation to which we are accustomed from the longer consolidated sciences – physics and biology. Another global connecting line is the philosopher of science’s attempt to use the context of linguistics to explore a new model of explanation – the PrincipleBased Model of Explanation. This is formulated in the last chapter of the second part and then applied in parts in the individual chapters, for de Saussure in the case of the principle of arbitrariness, for Hjelmslev in the case of the principle of analysis, for Chomsky in the case of the principle of recursion, for Herdan in his principle of duality, for Köhler in the principle of invariance of the linguistic system structure, up to Hřebíček’s principle of compositeness and transformation of the structural axiom in the topological model of explanation. This philosophical endeavour is also related to the gradual expansion of understanding of the nature of linguistic description and explanation. This understanding is all the deeper the more important the role of philosophy of science becomes in reflecting on linguistic research. Therefore, the book gradually supports the widespread belief in contemporary philosophy of science that philosophy of science should not be the esoteric activity of a few experts, but should be the theoretical foundation on which a particular discipline can be based. The pursuit of this role in the philosophy of linguistics leads us to try to transform functional explanation into new forms and subsequently to seek its alternative in a non-causal topological model of explanation.
2 Philosophy of scientific explanation Consider the phenomenon of light hitting water at one angle, and traveling through it at a different angle. Explain it by saying that a difference in the index of refraction caused the light to change direction, and one saw the world as humans saw it. Explain it by saying that light minimized the time needed to travel to its destination, and one saw the world as the heptapods saw it. Two very different interpretations. The physical universe was a language with a perfectly ambiguous grammar. Every physical event was an utterance that could be parsed in two entirely different ways, one causal and the other teleological, both valid, neither one disqualifiable no matter how much context was available. Ted Chiang, Story of Your Life
What are the concepts used for? Why is it valuable to maintain conceptual purity? Certainly there is nothing to prevent us from writing that the premises of the deductive argument explain the conclusion (1) or that the geometric scheme (Appendix 1) explains why the Pythagorean theorem is valid (2). The orthographical rules for the Czech language explain in a sense why we are not allowed to write: “Děti si na na jejich talířky zakrojili dorta aby si pošmákli.” (3). Generative grammar explains why it is not allowed to write in English: “Carl smarter than Marc are.” (4). The fact that 13 is a prime number in a sense explains why mother cannot divide 13 strawberries evenly among her three children without slicing (5).17 We commonly say that Newton’s laws of motion, along with the initial and boundary conditions and the CPC,18 explain why the billiard ball bounced off the edge of the table and did not end up in the hole (6). We can read that the gauge symmetries explain why the Higgs boson exists (7).19 And we could go on. Conceptual purity is important in order to be able to distinguish different types of entities, states, events, and processes. In a very pluralistic and benevolent conceptual system, nothing would prevent us from defining an explanation by the set of examples we used. And if we did not feel completely comfortable, we could give them a family resemblance. We believe that the goal of philosophy, and especially philosophy of science, should be conceptual clarity, despite the mentioned possibility. When Carl Gustav Hempel defined the deductive-nomological model (D-N
This is a minor modification of Marc Lange’s famous Strawberry Problem: “The fact that 23 cannot be divided evenly by 3 explains why Mother fails every time she tries to distribute exactly 23 strawberries evenly among her 3 children without cutting any (strawberries — or children!).” (Lange 2017, 6). Ceteris paribus clause – “all other things being equal” condition for D-N model of explanation. This is, of course, in contrast to example (6), a very rough explanatory sketch. For details, see, e.g. Stenger (2006, 268–272). https://doi.org/10.1515/9783110712759-002
2 Philosophy of scientific explanation
11
model) of explanation, he strove for just such clarity – he wanted to define clearly what we can consider a scientific explanation. Viewed from this perspective, we certainly cannot grant all the instances of the concept of explanation in the above examples the status of a scientific explanation. This way, we would mix very diverse concepts, which have specific properties and different applications. There are scientific explanations in the examples given, in addition to scientific descriptions, diagrams and arguments. Differentiating them allows us to understand the specifics of these subtler concepts. In case (1), a logician corrects us that it is more appropriate to say that premises justify the conclusion (semantic conception), or that the conclusion results from premises (syntactic conception). We speak about a deductive argument. The premise “It’s raining” justifies the conclusion “It’s raining”; we probably would not use the term explanation intuitively here. The use of the term explanation is not appropriate for characterizing arguments. If we wanted to explain the arguments as such, we would have to employ the metaphysical concept of grounding, but even in this case we use the term explanation in a new dimension – it is a metaphysical explanation that explains the properties of logic, not a scientific explanation (e.g., see Poggiolesi 2021).20 Examples of type (2) were made visible in the philosophy of science mainly by Ronald Giere (e.g. Giere 2006) in the context of his cognitive view of models, theories and indirectly also of explanations. Example (2) belongs to a collection of cases of diagrams, schemes and other visual representations that facilitate understanding an (abstract) problem or enable us to solve a cognitive task. There are two ways at hand to interpret such representations. The first way (which is not applicable to our case) can lead to the use of diagrammatic logics, which allow us to achieve a formal expression not by symbolic, but by iconic means. If so, then some cases of type (2) would be reduced to problems of type (1).21 The other option leads to cognitive science, which can formulate scientific explanations of how type (2) understanding is possible in terms of the functioning of the human cognitive system. Thus, example (2) again does not represent a scientific explanation, but a way in which the human cognitive system can represent some mathematical truths. And it is up to cognitive science to explain this cognitive ability.22
The aforementioned Marc Lange provides an overview of “everything” that can also be described as a non-causal explanation, including a metaphysical explanation (Lange 2017, 14–20). See also note 2. A well-known example is the Venn diagram. In general, on the issue of iconic logic, see Shin (2002). Giere also interprets cases of type (1) as a kind of cognitive task, see Giere (2002).
12
2 Philosophy of scientific explanation
The following example (3) is again completely different. The orthographical rules of the Czech language codify the language usage;23 they comprise a set of norms that have been established by convention on the basis of codification decisions (historical experience, analogies, economizations and linguistic intuition). These norms restrict the speaker in his or her standardized speech, although it does not prevent him or her from speaking in an illiterate manner and disrespecting the norm. So even though we are allowed to communicate that Pavlík says: “Maminko, mohl bych vzít s sebou Petra?” because the orthographical rules of the Czech language say so, the word “because” does not represent a reference to any scientific explanation. No valid scientific explanation can allow it to be systematically counteracted by some (here linguistic) type of phenomena.24 Example (4) falls into the area of formal linguistic explanations to which we will pay separate attention below (in part 4.). We have indicated above and will continue to argue that this is a formal description and not an explanation. The main reason is the manner in which linguistic phenomena in generativism are “explained” – we always have a sufficiently robust formal grammar (at the generative and transformation levels) that can represent the language system. A native speaker of a given language is an arbiter of a correctly created syntactic structure. After finding a suitable formal means, we can then predict new sentences – and verify these predictions with a native speaker. However, the possibility of prediction is not in itself evidence of an explanation – we mentioned pre-Copernican astronomy above.25 As we will argue below, generativism does not own any explanatory mechanism – it could be a universal grammar, but even in a minimalist form, it is ultimately a nonlinguistic matter. Examples of type (4) are for us formal linguistic descriptions (more detail the part 4). Example (5) is very specific in time. It might look inappropriate, in the classical philosophy of science, given the legacy of logical positivism. At present, it represents an example of when a purely mathematical fact explains a fact that concerns the physical world (or e.g. biological or linguistic systems). We are convinced that, again, we cannot understand this example as a scientific explanation, at least in its pure form. The first way to understand it refers back to the second way of understanding example (2), if we reverse the interpretation scheme. The physical properties of strawberries (these are macroscopic objects that have a
These are not only orthographical rules but also (and again incomplete) orthoephical, morphological, syntactical rules and those of lexicalization. Pavlík could also say: “Mamí, moh bysem sebou vzít Péťu?” Norms are not scientific laws (see the chapter 3.1 on de Saussure). Stephen Toulmin recalls the predictive success of Babylonian astronomy, e.g., see Toulmin, Goodfield (1999, 41, 52).
2 Philosophy of scientific explanation
13
clearly defined volume and integrity), as well as the diagram of squares and triangles above (see Appendix 1), allow children and mothers to understand that 13 is not divisible by 3, similar to the above mentioned understanding the Pythagorean theorem. In the same way we can formulate the statement that the Pythagorean theorem explains that I cannot assemble in a square a model of wooden parts missing one triangle. Thus, in this view, it is again a way the human cognitive system facilitates represention of some mathematical truths. The other possibility of solving example (5) adds some physical facts to the explanation scheme, which together with the mathematical ones explain the indivisibility of the mentioned number of strawberries into three groups. As Erik Weber says, if the mentioned mother had thirteen bottles of lemonade, there would be nothing to stop her from distributing them evenly (their contents) among three children.26 From this point of view, we can see that a separate mathematical fact is not decisive for explanation of the indivisibility of physical objects into a given number of groups, but only in combination with specific physical properties of objects – strawberries are solid objects we would have to cut, lemonades are liquid and pragmatically indefinitely divisible into parts. If we want to elevate an explanation of type (5) to a scientific explanation, then we have to, nevertheless, call it a physical explanation.27 Case (6) is a model example of a deductive-nomological explanation. The explanans of this explanation contains universal statements of the nature of scientific laws (Newton’s laws of motion) along with initial and boundary conditions, which specify values of variables expressed in general mathematical law statements, and values which can describe properties of the environment in which the laws are applied. The explanandum is formed with a sentence that represents the investigated physical phenomenon (the rebound of a billiard ball in question). The whole explanation is constructed as a deductive argument, but it is promoted to the form of an explanation by the presence of the scientific law, which is true (at least approximatively), relates to the phenomenon described in the explanandum, and has an empirical content.28
This is a paraphrase of Weber’s argument, which he delivered at the workshop “Non-Causal Explanations: Logical, Linguistic and Philosophical Perspectives” in Ghent on May 10, 2019. Title: “Against Distinctively Mathematical Explanations of Physical Facts”. However, we will return to Lange’s arguments several more times, especially in connection with some variants of explanations in quantitative linguistics, with Luďek Hřebíček and Jan Andres (see the Second Interlude and chapter 5.6). Thus, the conditions of a valid D-N model of explanation are fulfilled, as defined by the classical text Hempel, Oppenheim (1948).
14
2 Philosophy of scientific explanation
Despite the ideal type of a scientific explanation defined above (6), the situation is not simple’ it is in vain that we have undergone seventy years of debates on scientific explanations. The central problem of the D-N explanation model is the question of the causal nature of the laws expressed in the explanans and, of course, the definition of the causality concept. A straightforward solution would be to state that the correct scientific laws are causal laws (or variants of laws reduced to causal laws); but as we can see in Example (7), many central explanations in the most prominent scientific discipline, physics, do not employ any causal law. The scientific explanation that refers to the existence of different types of symmetries (gauge symmetry is one of the central examples) is, strictly taken, a non-causal explanation (for a detailed discussion see Lange 2017, 46–95). These mentioned examples and their analysis should, therefore, not suggest the idea that the situation is clear in philosophy of science and that linguistics should bow to the norms brought in by philosophy of science. The analysis serves to identify several important concepts that lie beneath the vague concept of explanation used in the examples. These central concepts are: the logical argument, the scientific description that is predictive, and the scientific explanation based on some form of the scientific law.29
2.1 Difficulties with models of explanation after Hempel For the logical empiricists,30 causality was an inadmissible metaphysical concept. Therefore, the D-N model is also interpreted as a purely linguistic model of scientific explanation, i.e. as a deductive relation between a specific type of sentences. And from the conditions delimited this way, in which there is no causal nexus, counterintuitive possibilities of scientific explanation can be deduced. Although Hempel originally omitted the significance of the causal nexus for understandable reasons,31 under the onslaught of a series of counterexamples, causality had to be rehabilitated in models of scientific explanation. Bromberger’s
We leave the cognitive cases present in examples (2) and (5) aside. Although the critique of the “received view” is justified in the contemporary philosophy of science, its logical-empirical roots still have their philosophical potential. For example, Roman Frigg (following Munich structuralism) points to the usefulness of thinking about the empirical basis. Frigg made this statement at the workshop “Representation in Science” in Prague on May 28, 2018. Title: “Theory and Observation”. Online: http://stream.flu.cas.cz/media/romanfrigg-theory-and-observation. Causality was a metaphysical burden for logical empiricism and, consequently, for the syntactic philosophy of science, which has to be removed, in the spirit of Carnap (1959).
2.1 Difficulties with models of explanation after Hempel
15
counterexample is best known, and it became widely known as the Flagpole Problem (see Bromberger 1966). The core of the counterexample is that the original scheme of the D-N model allows removing a condition from the explanans and replacing it with a sentence in the explanandum without the D-N model ceasing to apply. In this way, we can replace one of the conditions – the height of the flagpole – with the phenomenon originally explained – the length of the flagpole shadow. De facto, we then explain the height of the flagpole, among other things, using the length of the flagpole shadow. The counterintuitiveness is obvious; while we can say that the height of the flagpole causes (of course not without other conditions) the length of the flagpole shadow, the reverse statement does not seem appropriate – the shadow length does not cause the flagpole height (except for some special idealism). Causality is back, and seems to form the core of an ideal type of scientific explanation. Bas van Fraassen provides an interesting solution to Bromberger’s problem (van Fraassen 1980).32 This solution is grounded on a pragmatic point of view. Let us say that we always find some relevant contextual complement that authorizes the resulting counterexample, which is not equipped with a causal nexus. Van Fraassen suggests that we imagine the shadow length to have some other pragmatic function, i.e., for example, it has to reach an indicator, at some specific time, at a particular place of the earth’s globe. Then it is completely intuitive to say that the shadow length explains why a flagpole of an appropriate height was chosen (although, of course, it is still true that we cannot say that the shadow length causes the flagpole height).33 Criticisms to van Fraassen’s approach are natural counter-arguments; they say that the clear line between the scientific and non-scientific explanation is disappearing. Does it make sense to say that the length of the flagpole shadow is part of the scientific explanation why a flagpole of such a length was chosen? When a viewer asks: “Why do they hang the flag so high?” And his friend replies: “Because now the shadow of the pole on which it hangs has to reach the feet of Joe Biden.” It does not seem to provide him with a scientific explanation. Or does it? Of course, a viewer has to understand that the laws of optics say something fundamental about the relationship between the pole height and the length of the pole shadow. However, in this way we could probably find a substantial scientific content in a large number of our statements. When a child asks while going through the Cf. the chapter “The Pragmatics of Explanation” in van Fraassen (1980, 97–157). Van Fraassen’s critique is provided by Skow (2016). For a synoptic description of Flagpole Problem and van Fraassen’s argument, see Rosenberg (2005, 37–44).
16
2 Philosophy of scientific explanation
passages between two streets: “Dad, why does the path run right here?” We answer: “The road is here to connect the two streets, so that there is a shortcut.” At that moment, we “give reasons/explain” again with reference to the pragmatic function (or directly teleologically). But as in the case of the spectators in the stadium, we have to know something important about the physical world that we are talking about. And it is not enough to refer to pure geometry itself, the child must understand that houses are not permeable (similar to Weber’s and our own argument in the case of the Strawberry Problem), that they are rigid, impenetrable items, etc. The answer to van Fraassen’s solution is, therefore, ambiguous. We would have to examine carefully individual types of our statements and to delimit those that require essentially a reference to the vital properties of the physical world to achieve understanding. We believe that all statements referring, albeit indirectly, to the world can be understood in a similar way as in van Fraassen’s solution to the Flagpole Problem. Statements concerning speakers’s internal mental states will remain aside, as in the example: “Why are you angry with me?” “Because I expected you to follow me.”34 Such difficulties with defining scientific explanations, which at the same time complicate defining scientific theories,35 have led to the turn of some philosophers of science’ questions towards the analysis of cognitive aspects of our mental activity in modeling the physical world and building theories about the world. We are thinking in particular of Ronald Giere and his contemplations on the structure of distributed cognitive systems, which we understand as one of the interesting outcomes of Quine’s naturalized epistemology. We will not follow this conception36 because it would lead us from philosophy of science to particular research in cognitive sciences and neurosciences.37
Also, statements that are formulated in the fictional world will, if they refer to the essential characteristics of this world, fall into the category of the case with the flagpole. Why did Harry get from the London King’s Cross railway station to Platform Nine and Three-Quarters? Because a portal has opened on the wall of the station (i.e. the impenetrability of physical objects is not an obstacle in this fictional world for creating shortcuts across spacetime). We mean the development of the concept of scientific theories from the originally syntactic view (e.g. Nagel 1961), via the semantic view (e.g. Suppe 1977, Suppes 1972), up to the model-based view (e.g. van Fraassen 1980, Giere 1999). Halvorson (2016) writes in an interesting way about this, with the defense of the syntactic view of theories. E.g. the chapter “Perspectival Knowledge and Distributed Cognition” in Giere (2006, 96–116); especially the section “Models as Parts of Distributed Cognitive Systems” (Ibid, 100–106). The reasons why some philosophers of science leave the domain of their own discipline may be justified by external factors – philosophy of science conducted as a conceptual analysis often loses contact with the reality of scientific discourse, which is not threatened by our inability to
2.1 Difficulties with models of explanation after Hempel
17
If we are not able to delimit the concept of scientific explanation without persistent problems, we can choose a pragmatic maneuver and focus on the analysis of what scientists themselves in various scientific disciplines consider to be a scientific explanation. This pragmatic strategy is underpinned by the model-based approach towards theories by Ronald Giere, which he presented in a series of books and papers at the turn of the millennium.38 Giere notes that scientists use models of very different types to represent diverse selected aspects of the world. Giere circumvents the original problem of the relationship between the model and selected aspects of the world by stating that the models resemble selected aspects of the world.39 Expressed directly in Giere’s words: The scientist (S) uses model (M) for the purpose (P) of world representation (W).40 The chosen strategy follows the semantic conception of theories, but resigns to its formal foundations (cf. Halvorson 2016, 591–600), and gives it a pragmatic essence. Therefore, the focus is not on the model as representation, but on the very activity of representing, which is motivated by the various purposes chosen by scientists. In the context of the conception of scientific explanations, the central question is not what a scientific explanation is, but how scientists in individual cases explain specific phenomena or classes of phenomena. The chosen path has undoubted advantages – the philosopher of science rises from his chair and heads for his scientific colleagues to their research centers and laboratories, starts reading scientific outputs, and tries to understand how science works in real practice. Science ceases to be examined as an ideal assumed intuitively, but one which we do not find anywhere in practice. At the same time, there is still a chance, after the whole diverse nature of scientific explanations has been explored, we can look forward to new syntheses, which will be created by philosophers of science. The negatives of the chosen path are known and are often pointed out. We are talking about representing the world, but we have not clarified what scientific representation is (cf. Suárez 2016). We are talking about scientific models, but we have not clarified their relationship to representations (are they identical with
define scientific explanation, but by cognitive fouls, biases and flaws in argumentation. For many philosophers of science, activist movements in support of science and rationality are becoming attractive, such as e.g. Novella et al. (2018). In addition to the cited Giere (2006) and Giere (1999), it is primarily the paper Giere (2004). The problem of the vagueness of the concept of similarity was discussed by Giere himself, cf. Giere (2006, 63–67). A critic of the concept of similarity is also one of the representatives of the semantic view of theories, e.g. Suárez (2003). Giere states exactly this: “S uses X to represent W for purposes P.” (Giere 2004, 743).
18
2 Philosophy of scientific explanation
them?). We register an extensive catalog of scientific models, but we are only able to find family resemblances. We say that the models are similar to the world, but we have not defined this similarity clearly (cf. Smith 1998a). We refer to the effectiveness of scientific models, but the concept of effectiveness is even less clear than in the case of similarity. Strictly viewed, the question whether it still makes sense to talk about scientific explanations in a situation when we together with Giere talk about Science without Laws (we will follow up on this problem below, see chapter 2.3) is also debatable. On the other hand, philosophers of science proceed with thoroughness and caution – they describe explanatory strategies across sciences, they do not prescribe anything a priori. And it results, of course, in significant pluralism in the conception of scientific explanations. The way of asking questions about the nature of scientific explanation has fundamentally changed in philosophy of science, and as we will see, some areas of linguistics still strictly adhere to classical standards typical for the received view without registering this change. This is to their detriment, indeed, because the role of philosophy of science is not purely academic and descriptive, but is intended to reflect the scientific activities of each individual discipline and its researchers. Thus, in the contemporary philosophy of science, it is no longer just a matter of explication what scientific explanation per se is (as reported by Rosenberg 2005, 26), but of describing the scientist’s activity related to explanation, which also includes a description of the means he or she uses to do so. Something similar to what we witness in the model-based view of theories, thus, happened to the subject of scientific explanation. If the theory is understood as a cluster of models, in which the pragmatic dimension of their application as representations for the purpose of representing a real system is examined on top, then the scientific explanation commences to be understood as a concept (with family resemblance as a link), which is defined by listing individual cases. When, within Giere’s intentions, in the model-based (or pragmatic) view of theories,41 we claim that:
As we stated above Giere (2004) presents this approach in a nutshell. Again, there is no denying that this is a very interesting approach. When we consider an explanation in biology from the outside, from a philosophical armchair, the need for functional analysis and a functional explanation of the phenomena of living nature probably comes to mind. When we listen to interpretations by individual biologists, we discover how incredibly varied explanatory activities can be.
2.1 Difficulties with models of explanation after Hempel
19
(1) The scientist S utilizes the mechanical model M for the purpose P to represent the system W. (2) The scientist S utilizes the topological space model M for the purpose P to represent the system W. (3) The scientist S utilizes the graph theory-based model M for the purpose P to represent the system W. We simply describe very diverse models with the same pragmatic figure. For example, case (1) requires deciding whether the mechanism implements a variant of the causal nexus, and we will meet it later as a concept of causal explanation. Furthermore, case (2) does not need the causal nexus ex definitione (see the next chapter). Case (3) is a scientific description rather than an explanation.42 We may, furthermore, ask whether there is still a clear distinction between scientific explanation in the traditional conception of philosophy of science and scientific description. Would it not be more apposite to talk about scientific descriptions in all the above mentioned examples? In fact, is it still important in the pragmatic philosophy of science to keep the distinction between explanation and description? De Saussure also used a certain type of mathematical theory to represent the language system, but in contemporary linguistics we are reluctant (mostly) to speak in this case about him explaining linguistic phenomena (examined below, see chapter 3.1). The reward for the acquired overview of the plurality of scientific research practices is a growing degree of vagueness and blurring of boundaries between explanations and descriptions. Reflections on pragmatic aspects of explanations in philosophy of science after Hempel would not be sufficiently saturated if, in addition to Bas van Fraassen and Ronald Giere, the role of Nancy Cartwright was not recalled. Her model-based view of theories is similar to Giere’s in the critique of the standard conception of the scientific law. Again, in the line of the pragmatic turn, it points to the predominant set of cases when the scientific law represents such an idealized model of the system that it cannot be used in specific situations (cf. Cartwright 1999, Cartwright 1983). One of the outcomes of these considerations is the belief that scientific models that are built with the help of scientific laws are “only” useful fictions (e.g. Morrison 2015, Frigg, Nguyen 2016). While Ronald Giere’s legacy is primarily the development of cognitive approaches to scientific models and theories and the further naturalization of When an epidemiologist uses graph theory to predict the rate of spread of an epidemic in terms of the nature of the graph, which appropriately represents a given real system in which the epidemic may spread, it does not explain the cause of the spread of the epidemic. Can we say that it explains the rate of spread of the epidemic?
20
2 Philosophy of scientific explanation
philosophy of science, Cartwright’s main legacy is the visibility of simplifying assumptions in scientific models. If for Cartwright they are primarily the basic pillars of scientific fundamentalism,43 its successors are primarily interested in the very role of simplifying assumptions in scientific models (e.g. Morrison 2015). The rejection of scientific fundamentalism does not lead to the exclusion of different kinds of idealizations and abstractions from scientific modeling; on the contrary, a careful analysis of different types of scientific models shows that they both are, as simplifying assumptions, a necessary part of scientific modeling.44 The fundamental question is, strictly speaking, how can false models (because idealized and abstract) be part of scientific explanations at all? If a philosopher of science does not want to practice presented fictionalism, which closely binds original pluralism of pragmatic philosophy of science with anti-realism (see Appendix 7), then he or she has to perform a careful conceptual analysis of individual simplifying assumptions – among them above all idealization and abstraction. Idealizations are better known because they are traditionally identified as one of the initial strategies for building modern natural science. Idealizations neglect some features of the real system, which are not considered relevant and may be neglected because external conditions allow it. Idealizations represent a simplification of the complexity of a studied system. Galileo’s relations for a mathematical pendulum are a paradigmatic example.45 Countless examples in contemporary physics46 and other sciences are available.47 Indeed, linguistics is also full of idealizations; de Saussure’s structuralism idealizes in the description of the state (synchrony) of the language system although we
Cf. the first part “Fundamentalism versus the patchwork of laws” of Cartwright (1999, 23–34). Cartwright draws attention to scientific fundamentalism, especially in the context of physics and its basic theories, which are indispensable for all types of physical systems. Similarly, we can consider fundamentalism in some approaches of molecular biology (in the acute form of the selfish gene hypothesis), which neglect the importance of the phenotype and are disrupted by new concepts of epigenetics, proteomics, etc., cf. Barbieri (2015). On the topic of simplifying assumptions – abstraction and idealization – see e.g. GodfreySmith (2009). An interesting feature of idealizations is that they can be reused in a new context as the basis of models, which can again dispose with a greater or lesser degree of agreement with reality. The analogy of a mechanical oscillator and an electromagnetic oscillator is a paradigmatic example. We still use Newtonian mechanics to describe some phenomena in the solar system even though we know it is not true. The hydrogen atom can be described in some basic states by Bohr’s model etc. The Lotka-Voltera model, which we can use for simple cases of ecological population dynamics, is particularly popular among philosophers of science (cf. Weisberg 2013). We can use a strange attractor to describe the simple activity of the neural network of the human brain (cf. Freeman 1988), etc.
2.1 Difficulties with models of explanation after Hempel
21
know that this state results from dynamic changes in parole (diachrony). Tools of formal grammars can capture some aspects of natural language syntax (e.g., regular expressions allowing us to explore corpora effectively), recognizing that they are unable to grasp more complex aspects of sentence syntax (e.g., cross-serial dependencies in sentences). The generative model used by a field linguist in describing the grammar of a hitherto unknown language always has to be open to revision with regard to the phenomenon under study, influenced by the response of native speakers.48 System-theoretical linguistics idealizes in the individual description of each language subsystem because it cuts it off from naturally existing links to other systems (e.g. a case of the lexical and syntactic subsystem, see chapter 5.2). And also a hypothetical comprehensive linguistic system will remain an idealization if it fails to be connected with non-linguistic subsystems of cognitive science and neurosciences. Abstractions are less well-known as simplifying assumptions outside philosophy of science, although they are significantly more interesting than idealizations. They are more complicated for abstracting from some obvious properties of the real system while they cannot be given up without having to reject the whole theory.49 They represent the fundamentally unrealistic nature of the model because when idealizing we attribute only some of its properties to the investigated system; while when abstracting, we attribute to it a property that it cannot really have. An abstraction is, for example, represented with a point model of elementary particles, an infinite ensemble of particles in renormalization theories, the inflation process in cosmology, etc.50 Abstraction also feels at home in linguistic theories. We believe that abstraction is represented by the concept of sign arbitrariness in structuralism. Although we know that linguistic expressions are not strictly arbitrary, they result from developments in parole, which can be traced diachronically. However, its denial would lead to the collapse of the theoretical power of structuralism, which grounds the system on oppositions and relations – relations are real, nodal points are contentless. In generativism, we encounter abstractions formed as countable infinity bounded with recursion. The theoretical power of gererativism requires postulation of this abstract entity, without which theory would become only an empirical
A famous example of Swiss German, see Shieber (1985), see also Appedix 9. At this point, we define abstractions through Margaret Morrison’s approach, see the chapter “Abstraction and Idealization” in Morrison (2015, 15–49). The problem of abstractions is closely related to the definition of theoretical terms and entities. In physics, abstractions are, for example, the absolute space and time of Newtonian mechanics – necessary for the conception of gravity and rejected with all of Newtonian mechanics in the general theory of relativity.
22
2 Philosophy of scientific explanation
generalization, over the form of natural language sentences.51 In system-theoretical linguistics, we can consider the very concept of the language system as a communication tool as an abstraction. Or can it even be argued that rejection of arbitrariness and of discrete infinity makes the presumed linguistic explanations into mere descriptions? These considerations will be further discussed in the following chapters (mainly 3.1 and 4.1). Perhaps, however, structuralism does not idealize, because it is a pure abstraction, and on the contrary, system-theoretical linguistics does not abstract, but only idealizes real language system. Or, idealizations and abstractions need to be hierarchized, and one also has to find threshholds, but that is not our task in this treatise. If we understood the very concept of language as a communication system as an abstraction, then we would find ourselves at the top of the hierarchy of abstractions.52 Perhaps for today’s situation in philosophy of science, an analogy with the situation in linguistics after the communication turn, or a turn towards pragmatics could be made. Philosophy of science experienced a turn towards pragmatics later than linguistics, and therefore later it began to crumble into individual subdisciplines. But just as linguistics has preserved traditional themes (structuralist and generativist traditions), so has philosophy of science. They are, yet, just somewhat inaudible in the noise of popular topics. However, modern themes cannot do without traditional ones if they do not intend to result in superficial solutions. The original questions typical for the received view of philosophy of science then come into the spotlight. We could observe this above, when we asked together with Margaret Morrison what it means to use unrealistic mathematical models in explanation when, at the same time, their unrealistic nature is necessary for successful modeling (see note 49); or when Alexander Reutlinger and Juha Saatsi ask what the difference is between the causal and non-causal explanations (cf. Reutlinger, Saatsi 2018), which leads us back to the need to define causality and consider whether it makes any sense to take into account other than causal explanations as real scientific explanations (cf. Skow 2014); or when Robert Batterman (and others) examine renormalization theories and find that the traditional hierarchical model of causality suffers from fundamental problems in terms of existing explanatory strategies (cf. Bain 2013, cf. Batterman 2013); or alternatively when, like Marc Lange, we consider the role of mathematics in relation to scientific disciplines (cf. Lange 2017). We examine the role of abstractions in the context of generativism in more detail in the chapter “Infinite Sets in Formal Grammar” of the paper Zámečník (2018, 253–255). This may correspond to the distinction between essentialism and externalism in philosophy of linguistics, cf. Scholz, Pelletier, Pullum (2015).
2.2 Contemporary solutions to models of explanation
23
2.2 Contemporary solutions to models of explanation In the previous chapter, we documented the development of the concept of scientific explanation in the period after Hempel, which led this overview to one typical form of considerations on explanations associated with a pragmatic conception of philosophy of science and the model-based view of scientific theories. To a certain extent, one can say that this pragmatic conception is dominant because it also significantly manifests in the philosophical reflection of individual (in an ever more subtle resolution) disciplines. We will now show how the traditional agenda in today’s philosophy of science is preserved as it manifests itself on the topic of scientific explanations, and we will indicate in what respect this could be useful for linguistics. We will not examine the whole area devoted to this topic in the philosophy of science, but purposefully we will focus on those areas that can apply in quantitative linguistics. We will then utilize the selected approaches in more detail in the final chapter when we will look for a new model of explanation for Köhler’s system-theoretical linguistics (see chapter 5.6). The traditional agenda is hidden behind the current two dichotomies that serve as basic orientation categories in classifying the models of explanation. It is a dichotomy of mechanistic (e.g. Craver 2006, Craver, Darden 2013) and design (cf. van Eck, Mennes 2016) explanations, and a dichotomy of causal (cf. Woodward 2003) and non-causal (cf. Reutlinger, Saatsi 2018) explanations. Dichotomies have always been useful as a means of classifying of concepts, but also misleading when it comes to a finer description of their details and differences. The same holds true for these old-new dichotomies. In addition, these pairs are intertwined because mechanistic explanations are prototypically causal, and design explanations fall into a set of non-causal explanations. However, it is customary to always analyze each dichotomy separately. The most popular current models of explanation are mechanisms, they even gave a name to a new conception in philosophy of science – new mechanism. According to new mechanists, a scientist explains mechanistically, if he or she manages to identify a (causally) connected chain of individual agents jointly responsible for the occurrence of a particular phenomenon, realization of a specific state of affairs. Mechanistic explanations draw their explanatory power from recognition of a mechanism, or their co-occurrence, which can be expressed by a model so as to represent the target system appropriately. Mechanistic explanations can absorb the causal aspect of explanation. Glennan defines the minimal mechanism as follows: A mechanism for a phenomenon consists of entities (or parts) whose activities and interactions are organized so as to be responsible for the phenomenon. (Glennan 2016, 799)
24
2 Philosophy of scientific explanation
There are different typologies of mechanisms, both in relation to scientific disciplines in which they apply and in the context of philosophical demands that may be placed on them.53 The advantage of mechanisms is their wide applicability to a large number of different phenomena, across disciplines (from physics to neuroscience).54 In order not to overload the chapter with a specific example from a specific scientific discipline, it will be included in Appendix 2. Despite many texts and books dealing with and grounded on the new mechanism, we cannot free ourselves of the belief that the definition of mechanism (e.g. see Glennan above) is vague. It is also questionable whether it is useful to generalize across such systems as, on the one hand, an abstract machine – such as von Neumann probe (cf. von Neumann, Burks 1966); and on the other hand, a specific complex biological process – such as a metabolic cycle – suitably modelable by identifying individual components of a complex mechanism.55 Although new mechanism can be labelled as a philosophical mainstream, there are cases when it seems more advantageous to explain the phenomena via design, because the mechanism is not known, or considering the mechanism in this case does not seem to be adequate. A scientist uses design to explain when she or he identifies the usefulness of a complex system that is designed to realize some state of affairs allowing a phenomenon to occur.56 The design explanations draw explanatory power from the ability to represent appropriately the function that the target system is to perform. An example of a design explanation is given in the appendix (Appendix 3). At first, this dichotomy mentioned in today’s discussions of scientific explanation may seem only to be a revival of classical notions of mechanism and function. Although this pair may operate as a modern variant of causal and functional explanation, they cannot be clearly mapped to each other.57 If we take a closer look, we will find that their innovation is mainly related to the changed point of view, because explanations are no longer so much examined as abstract operations, but above all as
An overview of various concepts and variants of mechanisms is clearly presented in Craver, Tabery (2015). However, their impact is evident beyond science and the humanities. Suffice it to recall the mathematics and theory of automata, prototypically the Turing machine, which also serves as a means of defining the concept of an algorithm (see the chapter “Church’s Hypothesis” in Partee, ter Meulen, Wall 1993, 515–517). On the principle of hierarchy: sub-mechanism – mechanism – super-mechanism. However, for example, Craver believes that his conception of mechanism is applicable also to functional explanations (see Craver 2013, see also Garson 2013). Above (see note 54) we referred to a non-causal variant of mechanism – the Turing machine and the concept of automaton in mathematics in general. The relationship between the concept of design and function should be examined in more detail.
2.2 Contemporary solutions to models of explanation
25
scientific research strategies.58 If we would adopt this new approach, then we could easily identify the presence of mechanistic and design strategies of explanation in most disciplines and their theories; linguistics, at least in its quantitative form (and this will be indicated in the Conclusion of the book). At the same time, we would get rid of the need to solve abstract problems that are related to explanation in its prescriptive form (such as the kind of dependence between explanans and explanandum, the asymmetry of this relationship, etc.). However, we would look at the examined systems (including the linguistic one) from a fictionalist perspective (see the chapter 2.1 above) – we would examine the systems “as if” mechanical processes took place in them, or “as if” they performed certain functions. Thus, we would step out of the framework of some, even moderate form of realism in relation to the used scientific models and other tools of theory-building. Indeed, this may not be wrong, but we will also attempt to propose another possible solution.
2.2.1 The perils of non-causal explanations If the first dichotomy develops mainly in the context of a pragmatic view of theories and does not result in controversies between two dissenting camps, the other dichotomy – causal and non-causal explanations – is more strongly associated with the prescriptive conception of philosophy of science with its greater effort to generalize results of conceptual analysis. At the same time, it is also more often the basis for controversies about the (im)possibility of existence of non-causal explanations (see Skow 2014). The fact that the other dichotomy is not a mere armchair philosophical debate on the possibility of non-causal explanations59 suggests that their discovery and today’s popularity is largely based on their specific conceptualization in individual disciplines. The new finding is that non-causal explanations are not domain specific, we find them across disciplines, and the extent of their utilization across sciences is considerable. Non-causal explanations have been identified in physics
However, enormous attention was paid to mechanistic explanation, and it was developed at all possible levels of conceptual analysis (including the relationship to other explanations and the ability to accommodate them, including metaphysical conditions for defining mechanism, etc.), see Craver, Tabery (2015). From the traditional point of view, we could refer to Hempel (1965), where he describes a number of other than just causal explanations. We will point out this below in connection with the functional explanation in linguistics (see chapter 5.5 and Appendix 19).
26
2 Philosophy of scientific explanation
(explanations through symmetries in fundamental physics, non-causal explanation in the case of critical phenomena), biology and life sciences (optimization, graphtheoretical and topological explanations), ecology (critical phenomena, topological explanations) etc. (e.g. Lange 2017, Batterman 2011, Huneman 2018, Huneman 2010, Kostić 2020, Kostić 2019). This dichotomy suffers symmetrically from the same problem as the previous dichotomy – also here, the causal explanation is (after many decades) comprehensively investigated,60 but the non-causal explanation has not yet been comprehensively given synthetically. There are only two comprehensive books on non-causal explanations – already mentioned Lange (2017) Because Without Cause, and Reutlinger, Saatsi (eds.) (2018) Explanation beyond Causation. Lange’s book is an impressive work, which inspires a number of researches on non-causal explanations, perhaps in all its forms (see below), but it is more of a starting initiation book than a comprehensive synthesis.61 The collective monograph, assembled by Reutlinger and Saatsi, contains a number of key contributions (especially Woodward and Bokulich, see below), but for understandable reasons it lacks a synthetic form even more than Lange’s book. In our opinion, the efforts to define non-causal explanations represent a positive example of an attempt to overcome pluralism defined as a characteristic feature of the contemporary view of models of scientific explanations. In addition, it represents a challange to those who promote the universality of causal explanations to re-conduct a conceptual inventory of causality. Simultaneously with the examination of non-causal explanations, it again creates space for emphasizing the importance of logical inference for explanation models. In this way, models of explanation are already examined by Khalifa, Millson, Risjord (2021), when they offer the concept of sturdy inference, which could be a prototype explanation model (for both causal and non-causal cases), which, at the same time, preserves the asymmetric relation between explandum and explanans. Proponents of non-causal explanations face several important challanges, of which we will attempt to introduce briefly three that are important for further analysis in this chapter. At the same time, they are also important with regard to the application of models of non-causal explanations in linguistics. These tasks are to consider (1) whether there is a universal model of non-causal explanation,
From classical research, see Salmon (1998) to the modern counterfactual form, see Woodward (2003). Sorin Bangu points to the oversized content of the book, see Bangu (2017).
2.2 Contemporary solutions to models of explanation
27
(2) whether a non-causal explanation is feasible in a counterfactual framework,62 and (3) whether a condition of asymmetry63 between explanans and explanandum can be assured for non-causal explanations. A crucial question that concerns the universality of non-causal explanations emerges: Is it possible, similarly to a causal explanation (with examples in physics, chemistry, biology etc.), to define a non-causal explanation positively? Or is it just a “residual” category delimited purely negatively – by the absence of a causal nexus? As we indicated above, at the moment we are still in a situation where we rather collect individual examples; however, there are some universal possibilities to be offered, and they are closely related to the solution of the other two tasks left (counterfactual and asymmetry). Jansson and Saatsi (2019) are convinced that there is a possibility to create a universal model of non-causal explanation, which would be formulated in a counterfactual framework and, at the same time, would meet the condition of asymmetry. They are convinced that it is possible to apply Woodward’s counterfactual framework to formulate not only causal but also non-causal explanations.64 At the same time, Jansson and Saatsi consider it to be the solution of the problem of asymmetry not only for causal but also for non-causal cases of explanations. The core of their beliefs is expressed in this summary: In the counterfactual framework, explanations must involve an invariant, change-relating generalization that supports counterfactuals indicating an explanatory dependence of the explanandum on the explanans. (Jansson, Saatsi 2019, 837)
They argue that the universal counterfactual framework in Woodward’s sense can be extended to all cases of non-causal explanations; and thus, it can also express the asymmetry of the relation between explanans and explanandum – only entities in the explanans together explain the explanandum, but not vice versa. Jansson and Saatsi apply the change-relating counterfactual concept to the classical example of a noncausal explanation in graph theory – Bridges in Königsberg (Jansson, Saatsi 2019, 841). At first glance, an elegant solution offered by Jansson and Saatsi, however, hides certain conditions that may not be acceptable for advocates of non-causal explanations. First, they express the belief that non-causal sui generis explanations, without any causal admixture, do not exist (Jansson, Saatsi 2019, 841). This
The fulfillment of the support of counterfactual condition we discuss in more detail in the appendix (Appendix 4). This is the above mentioned difficulty demonstrated by the Flagpole Problem (see the chapter 2.1). We focus in more detail on the issue of asymmetry in the paper Zámečník (2021); see also the chapter 5.6. Woodward himself states this in the paper Woodward (2018).
28
2 Philosophy of scientific explanation
may be surprising, but understandable, considering that they write all the time about “explanatory abstractions”. This means that causal and “quasi” causal explanations differ in the extent to which abstractions are used in creating explanation models (Jansson, Saatsi 2019, 821–824). Nevertheless, how should we understand it? We believe that simply, as has been said above, there are no purely non-causal explanations, and what is more, the explanatory nature of abstractions is conditioned by the presence of the causal nexus. We will certainly understand that this solution will not be acceptable to Lange, who promotes the thesis of distinctively mathematical explanations of non-mathematical facts (see the subchapter 2.2.2 and the Appendix 5). But we believe that it should not be acceptable to a single supporter of a wider set of non-causal explanations either, because even that set would be doomed.65 Lange (2021) rejects universal solutions offered by Jansson and Saatsi (2019): (. . .) There is no fully general account of what makes some facts explanatorily prior to others in non-causal scientific explanations. Rather, the order of explanatory priority is fixed by different considerations in different non-causal explanations. (Lange 2021, 3916)
Lange offers three possible solutions, which, however, he does not pursue any deeper. Two of them are connected with analytic metaphysics, so we will leave them without any comment (Lange 2017, 23). The third solution proposes that explanatory priority is decided by the level in the hierarchy occupied by the explanatory component, or rather by its modal power – in Lange’s conception, it is a distinction between laws and meta-laws. For Lange, this type of asymmetry is important in non-causal explanations through constraints (most often in physics) (Lange 2021, 3915–3916).66 This solution is also very important for our task; as we will see in the Second Interlude and in chapter 5.6, a similar explanatory strategy can be found in quantitative linguistics by Luďek Hřebíček and Jan Andres (Hřebíček 2002a, Andres 2009). Lange does not trust in the ability to express explanation asymmetry via the counterfactual dependence of the explanans and explanandum, either. He returns to the example of the Bridges in Königsberg and shows that both Woodward’s (2018) and Jansson’s and Saatsi’s (2019) solutions are not bulletproof. The core of Lange’s argument is that in Woodward’s conception, only those cases of
In this regard, it is worth recalling Morrison (2015), who rejects the independence of mathematical explanations in relation to non-mathematical facts (Morrison 2015, 50–57), but defends the distinctive importance of mathematical abstractions as sources of the explanatory power of scientific models (see also the subchapter 2.2.2 below). Cf. the chapter “There Sweep Great General Principles Which All the Laws Seem to Follow” in the book Lange (2017, 46–95).
2.2 Contemporary solutions to models of explanation
29
counterfactuals are taken into account where the explanans can be subjected to interventions, i.e. when it is not of a modal nature. However, Lange presents cases where both the explanandum and the explanans are modal (Lange 2021, 3899) and where asymmetry is, thus, lost (Lange 2021, 3899–3900) (for detail, see Zámečník 2021). ✶✶✶ In the last paragraphs, in an effort to approach the current intellectual controversy, we have been immersed in the somewhat technical terminology of philosophy of science.67 It is a necessary price because the prerequisite for counterfactual support is a necessary condition for a valid explanation. As we can see, the debate participants differ in whether we can also consider it as a condition sufficient to express the asymmetry of explanation. Lange’s argumentation is irritating in some respects – he rejects the found solutions, but only outlines his own solutions without arguing systematically for them. There is, however, one thing Lange shows very nicely – a certain weakness of the counterfactual approach in relation to the articulation of explanation asymmetry. In the case of causal explanations, the counterfactual solution (Woodward 2003) of the asymmetry problem works because the asymmetry is provided by the causal nexus itself (Lange speaks of a “causal arrow”). However, this is not possible in non-causal explanations because there we can postulate the existence of asymmetry (i.e. declare that apart from the “causal arrow” there is also a “noncausal arrow”), but in doing so we would make the solution too easy (Lange 2021, 3907). The defense of the counterfactual conception could be based on the claim that this conception leads us to reconsider some of our intuitions on the nature of scientific explanation. As if it were enough for us to be able to construct a counterfactual dependence because everything else would already presuppose an interpretation going beyond the received view – we would have to renew our intuitions on causal or other dependencies and take a step with Lange towards analytic metaphysics. On the other hand, is the intuition on explanatory asymmetry anything we should get rid of, as of a prejudice? Shall we keep then anything else different that distinguishes explanations from descriptions? After all, if an explanation does not provide information about some type of asymmetric dependence, can we really understand it as an explanation? It seems that this approach would again lead us to a pragmatic view of theories (see chapter 2.1 above), in which a faithful reflection and recapitulation of
For a better understanding of some terms such as “change-relating counterfactuals”, “intervention”, etc. we attach Appendix 4.
30
2 Philosophy of scientific explanation
scientific practice is decisive. Therefore, it is interesting that Lange also relies in his arguments on scientific practice when declaring: (. . .) scientific practice of explaining regularities is often very uniform in embracing certain directions of explanation rather than others. (Lange 2021, 3911)
However, Lange points out that these privileged directions are given by certain intuitions of scientists about the hierarchy of laws as we have already expressed above with reference to the concept of meta-laws (and see note 60). This is important for Lange because these scientific intuitions break the symmetry of counterfactual dependence – especially when scientists explain conservation laws by fundamental symmetries, but not vice versa. However, in Lange’s conception, scientific intuition grasps some real asymmetric dependence, so it would be strange if this asymmetric dependence was not real when scientists mostly respect it. Daniel Kostić proposes a pragmatic solution to the asymmetry problem, which does not need Lange’s intuition. Asymmetry is based on a scientist’s decision to choose his own explanatory strategy. Kostić expresses this when he defines Explanatory Perspectivism as one of the assumptions of his non-causal topological model of explanation: A is an answer to the relevant explanation-seeking question Q about B (. . .). (Kostic 2020, 2)
We will analyze the whole Kostić’s explanation model in chapter 5.6, where we will also use it to create a new non-causal explanation model for system-theoretical linguistics by Reinhard Köhler. We will see that Kostić does not have to be interpreted as a strict supporter of a pragmatic approach to explanation,68 but here, by referring to his wording, we would like to point out that the pragmatic turn makes solving the problem of explanation asymmetry possible very quickly. In each particular case, the arbiter of asymmetry is a scientist’s intention, or a research strategy of a particular scientific community, which has its own history of successes and failures, in which they have found out what to rely on when answering questions of a given type.69 Bokulich’s (2018) approach is also in line with this pragmatic view, showing how some phenomena can be explained in parallel in both causal and non-causal
For this, other assumption of his explanation model are too tied to the question of the veracity of scientific claims and to the question of the nature of scientific representations of the phenomena under study. At this point, we could enter the topic of the development of scientific theories, starting with Kuhn (1962).
2.2 Contemporary solutions to models of explanation
31
ways. We take at least such a pragmatic lesson from her definition of non-causal explanation: A non-causal explanation is one where the explanatory model is decoupled from the different possible kinds of causal mechanisms that could realize the explanandum phenomenon, such that the explanans is not a representation (even an idealized one) of any causal process or mechanism. (Bokulich 2018, 14)
We could understand this statement as a purely negative definition of a noncausal explanation, i.e. such an explanation in which there is no causal nexus. This, indeed, would only postpone the problem. However, we can read this statement again as an expression of a scientific strategy that systematically deviates from the causal nexus. Bokulich clarifies it a few lines above with reference to the concept of abstraction: (. . .) it is an abstraction across very different causal mechanisms – that this model explanation can be counted as non-causal. (Bokulich 2018, 14)
In this respect, therefore, scientists systematically, across different target systems, abstract from causal mechanisms (which they assume to exist naturally) and, thus, arrive at a useful explanatory tool, which is naturally called non-causal. That is, for Bokulich, the non-causal explanation is delimited by a research strategy of dealing with simplifying assumptions. And if in some cases we could not come to answering an explanation-seeking question via a causal interpretation, Bokulich shows that there are cases where both causal and non-causal explanations can be utilized for different specific purposes for the same phenomenon (Bokulich 2018, 14–31). When summarizing the situation, it is clear that Jansson and Saatsi (2019) search for (and, in their own words, find) a universal concept of explanatory abstractions within a counterfactual framework and with built-in explanatory asymmetry. Lange assumes a plurality of variants of non-causal explanations, and at the same time he claims that in order to define explanatory asymmetry it is not enough to state counterfactual dependencies. Kostić (as we shall see) finds a universal form of topological explanation building on a counterfactual framework and with asymmetry, as we believe, being established primarily pragmatically – via the research strategy of the scientist.
2.2.2 Typology of non-causal explanations Since the previous considerations are very abstract, and terminology-laden formulations may somewhat obscure the purpose of using non-causal explanations
32
2 Philosophy of scientific explanation
and their place in the context of linguistic theories, we will now try to propose a typology of non-causal explanations to help us restore considerations of dichotomy (causal versus non-causal explanations) to a clearer level. The creation of a typology presupposes leaving some things aside – in the case of non-causal explanations, these are considerations of metaphysical, mathematical and (partly also) distinctively mathematical explanations, which we mentioned above. Some details and references can be found in the appendix (Appendix 5). At the same time, we do not want to build the typology on an overview of already found non-causal explanations because this overview is very diverse (as we wrote in subchapter 2.2.1 above), and would also include a number of controversial cases (precisely in the area of distinctively mathematical explanations). At the same time, we try to choose those types of non-causal explanations that can be said to be related to some mature theory, that are not explanations associated with “only” certain scientific models.70 The criterion of scientific non-causal explanation then simply is for us that it has to be an explanation used by scientists within some developed theoretical constructions, that it is not an ad hoc explanation. On the basis of such a purified conceptual space and on the basis of studying contemporary approaches and finding exemplars, we dare to say that it is possible to delimit two basic types of non-causal scientific explanations. These are explanations that are based on the symmetry principles and explanations that are associated with the universality concept. Both cases are widely represented in physical theories, and the latter has an ever wider radius, overlapping with a number of other sciences and also with the social sciences.71 The symmetry principles represent the basic framework of fundamental physics,72 the basis of the standard model of particles and interactions (for an overview, see Stenger 2006). The basic idea, which we will also utilize in linguistic examples, is the assumption that the symmetry of a certain space type ensures the invariance of certain variables with respect to a certain set of transformations of a given space type. At the beginning of modern fundamental physics, there are the space-time symmetries of classical and relativistic physics. Thus, the translational symmetry of the three-dimensional space of classical physics ensures the invariance of momentum, and the translational symmetry of one-dimensional time ensures the invariance of the energy of a dynamical system.
I believe that this includes a large number of examples of distinctively mathematical explanations, which are linked to specific examples (see Strawberry Problem, Bridges in Königsberg, etc.). And the humanities, see Caldarelli (2007) for an overview. It has long been an important topic in the discussions of philosophers of science, see, for example, van Fraassen (1989).
2.2 Contemporary solutions to models of explanation
33
Consistent with what was stated by Lange above, physicists can say that these symmetries explain the conservation principles of classical physics. The explanation that physicists use in this way is not causal.73 The laws that are delimited in classical dynamics incorporate these basic symmetries. The modern standard model holds, in general, that all laws expressing relations between variables and constants describing particles and interactions have to incorporate gauge symmetries.74 The symmetry principles establish a non-causal explanation sui generis and represent the core of fundamental physical theories. The question of the origin of the symmetry principles leads in philosophy of science (unless it is a pragmatic conception) mostly back to analytic metaphysics (see French 2014).75 We will not trace their origins; it is important that they are widely used sources of non-causal explanations, which are an essential part of fundamental physical theories. Once the correct symmetry is formulated, it is possible to predict an appropriate conservation principle based on it, which is linked to the operationalizable properties of physical entities. That is, conservation principles can predict the existence of relevant entities. The situation is more complicated in the sense that many of these symmetries are broken; in such cases the cause of symmetry breaking is predicted.76 In the Appendix (Appendix 6), we show by example how finding symmetry breaking leads in physics to new symmetry maintaining, i.e. to finding a more general symmetry. Although the symmetry principles are formulated mainly in the context of physics, we also find their analogies in other disciplines that rely on mathematically formulated laws; below we encounter them in the context of quantitative linguistics.77 However, it is much more common to find another variant of noncausal explanations outside the framework of fundamental physics – based on the universality concept.
The classical laws of Newton’s dynamics are understood as consequences of these symmetries, or given conservation principles. We comment on it briefly and with neglect of detail. For a more precise definition, see Stenger (2006). The standard model is based on quantum mechanics. For a more detailed definition, see Morrison (2015), who also shows the difference between theories that build on gauge symmetries and theories that rely on the concept of universality. In addition to French’s structural realism, we also find transcendental interpretations, according to which symmetries express specific instances of the cosmological principle, which states that the correct formulation of scientific laws is point-of-view invariant; in other words, scientific laws can be found in the same form regardless of the position of the observer (of its location in space-time), see Stenger (2006). A famous example is the prediction of the Higgs boson, see Baggott (2012). French (2014, 324–352) proposes their application mainly in biology.
34
2 Philosophy of scientific explanation
The area that incorporates the concept of universality is dynamical systems theory. If we witness (in the standard model) a hierarchy of more and more general symmetries in physics and a simplified inventory of physical entities, the theory of dynamical systems is interdisciplinary to the effect that it is not interested in the ontology of a given dynamical system under study, but in the universality of the behavior of dynamical systems across different ontologies. Emergent phenomena in the behavior of complex dynamical systems are a characteristic problem. A specific feature of these phenomena is that their explanation does not require any knowledge of causal influences from the system micro level, but it is enough to understand the relation (most often expressed by a power law) of several higher-level properties, represented by statistical variables.78 Behavior universality means that a simple universal mathematical representation in the form of a power law79 can be applied across systems with emergent phenomena, whether as phase transitions in thermodynamics, self-organization in biochemical systems or the emergence of regularities in human speech. Power law exponents, then, often represent an important and often the only indicator of dynamical system behavior.80 Stephen Kellert and Peter Smith made dynamical systems theory popular in philosophy of science in the 1990s. However, their books (Kellert 1993, Smith 1998b) followed more epistemological and methodological implications of the theory, at that time fashionably referred to as chaos theory, rather than the issue of universality as a means of non-causal explanation. In this context, the current philosophy of science focuses on the above-mentioned emergent phenomena. It should be noted that the explanatory abstractions considered by Jansson and Saatsi (2019) are crucial for the theory of dynamical systems. Margaret Morrison (2015) points out that critical phenomena are also linked to the need to use completely unrealistic abstractions – such as the assumption of an infinite number of particles forming a dynamical system in the effective field theory. Thus, the topic of non-causal explanations encounters the issue of simplifying assumptions (see the chapter 2.1 above). We will focus on the issue of scale-free networks, in which universal behavior occurs typically with the power law occurrence, in even more detail in the
Indeed, there is a connection with Haken’s synergetics and with a number of theories of selforganization, which have been developed since the 1960s and which have inspired systemtheoretical linguistics (see the chapter 5.2 below). Not always of the same type, as we will see below (see the chapter 5.2.1). For an overview of various manifestations of universality across dynamical systems, see Caldarelli (2007).
2.2 Contemporary solutions to models of explanation
35
subchapter 5.4.1. As we will see, in quantitative linguistics from the considerations of George K. Zipf and Herbert Simon, through Gustav Herdan to Gabriel Altmann and Reinhard Köhler, we can observe different ways of articulating non-causal explanations of speech behavior with support in the concept of universality. Luděk Hřebíček has popularized the direct connection with the theory of dynamical systems and Jan Andres deals with it systematically (for both, see the Second Interlude, subchapter 5.4.1 and chapter 5.6). We believe that in philosophy of science, Philipe Huneman (2018, 2010) and the already mentioned Daniel Kostić (2020, 2019) have come closest to articulating a new model of non-causal explanation, which can incorporate both the concept of symmetry (and the specific symmetry principle) and the concept of universality. In chapter 5.6 we will try to show how a model of explanation which is able to replace the existing functional explanation in system-theoretical linguistics can be created by combining and modifying their two different models of topological explanation. The study of two dichotomies in the views of scientific explanations in the contemporary philosophy of science has led to the discovery of a number of conceptual means that can be applied to cases of linguistic explanations in the following chapters. In summary, we will show two applications: (1) utilization of topological explanation (Kostić and Huneman) to replace functional explanation in Köhler’s system-theoretical linguistics (see chapter 5.6); (2) utilization of the concept of meta-laws and distinctively mathematical explanation (Lange) for the case of linguistic explanations by Luďek Hřebíček and Jan Andres (see chapter 5.6 and the Second Interlude). We will also outline the possibility of utilizing mechanistic explanation (Craver and Darden) for the case of inductive research strategies in quantitative linguistics (and the unified approach) by Gabriel Altmann (see Conclusion 6). Although linking linguistic perspectives to the results of a conceptual analysis of philosophy of science is valuable in itself, we would feel a considerable degree of inadequacy if we merely acquired instruments created by others and applied them in favor of linguistic explanations. Therefore, in the next chapter we will try to create our own minimalist model of scientific explanation, which we will, then, use as a standard in evaluating linguistic explanations found in the theories by Ferdinand de Saussure, Louis Hjelmslev, Noam Chomsky, Gustav Herdan and a number of quantitative linguists, ending up with, last but not least, Reinhard Köhler.
36
2 Philosophy of scientific explanation
2.3 The principle-based model of scientific explanation In the previous chapters, we outlined changes in ways of philosophical reflection on scientific explanations from the original prescriptive delimitation of what a scientific explanation should look like (the received view in the D-N model) to the current predominantly descriptive approach, which files individual occurrences of scientific explanations across scientific disciplines. Probably no one doubts that these are complementary approaches that deserve their own research space. In the following chapter, we will try to draw more attention to the prescriptive point of view. We will distinguish scientific explanations from explanatory strategies – we will see scientific explanations as prescriptive tools for evaluating existing explanatory strategies. This maneuver, we believe, cannot be carried out on a purely pragmatic basis, because probably the only common denominator of all explanatory strategies would remain the usefulness of the explanatory strategy defined vaguely. We, therefore, have to choose syntactic and semantic criteria. If a strict proponent of a pragmatic approach to explanation were to claim that these criteria are selfevident, then we will argue that stating their self-evidentness makes them vague.81 The other extreme, i.e. the search for a rule that should be followed by scientists under all circumstances, in all disciplines, etc., however, is not the goal. This standard will be fulfilled in individual explanatory strategies to varying degrees, and there will be exceptions. It is the existence of exceptions that stimulates the effort for further conceptual analysis, the value of which we have justified above. In addition, we will be interested in this standard in relation to existing linguistic explanations. At the beginning of the second part of the book, after analyzing examples (1) – (7), we came to the conclusion that we have to distinguish carefully between an argument, a scientific description, and a scientific explanation. And now, we have also taxed ourselves with the task of doing so with prescriptive means. From this point of view, a scientific argument is graspable by means of logic,82 and has its specific nature, which is grounded on a structure of deductive judgment and deductive inference. Argumentation is a natural basis of a scientific activity, and deserves separate attention. The line between scientific description and scientific explanation is blurred, even from a prescriptive point of view. The main reason is the very wide use of As when Giere claims that he does not need to clarify the concept of the truth of the model when analyzing the pragmatic role of models, see Giere (2006, 64–67). We leave aside the topics of pragmatic aspects of argumentation procedures as well as the issue of cognitive aspects of argument-making. We remain purely prescriptive.
2.3 The principle-based model of scientific explanation
37
the term description. We will, therefore, try to define the concept of scientific description in a more narrow way: (I)a The scientific description illustrates the system under investigation through a structure that represents its recognized parts and their interrelationships. (I)b The scientific description may also illustrate the process taking place in the system through functions with a time parameter. The scientific description therefore enables making predictions.
An example of the first variant of the scientific description definition can be Linnaeus’ taxonomy of species. It represents individual species and relationships of kinship among these species on the basis of the external similarity of life forms of species, i.e. phenetically. With the advent of molecular biology, cladistic taxonomy, which explains the kinship of species on the basis of a common evolutionary past expressed by changes in DNA, was made possible to be put in opposition to the phenetic taxonomy. Mendeleev’s periodic table of elements (similar to Linnaeus’s taxonomy) shows that scientific descriptions are often guided by the use of some type of hierarchy – from simple to complex, from lighter to heavier, from older to younger, etc. Clarifying the occurrence of a hierarchy can be part of a scientific explanation, which replaces description. This is the case for the periodic table of elements using the means of quantum mechanics when applied to the structure of the atom. Galileo’s kinematics is an example of the second variant of the description. It represents individual objects and their relationships via a spatial arrangement, and allows us to capture a process in the system with mathematical functions embedding a time parameter.83 As we stated above, Newton’s mechanics via the laws of motion and gravitational interaction offered an explanation of motion by elucidating its causes.84 In the case of linguistics, we can also find descriptions of both kinds. Linguistic structuralism as well as generativism could be examples of the first type of description, while etymological descriptions comply with the second type. In the following chapters devoted to the analysis of selected texts by de Saussure,
Torretti draws attention to those aspects of Galileo’s mechanics that have the character of an argument (Torretti 1999, 20–30). Descriptions, indeed, also play an important role in modern science; it is enough to recall biology, specifically, for example, proteomics, in which the individual types of secondary and tertiary structures of proteins are described.
38
2 Philosophy of scientific explanation
Hjelmslev and Chomsky, we will attempt to answer whether it was possible to conceive the given linguistic approaches as explanatory. We offer one more definition of description, which specifies what description, compared to explanation, lacks: (II) The scientific description is an illustration85 of the structure (and the process) of the system (relationships between parts of the system as a whole), without providing information on the principles, causes, etc. that are responsible for the system state and development.86
Statement (II) specifies that if we want to move up from describing the structure (and the process) of the system to a scientific explanation, it is necessary to identify the principles, causes, etc., that are responsible for the implementation of the system structure (and process). Everything would be even simpler if we distinguished between scientific descriptions and scientific theories. Only such a conceptual structure that would explicate the principles, causes, etc. of the state and development of the system could be called a scientific theory.87 This distinction is tempting, but would mean a systematic terminological change in the humanities. Or shall we be willing to admit that linguistic structuralism and generativism are not theories? Since it is common in linguistics to think of systemic description as of a theory, we will not introduce this terminological change. We were able to define the difference between a scientific description and a scientific explanation by referring to the obligation of the scientific explanation to explicate the principles, laws, causes, mechanisms, etc., which are responsible for the state and development of the system under study. At the same time, of course, scientific descriptions can be parts of scientific explanations. Now, we face the task of defining scientific explanation by carefully examining the possibilities of its basal component – the differences between the principles, laws, causes, mechanisms, etc. We start with a radical thesis on the nature of scientific explanation:
These illustrations are made via representations that can be of various natures – e.g. mathematical functions (equations), graphical representations, etc. Within the model-based view of theories these representations are systematically described, see Giere (2006), Giere (1999). Scientific descriptions may or may not implement logical inference, see Giere (2004, note 4, 744). We will not explicate here the issue of defining scientific theory. For a basic definition of the syntactic and semantic view of theories, see e.g. Rosenberg (2005, 69–111), for a pragmatic view of theories see e.g. Giere (2006, 59–95).
2.3 The principle-based model of scientific explanation
39
The Radical Thesis: All explanatory strategies that do not implement a causal nexus fall into the category of scientific descriptions. The only valid scientific explanation model is causal explanation.88
Above, we have already clearly declared confidence in the existence of non-causal explanations. However, an undeniable advantage of the radical thesis of scientific explanation lies in the identification of the basal component with the cause. We explain scientifically only when we discover the causes of studied phenomena. Moreover, the radical thesis seems to be in line with basic intuitions about explanation. Since the distinction between Why-questions and How-questions can intuitively resonate with the distinction between causal scientific explanation and scientific description. Although this interpretation of Why-question and How-question is common (e.g. see Woodward 2003, Salmon 1998) and aims at defining science as an activity seeking to discover the causes of phenomena, there is also a different and older interpretation. Following this interpretation, typical of logical positivism, scientific questions are formulated exclusively as How-questions, because Why-questions fall into the realm of metaphysics. This does not mean, however, that science is made up of only a sum of scientific descriptions. The general “descriptiveness” of science is related to the chaste notion of logical positivists that science allows us to organize empirical bases, but does not give us an insight into the structure of reality. The reader will find an explication of the relationship between realism and antirealism in the contemporary philosophy of science in the Appendix (Appendix 7). The radical thesis about the exclusive role of causality in scientific explanations leads us too close to metaphysics. Moreover, we saw above (chapter 2.2) that there are explanatory strategies and models of explanation that are strictly noncausal. From the point of view of conceptual analysis, the radical thesis requires a clear definition of the causal nexus,89 which does not get along without difficulties. The main difficulty is to define the hierarchy of causal relations among various levels of the system in a given discipline, but also among individual disciplines.90
We stated above that it would be significantly easier for us to solve the problem of the asymmetry of explanation if only causal cases were recognized as explanations. We have already referred to Skow (2016) and Skow (2014) above. E.g. Reutlinger is using Russellian definition of causality, for more details see Reutlinger (2018). The simplest solutions assume the reduction of all causal relations to physical causal relations. An attempt to postulate autonomous causal potency at a level other than the basic (i.e. physical) level always leads to contradictions, or to epiphenomenalism of the higher-level causal domain, see Kim’s dilemma for non-reductive physicalism, cf. Kim (2005, 39–52). As we saw above, when we examine specific scientific strategies and theories, we find that even at the physical level, we cannot define a simple hierarchy of causal domains – see the above issue of emergence in philosophy of science (see subchapter 2.2.2).
40
2 Philosophy of scientific explanation
Causal explanations, yet, form an important part of scientific explanations, but as we stated above (chapter 2.2), we can hardly say that it is the only one. Explanatory strategies are much more varied, and it seems inadequate to want to ascribe to them a radical thesis. In addition to all the examples we gave above (principles of symmetries, the concept of universality, etc., see subchapter 2.2.2) and in addition to new approaches to explanation, which we expressed through illustration of two dichotomies of explanation models (see chapter 2.2), it is necessary to take into account specific explanations found in linguistics – formal and functional explanations – which we will deal with in detail in the following parts of the book. It is, therefore, necessary for us to put more general demands on the base component than the radical thesis requires. It is not necessary to sacrifice our efforts to find purely linguistic explanations and to assign the status of scientific descriptions to all of them, because even a physical precedent leads us to believe that the causality of scientific explanation cannot be exclusive. We will, therefore, try to delimit the explanatory thesis in a moderate form. The Moderate Thesis: All explanatory strategies that do not implement some form of lawlike connection fall into the category of scientific descriptions. Models of explanations based on some form of lawlike connection are valid models of scientific explanations.
This thesis requires further explication with regard to the definition of the term “lawlike connection”. The issue of defining the scientific law concept and its role in scientific explanation is extensive in philosophy of science.91 We are not able to evaluate conclusions of this debate, or rather various versions of the solution, and select the winning proposal for the correct form of the scientific law. Although this may be seen as a weakness in our efforts to define a scientific explanation, we do not think it is a fundamental shortcoming. We have two basic reasons for this statement, and the third reason will show the next direction of our considerations.92
As a certain neutral result, the scientific law can be defined as a universal statement ∀x (F (x) → G (x)), always and everywhere valid and supporting the counterfactual conditional (see subchapter 2.2.1 on asymmetry above). However, every result is problematic, for various reasons: To what extent can scientific laws be known without knowledge of the laws of nature? How does the condition of law validity change “always and everywhere” with regard to different concepts of time and space in different theories? Is it possible to accommodate all three conditions in accordance with the cosmological principle (see note 75)? For an overview, see Carroll (2016). In the Appendix, we add an example of the pitfalls of using the term the “scientific law” in its deterministic and statistical variants, see Appendix 8.
2.3 The principle-based model of scientific explanation
41
The first reason is our conviction that solving the problem of the scientific law status cannot get along without the use of analytic metaphysics. We have already stated above that this is not the way we will proceed, as we are in favour of a transcendental view of the capabilities of human reason (see also Appendix 7). The second reason is an effort to follow up the debates that took place in philosophy of science over the texts of van Fraassen, Giere and Cartwright because we consider them the most important events in philosophy of science at the turn of the century, when philosophies of individual disciplines began to emerge in large numbers. And for these authors, a critique of the scientific law concept is important (see van Fraassen 1989, Giere 1999, Cartwright 1999, Cartwright 1983). The last reason, and as we have stated a decisive one, relates to an issue that recurs in the mentioned texts (van Fraassen, Giere, Cartwright) – how to define a theory when we do not have the concept of law? We reminded above (see chapter 2.1) that the solution was found in the semantic or model-based view of scientific theories. In this conception, it is simply stated that the theory is composed of a set of models reflecting aspects of the world and a set of hypotheses specifying in what respect the individual models are suitable for the representation of a given aspect of the world. We will use direct explication of the model-based view of theories (MOT) given by Giere: In this picture [MOT], scientists generate models using principles and specific conditions. The attempt to apply models to the world generates hypotheses about the fit of specific models to particular things in the world. Judgments of fit are mediated by models of data generated by applying techniques of data analysis to actual observations. Specific hypotheses may then be generalized across previously designated classes of objects. (Giere 2006, 60–61)
The mentioned quotation shows that theory is viewed simply as a means for a scientist to describe the reality around him. We believe that it clearly shows that MOT, in principle, abolishes the traditional theory concept along with the concept of law. Theory is no longer seen here as something that explains some aspect of the world, but as a machine generating effective descriptions of the real system. Giere, on the other hand, refers to principles (and specific conditions). How to understand these principles? Giere tries to show systematically that the principles presented are simply other models (albeit the most abstract models ever,93 see Giere 2006, 61–62). So why does he use another term? Is it reminiscent of the
And here follows the already mentioned (chapter 2.1) possibility of naturalization of philosophy of science when these principles as abstract models are related to the cognitive setting of a scientist, cf. the chapter “Perspectival Knowledge and Distributed Cognition” in Giere (2006, 96–116).
42
2 Philosophy of scientific explanation
traditional theory concept, which is based on the found lawlike connection, and thus allows explanation? No matter what Giere actually thinks in the end, we have an opportunity to grasp the proposed term “principle” and conceptualize it. We do not consider theory as a “mere” means of classification, but as a source of explanatory power, which is manifested exactly in the form of principle. Technically, we choose the term “principle” because Giere proposes it in this way, and because the term “law” may seem somewhat anachronistic in the context of contemporary philosophy of science. Otherwise, of course, these are only terms – it is their content that is important. We will, therefore, stick to understanding lawlike connection in the sense of the dependence among entities expressed by the scientific principle. It will allow us to promote the hypotheses (which Giere is talking about) again from their purely mapping meaning (searching for similarities) and upgrade them again, in a unified form, to a theory that reveals structural and processual invariants expressed in principle(s). We, therefore, will replace the basic component in the definition of scientific explanation with the concept of a scientific principle: The scientific principle is a universal rule that expresses the necessary and invariant properties of the system in terms of the system structure or the process running in the system.
A detailed explication of individual concepts (invariant, system property, system, structure and process in the system), which are part of the definition of scientific principle, will be filled with a specific content (in the following parts of the book) when analyzing individual linguistic explanations (systemic in structuralism, formal in generativism, functional in system-theoretical linguistics) and then, also in relation with the newly proposed variants of explanations in quantitative linguistics (topological and distinctively mathematical). Here, we only postulate the starting point of a valid model of scientific explanation, according to which systems are exposed to scientific research, to performed explanatory strategies (the pragmatic viewpoint), which can culminate in a valid scientific explanation (the syntactic and semantic viewpoint), isomorphic, not just similar, regardless of the chosen domain or scientific discipline. That is, we can define a minimal domain-independent model of scientific explanation (i.e., the same for physics as for linguistics). The original discrepancy, between explanatory procedures in individual disciplines, is eliminated by the choice of principle as a universal basis component of explanation. The minimalist definition of a scientific principle can also be paraphrased so that it is possible to find scientific principles and construct scientific explanations whenever scientific activity is able to identify the structure of a system (or a
2.3 The principle-based model of scientific explanation
43
process that runs in the system). The difference of systems is given by the specific natures of principle implementation. Above, we showed the dichotomy of causal and non-causal explanations (see chapter 2.2), in which we can distinguish two types of homonymous principles. In the case of non-causal explanations, we identified cases of the principles of symmetry and cases of using the concept of universality. In summary, the following issues are offered as variants of scientific principles for linguistics: non-linguistic principles of special sciences (in connection with cognitive aspects of generativism and system-theoretical linguistics), causal principles (marginally included in the context of quantitative linguistics), principles based on the concept of universality (in connection with functional and topological explanations in system-theoretical linguistics), principles of symmetries (in structuralism and in system-theoretical linguistics) and purely mathematical principles (in relation with a distinctively mathematical explanation in quantitative linguistics). Based on the analysis of the concepts of lawlike connection and the scientific principle, we can proceed to a further formulation of the moderate thesis of scientific explanation: The Adjusted Moderate Thesis: All explanatory strategies that do not implement the explanatory principle fall into the category of scientific descriptions. Valid models of scientific explanations are models of explanation based on the explanatory principle.
We will ground our formulation of the model of scientific explanation, which has already been presented in detail, on this adjusted version of the moderate thesis. We will gradually examine the nature of the explanatory principles in the context of linguistic explanations – we have to analyze thoroughly whether we do not contradict oueselves, whether some traditional linguistic explanations, which we preliminarily (within the formulated dilemma of linguistics) consider as scientific descriptions, do not follow principles that could suit the basis component of explanation as we have enounced it here. When formulating the model of scientific explanation, we proceed from the belief that scientific explanation has to be a valid logical argument, which incorporates the explanatory principle as a basis component. Schematically, we can illustrate the explanation model in its rough94 form as follows:
Since we are talking about a logical argument, although the diagram does not relate statements, but principles (with conditions) and phenomena.
44
2 Philosophy of scientific explanation
Explanans: Scientific principle(s) + initial (and boundary) conditions95 Explanandum: Observed phenomenon Our model does not deny resemblance to Hempel’s D-N model, but differs from it on a number of important points which, we believe, make the automaticallyarising criticism difficult. Above (see chapter 2.1), we have already referred to Halvorson’s defense of the syntactic view of theories (Halvorson 2016). We agree with him that the syntactic view of theories did not reduce theories to mere linguistic entities. Therefore, we also do not formulate the model of explanation (see diagram above) as a relationship between sentences. Strictly speaking, a deductive structure can only be constructed solely between sentences, but the semantic property of our model of explanation is that the entities in both the explanans and the explanandum are realized on the basis of a certain objectively conceived structure. We believe that this semantic extension of a purely syntactic conception also prevents the re-application of an argument based on the problem of explanatory symmetry. The problem of symmetry for the D-N model emerges at the moment when we perceive it purely as a relationship between sentences. On top, we have shown that causal nexus is not the only type of dependence between explanans and explanandum, and we believe that our model of explanation also incorporates non-causal explanations. We prefer the new name “principle-based model of explanation”, which also expresses the other, albeit more marginal difference compared to the D-N model of explanation. What is really important is that the principle-based model of explanation captures the necessary relationships in the structure of the examined entities. It would also be possible to reformulate Hempel’s model into a N-D (nomologicaldeductive) model and, thus, point out the meaning of dependence, which is expressed in the principle and which is not exhausted by syntactic inference. Schematically, we can express the principle-based model of explanation by explicating its syntactic and semantic components as follows: Syntactic level: Deductive inference
We leave aside the CPC issue, see note 18.
Semantic level: The structure expressed by principles
2.3 The principle-based model of scientific explanation
45
Explanans: Explanans: Sentence(s) expressing the scientific principle(s) The scientific principle(s) Sentences describing initial and boundary conditions Initial and boundary conditions Explanandum: Sentence expressing observed phenomenon
Explanandum: The observed phenomenon
With this duplicate scheme, we want to emphasize that the scientific explanation does not only express a special type of deductive inference, in which some general statements are given a specific status of scientific principles. The scientific explanation expresses the structure of the necessary relationships among the entities specified in the principle(s), which the conditions enter unambiguously (automatically establishing an explanatory asymmetry), and thus, point to the place of the observed phenomenon in this structure. At the syntactic level, it is, then, possible to speak of deductive inference of sentence(s) about the observed phenomenon(a) from the sentences contained in the explanans. Therefore, we choose the name principle-based model of explanation. We believe that this conception of explanation leaves open space for both realistic and transcendental conceptions of theories (see also Appendix 7). We were able to define necessary and sufficient components of scientific explanation, which allowed us to formulate a new universal principle-based model of explanation (PBE). The original intuition that this model has to include a logically valid argument supplied with a lawlike statement in a causal or non-causal form has matured into a shape in which we put the structure delimited by the principle(s) supplied with a valid deductive inference. We develop Halvorson’s (2016) conception that the semantic structure is displayed into a syntactic structure. In the following parts of the book, we will gradually endeavor to identify the principles of individual linguistic approaches and theories, and we will attempt to establish principle-based models of explanation based on them. If it turns out to be possible, then we will develop the minimalist PBE with regard to the specifics of the identified principle. We would like to point out that we will attempt to do this especially in the case of system-theoretical linguistics (see part 5), where we will try to transform a functional explanation (based on the synergetic principle) into a topological explanation (based on the invariance principle) on the basis of PBE. Indeed, there remain some open questions, which we will try to answer in the following chapters. Can it be shown that systemic and formal explanations are “only” scientific descriptions and that they do not provide the principle(s) constituting the explanations? Can the principle-based model of explanation and its extension deal with traditional problems (primarily a problem with functional
46
2 Philosophy of scientific explanation
equivalents, see chapter 5.5) in the original functional explanation of systemtheoretical linguistics? How exactly can we incorporate the economization and optimization principles, which we recognize in system-theoretical linguistics, into a clearly defined structure of the principle(s) of system-theoretical linguistics? And is it possible to strictly distinguish scientific description from theory in linguistics at all?96 Our task in this chapter was to make the prescriptive strategy of philosophy of science visible and, in a positive sense, to throw it against the current mainstream of philosophy of science. Indeed, we agree that it is valuable to be inspired in specific sciences – in the end, the original concepts of philosophy of science, in its received view, were inspired by science – physics – and only then were, sometimes97 too automatically, exported as universal claims to other sciences. Even in the case of linguistics, it is, therefore, valuable to analyze without prejudice the explanatory strategies that arise autonomously here and then subject them to philosophical and scientific evaluation, exposing them to the prescriptive requirement, which have been formulated here in the form of the principle-based model of explanation.98
For example, Köhler chooses the name of system-theoretical linguistics for his theory, follows structuralism positively (Köhler 1986, 6), but does not seem to give it the status of a full-fledged linguistic theory, partly because it is unable to systematically grasp the diachronic aspects of language. After all, Hempel knew that a different type of explanation was needed for biology than for physics — and he proposed a functional analysis (see chapter 5.5 and Appendix 19). We have stated several times that today’s philosophy of science is markedly pragmatic, sometimes eclectic (we have not avoided it either) and often benevolent, driven by scientific practice. Philosophers of science gradually focus on individual disciplines, often marginalized in philosophical reflection, and make their conception of theories and explanations explicit. So the fact is that when a sufficient group of linguists begins to promote an explanation with a description, contemporary philosophers of science will begin to look for a way to agree with them. The prize is, indeed, the pluralism of the conceptual system and the cataloging nature of the activities of philosophy of science.
3 Systemic explanation in linguistics Urbe imperio exuta multi in Italia sermones, una tamen lingua deest Pendéros Fantasmatos, Varia99
In the introduction, we made a slightly radical statement that all comprehensive linguistic theories were presented by their creators as a means of scientific explanations of linguistic phenomena. But perhaps we have just fallen into the terminological trap we pointed out above – on the one hand, as philosophers of science, we automatically expect a theory to provide a means of explanation (inspired by natural science); but on the other hand, we neglect that in the humanities and social sciences, demands on delimitation of a theory are often softer. Perhaps we subject structuralism to too strict criteria, and it would be inadequate to deny it the status of a theory. Fuzziness is removed when we recall that in the introduction, we acknowledged the effort to explain the original formulations of linguistic structuralisms100 and not to structuralism as one of the current general concepts used in the humanities. As a general tool, structuralism in the humanities is descriptive, but this descriptiveness is not necessarily part of original concepts. Therefore, the best solution would be to search original source texts by de Saussure and Hjelmslev and to attempt to find the presumed explanatory intention in them, and to reconstruct it. The method of our conceptual analysis will change in the following chapters because we cannot approach the linguistic content of the analyzed linguistic theories in a sufficiently informed way. We also admit openly that the selected texts analyzed, however legitimately we may consider them central, constitute only a fraction of the relevant texts that could be examined. In the case of Ferdinand de Saussure’s structuralism, we will analyze excerpts from the Cours de linguistique générale.101 We will get closest to the issue of linguistic explanation by analyzing de Saussure’s conception of linguistic law in a synchronic, diachronic and panchronic point of view. In the case of Hjelmslev, we will
We thank Jiří A. Čepelák for discovering this remarkable statement. We refer here only to de Saussure, but of course he is not the only source of linguistic structuralism. And we refer to the Course in General Linguistics as to a book by de Saussure, however we are aware of the complicated origin of this collection of lectures by de Saussure. We will refer to the English edition: de Saussure, F. (2011) Course in general linguistics. New York: Columbia University Press, translated by Wade Baskin, edited by Perry Meisel and Haun Saussy. In some important cases, we will take into account the French version: de Saussure, F. (1971) Cours de linguistique général. Paris: Payot. https://doi.org/10.1515/9783110712759-003
48
3 Systemic explanation in linguistics
analyze the introductory chapters of the Prolegomena to a Theory of Language,102 where questions about the nature of linguistic theory and linguistic explanation emerge in connection with the definition and analysis of the principle of analysis.
3.1 The missing panchronic view Scientific disciplines seek their symbolic origin, and linguistics finds it in de Saussure’s linguistic structuralism.103 If we link structuralism with the emergence of linguistics as an autonomous science, we then look for a certain distance from the considerations of the past and the emancipation of scientific considerations about language from the tow of other scientific disciplines. The distance concerned the considerations of the second half of the 19th century, which remained closed in the philological dimension of linguistics, in the demanding diachronic research on the origin of languages by comparative-historical methods. Emancipation was related to the critique of positivism,104 which found its expression in linguistics (among other things) among the neogrammarians. The neogrammarians were inspired by the methods of natural sciences; they sought, often successfully, to find regularities applicable to linguistic facts through the analogy of physics and linguistics, and through the ideal of exact description of facts. Peter Grzybek identifies the fascination with the deterministic laws of physics and the effort to build the study of language as a natural science as one of the main features of the neogrammarian view. He cites Bopp’s “phonetic law” as one example (Grzybek 2006, 4). He describes the tendency of the neogrammarians as follows: Many a scholar in the second half of the 19th century would elaborate on these ideas: if linguistics belonged to the natural sciences, or at least worked with equivalent methods, then linguistic laws should be identical with the natural laws. Natural laws, however, were considered mechanistic and deterministic, and partly continue to be even today. (Grzybek 2006, 4)
Structuralism rises against the neogrammarians, however it does not represent the separation of linguistics and the idea of exactness; on the contrary, it represents a sophisticated use of mathematics, not as a tool for the quantitative grasp of linguistic
We will refer to the English edition: Hjelmslev, L. (1969) Prolegomena to a Theory of Language. Madison: The University of Wisconsin Press, translated by Francis J. Whitfield. For historical and systemic evaluation of structuralism see Butler (2003), Nöth (1990). Given that system-theoretical linguistics chooses the successor of logical positivism as a philosophical-scientific starting point (Mario Bunge, Carl Gustav Hempel, etc.), the question of its relation to the structuralist heritage of linguistics is all the more interesting (see the fifth part of the book).
3.1 The missing panchronic view
49
phenomena, but as a formal means of expressing a linguistic theory (description). With general linguistics, structure and system arrive, but structure and system are concepts of formal reasoning. In de Saussure’s linguistic structuralism, indeed, we can see a manifestation of the critique of positivism, but at the same time, we should confess that this critique has manifested itself in the sciences in various ways. We usually associate the concept of Geisteswissenschaften, phenomenology and psychoanalysis with the critique of positivism; but the critique of positivism is, of course, also a building of formal theories – modern logic is one of the forms of the critique of positivism. Therefore, we do not have to see structuralism in tight connection with the listed “protest theories”, which are often mentioned till now, with phenomenology105 and psychoanalysis.106 We prefer seeing structuralism close to the reasoning of logicians and mathematicians, at the time of the crystallization of symbolic logic (and also its iconic satellite), at the time of the birth of reflections on topology and graph theory.107 In such a view of structuralism, some concepts playing a central role in the context of humanities become a natural aspect of defining a formal system. For example, arbitrariness could be perceived as a property of a variable in calculus. The language system does not suffer if we replace the language expression while leaving the structure of the system preserved. However, we have to replace (substitute) the language expression systematically at all places where it occurs in the system. Although the further development of the humanities has made to a large degree an empty concept of structuralism, which has become a target of criticism of various post-structuralisms, we want to make visible its original form, reconstructed in an effort to define the system formally. Such structuralism is immune to poststructuralist criticism. We can deconstruct (within the framework of scientific discourse) structures within our power and created by us – we can, to some extent, shift our biological essence, and reshape social reality (change power relations, institutions, etc.).108 Although phenomenology was very close to formal procedures, at least initially. We are thinking in particular of Husserl’s critique of psychologism in logic. Some formal, mathematical means have been preserved in the phenomenological method – for example, the concept of eidetic variation and finding the eidetic invariant is analogous to the above mentioned (see the subchapter 2.2.2) conception of spatial symmetries and conservation laws in physics, as introduced by Emmy Noether. In addition to positivism, structuralism was also a critique of subjectivism, from which phenomenology (and psychoanalysis) failed to detach itself. Peirce’s semiotics can be given a similar status in parallel, precisely with regard to considerations of topology and iconic logic. On iconic logic, see Shin (2002). Following the debates on realism in philosophy of science, see Appendix 7.
50
3 Systemic explanation in linguistics
However, post-structuralisms often mix types of structures, and they intend to deconstruct even those structures that are not in our grasp.109 Such are not only structures of the physical world, which Foucault leaves chastely to their own destiny, as the realm of natural sciences, and focuses on the transformation of the human sciences (Foucault 2002, 375–422). A structure as an abstract entity that allows us to define a system of relationships is also independent. We consider that this cannot be changed anyhow in the framework of a scientific discourse; it cannot be denied without getting rid of the structure and system as means that can explicate something to us. We can disrupt structures, but not the very principle of structuring. And apparently in this way, as a purely explicatory formal framework, structuralism is intended in de Saussure’s linguistic conception. But how is the linguistic theory/description intended in de Saussure’s conception, and does it make sense to talk about linguistic explanations within it? We will now focus mainly on the analysis of chapter three of part one of the Course in General Linguistics (CGL), entitled “Static and Evolutionary Linguistics” (de Saussure 2011, 79–100, hereinafter referred to as CGL).110 In the first two chapters of part one of the CGL, de Saussure formulates, or rather makes visible,111 four basic principles: the arbitrariness of a sign, the linearity of the signifier, the continuity of the sign, and the variability of the sign.112 We dare say that the first two principles refer to the language system as an abstract entity; while in the whole quartet, they refer to the language system in use by speakers. This distinction eliminates the apparent contradictions that lead, among other things, to the fact that arbitrariness is not absolute when viewed in the system used by speakers. De Saussure directly states that a grammarian or logician can make a change in the system, but it has no effect on the speaker (CGL, 73).113 At first glance, it seems that the principle of arbitrariness of a sign and the principle of linearity of the signifier are the principles of building a formal system, and do not go beyond the systemic description. From the perspective of logic
This almost intellectual cliché was revived by Laurent Binet in The 7th Function of Language (Binet 2017). Having regard to the whole of this section entitled “General Principles” (CGL, 65–100). He states that arbitrariness and linearity are obvious. “No one disputes the principle of the arbitrary nature of the sign, but it is often easier to discover a truth than to assign to it its proper place.” (CGL, 68). “While Principle II [The Linear Nature of the Signifier] is obvious, apparently linguists have always neglected to state it, doubtless because they found it too simple; nevertheless, it is fundamental, and its consequences are incalculable.” (CGL, 70). We consider the terms that we do not explicate in this chapter to be rudimentary. Their intervention can create a new language – artificial or formal.
3.1 The missing panchronic view
51
(discussed above), we can make the substitution of signifiers in one-dimensional notation – as is the case in other formal systems.114 When explicating the principle of continuity, de Saussure states that: “(. . .) the only real object of linguistics is the normal, regular life of an existing idiom” (CGL, 72). When using language, the speaker follows the law, which is something: “(. . .) that is tolerated and not a rule to which all freely consent” (CGL, 71). This law viewed in the system in use is a convention whose compliance is automatic, a convention which is freely immutable because the sign is arbitrary, i.e. because there is no algorithm assigning the signifier to the signified. De Saussure classifies language as a human institution (CGL, 75), but unlike other institutions, language is much more prone to change (see the principle of variability) because other institutions: “(. . .) are all based in varying degrees on the natural relations of things (. . .)” (CGL, 75). Language as a system in use is exposed to the passage of time, and the arbitrariness of signs makes it more susceptible to change. “Time changes all things; there is no reason why language should escape this universal law” (CGL, 77). By defining these four principles, de Saussure prepared the constitutive elements of his linguistic description or language theory. Can we begin to build a scientific explanation from them alone? For example, the following could be formulated: Explanans: The principle of the arbitrariness of a sign Condition: Language is used in the linearity of writing (speech), i.e. is used in time (it contains the principle of linearity as well as the “law” of time passing). Explanandum: The language changes Is the principle of arbitrariness of the sign an explanatory principle that can play a part in the principle-based model of explanation? We do not think so, although we can not imagine a generally semiotic system in which it does not apply.115 For there is the question, do we need the principle of arbitrariness to explain language change? The condition in the explanans refers to the use of language over time, and de Saussure himself states that the “law” of change in the passing of time applies, that is, the principle of arbitrariness is superfluous.
Strictly speaking, the arbitrary relation between the signifier and the signified is not based on convention of choice, but has its relative motivation. Even in the extended (bio)semiotic field, arbitrariness is stated as a necessary but not a sufficient condition of a semiotic system. Cf. Lacková, Matlach, Faltýnek (2017).
52
3 Systemic explanation in linguistics
Is the principle of arbitrariness not a denial of the existence possibility of any explanatory principle of linguistic change? Based on it, we can state that language changes, but we cannot predict how it will change. It has been expressed resolutely above – there is no algorithm that controls the assignment of the signifier to the signified. An explanation that has only such a vague prediction, which is moreover a prediction of what, from a historical point of view, we “normally observe” is not a validly created explanation. The principle of the arbitrariness of the sign fails as an explanatory principle. Based on these four above mentioned principles, we are not able to build a valid principle-based model of explanation. ✶✶✶ Does de Saussure offer any major further extension to his linguistic conception, which would allow us to formulate a structuralist explanation sui generis?116 In the aforementioned central chapter three, de Saussure focuses on defining linguistics as an irreducibly dual scientific discipline that has its synchronic (or static) and diachronic (evolutionary) components.117 Fatality of the definition of the synchronic axis towards the diachronic axis is complemented by de Saussure’s challenge: In these fields scholars cannot organize their research rigorously without considering both co-ordinates and making a distinction between the system of values per se and the same values as they relate to time. (CGL, 80)118
It seems that the basic difference between synchrony and diachrony is already expressed in the basic principles: arbitrariness, linearity, continuity and variability, which we pointed out above (abstract system versus system in use). In chapter three, however, de Saussure explicates the concept of the linguistic law, and this promises prima faciae another possible basis for a model of scientific explanation. In his own words, de Saussure is radically at odds with the linguistic tradition when inclining clearly towards synchronic linguistics: “Ever since modern linguistics has come into existence, it has been completely absorbed in diachrony”
The term “structural explanation” is used in philosophy of science mainly in the context of structural realism. Cf. French (2014), see Appendix 7. In Herdan’s linguistic theory, we find again an emphasis on linguistic duality, in his case the duality of language as choice and language as chance (see the First Interlude). At the beginning (CGL, 79), however, he argues that some disciplines are time-independent – he mentions astronomy and geology. Paradoxically, for CGL, published in 1916, a general theory of relativity emerged in the same year that would radically change astronomy, in fact establishing cosmology according to which the Universe originated and evolved over time. The singularity of the Big Bang represents radical symmetry breaking. As Jiří Langer put it in the discussion: “What will happen to all the symmetries in the Universe that has its beginning?”
3.1 The missing panchronic view
53
(CGL, 82). Structuralism, however, prioritizes the view of the state of the system, of its cessation in time. De Saussure states: “That is why the linguist who wishes to understand [emphasis mine] a state must discard all knowledge of everything that produced it and ignore diachrony” (CGL, 81).119 De Saussure recalls that diachronic linguistics was seen as more scientific while: “Classical grammar has been criticized as unscientific (. . .)” (CGL, 82). However, newer – neogrammarian – linguistics deserves more criticism because it has not been able to distinguish between “states” and “successions” clearly (CGL, 82). These evaluations imply that for de Saussure, the criterion for being a science is the ability to define clearly the differences between synchrony and diachrony that are inherent to scientific research. This distinction will allow linguists: “(. . .) a better understanding [emphasis mine] of language-states” (CGL, 83).120 De Saussure uses the term “understand” (“comprendre”), which we will try to interpret below with regard to the nature of his theory and a possible structuralist explanation. De Saussure does not deny the significance of the diachronic fact, which is an “(. . .) independent event; the particular synchronic consequences [emphasis mine] that may stem from it are wholly unrelated to it” (CGL, 84).121 De Saussure adds that diachronic facts are not aimed at changing the system: they always strike a specific element of the system, without any intent (CGL, 84). And the system does not change by itself (see above): “Neither was the whole replaced nor did one system engender another; one element in the first system was changed, and this change was enough to give rise to another system” (CGL, 85). And he further claims: “The diachronic perspective deals with phenomena that are unrelated to systems although they do condition [emphasis mine] them” (CGL, 85).122 Together with Gustav Herdan, we can talk about coincidence, which thus enters into formation of the system through diachronic facts (see the First Interlude). De Saussure himself states that we come across: “(. . .) ever fortuitous nature of a state” (CGL, 85) because:
In the French version: « Aussi le linguiste qui veut comprendre [emphasis mine] cet état doit-il faire table rase de tout ce qui l’a produit et ignorer la diachronie. » (De Saussure 1971, 117, hereinafter referred to as CLG). In the French version: « (. . .) fera mieux comprendre [emphasis mine] les états de langue. » (CLG, 119). In the French version: « Donc un fait diachronique est un événement qui a sa raison d’être en lui-même; les conséquences synchroniques [emphasis mine] particulières qui peuvent en découler lui sont complètement étrangères. » (CLG, 121). In the French version: « Dans la perspective diachronique on a affaire à des phénomènes qui n’ont aucun rapport avec les systèmes, bien qu’ils les conditionnent [emphasis mine]. » (CLG, 122).
54
3 Systemic explanation in linguistics
(. . .) language is not a mechanism created and arranged with a view to the concepts to be expressed. We see on the contrary that the state which resulted from the change was not destined to signal the meaning with which it was impregnated. (CGL, 85)
At the end of paragraph 3, de Saussure defines the field of linguistics, i.e. synchronous linguistics: “Language is a system whose parts can and must all be considered in their synchronic solidarity” (CGL, 87). Diachronic facts are obvious, they randomly influence the elements of the system and, thus, “cause” a change in the system, but this area of diachrony is not an essential field of linguistics. The above mentioned quotations (“synchronic consequences”, “do condition them”) imply that de Saussure considers diachronic facts as the causes of a system change. Unlike de Saussure, Reinhard Köhler (see chapter 5.2) takes seriously the possibility of including the “exterior” of a system that “squeezes” the linguistic system (through needs or requirements) – in its state and even dynamics – in the linguistic theory.123 This allows Köhler to formulate a functional model of linguistic explanation. De Saussure could take a step towards a functional explanation, especially when he even implicitly works with diachronic facts as “causes” of changes in system elements. However, he does not consider this step important, because he grounds linguistics on a synchronic point of view. From the results of the analysis reached so far, we conclude, unsurprisingly, that the principles that structuralism would be willing to incorporate into a principle-based model of explanation cannot be of a causal nature. Therefore, in addition to the unsuitability of the principle of arbitrariness, we also have to exclude the principle expressing the causal nexus as an explanatory principle. De Saussure expresses the relationship between synchrony (S) and diachrony (D) by three examples: projection (S) of a body (D) onto a plane, a cross section (S) of a plant stem (D) and states (S) before individual moves in a chess game (D) (CGL, 87–89). He himself values the most the third example, which is also most often mentioned. It is during its explication when he also very aptly delimits diachrony out of the linguistic interest when he describes the move with a chess piece: The change effected belongs to neither state: only states matter. (CGL, 89)
Despite this, the last example does not seem to us to be the most appropriate, although de Saussure likens the rules of chess to “constant principles of semiology”. However, to think of a chess piece’s move according to the rules as about a
This seems strange, but as we will see (in chapter 5.2), the definition of the linguistic system differs between de Saussure and Köhler – for de Saussure, it is a system of solidarity between signs; for Köhler, it is a system of mutual parameterized relationships of linguistic quantities (and outside-linguistic requirements).
3.1 The missing panchronic view
55
diachronic fact of language is misleading. The principle of arbitrariness, as a basic semiological principle, incorporates randomness, but this can only be thought of here as an arbitrary choice of a possible move for the selected chess piece. In the end, de Saussure himself claims that players would have to be witless, indeed they would have to choose arbitrarily regardless of the goal of victory in chess (CGL, 89). The examples of projection and cross section are much more interesting. They clearly resemble language in use (speech, parole) with an object (in the latter case, even a biological object comes into consideration) and the language system (langue) with the section (or projection). Projection is always dependent on a solid figure, and individual sections reveal the stem structure. Individual projections and sections depend on the projected (or cut) figure body; in the end, they draw it as an invariant.124 In the case of the stem, the overall stem arrangement is also taken into account (we can also make a longitudinal cut); so here, coincidence is not powerfull in the same way as in chess. In the end, is de Saussure’s inclination toward the chess example proof that he really establishes structuralism in linguistics as a “mere” description? The stem example evokes the method used not long before de Saussure by Henri Poincaré for qualitative analysis of dynamical systems (see Galison 2003). Poincaré’s maps of the complex motion trajectory of the celestial body revealed the invariant properties of dynamics; and it was dynamics that defied the analytical description by “brute force”.125 Why could this not be the case with the language in its diachrony? Did de Saussure, thus, make it impossible for himself to get to a real linguistic explanation, in this case non-causal (we will return to these issues in the fifth part of the book)? When comparing synchronic and diachronic methods, de Saussure confirms inadvertently our assumption that he is implicitly considering causal links while admitting that it is in a sense true that it is essential to know the genesis of a given state: “(. . .) the forces that have shaped the state illuminate [emphasis mine] its true nature, and knowing them protects us against certain illusions” (CGL, 90).126 However, he does not trust the result of such actions because, although it removes
Again, the similarity of the structuralist approach to the registration of transformation invariants appears, as we have shown above (see subchapter 2.2.2.) and as we will show below (see the Second Interlude and Appendix 6). In dynamical systems theory, it is possible to reveal some properties of dynamical systems by classifying the attractor of this system, although a complex causal network is not accessible for analysis. Cf. Peitgen, Jürgens, Saupe (2004), Smith (1998b), Kellert (1993). In the French version: « (. . .) les conditions qui ont formé cet état nous éclairent [emphasis mine] sur sa véritable nature et nous gardent de certaines illusions (. . .). » (CLG, 128).
56
3 Systemic explanation in linguistics
certain illusions, “it leads everywhere” (CGL, 90). The synchronic description provides certainty and unambiguity, and de Saussure clearly prefers this. We come to the final point where de Saussure explicates the three variants of laws in linguistics: synchronic, diachronic, and panchronic laws. Synchronic and diachronic laws, although in opposition, together form a pair defined against potential panchronic laws. The first pair is linked to language as a social institution: Since language is a social institution, one might assume a priori that it is governed by prescriptions analogous to those that control communities. Now every social law has two basic characteristics: it is imperative and it is general; it comes in by force and it covers all cases – within certain limits of time and place, of course. (CGL, 91)
De Saussure is generally skeptical of the concept of “language law”, saying that: “(. . .) speaking of linguistic law in general is like trying to pin down a ghost” (CGL, 91). As de Saussure states, synchronic law is general but not imperative while diachronic law is imperative but not general (CGL, 92–93). Although we follow the analogy of the laws of society, it is becoming clear why none of them can be conceived as a principle in a well-established principle-based model of explanation. Rules in language as a system do not allow exceptions, but there is no “power” or “authority” to enforce them. After all, as de Saussure points out, the Latin language system did not allow exceptions, but nevertheless disappeared, for example in favor of the French language system (CGL, 92–93). “In short, if one speaks of law in synchrony, it is in the sense of an arrangement, a principle of regularity” (CGL, 93). This statement leads us to the conclusion that the central systemic approach in de Saussure’s structuralism is a systemic description, not an explanation. Conversely, diachrony postulates cause (“dynamic factor”): in this sense it is coercive, imperative (CGL, 93). However: “(. . .) we can speak of law only when a set of facts obeys the same rule, and in spite of certain appearances to the contrary, diachronic events are always accidental and particular” (CGL, 93).127 This is again a confirmation that de Saussure always prioritizes the semiological principle of arbitrariness. And, as we shall see, this is also confirmed in de Saussure’s definition of the panchronic approach to language. De Saussure agrees that other than “legal” concepts of linguistic law can also be considered, i.e. laws in the spirit of science (e.g. physics), which are general and
At this point, it is appropriate to recall Foucault’s hermeneutic positivism, with which (among other things) Foucault dealt with structuralism, cf. Veyne (2010, 16).
3.1 The missing panchronic view
57
valid always and everywhere.128 He even directly claims that these panchronic laws exist in linguistics (CGL, 95), but then he makes it unclear when he declares: But they are general principles existing independently of concrete facts. When we speak of particular, tangible facts, there is no panchronic viewpoint. (CGL, 95)
How can we think of law in the spirit of the science that exists independently of (concrete) facts? Does this mean that such a law is untestable? Does de Saussure mean that we cannot formulate a panchronic law from partial concrete facts through induction? If so, then his skepticism is justified, but it also means that he defines scientific law in a way that becomes anachronistic (perhaps this would be the case for neogrammarians). Scientific law has to, as a hypothesis, meet the condition of falsifiability, from which we can deduce testable predictions. However, if the panchronic law in linguistics has no relation to facts, it cannot be falsified, and is, therefore, not a scientific law. Would it help de Saussure to distinguish between specific facts and general (abstract) facts? He means something in this matter when he presents an example of phonetic changes that always occur, but whose specific course cannot be predicted (CGL, 95). However, the very distinction of general principles and general (abstract) facts becomes unclear. Should the ubiquity of phonetic changes not be understood more as that general principle? And if not, then what can we consider as a general principle from a panchronic point of view? The only thing that is meaningfully left is to consider the general principles of semiology as general panchronic principles. However, this brings us back to our original attempt to formulate the principle-based model of explanation for linguistic structuralism, based on the principle of arbitrariness. We have shown that it is impossible. De Saussure’s view of the panchronic law confirms our interpretation of the principle of arbitrariness as an invalid principle, in terms of commitments that are obligatory for the principle in the principle-based model of explanation. The principle of arbitrariness actually stands in opposition to the meaningful concept of a panchronic law for linguistics, unless a law is understood as the statement that time changes everything. We believe that we can now reasonably state that de Saussure’s linguistic structuralism is not explanatory and does not establish a model of explanation; it is formulated as a synchronic systemic description. De Saussure wants to
This is a somewhat vague definition, so see e.g. “(. . .), physicists seek universality, formulating their laws so they apply widely and do not depend on the point of view of any particular observer.” Stenger (2006, 61).
58
3 Systemic explanation in linguistics
understand129 the state of language. Linguistic theory/description is meant to reveal something we were not aware of, to reveal something new. However, this is compatible with the concept of linguistic theory as a description (see the chapter 2.3 above). These considerations will also take on new forms in the context of today’s distinctions between explanation and understanding, as defined by philosophy of science.130 In this sense, we can interpret de Saussure as a philosopher of science who realizes that in linguistics it makes no sense to demand a law-based explanation. Given the nature of the principles of semiology, we can strive “only” to understand the individual states of language. Nevertheless, in two respects we can consider a certain explanatory potential of de Saussure’s structuralism. De Saussure takes both of these possibilities into consideration to some degree, but does not choose them, because, we think, he understands them as paths leading beyond linguistics itself. The first way (initiated primarily in the synchronic point of view) leads to non-causal structuralist explanations sui generis, to the definition of mathematical structures that are interpreted as general explanatory principles of the state and possibly also the development of the investigated system. The extent to which such explanations are open to linguistics, and the extent to which they are sui generis linguistic explanations, will be examined (see chapter 4.1 and the Second Interlude). The other way (initiated in the diachronic point of view) leads to functional explanations that resign to the purity of linguistic theory, and mobilize the outside-systemic factors that are responsible for interventions in system elements. Via defining system-theoretical linguistics, we can see that Köhler follows this line (see chapter 5.2 below). De Saussure’s attempt to define linguistic theory is, thus, an example par excellence of our linguistic dilemma: a consistent pursuit of purely linguistic theory prevents de Saussure from formulating this theory as explanatory. De Saussure’s supposed systemic explanation is limited to a systemic description. Therefore, we will try to consider another variant of systemic explanation that could succeed, which is Louis Hjelmslev’s systemic approach to language.
See above: “That is why the linguist who wishes to understand [emphasis mine] a state must discard all knowledge of everything that produced it and ignore diachrony.” (CGL, 81) “(. . .) a better understanding [emphasis mine] of language-states.” (CGL, 83). On the topic of understanding in the philosophy of science, see e.g.: Khalifa (2017), de Regt, Leonelli, Eigner (2009), Friedman (1974).
3.2 In the name of the principle of the analysis
59
3.2 In the name of the principle of the analysis At first glance, Hjelmslev offers us a clear, unambiguous answer to the question of whether he will provide us with a systemic explanation of linguistic phenomena. In chapter three “Linguistic theory and empiricism” of the book Prolegomena to a Theory of Language (PTL) Hjelmslev first defines that: “(. . .) a theory must be capable of yielding, in all its applications, results that agree with so-called (actual or presumed) empirical data” (PTL, 11). This close connection between linguistic theory and empiricism suggests almost a conception of theory in natural science. Subsequently, however, Hjelmslev replaces the term “theory” with the term “description” when defining the empirical principle: The description [emphasis mine] shall be free of contradiction (self-consistent), exhaustive, and as simple as possible. The requirement of freedom from contradiction takes precedence over the requirement of exhaustive description. The requirement of exhaustive description takes precedence over the requirement of simplicity. (PTL, 11)
However, we will not let ourselves be persuaded by the use of terminology alone, and we will continue to examine whether Hjelmslev’s systemic description carries elements of scientific explanation. We consider that Hjelmslev compares the empirical principle against “an assertion of inductivism” (PTL, 11), that is, against the idea of building a theory through the generalization of particularities, the method of generalization and synthesis of particular empirical knowledge. He attributes this inductive approach to standard linguistics (PTL, 11–12), against which he raises his own approach.131 Hjelmslev defines the language theory method as deductive and empirical at the same time (PTL, 13). We interpret this conception in the spirit of the nascent syntactic view of theories (see chapter 2.1 above), the birth of which can be traced back to the texts of Karl Popper, Carl Gustav Hempel, Ernst Nagel, Rudolf Carnap, etc. These views differed from the original logical-positivist idea of building a theory inductively from experiential data, through generalization. As is well known, one of the early critiques of logical positivist inductivism comes from Popper. We believe that this interpretation is supported primarily by the analysis of chapter five “Linguistic theory and reality”, where Hjelmslev clarifies what he means by the term “theory”. Firstly, he refuses to understand the theory as “a system of hypotheses”, which are declared true or false through verification (PTL, 13–14). This is, we believe, precisely a rejection of inductivism and verificationism
“(. . .) induction leads from fluctuation, not to constancy, but to accident. It therefore finally comes in conflict with our empirical principle: it cannot ensure a self-consistent and simple description.” (PTL, 12).
60
3 Systemic explanation in linguistics
at the same time. He then defines the theory as a double entity, with respect to its arbitrariness and appropriateness, where arbitrariness defines the theory as a structure independent of experience, while appropriateness defines the theory as applicable to the description or explanation (we do not yet know) of experience (PTL, 14). Hjelmslev directly states that: “(. . .) the empirical data can never strengthen or weaken the theory itself, but only its applicability” (PTL, 14). Hjelmslev actually introduces a distinction between pure and the applied theory (see, e.g. Papineau 2012, 50–55). The pure theory, as an abstract structure, is uncorrectable; it is a structure that is validly created, and has its existence independently of experience. In the applied form, we then assign empirical content to some terms in the theory, and on the basis of this we can decide for the deduced theorems whether or not they meet empirical conditions given by experience. Again, we can say that this conception is very close to received-view in philosophy of science. Hjelmslev even writes that: “On the basis of a theory and its theorems we may construct hypotheses (including the so-called laws), the fate of which, contrary to that of the theory itself, depends exclusively on verification”132 (PTL, 14). Hypotheses (laws) are theorems supplemented by empirical content.133 Such a modern approach to a scientific theory could, of course, be compatible with the D-N model of explanation. So, can we find an example of a valid principlebased model of explanation in Hjelmslev’s view of language? Unfortunately, we have to answer in a negative way. Hjelmslev provides the justification for this negative judgment in the same chapter. The constructed structure of language theory leads Hjelmslev to the question of whether and where to look for axioms (or postulates) of the theory of language (PTL, 14–15). Hjelmslev assumes that they can be found in the theory of knowledge, but at the same time, he adds an important point: [Axioms] are tracked back so far and they are all of so general a nature that none would seem to be specific to linguistic theory as opposed to other theories. (PTL, 15)
Apart from the fact that, as with de Saussure, we see that Hjelmslev fails to define positively the basic principles of language theory, which would not be very general and which would, instead, be specifically linguistic, Hjelmslev’s subsequent clarification of the bond between knowledge theory and language theory provides other important information on Hjelmslev’s idea of language theory:
The explicit mention of verification (without falsification), we believe, does not connect Hjelmslev with logical positivists, although he does think of theory as true/false, unlike Popper, who uses the concept of theory corroboration through experience. There is an example of the fact that although Hjelmslev defines the theory in the context of a syntactic view of theories, the theory is not conceived as a “linguistic” structure by him, for which the syntactic view of theories is sometimes criticized (see chapter 2.3 above).
3.2 In the name of the principle of the analysis
61
We are thereby forced in some degree to invade the domain of epistemology [emphasis mine], (. . .) Our procedure here is based on the conviction that it is impossible to elaborate the theory of a particular science without an active collaboration with epistemology. (PTL, 15)
It is as if Hjelsmslev did not build one of the theories of language, that is, a particular theory within a linguistic discipline, but a metatheory of any linguistic theory. In a way, and also due to referring to epistemology, his effort to build a linguistic (meta)theory recalls the building of a universal model of scientific explanation by Hempel.134 And although Hjelmslev does not specify what exactly “epistemology” means, Hempel also has to follow certain general principles of epistemology in accordance with the requirement of universality.135 In Hempel’s D-N model, the syntactic view of a scientific theory is already clearly defined – can we expect mutatis mutandis that Hjelmslev’s conception of linguistic (meta)theory defines his model of explanation? In general, yes. It would be a deductive conception in which axioms would play the role of Hempel’s laws, but, unfortunately, not on a specific level. Hjelmslev’s conception of a general model of language theory does not provide a specific well-formed linguistic explanation, at least until specific linguistic axioms manage to be identified, just as Hempel’s D-N model is not a concrete well-established scientific explanation per se, until we manage to identify specific scientific laws. We have to keep examining whether Hjelmslev will bring us any closer to specific linguistic principles. And, indeed, Hjelmslev’s goal is to emphasize the specificity of language theory. This distinguishes Hjelmslev’s construction of linguistic theory from “(. . .) all previous undertakings of linguistic philosophy” (PTL, 11). Hjelmslev is probably the first philosopher of linguistics, in the sense of philosophy of science and not in the sense of the (analytic) philosophy of language. Hjelmslev concludes chapter five with a statement that tames our hope: Linguistic theory, then, sovereignly defines its object by an arbitrary and appropriate strategy of premisses. The theory consists of a calculation from the fewest and most general possible premisses, of which none that is specific to the theory seems to be of axiomatic nature. The calculation permits the prediction of possibilities, but says nothing about their realization. (PTL, 15)
We will have to consider below the characteristics which Hjelmslev attributes to the presuppositions of language theory if he refuses to grant them an axiomatic
Hempel sought to extend the D-N model beyond the natural sciences, cf. already Hempel, Oppenheim (1948). Determined in the assumptions for the valid D-N model of explanation, cf. Hempel, Oppenheim (1948), see also the beginning of part 2 above.
62
3 Systemic explanation in linguistics
nature. At the moment, this makes it difficult for us to build a principle-based model of explanation, because the nature of principles has to be clear. Just as we can find formulations attesting to the epistemic goal (see chapter 3.1) of linguistic theory in de Saussure, also in Hjelmslev we can read that the goal of language theory is “(. . .) to indicate a method of procedure for knowing or comprehending a given object” (PTL, 16). This object is specifically and primarily the text as the beginning of the analysis for Hjelmslev. Again, we may state that, as before with de Saussure, it is adequate to talk about the understanding that we gain through language theory.136 We do not think that Hjelmslev would make any clear distinction between the meta-theoretical conception of language theory and its specific use as the theory of some particular language(s). This may hold because Hjelmslev assumes that language theory can be modified with an expanding set of analyzed texts, taking into account the empirical principle, so that we still have a self-consistent and exhaustive, though not necessarily the simplest, description. Perhaps most concisely, Hjelmslev describes the goal of language theory as follows: The aim of linguistic theory is to provide a procedural method by means of which a given text can be comprehended through a self-consistent and exhaustive description. But linguistic theory must also indicate how any other text of the same premised nature can be understood in the same way (. . .). (PTL, 16)
Furthermore, Hjelmslev specifies that theory as a self-consistent and exhaustive description of objects examined should not be used only to describe and predict texts in a particular language, but “(. . .) on the basis of the information that it gives about language in general, any possible text composed in any language whatsoever” (PTL, 17). This confirms our claim that there is no strict distinction between the theory of language as a metatheory of linguistics and a specific language theory. We can understand language knowledge in general only when we do not mean texts (or processes) in the first place, but the language or system (PTL, 20). We deduce from this that Hjelmslev’s language (meta)theory is a systemic description. The situation is similar as in de Saussure’s view of structural description, however, without Hjelmslev aspiring to consider principles (at least for the time being) that could be potentially explanatory (like de Saussure’s principle of arbitrariness).
However, the English translation also contains the statement: “explained in the light of the structure” (PTL, 19–20).
3.2 In the name of the principle of the analysis
63
With regard to the appropriateness and arbitrariness of language theory, Hjelmslev further specifies the method of creating a specific systemic description:137 From certain experiences, (. . .), the linguistic theoretician sets up a calculation of all the conceivable possibilities within certain frames. These frames he constructs arbitrarily: he discovers certain properties present in all those objects that people agree to call languages, in order then to generalize those properties and establish them by definition. From that moment the linguistic theoretician has – arbitrarily, but appropriately – himself decreed to which objects his theory can and cannot be applied. He then sets up, for all objects of the nature premised in the definition, a general calculus, in which all conceivable cases are foreseen. This calculus, which is deduced from the established definition independently of all experience, provides the tools for describing or comprehending [emphasis mine] a given text and the language on which it is constructed. Linguistic theory cannot be verified (confirmed or invalidated) by reference to such existing texts and languages. It can be judged only with reference to the self-consistency and exhaustiveness of its calculus. (PTL, 17–18)
Is systemic description similar in some respects to empirical generalization? Does Hjelmslev not contradict himself with regard to how he defended his view against inductivism? We think it is similar because the language theorist is always bound by the set of his limited experience,138 but it differs in the process of analysis, the application of the empirical principle being always the same – in this aspect, the linguistic metatheory still works. There is, however, the problem that the definition of “recognizing something as a language” is not entirely clear. How does this happen, based on what principles? Hjelmslev’s assumption that there may be more linguistic theories that differ “(. . .) in the sense of “approximations to the ideal set up and formulated in the ‘empirical principle’”” seems unclear to us either (PTL, 19). Hjelmslev even argues that: “One of these must necessarily be the definitive one, (. . .).” (PTL, 19) We find it unclear because the language metatheory is still the same for all these theories, and what distinguishes them is actually only the breadth of experience that theorists have at their disposal. The intuition that recognizes a given object as a reference to an existing language has to be the same for everyone. So the only difference in theories may eventually lie in the third criterion formulated within the empirical principle – the definitive theory is the simplest theory. Allow us to say that we have a definitive language theory that has chosen the framework to achieve the simplest description (with obvious fulfillment of remaining
We believe that the analogy of Hjelmslev’s systemic description and Carnap’s Logische Syntax der Sprache would be worth thinking out thoroughly, see Carnap (1934) especially in relation with Carnap’s formation rules and transformation rules. Similarly, this problem returns to us in generativism with regard to the reliance on native speakers (see chapter 4.1).
64
3 Systemic explanation in linguistics
conditions of the empirical principle). Let us ask what this definiteness means. In some respects, Hjelmslev proposes that a limit path has to be followed to it, as a constant approximation to the final theory (PTL, 19). An only fact seems to be certain that the empirical principle always has to apply, and that the theory exists as a double entity (with regard to appropriateness and arbitrariness), i.e. only the metatheory remains certain. Language theorists remain trapped in the set of their own experience, language experience. Hjelmslev provides, in the purest possible form, a systemic description that seeks the maximum degree of linguistic autonomy. It is, in a sense, the ideal achievement of linguistic theory, which suffices purely with linguistic means. However, as we have already stated, the price is the descriptiveness of the theory; or can the paths to explanations also lead from Hjelmslev’s systemic description as they have been recognized above in de Saussure’s case (see chapter 3.1)? We will struggle to answer this question after we become clear about what Hjelmslev means by the term “language/system” and when we understand the possibilities of “intuitively” inferring the linguistic nature of the object. ✶✶✶ We will now focus on Hjelmslev’s explication of the principle of language object analysis. Hjelmslev launches it by introducing a system of definitions, which should lead ideally to removing all axioms on which the theory is built (PTL, 21). He proposes the use of a system of formal definitions that will allow us “(. . .) [to anchor] them relatively in respect to other objects, similarly defined or premised as basic” (PTL, 21). This strategy again shows that Hjelmslev seeks to build a systemic description – he wants to build a coherent network of established terms (their definitions), which will be isolated from the system exterior and will stick to its own structure of functions. The effort to capture the principle of analysis is part of the effort to build the system of definitions, the “deepest strata” at which we “(. . .) must treat this principle of analysis. They must establish the nature of the analysis and the concepts that enter into it” (PTL, 21). Hjelmslev emphasizes the connection between the principle of analysis and the requirement of an exhaustive description (PTL, 22) and, thus, the principle of analysis acquires contours different from the mere method of analysis. In fact, we are beginning to understand why we should talk about a principle. Hjelmslev rejects the “naively realistic” idea that we can divide known objects into parts, at our discretion. Hjelmslev adheres to a realism that makes “mutual dependencies” visible: (. . .) the important thing is not the division of an object into parts, but the conduct of the analysis so that it conforms to the mutual dependences between these parts, and permits us
3.2 In the name of the principle of the analysis
65
to give an adequate account of them. In this way alone the analysis becomes adequate and, from the point of view of a metaphysical theory of knowledge, can be said to reflect the “nature” of the object and its parts. (PTL, 22)
In the principle of analysis, we assume that “mutual dependencies” (which will be further specified by Hjelmslev) exist objectively, and the application of the principle of analysis lies in finding them in the object of analysis.139 Hjelmslev specifies his conception of realism as structural: (. . .) the “objects” of naive realism are, from our point of view, nothing but intersections of bundles of such dependences. That is to say, objects can be described only with their help and can be defined and grasped scientifically only in this way. The dependences, which naive realism regards as secondary, presupposing the objects, become from this point of view primary, presupposed by their intersections. (PTL, 23)
Hjelmslev considers structural realism to be scientific and domesticated in science.140 This, together with Hjelmslev, brings us to the very edge of metaphysics and poses a special dilemma: should we admit structural realism to be a true scientific interpretation of the world while object-oriented realism is naive metaphysics? Or should we admit that we also create “only” systemic descriptions in other sciences? Instead of answering, we will ask a more specific form of the question: How would we be affected if someone suggested to consider the use of, for example, “symmetry principles” in physics as the creation of systemic descriptions of physical phenomena? Can we consider structural realism in any respect more scientific than object realism (which Hjelmslev calls naive here)? We are afraid that nothing but our metaphysical preferences entitles us to do so. The only thing that can be argued easily is that the structural approach is used to a greater extent in a certain discipline, at a certain period141 – every scientific theory has to use some “metaphysical borrowings” to build its basic concepts (see the Second Interlude and Appendix 7).
Hjelmslev expresses the meaning of mutual dependences as follows: “(. . .) both the object under examination and its parts have existence only by virtue of these dependences; the whole of the object under examination can be defined only by their sum total; and each of its parts can be defined only by the dependences joining it to other coordinated parts, to the whole, and to its parts of the next degree, and by the sum of the dependences that these parts of the next degree contract with each other.” (PTL, 22–23). “The recognition of this fact, that a totality does not consist of things but of relationships, and that not substance but only its internal and external relationships have scientific existence, is not, of course, new in science, but may be new in linguistic science.” (PTL, 23). However, he admits that: “(. . .) Saussure, who sought “rapports” everywhere and asserted that a language is a form, not a substance, recognized the priority of dependences within language.” (PTL, 23). And in this form, indeed, the structural view was widespread in Hjelmslev’s time in physics. However, it is a question of what sciences Hjelmslev specifically had in mind.
66
3 Systemic explanation in linguistics
The ability to distinguish intuitively different types of mutual dependencies (or linking clusters) is a part of the principle of analysis. Hjelmslev distinguishes three – interdependence, determination and constellation.142 All three of them are, then, chosen pairs of terms, depending on whether they relate to a text/process or language/system (PTL, 24–25). Details are not important to us; it is only worth mentioning that he also puts the (linguistic) theory itself to this principle of analysis – he refers to it as a hierarchy of definitions, for which it holds that: The functions between the definitions are determinations, since the definitions designed to be placed early in the process (or system) of definitions are presupposed by those designed to follow later, but not vice versa. (PTL, 25)
There is an interesting question – how to base a set of dependencies that Hjelmslev considers real. The answer will also influence our decision whether the principle of analysis can be considered a valid principle in the principle-based model of explanation. Hjelmslev explicitly defines the principle of analysis as follows: It may be taken for granted that a text and any of its parts can be analyzed into parts defined by dependences of the sort discussed. The principle of analysis must, consequently, be a recognition of these dependences. It must be possible to conceive of the parts to which the analysis shall lead as nothing but intersection points of bundles of lines of dependence. Thus analysis cannot be undertaken before these lines of dependence are described in their main types (. . .). (PTL, 28)
What do we mean when we want to find the basis of mutual dependencies? Primarily, what are they supposed to be in terms of theory building? It is possible to say that Hjelmslev simply takes the existence of relations (and dependencies) for granted. Then, in our opinion, the principle of analysis is simply a starting postulate of the theory, in which Hjelmslev continues by specifying three types of these dependencies. To what extent is this postulation arbitrary? Could we consider that there may be other types of dependencies? Could it turn out that from some point of view some dependencies are not possible? How to understand the dependence that is neither interdependence nor determination; why in this case talk about dependence at all? Can interdependence be represented in any logical calculus?143
“The mutual dependences, in which the one term presupposes the other and vice versa, we shall call conventionally interdependences. The unilateral dependences, in which the one term presupposes the other but not vice versa, we call determinations. And the freer dependences, in which two terms are compatible but neither presupposes the other, we call constellations.” (PTL, 24). To represent determination via implication and conjunction via constellation comes forward. From the point of view of calculus completeness, it is sufficient. However, interdependence can hardly be represented by equivalence. So how to represent it in logical calculus?
3.2 In the name of the principle of the analysis
67
Can we create a valid principle-based model of explanation in this situation? For example, such as: Explanans: The principle of analysis (i.e. there are dependencies) Condition: There is a set of dependency types, of which at least one type is always implemented. Explanandum: We observe/find the realization of the XY type of dependence (e.g. the relationship between the noun and the verb) We stated above that the principle of analysis (as a postulate) suffers from several problems – the logical status of interdependence, the nature of dependence in the constellation, the question of closedness of a set of dependency types. If we limit the principle of analysis to the assertion that dependencies exist, then of course, it becomes vague. Just as when we simply stated that there are objects (somehow interacting), it will not lead us to explanation of a particular linguistic phenomenon. ✶✶✶ In our analysis of Hjelmslev’s conception of the linguistic theory, we have experienced several “movements up and down”, from theses that strongly resemble a departure from pure descriptiveness (defining a theory as a double entity – arbitrary and applicable, in the interpretation of the principle of analysis as a constitutive principle of the linguistic theory), back to the need to state that Hjelmslev builds, not only at the terminological level, linguistic theory as a systemic description. Despite Hjelmslev’s proclamations about the nature of the principle of analysis, we believe that it is more appropriate to actually interpret it as a method, which, given the implicit knowledge of language entities, leads to the identification of individual language parts based on saturated sets of individual types of relations, which were again identified preliminarily on the basis of implicit knowledge. A comparison with logical calculus in general may be offered. Logical calculus also “illuminates” certain aspects of natural languages (otherwise, logical calculus can serve in an applied form, e.g. in natural language processing), but it always does so with the starting point of having a clearly defined axiomatic system. Therefore, a certain type of logic always illuminates a certain aspect of language, but it is difficult to find a general logical calculus suitable for illuminating
68
3 Systemic explanation in linguistics
natural language as a whole. And it will be difficult to say that this logical calculus explains natural language structure.144 It, therefore, seems less “natural” to us to move from Hjelmslev’s systemic description towards some variants of explanations – structural or functional. We see no way at all to a functional explanation because Hjelmslev does not deal with requirements outside the system. A structural explanation could be more natural; but in its case, we encounter vagueness of the relation concept and the problem of defining sets of dependency types. Nevertheless, we will see that Luděk Hřebíček performs something that we could call elevation of the principle of analysis to the explanatory principle (see the Second Interlude). In chapter ten, “Form of the analysis”, Hjelmslev adds another type of dependence, which exists among individual objects (terms) and the whole (text) (PTL, 28). The basic property of this dependence is its uniformity: “(. . .) coordinate parts, which proceed from an individual analysis of a whole, depend in a uniform fashion on that whole” (PTL, 28). The introduction of uniformity allows Hjelmslev to express a formal definition of the analysis: “Analysis we can then define formally as description of an object by the uniform dependences of other objects on it and on each other” (PTL, 29). Hjelmslev further states that an object will be referred to as a class terminologically, and objects uniformly dependent on a class will be referred to as class components (PTL, 29). From the point of view of our interest, which aims at delimiting Hjelmslev’s theory as one being able to provide an explanation or convey a description, it is interesting how in the following lines, he presents the determination within the hierarchy of the linguistic theory, up to the indication of indefinables when: (. . .) the definition of component presupposes the definition of class, and the definition of class the definition of analysis. The definition of analysis presupposes only such terms or concepts as are not defined in the specific definition system of linguistic theory, but which we posit as indefinables: description, object, dependence, uniformity. (PTL, 29)
As we can see, Hjelmslev sees the description as a primitive term (indefinable), he defines the analysis via (among other things) a description, therefore we also do not find a definition of a systemic description in Hjelmslev’s works. What does this say about our efforts to delimit systemic description and explanation? Let us answer with the definition: A linguistic theory is an analysis performed by describing uniform dependencies among objects in a system.
Perhaps only by analogy in the sense that the concept of grounding is used in relation to logic (see Appendix 5).
3.2 In the name of the principle of the analysis
69
In addition to the analysis, Hjelmslev defines the deduction as: “(. . .) a continued analysis or an analysis complex with determination between the analyses that enter therein” (PTL, 31).145 Hjelmslev points out that it is possible to use the term deduction in the sense of ‘logical conclusion’ (PTL, 32). “Propositions that follow from other propositions can in our sense be said to proceed from them by an analysis: conclusions are at each step objects that depend uniformly on each other and on the premisses” (PTL, 32). This again brings the question of how to understand the relationship between Hjelmslev’s theory of language and logical analysis of language. As we mentioned above, Hjelmslev’s theory is an analysis based on the recognition (description) of dependencies, which are specified as interdependencies, determinations and constellations. In addition, individual theory levels (defined by formal definitions) depend on each other (in the sense of determination). The logical analysis of a language in the sense of recognizing the logical form of a representation is applicable only onto the sentence level (the syntactic plan)146 – which is the main difference. So can we say that at the syntactic level, there is no fundamental difference between Hjelmslev’s language theory (deduction, analysis, etc.) and the logical analysis of language? For Hjelmslev, of course, the sentence level is not sufficient. He starts with the text and descends from sentences to lexical units and further to their basic components. On that occasion, Hjelmslev makes a specific principle of analysis that delimits intuitively a set of dependencies to be found across all linguistic plans from the analysis that is clearly understood in logic in relation to sentences that are interlinked by logical connectives and defined by derivation rules (for example, by introducing a substitution rule and a modus ponens rule). Hjelmslev’s systemic description, thus, meets the “logical form of the natural language representation” at the sentence level. However, it is something different with regard to other language plans. It is a way beyond the analytic philosophy of language, which was largely coextensive with logic,147 towards the use of mathematical
Hjelmslev continues: “A deduction is thus one special kind of procedure, while induction is another special kind of procedure. Let us define an operation as a description that is in agreement with the empirical principle, and a procedure as a class of operations with mutual determination.” (PTL, 31). Determination also exists between analysis and synthesis: “If a procedure consists of both an analysis and a synthesis, the relationship between the analysis and the synthesis will always be a determination, in which the synthesis premises the analysis but not vice versa (. . .).” (PTL, 31). For example, sentences can be equivalent because they are closed structures with a truth value, but this is not the case, for example, with lexical units. Again, we can refer to Carnap (1934). Hjelmslev characterizes logic in the chapter “Language and non-language” as a purely syntactic system (talking about the path from Hilbert’s formalism, through Polish logic to Carnap’s Logische Syntax der Sprache), which is only interested in the form of the expression, but not the
70
3 Systemic explanation in linguistics
means for other language plans and also for the possibility to grasp these plans in the hierarchy entirely. Therefore, in the introduction we call Hjelmslev’s approach a functional model of grammar. Here, indeed, we proceed again by trying to think out Hjelmslev’s approach thoroughly in a form that will be different from a purely descriptive one. However, the same structural problem is repeated as in the case of imperfect mirroring with logic (the problem with interdependence) – the concept of a function is used not strictly, but again on the basis of analogy. Hjelmslev defines a function as follows: “A dependence that fulfils the conditions for an analysis (. . .)” (PTL, 33). He points out that: “We have adopted the term function in a sense that lies midway between the logico-mathematical and the etymological sense (. . .), in formal respect nearer to the first but not identical with it” (PTL, 33–34). And his following explication removes the possibility of seeking a kind of connection to some form of a mathematical structure that could play a role in non-causal explanation: It is precisely such an intermediate and combining concept that we need in linguistics. We shall be able to say that an entity within the text (or within the system) has certain functions, and thereby think, first of all with approximation to the logico-mathematical meaning, that the entity has dependences with other entities, such that certain entities premise others – and secondly, with approximation to the etymological meaning, that the entity functions in a definite way, fulfils a definite role, assumes a definite “position” in the chain. In a way, we can say that the etymological meaning of the word function is its “real” definition (. . .). (PTL, 34)
We see that Hjelmslev does not follow the path of exact application of the concept of function in the mathematical sense. We believe that his intention, unlike de Saussure’s (see chapter 3.1), does not propose a way to the mathematical expression of linguistic theory. It would be necessary to remove the etymological part of the meaning of the term “function”, find a mathematical expression of interdependence and, above all, understand the principle of analysis as an explanatory
form of the content. Therefore, he also problematizes it as an example of the semiotic system (PTL, 108–111). However, he does not follow Carnap’s development towards logical semantics; he ignores the very changes of Hilbert’s metamathematics after Gödel (Gödel’s numbering introduces a double-set system; it is a sovereign semiotic step that is repeated in Turing’s solution of Entscheidungsproblem and von Neumann’s conception of self-reproducing automata). Hjelmslev’s definition of semiotics fully reflects this: “(. . .) a hierarchy, any of whose components admits of a further analysis into classes defined by mutual relation, so that any of these classes admits of an analysis into derivates defined by mutual mutation.” (PTL, 106). If semiotics was isomorphic to metamathematics, then in Hjelmslev’s metatheoretical point of view, the theory of language (semiotics) also becomes isomorphic to metamathematics. Then it could probably be said that Hjelmslev’s project, seeking a metamathematical description of all sign systems, is coextensive with “modern logic.” This again delimits the systemic explanation as a systemic description.
3.2 In the name of the principle of the analysis
71
principle. We repeat that we recognize this step in Hřebíček’s conception of the principle of compositeness (see the Second Interlude). Hjelmslev introduces other terms that specify previous knowledge regarding the definition of dependencies. We cannot continue to observe and interpret them although they provide further clarification of some of the difficulties that arise with Hjelmslev’s definition of dependence as a function. In any case, the established terms allow him to conceive of the system as a correlational hierarchy and the process as a relational hierarchy (PTL, 38–39).148 Hjelmslev’s constant need to think of language theory at the same time as of the metatheory of it also leads to an attempt to express the function between the process (text) and the system (language). This, indeed, shifts the set of considerations to a fundamentally different level, especially when he defines this function as determination: But the decisive point is that the existence of a system is a necessary premiss for the existence of a process: the process comes into existence by virtue of a system’s being present behind it, a system which governs and determines it in its possible development. A process is unimaginable – because it would be in an absolute and irrevocable sense inexplicable [emphasis mine] – without a system lying behind it. On the other hand, a system is not unimaginable without a process; the existence of a system does not presuppose the existence of a process. (PTL, 39)
Here, there is an expression of the essence of Hjelmslev’s systemic description, which he also refers to as linguistic theory. Hjelmslev distinguishes between virtual and realized text (PLT, 40). We always start from a realized text, about which we also claim that it is determined by a system that allows it to be “explained”. However, we remain trapped in the system with its virtual games – virtual texts, or only one virtual text that is determined by the system per se, the principle of analysis. We do not explain a specific implemented text – a specific phenomenon, but we present a system. Hjelmslev does not present an explanatory theory that allows us to explain the process (the realized text). Hjelmslev simply states the systemic description, i.e. the system. Hjelmslev’s indispensable need for the system to explain the process is grasped by Köhler in such a way that in system-theoretical linguistics he issues a
E.g. Hjelmslev uses logical connectives of conjunction and disjunction to distinguish the process and the system terminologically: “(. . .) in the process, in the text, is present a both-and, a conjunction or coexistence between the functives entering therein; in the system is present an eitheror, a disjunction or alternation between the functives entering therein.” (PTL, 36). He is aware of problems of this terminology, and therefore, speaks of conjunction as a relation and disjunction as a correlation (PTL, 38). We leave the term functive without any definition (cf. PTL, 34).
72
3 Systemic explanation in linguistics
real explanation of the process. We just need to examine further (see chapter 5.5) whether this explanation can be purely linguistic. ✶✶✶ The last question which we have to ask ourselves here after observing the variants of systemic (structural) descriptions is their relation to teleological explanations. The interpretation of structuralism was in many cases, especially in the case of the Prague school, a teleological one (e.g. Lacková 2018, Kořenský 2014, Osolsobě 2003). In the case of Hjelmslev, we see this interpretation possible in connection with his use of the etymological meaning of the term function, when he claims: “(. . .) that the entity functions in a definite way, fulfils a definite role, assumes a definite “position” in the chain” (PTL, 34). We did not write about the teleological explanation in part 2 because it does not represent a valid model of explanation in the current philosophy of science or linguistics. In any case, the transformation of teleological explanation into some form of functional explanation has been a subject of great interest in the philosophy of science (especially biology, cf. Garson 2008, Wright 1976). Hempel tried to rule out this kind of explanation by converting it to the D-N model, through its motivational variant (Hempel, Oppenheim 1948, 141–146), and then suggested ways to convert it into a functional analysis. We map Hempel’s approach in more detail in connection with the functional explanation in systemtheoretical linguistics (see chapter 5.5 and Appendix 19). The basic difficulties of teleological explanation are well known.149 In structuralism, there was often present a teleological explanation in the context of phonetic changes (cf. Jakobson 1962, Sériot 2014), as if the system could enforce the emergence or demise of phonetic oppositions so that it remained stable. Teleological explanation is seductive because it allows us to enter the field of always problematic diachrony – as if teleology could be a source of diachronic laws that govern the development of language. Also in the etymological meaning of the concept of function, Hjelmslev leaves a teleological residue present.
Explaining through a purpose or a goal or an optimal state is more of a cluster of similar but sufficiently different entities – there is certainly a difference between the purpose governing the human actor, as a developed intentional system (here, there is naturally one of reductions to neuroscientific explanation), between goal-directed behavior of entities in the biosphere (see Garson 2008, Wright 1976), which have been challenged by cybernetics through the concept of feedback, and between optimizations that are associated with economizing principles that represent explanations sui generis (see chapter 5.2 below). The basic counter-arguments, of course, refer to the counter-intuitiveness of the temporal interpretation of the teleological explanation – we explain due to something that does not exist yet, that is, the state in the future affects the state in the present, so it is a basic conflict with a causal chain of events.
3.2 In the name of the principle of the analysis
73
One of the frequent “blind paths” from the explanatory dilemma of linguistics is teleological explanation. The relationship between a systemic description and an explanation can also be formulated in the way that the description can be “elevated” to an explanation if it includes a variant of the teleological principle. We can formulate it, for example, as follows: Teleological principle: The system keeps itself in (strives for) equilibrium (in the passing of time).
Then it is possible to formulate the principle-based model of explanation as follows: Explanans: Teleological principle Requirements for implementation of this system in the process (initial and boundary conditions) Explanandum: We observe the linguistic phenomenon XY (e.g. phonetic change) The validity of such a model of explanation is, indeed, undermined by the problematic status of the teleological principle. Strictly speaking, we cannot call this principle a scientific principle (see note 149). We will return to the issue of teleological explanation in the analysis of functional explanations and especially in the analysis of Köhler’s system-theoretical linguistics. The core of its functional explanation is the structural axiom, which states that the language system is self-organized, where the concept of self-organization is borrowed from the systemic approaches of the 1960s and 1970s (Haken’s synergetics, Prigogine’s non-equilibrium thermodynamics, theory of dynamical systems, etc.). Köhler tried to get rid of the teleological flavors of functional explanation by carefully observing Hempel’s functional analysis; to what extent he succeeded, we will assess in the relevant chapter (see chapter 5.5).
4 Formal explanation in linguistics Later The sentence is the passage from one point of thought to another point of thought. This passage is achieved in a thinking sleeve. Since the size of its writer’s sleeves is unknown, he finds himself judged on his passages. Soon he has the reputation of being even more lacking, more of a fool than any of his contemporaries. You forget that he had it in his sleeve to say something quite different, even the opposite of what he has said. Henri Michaux, Ecuador (translated by Robin Magowan)
Although we have tried to see an entity that is closely connected with the development of formal disciplines in structuralism – de Saussure’s connection to topology and graph theory, Hjelmslev’s connection to modern logic and algebra – it is more common to see the beginning of formal nature and exactness of linguistics in connection with postwar scientific development in the 1940s and 1950s. Then, in the context of the origin and development of information theory, cybernetics, automata theory and formal grammars, generativism is born (see Turing 1937, Shannon 1948, Wiener 1948, von Neumann, Burks 1966). These theories, connected with natural sciences and formal sciences, divert linguistics from the grip of philosophy of language (and humanities in general) and provide linguistics with the means to overcome its descriptiveness – replacing systemic description with linguistic explanation. At least this is how the founder of generativism, Noam Chomsky, apprehended this situation in linguistics.150 The specific role of graph theory is addressed in several notes in the Appendix (see Appendix 9). The initial impulse was an intention to identify information as a variable, expressible by means of combinatorics and probability theory, which led to an axiomatized theory of information which, in collaboration with cybernetics, enabled a quantitative grasp of the communication process and the text. However, Noam Chomsky finds these technically very useful tools insufficient to build an explanatory linguistic theory (see the following chapter 4.1) – with other tools of automata theory and formal grammars, he proceeds to build a theory of grammar that delimits the grammatical syntax of natural languages from concatenation syntax defined in information theory. In the range of context-free and context-sensitive
See, for example, the title of the eighth chapter of Syntactic Structures: “The Explanatory Power of Linguistic Theory” (Chomsky 2002). https://doi.org/10.1515/9783110712759-004
4 Formal explanation in linguistics
75
grammars (respectively, pushdown and linear bounded automata), the place of the most suitable formal models of natural languages has been established.151 We will not follow the whole development of generativist linguistics because we believe that the explanatory maxima remains largely intact in it. We will focus only on a few representative texts: the initial texts – Syntactic Structures (SS, Chomsky 2002) and Current Issues in Linguistic Theory (CILT, Chomsky 1970) – and the current form of the minimalist program – The Minimalist Program (MP, Chomsky 2015) – between which we will move appropriately. In the context of defining a formal explanation, we will also notice some important interpreters of generativism – Frederick Newmeyer and Martin Haspelmath, who present the concept of formal explanation in general opposition to functional explanation and also provide interesting insights into the gradual “merging” of formal and functional explanatory perspectives (see chapter 4.2). Chomsky critically evaluates information theory in relation to language, objected to the quantitative statistical description of language, and proposes to search for universal principles of natural language grammar. He has explicitly proclaimed (as we shall see below) the self-sufficiency of linguistics as an exact science, which is ruled by an explanatory power, which does not have to be “borrowed” from other sources. From today’s perspective, after almost 70 years of generativism in its constant changes, it may seem more natural to see it as a sequence of formal models of language fitting the description of some aspects of language (as generativism was characterized in the introduction). However, we will try again to consider generativism a priori, without bias, as an explanatory theory. As we will see below (see chapter 5.2), where Chomsky strictly delimited the possibilities of statistics in relation to linguistics, Köhler’s system-theoretical linguistics represents a return (or rather a continuation) to the mathematical theory of communication. It was possible mainly because there was always a diversion towards the generativist mainstream (as we will see in the case of Gustav Herdan, see the First Interlude), which clearly favored the importance of statistics and quantitative modeling of language from linguistic data (corpora). Because it is common to use the term formal explanation in connection with Chomsky, we will first propose a model of formal explanation in linguistics. The basic concept of generative grammar is important for this model of explanation – we define models of formal grammars, which are defined by the alphabet of symbols and a set of rules for generating strings.152 A native speaker is able to adequately For an excellent overview, see part five “Languages, Grammars, and Automata” in Partee, ter Meulen, Wall (1993, 429–557). We define the basic concepts of generativism in a simplified way: there can be more alphabets (we distinguish between terminal and nonterminal symbols); the rules are formulated as
76
4 Formal explanation in linguistics
assess grammaticality of statements in a given natural language. The formal explanation, then, consists in the agreement of the set of strings generated by the means of formal grammar153 with the set of statements in a given natural language. Let us mention a famous example (for details, see Appendix 9) of falsification of a generative grammar model – in the 1980s, Swiss German managed to be proved and substantiated with empirical evidence not to be modelable by context-free grammar. In other words, a certain formal model (context-free grammar) was not able to generate strings corresponding to a set of Swiss German sentences that a native speaker called grammatical. It was a nice example of the falsification of a linguistic hypothesis (H1: All natural languages are modelable using context-free grammars), to which generative linguists had to respond by changing the hypothesis (H2: All natural languages are modelable using contextsensitive grammars), i.e. by modifying explanatory principles.154 This example fully corresponds to the practice of natural sciences referred to by generativists. And it is in line with the proposal for the functioning of the falsification method as we find in philosophy of science – either in Popper (1935) or in a sophisticated form in Lakatos (1978). Linguists incorporate an adjustment of the underlying hypotheses and the principles on which they are based in order to arrive at a true explanation of the grammar of natural languages. As in the case of hypotheses in natural sciences, however, in the case of generativism, it is a question of adjusting the assumptions which determine the manner of implementation of this principle, rather than modifying the explanatory principle itself.155 The explanatory principle of generativism will be referred to as the Generative Principle (GP), and we will predefine it as follows: Generative principle: In a language, it is possible to form recursively indefinitely156 long sentences, which can be represented by a correctly formed tree.
derivation rules and are crucial for defining Chomsky’s hierarchy. We will consider these concepts to be known in the following explications. For a basic orientation, see Partee, ter Meulen, Wall (1993, 429–557). Again, we express ourelves in a simplified way; we should specify the rules of transformational grammar. Again for a basic orientation, see Partee, ter Meulen, Wall (1993, 553–557). We express the situation in a simplified way for the purpose of illustrating the example, for details see Shieber (1985), for a summary see Partee, ter Meulen, Wall (1993, 501–503). In the received view of philosophy of science, this is recognized as a problem with ad hoc modifications of conditions in explanans. Popper’s strict falsificationism is naive in the context of the real operation of theory testing, as Lakatos has documented. For a review, see Rosenberg (2005, 116–125, 163–167). The length of the sentence is not limited, but of course there are not infinitely long sentences.
4 Formal explanation in linguistics
77
Based on this principle, it is then possible to design a principle-based model of explanation, for example in this form: Explanans: Generative Principle (GP) Condition: Specification of the type of formal grammar satisfying GP for a given natural language. Explanandum: Grammatically correct sentence of a given natural language (produced by a native speaker) In the following chapters (4.1 and 4.2), we will try to define both the generative principle and the principle-based model of explanation based on it more precisely. We will use the analysis of Chomsky’s definition of explanation in generativism. Indeed, we will also try to evaluate whether the generative principle is a valid scientific principle that can establish a principle-based model of explanation. There will be several variants considered, as well as the very proposal of a formal explanation, which linguists put against the functional explanation. ✶✶✶ However, before we will do so, let us focus on one more property expressed in the generative principle – sentences can be represented by a tree. When we look for formal frameworks in all the linguistic theories examined above, we are offered a common framework leading from structuralism, through generativism even to post-structuralism – this framework is graph theory. For de Saussure, it is implicitly present in the system of oppositions that establish the structure; for Hjelmslev, it is then visible in the hierarchy of structure in language plans, which in the case of a sentence plan in Chomsky’s generativism takes the form of phrase structure trees. The particular form of graphs – a correctly formed tree – is conceived as a basic formal tool for modeling natural languages in generativism. The close connection of this mathematical device with the “content” of the linguistic theory is shown in an interesting respect. The boundary of context-sensitive grammars is also the one with unrestricted rewriting systems (Turing automaton), in which “sentences” can no longer be exhaustively represented by trees – branches can cross and join – and the possibility of hierarchical distribution of plans is lost.157
A speaker who masters the grammar of unrestricted rewriting systems would have to be able to organize real numbers, the continuum hypothesis would not apply to him – the concept of grammar would be fundamentally different for him although we could still see similarities with
78
4 Formal explanation in linguistics
Graph theory158 winds through the history of linguistic theories and does not omit the special path of post-structuralism. The structure of the rhizome is not hierarchical, but we do not have to understand it as “chaos”. Its formal representation is at hand via unrestricted rewriting systems, whose generated strings, although governed by strict rules, can be interpreted by natural language speakers only as associative partial speeches without a preserved phrase structure.159 Perhaps somewhere here, we can sense the power of the vision of generativism, which, despite poststructuralist developments in the humanities in the last third of the 20th century, seeks a universal grammar that explains the common structure of natural languages. In fact, right at the beginning, it places a correctly formed tree in the neural network of the human brain, as a prescription, a preprinted structure directly leading to the acquisition of speech, to phrase structures. According to proponents of system-theoretical linguistics, a speaker can actually only be equipped at the beginning with a randomly structured (neural) network that adapts to the tree structure typical of natural language, under the influence of economizing and other constraints (pressures). Where system-theoretical linguistics seeks to draw explanatory power from the idea of a language under pressure from the world (see chapter 5.2), generativists believe in the abstract nature of “being a language”. They do not believe that grammar can emerge from the competitiveness of activities under pressure, they are looking for an abstract universal principle(s) of grammar.160 We will try to capture these rules together with them.
4.1 Building linguistics as a science In the introduction of Syntactic Structures, Chomsky explicates linguistic theory in the spirit of “grammatical description” (for the term used, see SS, 11): The ultimate outcome of these investigations should be a theory of linguistic structure in which the descriptive devices utilized in particular grammars are presented and studied abstractly, with no specific reference to particular languages. One function of this theory is to provide a general method for selecting a grammar for each language, given a corpus of sentences of this language. (SS, 11)
ordinary grammars – a limited alphabet, linearity, sequential application of rules, etc. See also Zámečník (2018, 253–255). And also its topological variants, cf. Gross, Tucker (1987). This is nicely captured in the book by Binet, The 7th Function of Language (Binet 2017) on the example of “Derrida’s speech” of one of the heroes (Binet 2017, chapter 92). For more on the tensions among essentialism, externalism and emergentism in philosophy of linguistics, see Scholz, Pelletier, Pullum (2015).
4.1 Building linguistics as a science
79
In other words, the theory deals with general means of describing any language, and should provide a method for choosing a grammar for known linguistic empirical evidence (corpus). Chomsky’s conception, defined in this way, does not look revolutionary, but in the chapter “The Independence of Grammar”, Chomsky continues: The grammar of L will thus be a device that generates all of the grammatical sequences of L and none of the ungrammatical ones. One way to test the adequacy of a grammar proposed for L is to determine whether or not the sequences that it generates are actually grammatical, i. e., acceptable to a native speaker (. . .). (SS, 13)
And as he explicates further (in the chapter “On the Goals of Linguistic Theory”): “A grammar of the language L is essentially a theory of L” (SS, 49). With regard to the adequacy of grammar, he refers to Quine: To use Quine’s formulation, a linguistic theory will give a general explanation [emphasis mine] for what ‘could’ be in language on the basis of “what is plus simplicity of the laws whereby we describe and extrapolate what is”. (W. V. Quine, From a logical point of view (Cambridge, 1953), p. 54). (SS, note 1, 14)
In the chapter “On the Goals of Linguistic Theory”, Chomsky follows Quine’s conception and understands linguistic theory completely isomorphically with “any scientific theory” (SS, 49), and gives directly physical examples (SS, 49). He argues that: Any scientific theory is based on a finite number of observations, and it seeks to relate the observed phenomena and to predict new phenomena by constructing general laws in terms of hypothetical constructs such as (in physics, for example) “mass” and “electron”. Similarly, a grammar of English is based on a finite corpus of utterances (observations), and it will contain certain grammatical rules (laws) stated in terms of the particular phonemes, phrases, etc., of English (hypothetical constructs). These rules express structural relations among the sentences of the corpus and the indefinite number of sentences generated by the grammar beyond the corpus (predictions). (SS, 49)
In summary, for the first time in modern linguistic approaches, Chomsky’s statement proposes to build explicitly linguistic theory as the based on laws that allow us to conceive of theory as a source of explanatory power. However, we can interpret Chomsky’s statement in several ways: strongly and weakly on a general level, and strongly and weakly on a specific level. The general level is a universal grammar as a structure of rules valid for any (natural) language. A specific level is a grammar of a particular natural language. The strong one represents an explanatory interpretation while the weak one means a descriptive interpretation. Given the quotation above, interpreting the statement specifically seems straightforward, after all, Chomsky speaks of the grammar of English. Then, indeed, it seems more natural to choose a weak variant of interpretation because we can hardly understand the specific rules of English grammar as sui generis laws. Given this, it
80
4 Formal explanation in linguistics
seems natural to understand Chomsky’s linguistic approach as another in a continuous sequence of linguistic structural (systemic) descriptions, which, however, has acquired precise formal characteristics. Descriptions should be built for individual languages separately – with the obvious possibility of comparing particular grammars defined in this way. Perhaps, the only way to go beyond a “mere” description would be to interpret the grammatical rules of English as empirical laws. Nevetheless, Chomsky refuses explicitly to build a grammar on the corpus of a specific language – the corpus is intended to verify (falsify) the finished grammar: “(. . .) the set of grammatical sentences cannot be identified with any particular corpus of utterances obtained by the linguist in his field work. Any grammar of a language will project the finite and somewhat accidental corpus of observed utterances to a set (presumably infinite) of grammatical utterances” (SS, 15). Empirically speaking, this interpretation of ours is supported – the way the generativist conception has been performed corresponds to exploration of particular languages and construction of their grammars. If we choose a general interpretation, then both possibilities open up for us – the strong and also the weak one. In this case, we speak about general universal rules of grammar, which can either be interpreted as tools of systemic and formal description (weakly) or as laws establishing explanation. Since Chomsky explicitly compares linguistic theory with a physical one, it seems natural to consider a strong variant, which means that it should be possible to create a principle-based model of explanation along the same lines as outlined above: Explanans: Rules (laws) of the Universal Grammar Conditions: Special application of rules (laws) for a specific natural language. Explanandum: Any sentence from a corpus of grammatically correct sentences (from the perspective of a native speaker) Recall that for Hjelmslev, in the empirical principle, simplicity was the last point that could be sacrificed with respect to the rest (see chapter 3.2). In contrast, Chomsky advocates himself via simplicity of the universal grammar, which culminates in a Minimalist Program that can be understood as a continuous effort to build a normal science that achieves explanation through the simplest possible “principles”. This is how Chomsky puts it in “Preface to the 20th Anniversary Edition” of The Minimalist Program (2015) after recalling that Minimalist Program is a direct continuation of his theories back to the 1950s:
4.1 Building linguistics as a science
81
(. . .) a leading concern from the outset had been to clarify the concept “simplest grammar” and to determine how to choose the simplest grammar for each language. The basic reasons are just normal science. Since Galileo, modern science has been guided by his maxim that nature is simple and that it is the scientist’s task to show that this is the case. It has long been clear that the quest for simplicity is closely related to the quest for explanation (. . .). (MP, vii)
We can also understand this as Chomsky’s answer to our rebuke from the introduction (see part 1) that generativism proceeds by sweeping away anomalies, although it should be falsified. Chomsky believed that the scientific research program of generativism is progressive.161 He believed that it would survive in the same way as in the 18th and 19th centuries, as it was with Newton’s mechanics, which endured face to face with anomalies until the early 20th century because it had no better alternative.162 Our task will, therefore, be to determine whether the universal principle of grammar (UPG, defining a set of rules, or even laws, of universal grammar) is a well-established scientific principle; whether it can establish a valid principlebased model of explanation. It is not yet entirely clear what exactly this principle expresses. Chomsky, at first, suggests the way to the definition of UPG negatively, by rejecting some possible approaches to the definition of grammar, the first of which we have already mentioned above. We cannot define grammar: (1) by a corpus of specific statements of any extent (SS, 15), (2) by meaningful statements (SS, 15), nor (3) by statements that refers to a very good statistical approximation of a given language (SS, 16).163 We will leave point (2) without any analysis.164 In the commentary on point (1), Chomsky gets close to the definition of UPG, when stating: “(. . .) a grammar mirrors the behavior of the speaker who, on the basis of a finite and accidental experience with language, can produce or understand an indefinite number of new sentences” (SS, 15). This, of course, leads in psycholinguistic, cognitive-linguistic and bio-linguistic directions. In fact, one can say that the speaker reflects the theory of grammar – the innate rules of UPG allow him to generate an unlimited number of The approach of Imre Lakatos (cf. Lakatos 1973, Lakatos 1978) distinguishes between progressive and degenerative scientific research programs (SRP). Chomsky, therefore, bets that although individual theories of generativism may be individually falsified, the core of the SRP of generativism remains intact. These are mainly astronomical anomalies, which ended with the discovery of new planets (with the exception of the Mercury anomaly, see e.g. Torretti 1999, 417–420). Chomsky chooses the formulation: “high order of statistical approximation to English” (SS, 16). This opens up a large space for reflection that we are not able to comprehend – starting with thinking about building grammar from the semantic base, see Kořenský (1984); to the tools of formal semantics, see e.g. Saeed (2016), Heim, Kratzer (1998).
82
4 Formal explanation in linguistics
sentences (i.e. predictions) and to recognize that a given sentence with which he is confronted (from the corpus) is grammatical (i.e. explanation). This mirroring of a speaker’s behavior in generative grammar should not be overlooked. To justify the speaker’s ability to perform language through a reference to the origin of his language competence in “innate” grammatical rules is different from relating these rules as an abstract principle to linguistic phenomena. Identification can lead us astray, where we confuse the search for neurological and biological roots of language competence with the search for a properly formed theory of generative grammar, i.e. a theory based on a valid principle of grammar. Therefore, we should not confuse the neurological (and biological) ability to form grammatical sentences with UPG, which should be the principle that makes the theory of generativism explanatory. We believe that this warning is important because the argument of the speaker’s ability to recognize quickly or “intuitively” grasp grammatical sentences and also create them, quickly learn grammar as a child, etc., is very widespread. Chomsky pays the greatest attention to the rationale of why grammar cannot be arrived at through the statistical approximation of language (3). At the same time, however, the initial consideration distorts somehow what we are to understand by the statistical approximation of English165 – the fact that we can always randomly create sentences, grammatical and non-grammatical, that are not in any corpus, is not a proof of the statistical approximation inadequacy. We can still focus on existing corpora and notice some quantitative (statistical) trends in sentence syntax creation, and system-theoretical linguistics follows this path (see chapter 5.2). Here, Chomsky is distracted from the chance to appreciate statistical analyses by the belief that the guiding point is the ability of a native speaker to form and recognize grammatically correct sentences (SS, 16). “We see, however, (. . .) that a structural analysis cannot be understood as a schematic summary developed by sharpening the blurred edges in the full statistical picture” (SS, 17). Chomsky may understand the use of statistics too narrowly in relation with Shannon’s mathematical theory of communication.166 In the argument against point (3), there appears a The point is that Chomsky states that two sentences – (1) Colorless green ideas sleep furiously, and (2) Furiously sleep ideas green colorless –will be excluded from the statistical models of grammar, although the first is grammatical, despite not being pronounced yet, while the other is non-grammatical (SS, 15–16). This is indicated by the statement: “If we rank the sequences of a given length in order of statistical approximation to English, we will find both grammatical and ungrammatical sequences scattered throughout the list; there appears to be no particular relation between order of approximation and grammaticalness” (SS, 17), as if Chomsky pointed out, rightly, indeed, that concatenation syntax is not sui generis syntax. However, the statistical pathways to syntax analysis are broader, see Herdan and Köhler below (see chapter 5.2 and the First Interlude).
4.1 Building linguistics as a science
83
context, which will become important for us (in part 5 of the book): Chomsky points to efforts of various authors to find a relationship between the grammatical and statistical structure of the language,167 but does not trust them since, according to him, they can be meaningfully done only when we already have established the grammar (SS note 4, 17). Chomsky comes to a further definition of UPG when he delimits the use of finite-state automata (regular grammar) as a means of generating natural language sentences (see the chapter “An Elementary Linguistic Theory”). This delimitation is familiar, related to impossibility of realizing cross-serial dependencies – typical for natural languages – through regular expressions. The pumping lemma for regular grammars does not allow this dependence to be implemented recursively. This impossibility, as was mentioned above, leads to a decision between context-free and context-sensitive grammars whose pumping lemmas capture cross-serial dependencies. Examples are given in the Appendix (see Appendix 9). At the end of the mentioned chapter, Chomsky restricts the boundaries of regular grammars (finite Markov models) with his basic vision of UPG. He first summarizes that: (. . .) there are processes of sentence formation that finite state grammars are intrinsically not equipped to handle. If these processes have no finite limit, we can prove the literal inapplicability of this elementary theory. If the processes have a limit, then the construction of a finite state grammar will not be literally out of the question, since it will be possible to list the sentences, and a list is essentially a trivial finite state grammar. But this grammar will be so complex that it will be of little use or interest. (SS, 23)
This is a very important set of statements. First of all, Chomsky implicitly subscribes to the linguistic feature of productivity or discrete infinity168 of the language – the speaker is able to generate an infinite number of sentences (with unlimited length).169 He goes on to state explicitly that even if the number of sentences were limited, grammar would become a “mere” list, which would, on top, be very complex. Whether discrete infinity is a specific feature of natural language, or it is “only” the implementation of a property of formal languages to natural languages, we clearly recognize the importance of the above mentioned principle of simplicity.
He refers to Benoît Mandelbrot and Herbert Simon (SS, notes 4, 17). Reboul (2017, 26) uses these two terms equivalently. Productivity is one of Hockett’s features, which is combined in a modern form by Reboul, who notes the close link between features: productivity, discreteness and decoupling, cf. Reboul (2017, 24). From the point of view of the role of the economizing principles, this is of course doubtful, see below in the context of the system-theoretical linguistics (see chapter 5.2).
84
4 Formal explanation in linguistics
The irony is that Chomsky has been fighting the same battle for decades. And always and again, the defined grammars become similar to the list, and therefore, a new form of grammatical rules is proposed again, because to stick to the list would be in the contrary with the principle of simplicity, which is more important than other principles. Following the principle of simplicity, thus, leads Chomsky within the theory of grammar to the restoration of the principle of infinity. And this following of the principle of simplicity leads him into the arms of neurosciences in the case of searching for “innate grammatical rules”. Chomsky summarizes the tendency towards simplicity as follows: In general, the assumption that languages are infinite is made in order to simplify the description [emphasis mine] of these languages. If a grammar does not have recursive devices (. . .) it will be prohibitively complex. If it does have recursive devices of some sort, it will produce infinitely many sentences. (SS, 23–24)
Chomsky’s conception of the relationship between the principle of simplicity and the principle of infinity is an example of the use of a mathematical entity (here an infinite recursive procedure) as a model that builds on a simplifying assumption of abstraction (see chapter 2.1). In order to fulfill the principle of simplicity, for the theory to be truly explanatory, the theory has to include a countably infinite set of sentences generated by grammar. Although, indeed, we know that every empirical corpus will always be finite. Similarly, the pumping lemma (and the principle of recursion) will not be limited in terms of the number of steps to be performed – otherwise, the grammar would again have to remain a mere list – the simplicity of grammar theory would be lost at the moment of terminating the recursive procedure at some stage.170 Chomsky’s treatment of countable infinity is, thus, a nice linguistic example of the role of mathematical abstractions in the structure of theories (cf. again Zámečník 2018, 253–255). Here, we are dealing with a meta-theoretical requirement expressed in the principle of simplicity: “The principle of grammar should be as simple as possible,” which defines the space for choosing the principle of infinity: “Recursively, it is possible to form an infinite number of sentences” as a basic explanatory principle of linguistic theory, which should be part of the principle-based model of explanation. ✶✶✶
Here again, we can see the difference compared to the study of syntax in system-theoretical linguistics (see chapter 5.2), where by analyzing large corpora we examine, for example, sentence complexity, preference of left- or right-branching, etc., which is represented for generativism basically by an indefinabilia of the theory.
4.1 Building linguistics as a science
85
In the chapter “On the Goals of Linguistic Theory”, Chomsky subscribes to Hjelmslev’s distinction between arbitrariness and appropriateness of the theory (SS, note 1, 50), when he defines the terms condition of generality (independent of any language) and external conditions of adequacy (judged by the native speaker) (SS, 49–50). In this context, Chomsky’s statement defining the theory as a correctable entity is interesting: “(. . .) neither the general theory nor the particular grammars are fixed for all time (. . .). Progress and revision may come from the discovery of new facts about particular languages, or from purely theoretical insights about organization of linguistic data – that is, new models for linguistic structure” (SS, 50). The theory is, thus, conceived really on the model of a natural science, in accordance with the received-view ideas in philosophy of science. Chomsky also reflects on the relationship between the general linguistic theory of generativism and concrete grammar, that is, how to understand the statement that a particular grammar results from the general theory. He offers three interpretations of this relationship, in terms of discovery, decision and evaluation procedure (SS, 51). On that occasion, he uses the example of the general grammar theory as a machine that finds, decides, or evaluates grammar. He himself prefers the weakest of these procedures, the evaluation procedure, i.e. the theory evaluates on the basis of the corpus and several possible grammars, which of these grammars corresponds to the corpus best (SS, 52–53).171 Chomsky considers this to be sufficient and notes that: “There are few areas of science in which one would seriously consider the possibility of developing a general, practical, mechanical method for choosing among several theories, each compatible with the available data” (SS, 53). Here, as before in case of Hjelmslev, we feel a certain confusion in the use of the concept of theory (see chapter 3.2). Strictly speaking, we have a general theory of grammar (or that imaginary machine) and then individual theories (i.e. grammars). When Chomsky invokes the advantages of the evaluation procedure in the context of the sciences in general (in the previous quotation) and uses the term theory in the sense of the entity “compatible with the available data”, it becomes unclear how to understand the concept of theory. A possible explication is that general linguistic theory is something like general physical theory, to the effect that there will be a general feature of the “physicality” of a scientific theory.172 Partial theories would, then, be simply individual physical theories for different physical ontologies. This seems unbelievable because then
The discovery procedure, which is the strongest, would make it possible to create a grammar directly on the basis of an existing corpus. A slightly weaker decision procedure would make it possible to decide, on the basis of the corpus and the grammar already given, whether this grammar is correct for the corpus (SS, 50–51). So, not in the sense of the physical “theory of everything”.
86
4 Formal explanation in linguistics
general linguistic theory actually becomes a description of what a linguistic theory should look like. At the same time, it would mean that individual grammars (theories) can be very diverse depending on the nature of the linguistic data. This may not be possible if we want to build an explanatory theory. It is more appropriate to understand the relationship of general theory (general grammar) and partial theories (grammars), as a relationship of a single theory (general theory of grammar) to its individual models (partial grammars). Thus, for example, we have different models of Newtonian mechanics for different mechanical systems – specific conditions (corpus) enter the “machine of equations” (the deductive process), and as a result we get individual models (grammars). However, then there is no point in labelling them as theories, and strictly speaking, they are not chosen in any kind of empirical competition, but they are purposefully constructed. Although the previous interpretation of the relationship between the general theory of grammar and individual grammars seems to us to be the most appropriate, other interpretations are, indeed, possible. We can understand the general theory of grammar as empirical generalization and individual grammars as concretizations of these generalizations. Or, indeed, nothing prevents us from perceiving the general theory of grammar as a systemic description, again concretized for individual grammars of natural languages. Based on Syntactic Structures, we cannot remove this conceptual ambiguity, but only follow Chomsky’s other considerations. Chomsky puts three tasks before linguistic theory: (1) to determine external criteria of adequacy for grammars, (2) to characterize the form of grammars in general, and (3) to define the notion of simplicity173 (SS, 53–54). Chomsky states: “Completion of the latter two tasks will enable us to formulate a general theory of linguistic structure in which such notions as “phoneme in L”, “phrase in L”, “transformation in L” are defined for an arbitrary language L in terms of physical and distributional properties of utterances of L and formal properties of grammars of L” (SS, 54).174 We believe that behind task (2) lies the principle of recursion, which incorporates the idea of discrete infinity (productivity) of language. Chomsky’s summary of the tasks of linguistic theory is also important:
The simplicity is also important for Chomsky because, in his opinion, it leads to the fulfillment of the conditions of adequacy (SS, 55). Chomsky specifies: “Linguistic theory will thus be formulated in a metalanguage to the language in which grammars are written – a metametalanguage to any language for which a grammar is constructed.” (SS, note 4, 54).
4.1 Building linguistics as a science
87
Our ultimate aim is to provide an objective, non-intuitive way to evaluate a grammar once presented, and to compare it with other proposed grammars. We are thus interested in describing the form of grammars [emphasis mine] (equivalently, the nature of linguistic structure) and investigating the empirical consequences of adopting a certain model for linguistic structure, rather than in showing how, in principe, one might have arrived at the grammar of a language. (SS, 56)
This formulation presents Chomsky’s linguistic theory again more as a description if we do not read the content of the parentheses as a matter of priority – the nature of the language structure (form of grammar) would then lead to the definition of a formal explanation. We do not find the answer to this uncertainty in Chomsky’s Syntactic Structures – if he refers to explanations, he does so mostly by analogizing physics. In the chapter “The Explanatory Power of Linguistic Theory” itself, he basically deals with the description of what it means that grammar is adequate. Using the example of constructional homonymity, Chomsky shows how the introduction of a morphological plane is a means of explaining a particular linguistic fact: “Thus, a perfectly good argument for the establishment of a level of morphology is that this will account for the otherwise unexplained ambiguity [emphasis mine] of /əneym/” (SS, 86). However, the explanatory nature defined in this way cannot be accommodated in the form of a valid scientific explanation (principle-based model of explanation). Rather, it can be formulated that we make the theory (description) adequate through the introduction of another entity, here of the linguistic level. There is no reference to any explanatory principle, although laws were referred to above. The above mentioned comparison with pre-Newtonian astronomy and kinematics returns. The discrepancy with the (exact) observation is solved by introducing theoretical entities (epicycles, etc.), which make the theory (description) adequate, but not explanatory. We are in a somewhat paradoxical situation because although the concept of formal explanation is associated with generativism, we do not find this term in Chomsky’s statements, at least not explicitly. Only the belief that linguistics is built as an exact science is expressed explicitly, that the theory of linguistics has the nature of theories of natural sciences, that it has rules at its disposal, in the sense of laws. In some respects, however, Chomsky’s idea of linguistic theory, despite the above mentioned, is more modest than it was in Hjelmslev’s case. Chomsky renounces the discovery procedure; while Hjelmslev has it – albeit present in a not entirely explicit and pure form – expressed by the principle of analysis. In fact, Chomsky creates a sophisticated generative (formal) description for linguistics. The explanation, despite proclamations, occurs in generativism only later, in relation with biolinguistics, when the explanation is achieved by non-linguistic means (see chapter 4.2 below). ✶✶✶
88
4 Formal explanation in linguistics
Perhaps surprisingly, we find that Chomsky does not offer a linguistic explanation sui generis. We believe that Chomsky’s journey through decades shows that a strong interpretation of the general approach (see above) is possible only at the cost of stepping away from a purely linguistic theory towards neuroscience and biology. Nevertheless, there is still a possibility, which we presented above in the principle-based model of explanation, to draw the explanatory power of generativism directly from UPG, respectively from the principle of recursion (see the beginning of part 4). If we identify the principle of recursion with a generalized pumping lemma about the insertion of a substructure into the structure, then we can announce that the basic drive of generative explanation is mathematical abstraction, which necessarily incorporates the concept of countable infinity (see above). The domain of linguistics is natural language, so can it be said that Chomsky uses mathematical abstractions – generally formal grammars and specifically the principle of recursion – to build natural language theory in a similar way as physicists use mathematical abstractions to build theories on physical systems? Is it possible to level the principle of recursion (historically speaking) with the principle of gravity, or the principles of symmetry? For example, could the principle of recursion correspond to any type of structural principle and, thus, establish a structural explanation, which we evaluated in the chapter focused on structuralism? (see part 3) The first limitation comes with regard to the scope of such a linguistic theory. It would probably be almost exclusively a theory of syntax.175 Another limitation comes with regard to the sufficiency of this principle for explanation, although we declare it necessary on the syntactic level.176 A comparison with infinitesimal calculus in physics is offered; in a way, it is a construction principle of physics – classical laws of physics are designed with its help, and we cannot imagine excluding it. It is necessary in this respect. Yet, it is not a physical principle; rather, it is a means that has to be filled with physical content. However, is there such a thing in generativism? Can we formulate the laws of linguistics – even if only for syntax – which use the principle of recursion and remain invariably corroborated? After all, the history of generativism is the history of retreat from any such general rules (laws) with the mere preservation of the construction principle of recursion. After all, if the principle of recursion were the explanatory principle of linguistics, it would be the same as declaring the
However, we do not particularly insist on this limitation because we only consider generativism within this restriction. It may be said that, implicitly, the principle of recursion is also recognized in the analysis of syntax by system-theoretical linguistics in connection with the variables of complexity and depth of embedding, cf. Köhler (2012, 154–160). See chapter 5.2 for more details.
4.1 Building linguistics as a science
89
principles of mathematical analysis as the explanatory principles of physics. A person who knows calculus and is completely unfamiliar with concepts of physical principles can reach a result by purely solving a differential equation; the solution (once he is told what each symbol means physically) will be compared with empirical evidence. Can he or she, then, state that mathematical analysis explains the mentioned phenomenon? Certainly not: mathematical analysis generates, according to the rules, a sequence of results that reach the desired conclusion. This may be the core of the formal description (explanation). We, therefore, believe that the principle of recursion is not a valid scientific principle upon which to build a principle-based model of explanation.177 However, we admit that Chomsky does not want to explain formally – he offers formal descriptions (see above). He wants to explain not formally, but through biolinguistics (see below). Therefore, we see generativism as a combination of a formal linguistic description and a biolinguistic explanation; unfortunately, again without resolving the dilemma of linguistics.178 ✶✶✶ We will add a reflection on the chapter “Goals of Linguistic Theory” from the book Current Issues in Linguistic Theory (CILT) because there Chomsky outlines the relationship between generativism and structuralism. Chomsky again relies on the ability of the speaker and the listener to master an infinite number of sentences, which leads him at the very beginning of generativism to cognitive (neuroscientific and
In addition, the principle of recursion is convicted in its pure formality by the fact that it is not exclusively linguistic – crystals grow recursively, the bloodstream, alveoli branches, etc. On the other hand, could the principle of recursion be interpreted as a kind of economization principle? Could it be related to Köhler’s register hypothesis in system-theoretical linguistics? (see chapter 5.2). Chomsky’s statement in the book Aspects of the Theory of Syntax is worth mentioning: “In short, the most serious problem that arises in the attempt to achieve explanatory adequacy is that of characterizing the notion of “generative grammar” in a sufficiently rich, detailed, and highly structured way. A theory of grammar may be descriptively adequate and yet leave unexpressed major features that are defining properties of natural language and that distinguish natural languages from arbitrary symbolic systems. It is for just this reason that the attempt to achieve explanatory adequacy – the attempt to discover linguistic universals – is so crucial at every stage of understanding of linguistic structure, despite the fact that even descriptive adequacy on a broad scale may be an unrealized goal. It is not necessary to achieve descriptive adequacy before raising questions of explanatory adequacy. On the contrary, the crucial questions, the questions that have the greatest bearing on our concept of language and on descriptive practice as well, are almost always those involving explanatory adequacy with respect to particular aspects of language structure.” (Chomsky 1969, 36).
90
4 Formal explanation in linguistics
psychological) topics related to the way the speaker (and listener) master grammar (CILT, 7–8). In contrast to grammar concepts in the past, he states that: The grammar, then, is a device that (in particular) specifies the infinite set of well-formed sentences and assigns to each of these one or more structural descriptions. Perhaps we should call such a device a generative grammar to distinguish it from descriptive statements that merely present the inventory of elements that appear in structural descriptions, and their contextual variants. (CILT, 9)
Although he critically evaluates the descriptiveness of structuralism, he also puts generative grammar in direct connection with de Saussure’s differentiation of langue and parole: The generative grammar internalized by someone who has acquired a language defines what in Saussurian terms we may call langue (. . .). (. . .) Clearly the description of intrinsic competence provided by the grammar is not to be confused with an account of actual performance, as de Saussure emphasized with such lucidity (. . .). (. . .) The classical Saussurian assumption of the logical priority of the study of langue (and, we may add, the generative grammars that describe it) seems quite inescapable. (CILT, 10–11)
To comment on the delimitation of de Saussure’s structuralism, Chomsky states: Saussure (. . .), regards langue as basically a store of signs with their grammatical properties, that is, a store of word-like elements, fixed phrases and, perhaps, certain limited phrase types (. . .). He was thus quite unable to come to grips with the recursive processes underlying sentence formation, and he appears to regard sentence formation as a matter of parole rather than langue, of free and voluntary creation [emphasis mine] rather than systematic rule (. . .). There is no place in his scheme for “rule-governed creativity” [emphasis mine] of the kind involved in the ordinary everyday use of language. (CILT, 23)
It should be noted here that Chomsky himself states that the possibility of dealing with “rule-governed creativity” appeared only with the development of logic and the foundations of mathematics (CILT, 22). And therefore this “creativity”179 controlled by the principle of recursion could not be conceptualized by Saussure. We above (see the beginning of part 4) expressed the development from de Saussure to Chomsky in terms of the use of graph theory in favor of linguistics (from a network of signs to a correctly formed tree), which is confirmed here in Chomsky’s words. Moreover, our belief that the principle of recursion is a “construction principle” of a logical-mathematical kind that is not an explicit linguistic principle is strengthened. This principle allows “only” the description of generative grammar.
We will see below (see the First Interlude) that the understanding of syntax as an area of “volitional creation” is present in Herdan’s conception of language theory in his book The Advanced Theory of Language as Choice and Chance, cf. Herdan (1966).
4.1 Building linguistics as a science
91
Chomsky’s self-reflective and revealing paragraph is worth noting: It is, incidentally, interesting to take note of a curious and rather extreme contemporary view to the effect that true linguistic science must necessarily be a kind of pre-Darwinian taxonomy concerned solely with the collection and classification of countless specimens, while any attempt to formulate underlying principles and to concentrate on the kinds of data that shed some light on these is taken to be some novel sort of “engineering”. Perhaps this notion, which seems to me to defy comment, is related to the equally strange and factually quite incorrect view (. . .) that current work in generative grammar is in some way an outgrowth of attempts to use electronic computers for one or another purpose, whereas in fact it should be obvious that its roots are firmly in traditional linguistics. (CILT, 25)
Over time, it is interesting to read these lines because no one doubts (not even Chomsky) that the emergence of generativism was stimulated by the development of mathematical logic and computer science. However, it is slightly ironic that it was not possible to define those “underlying principles” truly linguistically above the level of the logical-mathematical construction principle. In other words, linguistics, in a way, remains a taxonomy of individual languages, which, indeed, describes the means of generative grammars much more effectively (than in the pre-classical period). In other words, the explanatory task was shifted to the shoulders of biology. In the end, this is also fully shown in the minimalist program (MP), where Chomsky, with reference to the principle of simplicity defines universal grammar (UG) as: “(. . .) the theory of the biological endowment of the relevant components of the faculty of language (FL)” (MP, viii). He points out how, with the gradual reduction of UG to a minimum, the basic questions focused on the biological origin of language and on the question of the relationship between language and mind (or brain) (MP, viii). Chomsky relies on the problematic thesis180 that FL appears suddenly sometime during the period when Homo Sapiens leaves Africa (50–80 thousand years BP) (MP, viii). The core of MP is, therefore, centred on the search for a simple principle that emerged (at that time) and equipped man with FL: “Since language is clearly a computational system, the relevant laws of nature should include (and perhaps be limited to) principles of efficient computation [emphasis mine]” (MP, ix).181 Chomsky points out that: “The basic principle of language (BP) is that each language yields an infinite array of hierarchically structured expressions, each interpreted at
There are a number of studies documenting an earlier origin of language. Given the anatomical and molecular-biological characteristics of Neanderthals, it can reasonably be inferred that they had FL (cf. Marshall 2018). Chomsky, therefore, applies economization, see our previous references to the relationship between the principle of recursion and economization in system-theoretical linguistics (and see chapter 5.2 for more details).
92
4 Formal explanation in linguistics
two interfaces, conceptual-intentional (C-I) and sensorimotor (SM)” (MP, ix). It is specific for Chomsky that he prioritizes FL in the sense of “language of thought”. The core of the MP is then expressed by Chomsky as follows: If FL is perfect, then UG should reduce to the simplest possible computational operation182 satisfying the external conditions, along with principles of minimal computation (MC) that are language-independent. The Strong Minimalist Thesis (SMT) proposes that FL is perfect in this sense. (MP, ix)183
We see that the center of Chomsky’s theory in minimalism remains the basic principle of language in the form of the computational principle. We do not think it has changed much since the 1960s (see above); it is the “construction principle” that is actually “nothing more than” a formal means of creating endless hierarchies of tree structures. And its origin is shrouded in some ambiguity, but in any case its origin is delegated to the hands of biology and its explanations: A possible account of the origin of language is that some rewiring of the brain, presumably the result of some mutation, yielded the simplest computational operations for BP, including the link to some preexisting conceptual structures CS, providing a LOT [language of thought]. (MP, xi)
We do not want to be skeptical. It is a fact that the axis of current fruitful debates about the origin of the language and the specificity of human language is the clash of language-of-thought-versus-language-for-communication approaches. Recently, this conflict is demonstrated in Reboul (2017), which defends the thesis that language emerged primarily as a language of thought and was only subsequently outsourced for communication (Reboul 2017, 57–58). We will see that system-theoretical linguistics implicitly occupies (it is not interested in the origin of language) the position of language for communication, where the role of nonsystemic economizing factors is applied (see chapter 5.2). To summarize, however, we have to state that even the latest form of MP does not represent generativism as a theory that provides linguistic explanations. It is either a description that is based on the construction principle of recursion (cf. Merge), or is based on extra-linguistic – neuroscientific, biological – explanations. However, these explanations are only hypothetical. As we have seen, the abrupt emergence of the language is difficult to explain even by means of biology.
Chomsky refers to the simplest computational operation as the Merge (MP, ix). Chomsky makes an interesting but rather speculative statement: “It appears that the internal system, biologically determined, observes SMT and therefore ignores linear order in favor of structural distance. Linear order and other arrangements, therefore appear to be reflexes of the SM modalities for externalization, having nothing to do with core elements of language design (. . .).” (MP, xi).
4.2 Explication of some models of formal explanation in linguistics
93
4.2 Explication of some models of formal explanation in linguistics We have clarified above (see chapter 2.3) that in the current debates within linguistics, it is most common to distinguish between formal and functional explanations, with formal explanations linked to the line of generativism and functional explanations, less explicitly, to the line of cognitive linguistics. This division, at the same time, corresponds to two basic types of philosophies of linguistics – the formal explanation is connected with essentialism and the functional explanation with externalism (cf. Scholz, Pelletier, Pullum 2015). We will demonstrate formal explanation in linguistics with three examples, texts by Paul Egré, Martin Haspelmath and Frederick Newmeyer (Egré 2015, Haspelmath 2004, Newmeyer 2016). We will also return to these authors when explicating models of functional explanation in linguistics. For all of the above mentioned authors, an openness towards generativism in its current forms is typical, as well as an attempt to show that between the formal and functional explanations, there is no fundamental inconsistency that cannot be bridged. Newmeyer even considers “merging” the two approaches (see chapter 5.1 below). All the approaches presented are similar with regard to the definition of functional explanation (see chapter 5.1 below) and also with regard to the dichotomous approach to the nature of linguistic explanation, although they use different terms to denote the opposite pole of functional explanation. Egré contrasts internal and external explanations using the term grammatical explanation for internal ones and two variants for external ones – historical and functional explanations (Egré 2015, 451). The profiling of grammatical explanations allows Egré to see structuralism and generativism in a consistent way as representatives of the same concept of explanation, which, however, was brought to a higher level in generativism, through the thesis of syntax autonomy (Egré 2015, 453). In the spirit of generativism, Egré succeeds in defining a formal explanation as follows: (. . .) linguistic form is explained, according to the perspective taken by the generative program, if its structural description can be derived from a recursive set of grammatical rules, based on a finite lexicon; its absence is explained if its structural description does not appear as the output of the rules of the grammar. (Egré 2015, 453)
That is, the occurrence or non-occurrence of a linguistic form (structure), of course realized in a grammatical phenomenon, is explained by the derivation of its structural description from a recursive set of grammatical rules. We can state that such a concept of formal explanation corresponds to that which we identified in the analysis of Chomsky’s texts (see chapter 4.1).
94
4 Formal explanation in linguistics
Egré also reinforces the importance of the explanatory nature of the linguistic theory by distinguishing among the observational, descriptive and explanatory adequacy of linguistic theory, which are of increasing importance in this order. Observational adequacy is identified with the ability to classify data correctly; the distinction between descriptive and explanatory adequacy is not clearly evident from Egré’s words (Egré 2015, 453–454), but expresses the connection between the ability to explain and the ability to grasp language acquisition and the universality of grammar across languages184 (Egré 2015, 454). Egré adds two objections to grammatical explanation in generativism, which have already been expressed in a classical form by Talmy Givón and Michael Levin (Givón 1979, Levin 1977). According to Egré, Givón’s critique aims at the inability of generativism to offer valid predictions, and Levin’s critique aims at the non-causal nature of grammatical explanations (Egré 2015, 455). Both of these arguments are based on the classical definition of explanation in Hempel’s D-N model and in its critiques that appeared in the 1960s and 1970s (see chapter 2.1 above).185 Egré defends generativism against both objections. He believes that even a description, if not trivial, can be predictive (Egré 2015, 455) and argues for the non-causality of some explanations. We agree that these are not prima faciae fundamental objections. We simply recognize non-causal models and unambiguously combine predictivity with generativism (see chapters 2.2 and 4.1). Moreover, due to the invalidity of the strict symmetry thesis between explanation and prediction, the possible absence of some type of prediction is not in conflict with the explanatory potential of the theory. Conversely, the mere possibility of predicting does not guarantee the explanatory potential of the theory.186 However, Egré’s reference to causality again leads us away from a purely linguistic mode of explanation. Egré points out that: (. . .) most theories of generative grammar are indeed silent about the algorithm that will be used to implement the rules. A complete theory of language should tell us not only about abstract rules, but also about the underlying computational mechanisms. This view is compelling, but it seems fair to say that the general methodology of generative grammar is compatible with that goal (. . .). (Egré 2015, 456)
Grammar is more explanatory: “(. . .) if the units and principles it postulates better account for the way linguistic representations are learnt and deployed.” (Egré 2015, 454). And also: “(. . .) if the principles it uses are more likely to be shared with those of other languages.” (Egré 2015, 454). Egré comments on the problem of the absence of causality in the original D-N model of explanation explicitly. He also mentions the “flagpole problem” and Hempel’s defense of the possibility of non-causal explanation (Egré 2015, 455). The failure of the explanation-prediction symmetry in the context of the theory of dynamical systems is discussed by Kellert (1993, 96–103).
4.2 Explication of some models of formal explanation in linguistics
95
Egré refers to the neuroscientific aspects of explanation (Egré 2015, 456), and we could also refer more broadly to the minimalist program, as we stated above (see chapter 4.1). To summarize, we can state that Egré’s exposition helped us to define a formal or grammatical explanation, but his defense did not go beyond our findings and is not problematic given our skepticism about the possibility of a valid principle-based model of explanation in generativism. ✶✶✶ Frederick Newmeyer defends that the controversy between formal and functional linguists is artificial and unnecessary because, in fact, the two approaches support each other (Newmeyer 2016, 1). Overall, he pushes for a conciliatory solution to the conflict, which again sounds more positive for generativists – repeatedly pointing out that generative linguists accept the importance of functional explanation (Newmeyer 2016, 17–19, 20–25). Newmeyer’s efforts to define the terms “formal” and “functional” explanation clearly, are very important for our goals. At the same time, a linguistic dilemma again arises in their definitions: (1) (2)
An explanation is formal if it derives properties of language structure from a set of principles formulated in a vocabulary of nonsemantic structural primitives. An explanation is functional if it derives properties of language structure from human attributes that are not specific to language. (Newmeyer 2016, 3)
The definition of a formal explanation implies Newmeyer’s association between the existence of a valid formal explanation and the existence of an autonomous syntax thesis. This identification clarifies to us the field of research in generativism. We can, therefore, state that “in a way” for Newmeyer, there is a principle-based model of explanation in generativism because there is an autonomous module of syntax in language – defined by non-semantic rules (Newmeyer 2016, 3). For us, however, the arguments raised above still apply (recursion as a construction principle, see chapter 4.1). Could one say that without the validity of the thesis of syntax autonomy, a formal “explanation” would be degraded to a description for Newmeyer? Newmeyer even believes that a formal explanation as defined by generativism is compatible with a weak variant of Hempel’s D-N model187 (Newmeyer 2016, 4). He argues that: “(. . .) a general principle is formulated and paired with a
He also recalls the problems of the D-N model recognized by Salmon: (1) the possibility of manipulating the conditions and adapting the theory against falsification; (2) the absence of a theory-independent criterion for recognizing relevant facts (Newmeyer 2016, 5).
96
4 Formal explanation in linguistics
set of initial conditions. A phenomenon is said to be “explained” if it can be deduced from this pairing” (Newmeyer 2016, 4). He provides a specific example in relation to “Wh-Criterion”: Wh-Criterion (. . .) A. A wh-operator must be in a Spec-head configuration with X0 [+wh]. B. An X0 [+wh] must be in a Spec-head configuration with a wh-operator. (. . .) Hence it is deduced that following sentence will be ungrammatical: ✶ I wonder you saw who. (Newmeyer 2016, 4)
And it is exactly the “Wh-Criterion” that is supposed to be an example of a principle (rule) in the field of autonomous syntax. So are we dealing with a valid principle-based model of explanation in generativism? We are not convinced for several reasons: (1) the “Wh-Criterion” is a specific rule of English grammar, not a universal grammatical principle; moreover, (2) we consider the “Wh-Criterion” to be an empirical generalization, which therefore cannot be an independent principle; and as reductio ad absurdum (3) we could, as we believe, formulate an unlimited number of such “D-N models” that will have “to some extent” a general initial statement; but this statement is not a law (principle) – that is, the basic condition of a valid D-N model of explanation is not met. Newmeyer is aware of the limitations of the D-N model and also reflects other alternative demands that contemporary philosophers of science place on scientific explanations – he writes about the consistency of the theory, simplicity, elegance and the ability to bring understanding (Newmeyer 2016, 6). He believes that the Syntactic Structures themselves meant clearly an increase in these criteria in the domain of linguistics in the 1950s (Newmeyer 2016, 6). The references to simplicity, indeed, coincide with Chomsky’s minimalist program, which we wrote about above (see chapter 4.1). In the text, we also find an argument which we have already found in Egré and which Givón formulated against the generativist explanation: In essence, a formal model is nothing but a restatement of the facts at a tighter level of generalization (. . .). There is one thing, however, that a formal model can never do: It cannot explain a single thing (. . .). The history of transformational-generative linguistics boils down to nothing but a blatant attempt to represent the formalism as “theory”, to assert that it “predicts a range of facts”, that it “makes empirical claims”, and that it somehow “explains” (. . .). (Givón 1979, 5–6, cited by Newmeyer 2016, 6–7)
Newmeyer disagrees with Givón, basing his opinion on the distinction between descriptive and theoretical models; at the same time, he considers the theoretical model to be explanatory (Newmeyer 2016, 7). However, his statement implies that the theoretical model of generativism is rather a formal basis, in our terminology the construction principle of the theory. He writes: “(. . .) a formal explanation is,
4.2 Explication of some models of formal explanation in linguistics
97
by any criterion, a “real” explanation. The only questions to answer therefore are whether formal explanation is necessary and whether it is sufficient” (Newmeyer 2016, 7). He answers the first question in the affirmative, the other in the negative (Newmeyer 2016, 7). From our point of view, the situation can also be formulated in the way that a formal description is necessary as a construction principle, but does not constitute a valid principle-based explanation (see our argumentation in chapter 4.1 above) per se. If Newmeyer admits that it is not enough, then he himself undermines the position of formal linguistics. From the point of view of the explanatory dilemma of linguistics, the situation deviates towards an explanation that requires non-linguistic principles. We believe that Newmeyer comes closest to this problem in the following passage of the text: It is not enough just to point to formal generalizations, since all but the most extreme functionalists would agree that they exist. Rather, it is necessary to show that these generalizations are fully intertwined [emphasis mine] in a system. (Newmeyer 2016, 8)
Unfortunately, the term “fully intertwined” is not clearly explicated by Newmeyer, and thus again does not escape Givón’s critique of generativism. However, whatever is “fully intertwined”, it is clear that the basic empirical support for the validity of formal explanations is – according to Newmeyer – the thesis of the autonomy of syntax (AS), which is demonstrated above by the “Whconstructions” example (for illustration, see Newmeyer 2016, 8–12). Newmeyer summarizes this position as follows: (. . .) languages have morphosyntactic systems that are not in lockstep with semantic or discourse properties, as is consistent with AS. It follows, then, that formal explanation is a justified mode of explanation in linguistics. (Newmeyer 2016, 12)
However, we believe that this can again be stated also about the case of a description. If we find a description for which there is no exception, then can we declare it explanatory because it is not fitted with stability in semantics or discourse? When it is not explicated what “fully intertwined” means, we do not gain the means in generativism to build a clearly conclusive principle-based model of explanation that should be purely formal. Apart from the fact that the thesis of the autonomy of syntax can hardly be understood as an independent linguistic principle that could serve as a source of explanatory power, we will see that the thesis of the autonomy of syntax can be seriously challenged at a time when functional linguistics, or system-theoretical linguistics, will begin to explain syntax functionally, not in isolation, but in relation to
98
4 Formal explanation in linguistics
other linguistic levels (see chapter 5.2). At that moment, from the formal explanation, only the construction principle really remains. On the other hand, at that moment we will also gain an opportunity, as we have already indicated above (see note 176), to consider some starting points and principles of generativism in the sense of outside-systemic requirements. We will be able to incorporate some basic elements of the core of the scientific research program of minimalism into the research program that exists in system-theoretical linguistics. The most attractive vision is that some principles of generativism could result from a comprehensive system-theoretical approach that would link all linguistic levels. Generativism would prove to be an approximation of system-theoretical linguistics at the syntax level. Later, we will focus (see chapter 5.1) on Newmeyer’s attempt to reconcile formal and functional explanations, and try to show that his example of the combination of formal and functional explanations (Newmeyer 2016, 22–24) is explicable by means of system-theoretical linguistics. ✶✶✶ The last chosen contribution to the explication of formal explanations in linguistics is provided by Martin Haspelmath (2004). In this text, Haspelmath argues against the need for linguistic descriptions in building linguistic explanations.188 He prefers clearly to build linguistic explanations on non-linguistic foundations – this represents a limitation for generativists (for conclusions see Haspelmath 2004, 566–568) and also for functional linguists (for conclusions see Haspelmath 2004, 573–574). Haspelmath’s entire endeavor is framed by his belief that the discipline capable of providing an “explanation of the basic building block” of linguistics is biology (Haspelmath 2004, 557). In fact, both generativists and proponents of functional explanations face the need to base their theories on some non-linguistic basis. In this respect, the linguistic description is ultimately irrelevant, and moreover, the formal explanation becomes also irrelevant. Haspelmath does not even hide the fact that a “formal explanation” is nothing more than a formal description: Two general guiding principles that formal linguists use to make the choice are: (i) Choose the more economical or elegant description over the less economical/elegant description, and (ii) choose the description that fits better with your favorite view of Universal Grammar. (Haspelmath 2004, 573)
Specifically, he expresses this belief as follows: “Linguistic explanation that appeals to the genetically fixed (“innate”) language-specific properties of the human cognitive system (UG) does not pressupose any kind of thorough, systemic description of human language (. . .). Linguistic explanation that appeals to the regularities of language use (“functional explanation”) does not pressupose a description that is intended to be cognitively real.” (Haspelmath 2004, 554–555).
4.2 Explication of some models of formal explanation in linguistics
99
Unlike Newmeyer, Haspelmath does not believe that we need to define something like an independent “formal explanation” that would be a prerequisite for a functional explanation – not even in the form discussed above, as a necessary but not sufficient condition for explanation.189 He characterizes the activities of generative linguists (especially in the 1980s and 1990s) as the creation of empirical generalizations: “(. . .) by examining a range of phenomena both within and across languages, formulating higher-level language-internal and cross-linguistic generalizations, and then building these generalizations into the model of UG” (Hapelmath 2004, 559). Thus we can say that the scheme of this activity is simple: an empirical generalization 1 (EG1) is created, then confronted with empirical evidence, there is a change to EG2, etc., according to the availability of data (we gave the example of Swiss German above, see the beginning of part 4 and Appendix 9).190 Haspelmath cites a number of examples from the history of generativism, when limiting criteria specifying the form of UG (in terms of syntax, morphology and phonology)191 were gradually clarified. He joins Newmeyer’s critique that typological evidence cannot be used to form hypotheses about UG (Haspelmath 2004, 561) and adds another argument based on exceptions to language universals application: Thus, on purely statistical grounds, there is every reason to believe that there are also generalizations with exceptions that we could only observe if there existed six billion languages in the world. (Haspelmath 2004, 564)
He refers to the book Newmeyer (1998), in which Newmeyer claims: “Formal analysis of language is a logical and temporal prerequisite to language typology. That is, if one’s goal is to describe and explain the typological distribution of linguistic elements, then one’s first task should be to develop a formal theory.” (Newmeyer 1998, 337; cited according to Haspelmath 2004, 568). By contrast, Haspelmath claims that: “(. . .) for the purposes of discovering empirical universals (. . .), it is sufficient to have phenomenological descriptions (. . .). “Observational adequacy” is sufficient. In other words, a descriptive grammar must contain all the information that a second-language learner (. . .) would need to learn to speak the language correctly, but it need not be a model of the knowledge of the native speaker.” (Haspelmath 2004, 569). Haspelmath also points to a certain confusion between description and explanation, which is typical in generativism: “When a situation is encountered where some non-occuring structures could just as easily be described by the current descriptive framework (= the current view of UG) as the occuring structures, this is taken as indication that the descriptive framework is too powerful and needs to be made more restrictive. In this sense, one can say that description and explanation coincide in generative linguistics (. . .).” (Haspelmath 2004, 560). In functional linguistics this confusion does not occur according to Haspelmath (Haspelmath 2004, 560). E.g. the X-bar scheme introduced restrictions on the original UG, limited the phrasestructure rules, and UG became more empirically adequate (Haspelmath 2004, 560).
100
4 Formal explanation in linguistics
He, thus, draws attention to the fundamental limitation of empirical evidence, which is drawn from the comparison of natural languages that exist – in fact, the classical problem of induction opens up before generativism. Thus, an extralinguistic source of explanations in generativism begins to prove inevitable. Haspelmath’s biological analogies of language are the most interesting. In the subchapter “Possible languages and possible organisms”, he builds on considerations of hypothetical languages that can potentially be spoken, but of which only a small part is actually realized (Haspelmath 2004, 565). He writes about them: “Such language could be acquired and used, but they would not be very userfriendly, and they would undergo change very soon if they were created artificially in some kind of experiment” (Haspelmath 2004, 565). It is a kind of adaptability of languages, which he compares with the adaptability of organisms. On that occasion, he also considers non-adaptive factors that may play a role – he gives the example of symmetries (Haspelmath 2004, note 6, 575). Personally, however, he is inclined to believe that symmetry also refers to some form of adaptability.192 For us, the consideration of symmetries is interesting in the context of the search for non-causal structural or topological explanations (see chapter 5.6).193 Biological analogy allows Haspelmath to go further and compare existing constraints on the structure of possible languages with the existing constraints on the biological structures given by the genetic code (Haspelmath 2004, 565). However, he does not believe much in the usefulness of generativistic descriptions with regard to the definition of these structures. He says that: (. . .) the empirical study of cross-linguistic similarities does not help us in identifying the cognitive code that underlies our cognitive abilities to acquire and use language. The cognitive code evidently allows more than is actually attested [emphasis mine], and cross-linguistic generalizations can be explained by general constraints on language use. (Haspelmath 2004, 566)
References to the cognitive code, as well as the whole network of biological analogies, prove that Haspelmath is a supporter of the existence of an extra-linguistic explanatory source of linguistics. Haspelmath believes that neuroscience will
For example, the symmetry of moving animals contrasts with the asymmetry of organisms that are static (e.g. plants) (Haspelmath 2004, 575). Haspelmath’s references to Thompson (1961) revive reflections on a structuralist and formal approach to biology, see French (2014, 324–352). See above (chapter 2.2) for considerations on non-causal explanations.
4.2 Explication of some models of formal explanation in linguistics
101
eventually lead to a solution to basic linguistic issues of generativism, but, according to him, we still have to wait for this solution (Haspelmath 2004, 566).194 ✶✶✶ We have followed three current contributions to the role of formal explanations in linguistics, the common feature of which is a statement on the gradual convergence of generativist and functional approaches to language. Another common feature is the move towards the biological and specifically neuroscientific explanatory frontier, which would protect any linguistic explanations most safely. Of these texts, Haspelmath relies most on non-systemic, non-linguistic criteria that provide explanations while descriptions and formal models of grammar are not needed in this task in contemporary generativism (and as we will see also in functional approaches to language, see the fifth part of the book). In contrast, Newmeyer defends the originality and the need for formal explanations, which are also necessary for the implementation of functional explanations. We argued against Newmeyer, referring to our conception of the principle of recursion as a construction principle, and we questioned the independence of the formal explanation. We think we can conclude our reflections on formal explanations by stating that we have not been succeeded in finding a possible valid principle-based model of explanations for them. We, therefore, do not think that, in the context of generativism, for which the formal point of view is the strongest, it is possible to formulate a purely linguistic explanation. What contemporary generativism offers, therefore, are both (1) very successful descriptions classifying existing and potential new languages and (2) possible models that seek explanatory power outside the linguistic sphere – in biology and neuroscience – within a minimalist program. Thus, generativism is fully anchored in the mentioned explanatory dilemma of linguistics. In the First Interlude, we will attemp to build a bridge between the structural and formal approaches outlined above and the quantitative and systemtheoretical linguistics that will follow (see part 5 of the book). This bridge will be Gustav Herdan’s linguistic theory, whose explanatory strategy is difficult to classify within selected categories of formal (and systemic) and functional descriptions and explanations. In this interpretation, we will labour to make a different line of linguistic development than the traditional one (from structuralism to generativism) more visible – the line leading from structuralism through Herdan to quantitative and system-theoretical linguistics.
He, therefore, offers another option: “What we really need to test the outer limits of UG is experiments on the acquisition of very unlikely or (apparently) impossible languages.” (Haspelmath 2004, 566).
102
4 Formal explanation in linguistics
The first interlude: Herdan’s Language as choice and chance The first great formal theories of complex systems were those of statistical physics: first, kinetic theory (Boltzmann, Maxwell), and then the more general apparatus of statistical mechanics (Gibbs, Tolman). For many complexity theorists writing in the wake of the development of statistical physics, it was natural to apply the same statistical methods to ecosystems (Lotka), to “disorganized” complex systems in general (Weaver), and even to an imagined social science capable of predicting the future history of the galaxy (Asimov’s Foundation trilogy). Michael Strevens, Complexity Theory
Gustav Herdan’s work, specifically The Advanced Theory of Language as Choice and Chance (Herdan 1966, TLCC), is interestingly situated in the middle of linguistic structuralism and quantitative linguistics.195 Herdan puts it this way in the introduction: The determination of the extent to which the speaker is bound by the linguistic code he uses, and, conversely, the extent to which he is free, and can be original, this is the essence of what I call quantitative linguistics. (TLCC, 5–6)
As we stated above, Herdan’s place in our research is defended by pointing to the bridging role he plays between structuralism on the one hand and quantitative linguistics on the other. However, there is another reason why Herdan’s work is part of our analysis – Herdan systematically permeates the analogy of linguistics and physics throughout his book. This analogy will again be useful to us in considering the nature of the explanations that his “quantitative linguistics” offers. Herdan himself understands his conception of the theory of language as closely connected with the work of de Saussure – he directly states that: “(. . .) my work may be described as the quantification of de Saussure’s langue-parole dichotomy” (TLCC, 13) and Boole (Laws of Thought)196 (TLCC, 13). He also draws inspiration from these authorities for two central principles – de Saussure’s principle of linearity and Boole’s law of duality. He defines both of these principles (laws) as
Herdan is included among the classics of quantitative linguistics although his position is somewhat controversial, especially given the certain eclectic nature of his linguistic approach. In some respects, he was initially in stark opposition to emerging quantitative linguistics – for example, due to his rejection of Zipf’s law (see below). For an overview see Best, Altmann (2007). “The present work can be described as an extension of Boolean principles from symbolic logic to linguistic expression in all its aspects, the fundamental law in both fields being a law of duality: Boole’s algebraic law of duality as the fundamental law of thought, i.e. of linguistic content, and my principle of linguistic duality in language as the fundamental law of linguistic expression.” (TLCC, 13). He refers to the book An Investigation of the Laws of Thought on Which are Founded the Mathematical Theories of Logic and Probabilities, Boole (1854).
The first interlude
103
mathematical concepts necessary for grasping the structure of language (TLCC, 8). And the law of duality is more important for the structure of language because, as Herdan writes: Linearity, although the basic principle of speech, is only the Prokrustes bed of language. It needs to be counteracted by grammar indicating the points of contact between the words, according to the speaker’s choice. (TLCC, 7)
The law of duality can, thus, be thought of as a potential fundamental principle on which the principle-based model of explanation in Herdan’s linguistic theory can be based. Herdan’s mathematical conceptualization of grammar (syntax) is, thus, based on the principle of geometric duality: Thus, geometrical duality represents the mathematical model of the system of grammatical points of contact between the linguistic elements; or briefly, of the grammatical contacts between the words. It represents a ‘schema’ of the simplest kind of conceptual concatenation, and in this sense can be regarded as the basic law of grammar. (TLCC, 8)
Here, it is important to point out that Herdan takes a quick step from de Saussure’s to Chomsky’s view in the sense that the system of oppositions – typical for de Saussure – refers to grammar (syntax) only implicitly while Chomsky bases it with all autonomy on the specific form of the relational system. We interpreted the transition from de Saussure’s view to Chomsky’s view of linguistics as a selection of a certain kind of graphs (a correctly formed tree); in Herdan’s view we see mathematical conceptualization rather on the level of analogy – geometric duality is projected onto language structure. With regard to Hjelmslev’s conception of (meta)theory of language (see chapter 3.2), it is interesting to note that Herdan explicitly states that he bases his theory on the two principles – linearity and duality, and that he is not interested in the logical structure of language (TLCC, 8).197 Herdan also further develops this in defining “literary statistics” (or “language statistics”) as “quantitative philosophy of language” (TLCC, 9). He specifies: (. . .) literary statistics is structural linguistics raised to the level of a quantitative science, or to that of a quantitative philosophy. Thus it is not in its results irrelevant to linguistics, nor is its main function that of providing an auxiliary tool for research. (TLCC, 9–10)
He says that: “(. . .) it is not logic which is made the starting point, the ‘model’, of language, but the empirical fact of the linearity of the sequence of linguistic elements, and the relations of points and segments along that line, governed by the laws of projective geometry. Logic governs the relations between the concepts, whereas projective geometry governs the relations between the linguistic symbols (. . .).” (TLCC, 8).
104
4 Formal explanation in linguistics
He defines language statistics as an independent linguistic theory, which has its basic principles (linearity and duality), uses the means of mathematics (combinatorics and “classical statistics”) and benefits from analogies with statistical physics.198 The very schedule of Herdan’s work is framed by the “duality” of chance and choice – blind “causality” of the linguistic phenomenon and our decision (as speakers) to use some linguistic means. Thus, in the four chapters (I–IV), Herdan presents a gradual alternation of chance and choice.199 We cannot choose phonemes (I) – that is why it is chance that governs in phonetics. We can choose words (II) – that is why we find choice in the lexicon. We cannot choose the optimal system of linguistic oppositions (III) – therefore grammar is governed by chance. We can choose basic oppositions for an adequate description of the universe of discourse (IV) – therefore choice rules at the discursive level (TLCC, 11–12). ✶✶✶ Herdan examines individual levels – phonetic, lexical, grammatical and discursive – in detail. In the following text, we will focus on those aspects of his theory that clarify the connection between structuralist and system-theoretical linguistics and consider some elements of Köhler’s system-theoretical linguistics in its germinal form (outside the comprehensive system). We will also be interested in cases where Herdan identifies various linguistic laws that could, thus, establish a principle-based model of explanation. In the first section of the book “Language as Chance I – Statistical Linguistics”, Herdan defines the relationship between linguistics and the theory of information and communication through the Fundamental Law of Communication (FLC): The proportions of linguistic forms belonging to one particular level of understanding, or to one stage of linguistic coding, – phonological, grammatical, metrical, – remain sensibly constant for a given language, at a given time of its development and for a sufficiently great and unbiassed number of observations. (TLCC, 15)
Thus, language statistics is understood as an intersection of three areas: combinatorics, number statistics (this includes probability theory and statistical inference, i.e. what is usually understood as “statistics”) and statistical physics (TLCC, 10). With regard to statistical physics, he writes: “(. . .) we find that the statistical principles appropriate to the levels of language, i.e. phonemic or alphabetic, vocabulary, grammar, style, correspond to the principles of small particle statistics (. . .).” (TLCC, 10). However, this division is not strict, he relativizes it in several places, e.g.: “It will be obvious that since grammar and lexicon are not kept in water-tight compartments, neither is either purely ‘chance’ or purely ‘choice’. Each contains both elements, though in significantly varying proportions.” (TLCC, 127).
The first interlude
105
However, this “law” (FLC) can hardly be described as an explanatory principle, it is rather a regularity (rule) that should itself be explained. This particular explanation is present in Köhler’s systems-theoretical linguistics.200 Therefore, we cannot rely on FLC as on an explanatory principle in the principle-based model of explanation. It is testified by Herdan himself when he relates this law to de Saussure’s langue-parole dichotomy, or to the definition of langue, which is independent of the parole: The basic law of linguistic communication as stated above is then tantamount to the statement that ‘la langue’ is the collective term for linguistic engrams (phonemes, word-engrams, metric form engrams) together with their particular probabilities of occurrence. (TLCC, 27)
Although this interpretation of langue provides an important link between linguistic structuralism and quantitative linguistics, it does not bring any content to this concept that would allow FLC to be understood as an explanatory principle. Rather, it turns out even more that it is actually a quantitative-linguistic definition of the concept of language (langue) as a system that is expressible by statistical means. And actually Herdan uses the term “the statistical view of de Saussure’s dichotomy” (TLCC, 27). From Herdan’s other statements, it is clear why he tends to talk about law – that langue thus conceived can be understood as governed by normative law where de Saussure strictly attributed to synchronic law only generality but not imperativeness (TLCC, 28, cf. CGL, 92–93).201 He then completes this reasoning by using statistical terminology, comparing parole to sample and langue to population (TLCC, 28). If we understand that the population is “governed by laws” responsible for the stability of the inventory of forms and their frequencies, then the way opens again to Köhler’s system-theoretical linguistics. However, we still have to find the “laws” that “govern the population”, as well as the model of explanation that can be based on them (see the fifth part of the book).202
It is well developed especially on the lexical level – through the expression of the relationships of quantities: the size of the inventory, the number of phonemes and the frequency of the lexical unit (see the chapter 5.2). Herdan writes: “(. . .) if by linguistic normative laws we understand something which regulates the relative frequency of linguistic forms belonging to a certain class, then our statistical conception of ‘la langue’ implies such normative laws (. . .).” (TLCC, 28). Herdan complicates and confuses the situation a bit when he starts talking about collective choice: “It will be obvious that chance enters here only in the relation of ‘la parole’ to ‘la langue’, being that of random sample to population. The form of the parent distribution itself, as the precipitation of the speech community’s linguistic activity during centuries, represents the element of choice, not individual choice like style, but collective choice. Using a metaphor, we may here speak of the choice made by the spirit of language, if we understand by it the principles of
106
4 Formal explanation in linguistics
However, Herdan only implicitly approaches the definition of the requirement “stability”, which is explicitly defined by Köhler in system-theoretical linguistics (cf. Köhler 2012, 178). Herdan puts it the following way: Without a sensibly stable series of relative frequencies of linguistic symbols or forms there can be no prediction, or rather no guessing or expectation of being correct, and thus there can be no information in the sense of information theory. (TLCC, 28)
What Köhler analyzed gradually (as we will see from the lexicon to the syntax, see chapter 5.2) is present in Herdan all at once, but of course in a less clear and complete form. He recognizes, for example, that grammatical phonemes (to the extent of language choice) contribute significantly to the stability of phoneme frequency distribution across speakers (authors) (TLCC, 56). Herdan is more comprehensively aware of these influences across linguistic plans; he realizes that, for example, the frequency of phonemes depends on many things across system levels (see the chapter “Explanation of Stability of Linguistic Distributions”, TLCC, 44–60). This foreshadows the interconnection of individual subsystems in Köhler’s system-theoretical linguistics (especially in the lexicon and syntax, see chapter 5.2). ✶✶✶ One of the central203 passages of section two of the book “Language as choice I – stylostatistics” is a subchapter (5.9) entitled “Unsuitable mathematical models in language statistics, and their consequences”. In this passage, Herdan expresses his critique of central statistical laws and economization principles. These are Zipf’s law, Mandelbrot’s canonical law, and the law of the least effort. Herdan believes that these are inappropriate mathematical models that are sterile, and cannot be applied in the sense we commonly understand the concept of scientific law (TLCC, 88). He comments on Zipf’s law most radically: That the decrease of frequency should be related to an increase in rank follows, not from any natural property of language structure, but merely from the fact that the word with the highest frequency is given the lowest rank, and as the frequency decreases the words are given correspondingly higher ranks. Thus the inverse relation between frequency and rank, which is at the basis of the Zipf law, is one of our own making. (TLCC, 88)
linguistic communication which are in conformity with the psychomental complex of the members of the particular speech community. The stability of the series is the result of chance.” (TLCC, 28). Part of this section is the assertion of the dependence of vocabulary richness on the length of the text, which is standardly referred to as Herdan’s (or Heap’s) law. Herdan states: “(. . .) although vocabulary grows with increasing text length, yet it does so with diminishing speed, because the rate of increase at any moment is inversely proportional to the text length.” (TLCC, 76).
The first interlude
107
We agree with Köhler’s belief that most so-called linguistic laws do not play the role of sui generis explanatory principles, but that they are rather empirical laws that manifest a deeper level of systemic principles (on which a functional explanation is built, see chapter 5.5); but Herdan’s criticism seems too radical. Even an empirical law itself can be an important indicator of the existence of the stated systemic principles, and therefore its evidence always has to be perceived as a potential source of important knowledge about the nature of the system under study. We will see this below (subchapter 5.4.1) in connection with considerations of the importance of power laws in scale-free networks (cf. Caldarelli 2007). On the other hand, Herdan’s critique is useful because it shows the need to distinguish the hierarchy of linguistic principles and laws presented – he himself does so when trying to identify universal principles (duality and linearity) and laws (FLC) that could serve to explain language phenomena. It is just a question of why Herdan was not equally critical of the above-mentioned FLC, which is obviously sterile from an explanatory point of view, and itself requires explanation.204 Herdan also believes that the law of the least effort has no significance for linguistics because it makes no sense to apply it outside the physical framework – in physics, it should express the principle of minimizing energy consumption by a physical system (TLCC, 91). More precisely, beyond Herdan’s interpretation, in physics this principle could even be said to appear within the variational principle.205 This is how Luděk Hřebíček thinks about it (see the Second Interlude). Herdan believes that this principle cannot be meaningful in linguistics because: “(. . .) the use of the linguistic code, like that of any other man-made contrivance, is naturally controlled by what governs all conscious or intentional activity under normal conditions: economy of effort” (TLCC, 91). Here we come across Herdan’s belief that there is such a thing as the economy of the effort of the intentional system, which cannot be transformed with the notions of economization of energy consumption in a physical system. We must admit that we do not understand Herdan’s conception. Overall, it is striking that Herdan rejects
204 Herdan is similarly radical in criticizing Mandelbrot’s canonical law. For example, he claims that: “This is where the great fault of Mandelbrot’s argument lies. His whole argument is based upon the tacit assumption that the substitution of another variable in place of r would leave the formula, and thus the curve, identically the same.” (TLCC, 89). And also: “But here comes the main defect of both models from the practical angle: their authors have forgotten to take the possible influence of sample size upon the parameters into account.” (TLCC, 89). Also in this case, we think the comments we made in connection with Zipf’s law apply. The variational principle in physics generalizes the meaning of the principle of least action, e.g. Feynman, Leighton, Sands (2010, 19.1–19.14). Strictly speaking, it does not always have to be about minimizing a given quantity, but also about maximizing, i.e. extremization is a more general principle.
108
4 Formal explanation in linguistics
one of the most interesting connections between physical and psycho-social systems in a book that systematically maps the isomorphic structures of linguistics and physics. The second key passage of “Stylostatistics” is the introduction of a new statistical parameter, the so-called Characteristic (Herdan uses the symbols K and vm interchangeably), which Herdan considers an important text constant (TLCC, 101–102). He defines the Characteristic as follows (where σ x is the standard deviation, Mx is the arithmetic mean and N is the number of words): pffiffiffi (. . .) the Characteristic when written in the form σ xM= x N is seen to represent the coefficient of variation of a mean, or the relative fluctuation of a mean, as distinct from that of a single value, σ x =Mx . We shall distinguish it from the latter by the symbol vm and understand by it the coefficient of variation of the sampling distribution of means. (TLCC, 102)
Herdan considers this entity to be linguistically significant because it is independent of the text length, and can thus serve as a means for suitable quantitative evaluation of the style: “Our interpretation of the Characteristic K has thus enabled us to describe style in terms of a statistical concept: the relative fluctuation of the mean frequency of word occurrence.” (TLCC, 103) Here Herdan develops one of his parallels between linguistics and statistical physics – he believes that language behaves like a physical system because with time the Characteristic decreases, which corresponds to the increase206 of the system entropy (TLCC, 112–113): (. . .) vm appears indeed to decrease with time, the range of variation in the use of words around a mean frequency becoming smaller and smaller compared with that mean frequency. This admits the tentative conclusion that language, in the aspect under consideration, behaves like a physical system. (TLCC, 113)
The problem, however, is that we believe that the physical analogies used so far are not sufficiently transformed to reveal anything crucial about the functioning of the language system.207 Moreover, we cannot suppress the impression that the Elsewhere, however, he considers entropy change somewhat differently. For example, Herdan points out that some features of language indicate an increase of entropy (e.g. changes in vocabulary), while others indicate a decrease in entropy (e.g. the development of European languages towards monosylabism), but that this contradiction is only apparent (TLCC, 298–299). According to him, the first case is a result of purely stochastic phenomena, while the other also includes deterministic elements (which are artificial and related to the choice of speakers) (TLCC, 298–299). He borrows Auerbach’s concept of ectropy, which he interprets inversely to entropy and identifies with redundancy (TLCC, 299). For example, Herdan states: “The change in the relative frequency of gaps with the size of gap is formally identical with that of the radioactivity of isotopic material with time. (. . .) this
The first interlude
109
whole concept of Herdan’s Characteristic is too simplistic. It is as if Herdan claimed that the main advantage of language statistics lies in the use of several concepts of descriptive statistics (arithmetic mean, standard deviation, etc.) and such a suitable combination of them to find a constant characterizing a text style or even a literary epoch. Herdan, thus, proceeds fundamentally differently than Köhler in the field of system-theoretical linguistics (see chapter 5.2). It is not essential for Köhler to find sub-indices characterizing a particular language level, but to describe the system of this level and its relation to other systems for other language levels. Although Herdan attempts to lay de Saussure’s theory on a mathematical foundation,208 he rather achieves the elucidation of partial aspects of the language system. So far, even in the field of stylistics, we do not find a systematic effort to build an explanatory theory. However, some interesting analogies are present in Herdan’s reference to more general trends later thematized by system-theoretical linguistics – for example, the importance of the power law. Herdan considers them in the form xyk = const. in connection with the analysis of Chinese vocabulary and finds an analogy with biological taxonomy (TLCC, 143–147). ✶✶✶ The third section of the book “Language as chance II – Optimal systems of language structure” offers a partial analysis of several other concepts, which we find later in a comprehensive form in Köhler’s theory of the structure of the lexicon: (. . .) the number of basic phonemic or alphabetic elements and their combination in syllables and words, make different claims upon our linguistic faculty. Upon the number of elements per word, say, depends the amount of physical effort required in speaking and writing; therefore, the tendency to shorten long words. The number of basic elements, on the other hand, – the size of the phonemic system or of the alphabet – determines the mental effort required for learning and memorising such elements, their symbols and combinations (. . .). (TLCC, 178–179)
Herdan also quantifies the relationship between vocabulary size (in a given language) and average word length (in that language) (TLCC, 179–180). This is again found in Köhler’s system-theoretical linguistics in a developed form. Generally, it
means that we have described the average time interval between grammar forms, or their frequency, by a law of chance similar in form to that used for radioactive decay.” (TLCC, 130). 208 He tries to show that language statistics can develop de Saussure’s concepts and structuralist theory further: “The specifically linguistic statistics, such as vm and the coding principle of reciprocity of symbol length and frequency of use, are also valid for the ‘signifiant-signifié’ relation where they express the relation between the concepts and the events (objects) subsumed under them, or between content and size of concepts.” (TLCC, 121).
110
4 Formal explanation in linguistics
is interesting to compare these partial findings on the dependence of word length on the alphabet size, vocabulary size, but also on redundancy (see below) with Köhler’s quantitative expression of the length of the lexical unit: L = LGA RedZ PH −P F −N where LG is the vocabulary size, Red is the redundancy requirement, PH is the number of phonemes and F is the frequency of a lexical unit. The values of A, Z, P and N are parameters (Köhler 1986, 77).209 In the chapter “Optimality of the word-length distribution”, Herdan outlines the way he conceives an explanation of a linguistic phenomenon, here a specific “word-length distribution”. Herdan writes: (. . .) the word length which we encounter in a language at a given time of development is the result of the ‘encounters’, i.e. comparisons between words and the ensuing impulses for changing word length as an aid or safeguard against confusion. (. . .) particular word lengths are the results of ‘encounters’ by comparison, and thus, of solidarity between all lengths, the effect being proportional to the strength of the impulse. This must lead to a better and better adjustment of word lengths, and as time proceeds, their distribution approaches lognormality. An optimal linguistic code is one in which the oppositions forming the system of solidarity are least liable to be misunderstood, which is for word length achieved by its lognormal distribution. (TLCC, 204)
Here again, with regard to the concept of solidarity, Herdan refers to de Saussure (TLCC, 204) and refuses to explain the structure of the language system through psychology.210 He refers to the need to explain via properties of the code and its tendency to optimality: The mass – or global – properties of language, such as the stability of occurrence of phonemes, the lognormality of word length, the frequency distribution of vocabulary etc. must be explained as consequences of the properties of the linguistic code and its gradual approach to optimal conditions [emphasis mine]. (TLCC, 205)
We see two important findings in Herdan’s reflections – the idea of the structural axiom (1) that characterizes language as a self-organized system, which systemtheoretical linguistics considers as its explanatory basis (see chapter 5.5); and the
Apart from the lack of integrity, Herdan also differs from Köhler in his greater emphasis on the importance of combinatorics (Cf. TLCC, 182.). He typically writes: “After so much has become known about language structure, it is a retrograde step to try and explain its working by psychology. This is reminiscent of the savage who watched a locomotive moving and explained it by the animal inside whose breath, due to exertion, came out at the funnel.” (TLCC, 205).
The first interlude
111
problem of defining the basic mathematical form expressing the structural axiom (2). In Köhler’s later considerations, the structural axiom (here, optimization conditions) results in the power law (specifically the Menzerath-Altmann law), which Köhler explicates through the register hypothesis (see subchapter 5.2.2). Herdan attributes an analogous central role to the lognormal distribution (cf. TLCC, 204).211 The other finding becomes even more interesting when we consider that the lognormal and power law distributions are not entirely easy to distinguish,212 either from a practical point of view (considering which fits the data better) and from a theoretical point of view – i.e., which is more suitable for expressing the structure of the relations between the levels of the linguistic system. We will address this key issue in more detail below (see subchapter 5.4.1).213 The analysis of the lexical level in the chapter, “The “New Statistics” on the vocabulary level”, gives Herdan room to develop the most significant physical analogy in the book – the analogy between the “new statistics” needed to analyze the lexical level and quantum statistics in physics. A more detailed analysis of this very interesting consideration, which is valuable from a conceptual point of view especially for physics because it clarifies the specifics of Bose-Einstein statistics, is part of the Appendix (Appendix 10). ✶✶✶ In the penultimate section (“Language as choice II – Linguistic duality”), Herdan tries to apply the Boolean principle of duality (see above) in the form of linguistic duality, as a basic principle of linguistic theory. Herdan’s brief definition of
We will see that there are a larger number of candidates for the position of a “central” type of statistical distribution in quantitative linguistics, see e.g. Milička (2014). 212 The similarity of the two distributions is explicated by Mitzenmacher as follows: “Despite its finite moments, the lognormal distribution is extremely similar in shape to power law distributions, in the following sense: If X has a lognormal distribution, then in a log-log plot of the complementary cumulative distribution function or the density function, the behavior will appear to be nearly a straight line for a large portion of the body of the distribution. Indeed, if the variance of the corresponding normal distribution is large, the distribution may appear linear on a log-log plot for several orders of magnitude.” (Mitzenmacher 2004, 229). Another interesting link to later system-theoretical linguistics can be found in Herdan’s reflections on indeterministic rules (compared to deterministic ones), which he again defines through physical analogies: “These considerations might, at first sight, suggest that as the number of degrees of freedom increases, the properties of a mechanical system should become more and more complicated, and that it should become increasingly difficult to find any regularity of behaviour. Fortunately this is not true. Just when there is a very large number of degrees of freedom, the system obeys laws of a very special kind. The investigation of these laws forms a special branch of physics, called statistical physics. In other words: chance, the ever-present alternative, comes to our rescue.” (TLCC, 216).
112
4 Formal explanation in linguistics
Boole’s law of duality214 leads us to doubt whether this law (principle) can play a successful role in the principle-based model of explanation. We understand it as a not trivial, but still too general definition of the fact that (not only) the speaker has the ability to categorize the Umwelt through binary oppositions. Although this fact is a fundamental characteristic of speakers (and, of course, it has already earned a number of analyses, e.g. in cognitive linguistics), it can hardly be understood as a principle allowing us to make explanations specifically linguistically. We believe that its application prototypically simply leads to its reformulation at the level of explanandum. With regard to the linguistic duality itself, Herdan states: (. . .) the expression of a thought in language implies the arbitrary selection of a basic opposition between words. What is fundamentally a conceptual opposition thus becomes a linguistic opposition on the vocabulary level. (TLCC, 329)
The situation improves not even when considering duality in the context of the probability theory. Herdan cites Boole to emphasize that logic and the probability theory are not in opposition, but form two pillars of Boole’s conceptualization of thought (TLCC, 330); nevertheless, we believe that Herdan “only” points out that the principle of duality in the sense of axiom Pð AÞ = 1 − Pð⁓AÞ is embedded in the foundations of probability theory. Herdan’s definition of duality in the context of mathematics, or in particular of projective geometry is much more interesting: “(. . .) all the propositions of plane projective geometry occur in dual pairs which are such that from either proposition of a particular pair another can be immediately inferred by interchanging the parts played by the words point and line” (TLCC, 330). Herdan believes that we can define duality in language exactly in the same way, and even that duality originates from language: We shall show that it is possible, and necessary, to conceive of duality in language as a genuine mathematical duality, or, as we might say, of linguistic duality as a quantitative concept. And more than that, we may find that it is in language that duality has its real and original home. (TLCC, 331)
We believe that Herdan thinks out boundaries of structuralist thinking about language thoroughly here, in fact trying to identify structuralism and projective geometry.215 In any case, Herdan heads towards an interesting but somewhat confusing 214 He formally defines it by the equation: xð1 − xÞ = 0, or x = x2 because x is a defined characteristic that distinguishes a given entity from the remainder ð1 − xÞ (TLCC, 328–329). “The view which has come more and more into prominence during the last 100 years of geometrical thought is that there is no mechanism necessary for producing a geometrical duality, but
The first interlude
113
identification of a linguistic form (type) with a geometric point and a linear sequence of these forms (token) with a line (TLCC, 333). These bold reflections by Herdan open up a number of questions; first of all, where is Herdan actually led by these reflections? We think of them as of the frontier of structuralist thinking about language, but this thinking traditionally does not have statistical descriptions (or explanations) in its inventory. Significantly, although Köhler refers to structuralism as to an attempt to mathematize linguistics, he abandons efforts for such an abstract structuralist description (explanation) of the language system (see chapter 5.2). On the other hand, Herdan’s reduction of projective geometry to linguistics could be interpreted as a cognitive-linguistic step – to base formal systems on basic dualities that originate from specific physical predispositions of speakers. However, it is not clear enough how this purely conceptual basis of Herdan’s theory relates to his statistical models, which he continuously outlines (and which form fragments of system-theoretical linguistics). Could this identification of the isomorphism of a mathematical and linguistic structure (Cf. TLCC, 333) be used in our effort to design a new model of non-causal explanation (e.g., a topological model of explanation) for system-theoretical linguistics? Or, on the contrary, is Herdan’s approach a warning against such efforts and a reductio ad absurdum of a structuralist description (explanation)? We will reconsider these possibilities again in chapter 5.6. As we have seen, Herdan’s theory is built on a number of rules and (physical) analogies and is demonstrated by a number of models that are tested thoroughly (and exceptions are taken into account); yet, in summary, this theoretical system does not work completely uniformly because general principles (especially the principle of duality, but also FLC etc.) are not linguistically interpretable in a clear and unambiguous way. We cannot simply say that the whole theory is really based on them. It may be a somewhat unfair statement because we also found de Saussure’s principle of arbitrariness unfit for establishing a valid explanation. But there is a clear difference – both de Saussure and Hjelmslev established their basic principles within their theories – and the principle of arbitrariness and the principle of analysis clearly show how a systemic description can be built. One of the main reasons why we see Herdan as an important link between structuralist linguistics and Köhler’s system-theoretical linguistics is his constant effort to interpret de Saussure’s linguistic system quantitatively. He attempts to express the relationship between de Saussure’s principle of arbitrariness and
that it may reside in the texture of space itself. Thus any statement about certain structural properties of space may be balanced by another statement, its dual, in which the elements, in terms of which the first statement described the property in question, interchange their rôles.” (TLCC, 332).
114
4 Formal explanation in linguistics
code efficiency, or perhaps even with the general economization principles that are so important for system-theoretical linguistics. Herdan follows de Saussure’s breakup from the idea of a close connection between language and the world, and offers the following view: As languages develop, that is become more efficient coding systems, the speaker becomes more adept in renouncing any imitation or duplication of the manifold, and in using language as a system of oppositions sui generis. The more complex, though orderly, that system of oppositions, the higher the stage of linguistic development of both language and speaker. (. . .) The code will, in general, be the more efficient, the more it is a code and not a mere duplicate, of the message. (TLCC, 343)216
However, the hope of using the principle of duality as an explanatory principle is not fulfilled. We find that the relationship between the principle of arbitrariness and Herdan’s principle of duality is essentially trivial. The principle of oppositions and the associated arbitrariness defined by de Saussure is simply an example of the principle of duality that we find in Herdan. Herdan’s principle of duality is, thus, even less obvious, even less definable beyond the enumeration of binary oppositions, and cannot serve for creating a principle-based model of explanation. ✶✶✶ In the final chapter “Linguistic duality and parity” of the last section of the book (“Statistics for the Language Seminary”), Herdan tries to connect the conception of linguistic duality exactly to the conception of conservation laws in physics and their underlying symmetries (see subchapter 2.2.2 above) – he, de facto, proposes reduction of (not only) linguistics to fundamental physics. By their nature, Herdan’s final reflections are again “only” analogies (they are discussed in detail in Appendix 11). If we notice their normative role, then we have a presented attempt at a strict physicalist solution to the linguistic dilemma. Given Herdan’s belief in the importance of physical analogies, we believe that if he had conceived his “theory of language as choice and chance” today, then he would use the above-mentioned (see Appendix 6) conceptual pair of physics to analogize this fundamental opposition in linguistics: symmetry maintaining and the symmetry breaking. We will see certain variants of this analogy in Luďek Hřebíček (see the Second Interlude).
In more detail, Herdan uses Born’s conception of the “restless universe” to express the stability of the relationship between linguistic expression and content (see TLCC, 344–345).
5 Functional explanation in quantitative linguistics How do I tell this? It’s hard to think sometimes amid the clamour of argument. The politics of objects. All our conversations compete. YouTube videos might be conversing among themselves – their lists and references and cuts parts of their dialect. When we bounce from song to nonsense to meme, we might be eavesdropping on arguments between images. It might be none of it’s for us at all, any more than it’s for us when we sit on a stool and intrude on the interactions of angles of furniture, or when we see a washing line bend under the weight of the wind or a big cloud of starlings and act like we get to be pleased. China Miéville, The Dusty Hut, in: Three Moments of an Explosion
In the previous parts (3 and 4) of the book, we reviewed a subject of linguistic explanation from the traditional point of view of the main milestones of linguistics. In this fifth part, however, we choose one of the specific areas of present linguistic approaches. There are two reasons for this: firstly, contemporary linguistics is typified by a pluralism of ways to examine language, and there is no, as far as we know, leading theory or school. Above all, however, we associate with this particular school – with quantitative linguistics and more specifically Köhler’s systemtheoretical (or synergetic) linguistics – the hope that the subject of linguistic explanation may be most explicitly grasped within it, and also that it is this school that succeeds in building the concept of scientific explanation that is not a “mere” linguistic description. The fifth part will be structured, therefore, in a slightly different manner compared to previous ones, and will be more extensive. Given the importance we associate with system-theoretical linguistics, we will also make a brief and perhaps concise description of this linguistic theory. Reinhard Köhler first built it for the lexical subsystem (Köhler 1986), and gradually expanded it until the present when the scope of the theory extended to the field of syntax (Köhler 2012).217 We will illustrate Köhler’s theory with examples from lexical and syntactic subsystems, so that we could define the basic concepts that appear in the functional
Our description and analysis of Köhler’s system-theoretical linguistics will be based mainly on knowledge of his key texts: Zur linguistischen Synergetik: System und Dynamik der Lexik (1986) and Quantitative Syntax Analysis (2012). And also about the valuable discussions that I had an opportunity to participate in with Reinhard Köhler at the end of March 2018 and in April 2019 in Trier, and for which I thank him warmly. https://doi.org/10.1515/9783110712759-005
116
5 Functional explanation in quantitative linguistics
explanation – linguistic variables, parameters of these variables and extra-systemic requirements against which the system maintains its stability (see chapter 5.2). In connection with the description of Köhler’s theory, an important question arises as to the significance of statistical laws identified by quantitative linguists and used by Köhler within his own system-theoretical conception. We do not have in mind just Zipf’s and Menzertah-Altmann’s laws, which are the best known outside the field of quantitative linguistics, but a number of statistical distributions (regularities or tendencies) that have different degrees of universality, but which have a common link to power law distributions. As we will see, Köhler offers a hypothesis (the register hypothesis, see subchapter 5.2.2) that could shed light on the origin and role of power law distributions. In this respect, the comparison of Köhler’s system-theoretical linguistics and Altmann’s (and Wimmer’s) unified approach will also be a subject of our analysis. Both conceptions have a common initiation, and coincide in many ways, but above all from the perspective of the philosophy of science, we can recognize fundamental differences between them (see chapters 5.3 and 5.4). The central chapter of this part will be the analysis of Köhler’s functional explanation (see chapter 5.5), which he built with the help of texts by Carl Gustav Hempel (especially Hempel 1965). We will try to show to what extent this model of explanation is consistent with Köhler’s theory and on what explanatory principle this explanation is based – a structural axiom which states the self-organization of the linguistic system. We will also try to show to what extent this model takes into account the objections already expressed by Hempel in connection with the conception of functional analysis. We will also investigate what other objections to the concept of functional explanation have emerged since the issue of the cardinal text Hempel (1965). In this regard, we will build on the results of our previous analyses from the texts Zámečník (2014), Benešová, Faltýnek, Zámečník (2015), Benešová, Faltýnek, Zámečník (2018). Original and modified models of functional explanation will be confronted with conditions for a well-established principle-based model of explanation, so that we can decide whether system-theoretical linguistics has overcome the dilemma of linguistic theories that we encountered unsuccessfully several times in previous chapters. Finally, we will also try to propose alternatives to a functional explanation that would definitely resolve a linguistic dilemma and offer an undeniably valid principle-based model of explanation (see chapter 5.6). The introductory comments imply that this part of the book is – for perhaps good reasons – somewhat asymmetrical with the rest of the book. However, asymmetry also applies to the very concept of functional explanation (see chapter 5.1), which has to be properly defined at the outset. This explanation is characterized by the fact that it is interpreted specifically differently, according to the
5.1 General notion of functional explanation in linguistics
117
context of the linguistic trend in which it is used (whether in cognitive linguistics, neurolinguistics, etc.) and with regard to the historical period (e.g. function and teleology in the context of the Prague Linguistic Circle). This contrasts with the relatively stable and coherent links of the formal explanation specifically to generativism. On the other hand, just as we were interested in a formal explanation in the context of the work of one particular linguist (Chomsky), so the functional explanation will be of interest in a precisely defined form in the work of another particular linguist (Köhler). We, therefore, remain faithful to the search for a line of important linguistic theories – from Saussure’s and Hjelmslev’s structuralist approaches (and systemic descriptions/explanations), through Chomsky’s generativism (and formal explanation/description) to Köhler’s system-theoretical linguistics (and functional explanation). Moreover, part 5 takes us back to the beginning of our book and concludes our research meaningfully by returning to philosophy of science, which is explicitly reflected in system-theoretical linguistics. It is the important characteristic of the whole circle of quantitative linguistics that they build purposefully on a strict effort to define the concept of linguistic theory, law and, of course, explanation by means of philosophy of science. At the same time, they did not perceive this relation to philosophy of science as a mere embellishment, but as a conditio sine qua non of well-conceived scientific hypotheses and theories. In quantitative linguistics we can still find the original confidence in the philosophical system expressed by Mario Bunge,218 who is the main source of inspiration for the founders of quantitative linguistics.219
5.1 General notion of functional explanation in linguistics As we pointed out above, the concept of a functional explanation is significantly more pluralistic than a formal explanation. We can outline two main frameworks in which the functional explanation is discussed. It should be noted that these
Unfortunately, we are unable to evaluate the significance of this philosopher for the established system of Köhler’s linguistic theory. This is, of course, a weakness, given the importance of his work in the quantitative-linguistic community. This task is still waiting for a future solution. See Appendix 12 for more details. I am indebted to Peter Grzybek for a certain critical overview of this quantitative linguistic mainstream. Peter Grzybek drew my attention to the importance of Bunge, and also introduced me to Peter Meyer’s critique of the methodological and philosophical basis of quantitative linguistics (see subchapter 5.3.2).
118
5 Functional explanation in quantitative linguistics
frameworks are not strictly split apart; but a detailed interpretation of their relationship goes beyond the scope of this work. The first framework is a traditional reflection on the concept of language function stemming from structuralism (back to the Prague Linguistic Circle, Roman Jakobson, etc.), which is also related to the problem of teleological explanation. We have to state chastely that we do not know it enough and so we will not comment on this broad and very important area.220 The other framework is cognitive linguistics (linked to psycholinguistics and neurolinguistics) in which the current functional explanation is mostly applied. In this context, the concept of function is applied to express the way in which language fulfills non-linguistic requirements – it is in this aspect that a functional explanation in system-theoretical linguistics is also understood. In the first frame, the function is inherent to the language; in the second frame, the function is implemented in the response to what is external to the language. It should be emphasized that we will, indeed, limit ourselves here only to the latter context and, thus, to the meaning associated with the term ‘function’ in the functional explanation. One guideline may be that formal explanation draws on the capacity of mathematical models (in favor of linguistics), while functional explanation draws primarily on the link between linguistics and biology.221 And as we will see (in chapter 5.5 and Appendix 19), the creation of models of functional explanations was motivated primarily by the inability to grasp biological (but also cybernetic) systems by means of causal explanations. At the same time, care has always been taken not to tarnish the functional explanation by suspecting that it is a variant of teleological explanation (see again Kořenský 2014). Already above, we gave (see chapter 4.2) a brief and characteristic explication of the functional explanation as found in Newmeyer: An explanation is functional if it derives properties of language structure from human attributes that are not specific to language. (Newmeyer 2016, 3)
In such a delimited form, we strictly see a linguistic dilemma – it is an explanation that relies on non-linguistic entities as an explanatory principle. Newmeyer specifies how we can make a typology of these “human attributes”; in his analysis, we find three central characteristics of human cognitive needs, which he
Hjelmslev’s structuralism is, of course, yet another different framework for reflection on the concept of function in linguistics. Above, however, we pointed out the possibility of a teleological interpretation of function in Hjelmslev (see chapter 3.2). In general, most system-theoretical conceptions are characterized by considering the system as a self-regulating entity – a quasi-biological entity sui generis. Biological metaphors and conceptual borrowings are crucial for the area of applications of functional explanation.
5.1 General notion of functional explanation in linguistics
119
describes as: structure-concept iconicity, information-flow based principles and processing efficiency (Newmeyer 2016, 13–15). The notion of structure-concept iconicity hides a certain isomorphism between expression (or “linguistic representation”) and content (or “concept”). Newmeyer directly states: “(. . .) the form, length, complexity, or interrelationship of elements in a linguistic representation reflects the form, length, complexity, or interrelationship of elements of the concept that the representation encodes” (Newmeyer 2016, 13).222 The other characteristic is clearly connected with the communication function of language – the function of communication is transfer of information, which is reflected in construction of grammar (Newmeyer 2016, 13).223 Finally, he combines processing efficiency with Zipf’s law (Newmeyer 2016, 15).224 We will see that even Köhler’s functional explanation could be understood in the spirit of the “human attributes” outlined by Newmeyer, and would, thus, fall under the general definition of a functional explanation. However, this would mean that the explanatory linguistic dilemma also fully affects Köhler’s conception. However, we believe that Köhler’s conception differs by the idea that language itself is an example of a self-organized system that it is not just an appendix or even an epiphenomenon of biology. We can base this interpretation of Köhler’s conception of language on his attempt to define the universal form of a linguistic system (as we shall see in chapter 5.2). In Köhler, as we will see, we can identify, above all, two of Newmeyer’s mentioned characteristics: information-flow based principles and processing efficiency. Köhler does not try too hard to enter an area where he would deal with the relationship between linguistic representation and concept.225 In the realm of the lexicon, he actually implicitly avoids this problem when he measures the quantity of the number of meanings with the existing codified dictionary of the given language. The connection between two more represented characteristics is clearly evident especially in Köhler’s quantitative analysis of syntax (see also chapter 5.2 below).226
Newmeyer even states that: “(. . .) non-iconic temporal ordering of clauses (. . .) is both harder to process and involves different neural activation than iconic ordering (. . .).” (Newmeyer 2016, 13). Newmeyer provides a number of examples: “ergative clause patterns” (Newmeyer 2016, 13–14), “ordering of the major elements within a clause” (Newmeyer 2016, 14), etc. He also presents an example of a Parsing-based explanation: “(. . .) grammars try to reduce the recognition time (. . .).” (Newmeyer 2016, 15). Although this task would probably await him when contemplating the semantic module of system-theoretical linguistics. For example, Köhler’s requirements: minimization of production, encoding and decoding effort, maximization of complexity, limitation of embedding depth, minimization of structural information, etc. (Köhler 2012, 179).
120
5 Functional explanation in quantitative linguistics
In the functional explanation, Newmeyer criticizes exactly the aspect which Köhler also identifies as trouble-making – Newmeyer points out that there is a counterexample to each functional explanation (Newmeyer 2016, 16).227 However, this is not necessarily a fatal problem because a linguistic system can be defined precisely via counterbalances, mutually limiting requirements because individual requirements have to, of course, come into conflict and eventually reach a certain balance – in this respect, Köhler’s version of functional explanation surpasses Newmeyer’s version (we will see in more detail, in chapter 5.2). However, there is a problem in the form of functional equivalents, i.e. the fact that the same effect can be achieved in the system in different (parallel) ways (for more details see chapter 5.5). Egré, on the other hand, conceives the multiple realizability of a solution in a given system as an advantage (Egré 2015, 459–460). According to Newmeyer, proponents of the formal explanation in linguistics are sympathetic to the functional explanation because they see it as a necessary part of a complex of factors affecting language realization in a particular case. According to Newmeyer, the functional explanation is, thus, present in Chomsky’s following enumeration under point 3: (. . .) three factors that enter into growth of language in the individual: 1. Genetic endowment, apparently nearly uniform for the species (. . .); 2. Experience, which leads to variation, within a fairly narrow range (. . .); and 3. Principles not specific to the faculty of language. (Chomsky 2005, 6; cited by: Newmeyer 2016, 18)
Newmeyer is convinced of the compatibility of formal and functional explanations (Newmeyer 2016, 20–24) – according to him, using both methods of explanation is necessary, as is the case in other scientific disciplines. However, to prove it he uses examples (a formal description of the chess game and a formal analysis of a biological organ, Newmeyer 2016, 20) – nevertheless, the chess game is not a science and the formal analysis of an organ remains a formal description of the structure, not an explanation. His challenge then sounds trivial, it goes without saying that descriptions (and formal ones in particular) are irreplaceable in science, but that does not automatically make them explanations (see also the considerations in chapters 2.3 and 4.1). This, we believe, also applies to the linguistic example (Newmeyer 2016, 22–24) in which Newmeyer tries to prove that formal and functional explanations are so intertwined in linguistics that they cannot simply be separated from each other:
For example, Newmeyer states that “Communicative Dynamism” is opposed to “Communicative Task Urgency” (Newmeyer 2016, 16), and “information flow” is opposed to “mark cognitively similar entities the same way over long stretches of discourse” (Newmeyer 2016, 17), etc.
5.1 General notion of functional explanation in linguistics
121
What we have here in other words are functionally motivated formal patches for dysfunctional side effects of a formal principle that is functionally motivated. I suspect that this sort of complex interplay between formal and functional modes of explanation is the norm, rather than the exception, in syntactic theory. (Newmeyer 2016, 23–24)
Newmeyer (2016, 22–24) with reference to Lightfoot (1999), demonstrates interconnection of formal and functional aspects of explanation in linguistics: “Lightfoot demonstrates that a formal constraint, whose ultimate explanation is most likely functional, can nevertheless have dysfunctional consequences, leading grammars to resort to formal means (. . .) to overcome these consequences” (Newmeyer 2016, 22). The stated formal constraint is the Principle of Lexical Government, which according to Lighfoot is explained functionally.228 However, this principle leads in some cases to dysfunctions, but different languages have different formal means of eliminating this dysfunction (Newmeyer 2016, 23). Despite the interesting interplay of functional and formal explanations, as presented by Newmeyer, we believe that functional explanations play a decisive or leading role here. There is always a functionally motivated choice of formal means. The fact that the functional choice leads in some cases to dysfunctions (through a formal intermediate stage) only proves that at heart, it is a question of balancing the requirements to which a language system is subjected. ✶✶✶ We have stated that Haspelmath assumes that biology (neuroscience) will (probably) be decisive in solving basic questions of linguistics (Haspelmath 2004, 555–558, see chapter 4.2).229 The linguistic description proves to be irrelevant, both for explanations based on the universal grammar (UG) and for functional explanations: Linguistic explanation that appeals to the genetically fixed (“innate”) language-specific properties of the human cognitive system (UG) does not pressupose any kind of thorough, systemic description of human language (. . .) Linguistic explanation that appeals to the
“(. . .) the general condition of movement traces (. . .) may well be functionally motivated, possibly by parsing considerations. In parsing utterances, one needs to analyze the positions from which displaced elements have moved, traces. The UG condition discussed restricts traces to certain well-defined positions, and that presumably facilitates parsing.” (Lightfoot 1999, 249; cited according to Newmeyer 2016, 22–23). A certain reductionist view could be directly derived from Haspelmath’s conception. For each discipline (with the exception of physics), he distinguishes 5 levels (we present concretizations for the case of linguistics): phenomenological description (descriptive grammar), underlying system (“cognitive grammar”), basic building blocks (“cognitive code” = elements of UG), explanation of phenomenology and system (diachronic adaptation) and explanation of basic building blocks (biology). At the same time, biology is explained by biochemistry, and biochemistry is explained by physics (Cf. Haspelmath 2004, tab. 1, 557).
122
5 Functional explanation in quantitative linguistics
regularities of language use (“functional explanation”) does not pressupose a description that is intended to be cognitively real. (Haspelmath 2004, 554–555)
Whether it is the influence of an external environment (in the case of a functional explanation) or an internal setting of the speaker (genetics or a cognitive code in UG), it is neither a purely linguistic explanation in either case, as both address an explanatory source outside linguistics. Although Haspelmath focuses critically on both formal (or generative) and functional linguists, a greater degree of criticism is being reaped for formal linguists. Haspelmath directly claims that: “(. . .) UG cannot be discovered on the basis of linguistic description (. . .), and that it [UG] cannot serve as an explanans for observed universals of language structure” (Haspelmath 2004, 559). Haspelmath prioritizes functional linguistics and its explanation. He, unlike Newmeyer, joins the opinion that phenomenological descriptions are sufficient for functional linguists; contrary to Newmeyer’s idea (see above), they do not require formal descriptions (cf. Haspelmath 2004, 568) or explanations (expressed in Newmeyer’s terminology): (. . .) for the purposes of discovering empirical universals (and explaining them in functional terms), it is sufficient to have phenomenological descriptions that are agnostic about what the speakers’ mental patterns are. We do not need “cognitive” or “generative” grammars that are “descriptively adequate”. “Observational adequacy” is sufficient. (. . .) Thus, most of the issues that have divided the different descriptive frameworks of formal linguistics and that have been at the center of attention for many linguists are simply irrelevant for functional explanations. (Haspelmath 2004, 569)230
Finally, Haspelmath concludes by stating that it is necessary to reverse the perspective of the importance of non-linguistic entities to the creation of linguistic explanations. If originally (and probably in generativism so far) they thought that non-linguistic elements only supplement grammatical descriptions (or formal explanations), then today we should understand it the other way around: “What I am saying here is that external evidence is the only type of evidence that can give us some hints about how to choose between two different observationally adequate descriptions” (Haspelmath 2004, 574). Paul Egré approaches the controversy of grammatical versus functional explanations (as we have seen in chapter 4.2 above) in the most conciliatory manner and does not clearly lean to any side of the controversy. Like Haspelmath, he believes that the solution can be provided by neuroscience, which eliminates the
And in the following subchapter (Haspelmath 2004, 569–572), he gives a number of examples as evidence.
5.1 General notion of functional explanation in linguistics
123
causal gap preventing the explanation of realization of grammatical rules in a speaker’s brain (Egré 2015, 460–461). In connection with the above mentioned praise of the multiple realizability of a function in the system, the optimalitytheoretic (OT) explanation, in which Egré suggests a connection with a functional explanation, is worth noting (he refers to texts: Prince, Smolensky 1997, Haspelmath 1999): In optimality theory, derivation rules of a phrase‐structure grammar are replaced by a lexicographically ordered set of possibly conflicting constraints. These constraints can be violated, but the output of OT‐based explanations is to account for the selection of distinct forms (phonological or syntactic) based on those forms that minimize the number of violations. (Egré 2015, 459)
We will see that, in a more general intent, Reinhard Köhler applies elements of OT-based explanation in system-theoretical linguistics when he introduces his functional explanation based on Hempel’s functional analysis. ✶✶✶ The conciliatory Egré bases the pluralistic view of explanation in linguistics aptly on the dichotomy of structure-dependences, which are of interest to generativists (from the perspective of grammatical explanation), and of cognitive constraints, which are of interest to cognitive linguists (from the perspective of functional explanation). Newmeyer builds an independent space for the formal explanation systematically, and even finds a D-N way of expression for them, while the radical Haspelmath considers the explanation of language system properties possible only through non-linguistic factors. We have not found a solution to the explanatory dilemma of linguistics in the three researched authors. Haspelmath completely grasps the out-of-linguistic “corner” of the dilemma while both Newmeyer and Egré balance at both ends without being able to offer a valid model of linguistic explanation that did not reduce to a description (for formal or grammatical explanations), or to out-oflinguistic explanations from the field of cognitive science (for OT-based explanations and for functional explanations in general). In the following chapters, we will struggle to shift the perspective of functional explanation despite the mainstream conception (here Newmeyer, Haspelmath, Egré) so that the constraints are not only cognitive, and so that they also include some structural dependencies. This will then allow Köhler to design a universal model for any linguistic subsystem (as well as for a system of these subsystems) and to build a model of functional explanation that will be in accordance with the view of the philosophy of science.
124
5 Functional explanation in quantitative linguistics
5.2 System-theoretical linguistics Reinhard Köhler developed a comprehensive conception of system-theoretical linguistics (STL), i.e. synergetic linguistics,231 in the 1980s in the book Zur linguistischen Synergetik: Struktur und Dynamik der Lexik (ZLS, 1986), following the tradition of quantitative analysis of language and linked to linguistic approaches of such authors as Gustav Herdan or Gabriel Altmann. He created it in the context of the development of dynamical systems theory (and related theories and disciplines) and in contrast to the generativist mainstream.232 It was exactly the context of the dynamical systems theory that enabled him to approach linguistic research explicitly through the means offered by statistics. Probabilistic models exceed the limits of determinism and at the same time, enable us to penetrate areas traditionally scientifically incomprehensible (see Kellert 1993, Smith 1998b, see Appendix 8). His approach complements the spectrum of language research and overcomes extremes between traditional philology (noise) and formal linguistics (deterministic regularity).233 He criticizes Chomskyan rejection of quantitative syntax explicitly (ZLS, note 5, 12); he repeats this criticism in a later book Quantitative Syntax Analysis (QSA, 2012). In the last mentioned book, he also defines the distinctiveness of a quantitative approach to the study of language (quantitative linguistics) by distinguishing between qualitative and quantitative mathematics. He connects the qualitative one with linguistic structuralism (including logic, algebra and set theory), and attributes the quantitative one (including analysis, probability theory and statistics, etc.) as specific for quantitative-linguistic research – for the creation of systemtheoretical linguistics (QSA, 13). The main non-linguistic inspiration and conceptual starting point for Köhler is synergetics, which was established and developed by Hermann Haken.234 However,
In quantitative linguistics, it is common to use the term “synergetic linguistics”, which is based on the original title of Köhler’s book (1986). On the other hand, outside of the quantitativelinguistic community, Köhler used the name system-theoretical linguistics, which we will also use (cf. Köhler 1987). He refers to the prominent philosopher of science Patrick Suppes (see Suppes 1970). Köhler applies the theory of dynamical systems, but rather observes its finesse – in the form of applications of fractal models of the system of language and chaotic dynamics of language evolution – through the texts of Luďek Hřebíček and Jan Andres (see the Second Interlude and chapter 5.6). He refers in particular to Haken (1978). The choice of synergetics was time-dependent, based on the popularity of synergetics in the 1980s. Over time, Köhler freed himself from a purely synergetic point of view, deepening his connection to the theory of dynamical systems (see below). Luděk Hřebíček considers the concept of emergence in connection with the analysis of the text
5.2 System-theoretical linguistics
125
Köhler’s use of Haken’s theory remains only at the level of anology; in none of Köhler’s texts do we find a more detailed link between linguistic synergetics and Haken’s theory, which does not lack mathematical expression and has a comprehensive conceptual structure (see Appendix 17). Köhler just uses some conceptual borrowings, such as Haken’s term “Versklavung”. However, the starting point is explicit; STL uses a biological view of language as a self-regulating system – the initial axiom of the whole STL is precisely a structural axiom of selfregulation. Köhler’s original STL, introduced in ZLS, focuses on the lexical linguistic subsystem. Köhler defines the basic concept of the lexical unit,235 in which he identifies properties that are operationalizable and quantifiable in the form of basic quantities of the lexical system. In addition to mutually correlated quantities, the lexical system is defined by a set of parameters that enable him to express relations between quantities by means of linguistically interpretable mathematical relations. The whole system is, then, limited from outside by a set of requirements (for conceptual development see chapter 5.5) which do not belong to the lexical system.236 Graphical algebra and linear operators serve as a means of formalizing the system for Köhler. We will show their application below in connection with a selected example of a quantitative relationship among requirements, quantities and parameters.237 Köhler identifies six basic quantities that are sufficient to define a lexical system, including: the number of phonemes,238 the lexicon size, the length of a lexical
(cf. Hřebíček 1999). Achim Stephan also includes Haken’s synergetics under the concept of emergence (cf. Stephan 1999, 232–238). More precisely, in addition he also conceptualizes a structural property (in an operationalized form, it becomes a quantity) (ZLS, 39), a lexical class, a structural class, and a lexical space (ZLS, 40). 236 The definition of the system: „Die Menge der lexikalischen Eigenschaften mit ihren Relationen untereinander bildet das lexikalische System. Alle Elemente sind so miteinander verknüpft, dass keine isolierte Teilmenge entsteht.“ (ZLS, 41), the state of the system: „Der Vektor der strukturellen Eigenschaften der lexikalischen Einheiten des Systems zu einem Zeitpunkt t wird Systemzustand zu t gennant.“ (ZLS, 42), the requirements: „Unter einem Systembedürfnis verstehen wir ein Element der Systemumgebung, welches in Relation zu anderen Elementen ausserhalb des Systems steht und Änderungen im System hervorrufen kann.“ (ZLS, 43). Köhler describes this in detail in ZLS, 43–49. We consider important to state the formal definition of the structure: „Die Struktur eines System ist die Menge seiner untereinander verknüpften Operatoren.“ (ZLS, 46), and of the function: „Die Funktion eines System ist die Gesamtheit der Wirkungen aller Operatoren des Systems.“ (ZLS, 46). Unlike Herdan (cf. TLCC, 274–281), Köhler does not identify a number of distinctive features independently as a quantity. Köhler does not build a phonetic system, but there is still an attempt to quantify distinctive features (see e.g. Winkler 1982).
126
5 Functional explanation in quantitative linguistics
unit, the polylexy, the polytextuality, and the frequency of a lexical unit.239 Although Köhler has been interested in the system dynamics, most of his conclusions concern a synchronous view of the (not only) lexical system.240 Therefore, the system variables: the number of phonemes and the lexicon size play a special role because they are fixed in the synchronous point of view; they define a constant characteristic of the system, which does not make sense to relate to individual units. The remaining system quantities represent structural properties (ZLS, 53), which can be quantified for any lexical unit. Among the mentioned quantities of the lexical system, polylexy and polytextuality deserve special attention and a more detailed explication. We have already found the vocabulary (i.e. lexicon) size, the number of phonemes,241 the length of the lexical unit242 and its frequency (albeit in an unsystematic and basic form) also in Herdan’s quantitative theory of language. In addition to the group of quantities that defines the properties of the whole system (the number of phonemes and lexicon size) and the group of traditional quantities (the length and frequency of a lexical unit), a third specific group appears, which includes polylexy and polytextuality. The difference between the second and third group of lexical quantities is associated with an interesting dilemma for Köhler, which relates to the very definition of the lexical unit for the needs of quantitative verification of operationalized hypotheses. Köhler identifies the lexical unit with the word (ZLS, 88), which relates to the need to distinguish between the word form and the lemma. While it is advantageous to work with word forms to quantify the length and frequency of a lexical
Köhler expresses them systematically in a logarithmic form so that the subsequent equations could be expressed in a linear way, precisely for the needs of graphical algebra (cf. ZLS, 50). R. Hammerl and J. Maj attempted to extend the Köhler’s system towards a greater explication of the diachronic aspects of lexicon evolution in the late 1980s (see Hammerl, Maj 1989). The topic was followed by a series of texts in Glottometrika in 1990 (Köhler 1990a, Hammerl 1990, Köhler 1990b). An interesting debate focuses on the issue of the explanation model nature used by Köhler and on the question of the extent to which causality is articulated in system-theoretical linguistics (cf. Hammerl 1990, 23–24). There is also a related question of the nature of dependencies that Köhler finds in the lexicon model, related to the notion of the dependency direction (die Abhängigkeitsrichtung) (Hammerl 1990, 23–24), see also subchapter 5.5.1. Here, however, there are also a number of questions related to the phoneme definition. Köhler chooses to identify phonemes with graphemes initially (i.e. picks specific language data – written texts). Köhler justifies it by the possibility of expressing the difference between a phoneme and a grapheme using a constant factor (ZLS, 50). The issue of unit segmentation is related to the length (any length generally, not just lexical). The length can be measured in the number of graphemes (phonemes), syllables, morphemes, or it can also be a time interval of signal duration. Köhler uses a number of graphemes (ZLS, 53, 90).
5.2 System-theoretical linguistics
127
unit, it is necessary to work with lemmas to quantify polylexy and polytextuality (ZLS, 88–89).243 Polylexy expresses a number of meanings244 that a given lexical unit can have (ZLS, 57), which is a variable difficult to quantify, and Köhler suggests the only possible procedure to determine it as the number of dictionary entries associated with a lexical unit (ZLS, 91–92). We saw above (see chapter 5.1) that structure-concept iconicity is connected with the relation between linguistic expression and content. We believe that polylexy refers implicitly to the requirement of structure-concept iconicity. At the same time, Köhler associates the requirement for specification with polylexy explicitly, which expresses the need to remove the ambiguity associated with polylexy (ZLS, 60). Polytextuality expresses a number of contexts in which a lexical unit can be found in a text (ZLS, 63). In the context of its operationalization, Köhler states that it is given by the number of texts in which this unit appears at least once (ZLS, 92). Köhler associates it primarily with the requirements of context economy and context specificity. The requirement of context economy acts in order to avoid a disproportionately large lexicon; the requirement of context specificity, which ensures expression unambiguity (ZLS, 63–64), opposes it. Based on both, a balance is created between the process of globalization and the process of centralization (ZLS, 64). Just as it differs between different system variables, there are also a number of non-systemic requirements245 of different kinds, which refer to diametricallydifferent properties of the system environment (die Systemumgebung). For example, a pair of requirements for minimization of encoding effort (minE) and minimization of decoding effort (minD) (ZLS, 50–51)246 is decisive for defining the number of phonemes, while the coding requirement (Cod) (ZLS, 52) is decisive for defining the lexicon size. We can easily identify that the requirements of minE and minD express cognitive limitations of speakers, while the requirement of Cod expresses the world complexity to be coded in the lexical system (i.e. Umwelt).
The failure to respect the needs to differentiate between word forms and lemmas for different research contexts often leads to results that are significantly skewed or completely invalid. Köhler points out that this often applies to researches based on quantitative linguistics towards applications in Data Science (private communication March 2018). On the problem of distinguishing grammatical and semantic meaning, see ZLS, 57; QSA, 18. There was a terminolgical change between the terms “need” and “requirement” (see chapter 5.5). Both requirements (as well as the requirements for minimization of memory costs effort (minM) and minimization of production effort (minP), see ZLS, 20) are related to Zipf’s principle of least effort (ZLS, 50–51).
128
5 Functional explanation in quantitative linguistics
In summary, Köhler lists ten non-systemic requirements for the lexical system, many of which we have already mentioned above. Only some requirements were left out: transmission security – associated with redundancy (Red), application requirement (Usg) and minimization of inventory (minI). Each requirement is responsible for a process happening in the lexical system (ZLS, 78) and results in a change (reduction, or enlargement) of the lexical quantity. At least such a classification of individual non-systemic requirements as reached in the case of lexical quantities seems to be very desirable – Köhler identifies twelve of these processes for the lexical system (ZLS, 78). We have already mentioned one division above – it was a group of cognitive requirements (minE, minD, minP, minM and minI, respectively) and a group of requirements reflexing complexity of Umwelt (Cod). If we referred to the classification offered above by Newmeyer (see chapter 5.1) for help: structure-concept iconicity (SCI), information-flow based principles (IFP) and processing efficiency (PE), then we could join our group of cognitive requirements (minE, minD, minP, minM and minI) with PE requirements, redundancy requirement (Red), context economy (CE), context specificity (CS) and application (Usg) with IFP requirements and specification (Spc) requirement with SCI requirements. However, this division is provisional and indeterminate, especially the line between the requirements of IFP and PE is not entirely clear.247 Köhler suggests further classification within the syntactic system, where he already identifies more than 20 requirements (QSA, 179) attached in the Appendix 13 for clarity (Figure 8). Within the syntax system, he defines three superior requirements: communication requirement (Com),248 efficiency of coding (OC), and minimization of memory effort (minM) (QSA, 179). Groups divided in this way may roughly, but not precisely, correspond to our division into Umwelt requirements (Com requirements), information requirements (OC requirements) and cognitive requirements (minM requirements). The Com requirement includes the Cod and Usg requirements, the OC requirement includes the minP and complexity maximization (maxC) requirements, and the minM requirement includes by far the most requirements: early immediate constituent (EIC),249 minimization of structural information (minS), preference of right-branching (RB), limitation of embedding depth (LD) and minI (QSA, Figure 4.33, see Appendix 14).
E.g. a question arises as to whether minE and minD should not be included among IFP’s requirements. It is not listed separately in the table on page 179, but there are its two subordinate requirements. Not included in the table on page 179, explicated elsewhere (QSA, 142, 193–194).
5.2 System-theoretical linguistics
129
In addition to the above hierarchy, Köhler offers another one, which divides the requirements into three groups: language-constituting requirements, language-forming requirements, and control-level requirements (QSA, 177–178). Among these groups, a third group is specially established – control-level requirements – because it contains requirements that affect the system at a higher level than the ones of the other groups.250 Köhler assigns two mutually acting requirements to this last group – adaptation requirement and stability requirement (QSA, 178).251 Already in the 1980s, Köhler considered a general hierarchy of requirements in the sense that he introduced a higher-level quantity “regulatory effectiveness” (die Steuerungseffektivität) responsible for the system self-regulation (see Figure 1). This quantity expresses the strength of non-systemic requirements acting on the system (ZLS, 150). This quantity itself is regulated by the two above-mentioned252 higher-level requirements: adaptation (die Anpassungsfähigkeit) and stability (die Stabilität), which are mutually acting. Stability prevents changes from happening too quickly – the system would then be hit by a disaster – it would cease to function as a communication tool (ZLS, 150). Köhler points out that the “regulatory effectiveness” applies to each lower-level control circuit (and its quantities) separately (ZLS, 151).253 To illustrate, we present Köhler’s scheme showing how to imagine the higher-level regulation of the whole linguistic system (see Figure 1). In addition to system quantities and non-systemic requirements, system parameters play an irreplaceable role in Köhler’s theory of the linguistic system. Without their introduction and especially their linguistic interpretation, the relationship between system variables (and requirements) would remain at the level of expressing mere relations of proportionality. For example, on average, for a statistical set of a large number of lexical units, the length proportion of two
Köhler includes in the language-constituting requirements: coding requirement, application requirement, specification requirement and de-specification requirement. Among the languageforming requirements, he includes a whole group of economization requirements (QSA, 177–178). In the original text, from which Köhler took over this classification, we also find a detailed list of economization requirements, see Köhler (1990c, 182). We reveal the superior role of the last group when reading the original text by Köhler (1990c, 182). We see that the idea of the role of these requirements has changed somewhat since the 1980s – now Köhler is talking about control-level requirements, but not directly about “regulatory effectiveness”. Köhler also points out that the effect of the language system on the external environment has to be evaluated, i.e. that a reverse approach has to be chosen, in which a linguistic system becomes a source of requirements imposed on its environment (see ZLS, 152).
130
5 Functional explanation in quantitative linguistics
Figure 1: Regulatory Effectiveness (ZLS, 151).
lexical units will be proportional to the inverse proportion of the number of meanings (or polylexies) of these two lexical units.254 We will clarify the importance of the parameters using a selected example of dependence between polylexy (m) and length (L) of a lexical unit (in their logarithmized form, see Figure 2, see ZLS, 57–62). Köhler first expresses the dependence of log m on the minE and minD255 requirements through parameters Q2 and Q1 , whereby the minD requirement will lead to a decrease in polylexy, while the minE will increase it. In addition to these requirements, polylexy also depends on the length of the lexical unit log L through parameter T. It is true that the longer the length of the language unit, the smaller its polylexy. T parameter connects Köhler to the effect of the specification requirement (Spc) (ZLS, 60–61). The resulting scheme will be as follows in the graphical algebra:
L
m
254 Symbolically expressed: L1 ≈ m2 , where L1 and L2 are lengths of two lexical units and m1 and 1 2 m2 are polylexies of these lexical units. Such quantifications were typical, for example, in the early h
t2
development of modern physics, where Galileo identified, for example, the proportionality h1 ≈ t12 , 2
2
where h1 and h2 are two different heights and t1 and t2 are two different free fall times of bodies from these heights. Köhler’s linguistic interpretation of the parameters that allows these proportions to be expressed by a parameterized equation (the issue that Gustav Herdan failed to solve) makes him, without exaggeration, the Newton of quantitative linguistics. Köhler states the requirements minE and minD in a non-logarithmized form, but strictly speaking, they also have to be logarithmized with respect to its subsequent derivation (ZLS, Tables 2, 77).
5.2 System-theoretical linguistics
131
Figure 2: Relation between Polylexy and Length (ZLS, 62).
The parameter T has a specific linguistic meaning, it expresses the degree of syntheticism of the language. Köhler comments it on as follows: The natural languages offer various tools (functional equivalents) to serve this system need [Spc]: specification can – just like modification – be achieved through syntactic means (. . .) or through morphological means such as composition and affixing (. . .). (. . .) That means: The dependency of polylexy on length is all the greater, the more a language makes use of morphological versus syntactic means to specify meaning. This typological property of a language is called syntheticity and is denoted by the letter T. [translation mine]256 (ZLS, 60)
Now we have defined all the elements that occur in the scheme of graphical algebra, and we can proceed to a symbolic expression of the dependence of polylexy on length, parameters and requirements. The equation will be as follows:257
256 „Die natürlichen Sprachen bieten zur Bedienung dieses Systembedürfnisses [Spz] verschiedene Hilfsmittel (funktionale Äquivalente) an: Spezifikation kann – ebenso wie Modifikation – durch syntaktische Mittel (. . .) oder durch morphologische Mittel wie Komposition und Affigierung (. . .) erreicht werden. (. . .) Das bedeutet: Die Abhängigkeit der Polylexie von der Länge ist um so stärker, je mehr eine Sprache von morphologischen gegenüber syntaktischen Mitteln zur Bedeutungsspezifikation Gebrauch macht. Diese typologische Eigenschaft einer Sprache soll Synthetizität heissen und mit dem Buchstaben T bezeichnet werden.“ 257 We present the equation as expressed in Köhler (ZLS, 61) with the addition of logarithmized forms of requirements minE and minD; however, strictly speaking, it should look in accordance
132
5 Functional explanation in quantitative linguistics
log m = Q2 log minE − Q1 log minD − T log L Due to the logarithm properties, we can further modify the equation: log m = log minEQ2 + log minD −Q1 + log L −T log m = log minEQ2 minD −Q1 L −T And after removing logarithmis, we obtain a characteristic power law dependence:258 m = minEQ2 minD −Q1 L − T The resulting equation represents one of the highlights of Köhler’s theory, which is a model example of system-theoretical linguistics. This relationship attributed the status of the universal law by Köhler (Köhler 2005, 764) and is exclusive because Köhler was able to interpret all its parameters linguistically. It is often presented in the following form: m = PL−T where parameter P expresses average polysemy of words of length 1 (Köhler 2005, 764).259 The characteristic power law dependence has been tested many times successfully on a number of linguistic data (see Figure 3). This is an example of successes that system-theoretical linguistics has achieved. Although it is unique in the degree of determination of all parameters, on the other hand, Köhler managed to check dozens of other system dependencies empirically. Only in the ZLS itself, he succeeded in empirically satisfactory verifying – with one exception – all fundamental systemic dependencies (ZLS, 87–136). The exception was the dependence of length on frequency (ZLS, 110–112). However, this led Köhler to the discovery of lexicon oscillation (ZLS, 137–146). Köhler directly stated that the anomaly he identified could be a signal of the existence of another systemic variable (ZLS, 137) – thus, he was thinking in the terms of philosophy of science (directly referring Kuhn 1962, ZLS, note 74, 137). Another example of the great success reached by the basis model260 of systemtheoretical linguistics is successful verification of the hypothetical relation between
with the scheme as follows: log m = Q2 log minE − Q1 log minD − T log Spc log L, therefore include the requirement Spc. 258 If we used a relationship involving the member log Spc, then the resulting form of the equation would be: m = minEQ2 minDQ1 L − T log Spc . 259 Of course, there remains an open issue how to interpret further the parameters Q1 and Q2 , which establish the parameter P = minEQ2 minD −Q1 . Köhler distinguishes the basis model (for the lexicon) from the whole theory (linguistic theory, die Sprachtheorie) (ZLS, 147–148).
5.2 System-theoretical linguistics
133
Figure 3: Polylexy as a Function of Length (Köhler 2005, 770).
frequency and polytextuality for verbs, nouns and adjectives and combinations of all, both for lemmas and word forms (ZLS, 126–135). This is the relation: f = ApB eCp where f is frequency, p is polytextuality, A, B, C are parameters and e is the Euler number (ZLS, 127). In the 1990s, subsystems for phonetics, morphology and text theory were successfully built (e.g. Altmann 1993, Krott 1999, Krott 1996, Hřebíček 1993, Hřebíček 1995). In the first decade of the 21st century, the syntax subsystem was gradually elaborated, which Köhler presented in a comprehensive form in the already mentioned book Quantitative Syntax Analysis (QSA, 2012). Nevertheless, some unresolved tasks remain. The linguistic interpretation of (not only lexical) system parameters was performed only for a relatively limited number of cases. At the same time, as we stated above, without a clear identification of the parameters, it is not possible to declare mathematical relations of
134
5 Functional explanation in quantitative linguistics
linguistic variables to be laws.261 System-theoretical linguistics, thus, still waits for further completion.262 Apart from the linguistic identification of parameters, it also waits for the development of other subsystems.263 An ambiguity in Köhler’s theory is the relationship between parameters and non-systemic requirements. Parameters belong to the system, but requirements form the system environment. However, some cases are not clear. For example, in the above-mentioned case of the dependence of polylexy on the length of a lexical unit, Köhler does state in the diagram the requirement of specification (Spc), which is manifested by parameter T expressing the degree of language syntheticism; but in the resulting equation expressing the relationship between polylexy and the length of lexical unit, there is no Spc. Therefore, in the case of full interpretation of all system parameters, would the complete omission of non-systemic requirements be possible? Would this mean that we create a fully functioning linguistic system that is no longer delimited by its exterior, and therefore also brings an autonomous and purely linguistic mode of explanation? What is the real relationship between parameters and requirements? We can consider this analogously to an example of a physical system in which parameters play the role of constants or other physical variables. If we present an example of an equation for instantaneous velocity of free fall, then the relation between velocity and time is parameterized by free fall acceleration, which itself is a variable (and in the case of a homogeneous gravitational field, it can be considered constant). Free fall acceleration, of course, belongs to the physical system, just as the degree of syntheticism belongs to the linguistic system. The physical system does not call for non-systemic requirements, unless we consider the general principles of symmetries, or a completely general point-of-view invariance requirement.
261 Köhler states: „Nach eingehenden typologischen Studien (. . .) ist es möglich, die für die Ausprägung des Parameters T verantwortlichen Faktoren zu finden. Hat man dann die genaue Beziehung zwischen ihnen und T bestimmt, so lässt sich die Form der Abhängigkeit mðLÞ daraus gesetzmässig ableiten.” (ZLS, note 31, 80). Also in QSA, Köhler returns to the need to interpret parameters (QSA, 76), considers interdependence of parameters (e.g. QSA, 76), shows a rare case (QSA, 80) where the parameter is determined directly according to the theoretical model. Since the beginning of modern quantitative linguistics, which can be roughly set on the break of the 1970s and 1980s; there was a gradual quantitative examination of individual linguistic levels, which, of course, did not always adhere to the system-theoretical approach fully. In this respect, the approach of Udo Strauss is interesting. He built the whole phonetics on the basis of literal and analogous use of physical quantities. He is also characterized by a strong background in the philosophy of science of Mario Bunge and Ernst Nagel (cf. Strauss 1980).
5.2 System-theoretical linguistics
135
Does this mean that the linguistic system is fundamentally different? Necessarily tied to non-systemic requirements? Or could we free the linguistic system from this dependence on non-systemic requirements (or to “hide artfully” their role, as in the case of physics) in the case of a linguistic interpretation of all parameters? Would the linguistic system, therefore, become completely autonomous, and would it be possible to identify a principle-based model of explanation and eliminate the basic dilemma of linguistic theories? In QSA, which represents the pinnacle of system-theoretical linguistics, Köhler shows restraint and argues that system-theoretical linguistics has not yet created a mainstream: (. . .) the study of the functional dependencies and of the interrelations among syntactic units and properties, as well as between these and units and properties of other linguistic levels and extra-linguistic factors is still in its infancy. Although functional linguistics, typology, and language universals research have gathered enormous quantities of observations, plausible interpretations, and empirical generalizations, a break-through has not yet been achieved. (QSA, 3)
In contrast to the case of the lexical subsystem, Köhler identifies almost twenty variables of the syntactic subsystem,264 and we have already stated that he also significantly expands the set of requirements and, of course, also the parameters. However, Köhler proceeds analogously to building a lexical system, identifies variables, performs their operationalization and compiles dependencies of control circuits of the syntactic system gradually. This system is significantly more complex than the lexical system (see Appendix 14, Figure 9), not only due to the number of its elements, but also due to the diversification of types of relations of elements – Köhler newly distinguishes causal and statistical dependencies, which is related to distinguishing functional and distribution laws (see subchapter 5.2.1) and to the hierarchy of relationships of requirements (see above). As an example of a syntactic control circuit, we will mention the relation between complexity and frequency of a syntactic unit, which also includes the length of a syntactic unit and the size of the inventories of syntactic constructions and categories. In the diagram (see Figure 4), we can also distinguish between functional and distributional dependencies (solid and dashed arrows) and hierarchized relationships of requirements. Köhler notes on this control circuit: “As in the case of lexical units, minP affects the relation between frequency and length, in that maximal economisation is realised when the most frequent constructions are the shortest ones (. . .)” (QSA, 188).
Köhler lists a total of 18 (QSA, 28–29).
136
5 Functional explanation in quantitative linguistics
Figure 4: The Interrelation of Complexity/Length and Frequency (QSA, 189).265
Another specific and complicating aspect of the syntactic subsystem is the existence of interdependent relations (Köhler calls it the interrelation). In our example, we can identify this type of relationship of frequency and complexity of a syntactic unit (via H parameter). The much more complicated nature of the syntactic subsystem is also evidenced by the fact that Köhler, with a few exceptions, does not attempt to express them functionally, and preserves schemes of graphical algebra. System-theoretical linguistics, as far as we have been able to present it, represents the results of thousands of observations and ongoing ad hoc experiments with linguistic data and thousands of tentative hypotheses formulated by Köhler and his contemporaries and successors. Following them, hundreds of controlled
Credit Line: Köhler, Reinhard, Quantitative Syntax Analysis, Berlin: De Gruyter Mouton, 2012, p. 189, fig. 4.22.
5.2 System-theoretical linguistics
137
and statistically significant experiments with texts have been performed. On the basis of them, dozens of functional and distributional relations were refuted or corroborated. These corroborated relations can be described as laws of quantitative linguistics. It is the nature of these laws that will be examined in the following subchapter.
5.2.1 What is a linguistic law? If something distinguishes Köhler’s system-theoretical conception of language from other linguistic approaches fundamentally – apart from an explicit systematic, quantitative method of working with linguistic data, their statistical processing and performed experiments – then it is his characteristic attempt to identify linguistic laws that can form the basis of linguistic explanations. As we will see below, this effort is based on the belief that no linguistic conception has yet met the demands set by philosophy of science.266 Hempel’s D-N model of scientific explanation is for Köhler the starting point, where the scientific law (as we saw in chapter 2.1) plays an irreplaceable role. We have already emphasized that philosophy of science (developed by the traditional approach by Mario Bunge) winds through the whole conception of quantitative research of language.267 Quantitative linguistics – as a method – and together with system-theoretical linguistics – as the first real linguistic theory – are to achieve the exactness of a natural science. In QSA, Köhler points out that quantitative linguistics does not examine other objects, nor does it have other epistemological starting points, but it differs from other linguistic approaches by an ontological point of view (QSA, 9): “(. . .) whether we consider a language as a set of sentences with their structures assigned to them, or we see it as a system which is subject to evolutionary processes in analogy to biological organisms (. . .)” (QSA, 9). This definition is reminiscent of the distinction between essentialists and externalists, as these positions are defined in philosophy of linguistics (see chapter 4.2). Köhler, thus, for understandable reasons, defends himself against generativism when it comes to syntax. An explanation is not possible without a theory that needs laws, which are explicitly not: rules, patterns, typologies, classifications, or axiomatic systems (QSA, 3). Köhler is, therefore, clear that generativism does not have laws:
“Thus, there is not yet any elaborated linguistic theory in the sense of the philosophy of science.” (QSA, 21). Probably most clearly documented by Peter Grzybek (2006).
138
5 Functional explanation in quantitative linguistics
We will put some emphasis on the fact that only laws and systems of laws, i.e. theories provide means to explain and to predict. (. . .) Chomsky has always been aware of this fact and, consequently, avoided claiming that his approach would be able to explain anything. Instead, he classified the grammars which are possible within this approach into two kinds: those with descriptive adequacy, and those with explanatory adequacy (without claiming that the latter ones can explain anything). (QSA, 137)
Köhler opposes any attempt at an essentialist interpretation of language structures by consistent conventionalism. He points out that direct observation of language is impossible, and introspection is unreliable (opposing the idea of a native speaker) and, therefore, remains only a “linguistic behavior” (QSA, 14), through which language can be analyzed in an externalized form. Köhler directly states that langue is an abstraction from parole (QSA, 14). According to Köhler, the definition of a linguistic concept is always a matter of convention; definitions may be useful, but they may not be true (QSA, 27).268 Properties are not inherent to objects, but are attributes that are born from the theory (QSA, 27). Many features of Köhler’s conception, even of the original lexical system, bear the hallmarks of the semantic view of theories, which prevailed in philosophy of science since the 1960s, and was mainstream in philosophy of science in the 1980s (see chapter 2.1).269 Köhler’s choice of a functional variant of the D-N model of explanation (mostly affiliated with the syntactic conception of theories) does not contradict this – above we showed, following Halvorson (2016) (see chapter 2.3), that both views on the structure of theories are reconcilable. System-theoretical linguistics can be understood as an example of theory in the sense of the semantic view.270 The perspective of philosophy of science allows Köhler to distinguish the principles and rules found in other linguistic approaches from the laws which he seeks to define in system-theoretical linguistics. He understands the search for language universals and typologies as an inductive attempt to arrive at a general rule (or principle) and considers them as unreliable for classical reasons – reminiscent of the problem of induction (QSA, 20), while Hempel’s D-N model of explanation leads from theories to the deduction of singular statements that can be
At the expense of objectivism, Köhler directly writes: “There are, in fact, researchers who believe ‘new linguistic units’ can be found by means of intensive corpus studies. It should be clear, however, that this is a fundamental confusion between model and reality. Any unit is conventional, not only meter, kilogram and gallon but also our linguistic units such as phoneme, syllable etc.” (QSA, note 1, 27). Köhler does not refer to the semantic view of theories in F. Suppe and P. Suppes. He only mentions Suppes’ texts (see note 232). Köhler’s conception of linguistic theory is reminiscent of the Structural Semantic view of Theories advocated by Mauricio Suárez (cf. Suárez, Pero 2018).
5.2 System-theoretical linguistics
139
falsified or corroborated (QSA, 20–21). It is precisely the clear definition of the theory that attracts Köhler to the classical philosophy of science: (. . .) in linguistics, the term “theory” has lost its original meaning. It has become common to refer with it arbitrarily to various kinds of objects: to descriptive approaches (e.g. phoneme “theory”, individual grammar “theories”), to individual concepts or to a collection of concepts (e.g. Bühler’s language “theory”), to formalisms (“theory” in analogy to axiomatic systems such as set theory in mathematics), to definitions (e.g. speech act “theory”), to conventions (X-Bar “theory”) etc. (QSA, 21)
For Köhler, individual linguistic laws are found by formulating hypotheses about relations in a linguistic system (the system of subsystems), which is limited by external requirements. Formulated hypotheses – such as the one above about the relationship between polylexy and the length of a lexical unit – are subjected to a confirmation process; and, if they are not falsified, they acquire the status of linguistic laws. Köhler distinguishes a total of three types of linguistic laws,271 within which there is no hierarchy in terms of dependency or importance: (1) functional laws (among them the relation between length and polysemy and Menzerath’s law), (2) distribution laws (such as Zipf’s law)272 and (3) developmental laws (such as Piotrowski’s law), which model the dynamics of a linguistic property over time. (QSA, 24)
System-theoretical linguistics grasps mainly functional and distribution laws. However, these laws do not play their explanatory role independently and in isolation. Just as in physics, for example, Newton’s laws of motion refer to deeper principles of symmetry, so in linguistics Köhler seeks the common starting principle, of which individual laws would be instances. As we will see, in Köhler’s functional explanation, there are not individual laws found (e.g. Zipf’s law,273 but in
For reasons of chronology and respect for tradition, it is worth recalling that Gabriel Altmann was working systematically on the search for linguistic laws since the beginning of the modern period of quantitative linguistics (see also chapter 5.3 below). In a series of papers, he firstly introduces the modern form of Menzerath’s law (Altmann 1980); subsequently, with V. Burdinski, he introduced the Law of Word Repetitions in Text-Blocks (Altmann, Burdinski 1982); with H. v. Buttlar, W. Rott and U. Strauss explicates the Law of Change in Language (Altmann, von Buttlar, Rott, Strauss 1983); and with B. Kind he also tried to formulate a “semantic law,” which he refers to as „Martins Gesetz der Abstraktionsebenen“ (Altmann, Kind 1983). Recall that we have already seen the difference between functional and distributional dependencies in the graph-algebraic scheme of the syntax subsystem. The examination of the validity, but especially the application of various forms of Zipf’s law and related laws (Zipf-Mandelbrot’s law, Zipf-Alekseev’s law, etc.) has received much attention and is still a subject of fruitful scientific debate. For all of them, we will mention one of the projects that focus on finding the origins of Zipf’s law and on which we will comment later (see Torre, Luque, Lacasa, Kello, Hernández-Fernández 2019, see chapter 5.4).
140
5 Functional explanation in quantitative linguistics
principle not Menzerath-Altmann’s law either), but the structural axiom that defines the self-organization of the system (see chapter 5.5). The concept of self-organization, which Köhler most intensively identified with synergetics, has been a subject of interest and analysis. It seems possible to develop a “synergetic analogy” of the linguistic system into a more concrete form, based on the theory of dynamical systems. Köhler himself sought systematically this development in a number of texts, which primarily analyzed the register hypothesis as an explication of the origin of Menzerath-Altmann’s law (MAL) (see subchapter 5.2.2 below). The most extensive attempt at this development is the Dynamische Sprachtheorie by Wolfgang Wildgen and Laurent Mottron (cf. Wildgen, Mottron 1987; for more details, see Appendix 15). However, the Menzerath-Altmann’s law plays a special role, not only in system-theoretical linguistics, but in general in the context of reasoning of most quantitative linguists.274 One reason (1) is its characteristic power law nature, which is typical for a number of linguistic laws (see above the relationship between polylexy and the length of a lexical unit). It could, therefore, be a kind of a general analytical tool – such a view would also bring it closer to Altmann’s and Wimmer’s Unified Approach (see chapter 5.3 below). The second reason (2) is “deeper” because it is a general relation expressing the connection across all linguistic plans, across the entire linguistic hierarchy; basically from the phonetic level, through the morphological, lexical, to syntactic and even to the suprasentence level.275 Menzerath-Altmann’s law is usually considered in two forms. In the simplified one:276 y = Ax −b where y is the average length of the constituent, x is the average length of the construct and A, b are positive parameters.
Efforts to generalize the use of MAL outside of linguistics appear very early (cf. Altmann, Schwibbe (eds.) 1989). Luděk Hřebíček investigated supra-sentence structures in particular, (cf. Hřebíček 1995, Hřebíček 1993). 276 Altmann paid great attention to the Menzerath’s law implementation. The central text proving it is Prolegomena to Menzerath’s Law (Altmann 1980). Here, among other issues, Altmann presents an important modification of Menzerath’s original formulation: „Je größer das Ganze, desto kleiner die Teile.“ (Menzerath 1954, 101), specifically the thesis: “The length of the components is a function of the length of language constructs.” (Altmann 1980, 3), which meant that the b function is a solution of the differential equation: dy y = x − c dx (Altmann 1980, 3). Altmann also lists three possible solutions; unlike the two of ours, he also envisages a solution for b = 0, when then y = Ae−cx (Altmann 1980, 3).
5.2 System-theoretical linguistics
141
In its full form, the MAL is formulated as follows: y = Ax −b e−cx where there is also a positive parameter c. In the construct-constituent relationship, there may be adjacent pairs of linguistic levels. Thus, we can relate the length of a supra-sentence unit (about hrebs in summary, see Ziegler 2005) to the length of a sentence, the length of a sentence to the length of a lexical unit, the length of a lexical unit to the length of morphemes (and perhaps even to higher or lower levels, respectively; see, e.g., Torre et al. 2019, Strauss 1980). Assuming we have a clear and unified way of segmentation,277 it seems that we can say that this could, indeed, be the universal structural principle that system-theoretical linguistics seeks.278 The second reason (2) is closely followed by the third reason (3), which is motivated by the idea (not only Köhler’s) that MAL could be incorporated into a functional explanation of system-theoretical linguistics. Köhler considers that MAL could be explanatory in terms of the D-N model of explanation, but has failed to integrate it into a functional explanation in a satisfactory manner.279 It would, thus, be a possibility to replace the structural axiom with Menzerath-Altmann’s law. The structural axiom suffers from a considerable degree of vagueness and seems to depend significantly on the metaphorical use of Haken’s synergetics.280 We will return to this point in detail in the next subchapter (5.2.2) and in the following chapters (mainly in chapter 5.5).
5.2.2 The register hypothesis and the principle of invariance of the linguistic system structure Köhler attempted to support this important point of Menzerath-Altmann’s law as early as in the 1980s by creating a model of human language processing, the socalled “register hypothesis”,281 from which it should have been possible to derive MAL (QSA, 84). Köhler’s register hypothesis282 is framed by two assumptions: However, this is far from a simple problem, see, e.g. Benešová, Faltýnek, Zámečník (2015). Köhler also describes MAL as a “set of laws-schemes” (personal consultation, April 2019, Trier). During a personal consultation in April 2019, Trier. Köhler points out (personal consultation, April 2019, Trier) that a 3D model of synergetic linguistics could take MAL in place of the structural axiom. Köhler already introduces the “register hypothesis“ in the text Köhler (1984, 177–183). The text Köhler (1989, 108–112) presents Köhler’s derivation of MAL from the register hypothesis. 282 An interesting alternative to this hypothesis is presented in the text Milička (2014). Milička promises here: “(. . .) to establish a new formula for Menzerath’s Law, supported by a linguistic explanation, with easy to interpret parameters and fitting the data at least as successfully as the previous
142
5 Functional explanation in quantitative linguistics
(1) There is a special “register” (. . .) for language processing, which has to serve two requirements: (a) it must store, on each level, the components of a linguistic construction under analysis until its processing has been completed, and, at the same time, (b) it must hold the result of the analysis – the structural information about the connections among the components, (. . .). This register has a limited and more or less fixed capacity (. . .). (2) The more components the construction is composed of, the more structural information must be stored. However, the resulting increase in structural information is not proportional to the number of components, because there are combinatorial restrictions on each level (phonotactics, morphotactics, syntax, lexo- and semotactics), and because the number of possible relations and types of relations decreases with the number of already realized connections. (QSA, 84)
We believe that the assumptions defined in this way lead us back to Köhler’s effort expressed again in the book on the synergetic model of lexicon (see above) to build the whole system of synergetic linguistics subsystems based on higher-level regulatory mechanisms. We have seen that the higher-level system variable “regulatory effectiveness” (die Steuerungsefektivität) of the system is defined by two complementary higher-level requirements – Adaptation (die Anpassungsfähigkeit) and Stability (die Stabilität). We think it is possible to claim that the structural axiom of Köhler’s functional explanation (see chapter 5.5) is actually an expression of this higher-level quantity of the language system. The above mentioned assumptions (1) and (2) lead Köhler to a conclusion that corresponds to the manifestation of Menzerath-Altmann’s law: (. . .) the memory space which is left in the register for the components of a construct depends on the number of the components, which means that there is, on each level, an upper limit to the length of constructs, and that with increasing structural information there is less space for the components, which must, in turn, get shorter. (QSA, 85)
The (new) structural information increases as the number of constituents in the construct increases (with each additional constituent), which results in decreasing the capacity of the register with respect to structural information (see Figure 5). Therefore, Köhler concludes that the increase in register capacity required to process structural information is inversely proportional to the number of constituents in construct x:
one.” (Milička 2014, 86). Milička offers the formula y = a + bx, for Menzerath’s law, the advantage of which, in addition to agreement with test results, is that it allows to interpret the parameters a and b linguistically, while being based on a modification of Köhler’s register hypothesis. Milička states: “Unlike Köhler’s paper [Köhler 1989], we will assume that the amount of the structure information is independent of the number of constituents in the construct. The assumption enables us to interpret formula [a + bx] by claiming that a is the average length of a constituent that contains the plain information and b is the average length of the structure information.” (Milička 2014, 89).
5.2 System-theoretical linguistics
143
Figure 5: Language Processing Register (QSA, 85).
K′ =
B x
Köhler identifies the increase in register capacity required to process structural information K ′ with a decrease in capacity for the constituents themselves. He can, therefore, write: y′ B = y x where y corresponds to the length of a constituent. In this relation, he recognizes Altmann’s relation used to derive Menzerath’s law. The solution of this differential equation is a simple form of MAL: y = Axb , for b < 0 Köhler also expresses the relationship between parameters A and b and the size of the register R: A + kb ≤ R where k represents the proportionality coefficient (Köhler 1989, 110–111).283 The idea of cognitive limits, the effect of which can be formally expressed by MAL, has remained a hypothesis since the 1980s. We believe it is appropriate to place the register hypothesis in the context of cognitive linguistics and its ways of functional explanation. Surely, we could subsume the whole register hypothesis under a group of cognitive requirements, as discussed by Newmeyer (see chapter
283 The values of A and b are language specific. Where A corresponds to the average length of a construct containing one constituent and b „(. . .) ist ein Mass für den Umfang an Strukturinformation, der durchschnittlich für ein einkomponentiges Konstrukt erforderlich ist.“ (Köhler 1989, 110). Köhler also presents an extended derivation that leads to the complete MAL formula, which includes the parameter c (Köhler 1989, 111).
144
5 Functional explanation in quantitative linguistics
5.1). Above all, there is a connection with Newmeyer’s concept of “information-flow based principles”, but also with “processing efficiency” (Newmeyer 2016, 13–15). We have already pointed out the connection between the requirements defined by Newmeyer and Köhler, but with the register hypothesis, this connection acquires a deeper meaning. It becomes clear that Köhler seeks to express (in Newmeyer’s notion) the effect of cognitive requirements on speech production mathematically. System-theoretical linguistics probably reaches its limit here – on the border with cognitive linguistics and neurolinguistics. However, Köhler believes that at the syntactic level, with syntactically annotated corpora, it is possible to test the register hypothesis further.284 Suppose the hypothesis is corroborated: what kind of importance can we attribute to Menzerath-Altmann’s law? Is it possible to build a principle-based model of explanation on its platform? In the following chapters (mainly chapter 5.5), we will describe the difficulties associated with the construction of a functional explanation based on the D-N model of explanation. However, we stated above (see chapter 2.3) that the construction of a principlebased model of explanation is less demanding due to the nature of the explanatory basis of the explanation model. Of course, Menzerath-Altmann’s law is not a causal law (see causality in linguistics below in chapter 5.5). Considerations of system regulation lead to thinking it over as a principle of economization285 (or optimization), where economization would refer to the need to achieve the desired effect in a system with limited resources (with limited space of the register) and optimization would correspond to choosing the optimal path towards this effect. This possibility is promising, but it encounters Köhler’s thinking of non-systemic requirements, in which he reserved only one of the three basic groups for economization requirements (see above). In general, we could define the economization (and optimization) principle as a tendency of the system to adapt as effectively as possible to external conditions (requirements), i.e. as an ability to ensure the system functioning (in the optimal form). Therefore, the reason why Köhler chooses the functional form of explanation as a general model is made explicit. Köhler naturally understands effective adaptability as a manifestation of self-regulation (and self-organization) of the linguistic system, the fulfillment of necessary functions of the system. Therefore, in the end, Köhler himself chooses the more general principle of the selfregulation of the “system of systems”, i.e. the structural axiom, as the basic axiom of his functional explanation.
Köhler, personal consultation, April 2019, Trier. Köhler points out that economization was associated with this power law by Menzerath himself, as the “rule of economy” (QSA, 147).
5.2 System-theoretical linguistics
145
At the time when Köhler’s system-theoretical linguistics was emerging, this choice must have seemed completely natural – synergetics, like many other nonreductive approaches, was very popular and self-organization was a natural matter of conditio sine qua non of system considerations. Unfortunately, the subsequent analysis showed that self-organization is not entirely easy to grasp conceptually. We notice this problem in the analysis of Köhler’s functional explanation (see chapter 5.5). Perhaps it would seem natural to move from a biological analogy (conceptual borrowing) – from the language as an organism (living system)286 – to the very reduction of linguistic explanations to neuroscientific explanations (as we saw Haspelmath’s suggestion above in chapter 5.1). This would probably end the debate on the explanatory dilemma of linguistics – we would be content with non-linguistic explanations based on neuroscientific foundations, on the one hand, and with linguistic descriptions of language and speech, on the other. Apart from the above mentioned options – (1) a functional explanation based on the “economization” principle (or as wee will see on the structural axiom) and (2) a neuroscientific explanation based on non-linguistic principles – the option remains, to consider MAL as (3) a principle of invariance of the linguistic system structure. We believe that the structural axiom of Köhler’s functional explanation (see chapter 5.5) is an expression of the higher-level variable of the “regulatory effectiveness” (die Steuerungseffektivität) of the language system. In this option (3) the structural axiom is not conceived as an explanatory principle, but as a universal attribute of the system. Menzerath-Altmann’s law could, then, be understood as a conservation principle of this variable.287 Preliminarily, let us assume that it is possible to formulate a principle-based model of explanation for system-theoretical linguistics, provided that we accept the interpretation of MAL as the principle of invariance of the linguistic system structure. Below we will encounter this possibility in the reflections of Luděk Hřebíček (see the Second Interlude), and finally we will follow it until the reformulation of Köhler’s functional explanation into a topological explanation (see chapter 5.6). The form of the principle-based model of explanation could be as follows:
One of the arguments against using this analogy is that it hides fallacy – specifically about the “hidden premise”: from the analogous “language behaves like a living system” we move to the factual “language is a living system”. We came to related conclusions in the paper Zámečník (2014), where possible variants of conservation laws for the field of quantitative linguistics are discussed (Zámečník 2014, 117–119). At that time, we did not yet know Köhler’s register hypothesis or his conception of higher-level properties and requirements.
146
5 Functional explanation in quantitative linguistics
Explanans: The principle of invariance of the linguistic system structure Conditions: Existence of sufficient resources for the linguistic system construction (i.e. variables, parameters and requirements). Explanandum: The Structural Axiom We have presented in a comprehensive form Köhler’s view of the nature of linguistic laws and his search for (linguistic) principles that establish these laws. Before we focus on his main solution to the problem – the construction of a functional model of explanation and the analysis of the Structural Axiom (see chapter 5.5) – we have to map alternative paths, which have formed in quantitative linguistics and which are largely prevalent in today’s community of quantitative linguists. This is primarily a path proposed by Gabriel Altmann and Gejza Wimmer in the socalled Unified Approach to Linguistic Laws. In the second place this is Luděk Hřebíček’s view of linguistic theory based on the principle of compositeness.
5.3 The unified approach in quantitative linguistics For pragmatically oriented quantitative linguists who work on the Unified Approach (see Wimmer, Altmann 2005), MAL is basically conceived as one of the statistical distributions that can be applied in the analysis of linguistic data, in addition to an inexhaustible number of other statistical distributions (cf. Wimmer, Altmann 1999). MAL, then, does not have a special status of any explanatory principle, it is simply reportable as a power law of a certain type in data of a certain type. Despite the fact that, Köhler has repeatedly stated that system-theoretical linguistics and the Unified Approach lead to the same results (QSA, 137–138), we will examine this statement because we assume that the two views of quantitative linguistics differ exactly in the concept of the linguistic law. Wimmer and Altmann understand the process of deriving individual quantitative-linguistic laws as an automatic process of deriving individual distributions (for a continuous and discrete case separately) from the basic mathematical scheme (cf. Wimmer, Altmann 2005, 791–801). Despite Köhler’s proclamation that system-theoretical linguistics and the Unified Approach are equivalent, we therefore believe that this is only seemingly so. The very intent of both approaches is completely different – where Köhler looks for an explanatory theory based on the linguistic law (or principle), Altmann and Wimmer require a statistical tool to
5.3 The unified approach in quantitative linguistics
147
be in compliance with a particular empirical evidence.288 As we will see in chapters and subchapters below, the comparison is also offered at the level of the influence of the inspiration source – the philosophy of science of the syntactic and semantic period.289 While Köhler was more inspired by Hempel’s functional analysis, and built different variants of a functional explanation (see chapter 5.5), Altmann was more inspired by the inductive strategy of building scientific theories (see subchapters 5.3.1 and 5.3.2 below).290 Of course, there are also many similarities (see chapter 5.3.1 and chapter 5.5 below) in both conceptions, mainly historically, because Köhler was Altmann’s pupil and is his head follower. For Altmann, a systemic view of reality is essential in the unification of quantitative-linguistic knowledge, models and theories; and the systemic approach is a natural epistemic starting point for philosophy of science. He expresses it directly with Wimmer as follows: All things are systems. We join two domains if we find isomorphisms, parallelisms, similarities between the respective systems or if we ascertain that they are special cases of a still more general system. From time to time one must perform such an integration in order to obtain more and more unified theories and to organize the knowledge of the object of investigation. (Wimmer, Altmann 2005, 792)
When Köhler presents the synergetic theory (system-theoretical linguistics) in parallel with the Unified Approach, Altmann and Wimmer directly state that their approach is a logical extension of synergetic linguistics (Wimmer, Altmann 2005, 792). The unified approach (UA) is based on two assumptions considered to be completely known, natural and domesticated in linguistics by Altmann and Wimmer (Wimmer, Altmann 2005, 792). For the case of using continuous mathematical linguistic models, they express them as follows: (1) “Let Y be a continuous variable. The change of any linguistic variable, dy, is controlled directly by its actual size because every linguistic variable is
This also results from the application of the computational realization of this idea, which is the Altmann-fitter, see http://www.ram-verlag.biz/altmann-fitter/. Technically speaking, it is a choice of a distribution out of a storage of options suitable for a particular empirical case. We use the classification into syntactic (1940s–1950s), semantic (1960s–1970s) and pragmatic (1980s– 1990s) philosophies of science. We express this classification in the book Zámečník (2015). At the same time, however, Altmann’s and Wimmer’s attempt to unify linguistic laws, expressed in the Unified Approach, recalls the approach to explanation in philosophy of science represented by Philip Kitcher (cf. Kitcher 1981, Kitcher 1993). According to Kitcher, the purpose of the explanation is not necessarily to reveal a causal nexus (or network of causal relationships), but to unify our knowledge. Although Altmann directly refers to Bunge in this respect (cf. Wimmer, Altmann 2005, 791), we believe that this is an expression of the same trend in philosophy of science at the latest since the 1980s.
148
5 Functional explanation in quantitative linguistics
finite and part of a self-regulating system [emphasis mine], i.e. we can always use in modelling the relative rate of change dy=y. (2) Every linguistic variable Y is linked with at least one other variable ð X Þ which shapes the behaviour of Y and can be considered in the given case as independent. The independent variable influences the dependent variable Y also by its rate of change, dx, which itself, in turn, is controlled by different powers of its own values that are associated with different factors, “forces” etc.” (Wimmer, Altmann 2005, 792) The issue, in our opinion, is that these are not strictly linguistic assumptions, but rather system-theoretical assumptions of linguistic modeling. Unlike Köhler, who builds the entire linguistic system – classifies variables, parameters and nonsystemic requirements, and expresses their interrelationships. Wimmer and Altmann actually “only” present mathematical formalism that can be used to describe linguistic data. It is also not clear why the reference to self-regulation appears in assumption (1), and whether it can be interpreted as Köhler’s structural axiom. Rather, they grasp the “principle of self-regulation” on an ad hoc basis and attach it to mathematical formalism. Therefore, we do not consider Unified Approach in this form to be an extension to Köhler’s synergetic (i.e. system-theoretical) linguistic theory (Appendix 16 lists the basic elements of Wimmer’s and Altmann’s UA). On the other hand, it is appropriate to look a little further back and note that the foundations of the Unified Approach have their origins in the joint work by Altmann and Köhler, in deriving MAL (cf. Altmann, Köhler 1995). This text presents the basis of the unified approach for the discrete and continuous case, as developed later by Altmann and Wimmer. From this point of view, this is not a fundamental difference, rather it is interesting that the text shows a certain gradual separation between Köhler’s and Altmann’s thinking – Köhler holds onto the register hypothesis and deductive structuring theory (e.g. QSA), while Altmann divests himself of initial axiomatic starting points – he does not care about them any more, or they are unimportant to him – and builds the Altmann-fitter. Formalism remains the same, but the context of its fulfillment changes. In this text (Altmann, Köhler 1995), however, both show the starting point of quantitative linguistics from the concept of self-organization – namely, they use the concepts of Haken’s synergetics. They use the term “language forces” – the diversification and unification forces of the speaker and the listener (Altmann, Köhler 1995, 62), when they understand Zipf’s “principle of least effort” (cf. Zipf 1949) as an analogy of physical forces (Altmann, Köhler 1995, 62). They state that: “From a modern point of view, (. . .), Zipf’s “forces” can be interpreted as order parameters or other system requirements” (Altmann, Köhler 1995, 62). They
5.3 The unified approach in quantitative linguistics
149
follow system-theoretical linguistics (ZLS) – it is a partial selection of requirements, but these well illustrate the general process of linguistic system building. They explicitly refer to Haken (1978) in connection with the representation of new meaning coding (see also Köhler 1990a) by various means (see the relation between polylexy and length of the lexical unit in chapter 5.2 above): “In synergetics (cf. Haken, 1978), the term “order parameter” has been introduced for control variables of a supersystem which “enslave” processes in subsystems” (Altmann, Köhler 1995, 64).291 Subsequently, they move on to the formulation of general rules for interaction between requirements – to express the self-organization of the system, for both discrete and continuous variants (as in the case of Wimmer, Altmann 2005). In summary, they claim: In all cases, the requirements – or changes of their weight – result in changes in the system and form one of the sources of the system’s dynamics. The corresponding processes can be mathematically modeled by means of stochastic processes. (. . .) In most cases, the approaches are based on equilibrium assumptions [emphasis mine], and their solutions yield functions or probability distributions, which can get the status of laws if sufficient theoretical and empirical corroboration exists. (Altmann, Köhler 1995, 65)
This is the closest we come to the origin of the occurrence of the self-regulation principle in the Unified Approach, although expressed here through “equilibrium assumptions”. We can already see here the idea of the inductive approach we discussed above in connection with Altmann and Wimmer (2005) – the status of law is based on the empirical corroboration. Perhaps the idea of the connection between Altmann-Wimmer’s view and Köhler’s view derives from this text. However, we believe that this connection is based mainly on the analogical use of synergetics (see also Meyer’s critique in subchapter 5.3.2 below).
5.3.1 The struggle with logical empiricism Some differences between Altmann’s and Köhler’s conception of linguistic theory are subtle; both authors are very closely connected with the common history of research and the effort to build a theory of quantitative linguistics. Therefore, it
It is somewhat confusing that the strict distinction between system variables and requirements is lost: “Word length, for example, is the order parameter that ensalves the length of the words’ syllables (. . .); sentence length is the order parameter that enslaves the length of the clauses (. . .).” (Altmann, Köhler 1995, 64). In what sense does the length of the word enslave the length of the syllables? For Haken’s idea of the mathematical expression of “enslaving”, see Appendix 17.
150
5 Functional explanation in quantitative linguistics
serves good to look even deeper into the past at Altmann’s idea of the nature of linguistic theory, linguistic explanation, and at the role that philosophy of science is to play in defining linguistic theory. We find this very soon, in the founding text of the modern form of quantitative linguistics, in the first issue of the first quantitative-linguistic journal Glottometrika – this is Altmann’s text “Towards a theory of language” (Altmann 1978). In this text, Gabriel Altmann drew the line of system-theoretical linguistics, including a close connection to philosophy of science, efforts to build a valid form of explanation, and a proposal for a general form of quantitative-linguistic theory. Here, Altmann proclaims the centrality of the concept of the scientific law for linguistics in a somewhat radical way: In any future theory of language one should therefore operate exclusively with laws, no matter whether deterministic or stochastic ones. To put it in a nutshell: no laws, no science (Bunge 1967, I: 318). (Altmann 1978, 4)292
For Altmann, the effort to build quantitative linguistics is based on a model of exact science aiming at creating “a theory of the structure, of functioning and of change of language” (Altmann 1978, 9). The fundamental question is how Altmann imagines the exact science and its methods – how he balances inductive and deductive strategies for building theories and testing them. In this regard, it is important to evaluate Altmann’s relationship to logical positivism – and inductivism in general – in theory testing. Altmann refers to the philosophers of science connected with deductivism – Hempel, Nagel and Popper, but at the same time, his description of a research method itself (cf. Altmann 1978, 12–13) sounds not deductive-nomological, but inductive. A possible solution could be based on the use of Hempel’s inductive-statistical model of explanation, but we do not know any case when Altmann would refer to it in any text. On the contrary, he refers to the deductive-nomological model of explanation explicitly (Altmann 1978, 5–6). Altmann explicitly talks about: (. . .) empirical, “quasiinductive” process – a step-by-step setting up of hypotheses which cannot at first be derived from a set of axioms. They will, however, be immediately tested and in case of a positive result they will be retained. (Altmann 1978, 12–13)
We can better understand this input procedure when we look at analyzes presented by Peter Grzybek (Grzybek 2006, 2–3), from which it is clear that indeed
In the text Altmann, Burdinski (1982), he even speaks directly as follows: “We find ourselves here on the threshold of a “physics of text”, which is developing into a very extensive discipline (. . .).” (Altmann, Burdinski 1982, 165).
5.3 The unified approach in quantitative linguistics
151
this “quasi-inductive” phase is only at the beginning of research, in an immature theory – and this is what, after all, Altmann also claims (Altmann 1978, 13). The issue related to interpretation of Altmann’s theory arises at the moment when we notice where it has evolved. The Unified Approach (UA) can be faithfully likened to a “statistical sieve” for empirical research results. The Unified Approach is based on a universal form of statistical distribution, but this universal formula does not work as an axiom with which it is possible to explain individual data deductively-nomologically. In other words, it seems that Altmann’s strategy corresponds to the permanence of the original quasi-inductive research phase. Altmann was already aware of the issue with inductivism in the original article: “The above method seems to represent a purely inductive, phenomenological approach, as if we were merely setting up empirical generalizations which are not suitable for theory building” (Altmann 1978, 17). And therefore, he justifies why the situation is as it is at this initial stage. It is still possible to bring preliminary hypotheses, but another axiomatic level to which Altmann referred has not appeared in the form of UA: The axiomatization and the deduction of hypotheses from the axioms can be accomplished at a later date. (. . .) The structural axioms that describe [emphasis mine] the properties and the dynamics of the universe of discourse can be set up later on. (Altmann 1978, 13)
We tend to believe that this axiomatization, this determination of the structural axiom, was performed in Köhler’s system-theoretical linguistics. Most features of a mature linguistic theory, as Altmann states them (Altmann 1978, 17–19), are found in Köhler’s theory as fully developed. This is (1) the use of “linear additive hypotheses” – Köhler showed that the system of linear functions is sufficient (even in the case of lexicon oscillations, when the conditions are met, see ZLS, 145–146).293 The means of graphical algebra enable consistent mathematization of individual linguistic subsystems (lexicon, syntax in particular). Furthermore (2), Köhler brings the linguistic interpretation of “path coefficients”, which he states as parameters connecting individual linguistic variables. Finally, and above all (3), he builds a functional model of explanation that incorporates, among other things, the structural axiom. Altmann foresaw this theory in the form of balance or optimality theory in language (cf. Altmann 1978, 20). To what extent does Altmann overcome the problem of the logical-empiricist or positivist approach to scientific testing and theory building? In the second part of the book we did not comment upon logical positivism enough, but here the reference to it proves to be necessary. Philosophy of science arises, to some extent,
Even in the case of lexicon oscillations, when the conditions are met (see ZLS, 145–146).
152
5 Functional explanation in quantitative linguistics
as an opposition to the logical-empirical method of theory building and theory testing.294 Logical empiricists trusted in building scientific theories from data, via induction and generalization, and this approach has been argued in many places (i.e. Popper 2002, 3–7). The starting point of logical empiricism is to trust data that are simply given,295 and do not need to be approached with theoretical assumptions. All theoretical entities should be, strictly speaking, reduced to empirical entities. The theory in this sense is actually just an acronym that reconstructs the connection between the data. Basically, it is just logical syntax (and of course mathematical syntax) that connects data.296 In the deductive approach typical of Popper and Hempel, pure logical syntax is not enough; a deductive-nomological explanation has to be based on specific sentences that are interpreted as scientific laws. These laws are about theoretical entities that are articulated by theoretical terms. Their strict removal is not possible because they are precisely what fills scientific laws with some content, which is necessary to distinguish them from purely logical constructions connecting data. Otherwise, the theory would again become just a description, or empirical generalization. In Altmann, however, we find both; in fact, he inclines, at least at the beginning (in 1978), to the strategy of empirical generalization until a fully developed linguistic theory is built. On the other hand, right from the beginning, he attributes a fundamental explanatory and predictive role to linguistic laws: “(. . .) once detected, they can be used for predictions as well as explanations in a particular language; what is more, one can even hope that some kinds of rules in individual languages can be roughly derived or predicted from them” (Altmann 1978, 21). The ability to predict and explain leads us to Altmann’s co-optation of the D-N model of explanation. Into the explanans he situates antecedent conditions, i.e. measured values of linguistic variables X1L , X2L , . . ., XnL (in some language) and linguistic law, in mathematical expression y = f ðx1L , x2L , . . ., xnL Þ, and into explanandum belongs the concrete value of linguistic variable YL . Schematically:
Whether it is the criticism present in Popper’s Logik der Forschung (Popper 1935, for actual English edition of The Logic of Scientific Discovery see Popper 2002), or Hempel’s and Oppenheim’s Studies in the Logic of Explanation (Hempel, Oppenheim 1948). The problem with the empirical basis is very nicely discussed by Popper (2002, 74–94). We express ourselves in a simplified way. For a comprehensive idea of logical positivism, cf. Ayer (1959).
5.3 The unified approach in quantitative linguistics
153
x1L , x2L , . . ., xnL f ðx1L , x2L , . . ., xnL Þ yL = c (Altmann 1978, 6)297 Once we have the D-N model, the whole machinery of hypothetical-deductive testing comes to the stage at the same time. Here again, however, Altmann begins to act more as an inductivist who seeks to confirm hypotheses, rather than as a deductivist who attempts to falsify hypotheses. The logical-empirical view was based on the strategy of verification and verificationism, which, as Popper demonstrated, suffers from a traditional problem of the induction principle. In the history of philosophy of science we find a number of cases of the defense of inductivism,298 which in some respects culminates in Bayesianism. However, Altmann does not comment on these new possibilities, and it is difficult for us to defend him simply with these new variations of inductivism.299 Despite some of Altmann’s proclamations (Altmann 1978, 13) we feel some tension between Altmann’s and Popper’s views of scientific hypothesis. Altmann’s view is, strictly speaking, inductivist, where Popper is strictly deductivist in the establishing of scientific hypothesis. Altmann says: “A scientific hypothesis must always be correctable” (Altmann 1978, 18), where Popper writes about falsification of hypothesis in the light of empirical evidence (cf. Popper 2002, 9–10). Because the falsification testing strategy promoted by Popper (2002) or Lakatos (1978) also suffers from a number of problems, we do not want to claim that Altmann adheres to any bad method; rather, verification and falsification strategies complement each other. However, Altmann’s relationship to inductivism is not entirely clear. The “duality of the method” of quantitative linguistics, as it has changed since the beginning of its scientific research program, is discussed in Appendix 21. Altmann’s strategy does not admit the subtle problems that have been identified in philosophy of science either, such as underdetermination of the theory by empirical evidence, also referred to as the Quine-Duhem thesis (see Quine 1963), let alone the problem of theory ladenness, associated mainly with Hanson (1958) and Kuhn
297 The symbol L in the scheme means that the measured values belong to some specific language L (Altmann 1978, 6). In history, we can follow this methodological tradition from David Hume, through British empiricism to, for example, Hans Reichenbach (cf. Reichenbach 1938). We find a developed form of the confirmation strategy in Carnap (1950), where he sets the idea of the degree of confirmation of hypothesis. Perhaps this is where we could see some link to Altmann, who trusts in the need to correct hypotheses that are subject to testing.
154
5 Functional explanation in quantitative linguistics
(1962). It remains unaffected by the development of concepts in philosophy of science related to a certain crisis of empiricism, as articulated and addressed in analytic philosophy after Quine (also by Sellars, Davidson and van Fraassen, see chapter 2.1 and Appendix 7). Not only for these reasons, it is prone to succumb to pitfalls of the confirmatory bias and fall into permanent confirmation of the status quo, which could be a symptom of the stagnation of this scientific research program.
5.3.2 Difficulties in grasping the linguistic law Certain answers to unclear questions of quantitative linguistics – especially in Altmann’s inductive conception – can be found in Peter Meyer’s polemical text Laws and Theories in Quantitative Linguistics (Meyer 2002), which analyzes methods of quantitative linguistics as well as the status of linguistic laws on which its explanation is based. Meyer conceives of quantitative linguistics as a scientific research program as a whole, but purposefully argues mainly with Altmann’s (and Wimmer’s) texts. He accuses them of being too fixated on trying to transform linguistics into the form of a natural science, which he considers to be the legacy of logical positivism. He even rebukes Altmann for not reflecting on a broader philosophical perspective and focusing exclusively on Bunge’s conception (Meyer 2002, 63–64). Leaving aside the fact that Meyer advocates the explanatory nature of nonquantitative approaches to linguistics (namely comparative linguistics and generativism – the principles and parameters approach, Meyer 2002, 63), his strict critique of the Altmann’s and Wimmer’s Unified Approach is very important to us: (. . .) we are left with some sort of metatheoretical credo that leaves open why this should be the only modus operandi allowed in modern science, let alone linguistics, particularly as the immense complexity of social and neurophysiological processes that jointly underlie the dynamics of language make it seem rather implausible that this dynamics can be modeled in any interesting way in terms of, say, a bunch of differential or equilibrium equations. (Meyer 2002, 64)
In line with our previous doubts, Meyer criticizes Altmann’s approach because he does not find in it a well-applied D-N explanation or a hypothetical-deductive method of testing. He states explicitly that: (. . .) the quantitative regularities discovered so far in QL do not pass as law-like statements, that is, as analogues to what qualifies as “laws” in the natural sciences, particulary in fundamental physics. As a consequence, explanations for these regularities (in science-theoretically established sense of “explanation”) are still wanting. (Meyer 2002, 64)
Meyer questions Altmann’s and Wimmer’s belief that it is possible in the Unified Approach to talk about deducing laws from axioms. He also strictly questions the basic assumption of Unified Approach (see Appendix 16) that the basic formula
5.3 The unified approach in quantitative linguistics
155
(whether for a continuous or discrete case) represents a kind of a unifying axiom. He directly states: (. . .) it must be stressed at the outset that a purely mathematical deduction of the negative binomial distribution from the difference equation (1) plus specification (2) does not provide us with a theoretical explanation of the distribution, since it does not embed it in a nomological network that has an independent justification. (1) and (2) are nothing but a mathematically equivalent reformulation of the probability distribution. (Meyer 2002, 66)300
Meyer is very explicitly skeptical about UA’s potential to be a real “super-theory” (Meyer 2002, 67).301 Meyer notes in more detail Wimmer’s and Altmann’s efforts (1994) to clarify how Unified Approach can be used for D-N explanation and how it can be subjected to hypothetical-deductive testing of word length across human languages (Meyer 2002, 67–68), but again here he is skeptical: It is clear that principle (1) gains empirical character only by virtue of specifying g ðxÞ. However, we are not offered any theoretically well-founded restrictions on to what class of functions g ðxÞ should belong, only some inductive evidence on what functions “worked well” in past investigations, that is, have led to a good fit for a reasonable amount of texts. Nor do we possess any criterion for predicting which selection among a set of “approved” functions will do well for a newly investigated text. In other words, (1) cannot be falsified; hence, it is not an empirical principle in any sense and, therefore, is not capable of being an “axiom” from which theorems of word length distribution could be derived. [emphasis mine] (Meyer 2002, 68)
We consider this statement to be the most serious critique of Altmann-Wimmer’s conception of linguistic theory. With some support for this theory, we could say that, in a way, Altmann and Wimmer responded to Meyer’s criticisms when, in a systematic text Wimmer, Altmann (2005) list all variants of derivable formulas in a hierarchical sequence. On the other hand, of course, completeness results in a number of unoccupied places in the system – a number of variants of the function g ðxÞ – which are not used (they are not realized in the case of known languages). 300 Modification of equations (1) Px = g ðxÞPx − 1 and (2) g ðxÞ = a +cxbx are found for a discrete case in Wimmer, Altmann (2005, 797–801), see also Appendix 16. Meyer further points out that Altmann was already aware of the problem in Altmann (1980), Altmann states there: “The derivation from a differential equation is not sufficient in order to award the statement (4) [Menzerath’s law in a quantitative formulation, P.M.] the status of a law. It remains a theoretically not fully validated hypothesis as long as it is not set in relation to other laws of language, i.e. until it is incorporated into a system of laws. Such a system does not exist at present, we merely suspect that it is somehow connected with principle of least effort or with some not yet known principle of balance [omission mine].” (Meyer 2002, note 7, 66–67). “There are no good reasons to assume that this purely formal analogy between extremely different formulas used in wildly disparate interpretations (as probability densities, as functions etc.) has a deeper reason connected somehow with (universal) properties of human language. All we get is a purely mathematical observation that has not yet any clear implications for the phenomena described with the aid of the respective formulas.” (Meyer 2002, 67).
156
5 Functional explanation in quantitative linguistics
Sometimes g ðxÞ is a constant, sometimes a function (cf. Wimmer, Altmann 2005, 797–801), and again, we are left without justification why this is the case and why some variants are missing. Of course, even in physics, only some mathematical functions are implemented, but there we can find the justification why some variations are missing (see Appendix 6); and this is the core of the explanation (which we call, for example, D-N).302 Meyer’s critique is very strict;303 we will try to mitigate it somewhat by proposing an analogy between Altmann’s and Wimmer’s conception and Galilean physics (see also note 254 above), which was the right step towards Newton’s theory, which Meyer shows as a standard for assessing Altmann’s conception (Meyer 2002, 68). Galileo, too, primarily trusted the power of mathematical argument when constructing his theory of motion on the inclined plane. He, too, did not predict the behavior of the kinematic system strictly, but only “explained” the sequence of lengths of the measured path sections per time unit as an “additive series of odd numbers”. Also in Galileo, the proportionality constant (or function) between instantaneous velocity and time was not interpreted physically. Today, in classical mechanics, we interpret it physically as free fall acceleration g (cf. Torretti 1999, 20–30) (See also note 83). We, therefore, incline to agree that the transition from Altmann’s conception of the quantitative-linguistic theory as it was born in the late 1970s (despite developments to the Unified Approach phase), to Köhler’s conception of system-theoretical linguistics in the mid-1980s can be likened to the transformation of Galileo’s physics to Newton’s physics. However, even Köhler’s conception retains its problems, one of them being the above mentioned problem with the linguistic interpretation of the parameters (see chapter 5.2 above), which interrelate the linguistic quantities in the system. Meyer’s critique, thus, applies at least partially to Köhler, when Meyer again writes resolutely:
A situation similar to that of Altmann-Wimmer’s would perhaps arise when we accept a multiverse scenario in cosmology with different kinds of laws in different universes, which would be completely covered with all the mathematical formulas that these laws could acquire; of course, with consequences for the “physical” objects contained in the universes. 303 However, it is dumbfounding that in the canonical text for the Unified Approach (Wimmer, Altmann 2005), Altmann and Wimmer do not react to Meyer’s recent criticism, and we do not know a text where this would happen. This may be due to a certain aggressiveness of Meyer’s text, for example when comparing quantitative linguistics with astrology, due to the inability to establish a framework for predictive success: “We have no idea why (1) – in a version specified for g ðxÞ – cannot be applied to this or that text; prediction is impossible. So all we can safely say is that (1) holds whenever it is found to hold. In this respect at least, QL does not yet fare much better than astrology.” (Meyer 2002, 69).
5.3 The unified approach in quantitative linguistics
157
The parameters that appear in the various “proportionality functions” proposed so far in the literature suffer from a complete lack of interpretability; they are just numbers obtained by fitting the model function class to the data at hand and vary from text to text without being predictable or connectable to other empirical statements about the texts in question. To be sure, interpretations have been proposed but they are plainly ad hoc and not susceptible to any sort of confirmation. [emphasis mine] (Meyer 2002, 69)
Meyer also points to the important fact that in a well-formed theory, parameters have to occur at various points in the nomological network. In other words, two linguistic laws derived from axioms of the linguistic theory should share some identical parameter (Meyer 2002, 70), as is the case with physical theories.304 Köhler and Altmann are naturally aware of this problem although they will not agree that the parameters are completely uninterpreted (above we pointed out the degree of syntheticism T of the given language, see chapter 5.2). In recent years, there have also emerged new attempts to interpret some parameters in key laws (Zipf’s and Menzerath-Altmann’s) – for example, Jan Andres (cf. Andres 2010) understands parameter b in Menzerath-Altmann’s law as “the fractal dimension” of texts examined.305 Meyer’s critique also points out that, in connection with data interpretation, a rejected, but persistent, qualitative dimension has to enter quantitative linguistics (Meyer 2002, 70). Meyer connects it with theoretical entities of linguistics – when quantitative linguists operationalize linguistic properties, they have to rely on qualitative concepts that will only allow them to operationalize these properties (Meyer 2002, 70). Meyer draws attention to the danger of the radical practice of quantitative linguistics, which would stick to the default of un-reflective use of specific qualitative starting points to be the only correct one. He points out that quantitative linguistics is not autonomous,306 it is not able to reach the explication of its concepts only through quantification.307
304 Already in Newtonian mechanics, we can, thus, identify standard parameters and constants, such as standard free fall acceleration g and subsequently the gravitational constant G identified in it. 305 “(. . .) to interpret possibly (?) the parameter b1 (cf. the MAL formula) as a suitable (e.g. boxcounting, Hausdorff–Besicovitch, . . .) dimension D = b1 , or its lower estimate D ≥ b1 , of a fractal with a sufficiently big measure of self-similarity which can be approximated by a model of a language ‘‘fractal’’ (. . .).” (Andres 2010, 120). See also Andres, Benešová (2012). In the spirit of how Esa Itkonen distinguishes between autonomous (i.e. synchronous, descriptive) and non-autonomous (i.e. explanatory, diachronic) linguistic approaches, cf. Pateman (1985, 481). Meyer is even more strict: “Even if we had a theoretically deducible stochastic regularity that works well with a certain concept (set of criteria) C1 of a word but does not work at all with another set of criteria C2, this would not tell us that we should henceforth use C1 instead of C2 in our qualitative descriptions since the viability of a qualitative concept can only be judged relative to the qualitative theory it forms a part of. It is the qualitative delimitation of the concept that gives a stochastic statement its meaning in the first place. Observable statistical regularities about artificial
158
5 Functional explanation in quantitative linguistics
We have already encountered this with Köhler; we have seen that, for example, he operationalizes the lexical unit length (see chapter 5.2 above) by measuring it in the number of graphemes that occur from gap to gap. At the same time (already in ZLS), however Köhler himself admits variations of length operationalization. He also points out that difficulties and misunderstandings begin where big data are only “analyzed” but not approached with linguistic preconceptions.308 Here, therefore, Meyer’s critique focuses more on some proclamations about general quantifiability (see Bunge’s view below in chapter 5.5), or on the claim that quantitative linguistics is sufficient without theoretical terms and entities (see below the discussion between Radek Čech and Martin Beneš). A typical case of what Meyer criticizes is the influence of the type of text segmentation on the resulting fit of a given linguistic law.309 This fact again refers to a more general problem of testing (not only) quantitative-linguistic theories (see above note 295 for Popper’s reflections on the problem of the empirical basis). Linguistic data are not neutral; we have to model them in order to be able to confront them further with the theory. Even in Ronald Giere’s pluralistic modelbased view of theories (see chapter 2.1 above), the concept of data models, which are inserted between the world and theoretical models, is directly used (see Giere 2006, 68–69). In other critical remarks, Meyer refuses to compare laws defined by Altmann’s and Wimmer’s methods with physical laws because he does not find criteria that establish the exceptional nature of linguistic laws (Meyer 2002, 71). Here, the standard quantitative-linguistic defense will be based on the difference between deterministic and stochastic laws (we also saw this above in chapter 3.1 in Grzybek 2006), but we think it would be unfair to Meyer to accuse him of not being aware of this difference. To some extent, one can argue that there are a number of ambiguities regarding the criteria for defining the law in the context of statistics (see below subchapter 5.4.1). However, if we do not want to use statistics only to fit distributions, but want to build a scientific theory, then we have to seek for some invariant principles.310 We miss this form of principles in the UA, constructs that have no independently discernible place in a linguistic description (as might be the case with C1) are virtually meaningless. (. . .) I would like to add that the stochastic treatment is conceptually dependent on the qualitative one but not vice versa.” (Meyer 2002, 70–71). In a private discussion (Köhler, March 2018, Trier). In the case of MAL we gave an example of phonetic, orthographical and morphological segmentation, see Benešová, Faltýnek, Zámečník (2015, 29–37). For example, quantum physics incorporates statistics, but its basic principles – expressed, for example, in Heisenberg’s well-known matrix quantum mechanics – are universal. And although there are several axiomatizations of quantum mechanics relying on different interpretations of some principles, a given variant of invariance is still necessary for a given axiomatization.
5.3 The unified approach in quantitative linguistics
159
along with Meyer. Moreover, Meyer points out that even a reference to ceteris paribus laws will not help quantitative linguists to solve this problem (Meyer 2002, 71).311 Meyer’s critique gradually reaches the basic principles that are central to Altmann’s and Köhler’s conception of linguistic theory – especially the self-organization principle. He considers this principle to be a metaphor that is time-specific (bound to the popularity of complexity theories) and, above all, claims that it is an “ill-defined formal metaphor” (Meyer 2002, 72–73): (. . .) the mere claim of analogy is not enough; it is a hypothesis that has to be proved on empirical grounds. Hence, transferring a formal model such as Haken’s synergetics to a new domain of phenomena is tantamount to setting up a new theory that must be validated independently of previous applications of the model in other domains. In other words, the right motto should be: first set up your linguistic theory, then try to find a common denominator with theories from other fields. (Meyer 2002, 73)312
Again, we have to say that this strict critique mainly concerns Altmann (and Wimmer) and not so much Köhler. As we have already stated, Altmann and Wimmer add the self-organization concept to their basic definition of the Unified Approach. However, they leave it without explication, as a metaphorical reference point for a quantitative linguist focusing on the very statistical analysis of linguistic data. Meyer does not mention Köhler’s synergetic linguistics, he does not refer to Köhler’s key texts. We believe that although Köhler chooses the name “synergetic linguistics” due to its popularity at that time and probably in connection with Altmann, yet his theory corresponds rather to the cybernetic and informational views in the context of the systems theory. On the other hand, Haken’s synergetics presents a sophisticated theory313 from which Köhler borrows some concepts in parts (see chapter 5.2 above and chapter 5.5 below), but not systematically. As a highly problematic example of the use of conceptual borrowing, Meyer considers the statement that MAL can be conceived as an attractor of a dynamical system (cf. Meyer 2002, 73–76). Meyer even blames Altmann on that occasion for
“It is often assumed that the inductive generalizations of QL are indeed laws proper, if only a special kind of them, to wit, ceteris paribus laws that hold only when certain necessary preconditions are satisfied. However, since those necessary preconditions can in virtue of the ceteris paribus restriction not be stated explicitly and are, therefore, not specified by the law itself, the ceteris paribus clauses amount to no more than a trivial immunization strategy.” (Meyer 2002, 71). For analogies in linguistics, see also Itkonen (2005). Kellert (2008) has systematically commented on the issue of conceptual borrowings, see Appendix 15 for more details. Although it has also been widely criticized, see e.g. Stephan (1999), in more detail below in the chapter 5.5.
160
5 Functional explanation in quantitative linguistics
reviving teleology or the post hoc, ergo propter hoc fallacy (Meyer 2002, 75).314 Interestingly, Altmann himself considers the teleological explanation to be acceptable, provided that it obtains new patronage in terms of self-regulatory systems (see chapter 5.5 below). However, Meyer breaks down his position somewhat when he refers to the concept of “complicity”, which was promoted in the 1990s by Cohen and Stewart (Cohen, Stewart 1994). He considers it to be the “third” path between a traditional qualitative research of language and quantitative-linguistic inductivist description (Meyer 2002, 77). It has to be said that from today’s point of view, in the case of Cohen, Stewart (1994), it was again “only” another hypothesis on complexity (in addition to synergetics, non-equilibrium thermodynamics, etc.), which did not crystallize into a comprehensive theory, so Meyer’s confidence in a certain concept (complicity) proved inadequate. While at the same time, this confidence also counted on a task for a functional explanation: Complicity-driven dynamics can, in many cases, be subjected to a merely qualitative or functional explanation [emphasis mine]; in other words, the dynamics of the combined system is not reducible to aspects of the dynamics of the systems that form its parts. (Meyer 2002, 77)
If a functional explanation has such a specific new role, then it is not clear what type of quantitative explanation, according to Meyer, it should be for Altmann (and possibly Köhler). In addition, Altmann considered a number of explanation variants in the context of linguistics – causal, functional and teleological (Altmann 1978, 8). As we already know, Köhler did not base his system-theoretical view on the concept of reducibility and (as we will see in more detail in chapter 5.5) opposes a functional explanation against a causal one explicitly. Meyer does not appreciate the significance of Köhler’s conception, which represents a step towards a mature quantitative-linguistic theory.
“For Old Russian texts after the fall of the yer vowels, Menzerath’s Law in its basic form can satisfactorily be fitted to the data. The authors conclude (Lehfeldt, Altmann 2002, 338): “In other words, the fall of the reduced vowels was directed at the elimination of these obstacles [for the law, PM].” Here we see an example of a post hoc, ergo propter hoc fallacy, that is, of an illicit causal-final reinterpretation of a merely temporal sequence of events: After the yers fell, Menzerath’s curve could be fitted again, therefore, or so the argument runs, the fall of the yers caused or was directed at reenacting Menzerath’s Law. We have no sound reason to take such a conclusion for granted since it is based on an unwarranted reification of the stipulated reason for the ceteris paribus regularity: The claim that “Menzerath’s Law holds ceteris paribus” effectively gets rephrased as follows: Menzerath’s Law is a kind of “telos”, a “driving force” that is somehow determined to change the dynamic structure of a language system in the course of time.” (Meyer 2002, 75).
5.3 The unified approach in quantitative linguistics
161
Despite these problems, Meyer’s text represents an important milestone in quantitative-linguistic methodology and a strong reminder of a number of important problems that have not been resolved satisfactorily. In addition, his vision for further development of quantitative linguistics has been fulfilled largely: (. . .) the role of QL laws in future linguistics would be a more mundane, modest one than hitherto assumed; qualitative and quantitative research would simply coexist and be directed at different goals and purposes. QL would not be able to find the deep, hidden mechanisms by which the evolution of linguistic communicative processes proceeds. (Meyer 2002, 78)
From the point of view of the dilemma of linguistics under consideration, the question also arises whether the reference to emergent phenomena (we will see below in the Second Interlude), Köhler’s system-theoretical approach, etc. does not trivialize specificity of linguistic explanation. We are convinced that not if we are able to lead, as Meyer points out (cf. Meyer 2002, 78), a valid analogy between the linguistic system and external non-linguistic system in which a given kind of explanation already works (see the possibilities of non-causal explanations in Köhler’s view in the chapter 5.5 below). In addition, Meyer offers an optimistic part of his vision: On the other hand, if a mathematical treatment of the way qualitative linguistic entities such as words, syllables and constituent structures emerge evolutionarily is more than a selfcontradictory hope, then it is precisely this mathematics of the “complicitary” qualitative concepts [emphasis mine] we linguists live by that would have to lay the foundations for a mature Quantitative Linguistics. (Meyer 2002, 78)
Meyer’s text allows us to make an overall assessment of Altmann’s (and Wimmer’s) vision of quantitative linguistics. One problem with which Meyer’s critique has much to do (and which Meyer does not explicitly mention) goes back to the logicalempiricist basis of Altmann’s (1978) paper and concerns the reduction of theoretical entities and terms to observational entities and terms (as we have already encountered, see above subchapter 5.3.1). Meyer touches it partially with a critique of Altmann’s “obsession” with quantification. All attempts to reduce theoretical entities (and terms) strictly to observational entities (and terms) have proved unsuccessful already in the classical (semantic) period of philosophy of science (cf. Maxwell 1962). Some consensus has been reached on the partial empirical interpretability of theoretical terms, meaning that a theoretical term can only be used in construction of a theory if it can, in some way, be associated with the results of experiments, measurements, observations or scientific simulations (e.g. Morrison 2015, 199–316). On the other hand, there are cases of theoretical terms referring to entities that are strictly unobservable. Physical cases
162
5 Functional explanation in quantitative linguistics
(thermodynamic temperature, quarks,315 inflaton field, etc.) are the most known and philosophically most researched but, of course, we also find them in every linguistic theory. In structuralism, it is mainly language (the langue as a system); in generativism, it is e.g. recursion and also in system-theoretical linguistics we find them in the content of a problematized structural axiom. We believe that we cannot eliminate theoretical entities from science unless we want to trivialize the concept of a scientific theory; and replace it with a systemic description or a logical-mathematical network into which we include empirical evidence. However, it is precisely this appeal, contrary to our beliefs, that seems to be heard from Altmann’s conception of the linguistic theory. As we have already stated (see subchapter 5.3.1 above), he does not revert to theoretical principles, but only to the machinery analyzing linguistic data. Indications of searching for theoretical entities, a sketching of an axiomatized form of linguistic theory can be found in the beginning, in an attempt to define a quantitative-linguistic research program (again Altmann 1978), but we almost completely lack it in its final form in the Unified Approach (Wimmer, Altmann 2005). We have already stated above that the Unified Approach basically forms only a “statistical sieve” for data analysis results, which corresponds to the function of the Altmann-fitter (see note 288 above). The Altmann-fitter assigns the resultingdata-modeling empirical curve to the most suitable theoretical curve, one that fits best with the data. What else can we call it besides a systematic empirical generalization? Why should we call this procedure analytical when we do not apply the theory in a hypothetical-deductive way? Where shall we identify theoretical entities that could establish a scientific explanation in the D-N form? Altmann only defines the nature of a linguistic variable, but it corresponds purely to the procedure of looking for a suitable formal mathematical framework for expressing the theory. Although he refers to the self-organization concept, it is not clear what role it shall play in constituting the Unified Approach. In fact, Altmann adheres to his original intention to define the mathematical form of the universal formula applicable to linguistic data. But while Köhler sees this formula (in the form of a power law) as an abstraction over an explanatory principle, Altmann takes it as a cornerstone, a standard by which the results of linguistic analysis are measured. In fact, Altmann remained loyal to the original inductive approach in this way. We believe that this is a more acceptable explication of the state of affairs than a variant that would proclaim the Unified Approach Already in the 80s of the last century, they were made visible, see Pickering (1984). An interesting feature of quarks is that an individual quark is unobservable (among other things because we would have to measure the fractional elementary charge, etc.); only those combinations of quarks are observed which together show measurable values of basic quantities, cf. Baggott (2012, 78–84).
5.3 The unified approach in quantitative linguistics
163
being the extension of Köhler’s system-theoretical view of quantitative linguistics. This would really only be possible through a distinctively explanatory role of mathematical statistics behind the Unified Approach. In fact, we could, then, call Altmann’s conception a formal variant of explanation within quantitative linguistics.316 Which of these interpretations of Altmann’s approach is more acceptable? The quantitative linguistics conception as a “confirmatory game” with the Altmann-fitter, or a conception based on the power of mathematical-statistical formalism? We believe that practically, the first option is widespread. The Unified Approach serves as a systematized practice of empirical generalization. This is confirmed in some other current approaches towards the philosophical-scientific dimension of the quantitative-linguistic research. The prototype may be Altmann’s adherent Radek Čech, who in a polemic with Martin Beneš expresses skepticism about theoretical constructions in linguistics.317 Following Altmann (not Köhler), Čech argues that it is necessary to reject linguistic abstractions and fruitless theoretical controversies (for example, about the langue-parole hypothesis) and to focus on the data analysis potential.318 One source of these beliefs may be a trend in the contemporary philosophy of science that emphasizes the importance of scientific practice over the scientific theory. This idea is also demonstrated in the views of science development, which, according to some, draws its potential primarily from practical activities, from inventions that accumulated in the early modern period. Modern science, thus, is not based on the shoulders of Newton’s theory, but on thousands of shoulders belonging to individual small constructors of mechanical inventions, tools and other means.319 As a common denominator of the above mentioned approaches, we can certainly see instrumentalism (see Appendix 7) in relation to scientific theories. As such, it is certainly a legitimate position, but in the proclamations that we find in Radek Čech, we feel a certain dogmatic position that does not admit that instrumentalism and realism are equally justified approaches to scientific constructions. It is, then, a
This is probably suggested by Hilberg (2004). Unfortunately, the discussion is only available in Czech, see Beneš (2015), Čech (2017). Radek Čech seems to have changed the unfulfilled ambition of Altmann’s conception into an advantage, which, according to him, corresponds to the current way of conducting science and also corresponds to the philosophical reflection of science. However, he considers the views by Paul Feyerabend and Richard Rorty to be the current philosophy of science (cf. Čech 2017, 105). This in itself should not matter, as we have noted several times; the current philosophy of science is pluralistic and characterized by a pragmatic message. However, we believe that it does not reflect the real starting points of Altmann’s view, which are older, logical-positivist and markedly inductive. We disagree with Radek Čech’s idea that science can do well without theoretical entities in the true sense of the word. We believe that this is a radical approach that removes the explanatory role from science, and makes it a mere descriptive activity. This view of science is defended, for example, by Boris Cvek, cf. Cvek (2015).
164
5 Functional explanation in quantitative linguistics
statement of the following type: Instrumentalism has finally replaced obsolete realism. Nevertheless, as we will see in the Second Interlude, besides Köhler’s Systemtheoretical Linguistics and Altmann’s Unified Approach we can find yet one view of linguistic theory that is entrenched in scientific realism – it is Hřebíček’s view, based on the principle of compositeness.
The second interlude: The principle of compositeness Luděk Hřebíček320 was a prominent representative of quantitative linguistics that deserves to be mentioned in connection with the search for the nature of linguistic theory and explanation. First of all, Hřebíček is a strong advocate of the study of the fractal nature of language (cf. Hřebíček 1994), which has allowed him to think – differently compared to Altmann and Köhler – about the origin and breadth of the use of MAL in linguistics. He proposed a method for deriving MAL (Hřebíček 1994, 85) based on an analogy with Mandelbrot’s definition of the fractal (specifically of the Koch curve).321 Technical details are not important (see Appendix 18); the most important is Hřebíček’s interpretation, which leads to the possibility that: “(. . .) the derived law is a real consequence of the fractal structures existing in language” (Hřebíček 1994, 86).322 Before evaluating where Hřebíček’s conception belongs in terms of the explanatory role of MAL and “fractal structures in language”, it should be noted that he also considered using the concept of emergence when perceiving MAL as an emergent property of texts (with all their substructures).323 His reflections on
We cannot omit a personal remark: the first book that introduced us to the context of quantitative linguistics, at a time when we perceived it only as a broader part of the theory of dynamical systems (roughly in the mid 2000s) was Hřebíček’s Vyprávění o lingvistických experimentech s textem (Hřebíček 2002b). Andres, Langer, Matlach (2020) point out that the way fractals are used in quantitative linguistics is too idealistic. They introduce an additional condition: “(. . .) the authors usually assume strictly self-similar structures in their studies, which is neither natural nor suitable for practical applications. In order to avoid this handicap, we required (. . .) that the structures are self-similar “only” cyclically (in blocks) (. . .).” (Andres, Langer, Matlach 2020, 3). He realizes that: “Future investigation should decide whether the language system is really selfsimilar or whether it is affine conglomerate and its self-similarity is nothing but a consequence of the statistical approach to the measurement of the length of constituents (. . .).” (Hřebíček 1994, 86). Hřebíček states: “Though Menzerath-Altmann’s law represents a principle emerging on all language levels, i.e. on different degrees of complexity, it can be presented as a language property fulfilling the principle of emergence because it could not be observed when the reductionism of the “relatively” secluded subsystems was applied in classical descriptions of languages.” (Hřebíček 1999, 44).
The second interlude
165
the emergence concept are also related to the concept of phase transitions among text levels. He basically suggests324 that a new text level (in the hierarchy from phonemes up) emerges from a lower level, and the relationship of this emergent dependence is expressible by the power law (MAL).325 These considerations also lead him to reflect on the linguistic symmetries that may characterize texts (Hřebíček 1999, 44).326 He deals with them in more detail in the text “The elements of symmetry in text structures” (Hřebíček 2002a). He thinks about the concept of symmetry non-metaphorically (following the book Weyl 1952), together with the notion of system invariance with respect to certain types of system transformations (Hřebíček 2002a, 18). In the linguistic context, he defines a total of four types of transformations, in which he examines the presence of appropriate automorphisms (cf. Hřebíček 2002a, 18–19) and finds that some symmetries can be found in connection with MAL, or that MAL could simply be one of the sought automorphisms (Hřebíček 2002a, 31). Hřebíček came closest to defining the general theory of linguistics (in the spirit of Köhler and Altmann) in the text “Some aspects of the power law” (Hřebíček 2003).327 In a very condensed form, he refers to the connection between MAL and the principle of compositeness (Prinzip der Konstituenz), introduced by K.-H. Best (Hřebíček 2003, 1). According to Hřebíček, this principle states: “Any item of the real world always contains one or more constituents” (Hřebíček 2003, 1–2). Although MAL expresses a more specific relationship between constructs and constituents (see chapter 5.2 above), for Hřebíček, the power law is generally the best way to express the meaning of the principle of compositeness. He expresses it in the following way: Any form of the power law seems to be a more precise expression of this principle. It concerns not only the intensity of the discussed relationship, but it can be presented as a function describing relationships between hierarchically different items. (Hřebíček 2003, 2)328
He discusses this in an engaging way in the chapter “Text a kontext” (cf. Hřebíček 2002b, 66–87). Again, it should be noted that these considerations are relevant in today’s context of considerations on emergent phenomena, phase transitions, effective field theories, renormalization theories, etc., (cf. Batterman 2013). Hřebíček systematically applied many of these means of natural science (cf. again Hřebíček 2002a). He states: “Consequently, text can be characterised as a phenomenon having the outstanding feature of symmetry with respect to the organisation of its units carrying meanings. This symmetry is based on the Zipf-Dolinskij distribution and consequently, on Menzerath-Altmann’s law.” (Hřebíček 1999, 44). However, a book was already published in English that presents Hřebíček’s concept of supra-sentence structures (see Hřebíček 1992). Hřebíček also considers the relationship between the power law and the principle of emergence: “The question to be solved in the future are the limits created by the principle of emergence, as well as the ability of a system to generate new structures.” (Hřebíček 2003, 2).
166
5 Functional explanation in quantitative linguistics
And in the following lines, he interprets the power law in the most interesting way in connection with the principle of compositeness. At the same time, Hřebíček outlines, we believe, the general abstract definition of Köhler’s register hypothesis. Simple concepts of mechanics serve him as conceptual aids – pressure in the limited space of the construct and the pressing force causing this pressure. He expresses it as follows: The simple idea of constituents’ affiliation to constructs calls the idea of an increasing pressure inside the space of a construct when the number of its constituents arises. This indicates the possibility of an attracting or, alternatively, pressing force participating in these processes and relations. This force (or “force”) occurs in miscellaneous shapes, one of which doubtlessly is Zipf’s ‘effort’ and ‘least effort’. (Hřebíček 2003, 2)
In fact, following Schröder (1991), he gives a comparison of Zipf’s power law and Newton’s power law (the law of universal gravitation) (Hřebíček 2003, 2). Although this comparison of Zipf’s and Newton’s law can be understood as being purely formal (for the agreement of formalism, see subchapter 5.3.2 above), Hřebíček uses it as a source of conceptualization of the pressing force that he relates to Zipf’s ‘least effort’. This method of conceptual borrowings is inspiring and is characteristic of the use of the theory of dynamical systems. However, of course, it is also prone to problems as we mentioned in the case of Altmann’s (and Wimmer’s) Unified Approach (again, see subchapter 5.3.2 above). Nevertheless, Hřebíček modestly states at the end of the text that he did not want to formulate a unified theory (Hřebíček 2003, 7); however, he comes closest of all those who have tried to do so. It is interesting to continue the analysis of this Hřebíček’s conceptual borrowing and to see where the derivation of its ontological commitments can take us further. First of all, it is important to reiterate that all the central laws of mechanics (as well as of today’s fundamental physics) are related to the conservation principles, which are tied to different kinds of symmetries (for more details, see Appendix 6). However, Newton’s law of force (and Newton’s law of gravity is still a form of the law of force) is most often characterized as the definition of force (cf. Stenger 2006, 245). Hřebíček, thus, comes to the right conclusion, which applies equally well to physics, that we will not do with symmetries themselves, but we also need the sources of their “disruption” – Newton’s force is actually the first symmetry breaking329 identified in this way. It is interesting how in our investigation we constantly (see chapters 3.1 and 3.2, the First Interlude) come across the starting
The situation is, of course, more complicated. For a more detailed insight, see Appendix 6. An interesting clear and understandable insight is provided by Stenger (2006).
The second interlude
167
point of the structural view, which is actually binding for the principle of compositeness, namely across the need to identify structure symmetry (with respect to transformations). And that at the same time, we need to consider the breaking of this symmetry so that this structure is able to undergo a change. Margaret Morrison’s ideas are inspiring here, when she considers ways of unification in science (see Morrison 2013). In connection with critical phenomena, she refers to application of the principle of universality as a third means of unification in science, besides other variants of unification – traditional reductive unification and more recently recognized (and also the most studied) synthetic unification (Morrison 2013, 385–414).330 And this is also where the biggest conceptual problem begins because we do not really know whether the synthetic or universalist unification principle is to be applied in the case of Hřebíček’s view. For synthetic (but also reducing) unification in physics proceeds by identifying the source of the symmetry breaking and by removing it by means of higher order symmetry;331 while in the case of universalist unification, this does not apply. The “disruption” remains permanent, and we only observe the universality of this symmetry breaking across different ontological domains (cf. Morrison 2013, 407–414). Therefore, if we are to complete this conceptual borrowing and if we want to stick to synthetic unification, then we would need to formulate the conservation laws for linguistic (sub)systems and to identify the symmetries that are responsible for them (as indicated in the paper Zámečník 2014). We are afraid that Hřebíček is not able to identify linguistic symmetries, because he remains too fixated by the automorphism of MAL. On the other hand, if Hřebíček builds a universalist project, then unfortunately the conceptual borrowing of Newton’s force fails (see Meyer’s and Kellert’s critique, mainly in subchapter 5.3.2 above), and we fear that the explanatory potential of such a theory also becomes debatable. But let us try to find a more thorough definition of the concept of linguistic symmetries in the Hřebíček’s view. The path could lead again through Hřebíček’s analysis of the power law, at the moment when he links considerations about the power law with information theory and actually expresses the core of Köhler’s view of the register hypothesis: The presence of constituents gives occasion for information transition among them or with some items outside the respective construct and the relevant system can be expressed as entropy; certain configuration of a construct with its constituents and with the other constructs is described in probabilities. If in the same text its constituents are reorganized and
She presents it in the context of the theory of electro-weak interaction in physics (Morrison 2013, 393–406). An example is the elimination of inertial acceleration, see Appendix 6.
168
5 Functional explanation in quantitative linguistics
their number inside a construct increases, the average amount of information pertinent to its mean constituent decreases. (Hřebíček 2003, 2–3)
We believe that the basic characterization of Köhler’s register corresponds to the principle of compositeness – we can state that Köhler expresses the final size of the register explicitly. The “mechanism”, which is further described by Hřebíček and which explains (SIC!) the validity of Zipf’s law, can be understood as a mechanism expressing the register hypothesis: Therefore we are inclined to see the deterministic succession of the explaining ideas in play in the following way: The principle of compositeness → Decreasing probabilities → Zipf’s increasing ranks (= increasing number of constituents). (Hřebíček 2003, 3)
Hřebíček, thus, believes that the relevant power law results from the principle of compositeness. Although he presents the derivation of Zipf’s law here, due to their mutual relationship, we can also consider the derivation of MAL in accordance with the register hypothesis, because Hřebíček infers: (. . .) that inside the Zipf-Alekseev relation is hidden a relation formulated by the Menzerath-Altmann law, which plays the role of a kind of proportionality coefficient there, but in the form of a mathematical function. In other words, Menzerath-Altmann is hiding inside Zipf-Alekseev. [translation mine] (Hřebíček 2002b, 93)332
Of course, it is still possible to look at the matter inductively (but via Meyer’s critique) and to interpret all of Hřebíček’s thoughts as mere analogies and metaphors that make no sense for the linguist’s work itself. However, Hřebíček conceived the power law as definitely something more than just one of many statistical distributions. Apparently, he understood the universalist unification as something very substantial: The intention (. . .) is to indicate certain similarities between somewhat remote systems in different sciences and their descriptions. Our position is that their agreement can scarcely be treated as a pure coincidence. The seeking of their common background is doubtlessly justified. The question arises whether the principle of compositeness cannot be accepted as such a background, as something more general than, for example, the physical concept of the field of energy. (Hřebíček 2003, 7)
Certainly the last sentence already steps a little bit out of a moderate scientific debate discourse, but in principle, we can still perceive that it follows the spirit of Hřebíček’s efforts to think through the presented conceptual borrowing. However,
„Ukazuje se, že uvnitř Zipfova-Aleksejevova vztahu je skryt vztah formulovaný Menzerathovým-Altmannovým zákonem, který tam hraje roli jakéhosi koeficientu úměrnosti, ale v podobě matematické funkce. Jinak řečeno, uvnitř Zipfa-Aleksejeva se skrývá Menzerath-Altmann.“
The second interlude
169
we can turn the perspective over and recall, in this context, Hjelmslev’s principle of analysis. We stated above (in chapter 3.2) that the principle of analysis cannot be considered as a principle that could establish a valid principle-based model of explanation in linguistics. How does Hřebíček’s (Best’s) principle of compositeness differ from the principle of analysis? Again, similarly with the principle of analysis (see chapter 3.2) or the principle of recursion (see chapter 4.1) the principle of compositeness is not a purely linguistic principle, and again, the linguistic dilemma cannot be resolved with it. It seems much more commonsensical to conceive of this principle precisely as an expression of an analytical method – to examine objects as constructs, i.e. as entities composed of constituents. And this is exactly what Hjelmslev does, starting with the text and following downwards. Hřebíček understands his effort – expressed in the search for a fractal theory of language – in the spirit of the question: “What is behind the language?”, which he finds in Hjelmslev’s Prolegomena as a challenge to overcome the descriptiveness of linguistics (Hřebíček 2002b, 25). On the other hand, we believe the Hřebíček’s reference to the principle of compositeness and its interpretation towards the principle of least effort is the best available analysis of the nature of the economization principle in general. If we add conditions expressing the necessity of the limited construct size, we can deduce the characteristic power law distribution across the linguistic hierarchy. We can again create a scheme of the principle-based model of explanation: Explanans: Principles:
The principle of compositeness The principle of least effort Condition: The limited size of the construct Explanandum: The power law structure across the hierarchy of all linguistic levels Of course, it is also possible to answer the question “What is behind the language?” more literally. Then we are back searching for a distinctively mathematical explanation, which we came across in many variants above (see i.e. chapter 4.1 and Appendix 5).333 Alternatively – in the spirit of the old τὰ μετὰ τὰ φύσικα – we can directly search for the fractal structure of reality, which is also shown to us through language.
In addition to Hřebíček’s reflections, we also draw attention to those by Jan Andres, (cf. Andres 2010).
170
5 Functional explanation in quantitative linguistics
5.4 The diversity of theories in quantitative linguistics We have reached a point where, for understandable reasons, in the contemporary life of the linguistic sub-discipline, strict lines of development disappear, and the possibility of judging the current state and other visions of quantitative linguistics categorically is limited. The previous chapters in the fifth part of the book show that the concept of the linguistic law is not firmly established, and that many authors trust it to a limited extent. Zipf’s law met such a situation several decades ago (see the First Interlude above), and the younger Menzerath-Altmann’s law334 is currently facing this. However, we do not mind because we can go back to Köhler’s conception, where these traditionally visualized laws do not play such a privileged role because they are “only” a “surface phenomenon” carried by a structural axiom, which Köhler attempts to define and to build in the functional model of explanation. Köhler uses the structural axiom asking for a universal principle that would articulate a power law expressing scale invariance across linguistic plans. Mathematical formulations do not represent the most important part of the theory per se for Köhler; these are “only” means for quantifying identified dependencies that his system-theoretical linguistics reveals. Much more important is the structure of the theory that establishes the functional model of explanation (see chapter 5.5 below). It would now be appropriate to summarize the main representatives of quantitative linguistics – Herdan, Altmann and Köhler – in terms of their view of linguistic theory structure. The main difference between Herdan and the Altmann and Köhler duo is that Herdan focuses more on the search for general principles (especially on the linguistic interpretation of the principle of duality), but at the same time he is significantly eclectic when building linguistic theory (see the First Interlude above). In contrast, Altmann and Köhler seek to build linguistic theory from the ground up on a single framework, to which they also introduce a definition system, basic quantities, and the expression of their quantified relationships. They build the theory in accordance with the syntactic view of theories335 in philosophy of science and seek to formalize the explanatory potential of this theory. The goal is to express the theory in an axiomatized form.
Although Menzerath’s original formulation dates back to the 1950s, it became more widely known only in the 1980s, in connection with the development of Altmann’s theory. Only relatively recently has it got into the non-linguistic context, (i.e. Torre et al. 2019). We have shown that we can also find elements of the semantic view of theories in Köhler (see chapter 5.2).
5.4 The diversity of theories in quantitative linguistics
171
The difference between Altmann and Köhler is much more subtle because, mainly, Köhler uses Altmann’s conception of quantitative linguistics as the starting point of his own view. This is evident both with regard to methodology, terminology, building a system of definitions, finding quantifiable properties,336 as well as with regard to trust in philosophy of science, the effort to build the linguistic theory along the lines of a natural science and the effort to prepare the axiomatization of the new linguistic theory.337 On the one hand, we can, therefore, state that the core of Köhler’s system-theoretical linguistics is already outlined in Altmann’s view. On the other hand, however, over time it has become clear that Altmann is a more inductivist quantitative linguist, for whom it is advantageous to limit the possibilities of quantitative-linguistic testing formally. While Köhler is a more deductivist linguist, who uses mathematical formalism “only” as a means of expressing system-theoretical principles themselves so that they can be tested and used for predictions. Altmann has more confidence in individual laws, which specify general mathematical formulas (see chapter 5.3 and Appendix 16), which establish the Unified Approach. Laws are available for practical use (in the Altmann-fitter); axioms are presumed, but cryptically embedded in the concept of self-regulation, which is not explained. Rather, one can say that axiomatization is left ad futurum. Köhler builds a functional explanation based on a structural axiom that should represent, through the register hypothesis, a scale invariance whose mathematical expression is the power law, traditionally referred to as Menzerath-Altmann’s law. Only in Köhler’s case do we perceive a real explanatory principle behind this “principle” on which a principle-based model of explanation can be built. In the case of Altmann, this “principle” could only serve in the sense of a formal explanation based on the power of mathematical formalism. In summary, this is the difference between the power law indicating scale invariance (in Köhler) and the general formula of linguistic statistical distribution (in Altmann). In other words, stochastic law (in Altmann) and structure principle (in Köhler) stand against each other. An important question also concerns the compatibility of system-theoretical linguistics and the Unified Approach in terms of defining different types of laws that we find in them. Can we state that Altmann and Wimmer can also incorporate Köhler’s hierarchy (see chapter 5.2 above) into their Unified Approach, even if they strictly work only with distribution laws (discrete and continuous distributions)?
For example, Altmann, thus, built the basis of the system-theoretical conception of the lexicon in the late 1970s (cf. Altmann 1978, 8–9), as we find it later in Köhler (1986). This can also be seen, for example, in the Köhler’s “regulatory effectiveness” (die Steuerungseffektivität), which Altmann forsees in the late 1970s: “(. . .) language must remain in a balance that insures its communicative efficiency.” (Altmann 1978, 5).
172
5 Functional explanation in quantitative linguistics
We, therefore, consider Köhler to be the main figure of quantitative linguistics, who transcends the eclectic nature of Herdan and avoids inductivism and seduction of the statistical method, to which Altmann is more inclined. Köhler is also more moderate towards building an explanation model for system-theoretical linguistics. As we will see in detail in chapter 5.5, he systematically builds a functional model of explanation that does not relate to the teleological explanation and also separates it from causal explanation. Altmann considers several variants of the explanation approach. His leniency towards teleology (see subchapter 5.3.2 above) relying on the possibility of articulating a teleological explanation through a functional one – for cases of self-regulatory systems – is problematic because the transformation of teleology into functional explanation has not been successful enough (see also Appendix 19 on Hempel’s functional analysis). In some respects, the situation is paradoxical because Altmann acts, at least in his statements on the linguistic theory and following the discussion of teleological explanations, as a “stricter proponent of synergetics” than Köhler, who chooses the name “synergetic linguistics”. For Altmann, Hermann Haken’s synergetics was probably a theory that treated teleological concepts rigorously. Below (see chapter 5.5) we will show that a strictly synergetic interpretation of the functional explanation is possible, but it leads back to the problematic concept of downward causation.
5.4.1 The origin of linguistic law The last issue we need to address before moving to the analysis of functional explanation in system-theoretical linguistics (see chapter 5.5 below) is a more general question about the nature of the power law in the comparison with other statistical distributions. To do this, we will use a reversal of perspective – from a specifically quantitative-linguistic one to a more general reflection on the question – why the power law should have a more important position than, for example, the lognormal distribution or normal distribution? Here, even more than in thinking about the nature of Zipf’s law and MAL, we get into difficulties that go beyond the scope of this work. However, it seems that the whole linguistic conception of quantitative linguistics proposed by Altmann and extended by Köhler can be based on complexity theories (i.g. Caldarelli 2007, see also Meyer’s reflection on “complicity” in subchapter 5.3.2 above). Moreover, authors such as Ramon Ferrer-i-Cancho or Jan Andres, who interact with the community of quantitative linguists, clearly confirm this trend (i.e. Torre et al. 2019, Andres 2010, Ferrer-iCancho, Solé 2003, Ferrer-i-Cancho, Solé 2001a, Ferrer-i-Cancho, Solé 2001b).
5.4 The diversity of theories in quantitative linguistics
173
However, Köhler does not respond to this challenge.338 What are the reasons of his for this? The main reason may be that in new texts, he does not find anything really new that would not be present in his chosen synergetics and non-equilibrium thermodynamics. For example, in Caldarelli we find the following statement: Quite surprisingly amongst the several causes [of behavior in accordance with the power law, L. Z.] we also find randomness. This means that disorder (fluctuations) in the quantity values between different sub-parts of the system, is enough to produce power laws. (Caldarelli 2007, 84)
Another reason may be the fear of losing linguistic specifics, the fear that linguistic content will drown in the sea of big data. As we have already seen (see chapter 2.2 above), complexity theories are used as interdisciplinary mathematical tools that rely on various articulations of the principle of universality, providing a universal interpretative framework for a number of disciplines (see Morrison 2015, Morrison 2013, Caldarelli 2007, 79). Let us recall that we consider only an approach based on principles (principle-based model of explanation) to be explanatory. And to solve the dilemma of linguistics, we require (also in connection with Köhler) specific linguistic principles. Therefore, let us now try to find out what can be hidden behind the stated concept of universality and whether it could be co-opted in the form of the principle of universality as the basis of linguistic explanation. There is still a danger that the principle of universality will again prove to be “only” a formal means of describing diverse data without specifically disciplinary support (i.e. neglecting linguistic specifics). We will see that considerations about the origin of the power law (MAL), especially Köhler’s (the register hypothesis) and Hřebíček’s (the principle of compositeness), will be assigned to general conceptions about the origin of power law distributions. Caldarelli offers several sources of power law behavior, the most important of which seems to be the tendency to minimize energy requirements of the system (Caldarelli 2007, 85). In addition to this source, he also mentions self-organized criticality, which he relates to minimization (Caldarelli 2007, 85). Another source of the power law may be diffusion processes. However, according to Caldarelli, the best explored are multiplicative processes; these are also connected with the fact that it is not always easy to distinguish between situations when they produce the real power law and when they produce lognormal distribution (Caldarelli 2007, 85–86).339 E.g. in Köhler’s latest synthetic work (Köhler 2012), neither this possibility nor Ferrer-iCancho himself are referred to. Caldarelli also mentions the possibility of deriving the power law from exponential distribution (Caldarelli 2007, 85).
174
5 Functional explanation in quantitative linguistics
In summary, Caldarelli offers three paths to the power law: random processes (this includes the aforementioned diffusion), the minimization principle, and multiplicative processes. The basic lesson from such a diverse background should be that we cannot assign importance to the power law itself, but we have to examine the origin of this power law. We cannot rely on the fact that when our data fit with the power law, then we have made an important discovery; it is necessary to find out whether this power law originates in a specific minimization principle. This is also the reason why we do not trust the Unified Approach and the inductive strategy in general. Quantitative linguists have examined the diffusion model (in the context of catastrophe theory, cf. Wildgen 2005) as well as models of generally random (Meyer refers to Li’s text, see Li 1992)340 but also multiplicative processes (Naranan, Balasubrahmanyan 2005, Naranan, Balasubrahmanyan 1998, Balasubrahmanyan, Naranan 1996, see also Appendix 15). However, the most frequently articulated model was based on economization principles, such as minimization – nominally in AltmannWimmer’s view, explicitly in Köhler’s register hypothesis and also in Hřebíček’s conception. Considerations about minimization in quantitative linguistics get into interesting contexts thanks to the concept of minimization under constraint, which introduced the text “Optimization in complex networks” (Ferrer-i-Cancho, Solé 2003) to which Caldarelli also refers (Caldarelli 2007, 92).341 Both Köhler’s and Hřebíček’s approach to deriving MAL – or the search for a basic explanatory principle for linguistics – can be covered by the principle of minimization under constraint. As we have already said, the decisive factor is whether this principle can be interpreted specifically linguistically – as attempted by Hřebíček (in analogy with physics, see the Second Interlude above) and Köhler; although Köhler’s constraints are probably of cognitive nature (in the register hypothesis, see subchapter 5.2.2 above). Or whether this principle is associated with the minimization of some non-linguistic quantity – such as energy. Thus, in this case, it is a matter of proving that the linguistic laws (Zipf’s, MAL, etc.) are epiphenomena of a physical level of communication systems (cf. Torre et al. 2019). Alternatively, it would be a less radical reduction to cognitive and neuronal processes.
In this text, Li claims that Zipf’s distribution is also verifiable for random texts. Meyer directly states: “In Mandelbrot’s and Li’s interpretation, Zipf’s Law simply says that natural language texts typically behave, from a stochastic point of view, as if they were the output of a random character source. Naturally, this does not mean that such texts are such an output. Once again, the search for a mechanism behind the stochastic regularity is determined to fail.” (Meyer 2002, 76). Caldarelli also refers to Mandelbrot (1953) in connection with what he refers to as entropy minimization (Caldarelli 2007, 94–95).
5.4 The diversity of theories in quantitative linguistics
175
We will see later (see chapter 5.6 below) that minimization under constraint allows us to build an alternative non-causal model of linguistic explanation that can complement Köhler’s existing functional model of explanation. We believe that Köhler’s explication of the register hypothesis can be related to the explication of the origin of the power law from the minimization principles as presented by Caldarelli (Caldarelli 2007, 92–95). Self-similarity typical for scale-free networks, among which Caldarelli also includes linguistic data structures (cf. the chapter “Linguistic networks” in Caldarelli 2007, 224–229) is crucial. The basic relationship between minimization and the power law is related to the fact that the fractal arrangement of the system may be a response to the need to minimize some cost function (Caldarelli 2007, 92). Caldarelli refers here to fractals in topology, which express relationships between the subsystems of a given system (Caldarelli 2007, 62). An example of the relationship between the power law, self-similarity and minimization can be found in Appendix 20. Caldarelli cites the explication: Often, this minimum configuration is only one of many different ones and therefore very difficult to reach during the evolution of the system. Fractals appear because self-similar configurations (with similar statistical properties corresponding to relative minima) occupy a relatively large region of configuration space. Therefore they are not only minimizing structures, but are also more easily accessible by system evolution. (Caldarelli 2007, 92)
In connection with the above mentioned facts, Köhler’s conception seems to us to be able to be assigned naturally under the wings of dynamical system theory (complexity theory), which is further signaled by texts by Jan Andres, who systematically studies fractal structures in language with emphasis on the dynamic side of the issue (cf. Andres 2010, 2009). However, the key problem is associated with the definitions of the constraints for the register hypothesis. We believe that it could be resolved globally for the case of the generally defined “regulatory effectiveness” (die Steuerungseffektivität), which we mentioned above (see chapter 5.2). However, the application of a general model that will conform to the general rules given in Ferrer-i-Cancho and Solé (2003) broadly has its pitfalls. The main difficulty is vagueness, which does not allow a more specific linguistic interpretation of constraint(s) – in principle, one can say that the principle of least effort is repeated only at a significantly higher technical level. Another drawback associated with the previous one is the complexity of the hierarchy of different types of constraints in Köhler’s theory, as well as the diversity of the different types of relationships that exist at individual linguistic sublevels.342
For example, how to accommodate the need to distinguish the word forms (advantageous for the analysis of frequencies and lengths of lexical units) and the lemmas (necessary to define the polylexy of lexical units) in the general framework?
176
5 Functional explanation in quantitative linguistics
Nevertheless, we will try to identify these constraints below and incorporate them into a new topological model of explanation (see chapter 5.6). Caldarelli gives several linguistic examples of scale-free networks – Zipf’s law, co-occurence networks and word associations (Caldarelli 2007, 224–229) – mostly tied to Zipf’s law. Nevertheless, in our opinion, it is an important challenge to involve quantitative linguistics more strongly in the research program signalized mainly by Ferrer-i-Cancho (Ferrer-i-Cancho, Solé 2003, Ferrer-i-Cancho, Solé 2001a, Ferrer-i-Cancho, Solé 2001b). Quantitative linguists have much to offer – for example, outside quantitative linguistics, there is a lack of wider knowledge about Menzerath-Altmann’s law (for interesting exceptions see Matlach, Dostál, Novotný 2022, Matlach, Krivochen, Milička 2021). Nevertheless, there is a certain vacuum in contemporary quantitative linguistics, characteristic of the advent of a new generation of researchers. Followers of Altmann and his inductivist linguistic project go in one direction, defined above; and in the opposite direction, those who, like Ferrer-i-Cancho, feel the broader context leading to the theory of dynamical systems (complexity theories). The advantage of Altmann’s successors is preservation and expansion of the linguistic dimension of the research, essentially in the spirit of Meyer’s vision of the future of quantitative linguistics, which adjusts qualitative linguistic concepts. Their disadvantage is, of course, the disappearance of the general theory that Altmann and Köhler sought. In the case of those who are inspired by complexity theories, the advantage and problem are complementary to the former group. There is a lack of unifying elements (we analyze it in more detail in Zámečník 2022, see also Appendix 21). ✶✶✶ We have seen that there are some difficulties in applying the principle of minimization343 in the context of system-theoretical linguistics. Another problem related to the definition of the power law in general is the question of the ability to differentiate between the real power law and the lognormal distribution. Caldarelli, for example, talks about the well-known fact that in the neighborhood of critical points, the quantities behave as power laws (cf. Caldarelli 2007, note 30, 94). However, this announcement sounds more like an experienced heuristics than something based on a fundamental theory (and its principles).344 Strictly speaking also with its definition because it is still not clear how to define the general principle of minimization with the support of many offered examples. This is not to say that this principle is not sufficiently mathematically grounded, but that there is no such thing as an axiomatics of the theory of dynamical systems. We return again to the problems related to the definition of critical phenomena (see chapter 2.2), cf. Batterman (2013). For a comparison of the power law and the lognormal distribution, see also note 212.
5.4 The diversity of theories in quantitative linguistics
177
Since the central self-similarity of fractals that characterize the topology of a system is necessary for utilization of the power law, the ability to distinguish the power law from lognormal distribution is very important – the lognormal distribution does not represent self-similarity.345 If we could claim (and perhaps we can) that minimization principles always lead to the power law, then it would be sufficient to distinguish these cases from cases where the power law arises as a result of multiplicative processes. However, even in the case of multiplicative processes, both the lognormal distribution and the power law can arise.346 Mitzenmacher notes this problem in the text “A Brief history of generative models for power law and lognormal distributions” (Mitzenmacher 2004). He aims to: (. . .) explain some of the basic generative models that lead to power law and lognormal distributions, and (. . .), to cover how small variations in the underlying model can change the result from one to the other. (Mitzenmacher 2004, 227)
It is certainly interesting for quantitative linguists that Mitzenmacher points to the persistence of these difficulties when comparing two ways of deriving the power law in the case of distribution of word frequencies, one of which relates to Simon (1955) – preferential attachement – and the other to Mandelbrot (1953) – optimization.347 He documents the ongoing controversy between the suitability of the power law and the lognormal distribution for a number of application areas.348 Above all, however, Mitzenmacher manages to define the difference that results in either the power law or the lognormal distribution in the case of multiplicative processes. He states:
The problem of simplifying assumptions returns to us (see chapter 2.1 above), specifically mathematical abstraction – the strict infinity of a fractal. In the context of dynamical systems theory and philosophy of science, see also Smith (1998b), Zámečník (2018, 250–253). 346 Caldarelli specifies (for k is a variable and σ is the standard deviation): “(. . .) we have true log-normal distributions that can easily be confused with power laws. This happens whenever the log-normal distribution is studied in a range of k for which σ lnðkÞ. On the other hand, a very similar situation also triggers the formation of true power laws (. . .).” (Caldarelli 2007, 97). Mitzenmacher introduces both methods (Mitzenmacher 2004, 230–235). Mitzenmacher also points to a stormy discussion between the two authors, who fundamentally disagreed with each other (Mitzenmacher 2004, note 3, 235). In particular, he points to multiplicative processes, often used in biology and ecology to express the growth of organisms or populations (cf. Mitzenmacher 2004, 236). And he states that, again in the 1950s, J. Aitchison and J. A. C. Brown (Aitchison, Brown 1954) pointed to a better fit of income distribution by means of a lognormal distribution than by means of the power law (cf. Mitzenmacher 2004, 236–237). And sums up that: “It is interesting that when examining income distribution data, Aitchison and Brown observe that for lower incomes a lognormal distribution appears a better fit, while for higher incomes a power law distribution appears better (. . .).” (Mitzenmacher 2004, 237).
178
5 Functional explanation in quantitative linguistics
As long as there is a bounded minimum that acts as a lower reflective barrier to the multiplicative model, it will yield a power law instead of a lognormal distribution. (Mitzenmacher 2004, 238)349
Following our reflections on the importance of the power law in relation to some form of the basic explanatory principle, it is instructive to look at how Mitzenmacher draws attention to the pitfalls of trust in the explanation of the power law through optimization, as Mandelbrot does.350 Above all, Mitzenmacher asks an interesting question: Specifically, it would be useful to know in a more formal sense in what situations the small differences between power laws and lognormal distributions manifest themselves in vastly different qualitative behavior, and in what cases a power law distribution can be suitably approximated by lognormal distributions. (Mitzenmacher 2004, 244–245)351
This question also leaves our conclusion on the nature of linguistic laws symbolically open. In the chapters remaining, we will first try to document the role of linguistic laws and principles in Köhler’s functional explanation in system-theoretical linguistics, and then try to formulate a topological explanation for system-theoretical linguistics based on previous considerations of minimization principles and specifically minimization under constraint.
5.5 Functional explanation in system-theoretical linguistics We have examined in great detail the nature of the linguistic law, which is – sometimes explicitly and sometimes implicitly – a means of establishing the explanatory nature of linguistic theory for quantitative linguistics. We have seen that the discussion centered on the linguistic law – (most often formulated as a universal power law) which is to express invariant structural properties of a
On the other hand, he is a little less optimistic later: “Given the close relationship between the two models, it is not clear that a definitive answer is possible; it may be that in seemingly similar situations slightly different assumptions prevail. The fact that power law distributions arise for multiplicative models once the observation time is random or a lower boundary is put into effect, however, may suggest that power laws are more robust models.” (Mitzenmacher 2004, 244) Mitzenmacher formulates this warning as follows: “Just because one finds a compelling mechanism to explain a power law does not mean that there are not other, perhaps simpler, explanations.” (Mitzenmacher 2004, 238). Mitzenamcher also cites the case of the Double Pareto Distribution, which interestingly combines the properties of the power law and the lognormal distribution (Mitzenmacher 2003, 242).
5.5 Functional explanation in system-theoretical linguistics
179
linguistic system (hierarchically organized from individual linguistic levels) – is far from closed (see Köhler’s register hypothesis in subchapter 5.2.2, Hřebíček’s principle of compositeness in the Second Interlude above). One of the reasons is that the question of the nature of linguistic law has become part of a broader study of the nature of dynamical systems, of which language is an example (see Ferrer-i-Cancho, Solé 2003, Torre et al. 2019, Matlach, Dostál, Novotný 2022 above). We will now focus on the fact that, despite the ambiguity of the power law concept, Köhler managed to formulate a functional explanation model in systemtheoretical linguistics, which, by specifying individual elements of explanans, contrasts with the previous models (Haspelmath’s, Egré’s, Newmeyer’s, see chapter 5.1) of this kind of explanation. We have seen above that in (most often) cognitivelinguistic cases, linguists suffice it to state that non-systemic requirements (cognitive, biological, etc.) are the explanans basis. Language is such and such, or exhibits such and such properties or the interplay of processes because it has to implement functions that are bound to outside-system requirements. In addition to the fact that the explanation formulated in this way points to the linguistic dilemma clearly, it also lacks the fundament that Köhler (but also the entire quantitative-linguistic community) has always emphasized. In summary, the difference between Köhler’s and (not only) Haspelmath’s concept of functional explanation is that Hapselmath only needs to define requirements while Köhler requires the whole deductive structure of the theory, including axioms and implicitly – depending on the answer to the register hypothesis validity – also laws.352 As mentioned above (in chapter 5.3), Gabriel Altmann already introduced variants of explanatory strategies, namely teleological, functional and causal, in a pilot text of modern quantitative linguistics (Altmann 1978). And he also presented the way he imagines the application of the D-N model of explanation (see subchapter 5.3.1 above). Compliance with the D-N model of explanation is also essential for Köhler (ZLS, 25–26, QSA, 174–175).353 Some aspects of the D-N model in Köhler’s conception are mentioned below in this chapter.
We acknowledge that views such as the principles and parameters approach that link the generativist tradition with the cognitive-linguistic tradition can lead to a similarly deductive theoretical structure (see chapter 4.1 above). Köhler gives an example of the application of the D-N model in the case of explanation of phenomena at the level of the syntactic subsystem: “Behaghel’s “Gesetz der wachsenden Glieder” was an inductively found hypothesis; it was induced by observation. Therefore, it has the status of an empirical generalisation, which prevents it from being a law hypothesis or – after sufficient corroboration on data – a law. With the advance of Hawkins’ Early Immediate Constituent principle, we have the chance to formulate a law hypothesis on the basis of a plausible mechanism and even to connect it to other hypotheses and laws. If this succeeds (. . .) and enough evidence
180
5 Functional explanation in quantitative linguistics
The functional explanation is primary for Altmann although already in this text he points to the problem with functional equivalents (Altmann 1978, 7), which we will comment on below.354 We have already stated above that Altmann admits the applicability of teleological explanations explicitly, in connection with the development of information theory, cybernetics and at the same time in connection with the functional analysis performed by Hempel (Altmann 1978, 7): Teleological explanations are at work for example when one explains the constitution of and the changes in a phoneme system by reffering to the trend to symmetry, maximal discrimination, (. . .). Teleological explanations without a good theoretical background raise great difficulties, since the status to which language tends can be determined only vaguely. Success with teleological argument can be attained in the case of self-regulating systems. One can also replace teleological explanations by functional argumentation. (Altmann 1978, 7)
Köhler presented the model of functional explanation in detail already in the context of the analysis of the lexical linguistic level (ZLS, 26–33), and articulated it with regard to the comments made by Hammerl, Maj (1989) at the turn of the 1980s and 1990s (Köhler 1990a, Köhler 1990b). Köhler discusses the pitfalls and possibilities of functional explanation – probably in the most detail – in the text “Linguistische Analyseebenen, Hierarchisierung und Erklärung im Modell der sprachlichen Selbstregulation” (i.e. Köhler 1990a). This text also reflects a wide range of literature in the field of philosophy of science and explicates the problem of functional equivalents. Köhler points to the problems of the use of causal explanation in the case of linguistics very carefully and clearly. The problems are related to the fact that often the cause A of the given phenomenon X is not necessary (there are alternative causes of the given phenomenon: B, C, D, . . .), and at the same time, there are situations when cause A is necessary, but because this cause has several different effects (X, Y, Z, . . .), it is not possible to state an explanation of the specific effect X (Köhler 1990a, 13–14). Therefore, he proposes the only solution: “(. . .) the functional analysis, which takes into account the causal and probabilistic interconnectedness of language entities” [translation mine] (Köhler 1990a, 14).355
is provided to support it, we may call it a law and subsume individual observations under it so that we can arrive at a scientific explanation.” (Köhler 2012, 175). However, Altmann believes that the problem with functional equivalents is not as difficult in linguistics as in social sciences: “(. . .) since the number of means of expression available in language is limited.” (Altmann 1978, 7). Indeed, the problem with functional equivalents was probably first pointed out by Robert Merton (Merton 1949, cf. Hempel 1965, note 4, 304). „(. . .) die Funktionalanalyse, bei der man die kausale und probabilistische Vernetzung der Sprachentitäten berücksichtigt.“
5.5 Functional explanation in system-theoretical linguistics
181
Subsequently, Köhler has confronted the functional analysis with the D-N model,356 which is not strictly correct – we will also notice Hempel’s functional analysis (below and in Appendix 19) – and this contradicts his later statements (see below, cf. Köhler 2005, 765). Köhler further presents the problem of functional equivalents and outlines a possible solution to this problem – following Altmann (1981)357 – and the scheme of the functional model of explanation itself (Köhler 1990a, 15). The functional explanation was also canonically presented in Köhler (2005, 765) and most recently in QSA (176). We, firstly, present the German written version issued in the ZLS (28): „a/ Das System S ist selbstregulierend: Für jedes Bedürfnis besteht ein Mechanismus, der den Systemzustand so verändert, dass es bedient wird. b/ An das System S sind die Bedürfnisse B1 . . . Bk gestellt. c/ Das Bedürfnis B kann durch die funktionalen Äquivalente E1 . . . Ef . . . En bedient werden. d/ Zwischen den funktionalen Äquivalenten besteht die Relation RðE1 , E2 , . . ., En Þ. e/ Aufgrund der Systemstruktur besteht zwischen den Elementen s1 . . . sm des Systems S die Relation Qðs1 , . . ., sm Þ. Ef ist Element des Systems mit der Ausprägung Rt .“ It should be noted that in the most recent statements, Köhler considers, in accordance with Hempel, the functional explanation to be a variant of the D-N model of explanation (Köhler 2005, 765).358 This forces us to reflect, in more detail, upon Köhler’s main source of inspiration, which is Hempel’s text The Logic of Functional Analysis (Hempel 1965). There, Hempel seeks to build a model of explanation that will meet the requirements of the D-N model although it will not rely on the causal law.359 In this Köhler directly states: „In Unterschied zu der deduktiv-nomologischen Erklärung liegt bei der funktionalen Erklärung keine Ursache mit zwingend eintretender Wirkung vor; die Existenz und die Eigenschaften eines Explanandums werden stattdessen durch Angabe der Funktion des Explanandums im System erklärt.“ (Köhler 1990a, 14). We did not have the opportunity to study this text. However, Köhler (1990a) states: „Das deduktiv-nomologische Erklärungsschema von Hempel-Oppenheim (. . .), das sogar in deterministischen Fällen seine Schwächen hat (. . .) und heftige Kontroversen und Lösungsversuche in probabilistischen Fällen hervorbrachte (. . .), kann nur cum grano salis benutzt werden.“ (Köhler 1990a, 13). The reason was the inadequacy of the causal model, even in natural sciences: “(. . .) even in biology – the establishment of causal or correlational connections, while desirable and important, is not sufficient. Proper understanding of the phenomena studied in these fields is held to require other types of explanation.” (Hempel 1965, 297).
182
5 Functional explanation in quantitative linguistics
context, he, therefore, introduces functional analysis, which is a kind of a “disciplined form” of teleological explanation (Hempel 1965, 303, 304).360 It should be noted that Köhler applies Hempel’s procedure consistently361 and (as we shall see) also offers concrete ways to eliminate some of the unresolved problems of this functional analysis model. On the other hand, some of the specific commitments of Hempel’s model in the context of system-theoretical linguistics have remained unclear. Therefore, we will now describe the basic elements of Hempel’s conception of functional analysis in order to evaluate the severity of these ambiguities (supplements to this interpretation are contained in Appendix 19). ✶✶✶ Hempel demonstrates that functional analysis falls within a more general concept of the D-N explanation, both by its construction as a deductive argument (see Appendix 19) and by drawing attention to a necessary condition which will ensure the explanatory nature of functional analysis – the presence of the general law in explanans: (. . .) the assertion that a condition n constitutes a functional prerequisite for a state of some specified kind (such as proper functioning) is tantamount to the statement of a law to the effect that whenever condition it fails to be satisfied, the state in question fails to occur. Thus, explanation by functional analysis requires reference to laws. (Hempel 1965, 309)
In building the deductive structure of functional analysis, Hempel encounters a number of pitfalls, which he gradually eliminates (for more details, see Appendix 19). Above all, it is a problem with functional equivalents – fulfillment of requirements can be achieved in various equivalent ways, i.e. there is no unambiguous relation between fulfillment of the requirement and setting one specific state and process in the system (below we will illustrate it with a specific linguistic example).362 The “Functional analysis, (. . .) though often formulated in teleological terms, (. . .) has a definitely empirical core.” (Hempel 1965, 304). 361 Hempel states: “Basic pattern of functional analysis: The object of the analysis is some “item” i, which is a relatively persistent trait or disposition (e.g., the beating of the heart) occurring in a system s (e.g., the body of a living vertebrate); and the analysis aims to show that s is in a state, or internal condition, ci and in an environment representing certain external conditions ce such that under conditions ci and ce (jointly to be referred to as c) the trait i has effects which satisfy some “need” or “functional requirement” of s, i.e., a condition n which is necessary for the system’s remaining in adequate, or effective, or proper, working order.” (Hempel 1965, 306). Hempel reveals biological inspiration: “This idea, incidentally, has an interesting parallel in the “principle of multiple solutions” for adaptational problems in evolution. This principle, which has been emphasized by functionally oriented biologists, states that for a given functional problem (. . .) there are usually a variety of possible solutions, and many of these are actually used by different – and often closely related – groups of organisms.” (Hempel 1965, 311).
5.5 Functional explanation in system-theoretical linguistics
183
solution he proposes (and which Köhler also uses) requires modification of one premise in the explanans (the whole scheme of functional analysis is in Appendix 19) defining the class of functional equivalents: (c’) I is the class of all empirically sufficient conditions for the fulfillment of requirement n in the context determined by system s in setting c. (Hempel 1965, 312)
However, it is still possible to deduce from the explanans only a weak explanandum, which does not refer to a particular item of the class I, but only to an element of the class I in general: (d’) Some one of the items included in class I is present in system s at time t. (Hempel 1965, 312)
Hempel, therefore, considers the explanatory scope363 of functional analysis to be limited – a weak explanandum specifies only the class of functional equivalents, not a specific item of this class.364 That is why Hempel generally speaks “only” about functional analysis, in fact he speaks “only” about the explanatory (and predictive) scope of functional analysis. In addition to the problem with functional equivalents, he points out that, in most cases, functional analysis is only able to provide conditional predictions: (d’’) If s functions adequately in a setting of kind c at time t, then some one of the items in class I is present in s at t. (Hempel 1965, 316)
The general law or principle mentioned above, which was adapted by Köhler’s system-theoretical linguistics in accordance with Hempel’s recommendation, would have to be made part of the explanans so that conditional prediction became categorical prediction. We recognize the importance of Köhler’s structural axiom in Hempel’s words: (. . .) a system of the kind under analysis will – either invariably or with high probability – satisfy, by developing appropriate traits, the various functional requirements [emphasis mine] (necessary conditions for its continued adequate operation) that may arise from changes in its internal state or in its environment. Any assertion of this kind, no matter whether of strictly universal or of statistical form, will be called a (general) hypothesis of self-regulation. (Hempel 1965, 317)
Thus, Hempel himself points out that for a valid model of functional explanation, we would need a well-established self-regulation hypothesis. At the same time, in
Hempel uses the terms “explanatory import” and “predictive import” of functional analysis (see Hempel 1965, 308–319). 364 “Thus, functional analysis surely does not account in the manner of a deductive argument for the presence of the particular item i that it is meant to explain.” (Hempel 1965, 312).
184
5 Functional explanation in quantitative linguistics
the spirit of analytic philosophy, he always points out that it is also necessary to have well-defined terms that occur in functional analysis, paying special attention to the term “need”.365 His examination of functional analysis corresponds to the conceptual analysis of the term “emergence”, which he had previously performed and which is closely related to the issue of self-regulation.366 Overall, we have to state that Hempel remains rather skeptical about possibilities of functional analysis in view of the above mentioned limitations – functional equivalents and the self-regulation hypothesis. Therefore, he does not speak explicitly of functional explanation, but of functional analysis. He understands functional analysis as a means of overcoming teleological explanations367 and clearly declares that the difficulties of functional analysis can be eliminated only by linking functional analysis, even in the social sciences, to conceptual means used in natural sciences: (. . .) the laws of self-regulation themselves are causal [emphasis mine] in the broad sense of asserting that for systems of a specified kind [it is mainly the principle of negative feedback,368 L.Z.], any one of a class of different “initial states“ (any one of the permissible states of disturbance) will lead to the same kind of final state. (. . .) functionalist hypotheses, including those of self-regulation, can be expressed without the use of any teleological phraseology at all. There are, then, no systematic grounds for attributing to functional analysis a character sui generis not found in the hypotheses and theories of the natural sciences and in the explanations and predictions based on them. (Hempel 1965, 326)
365 “It is essential, then, for functional analysis as a scientific procedure that its key concepts be explicitly construed as relative to some standard of survival or adjustment. This standard has to be specified for each functional analysis, and it will usually vary from case to case. In the functional study of a given system s, the standard would be indicated by specifying a certain class or range R of possible states of s, with the understanding that s is to be considered as “surviving in proper working order”, or as “adjusting properly under changing conditions” just in case s remains in, or upon disturbance returns to, some state within the range R. A need, or functional requirement [emphasis mine], of system s relative to R is then a necessary condition for the system’s remaining in, or returning to, a state in R; and the function, relative to R, of an item i in s consists in i’s effecting the satisfaction of some such functional requirement.” (Hempel 1965, 323). It is part of the already mentioned article Hempel, Oppenheim (1948). In the 1990s, Hempel’s pupil Jaegwon Kim performed an analogical conceptual analysis of a modern variant of emergentism – non-reductive physicalism in the philosophy of mind (cf. Stephan 1999, Zámečník 2014). 367 In this regard, Hempel’s following statement is important: “(. . .) what accounts for the present changes of a self-regulating system s is not the “future event” of s being in R, but rather the present disposition of s to return to R; and it is this disposition that is expressed by the hypothesis of self-regulation governing the system s.” (Hempel 1965, 325). “Various systems controlled by negative feedback devices, such as a steam engine whose speed is regulated by a governor, or a homing torpedo, or a plane guided by an automatic pilot, show, within specifiable limits, self-regulation with respect to some particular class of states.” (Hempel 1965, 326).
5.5 Functional explanation in system-theoretical linguistics
185
Hempel’s – perhaps surprising – comment369 makes us re-evaluate the use of functional explanation in system-theoretical linguistics. In one respect, given the connection of Köhler’s conception to the information/communication turn in linguistics, we can simply state that the principle of self-regulation expressed by the principle of feedback is a trivial part of system dynamics (since the dawn of cybernetics). We can refer to physical and chemical systems that exhibit selforganization as a natural result controlled even by a deterministic algorithm, as reported in the context of chaos theory (cf. Peitgen, Jürgens, Saupe 2004). Then, however, we need to explicate the meaning of the cited causal nexus370 – to which Hempel attracts us – in the context of linguistics. ✶✶✶ The topic of causality has been given relatively much space in quantitative linguistics, appearing mainly in texts by Juhan Tuldava. He is primarily interested in causality in connection with statistical survey and the possibility to declare the correlation between phenomena to be the traces of a causal nexus between phenomena. In the text “On Causal relations in language” (Tuldava 1995a), Tuldava focuses mainly on the introduction of correlation and regression analysis methods in quantitative linguistics.371 We have to state that we do not consider Tuldava’s interpretation of the observed correlations between system variables (frequency and length, polysemy and length, etc.) as a causal nexus to be clear. Tuldava uses the method of correlation and regression analysis to examine the relationships between the frequency and length of a lexical unit and polysemy and the length of a lexical unit (similarly we find these dependences in Köhler, see chapter 5.2 above), but we cannot interpret the conclusions he presents as causal because these quantities cannot play the role of causal factors; they are operationalized properties of lexical units. For example, can we causally interpret the relationship between pressure and thermodynamic temperature in an isochoric process? Thermodynamic temperature and pressure are macroscopic quantities, between which there is a statistical dependence; causal dependences can be stated at the microscopic level (of
Which, by the way, is carried in a similar spirit as we already find in Hempel, Oppenheim (1948) in the analysis of explanation in non-physical science. Probably in the sense expressed by Kellert – the network of causal relationships is so complex that we can only uncover the geometry of the system’s behavior and cannot uncover causal dependencies (Kellert 1993). “Causal analysis in its probabilistic treatment can be considered one of the most important subsidiary methods for the description and explanation of complex systems. It calls for the use of statistical methods which, when applied in linguistics, may help discover new heuristic possibilities for the investigation of interrelations and dependences in language.” (Tuldava 1995a, 16).
186
5 Functional explanation in quantitative linguistics
course, depending on the model we choose, in the case of a quantum model, it will be a problem even at the microscopic level). Similarly, we believe, even in the case of the frequency, length, and number of meanings of a lexical unit, finding a causal nexus would require descending lower and, above all, beyond linguistic plans.372 One of the somewhat confusing problems, for example, is the question which direction the causal nexus is heading – does higher frequency cause a shorter lexical unit length, or vice versa?373 An interesting discussion related to the problem of the causal nexus and quantification in linguistics took place between Tuldava (Tuldava 1995b, Tuldava 1995c) and Bunge (Bunge 1995a,374 Bunge 1995b). Tuldava causally interprets MAL (Tuldava 1995b, 13), citing the belief that: Causal relations between events are manifest not only in correlations between the various states of events, but also in the dependence between the levels of “uncertainty” (entropy) in a given system. This leads us to informational measures of causality (. . .). (Tuldava 1995b, 12)
Bunge reacts resolutely; the concept of causality makes sense, but in other areas of linguistics (psycholinguistics, historical linguistics) than Tuldava suggests (Bunge 1995b, 15). Bunge declares: I fail to see the relevance of the concepts of causality and probability to the study of linguistic expressions, texts, and languages in themselves, i.e. detached from speakers and linguistic communities. (Bunge 1995b, 15)
Tuldava argues primarily that causality does not have to be tied to diachrony (Tuldava 1995c, 17) and also argues with the probabilistic conception of causality Perhaps in the way chosen by Torre et al. (2019). For example, Tuldava states as one of the conclusions: “The problem of the determination of the direction of the connection (cause → effect) is handled in each concrete case depending on the aims and tasks of the investigation where professional-theoretical viewpoints have been taken as basis. In this way the dominating influence of the feature “frequency of occurrence”, manifesting the effect of many external and concealed factors, was revealed in the system of features under discussion.” (Tuldava 1995a, 40). Tuldava also considers the concept of an “allometric” relation between linguistic features (cf. Tuldava 1995a, 41). The question of the direction of the causal nexus between variables in the lexical system was also addressed by Köhler and Hammerl (see below). Here, Bunge delivers his famous thesis that: “(. . .) only one property is, with all certainty, intrinsically qualitative, namely existence.” (Bunge 1995a, 3). He radically opposes postmodernism in philosophy of science, namely Feyerabend: “These enemies of science have been reassured by Professor Paul Feyerabend (. . .), of epistemological anarchism fame and a major philosophical mentor of the contemporary antiscience movement, that imprecision is fruitful – presumably just because most fruitful ideas are born imprecise. The moral is clear: Prevent the baby from growing up.” (Bunge 1995a, 2). He systematically criticizes introduction of pseudoquantities into science (cf. Bunge 1995a, 5–8).
5.5 Functional explanation in system-theoretical linguistics
187
(Tuldava 1995c, 18). Without explicitly leaning towards Bunge, we lack an epistemological evaluation of the proposed procedure in Tuldava’s approach. His texts are based on a certain terminological definition of causality, then a statistical method is introduced, which is applied to a linguistic case and then a causal (or rather “allometric”, see the note 373) dependence between linguistic quantities is stated (the procedure is similar in the text: Tuldava 1998). We cannot go into more detailed analyses of the concept of causality and causal explanation, but in any case, Tuldava’s conception lacks an explication of how the causal nexus is realized at the level of the language system. We believe that if we tried to explicate this realization, we would arrive at the concept of physical realization offered to eliminate the problems of functional explanation (see subchapter 5.5.1 below). However, this maneuver would not support Tuldava’s view, respectively, it would show that all the “causal” (“allomeric”) dependencies between linguistic entities – defined by Tuldava – disappear and are replaced by causal dependencies at a non-linguistic level (cognitive-scientific, neuroscientific, etc.). When defining a functional explanation, Köhler clearly diverges from the causal nexus in some places – in this case between requirements and properties (ZLS, 25) – but admits it elsewhere (ZLS, 26).375 Rather than inconsistency, this suggests a variety of types of relationships between entities in Köhler’s system-theoretical linguistics. We have seen above that Köhler (1990a) actually admits causal and probabilistic dependencies in the language system. Later, Köhler (e.g. QSA, 201) composes deterministic as well as probabilistic dependencies into the schema of the syntactic subsystem. We believe that a rigorous assessment of different types of relationships between some requirements and properties and their division into “causal” and “probabilistic” should be performed systematically, and the functional model of explanation itself should be adapted accordingly.376 ✶✶✶
„So sagen wir vom Inventarumfang des Phonemsystems, er hat eine bestimmte Grösse, weil er aufgrund des Bedürfnisses minG den geringsten Wert annimmt, der bei einem gegebenen Wert der Grösse Phonemähnlichkeit möglich ist. Offenbar handelt es sich dabei nicht um eine Kausalerklärung.“ (ZLS, 25). „Für die Sprache sind jedoch Gesetze, die allgemeine Bedürfnisse der menschlichen Kommunikation (wie die Minimierung des Gedächtnisaufwands) mit einer Eigenschaft des sprachlichen Systems (wie die Grösse des Phoneminventars) kausal verknüpfen, weder bekannt, noch ist es plausibel, die Existenz solcher Gesetze anzunehmen.“ (ZLS, 26). 376 This is beyond the scope of this work. In any case, the relation R defined in the functional explanation would have to be adjusted, probably divided into a group of “causal” and “probabilistic” dependencies. The question is how to keep both types of dependencies within one set of equivalents (e.g. in the syntactic system). Then, there returns a problem we saw in Tuldava when he interpreted MAL as causal dependence – it would be necessary to clarify what is the
188
5 Functional explanation in quantitative linguistics
Let us now return to the functional model of explanation offered by Köhler in its latest form in QSA.377 For the sake of clarity, we make formal adjustments – we supplement the division into explanans and explanadum – without interfering with the content: Explanans: 1. The system S is self-organising, i.e. it possesses mechanisms to alter its state and structure according to external requirements.378 2. The requirements N1 . . . Nk have to be met by the system.379 3. The requirement N can be met by the functional equivalents E1 . . . Ef . . . En . 4. The interrelation between those functionalequivalents which are able to meet the requirement N is given by the relation RN EN1 . . . ENn . 5. The structure of the system S can be expressed by means of the relation Qðs1 . . . sm Þ among the elements si of the system. Explanandum: Ef is an element of the system S with load RNf . (QSA, 176) The first premise in the explanans is referred to by Köhler as the structural axiom (e.g. QSA, 177), which is the most important component of the explanans because it
information measure of causality. Köhler directly declares that: “The dashed lines represent the effect of order parameters on distributions of system variables.” (Köhler 2005, 771, Fig. 53.7). Köhler defines order parameters in the context of synergetics as: “(. . .) order parameters are macroscopic entities which determine the behaviour of the microscopic mechanisms without being represented at their level themselves.” (Köhler 2005, 761). Does the lexical subsystem remain with “solid lines” only? Can we really interpret all solid lines as causal dependencies (e.g. where the quantities frequency, polysemy, etc. appear)? Köhler seems to be inspired, as he himself states (QSA, 176), by Altmann’s text (Altmann 1981). 378 In the chapter “Synergetic linguistics” we find a slightly different formulation of the first premise: “The system S is self-organising. For each need, it possesses mechanisms to alter its state and structure in such a way that the need is met.” (Köhler 2005, 765). There should be no content difference between the two variants; in a newer version, only the term “need” was replaced by the term “requirement” (the term “das Bedürfnis” is used in German texts). Perhaps one could say that in the 2005 version, a more teleological tone is heard – we also point this out in the text Zámečník (2014) – while in the 2012 version, we can perceive a certain neutrality. While in the 2005 version, the center of the first premise is a self-organizing system, in the 2012 version, the central role is reserved for requirements. Would it be possible to conceive of their “influence” in a causal way? On the other hand, in Hempel, we find both terms “need” and “requirement” rather synonymous (cf. note 365) although he prefers the term “functional requirement” (cf. Hempel 1965, 317). In Zámečník (2014), we ask the question whether the latter premise does not bring a teleological element into Köhler’s functional model of explanation implicitly.
5.5 Functional explanation in system-theoretical linguistics
189
expresses the explanatory principle necessary to deduce the explanandum and, thus, to establish the D-N nature of functional explanation. Without the structural axiom as a principle, the whole deductive conception of system-theoretical linguistics would collapse. Köhler is aware of the need for such a principle to fulfill the conditions expressed by Hempel (see above and in Appendix 19). Without the structural axiom as a well-defined hypothesis about the self-regulation of the system, there can be no talk of a valid form of explanation. Under the structural axiom, as a hypothesis of self-regulation, Hempel imagines the following: Such a hypothesis would be to the effect that within a specified range C of circumstances, a given system s (. . .) is self-regulating relative to a specified range R of states; i.e., that after a disturbance which moves to a state outside R, but which does not shift the internal and external circumstances of s out of the specified range C, the system s will return to a state in R. A system satisfying a hypothesis of this kind might be called self-regulating with respect to R. (Hempel 1965, 324)
This is important because self-regulation is not a metaphysical principle, but a scientific hypothesis. Note that here Hempel does not determine self-regulation absolutely, but only relatively with respect to R – this is again in connection with his earlier analysis of emergence (Hempel, Oppenheim 1948) – this way self-regulation is rid of its “metaphysical aftertaste”. At the same time, self-regulation with respect to R allows testing of this hypothesis – self-regulation is not just an analogy (we have seen that this is what bothers Meyer in subchapter 5.3.2 above) or conceptual borrowing, but a fully-valid principle and the D-N nature of functional analysis is completed, or we can (perhaps) talk about a real functional explanation. Of course, we are eminently interested in the nature of Köhler’s structural axiom; whether it is an elaborate scientific hypothesis in the spirit of Hempel’s definition, or a general analogy that does not explicitly state the conditions for its refutation. Köhler states the following towards the structural axiom and also to the second premise complementing it (i.e. its relation to requirements): There is one structural axiom which belongs to the synergetic approach itself: the axiom that language is a self-organising and self-regulating system.380 Other axioms take the form of system requirements, (. . .). In synergetic terminology, these requirements are order parameters. They are not part of the system under consideration but are linked to it and have
Although it was apparently not Köhler’s intention, the separation of self-organization from self-regulation allows an interpretation that combines: (1) self-regulation with Wiener’s and Hempel’s information-cybernetic concept of the feedback loop and (2) self-organization with Haken’s synergetic conception based on the concept of downward causation (see subchapter 5.5.1 below).
190
5 Functional explanation in quantitative linguistics
some influence on the behaviour of the system. In the terminology of the philosophy of science, they play the role of boundary conditions [emphasis mine]. (Köhler 2012, 177)
We would correct the statement because it does not seem appropriate to speak about requirements as axioms. In the D-N model of explanation, axioms are understood as universal principles (laws). Requirements are, as Köhler himself declared, boundary conditions that impose external constraints on the system.381 Therefore, the statement that the functional model has one structural axiom into which requirements enter as conditions would seem to be a better formulation. This is entirely consistent with the D-N model, where, as we know, there are laws and conditions in the explanans. Of course, it would be possible, and perhaps that is what Köhler’s idea is, to distinguish the universal structural axiom (as a type) and its individual instances in the form of individual laws associated with individual requirements by system-theoretical linguistics. We can recall, for example, Hřebíček’s consideration of Zipf’s law of least effort (see the Second Interlude above). However, a question remains open whether this would be in line with the original idea of “regulatory effectiveness” (die Steuerungseffektivität, see chapter 5.2 above), which represents a higher-level variable of the system. It is as if the structural axiom was applied here twice – universally at the highest level of the “regulatory effectiveness” and specifically for individual instances associated with individual requirements. The problem is that realization of the structural axiom in the case of “regulatory effectiveness” is something different from its postulation as an abstract type of self-regulatory action.382 If the first two premises in the explanans establish the principle(s) and conditions in accordance with the D-N model of the explanation, then the third premise specifies implementation ways for each requirement, i.e. the functional equivalents. There have to be a finite and non-zero number of these equivalents for each requirement. The basic problem is, therefore, defining their number, or the possibility to state that we already know all possible ways of implementing the function with respect to the given requirement. The situation is all the more complicated because we do not have an overview of the entire language system, but only of some of its subsystems, the interconnection of which is only partially known. Therefore, we cannot know well enough whether there will be no other possibility of fulfilling the given requirement when building the whole system.
We have seen that in Hempel, the functional analysis also involves internal requirements (cf. Hempel 1965, 317). And yet in its specific way, we would have to incorporate a structural axiom in the case of Altmann’s and Wimmer’s Unified Approach.
5.5 Functional explanation in system-theoretical linguistics
191
In principle, therefore, the functional explanation available to system-theoretical linguistics is still only conditional, which, of course, is not unique in science. The situation is still more complicated by the fact that the requirements themselves are in mutual relations – they are hierarchized (see the hierarchy in chapter 5.2 above). It would, therefore, be useful to modify the model of functional explanation in such a way that this hierarchy is reflected in it. Fulfillment of the higher-level requirement can be achieved, for example, through two different lower-level requirements – that is, we would have another specific group of higher-level functional equivalents in connection with the requirements themselves. Köhler’s view of a higher-level property and its higher-level requirements is, of course, attractive, but like Luděk Hřebíček’s universalist conception (see the Second Interlude above), it is difficult to implement it. The requirements are so different (see Appendix 13) that it is difficult to imagine a law (principle) that would clearly link them. Thus, we are afraid, system-theoretical linguistics gets into similar problems as the principles and parameters approach in generativism (and cognitive linguistics). However, let us assume for a moment, for example in the case of a lexical subsystem, that we have managed to define all functional equivalents in relation to a given requirement, or for each requirement separately. Then, with reference to the fourth premise of the explanans, we can create the relation R, which expresses the interrelation between these finally defined equivalents. This completes the structure of the relationship between the structural axiom and the requirements, and thus, achieves implementation of the validity conditions of the D-N explanation model. In the last fifth premise of the explanans, we then only specify structures of relations among the system elements. On the other hand, there is the question remaining, why is this premise mentioned only at the end?383 The first premise expresses the structural axiom of self-regulation, the second premise sets the requirements compressing the system, perhaps therefore a reference to the system structure should be included in the second premise. However, Köhler probably wants to express that in the process of self-regulation, under the influence of requirements, relations (expressed in Q) have been evolutionarily established, i.e. the distribution of relations in the system can be defined only when we know all related (relation R) functional equivalents. Below (in subchapter 5.5.1) we will point out the consequences that the introduction of a strictly synergetic interpretation of the functional explanation would have for the last premise.
Certainly, from a logical point of view, they are not order-sensitive.
192
5 Functional explanation in quantitative linguistics
Finding all functional equivalents in relation to a given requirement and finding the relation R are documented in Köhler (1990a, 15–16). In this example, it is the specification requirement (die Bedeutungsspezifikation), which can be fulfilled by four different means: lexical, morphological, syntactic and prosodic (Köhler 1990a, 15). Unfortunately, only one of these means can be formulated explicitly because in its case, we know a parameter that has a specific linguistic interpretation – it is the parameter T, which expresses the degree of syntheticism in the given language. Above (in chapter 5.2) we met it as a parameter of the relation between polylexy and the length of a lexical unit. The T parameter is: “(. . .) a function of the extent μM to which a language uses morphological means for Specification (. . .)” [translation mine] (Köhler 1990a, 15).384 Similarly, they could be defined hypothetically for lexical means μL , for syntactic means μS , and for prosodic devices μP (Köhler 1990a, 15). And to define the relation R, or R ðL, M, S, PÞ, it is sufficient to state that:385 μL + μM + μS + μP = 1 (Köhler 1990a, 16) Köhler, thus, shows that now, regardless of whether we really have abilities, we can hypothetically determine a variant of the explanandum for any of the functional equivalents: (. . .) e.g. derive the specification by prosodic means, following the explanatory scheme, where P is an element of the system S with the form μP = 1 − μS − μM − μL [translation mine] (Köhler 1990a, 16)386
384 „(. . .) eine Funktion des Ausmasses μM , in dem eine Sprache morphologische Mittel zur Spezifikation heranzieht (. . .).“ Köhler assumes that: „1. Lexikalische, morphologische, syntaktische und prosodische Mittel zur Spezifikation schliessen sich gegenseitig nicht aus. 2. Die Summe aller Spezifikationsvorgänge ist die Summe der Verwendungen der vier Mittel.“ (Köhler 1990a, 16). 386 „(. . .) z.B. der Spezifikation durch prosodische Mittel, dem Erklärungsschema folgend, als P ist Element des Systems S mit der Ausprägung μP = 1 − μS − μM − μL ableiten.“ Köhler modestly adds a comment: „Zur numerischen Bestimmungen der vier Unbekannten wären allerdings weitere Gleichungen erforderlich, die uns leider noch nicht zur Verfügung stehen. Linguistisch ausgedrückt stellt sich die Frage danach, wie die Sprachen die Verwendungsgrade der möglichen Hilfsmittel steuern – eine Frage, die heute noch nicht beantwortet werden kann.“ (Köhler 1990a, 16).
5.5 Functional explanation in system-theoretical linguistics
193
However, Köhler still admits387 that the problem with functional equivalents is acute. In principle, each time another subsystem is deployed, the whole system of requirements hierarchy and functional equivalent relations are rearranged. Therefore, individual functional explanations may work to a limited extent for subsystems, and they are only explanation sketches. In addition, the whole situation is complicated due to the limited number of truly linguistically interpreted parameters (see chapter 5.2 above). There is a wide-ranging debate held over each of them, and dozens of experiments with texts concern it.388 Thus, we can define hypothetically, for example, as the above μP , but we do not know the parameter Π , which would have a linguistic interpretation. To some extent, in system-theoretical linguistics, the situation is similar as the one in theoretical physics – the final theory (of all basic physical interactions) is sought – but its conceptual possibilities are almost limitless, and the limitations given by the cosmological principle (by assumed principles of symmetries) are so benevolent that the final theory can take an astronomical number of forms.389 ✶✶✶ Thus, we have defined a functional explanation in system-theoretical linguistics. We can certainly state that it specifies significantly and, above all, formalizes a somewhat general definition of the functional explanation found in Haspelmath and Newmeyer (see chapter 5.1 above). It is easy to interpret Köhler’s explanation as an instance of the principle-based explanation model that we tried to create for all previous linguistic explanation models. We have already performed schematization above – we simply identify the principle (the structural axiom) and conditions (specifying requirements, functional equivalents and relations between them, the structure of the system) in its explanans, and the explanandum is deductively derivable from the explanans. Let us admit for a moment that we managed to solve a problem with functional equivalents – for example for a certain subsystem – we have identified all variables, all requirements, and even all parameters can be interpreted linguistically. In such a case, is it possible to state that the principle-based model of explanation is well-established? The answer certainly depends on the nature of the principle that appears in the explanans. Above, we have already considered the variants offered by Altmann and Wimmer in the Unified Approach and Hřebíček
According to the personal communication (Köhler, Trier 2018, 2019). 388 Let us recall a more recent discussion concerning the parameter b in MAL, which was addressed intensively by Andres in connection with Hřebíček (see the Second Interlude above). 389 In the case of M-theory, some time ago there was talk of 10100 possibilities (cf. Baggott 2013, 182–207).
194
5 Functional explanation in quantitative linguistics
when considering the relationship between MAL and the symmetry principles (see chapter 5.3 and the Second Interlude). Therefore, we will now focus purely on the structural axiom as the principle of self-organization and self-regulation. We should recall that the proposed functional approach can be understood as a real explanation, provided that it overcomes the pitfalls mentioned by Hempel, who was content to state that he was performing a functional analysis. The answer will depend on how we interpret the structural axiom. It was essential for Hempel that functional analysis, if it is to have predictive power, has to be based on a well-established hypothesis of self-regulation. At the same time, Hempel does not hide the fact that even in the case of self-regulation, it is actually a kind of causal action (he meant the feedback system, see above). Let us, therefore, ask whether, for Köhler, a functional explanation corresponds more to a functional analysis which points to causal relationships present behind self-regulation, but which are difficult to be analyzed (or which cannot be analyzed from a practical point of view by man),390 or whether Köhler’s functional explanation is based on the principle of self-regulation, referring to synergetics; because he believes that there is, indeed, a specific kind of causal effect of the whole imposed on parts – the macro-level on the micro-level – as assumed by Haken’s synergetics (see Appendix 17). We leave the possibility that Köhler consciously sought a mere functional description as unlikely because he often proclaims that he wants to build linguistics as an explanatory discipline. The answer is significant because it is linked to our constant question of whether the dilemma of linguistics can be resolved. If we accept the first option, i.e. the functional analysis as a way to talk about an unattainable causal network, then we retain the explanatory nature of the established theory, but we will again delegate its explanatory potential – as it is in cases of functional explanations in cognitive-linguistic (and psycholinguistic) approaches – outside linguistics itself (to neuroscience, biology, etc.). It is sometimes forgotten that there is, indeed, a difference between interpretation establishing functional analysis as a means of reflecting hidden causal relationships (see also Hempel) and the assumption that the synergetic principle refers to a new way of causal action. Although both are referred to as the causal nexus, the former represents a standard concept of causality based on the basic physical level and promoting itself from that level upwards, while the latter postulates a new type of causal nexus, most commonly referred to as downward causation.
Here we would get close to the concept of transcendental impossibility, which was introduced by Kellert (1993) in connection with the theory of dynamical systems.
5.5 Functional explanation in system-theoretical linguistics
195
Both of these approaches suffer from their specific problems, which also have an impact when applied in system-theoretical linguistics. To assume that we can unravel the causal network leading from the physical level of description, through the biological to the neural – and mental, with which we can certainly associate language competence – is certainly bold and unrealistic. In philosophy of mind, much effort has been devoted in recent decades to contemplating the relationship between the brain and the mind on the key issue of reconciling physicalism – all that exists at the basic level of description are only physical entities in mutual interactions – with the autonomy of the mental level. What seems to be a realistic maximum in relation to the physical explanation of the mental level (and thus also the level of speakers) was formulated by Jaegwon Kim as a procedure of functional reduction. According to this strategy we assume that for every function exhibited at higher levels of the system organization, there exist its – either known or unknown – physical implementer which determines it in the upward direction (see Zámečník 2014).
5.5.1 Reassessment of the structural axiom We believe that under Köhler’s system-theoretical linguistics and within the reception of it, we can find textual support in total for three variants of functional explanation, in terms of the structural axiom interpretation, which we have already indicated in the previous chapters. These are: (1) functional reduction, (2) strictly synergetic interpretation and (3) functional description. We will describe all three of these variants in terms of their advantages and disadvantages. Unfortunately, only the second variant solves the linguistic dilemma positively; the first leaves the explanatory task to non-linguistic factors, and the third offers both a variant of linguistic description and a variant referring to non-linguistic sources of explanation. ✶✶✶ We consider the method of functional reduction (detailed analysis of this variant is presented in Zámečník 2014)391 (1), i.e. setup of physical realizers which determine We believe that this is also the path that Herbert Simon offers in the book The Sciences of the Artificial (Simon 1969). Referring to the importance of functional explanation, Simon points out: “An important fact about this kind of explanation is that it demands an understanding mainly of the outer environment. (. . .) thus the first advantage of dividing outer from inner environment in studying an adaptive or artificial system is that we can often predict behavior from knowledge of the system’s goals and its outer environment, with only minimal assumptions about the inner environment. An instant corollary is that we often find quite different inner
196
5 Functional explanation in quantitative linguistics
linguistic functional relations upwardly,392 to be the most suitable way of interpreting the functional explanation393 in Köhler’s conception. We can relate it to Köhler’s reflections on the complexity of causal relations in the context of linguistics (cf. Köhler 1990a, 13–15). It is also applicable to the interpretation of some relations in the lexical and syntactic subsystem of Köhler’s system-theoretical linguistics (see chapter 5.2 above).394 However, in connection with the texts of Juan Tuldava, we have already mentioned the difficulties associated with argumentation by causal dependence in linguistics. If functional reduction were to be performed, then in a radical way all functions would relate to their cognitive and biological realizers. Even if Köhler does not do so, the register hypothesis could be interpreted in this way when Köhler analogizes it with the “memory space” (QSA, 84). Of course, as we have already indicated, this would close the way for a truly linguistic explanation – it would be delegated to biology and cognitive sciences.
environments, accomplishing identical or similar goals in identical or similar outer environments airplanes and birds, dolphins and tuna fish (. . .).” (Simon 1996, 7–8). He encounters the problem of functional reduction when he defines the limits of the adaptability of systems: “(. . .) if we could always specify a protean inner system that would take on exactly the shape of the task environment, designing would be synonymous with wishing. “Means for scratching diamonds” defines a design objective, an objective that might be attained with the use of many different substances. But the design has not been achieved until we have discovered at least one realizable inner system obeying the ordinary natural laws one material, in this case, hard enough to scratch diamonds.” (Simon 1996, 12). Kim experesses the functional reduction in this way: “(. . .) [functional] reduction can be understood as consisting of three steps. The first is conceptual step of interpreting, or reinterperting, the property to be reduced as a functional property, that is, in terms of the causal work it is supposed to perform. Once this has been done, scientific work can begin in search of the “realizers” of the functional property – that is, the mechanisms or properties that actually perform the specified causal work – in the population of interest to us. The third step consists in developing an explanation at the lower, reductive level of how these mechanisms perform the assigned causal work. (. . .) That is, if anything has the functonalized property, it follows that it instantiates some lower-level physical realizer, and it must in principle be possible for scientific investigation to identify it. (. . .) That a property is functionalizable – that is, it can be defined in terms of causal role – is necessary and sufficient for functional reducibility. It is only when we want to claim that the property has been reduced (for a given system) that we need to have identified its physical realizer (for that system).” (Kim 2005, 164–165). At the same time, we believe that this is the approach that would be most acceptable to Hempel, or rather it is an expression of the fulfillment of his interpretation of the “hypothesis of self-regulation”. There are simply so many feedback loops that we are not able to follow them all (see above and cf. Hempel 1965, 326). In the consultation, Köhler (2018, Trier) stated that he believes that the causal nexus is fundamental even to linguistics.
5.5 Functional explanation in system-theoretical linguistics
197
At the same time, however, this path would be free of conceptual and logical problems that we encounter when considering downward causation (we will see below). In addition, it could be argued that through this path, system-theoretical linguistics meets the current modified generativism – in the form of the principles and parameters approach (see chapter 4.2 above). Many quantitative linguists would certainly not have a problem with this solution – resignation to a purely linguistic explanation would be redeemed by the usefulness of quantitative-linguistic methodology in the spirit of Altmann-Wimmer’s Unified Approach, in connection with the increasing use of computational methods in many (not only) linguistic areas.395 As we have already mentioned, the problem of (non)causal interpretation of relations among entities in Köhler’s system-theoretical linguistics was pointed out at the turn of the 1980s and 1990s by Hammerl and Maj (see notes 240 and 373 above). The problem is that Köhler seems to consider the causal nexus directly for individual lexical quantities. Hammerl points out that: Before examining the relationships between different properties (. . .) one must also be clear about the “general type” of these dependencies, i.e. decide whether these dependencies can be interpreted as causal dependencies in only one dependency direction (e.g. changes in frequency cause changes in the average length of lexical items), as causal dependencies in both directions (. . .) and under which conditions, or as non-causal dependencies. [translation mine] (Hammerl 1990, 22–23)396
Hammerl points out that Köhler uses the term cause – precisely when describing the relationship between frequency and length of a lexical unit – without defining it (Hammerl 1990, 23).397 He points to the too static character of Köhler’s conception (in ZLS), lacking a dynamic aspect mostly, which Hammerl tries to correct in the mentioned text (and also in others together with Maj, cf. Hammerl, Maj 1989).
Data Science, Digital Social Science and Digital Humanities (cf. Zámečník 2022). „Vor der Untersuchung der Zusammenhänge zwischen verschiedenen Eigenschaften (. . .) muss man sich auch über den „allgemeinen Typ“ dieser Abhängigkeiten Klarheit verschaffen, d.h. entscheiden, ob diese Abhängigkeiten als kausale Abhängigkeiten in nur einer Abhängigkeitsrichtung interpretiert werden können (z.B. Veränderrungen der Frequenz verursachen Veränderungen der mittleren Länge lexikalischer Einheiten), als kausale Abhängigkeiten in beiden Richtungen (. . .) und unter welchen Bedingungen oder als nichtkausale Abhängigkeiten.“ And indeed, Köhler states: „Über die Form dieser Abhängigkeit herrscht allerdings keine Übereinstimmung – selbst ihre Richtung ist umstritten (. . .). Eine Aufgabe der vorliegenden Arbeit wird der Versuch sein, im Rahmen des hier entwickelten Modelle eine befriedigende Antwort auf diese Frage zu geben.“ (ZLS, 10). In a more detailed description of the relationship between length and frequency of a lexical unit, Köhler states: „Diese Abhängigkeit betrachten wir als Funktion eines weiteren Systembedürfnisses und postuliert daher die Minimierung des Produktionsaufwands (minP) als Ursache [emphasis mine] dafür.“ (ZLS, 70).
198
5 Functional explanation in quantitative linguistics
Köhler rejects most of Hammerl’s and Maj’s criticisms (cf. Köhler 1990b).398 However, he admits that it is complicated to interpret the relationship between frequency and length causally: To me, the approach of modeling the change in length and frequency as a stochastic process seems fruitful for this purpose: at every point in time t here is a probability that for a given lexical unit with length L and frequency F a new variant with length L′ (shorter or longer than L) arises at time t + 1. This probability depends on F and L at t. [translation mine] (Köhler 1990b, 45)399
Here we can see Köhler moving to a stochastic interpretation of (some) relations between variables in the lexical subsystem. However, as we stated above (see chapter 5.2 and note 376) Köhler in QSA (24) distinguishes between functional, distribution and developmental laws. We believe that, at least for functional laws, the question of the direction of the relation among variables (which Hammerl points out above) remains problematic and unresolved. And together with it, we cannot provide a definitive answer about the role of causality in Köhler’s conception. Indeed, Köhler does not distinguish between cases when he speaks of a directed relation between system variables and cases when the cause is clearly a requirement or what he refers to as an “order parameter” (cf. ZLS, 70; QSA, 170–173, 177, 190, 199). However, this brings us back to our belief (see chapter 5.2 above) that Köhler probably does not reflect the specific status of causality in synergetics. ✶✶✶ The other possibility is the strictly synergetic interpretation (2) of Köhler’s system-theoretical linguistics. It would mean taking the name of synergetic linguistics literally and to claim strictly that there is a special causal nexus that goes from a higher level (macro level) of the system to a lower level (micro level) of the system – downward causation. This interpretation is supported by Köhler’s references to the influence of requirements as order parameters (die Ordners) in the sense of applying the principle of enslaving (die Versklavung):
He argues that they do not reflect that Köhler’s basis model of the lexicon is based on system dynamics (cf. Köhler 1990b, 41–43), that it articulates real feedback and self-regulation (Köhler 1990b, 43–44) and that their variant of the basis model is contradictory (more detail in Köhler 1990b, 44–45). Despite disagreement, Hammerl published his book: Hammerl (1991). 399 „Mir erscheint für diesen Zweck der Ansatz fruchtbar, die Veränderung von Länge und Frequenz als stochastischen Prozess zu modelieren: Zu jedem Zeitpunkt t besteht eine Wahrscheinlichkeit dafür, dass zu einer gegebenen lexikalischen Einheit mit der Länge L und der Frequenz F eine neue Variante mit der Länge L′ (kürzer oder länger als L) zum Zeitpunkt t + 1 entsteht. Diese Wahrshcheinlichkeit hängt von F und L zu t ab.“
5.5 Functional explanation in system-theoretical linguistics
199
Other crucial elements of synergetics are the enslaving principle and the order parameters: if a process A follows dynamically another process B, it is called enslaved by B; order parameters are macroscopic entities which determine the behaviour of the microscopic mechanisms without being represented on their level themselves. (QSA, 170; Köhler 2005, 761)
We have already stated that Haken’s synergetics (see chapter 5.3 above) can even express the enslaving principle, yet, in a mathematical form (see Appendix 17). If we could accept Haken’s synergetics easily as a means of conceiving self-organization, then we could interpret the structural axiom as a synergetic principle that establishes downward causation. At that moment, we could build a valid principle-based model of explanation and, at the same time, remain in a purely linguistic explanation – because we would not identify only the analogy of the enslaving principle, but also the concrete implementation of the enslaving principle in the hierarchy of linguistic levels. We would solve the explanatory linguistic dilemma successfully. We believe that the functional model of explanation in this case should be supplemented by adding another premise to the explanans, which would specify the enslaving principle, or the ability of the order parameters to affect the lower level of the system causally. In total, we would modify the functional model of explanation as follows: Explanans: 1. The system S is self-organising, i.e. it possesses mechanisms to alter its state and structure according to external requirements. 2. The requirements N1 . . . Nk have to be met by the system. 3. The requirement N can be met by the functional equivalents E1 . . . Ef . . . En . 4. The interrelation between those functionalequivalentswhich are able to meet the requirement N is given by the relation RN EN1 . . . ENn . 5. The structure of the system S can be expressed by means of the relation Qðs1 . . . sm Þ among the elements si of the system. ✶ 6. In the system S enslave the Ordners O1 . . . Oo the elements of the system s1 . . . sm and achieve the transformation of the relation Qðs1 . . . sm Þ to Q′ðs1 . . . sm Þ.400 Explanandum: Ef is an element of the system S with load RNf .
In this form (with minor formal differences), we introduced the sixth premise in Zámečník (2014, 77).
200
5 Functional explanation in quantitative linguistics
The introduction of premise ✶6 completes the synergetic model of explanation. If it were counter-argued that the premise ✶6 is implicitly present in the original variant of the functional explanation, then it can be argued that the difference lies precisely in the ambiguity of interpretation of the original model. Now, we believe, it is no longer possible to doubt that the structural axiom can be understood as an explanatory principle articulating downward causation. The disadvantage of the modification is, of course, greater complexity of the model, there is a class of Ordners (the order parameters), which is not originally mentioned. On the other hand, we can say that it was really a certain inconsistency of the original model because in a number of texts (see above), of course, Ordners are mentioned. Direct identification of Ordners with requirements would be elegant. Then the additional premise could be expressed as follows: ✶✶
6. In the system S enslave the requirements N1 . . . Nk the elements of the system s1 . . . sm and achieve the transformation of the relation Qðs1 . . . sm Þ to Q′ðs1 . . . sm Þ.
However, we are not sure whether we can easily identify Ordners and requirements although Köhler does so in several places (e.g. QSA, 171–172, 177). The problem is (at least) the complicated set of requirements and their diverse nature. We have already pointed out the interpretation of requirements as boundary conditions (see above). We have to state that a comparison, or identification of Ordners with boundary conditions, would require further justification. Boundary conditions in scientific theories (most often in physics) do not have the role of order parameters normally because they do not have downward causation potency. It seems to us that there are indeed two different concepts in Köhler’s view; one that represents the standard explanation of a standard causal nexus (perhaps in the sense of Hempel’s self-regulation via feedback loop) and one that relies on downward causation. And it is in the quote below (which we have already supplemented above) that these two concepts meet and cause conceptual tension: In synergetic terminology, these requirements are order parameters. They are not part of the system under consideration but are linked to it and have some influence on the behaviour of the system. In the terminology of the philosophy of science, they play the role of boundary conditions. [emphasis mine] (QSA, 177)401
On the one hand, we face the problem of how to interpret various requirements systematically as Ordners and, on the other hand, even if it succeeds, how to treat
Of course, boundary conditions do not have to appear necessarily only in causal explanations, but it seems to us that Köhler uses them here in this way.
5.5 Functional explanation in system-theoretical linguistics
201
downward causation conceptually. If we overlook individual variants of requirements (see chapter 5.2 above, see Appendix 13), which occur in the lexical and syntactic systems, it is difficult to find a simple way of universal identification with Ordners. The requirements themselves in most cases do not stand out as higher-level properties. Rather, they act as “causes” or “forces” (let’s remember Hřebíček’s interpretation of “least effort”), which, moreover, often affect very specific places of the linguistic system or the subsystems. The interpretation of Ordners (the order parameters) should actually be based more on individual language plans, as indicated in the MAL concept (see subchapters 5.2.1 and 5.2.2), where the level of the construct affects the form of constituents in the linguistic hierarchy (. . . – syllables – lexical units – sentences – hrebs – . . .). From this point of view, however, we come across an excessive generality of interpretation – the only theory available is actually the Köhler’s register hypothesis, in which we will hardly interpret individual requirements specifically. Perhaps it would be possible to consider linking requirements (order parameters) into a single group through MAL, which would be interpreted as an expression of the “pressure” or enslaving from a higher-level property to a lower-level one. However, even here we get into difficulties because we understand the language system as the one made up of all hierarchized units, which allows us to imagine the “pressure” of higher levels. However, when modeling system-theoretical linguistics, we connect requirements with individual linguistic quantities, not with elements of a hierarchized system. In addition, we pointed out above that Köhler, even, considers deterministic (and stochastic) relations among system elements even in the direction away from requirements. We therefore believe that we cannot simply identify a requirements class with an order parameters (or Ordners) class. Moreover, we fear that the order parameters class cannot be formulated adequately in system-theoretical linguistics because Köhler’s approach, as developed, both on a lexical or syntactic level, corresponds much more to the information-theoretical and cybernetic conception of the system than to the synergetic one. Synergetic terminology was, we believe, simply borrowed as a trendy analogy. However, there must have been a certain motivation for the fact that language was seen as an analogy of a living system with evolution. The reason is, we believe, that theories such as synergetics (non-equilibrium thermodynamics, etc.) could have been seen as a means of grasping life in an exact, physics-oriented way (even mathematically based). Here, however, it should be noted that the ambitions thus associated with synergetics have not been fulfilled. None of the holistic approach theories (criticizing scientific reductionism) prevailed; the concept of emergent quality remained very useful at the methodological level, but did not
202
5 Functional explanation in quantitative linguistics
change the ontology of biology (in Quine’s words), no self-organization theories have replaced standard molecular biology.402 However, let us admit that it is still possible to formulate a relationship between requirements and order parameters (Ordners) and that it is possible to transform and develop a register hypothesis so that it expresses the enslaving principle of a higher language level to a lower language level. Let us now consider the other problem, which is materialized by the concept of downward causation, and which would be postulated as a necessary hypothesis by this step. We dealt in detail with the critique of downward causation in a previous text, where we presented Haken’s synergetics in the context of theories of self-organization and, more broadly, of emergentism (cf. Zámečník 2014, see also Stephan 1999, 232–246). Although Köhler does not explicitly use the term “downward causation” since he uses the term “enslaved” (die Versklavung), the concept of downward causation cannot be avoided. The problem, however, is that this concept is difficult to accept. Stephan (1999) points out that in Haken we find two forms of interpretation of synergetics, the first being “descriptive”, in which he expresses the possibility to describe system properties and to understand the behavior of the system on the basis of macroscopic quantities – or Ordners, without having to examine the micro-level.403 However, with a more radical “causal thesis”, he introduces the idea that the Ordners causally affect the micro-level, the building blocks of the system (Stephan 1999, 234–235). Stephan criticizes synergetics in its causal interpretation because it has failed to explicate this new form of causality.404 Stephan directly states: Furthermore, synergetics does not justify a scientifically supported form of “downward causation”, since the interpretation of the mathematical formalism, which leads from an inferential relationship to a causal relation directed downwards, is not plausibly justified. [translation mine] (Stephan 1999, 238)405
Today’s alternatives to neo-Darwinian synthesis are completely different, either towards the specification of the meaning of biological code as the basis of living systems (and its relation to, for example, the language code), as in code biology (Barbieri 2015), or towards semiotic interpretation of living systems, where the sign (not just code) nature of living systems comes to the fore (e.g. Emmeche, Kull [eds.] 2011, Markoš 2002). This is a standard procedure in a number of sciences, starting with classical thermodynamics, which we pointed out above. Haken also introduces mathematical formalization; we write about it in more detail in Zámečník (2014), see also Appendix 17. „Ferner begründet die Synergetik keine naturwissenschaftlich gestützte Form der “downward causation”, da die Interpretation des mathematischen Formalismus, die von einer Folgerungsbeziehung zu einer abwärts gerichteten Kausalrelation führt, nicht plausible begründet ist.“
5.5 Functional explanation in system-theoretical linguistics
203
It is even more important for us that the critique of synergetics is more generally related to the critique of non-reductive approaches – holism and other such views – that were popular in the 1980s.406 These approaches generally sought to combine physicalist ontology with a non-reductive understanding of higher-level properties. Non-reductive physicalism in the philosophy of mind was probably the most popular. At the turn of the 1980s and 1990s, Jaegwon Kim (for overview see Kim 2005) analyzed this view systematically and concluded that non-reductive physicalisms are not able to resolve the conflict between downward causation and upward determination between system levels. The concept of downward causation comes into conflict with the concept (essential for physicalism) of the causal nexus at the physical level.407 Kim’s critique has shown that non-reductive physicalism (and we believe this to be the case with basically all analogous approaches, including synergetics) is internally incoherent.408 ✶✶✶ So far, we have indicated the path (1) of the functional reduction and (2) of the strictly synergetic interpretation of the functional explanation. We have shown that the first path leads again to non-linguistic explanations, and the other path leads to the unclear concept of downward causation. Although Köhler primarily refers to Hempel as a source of reflection on functional explanations, it should be noted that this model of explanation was explored by a number of authors in the following decades, and a whole typology of variants of functional explanations was proposed (see Garson 2008 for a summary). Previously, we examined this typology in relation to system-theoretical linguistics and concluded that it provides an opportunity to modify the structural axiom in Köhler’s model of explanation in two variants (see in detail Benešová, Faltýnek, Zámečník 2018). The third way (3) to try to solve the problems of functional explanation in system-theoretical linguistics unfortunately leads back to both horns of our dilemma (as opposed to the first way), this time to a functional analysis in terms of systemic description. The two basic elements of the typology of functions are called the consequentionalist view and the etiological view. Both are further subdivided into other variants, which will not be described here, and we will focus only on those
It is also related to the concept of New Age Science, (cf. Hanegraaff 1996). Of course, like any conceptual analysis, this one also has its limits. Kim associates the concept of the causal nexus with the physical level systematically, but of course we have considered above (in chapter 2.2) the importance of non-causal explanations in physics. We express ourselves briefly here because we have already argued this in detail (see Zámečník 2014). It is fair to say that Kim’s argument is subtle, because in fact he primarily cares about the fact that the higher (mental) level is able to act causally (cf. Kim 2005, 39–52).
204
5 Functional explanation in quantitative linguistics
variants that we consider applicable in the context of system-theoretical linguistics (for more details, see Garson 2008).409 The first of the applicable variants is the interest-contribution (I-C) consequential view of function in system-theoretical linguistics, and the other is the non-representationalist (N-R) etiological view of function in system-theoretical linguistics.410 An advantage of the I-C view is that it is un-problematically constructable because in it: “(. . .) functions refer primarily to a distinctive style of explanation (“functional analysis”), and only secondarily to a distinctive object of study” (Garson, 2008, 538). This conception often referred to as the “systemic capacity” view of function (also referred to as Cummins function, cf. Garson 2008, 538–539) focuses on the role of the researcher who performs functional analysis according to the specific focus of their research questions. This allows for a plurality of possible functional interpretations, which may not necessarily be compatible (cf. Benešová, Faltýnek, Zámečník 2018, 17–18). For our purposes, however, it is advantageous when a pure descriptive functional analysis is sufficient, which does not aspire to the formulation of some universal principle (e.g. causal), which would establish an explanation (cf. Benešová, Faltýnek, Zámečník 2018, 21). We also asked ourselves whether this is not the only correct conception of functional “explanation” (analysis) that we can build without unpleasant ontological com-
The basic division of the etiological view (EV) is the one into representationalist (this is further subdivided into mentalistic and non-mentalistic variants, which are not suitable for our purposes) and non-representationalist EV (applicable to our needs, see below). The basic division of the consequentialist view is into four groups, of which three later ones are very similar; they are: interestcontributions (applicable for our needs, see below), goal-contributions, good-contributions and fitness-contributions views (cf. Garson 2008, Benešová, Faltýnek, Zámečník 2018). We have previously pointed out the ambiguity of functional explanation in Köhler’s conception: “It seems to us that the use of the function by SL and functional explanation oscillate between etiological and consequentionalist views. For example when generally speaking about semiotic systems, Köhler, R. et al. say that: “An explanation of existence, properties, and changes of semiotic systems is not possible without the aspect of the (dynamical) interdependence between the structure and function”, (Köhler, Altmann, Piotrowski 2005, 761) which sounds like the etiological view. In another place, it seems like the consequentionalist view of the functional analysis: “The elements under consideration have become a part of the language system, because they possess certain properties and have certain functions within the system.” (Köhler, Altmann, Piotrowski 2005, 762) However, in another place, they state that “this type of explanation /it is functional explanation in linguistics/ is a special case of the D-N explanation”, (Köhler, Altmann, Piotrowski 2005, 765) which may be possible, if ever, only in some very strong reductive type of the etiological view.” (Benešová, Faltýnek, Zámečník 2018, 18).
5.5 Functional explanation in system-theoretical linguistics
205
mitments (such as teleology,411 downward causation, etc.) (Benešová, Faltýnek, Zámečník 2018, 21). The other option is the N-R etiological view, which builds the view of function on the “selected effect”, where: “(. . .) having a function means having been selected for by natural selection” (Garson 2008, 533) and is buildable by Millican’s proper function (Garson 2008, 533): (. . .) for an item A to have a function F as a ‘proper function’, it is necessary (and close to sufficient) that (. . .) A originated as a ‘reproduction’ (. . .) of some prior item or items that, due in part to possession of the properties reproduced, have actually performed F in the past, and A exists because (causally, historically because) of this or these performances. (Griffiths, 1993, 413)
The advantage of this conception is causal history monitoring – the functional explanation in this form reflects the causal nexus, and does not resign from its finding, as in the I-C view of functional analysis. Both in way (1) and in this variant of way (3), we find an explanatory basis outside the linguistics. The difference between them is that way (1) represents a synchronic solution while this variant of way (3) represents a diachronic solution. Equipped with these two conceptual means (I-C and N-R views), we can now proceed to reformulating the first premise (structural axiom) in the explanans of Köhler’s functional explanation. Let us recall Köhler’s original premise: “1. The system S is self-organising, i.e. it possesses mechanisms to alter its state and structure according to external requirements.” (QSA, 176) In accordance with the I-C view, we can formulate the first premise as follows: 1.✶ The systemic capacity (or Cummins function) of system S enables to system S alter its state and structure according to the external requirements. In accordance with the N-R view then, as follows: 1.✶✶ The proper function of system S with an evolutionary history is to alter its state and structure according to the external requirements.412
Specifically, we write about Mayr’s teleonomy, cf. Benešová, Faltýnek, Zámečník (2018, 21). In Benešová, Faltýnek, Zámečník (2018), we formulate both variants with reference to Köhler’s formulation of the functional explanation from Köhler (2005) as follows: “(1)✶ The systemic capacity (or Cummins function) of system S enables to system S, for each need, the changes in its state and structure so that the need is met. (. . .) (1)✶✶ The proper function of system S with an evolutionary history is that, for each need, it possesses mechanisms to alter its state and structure in such a way that the need is met.” (Benešová, Faltýnek, Zámečník 2018, 21–22).
206
5 Functional explanation in quantitative linguistics
We believe that these two proposed solutions are the maximum that we can extract from the application of functional explanation so that it is in the line with the current state of conceptual analysis in philosophy of science, which also reflects the state of scientific research with respect to some older theories (synergetics, theory of self-organization, etc.). We should accept that the structural axiom is troublesome and that a detailed explication of the functional analysis does not offer a satisfactory form of its transformation into some kind of “new causal nexus”. What remains is, therefore, a research-motivated functional description (I-C view) or the explication of the causal nexus through the evolutionary history of a system equipped with a proper function (N-R view). However, if we relate this finding back to our task of solving the linguistic dilemma between description and explanation, then the I-C variant drives us to one horn of the dilemma and the N-R variant drives us to the other horn of the dilemma. The I-C view variant offers an effective description through functional analysis while the N-R view variant leads a system-theoretical linguist to search for a causal chain leading outside the field of linguistics. It offers an explanation, but one that is not in the hands of a system-theoretical linguist – a biological or neuroscientific explanation. Even the third way did not relieve us of the linguistic dilemma, and therefore we have to declare that we do not consider a functional explanation to be a model that would be able to provide an autonomous linguistic explanation. By analyzing it, we found that the possibilities of this model of explanation are tied to external non-linguistic factors, as in cases we encountered above with Newmeyer and Haspelmath (see chapter 5.1 above). In other words, despite the undeniably greater sophistication of the concept of functional explanation that we find in Köhler’s view, it is not possible to find its autonomous linguistic form. System-theoretical linguistics seems to have failed to step out of the linguistic mainstream. Dreams of the linguistic theory per se are gradually fading, as had happened previously with generativism (see chapter 4.1 above). They dissolve in a gradual confrontation with the fate of synergetics, with the impossibility of establishing a problem-free concept of the linguistic law; and they dissolve in connection with the very development of the community of quantitative linguists (see chapter 5.4 above). In the following chapter, we will try to propose another principle-based model of explanation for system-theoretical linguistics, which will not be functional and for which we will draw inspiration from the current range of non-causal explanation models (see chapter 2.2 above). We will try to follow the footsteps of Köhler, Hřebíček and Andres and transform the functional model of explanation into a topological model.
5.6 Beyond functional explanation
207
5.6 Beyond functional explanation: Topological explanation in system-theoretical linguistics We are entering quicksands because so far we have examined standard linguistic approaches and their explanation models, reconstructed their arguments and subjected them to criticism. We proposed a new formulation of explanatory models and drew attention to ambiguities and problems brought by created models – whether original or newly designed by us. But now we want to design our own new model of explanation for system-theoretical linguistics, and this is a higherlevel task that we want to try to carry out humbly because we lack our own linguistic erudition and rely strictly on the results of conceptual analysis which is based on the current conceptual means of philosophy of science. Above, we presented the current state of considerations over the forms of explanatory strategies that philosophers of various scientific disciplines reveal (e.g. mechanistic and design) and also conceptualization of the nature of explanation as an abstract entity in relation to the causal nexus (i.e. distinguishing between causal and non-causal explanations) (see chapter 2.2 above). From these conceptual sources, we will now try to use one of the variants of non-causal explanation, which is most often referred to as a topological explanation, and to modify Köhler’s functional explanation accordingly. Subsequently, we evaluate its pros, but we also add up the costs that its introduction entails for designing a linguistic system. The choice of topological explanation is to some extent arbitrary because we could rely on other variants – perhaps the ones intuitively closer to systemtheoretical linguistics – of non-causal explanation (optimization, graph-theoretical, etc.). However, the choice is motivated by the conceptual sophistication of topological explanation, which allows the model to be used for system-theoretical linguistics without being a mere analogy. The elabortion of the concept of topological explanation is evidenced by the analyses of Philip Huneman (Huneman 2018, Huneman 2010) and Daniel Kostić (Kostić 2020, Kostić 2019). Topological explanation found its first application in the context of life sciences – Huneman proposed a topological explanation scheme in the context of biology and ecology. We schematize Huneman’s proposal so that the structure of the premises in the explanans can be seen, its D-N nature can be examined, and it is then possible to consider how to absorb the premises of the explanans of Köhler’s functional explanation. Daniel Kostić has pointed out the problems of Huneman’s model – especially that it does not meet the condition of the counterfactual conditional and, therefore, cannot be a valid D-N model of explanation. He has proposed a new way of defining a topological explanation that meets this condition.
208
5 Functional explanation in quantitative linguistics
We will also introduce this model and use it for the formulation of topological explanation in system-theoretical linguistics. The basic question is, of course, why actually do it? In its history, linguistics has mainly developed a pair of formal and functional explanations, each of which offers certain advantages – it reveals a syntactic structure, or rather shows how the language system implements functions enforced by the system environment. By applying the topological explanation model, the intuitive clarity of the functional explanation is lost. Intuitively, we seem to understand that there are functions of language, just as there are functions in living systems. We agree, but “intuitive clarity” can also mean that we are subject to the effects of the conceptual borrowing we have internalized. In any case, a topological explanation is simply another variant of how to understand an explanation in linguistics; it does not represent anything that should replace a functional explanation. It represents a revival of structuralist considerations of explanation; a revival that is understandable in the context of time – the ability to analyze big data allows us to model relationships between system elements on a much wider scale than before. Therefore, we believe that in our reflections we are not only guided by influential texts, but that we also follow a certain important trend that is taking place in linguistics, but not only there. In addition to the mentioned conceptual depth, the advantage of the topological explanation lies in the fact that it remains universal, applicable across disciplines – from physics, through life sciences to social and cognitive sciences. At the same time, this model of explanation aligns with some elements of reflections on basic principles of system-theoretical linguistics, as we find them in Köhler, Hřebíček or Andres. When these authors do not talk about the function or the fulfillment of requirements, but about limitations given by the system structure, they clearly deviate from a strict functional explanation. Then, like Hřebíček, for example, we can identify transformation invariants of the linguistic system across its linguistic levels. These considerations are present in Köhler’s register hypothesis (we also state this above e.g. in subchapter 5.2.2), where the basic need for text sequencing (similarly to Hřebíček’s compositeness) meets with the finitude of the register size. Manifestation of the power law then does not indicate a universal form of function realization, but a compositional principle of a self-similar arrangement of constituents towards constructs in the hierarchy of linguistic plans. In Hřebíček, they appear in an even more explicitly non-functional form when he applies the principle of least effort analogously to Newton’s law of force and tries to find the basic principle of compositeness behind the principle of least effort, which is responsible for the power law (MAL). We suggested (see the Second Interlude above) that is possible to see a link between Hřebíček’s principle of compositeness
5.6 Beyond functional explanation
209
and Köhler’s hypothesis of the register. And in Andres, we find the most explicitly formulated idea that “behind the language”, there is the fractal structure.413 A common pitfall of these approaches is the nature of the basic principle underlying the structural (topological) explanation. The principle of compositeness shows it most markedly, as it is, de facto, a general principle of analysis (see the line of reasoning in the Second Interlude above), which is not strictly linguistic. In the case of Andres’ approach, we could directly speak of another example of a mathematical explanation of the behavior of non-mathematical systems (see Lange 2017, and Appendix 1). However, as we have already stated at various places above (see chapter 2.2 and the Appendix 5), Lange’s variants of explanation are problematic, and one cannot expect that a valid principle-based model of explanation can be based on them.414 In the context of the contemporary philosophy of science, reflections on topological explanations appeared first in Huneman (2010).415 In its most current form, the core of Huneman’s topological explanation is expressed in the text “Diversifying the picture of explanations in biological sciences: ways of combining topology with mechanisms” (Huneman 2018). The main benefit of Huneman’s view is that he refers to topological graph theory as to a conceptual basis for creating a given model of explanation (based on the findings of Gross, Tucker 1987). This outlines a broader platform for topological explanation in relation to network theory, and also to scale-free networks (see Caldarelli 2007 above). Huneman expresses the basic structure of the topological explanation as follows: Whenever the explanandum – a property, outcome, behavior of S – is explained by the fact that the system has topological properties Ti , (. . .) a topological explanation has been given. “Explanations” here means that some fact G is entailed by the topological properties Ti , and is itself a mathematical fact that describes adequately the explanandum under focus. (Huneman 2018, 119)
In such an explanation, it is, of course, necessary to specify what is meant by “has topological properties”. Huneman specifies it and, thus, basically reveals the basic structure of the explanans of the topological model of explanation.416
Alternatively, hyperfractal, (cf. Andres, Rypka 2013, and also Barnsley 2006). Of course, there is still a chance of using some metaphysical approach directly. As already mentioned, French’s structural realism (French 2014) could be useful in this respect. However, we can trace its origin to structural and graph-theoretical ways of explanation. 416 “(. . .) a system S under focus is related to a topological space S′ (. . .); topological properties are properties of S′ that specify its invariance regarding a class of continuous transformations. (. . .) Consequently, for any set X of continuous transformations, topological properties define equivalence classes Cx , namely, classes of manifolds that are equivalent regarding X, i.e. each of them being the transform of another through a function that belongs to X.” (Huneman 2018, 117–118).
210
5 Functional explanation in quantitative linguistics
Explanans: 1. The system S is represented by the model of the topological space P. 2. The system S has topological properties Ti that specify its invariance regarding a class of (continuous)417 transformations. 3. For any set X of (continuous) transformations of the system S, topological properties Ti define equivalence classes Cx . 4. Def. Equivalence classes Cx are classes of manifolds that are equivalent regarding X. (i.e. each of them being the transform of another through a function that belongs to X) Explanandum: The fact G is entailed by the topological properties Ti of the topological space P:418 Now, we have to try to identify the central elements of functional explanation that can be reconstructed into a topological form. Firstly, we have to relate the whole of the linguistic system (or in individual cases always a specific subsystem) to the topological space. Additionally, we need to clarify – and this is a key step – how to constitute a set of requirements in a topological explanation. Then, the task remains to constitute a set of functional equivalents and the relation of their subsets to individual requirements. And it is also necessary to incorporate the expression of transformations of the system structure into the model. Let us, therefore, start from the original form of the functional explanation in its strict synergetic form and mark the explanans’ key elements, which require conversion to a topological variant of the explanation: 1. The system S is self-organising, i.e. it possesses mechanisms to alter its state and structure according to external requirements. 2. The requirements N1 . . . Nk have to be met by the system. 3. The requirement N can be met by the functional equivalents E1 . . . Ef . . . En . 4. The interrelation between those functional equivalents which are able to meet the requirement N is given by the relation RN ðEN1 . . . ENn Þ. 5. The structure of the system S can be expressed by means of the relation Qðs1 . . . sm Þ among the elements si of the system. ✶ 6. In the system S enslave the Ordners O1 . . . Oo the elements of the system s1 . . . sm and achieve the transformation of the relation Qðs1 . . . sm Þ to Q′ðs1 . . . sm Þ.
In accordance with Huneman, we present “continuous transformation” here; however, in general, we can talk about transformations, which will differ in their type for different systems. The whole model was created on the basis of Huneman (2018, 117–118).
5.6 Beyond functional explanation
211
As the first step (1), we propose to identify a linguistic system (system S) with a topological space P (transformation of premise 1). From a formal point of view, we have to replace graphical algebra, through which the system is expressed in Köhler’s functional explanation, by means of topological graph theory. At the same time, we can include premise 5 into premise 1. We imagine the resulting new form of premise 1 as follows: 1. The linguistic system S, with the structure expressed by means of the relation Qðs1 . . . sm Þ among the elements si of the system is represented by the model of the topological space P. Perhaps surprisingly, not only does the reference to the system self-organization disappear from premise 1, but also to the fulfillment of the external requirements, but this is the main particularity – and advantage – of the topological model of explanation. The removal of self-organization is advantageous in view of our previous objections (see subchapter 5.5.1 above), and the removal of the reference to requirement fulfillment represents the suppression of the explanation functional nature. In general, the concept of requirement has to be adapted completely to the new form in relation to the system topology. The next step (2) is to assimilate the set of requirements into a topological model. The solution we propose will probably seem counterintuitive at first, as it will drive at our idea of the requirement role. We propose to assimilate the set of requirements with the set of topological properties of the system S. The reason for our proposal is that the system topological properties determine system invariance with respect to the set of system defined transformations. Thus, topological properties represent invariants of a system that is a subject to transformations. In other words, topological properties express what is preserved during system changes, or determine the limits of system variability. Originally, the requirements represented these constant limitations of the system. However, their disadvantage in the functional explanation was that they were situated outside the system – outside the linguistic system, on the edge of the structure of relations among linguistic quantities (and parameters). Therefore, when trying to reformulate the functional explanation, the variant present in cognitive-linguistic models (and in fact in the principles and parameters approach) always appeared as one of the extremes, and the linguistic dilemma always returned. Determining requirements are situated outside the system in the functional view. However, topological properties are, of course, part of the system, and they are actually an “imprint” of requirements in the system. Finally, we have an opportunity to create an autonomous valid linguistic explanation that is noncausal.
212
5 Functional explanation in quantitative linguistics
We therefore propose to reformulate premise 2 as follows: 2. The linguistic system S has the topological properties T1 . . . Tk that specify its invariance regarding a class of transformations. The combination of the first two premises provides us with the explanatory power core of the topological explanation model. At the same time, this combination also means that we do not need the strictly synergetic premise ✶6.419 To complete the model, we still need to explicate (3) what happens to functional equivalents in the model defined in this way. In the functional case, the fulfillment of each requirement was associated with a set of functional equivalents (premise 3). And for the explanation model to be correct, it is necessary to define the relation between functional equivalents bound to the given requirement, using the relation R (premise 4). In the topological case, the situation is simplified because each topological property is related to a set of transformations that form an equivalence class, while equivalence class can be defined as a class of manifolds that are mutually equivalent with respect to a given transformation (cf. Huneman 2018, 118). We can, therefore, add the definition of premises 3 and 4 in the explanans as in Huneman’s case (in the linguistic modification) as follows: 3. For any set X1 . . . Xk of transformations of the linguistic system S, topological properties Ti . . . Tk define equivalence classes CX1 . . . CXk . 4. Def. Equivalence classes CX1 . . . CXk are classes of manifolds that are equivalent regarding X1 . . . Xk . (i.e. each of them being the transform of another through a function that belongs to X1 . . . Xk ) We have now defined all the necessary premises in the explanans of the topological explanation. The problem with functional equivalents is completely eliminated: the whole equivalence class (which could be understood as an analogy of the relation R between functional equivalents with respect to a given requirement) provides a comprehensive overview of the transformations leading to the invariance of topological properties. The explanandum of the topological explanation can be formulated in accordance with Huneman’s definition (cf. Huneman 2018, 119) as follows:
Here we perform the conversion of a strictly synergetic variant of the functional explanation; but of course, we would achieve the same result by converting Köhler’s original form of the functional explanation.
5.6 Beyond functional explanation
213
The fact G is entailed by420 the topological properties Ti of the topological space P. The fact G would correspond to the specific value of a linguistic system variable, or to the empirically determined values of the related linguistic variables. We refer directly to topological properties representing transformation invariants of the linguistic system at points where the explanatory responsibility is imposed on the requirements in the original functional explanation. The explanatory basis is thus part of the linguistic system itself, and the explanatory dilemma of linguistics seems to be removed. In summary, we can formulate the topological explanation as follows: Explanans: 1. The linguistic system S, with the structure expressed by means of the relation Qðs1 . . . sm Þ among the elements si of the system is represented by the model of the topological space P. 2. The linguistic system S has the topological properties T1 . . . Tk that specify its invariance regarding a class of transformations. 3. For any set X1 . . . Xk of transformations of the linguistic system S, topological properties Ti . . . Tk define equivalence classes CX1 . . . CXk . 4. Def. Equivalence classes CX1 . . . CXk are classes of manifolds that are equivalent regarding X1 . . . Xk . (i.e. each of them being the transform of another through a function that belongs to X1 . . . Xk ) Explanandum: The fact G is entailed by the topological properties Ti of the topological space P: We consider this explanation model to be promising for several reasons; it could prove that the linguistic explanatory dilemma is solvable. We have managed to show that in this model, there is no problematic self-regulatory principle expressed in a working form in the structural axiom. At the same time, in this topological model there is no analogy to the problem with functional equivalents. Topological properties represent the explanatory basis of the topological explanation, and are part of the linguistic system. It seems, therefore, that we are well on our way to finding the principle-based explanation model we have been looking for. Of course, the introduction of this model entails commitments that may not be assessed positively at first. Therefore, let us gradually list possible counterarguments. The first obvious fact (1) that fundamentally changes the view of the
In addition to Huneman’s “is entailed by”, we could also use the terms: “is given by” or “is realized by”.
214
5 Functional explanation in quantitative linguistics
linguistic system is actually expressed by the graphical algebra removal – all the arrows disappear, and with them the potential causal relations leading from outside the system. The critical question may arise – as has been the case many times in the history of linguistic theories – to what extent the structure we acquire in this way in the topological model is really linguistic per se. In the end, was the linguistic nature of the system not given through the context of the nonlinguistic environment? We believe that this problem is ultimately illusory, precisely because the requirements have been transformed into topological properties that are part of the system. It is indeed the same as in physics, where we also recognize the explanatory nature of the conservation principles, and do not blame physics for losing its specificity and leaving its explanatory power in the hands of mathematics.421 The second problem (2) is the strict static nature of the topological model, which is the price for linking the original requirements with topological properties. On the other hand, the dynamic concept is also a minority in system-theoretical linguistics itself, and our offered solution, thus, corresponds to the original concept; the static nature is also inherent in Köhler’s original solution for the lexicon. On the contrary, a problem (3) that persists even after the change in the explanation model is the ambiguity of the set of parameters that occur in relationships among linguistic variables. A significant feature of the new explanation model is that it no longer depends so much on the specific mathematical formulation of the linguistic principle (law) – power law – but on the linguistic conservation principles. However, this brings great demands (4), which were not so burning in Köhler’s original conception. Those demands are for the systematization of topological properties (in the original respect of the hierarchy of requirements) so that we can understand the system of conservation principles and their corresponding symmetries. We are actually coming back to the important finding by Luděk Hřebíček (see the Second Interlude above) that it will be necessary to identify the symmetries that are responsible for the conservation principles in quantitative linguistics. Thus, we have solved the dilemma only ad futurum, i.e. if we manage to eliminate these difficulties and supplement the topological explanation inventory (with appropriate conservation principles and symmetries), then it will be possible to point to an explanatory linguistic conception. Despite our solution being partial, we have perhaps suggested a viable path that could resolve the Linguistic Dilemma. ✶✶✶
On the other hand, in chapter 2.2 above and in Appendix 5 we addressed a specific approach to non-causal explanation in Lange (2017).
5.6 Beyond functional explanation
215
However, in addition to specifically linguistic elements of the explanans that await evaluation and solution, we also face specific technical difficulties we observed above (see chapter 2.2) in more general reflections on the forms of explanation models in philosophy of science. Daniel Kostić points out these problems in connection with topological explanation in the text “General theory of topological explanations and explanatory asymmetry” (Kostić 2020). These are a condition that stipulates that the explanation has to express support of the counterfactual (or support of counterfactual conditional), and that explanatory asymmetry is met. Kostić proposes a topological explanation that fulfills both conditions:422 A topologically explains B if and only if: 1. (Facticity): A and B are approximately true; and 2. Either a. (Vertical mode): A describes a global topology of the network, B describes some general physical property, and had A had not obtained, then B would not have obtained either; or b. (Horizontal mode): A describes a set of local topological properties, B describes a set of local physical properties, and had the values of A been different, then the values of B would have been different. (Explanatory perspectivism): A is an answer to the relevant explanation-seeking question Q about B, such that the Q determines whether to use vertical or horizontal explanatory mode. (Kostić 2020, 2)
We can see that Kostić actually refines Huneman’s topological explanation by introducing vertical and horizontal modes while, at the same time, getting rid of Huneman’s shortcoming by adding the condition of the support of counterfactual to the explanation model. Of the whole scheme, Part 2 is especially important to us because the premise of Facticity, which refers to the approximate truth of A and B – however fully we agree with it – leads to problems of the theories of truth (and also approximate truth), which goes beyond the possibilities of our investigation. And the last premise of Explanatory perspectivism defines the pragmatic aspect of the explanatory strategy, which is set aside for us (see chapter 2.1 above) in this investigation.423 We believe that Kostic’s and Huneman’s connected conceptions can be applied to Köhler’s theory so that the vertical mode applies to the hypothetical whole of system-theoretical linguistics (all subsystems together), and the horizontal mode applies to individual subsystems, or to individual subsystem circuits. Given the level of development of the system-theoretical linguistics, the horizontal mode is more
We focus in more detail on the issue of asymmetry in the paper Zámečník (2021). Although it can be used to solve the problem of explanatory asymmetry (see subchapter 2.2.1 above, and see Zámečník 2021).
216
5 Functional explanation in quantitative linguistics
important to us. Here we can stay with our identification of requirements with topological properties – specifying that these are local topological properties; and to add a counterfactual condition – that is, if a given value of the topological property was not realized, then we would not measure out the given values of the system (with respect to the requirements). Of course, the problem of formulating counterfactuals per partes remains; it is related to the above-mentioned need to hierarchize requirements and on this basis to create a system of invariants (and the symmetries that establish them). Therefore, it is really easier for us to build a horizontal mode of topological explanation. For the vertical mode, we would have to articulate the universal “regulatory effectiveness” (“die Steuerungseffektivität”, and its relation to individual horizontal variants). Lastly, let us write a topological model of explanation for system-theoretical linguistics with the help of Huneman’s interpretation of the topological component (based on topological graph theory) and Kostić’s treatment of requisites of a correct explanation for the horizontal mode: Explanans: 1. Facticity: A and B are approximately true 2. The linguistic system S, with the structure expressed by means of the relation Qðs1 . . . sm Þ among the elements si of the system is represented by the model of the topological space P. 3. A describes a set of local topological properties of the topological space P, B describes a set of local linguistic properties of the linguistics system S; and had the values of A been different, then the values of B would have been different. 4. The linguistic system S has local topological properties T1 . . . Tk that specify its invariance regarding a class of transformations. 5. For any set X1 . . . Xk of transformations of the linguistic system S, topological properties T1 . . . Tk define equivalence classes CX1 . . . CXk . 6. Def. Equivalence classes CX1 . . . CXk are classes of manifolds that are equivalent regarding X1 . . . Xk . (i.e. each of them being the transform of another through a function that belongs to X1 . . . Xk ) Explanandum: The fact G is topologically explained by the topological properties Ti of the topological space P For completeness, we present the first premise (1), which states the approximate truth of sentences in both the explanans and explanandum. The second premise (2) defines the linguistic system S composed of elements in mutual relations and states
5.6 Beyond functional explanation
217
that this system is represented by the model of topological space P. The properties of the model of the topological space explain the properties of the linguistic system. The most important part, then (compared to Huneman), comes in the third premise (3), which fulfills the condition of support of the counterfactual. Had the topological structure (given by topological properties) been different, then the properties of the linguistic system would have been different. In other words, the values of linguistic system variables – e.g. of the length, frequency, polylexy, etc. in the lexical system – would be different if topological properties were different. And as topological properties are an “imprint” of requirements directly in the system, it is also the counterfactual that interrelates linguistic principles (specified in premise 4)424 to the form and properties of linguistic facts. This is a truly linguistic explanation – the linguistic dilemma (let us repeat it once again) is, thus, resolved. Although we did not emphasize this above, both the model created on the basis of Huneman (2018) and the model created in combination with Kostić’s model (2020) represent variants of the principle-based model of explanation, where principles refer to linguistic system invariants. However, as we have already stated, this is an ad futurum solution because we have not created a hierarchy of these principles for linguistics. We can, of course, refer to fragments brought by Luděk Hřebíček, Reinhard Köhler and Jan Andres. And we can hope that in the coming years, these fragments will be composed in such a way that the topological model of explanation will solve the linguistic dilemma not only hypothetically, but also in fact.
The fifth (5) and sixth (6) premises are, then, identical with premises three and four in the topological explanation created on the basis of Huneman’s conception (see above).
6 Conclusion Sometimes three or four or five of the pieces would fit together with disconcerting ease; then everything would get stuck: the missing piece would look to Bartlebooth like a kind of black India with Ceylon undetached (. . .) Of course the empty space no more looked like India than the piece which fitted it exactly looked like Britain: what mattered, in this instance, was that for as long as he carried on seeing a bird, a bloke, a badge, a spiked helmet, an HMV dog, or a Winston Churchill in this or that piece, he was quite unable to discover how the piece would slot into the others without being, very precisely, reversed, revolved, decentred, desymbolised: in a word, de-formed. Georges Perec, Life, a User’s Manual (translated by David Bellos)
We have traveled through a hundred years of linguistic descriptions and theories to solve a dilemma that is most likely to be irrelevant to most linguists and uninteresting to most philosophers of science. Linguists prefer to deal with specific tasks, the question of whether they describe only or even explain language data being irrelevant to them because they are the ones who understand language – they have their developed heuristics that lead them reliably to understand language phenomena. Philosophers will say that the question of the dilemma itself is old-world, that it is more important to follow specific scientific strategies, describe them and compare them than to fight a long-lost battle for the normative role of philosophy of science. So why do we have this book in front of us? Because a special combination of our knowledge and our own heuristics has led us to the conclusion that we can point out some contexts and previously unconsidered possibilities that will bring new knowledge to both linguists and philosophers. And we certainly cannot have exaggerated expectations – amidst the flood of books by authors much better than us, knowledge is pouring in on us with a speed and intensity that makes it impossible to orient oneself. So what is, then, the benefit of our journey through linguistic descriptions and theories? The benefit lies in the mission that should be fulfilled by philosophy of linguistics. Linguists need to be reminded of their inattention, that they implicitly use concepts about whose content they may not have a good idea, and that significant linguistic discoveries are somehow associated with the moments and personalities that carried out the philosophical reflection of their own discipline. We met them – they were, above all, de Saussure, Hjelmslev, Chomsky, Herdan and Köhler. We do not need to recall how this philosophical reflection was performed by each one of them now; we have described it in the previous pages. Philosophers need to be told that even if our strategy is not modern in terms of the current philosophy of science, it does not mean that it is invalid or even incorrect. We have adhered to those authors who perceive it in the same way we do – this is actually the peculiarity of this book: it uses philosophical dissent to https://doi.org/10.1515/9783110712759-006
6 Conclusion
219
reflect on traditional linguistic descriptions and theories. As we have written in several places – we understand philosophy of science as a source of permanent knowledge; it is actually a gradual cultivation of conceptual analysis in favor of the sciences. This is how it started in the period between the wars, and this is the way it should continue into the future. In addition to the apology, however, we should present a modest list of what we consider to be the contribution of our journey through linguistics and philosophical reflection on concepts, descriptions, and theories that have emerged in its history. We divide this list into a general and a special part – the general part concerns findings on some common features of linguistic approaches in relation to explanation, and the special one concerns mainly specific proposals for the transformation of explanatory procedures in system-theoretical linguistics. In general, we have been able to document that a certain explanatory intention can be identified in all the linguistic approaches examined and that even in many of them, there is an uncovered effort to fulfill this explanation in a way common to the natural sciences – by Chomsky, Herdan and Köhler (and other quantitative linguists). In all the approaches, it is possible to identify principles on which principle-based models of explanation can be provisionally built, but most of these principles prove to be invalid for various reasons (for the principle of arbitrariness, principle of analysis, principle of recursion, etc.). At the same time, it also became clear – at a general level – that philosophy of science has a number of proven conceptual tools that can be used to reflect current linguistic approaches – we mean especially the typology of explanations (causal and non-causal, mechanistic and design explanations). The traditional view, which in linguistics recognized a systemic description and a pair of formal and functional explanations, is thus greatly expanded to the benefit of linguistic theories. The above mentioned general findings perhaps prove that the project of building a philosophy of linguistics is useful, not only for linguists themselves. For philosophers, it represents an opportunity to use new examples of scientific practice – currently the focus is mainly on the field of life sciences – and possibly to correct the idea of the nature of scientific theories in the context of the social sciences and humanities. In summary, the development of philosophy of linguistics can direct the whole discipline of philosophy of science to further activity and progressive development in relation to science itself (instead of the somewhat self-serving conceptual games of some metaphysical approaches to philosophy of science). A special area of benefit concerns quantitative linguistics and especially system-theoretical linguistics. This contribution is organically linked to the history of quantitative linguistics and especially to the intentions of quantitative linguists, who have repeatedly proclaimed that philosophy of science is a necessary and
220
6 Conclusion
very useful starting point for them to build a real linguistic theory. Reflections on contemporary philosophy of science may lead them to realize that the traditional inventory of philosophical-scientific concepts needs to be somewhat updated and that the major personalities, particularly Mario Bunge and Carl Gustav Hempel, have a number of successors who have moved the research significantly forward. A new view of linguistic explanation in quantitative linguistics opens up the researched issue of non-causal explanations. It gave us the opportunity to transform Köhler’s (and Altmann’s) functional explanation, based on Hempel’s classical functional analysis, into a topological explanation outlined by Huneman and Kostić. Consistent contemplation of this possibility, as well as the search for further inspiration in the field of non-causal explanations, will certainly lead to further fruitful discoveries in the theoretical framework of system-theoretical linguistics. For quantitative linguistics, however, wider possibilities open up, which we have indicated on the previous pages, but which we have not further developed. In quantitative linguistics and, in fact, also in the reflection of Köhler’s work itself, we find three dominant approaches represented by (1) Altmann and his successors (among them Radek Čech), (2) Köhler and the circle connected with the synergetic approach to language and (3) Hřebíček and mathematical quantitative linguists (among them Jan Andres), which can be re-enriched by means of contemporary reflections on scientific explanations that we recognize in philosophy of science. Altmann’s school, also associated with the Unified Approach, could draw inspiration from the new mechanism we contrasted above with the design explanation. Craver’s and Darden’s contributions could inspire those quantitative linguists who strive primarily for the efficient processing of linguistic data and finding inductive paths from linguistic data to statistical distributions or linguistic laws. It can be especially suitable for those linguists who believe that theoretical entities are actually to the detriment of modern linguistics. While this approach does not resolve the linguistic dilemma, it does not really bother Altmann’s successors because without independent theoretical entities anyway, there is no point in explaining per se. This approach is likely to lead in the future in at least two directions. The former will remain with statistical modeling of language data and will likely merge with broader areas of data analysis. The other direction may have a tendency to merge with traditional branches of linguistics – the successors of generativism (in the form of principles and parameters approach, etc.) and cognitive linguistics. It is possible that this direction will also modify Altmann’s and Köhler’s functional explanations into a mechanistic form (an option we have not used). The first direction will, therefore, lead to the profiling of new forms of statistical descriptions of
6 Conclusion
221
linguistic data, while the other will delegate explanatory responsibility to nonlinguistic levels (biological, cognitive, etc.). The mathematical direction of quantitative linguistics, founded by Hřebíček, can be most inspired by the distinctively mathematical explanations that we have identified, especially in Marc Lange’s works. The study of fractal structures that are behind language and manifest in the form of the power law takes place in Hřebíček’s work as a search for mathematical structures that can justify the form of empirical linguistic laws, such as Menzerath-Altmann’s law. We identify these efforts mainly in Jan Andres, who sought to interpret the parameters of MAL by means of fractal analysis. Hřebíček (but also Köhler in some interpretations of the register hypothesis) went even further in the search for mathematically expressed symmetries in quantitative linguistics. His principle of compositennes de facto transformed Hjelmslev’s principle of analysis into a means of a distinctively mathematical explanation of linguistic phenomena. In the previous pages, we have confronted the possibility of considering a distinctively mathematical explanation as really valid; on the other hand, we have to admit that some interpretations of topological explanations could go also in this direction. In particular, Lange’s concept of explanation by constraint (cf. Lange 2017, 46–95), which he combines with the role of symmetries in fundamental physics, is an example of this. This unresolved issue should also be explored further.425 For Köhler’s system-theoretical linguistics, in our opinion, the greatest opportunity is the concept of topological explanation promoted by Humeman and Kostić. We carefully explored – not only here, but also Zámečník (2014), Benešová, Faltýnek, Zámečník (2015), Benešová, Faltýnek, Zámečník (2018) – the possibilities of how to modify troublesome elements of functional explanation and we did not find a possibility to preserve the linguistic nature of the functional explanation. A strictly synergetic approach to functional explanation is the only option that would allow this, but it suffers according to our opinion from serious and unsolvable conceptual problems. From a certain point of view, it may seem that the transformation to a topological model of explanation is unnatural because it goes against Köhler’s original ideas about the relationship between the linguistic system and its surroundings. However, Köhler’s approach is not devoid of structuralist reasoning;426 we think it can even be said that Köhler developed structuralist considerations by means This also includes the possibility of explanation through universality, as we identified it in Morrison (2015) and Morrison (2013), and the conjecture of the fractal concept in topology, as mentioned in connection with scale-free networks in Caldarelli (2007). Therefore, the book includes a relatively long intermezzo dedicated to Herdan.
222
6 Conclusion
of functional models (under the influence of Altmann and Hempel) because he had no other effective means at his disposal. Moreover, we can follow topological traces already in the original structuralist considerations in de Saussure (and, as we have shown above in Poincaré). We managed to transform Köhler’s functional explanation into a topological form. We argue that this explanation does not suffer from analogous problems of functional explanation (functional equivalents and the structural axiom’s status), and remains a linguistic explanation because it incorporates requirements – that Köhler situated outside the linguistic system427 – directly into the linguistic system through the concept of topological properties. We, therefore, argue that the linguistic dilemma is solved by this step, albeit only hypothetically. In order for this hypothetical possibility to become a reality, it would be necessary to identify the symmetries and conservation principles that underpin the topological properties of the linguistic system.428 With a bit of irony, we have to conclude by saying that we have replaced one dilemma with another: linguists can now choose between a truly linguistic explanation (i.e. topological), which is not completely linguistically interpreted, and an extra-linguistic explanation (i.e. functional), which is linguistically interpreted in a very satisfactory manner.
Because the requirements were located outside the linguistic system, they also did not provide a linguistic basis for functional explanation. This was a structural axiom that stated that language is a self-organizing system. In fact, we could not do much more than what Luděk Hřebíček has already indicated; we have only added the concept of explanation, which allows Hřebíček’s intention to be articulated more specifically.
Appendix The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age. Howard P. Lovecraft, The Call of Cthulhu
Appendix 1 Two Faces of Pythagorean Theorem Progressive mathematics teachers sometimes try to explicate to children the meaning of the Pythagorean theorem and demonstrate its validity through an activity that visualizes the claim that: “In a right triangle, the content of the square above the hypotenuse is equal to the content of the squares above both legs.” Children will receive two equal-sized paper squares, on which they draw lines with the help of a ruler, as shown in the diagram (Figure 6):
Figure 6: Pythagorean Theorem (Giere 2006, 10).
Some children become conscious of the idea behind the Pythagorean theorem by means of the visualization, but it is better to let children cut both squares according to the drawn lines. By comparing the resulting shapes, they simply find that the four triangles from the left square are the same size as the four triangles from the right square. This means, of course, that what was left after removing these triangles from the left square – the square above the hypotenuse – is as large as what was left after removing these triangles from the right square – the squares above both legs. The children can then add all three squares above the respective sides to one more triangle and then repeat the statement of the Pythagorean theorem. https://doi.org/10.1515/9783110712759-007
224
Appendix
Ronald Giere (2006) argues that in this way we come to understand mathematical claims; that we grasp sophisticated abstract models of mathematics through similar schematizations, diagrams, and visualizations. These considerations are part of Giere’s conception of distributed cognitive systems (cf. Giere 2006, 101–106). It is essential for us that Giere uses directly the formulation that these diagrams: “(. . .) are used to prove the Pythagorean Theorem” (Giere 2006, 101). Here, Giere completes his pragmatic conception of theories in a peculiar way by its relation to cognitive science – all our scientific models are somehow based on our cognitive activity. Understanding mathematics is also governed by cognitive models – the basis of understanding mathematics is not deduction and proof, but the use of a visual aid that illustrates an abstract statement. It is interesting that in the current philosophy of science we also find a completely reversed strategy. In Lange’s (2017) conception of distinctively mathematical explanation (see chapter 2.2 above and Appendix 5 below), on the other hand, it is a mathematical fact, structure or statement that in some cases explains non-mathematical facts (in physics, biology or linguistics). In the case of the Pythagorean theorem and the children’s experiment that leads to its understanding, we can argue that a mathematical statement – the Pythagorean theorem – explains why a child who gets two different piles of cut paper can make an equal square of each. In other words, in this case, the Pythagorean theorem does not come into consideration at all. A child just manages to assemble two squares of the same size. An observer familiar with mathematics (and distinctively mathematical explanations) can say that the Pythagorean theorem explains the success of this childlike endeavor, just as a generalized statement about the Euclidean plane can explain that the author of an ancient Roman mosaic managed to create a mosaic of just n stone pieces.429
Appendix 1 was published as a part of the paper Zámečník (2021, 410–411).
Appendix 2 Mechanistic Explanations
225
Appendix 2 Mechanistic Explanations As announced in the main text, The Stanford Encyclopedia of Philosophy (SEP) contains an extensive and detailed overview of a new mechanism – Craver, Tabery (2015). In particular, the sections What Mechanisms Are Not and What Are Not Mechanisms show that the concept of a mechanism is sufficiently flexible and general (in the first case) and that it is possible to define clearly what is not a mechanism (in the other case).430 It also contains a comprehensive overview of the scientific areas in which the mechanistic explanation has been applied by philosophers. Glennan points out that mechanisms are “(. . .) responsible for their phenomena (. . .)” (Glennan 2016, 800) in three ways: via producing, underlying, or maintaining phenomena (Glennan 2016, 800). Schematically, Glennan illustrates them as follows (Figure 7):
Figure 7: Relations between Mechanisms and Phenomena (Glennan 2016, 800).
These are not: entities; correlations; inferences, reasons and arguments; symmetries; fundamental laws and fundamental causal relations; relations of logical and mathematical necessity (cf. Craver, Tabery 2015).
226
Appendix
The first way (“producing”) shows that mechanisms can be thought of as processes, while the remaining two ways (“underlying” and “maintaining”) show that it is also appropriate to observe mechanisms as systems (Glennan 2016, 801). In general, these three variants enable us to embody a wide range of types of explanations in mechanistic explanations; firstly, the causal explanations where a causal chain is easy to follow (in the case of processes), but also those where the network of causal relationships is so complicated that knowing completely is not possible. The cases where a mechanism underlies a phenomenon, thus, seem to be able to include cases of emergent phenomena (where philosophers of science have traditionally invoked the principles of universality), and the cases where mechanisms maintain a phenomenon may include self-regulatory processes in systems whose behavior is traditionally explained functionally. Glennan introduces a number of examples from the field of life sciences. Of particular interest are those examples that approach a mechanism as a system. In the case of “underlying”: (. . .) a neuron’s action potential – the rapid depolarization and repolarization of the cell membrane that travels down the cell’s axon – is a phenomenon produced by the whole cell (situated in its environment) in virtue of the operation of the cell’s parts (e.g., membranes, ions, and ion channels). A related example is muscle contraction. A muscle is composed of bundles of muscle fibers, which are individual cells (myocytes). The contraction of a muscle is not strictly produced by the contraction of the fibers; instead, the contraction of the fibers underlies the contraction of the tissues because the contraction of the tissue just is the collective and coordinated contraction of the fibers. (Glennan 2016, 800)
In the case of regulatory mechanisms (“maintaining”), he gives the following examples: (. . .) cells have mechanisms to maintain concentrations of metabolites, cardiovascular systems have mechanisms to maintain stable blood pressure, or warm-blooded animals have mechanisms to maintain constant body temperature. Many machines and mechanical systems (e.g., heating and cooling systems, steam engines with governors) also have this character. (Glennan 2016, 801)
The advantage of a mechanistic approach to explanation is its easy and clear applicability across scientific disciplines. Craver and Tabery (2015) list such applications in chronological order in: cell biology, cognitive science, neuro-economics, organic chemistry, physics, astrophysics, behavioral genetics and phylogenetics, but also in social sciences (Craver, Tabery 2015, chapter 2.6 “Philosophical work to be done”). Maziarz and Zach (2020) provide a current example of a mechanistic explanation application in epidemiology.
Appendix 3 Design Explanations
227
Appendix 3 Design Explanations The definition of a design explanation can be found in its current form in Eck, Mennes (2016). The authors directly state that: “Design explanations are a specific type of functional explanation” (Eck, Mennes 2016, 1057). They follow the definition of a design explanation that we find in Wouters (2007), Wouters (2003), Wouters (1999), and consider it specifically in the field of biology. It is essential for them to define its biological function as a “biological advantage”, which they define following Wouters through the term “trait”431 (Eck, Mennes 2016, 1057). Eck and Mennes state that design explanations relate to the following type of contrastive why-questions: Why does organism o have trait t, rather than trait t′?; (. . .) Why does item i have characteristic c, rather than characteristic c′? (Eck, Mennes 2016, 1058)
The design explanation consists of three parts: (1) internal and external conditions relating to the relevant “trait” have to be defined; (2) a counterfactual comparison of the relevant organism (or item) and the hypothetical organism (or item) have to be formulated; (3) the respective advantage has to be explained on the basis of “invariant relations”432 (Eck, Mennes 2016, 1059). The whole design explanation is, as we can see, built on Woodward’s counterfactual framework (cf. Woodward 2003, for more see Appendix 4). Eck and Mennes present an example in which a design explanation works: “Why do giant squid have large eyes (instead of small eyes)?” (Eck, Mennes 2016, 1058). In this example, all three parts of the design explanation are explicated:
The “trait” is defined with reference to Wouters as follows: “(a) the presence or absence of certain items (such as hearts and circulatory systems), behavioral patterns (such as the fanning behavior of a stickleback) and processes (such as the beating of a heart and the circulation of the blood) of/in individual organisms; and (b) the properties (features/characteristics) of entities under (a) (such as the structure of the heart and blood vessels and the rate of the heartbeat) or of the organism as a whole (such as the size of an elephant). (Wouters 1999, 17–18)” (Eck, Mennes 2016, 1058). Eck and Mennes give an example: “For instance, the level of activity and size of many organisms requires the ability to generate certain amounts of energy, which in turn depends on certain oxygen supplies. Therefore, many organisms have a specialized organ for respiration. This functional dependence of activity level and size on a specialized organ for respiration can be further explained in terms of invariant, change relating relations between, amongst others, body mass and energy consumption, i.e., when body mass and activity levels increase so does energy consumption, and between energy production and oxygen consumption, i.e., producing more energy requires more oxygen consumption (Wouters 2007, 76).” (Eck, Mennes 2016, 1060).
228
Appendix
With respect to the large eyes of giant squid, the three conditions [one internal and two external, L.Z.] are related to biological advantage by the fact that large eyes reduce diffraction blurring and allow for a higher flux of photons. This, in turn, allows for smaller contrasts to be detected, thereby making it possible to detect large predators by the bioluminescence they cause in a dark pelagic habitat (Nilsson et al. 2012, 683). The functional dependence relation between the life conditions and the advantage offered by large eyes because of a higher photon flux can then be further explained in terms of invariant, change relating relations explaining the relationships between eye size and amounts of diffraction blurring and photon flux. Here, diffraction principles explain how waves propagate. In this way, the advantage of large eyes gets explained in terms of invariant relations. (Eck, Mennes 2016, 1060)
The design explanation differs from more traditional variants of functional explanation precisely by its reference to contrastive why-questions. Above all, as Eck and Mennes point out, it differs from the etiological view of the function and the explanations based on it (Eck, Mennes 2016, 1060–1061). This is not about tracking the causal history of natural selection, but about creating such hypothetical scenarios that could not even occur in our causal history. Therefore, design explanations could be classified as non-causal.
Appendix 4 A counterfactual Theory of Explanation
229
Appendix 4 A counterfactual Theory of Explanation James Woodward in particular contributed to the construction of the counterfactual conception of explanation. He presented it in a comprehensive and concise form in the chapter “A Counterfactual theory of causal explanation” in his book Making Things Happen: A Theory of Causal Explanation (Woodward 2003, 187–238). There is a consensus in philosophy of science that this conception represents a modern form of causal explanation, to which Wesley Salmon paved the way (e.g. Salmon 1998). In the current text, Woodward (2018), he tries to show that the counterfactual conception of explanation can also be applied to cases of non-causal explanations. The basis of the counterfactual view is an idea that we can approach the explanation through a specific form of questions, for which the description “What-if -things-had-been-different questions” has been used. These questions are in fact based on the need to search for counterfactual scenarios of a possible state of a system, or process, or behavior in a given system under study, which could have arisen if “something had been different”. Woodward puts it explicitly as follows: (. . .) what would happen to the [XY, omission mine] if we (or some natural process) were to physically intervene in the system in question. (Woodward 2003, 196)
In the cited sentence, both essential elements of Woodward’s conception of explanation are heard: it is formulated in a counterfactual framework and it is based on the concept of intervention in an examined system. In the original form (Woodward 2003), an intervention in the causal form is intended; in the new view (Woodward 2018), he tries to formulate the non-causal form of the intervention. As an example, Woodward presents a simple mechanical system under the conditions433 defined for the validity of Galileo’s equations for the period and frequency of mathematical pendulum oscillation. Thus, a change in the length of the pendulum leads to a change in the oscillation period of the mathematical pendulum, which is formally expressed in Galileo’s relation: sffiffiffi sffiffiffiffi l1 l2 ! T2 = 2π T1 = 2π g g where T1 and T2 are the original value of the oscillation period and the value of the period after an intervention in length, respectively; l1 and l2 are the original length of a pendulum and the changed length of a pendulum, respectively; and g is the constant gravitational acceleration. 433 Assuming a constant gravitational acceleration (g = 9.81 ms − 2 ) and for a limited angle of oscillation (max. 6°).
230
Appendix
In the given example, we can identify the explanatorily relevant variable l, which when changing (if being intervened) leads to a change in the variable T. Explanatorily relevant variables have to be distinguished from those that are explanatorily irrelevant: (. . .) an explanans variable S is explanatorily irrelevant to the value of an explanandum variable M if M would have this value for any value of S produced by an intervention. (Woodward 2003, 200)
An example of an explanatorily irrelevant quantity for Woodward is a classical and slightly comical counterexample to the D-N model of explanation, in which a Mr. Jones434 uses hormonal contraception systematically and, surprise surprise, does not get pregnant. Although this is a bizarre example, it satisfies all the conditions prescribed for the D-N model of explanation. We believe that in the example with Galileo’s pendulum, however, the period T is also explanatorily irrelevant. We can intervene in it – for example, by oscillating the pendulum at a higher speed – but this intervention will not affect the pendulum length l. Woodward’s identification of explanatory relevance is very important because it identifies the difference between revealing the dependence between the explanans and the expalandum, and the “mere” inference of the explanandum from the explanans. Although we can infer a change in the length of the pendulum based on a change in the period according to the formal equation: l1 =
T12 g T2g ) l2 = 2 2 2 4π 4π
but this does not mean that we identify the period T as the cause of a change in the pendulum length l.435 The distinction between relevant and irrelevant interventions, thus, allows Woodward to solve the problem of explanation asymmetry (i.e. Bromberger’s original problem) in a counterfactual framework: (. . .) the counterfactual account of explanation (. . .) provides a natural (and unified) diagnosis of what is wrong with putative explanations [Ex. Mr. Jones, insertion mine] that contain explanatory irrelevancies and with explanations [Ex. the length of the pendulum is explained by its period, insertion mine] that fail to respect explanatory asymmetries; in
In the current socially constructed reality we have to add another condition, that Mr. Jones is a cis-man. “(. . .) providing a nomologically sufficient condition for an outcome is not the same thing as answering a set of w-questions about that outcome when this latter notion is interpreted along interventionist lines.” (Woodward 2003, 198).
Appendix 4 A counterfactual Theory of Explanation
231
both cases, their inadequacy as explanations may be traced to their failure to answer any w-questions. (Woodward 2003, 200)
To make the picture complete, we attach a model of Woodward’s counterfactual conception of explanation: (EXP) Suppose that M is an explanandum consisting in the statement that some variable Y takes the particular value y. Then an explanans E for M will consist of: (a) a generalization G relating changes in the value(s) of a variable X (where X may itself be a vector or n-tuple of variables Xi ) and changes in Y, and (b) a statement (of initial or boundary conditions) that the variable X takes the particular value x. A necessary and sufficient condition for E to be (minimally) explanatory with respect to M is that: (i) E and M be true or approximately so; (ii) according to G, Y takes the value y under an intervention in which X takes the value x; (iii) there is some intervention that changes the value of X from x to x′ where x ≠ x′, with G correctly describing the value y′ that Y would assume under this intervention, where y′ ≠ y. (Woodward 2003, 203)436
In Zámečník (2021), we focus in more detail on solving the problem of asymmetry in the case of non-causal explanations (see also chapter 2.2).
232
Appendix
Appendix 5 The Extended Typology of Non-Causal Explanations Non-causal explanations form a category with very diverse subgroups. In addition to a group of scientific non-causal explanations (see chapter 2.2), these are metaphysical, mathematical and distinctively mathematical explanations. The latter group is reflected in the main text in connection with linguistic structuralism and especially with some branches of quantitative linguistics (in texts by Köhler, Altmann, Hřebíček and Andres, see chapters 5.2, 5.3 and 5.4). More classically oriented philosophers may be confused by the presence of metaphysical and mathematical explanations in philosophy of science. Metaphysics was expelled from traditional analytic philosophy, and the founders of philosophy of science (Popper, Hempel) demonstrated their definitions of science – its ability to explain phenomena and test hypotheses – in contrast to the system of metaphysical speculations. Mathematics was seen by the same philosophers as a formal discipline in which there is no point in talking about explanation because its mission is to derive proofs of mathematical statements. However, we must state that purely institutionally, metaphysics was rehabilitated within philosophy of science. Especially for humanities disciplines, where philosophy probably still belongs, it is true that their agenda is set by its representatives to a greater extent than in the natural sciences. In other words, philosophy of science is what philosophers of science do. Moreover, even philosophers skeptical of the institutional definition of philosophy of science have to admit that in scientific theories (especially the very subtle theoretical ones in the case of physics) we can find concepts that have their origins in metaphysics – subtle scientific concepts and metaphysical concepts actually form a continuous spectrum437 (Cf. Maudlin 2007). However, the non-causal explanation in metaphysics has a more specific form. It is related to the concept of metaphysical grounding (for a summary, see Bliss, Trogdon 2014). In this metaphysical approach, there is a specific ground relation that expresses “what grounds what” (Bliss, Trogdon 2014). It is basically a fundationalist idea that everything is grounded by something else and that all relations can be understood as metaphysically grounded. Thus, for example, we can look for metaphysical grounding of relations in logic (Cf. Poggiolesi 2021). Bliss and Trogdon (2014) clarify the relationship between metaphysical grounding and explanation when quoting Fine (2001):
For speculative scientific theories in physics, see Baggott (2013).
Appendix 5 The Extended Typology of Non-Causal Explanations
233
We take ground to be an explanatory relation: if the truth that P is grounded in other truths, then they account for its truth; P’s being the case holds in virtue of the other truths’ being the case. (Fine 2001, 15)
Bliss and Trogdon (2014) admit that some authors – although not being explicitly hostile towards metaphysics – find metaphysical grounding as an example of esoteric metaphysics. We will not embark here on further assessment of what makes metaphysics better than esoteric metaphysics. We believe that it is clear that metaphysical grounding has much deeper aspirations than our project. Confidence in the empirical content – that we lack in metaphysical concepts – is important for our line of reasoning. Metaphysical concepts represent a reservoir of ideas that can be grasped by logic and mathematics and then filled with empirical content – until they become scientific theories. This does not disqualify metaphysics as such, but allows us to argue that metaphysical explanation does not bring anything substantive to our understanding of scientific explanation. In other words, in Bliss’s and Trogdan’s reference to Strevens (2008), we feel the effort to free the question of explanation from the captivity of pragmatics and from the captivity of knowledge participants.438 That is a respectable goal, but as skeptics we will always ask, how do we know that we are right? ✶✶✶ Reflections on mathematical explanation or mathematical explanation in mathematics (see Mancosu 2018) shift the distorted notion of mathematics as a science of proof. In other words, they show that the creation, search and finding of mathematical proofs is diverse and pluralistic. And here we are talking about pure mathematics, not about its applied form. Mancosu gives examples: (. . .) providing alternative proofs for known results, giving an account for surprising analogies, or recasting an entire area of mathematics on a new basis in the pursuit of a more satisfactory ‘explanatory’ account of the area. (Mancosu 2018)
Mancosu’s first example combines mathematical explanation with justification of which mathematical proof of a given theorem is better. As in the case of explanations in natural sciences, we seek support in simplicity, in the number of steps of the proof, in the number of mathematical concepts involved, etc. The decision itself cannot stand on the mere valid inference of individual steps of the
“The relevant notion of explanation in this case is metaphysical in character, where what this is commonly taken to mean is that whether some facts bear an explanatory relation to others doesn’t depend on our explanatory interests or what we happen to understand (Strevens 2008, Ch. 1).” (Bliss, Trogdon 2014).
234
Appendix
proof. And these extra ingredients form an extended concept of mathematical explanation. The second example points out that much of the understanding of mathematics and the building of mathematics is based on the heuristics, or even mathematical intuitions, of mathematicians. For example, Singh (1997) has introduced a magnificent story of the search for proof of Fermat’s Last Theorem and through it presented the idea of the Langlands program – the idea that solving a specific mathematical problem in an area of mathematics can lead to solving another problem in a completely remote area of mathematics. Fermat’s Last Theorem was assumed to be true on the basis of confidence in the validity of the TaniyamaShimura conjecture, but the proof was not provided until Andrew Wiles (with the contribution of Richard Taylor) in the 1990s proved the Taniyama-Shimura conjecture. Assessing the degree of credibility of mathematical assumptions, thus, again goes beyond the classical grasp of mathematical proof. The third example refers to those situations in the history of mathematics when, through the introduction of new mathematical concepts, entire areas of mathematics changed fundamentally. The situation can be demonstrated by introducing new conceptualizations of numbers, conceptualizing infinite sets, defining the limit of the function, introducing the concept of a group of transformations, etc. The very concept of proof has evolved in this historical respect. It can be said that some unsolvable problems disappeared due to the fact that the problem was redefined – for example, in connection with the concept of the limit of the function, the problem of infinitesimal quantities disappeared. Understanding the mathematical problem is time-dependent in this respect. We believe that the last mentioned dimension of the topic of mathematical explanation shows a certain pitfall of Mancosu’s reasoning – it mixes the grasp of concepts of proof, explanation and understanding. On the contrary, we have tried to delimit the area of scientific explanations as best as possible (see the beginning of part 2 of this book). Mancosu claims directly: While one could easily provide myriads of evaluations by mathematicians contrasting explanatory and non-explanatory proofs of the same theorem, it is important to point out that explanations in mathematics do not only come in the form of proofs (. . .). In some cases explanations are sought in a major conceptual recasting of an entire discipline. In such situations the major conceptual recasting will also produce new proofs but the explanatoriness of the new proofs is derivative on the conceptual recasting. This leads to a more global (or holistic) picture of explanation than the one based on the focus on individual proofs. (Mancosu 2018)
Overall, as in the case of metaphysical explanation, our goal is much simpler: we are looking for a suitable model of explanation for linguistics, which would solve
Appendix 5 The Extended Typology of Non-Causal Explanations
235
the presented dilemma of linguistics. The conceptualization of mathematical explanations leads to problems that, we believe, can be solved by non-mathematical scientific means – for example, in connection with heuristics used by mathematicians, a cognitive-scientific solution would be offered. But a more likely path for philosophy of mathematics will lead to metaphysics – how else to judge the “quality” of a mathematical proof? If there is an intra-mathematical way, then probably only in line with the mentioned new439 conceptual foundation of mathematics (Mancosu’s third example above). Proof440 of a mathematical statement, explanation of a mathematical fact, and understanding441 of mathematical truth seem to us to be very different entities. The concept of explanation seems to us not to be entirely appropriate here. ✶✶✶ The last important type of non-causal explanation is distinctively mathematical explanation (DME) of non-mathematical facts. In our text, it is paid partial attention (and an example of the Strawberry Problem is offered) because it is reflected in some considerations of quantitative linguists. There are a number of examples. The reader will find them mainly in the book Lange (2017) mentioned several times. It should be noted that in addition to pure examples of DME (which perhaps includes the Strawberry Problem and which could also include an explanation of conservation principles through symmetry principles – but cf. the chapter 2.2), there are a number of mixed examples. In mixed examples, some mathematical abstraction plays an indispensable explanatory role, but it is difficult to say that it would play an explanatory role completely independently, without any other scientific content.442 Even perhaps the most famous example, presented by Baker in the paper “Are there genuine mathematical explanations of physical phenomena?” (Baker 2005) is, we believe, an interplay of biological facts and of a number-theoretic fact. Baker states: Three species of cicada of the genus Magicicada share the same unusual life-cycle. In each species the nymphal stage remains in the soil for a lengthy period, then the adult cicada
In this way, can we understand, for example, the definition of an algorithm through the concept of the Turing machine? We usually understand the field of mathematical proofs as a crystal temple of mathematical abstractions. For comparison, the conception offered by David Deutsch in The Beginning of Infinity is interesting when he speculates that the theory of proofs could be co-opted into natural science, (see Deutsch 2011). Lakoff (1990) commented on the problem of understanding mathematical reality interestingly (see Lakoff 1990, 353–369). This is also the case for Morrison’s reflections on mathematical abstractions (Morrison 2015, 15–49).
236
Appendix
emerges after either 13 years or 17 years depending on the geographical area. Even more strikingly, this emergence is synchronized among all members of a cicada species in any given area. The adults all emerge within the same few days, they mate, die a few weeks later and then the cycle repeats itself. (Baker 2005, 229)
It seems that we could say that the number-theoretical fact – 13 and 17 are prime numbers – explains why the cicadas in question have just such life-cycles because, thanks to prime numbers, the probability of the cicada’s life cycle intersecting with its predators’ life cycles is minimized. A biological component of explanation is an evolutionary principle of natural selection – cicadas increase their evolutionary advantage if they minimize the mentioned intersection (Mancosu 2018). Proponents of DME would claim at this point that the mathematical fact (prime numbers) stands at the top of the whole explanatory structure. In this case, as above (in the beginning of part 2 of this book) in the case of Erik Weber’s critique of the Strawberry Problem, we would incline to believe that without the biological content, the explanation would not work; that is, it is not a DME. In the text Zámečník (2021) we express other arguments against DMEs, which we call “ad hoc explanations”. We draw attention to their arbitrariness, unanchoredness and vagueness. At the same time, we cannot shake off the impression that the motivation to create a DME is also not always related to the effort to explain systematically and scientifically.443
For example, Baker’s motivation is as follows: “I have argued that there are genuine mathematical explanations of physical phenomena, and that the explanation of the prime cycle lengths of periodical cicadas using number theory is one example of such. If this is right, then applying inference to the best explanation in the cicada example yields the conclusion that numbers exist.” (Baker 2005, 236).
Appendix 6 Symmetry Breaking and Symmetry Maintaining
237
Appendix 6 Symmetry Breaking and Symmetry Maintaining The symmetry principles and the “mechanism” of symmetry breaking have their fundamental place in a standard model of particles and interactions – in this case we talk about gauge symmetries. Unlike classical mechanics, which is related to the translational and rotational symmetries of Euclidean space and the translational symmetry of time, in the physics of the standard model, symmetries are related to the state Ψ-space. The transformation of this space leaves certain quantities invariant – the principles of gauge symmetries, thus, determine the principles of invariance (conservation laws) of certain quantities. Stenger speaks directly about: The principle of gauge invariance: The models of physics cannot depend on the choice of coordinate axis in Ψ-space. (Stenger 2006, 77)
We can turn our gaze to the principles of symmetries and say that whenever we identify a quantity as invariant (with respect to the above mentioned transformations), it indicates a type of symmetries. For example, the law of conservation of electric charge in quantum electrodynamics (QED) indicates existence of the symmetry U ð1Þ.444 This is the basic symmetry of a standard model, which is not broken – this is reflected by the fact that photons constituting electromagnetic interaction have zero rest mass. In contrast to the basic symmetry of QED, the symmetry of the electro-weak theory SU ð2ÞxU ð1Þ is characterized by the fact that it is disrupted – bosons that mediate a weak interaction have a non-zero rest mass. This spontaneous symmetry breaking can be explained by the existence of the Higgs mechanism, as a means of generating masses of the reported electro-weak bosons (Stenger 2006, 105). Therefore, the discovery of the Higgs boson (2012) completed primarily the electro-weak theory within the standard model (for details, see Baggott 2013).445 For our linguistic needs – in the context of Luděk Hřebíček’s reasoning (see the Second Interlude) – however, it is more suitable to stick to the symmetries of classical mechanics and also to use another type of symmetry breaking: “When a direct, causal mechanism can be found, the process is called dynamical symmetry breaking” (Stenger 2006, 102). This brings us to the basic definition of symmetry maintaining and symmetry breaking. With some simplification, one can say that:
444 Strictly speaking, it is a local symmetry U ð1Þ. We should therefore distinguish between local and global symmetries, as well as local and global symmetry breakings, etc., (see Bangu 2013). An overview of all fundamental symmetries and invariance principles in physics can be found in Stenger (2006, 113–114). Bangu (2013) describes the “gauge logic” of the standard model very clearly.
238
Appendix
if symmetries ensure the invariance of some quantities and independence of physical models from space transformations (in our case the three symmetries mentioned above) – the point-of-view invariance – then symmetry breaking allows something interesting to happen in a physical system. In this respect, the first such symmetry-breaking element is Newton’s force, which transforms a kinematic system of Galilean mechanics into a dynamical system. The situation is more complicated because Newton’s force – or Newton’s law of force – is invariant under Galilean transformations (unlike the so-called Aristotelian law of force), but in non-inertial reference frames it leads to violations of the law of action and reaction – i.e. to emergence of virtual inertial forces. If the Galilean transformations are linked to the three stated symmetries of classical mechanics, then the violation of the law of action and reaction constitutes space translational symmetry breaking. The way to eliminate this symmetry breaking is to introduce a more general space-time symmetry within the general theory of relativity. Specifically, these two mentioned aspects are demonstrated by Stenger (2006), although he does not interlink them explicitly the way we do. Therefore, we will show paths (1) from Aristotle to Galileo and Newton, and further (2) to Einstein. To the first point (1), let us state that the Galilean transformation for the xcoordinate determines: x′ = x − vx t Aristotle’s law of force, expressed in modern notation, requires any movement to be the result of an acting force, which can be noted as follows: F = kv where v is velocity of an object and k is a constant. Velocity is defined in Newtonian physics as time derivative of a position vector; for our needs, it is sufficient to consider time derivative of the x-coordinate: v=
dx dt
If we apply Galilean transformation to Aristotle’s law of force, we obtain: F′ = kv′ = k
dx′ dðx − vx tÞ =k = kðv − vx Þ dt dt
We see that Aristotle’s law of force is not an invariant of Galilean transformations. In other words, Aristotle’s law of force depends on the reference frame
Appendix 6 Symmetry Breaking and Symmetry Maintaining
239
through which we observe. And the scientific law has to, in principle, be independent of any choice of the reference frame. Conversely, Newton’s law of force states that force is needed only to cause accelerated motion (uniform rectilinear motion and rest are equivalent, as determined by the principle of inertia): F = ma where a is acceleration of a body and m is Newtonian mass of a body. Acceleration is defined as the second time derivative of a position vector, so for us the second time derivative of the x-coordinate: a=
d2 x dt2
If we apply Galilean transformations to Newton’s law of force, we see that: F′ = ma′ = m
d2 x′ d 2 ð x − vx t Þ d ð v − vx Þ =m =m = ma = F 2 dt dt2 dt
We see that Newton’s law of force is an invariant of Galilean transformations; it does not depend on the reference frame (the observer), and therefore behaves like a real scientific law (Stenger, 2006, 208–209). ✶✶✶ We now turn to point (2), that in non-inertial reference frames, Newton’s law of force ceases to be a Galilean invariant and leads to violation of the law of action and reaction. If the reference frame moves with acceleration, the inertial observer will describe the position based on the application of Galilean transformations as follows: 1 x′ = x − vx t − at2 2 where vx is the initial velocity and a is acceleration of the system. If we express the law of force for an inertial observer describing a noninertial system, then we get: d2 x′ d dx d2 x − vx − at = m 2 − ma = F − ma F′ = m 2 = m dt dt dt dt
240
Appendix
In non-inertial systems, from the point of view of inertial observers, virtual forces arise, which we call the inertial forces and which contradict the law of action and reaction: Fs = ms as The space translational symmetry breaking is overcome only in the general theory of relativity by means of translational and rotational symmetries of spacetime (Stenger 2006, 62–63, 209–210).446
The derivation of equations for non-invariance of Aristotelian law of force, invariance of Newton’s law of force and for the formation of inertial forces was taken from appendices 33 and 40 of our book Zámečník (2015, 358–359, 367–368).
Appendix 7 Realism and Anti-realism
241
Appendix 7 Realism and Anti-realism For proponents of anti-realism,447 or instrumentalism in the interpretation of scientific theories – i.e. of the claim that theories do not refer to the structure of reality, that theoretical terms do not represent entities in reality, that theories are mere tools of our empirical organization – the main argument against realism is that, as we know from the history of science, scientific theories have proved to be untrue in the light of new empirical evidence many times. The question, then, is why shouldn’t the same fate befall our current scientific theories? We see that we get at this knowledge through experience – this process is, therefore, sometimes called pessimistic meta-induction (for details, see French 2014, 2–5).448 For proponents of realism in the interpretation of scientific theories – i.e. of the claim that theories (approximatively) truly represent reality (its objects or structures of relationships) – the main argument is the predictive success of science, or steady progress of scientific knowledge. Predictive success of science, in the case of the invalidity of realism, would give the impression of a miracle that demands a (metaphysical?) explanation quite naturally.449 Typically, this problem is revealed when considering theoretical terms and entities of scientific theories. Although we have experience with eliminating a number of theoretical entities (phlogiston, ether, etc.), we also have experience with the development of experimental and observational tools of science (telescope, microscope, particle accelerators, etc.; in linguistics, e.g. corpora, computational methods). Therefore, we do not want to accept that electrons, quarks or Higgs bosons do not exist – they are not made by any device, they only help extract them from reality. In linguistics, a theoretical entity is, for example, a word or a sentence; but maybe one day, we will find a language (extraterrestrial) where we will not be able to identify them. Similarly, a theoretical entity is generativistic recursion, whose unquestionability – in terrestrial languages – has been challenged by revealing counterexamples, although they have the nature of omitted anomalies.450 A reasonable answer, in the spirit of Davidson’s critique of relativism (see Davidson 1974), We can classify logical positivists as anti-realists, but in a somewhat implicit mode – in their approach, there were no questions about reality. Also in Popper’s original conception – in the spirit of the Logic of Scientific Discovery – anti-realism is implicitly present – hypotheses are bold assumptions that are sometimes corroborated and sometimes refuted by new empirical evidence, but we do not expect any scientific hypothesis or even theory to hold forever. At the same time, French points to the second fundamental argument against realism – underdetermination (French 2014, 21–24). This is Hilary Putnam’s “no miracles argument”. (see Putnam 1975) The term “inference to the best explanation” is also used – predictive success is best explained by the validity of realism. We have already mentioned Everett’s language of the Pirahã tribe (see note 9).
242
Appendix
seems to be that if we are able to identify a hitherto undiscovered (perhaps extraterestrial) “language”, then only by recognizing our linguistic theoretical entities in it, or being able to arrive at new theoretical entities that will suit all languages, will we understand the relationship (or dependence, or reduction451) of our old theoretical entities to newly conceived theoretical entities. However, some scientific theories do not guide us unambigously. In the context of debates on the interpretation of quantum theory, Bas van Fraassen comes to the formulation of his constructive empiricism (for a definition of “constructive empiricism”, see van Fraassen 1980, 11–13), which is often classified as one of the variants under the general concept of anti-realism.452 Quantum theory does not have a single axiomatic framework that dictates to us what to reduce our old theoretical entities to.453 Van Fraassen has pointed out that clinging to realism can hamper scientific research, or rather that we have a tendency to adapt the interpretation of theoretical entities of theories to what we know intimately. This, in fact, corresponds to the sacrifice of an empirical stance in favor of a dogmatic prejudice. And van Fraassen understands empirical stance as a permanent struggle against metaphysics (see van Fraassen 2002). The conflict between realism and anti-realism454 is evergreen in philosophy of science, which may signal a certain rigidity and excessive academicity of this dispute. However, we believe that a certain minimum of realism can hardly be given up. In this context, John Searle defines “external realism” (ER), a “thesis”, that enunciates that our (not only) scientific representations are representations of an independent external world; the external world is independent of our rep-
In the philosophy of science, it is common to distinguish between entity/theory reduction and entity/theory elimination. Newton’s theory of gravitation was reduced to Einstein’s theory of gravitation while the phlogiston theory of combustion was eliminated and replaced by the oxygen theory of combustion. Newton’s gravitational field was reduced to Einstein’s gravitational field, while phlogiston was eliminated and replaced by oxygen. However, at other times it is seen as a compromise between realism and anti-realism. See SEP entry for details (Monton, Mohler 2017). There are a number of interpretations of quantum mechanics: many-worlds interpretation, strictly nonlocal interpretation, interpretation of hidden variables, etc. The most important distinction in realism is between object and structural realism. For an object realist, there are real entities/objects that are postulated by theories (the true ones); for a structural realist, there are real structures of relations that express scientific laws/principles. Structural realism is advantageous by opening up space with a non-causal explanation, but its disadvantage is that it is difficult to apply to theories outside fundamental physics. However, French tries to relate it to biology, (see French 2014, 324–352).
Appendix 7 Realism and Anti-realism
243
resentations. For Searle, this is a clear defense against relativism.455 Regarding external realism, Searle states: Realism is the view that there is a way that things are that is logically independent of all human representations. Realism does not say how things are but only that there is a way they are. (Searle 1997, 155) ER is thus not a thesis nor an hypothesis but the condition of having a certain sort of theses or hypotheses. (Searle 1997, 178)
However, Searle himself is aware that the “thesis” of external realism suffers from certain problems; he formulates a call for philosophers to try to evaluate the special epistemic status of the “thesis” that “there is a world independent of our representations.”456 It is not a trivial problem to explicate the status of the “world”457 presented in this way. Searle admits that this is a transcendental thesis.458 What else can we say about this world objectively other than that “it is”? Any structure or process that we conceive will already be our representation, which may not be objectively valid. If the world were a regulatory/transcendental idea, then it could be defined as a “place of epistemic harmony”, i.e., a “seemingly” metaphysical question would turn into an epistemological form: any knower is able to reach epistemic harmony with all other knowers by means of scientific explanations. In other words, it is not possible to arrive at fundamentally incompatible scientific theories. In other words, all knowers converge to a common interpretation of the world.459
Formulated primarily against social constructivism, see Searle (1997), Searle (2010). Such defenses, more broadly oriented, include Davidson (1974), Putnam (1975), Boghossian (2006). Searle directly states: “What exactly is the status of propositions such as that there exists a reality independent of representations of it?” (Searle 2012, 200). For Davidson, the world seems to be a transcendental idea (see Davidson 2001). “(. . .) I think of ER not as a theory among others, but rather as a condition of a certain kind of intelligibility.” (Searle 2012, 200). We deal with this question in relation to the cosmological principle in the paper Zámečník (2012).
244
Appendix
Appendix 8 Deterministic and Statistical Law Conceptual difficulties in discussing the scientific law can be demonstrated by an often presented dichotomy between a deterministic law and a statistical law, which also complicates the situation in system-theoretical linguistics (see chapter 5.2). The terms of deterministic law and causal law are not equivalent – although every causal law is a deterministic law, it is not true vice versa.460 We can certainly call conservation laws deterministic, even though we characterized them above as noncausal principles (see chapter 2.2). It is not appropriate to call statistical laws nondeterministic laws only if we – strictly speaking invalidly – define non-deterministic laws by the absence of a causal nexus (but thus, again invalidly declaring the terms deterministic and causal law to be equivalent). It would be more appropriate to talk about non-deterministic tendencies, or the best statistical tendencies as compared to deterministic laws. Nevertheless, after the previous analysis, we are not conceptually clear – statistical tendencies often reveal (in some examples) a causal nexus,461 or statistical tendencies may refer to the existence of a scientific principle that is universal in nature, and could be called deterministic (economization principle, optimization principle). Of course, deterministic and statistical laws are directly differentiated on the basis of experimental results – i.e. deterministic law does not tolerate an exception (a body does not start to rise spontaneously in a homogeneous gravitational field instead of falling), while statistical law suffers many exceptions (and therefore we have criteria for validity and reliability of statistical testing). But the case of a statistical test is a case of complex dependencies in which one tendency (for example, to shorten words) can be broken down by another tendency (for example, to enlarge a lexicon), even though there is still an invariant structure of the system. Finally, we have classical statistical mechanics and thermodynamics, although we (in an approximate way) trust the principles of classical mechanics fully. We also have quantum statistics (in many ways different from standard A four-stage definition of determinism in connection with the theory of dynamical systems can be found in Kellert (1993, 50). Above in the text (in the fifth part of the book, mainly in chapter 5.5), we touched on the relationship between statistical correlations and causality. However, we cannot proceed in this way for all statistics completely – for example, in quantum mechanics it is not a matter of finding a causal nexus. We are once again in close proximity to metaphysics – a reasonably defined concept of causality requires the concept of identity. French argues that quantum particles (for which the Bose-Einstein statistic applies), unlike classical ones (for which the Maxwell-Boltzmann statistic applies), have no identity. And he considers this finding to be the main argument for structural realism (against object realism, see French 2014, 33–47). In Appendix 10 we show how Herdan interprets the peculiarity of quantum statistics for the case of linguistics.
Appendix 8 Deterministic and Statistical Law
245
statistics), although we trust the conservation laws that underlie the standard model of particles and interactions fully. What is the best thing to get out of these conceptual ambiguities? Apparently, the whole terminological pair of determinism and indeterminism is unsuitable for scientific discourse, and it will be best to get rid of it. There are simple and complex systems, and we often use statistical methods to describe complex systems. We are then able to find statistical tendencies,462 which, however, should not be described as sui generis laws. For both simple and complex systems, however, we assume – if we want to explain how the system “works” – certain invariant properties. The definition of these invariant properties is condensed into the form of scientific principles. This may sound too radical, but we believe that if we choose a uniform terminology and assign a basic component to the systemic approach in the form of a scientific principle, we will reach a basic level for meaningfully defining scientific explanation in the most universal form possible (see chapter 2.3). This will allow us, at least for a brief moment, to see scientific principles of a standard model of particles and interactions in close proximity to the scientific principles of system-theoretical linguistics. And we can also use conceptual pairs of symmetry maintaining (SM) and symmetry breaking (SB) (see Appendix 6). In the context of symmetry breaking, both systems – physical and linguistic – meet again, as Hřebíček has already announced (see the Second Interlude). Maybe it is just the “quality of the process in the system” – SM is more typical in physics in the standard model, but physicists are aware of the importance of SB; while in linguistics the original SM is often almost invisible despite constant SB (structuralism made SM visible for the first time).
We often declare them in linguistics, for example, Zipf’s law, etc.; but it is much more natural to include these tendencies among other statistical distributions (side by side with normal and lognormal distributions, etc.). See above the chapters focused on quantitative and especially system-theoretical linguistics (chapters 5.2 and 5.3).
246
Appendix
Appendix 9 The Pumping Theorem for a Context-free Grammar and the Case of Swiss German This appendix is based on the book Partee, ter Meulen, Wall (1993). We will refer to relevant sections directly in the text. The pumping theorem for a context-free grammar (CFG) shows the difference between the “insertion of a loop” in a regular grammar (RG) (cf. Partee, ter Meulen, Wall 1993, 468–471) and the “insertion of a subtrees” in CFG: 1. Create a basic tree (S, A are non-terminals and u, x, z are terminals): S A u
x
z
2. Insert a subtree: A A v
y
3. In this way, create a tree: S A A u
v
x
y
z
4. Inserting a subtree can be repeated: S A A A u
v
v
x
y
y
z
Appendix 9 The Pumping Theorem for a Context-free Grammar
247
The general form of a language generated by this CFG is L = uvi xyi z, i ≥ 0 (Partee, ter Meulen, Wall 1993, 493–494). More formally, we can express the pumping theorem as follows: If L is an infinite context free language, then there is some constant K such that any string w in L longer than K can be factored into substrings w = uvxyz such that v and y are not both empty and uvi xyi z 2 L for all i ≥ 0 (Partee, ter Meulen, Wall 1993, 494).
As with RG, we use the pumping theorem to test whether a language is an infinite context-free language (CFL). That is, if the language cannot be expressed as
L = uvi xyi z, i ≥ 0 , then it is not any CFL (Partee, ter Meulen, Wall 1993, 495). ✶✶✶ Shieber (1985) has shown that Swiss German (SG) is not a CFL – the evidence seems to be bulletproof empirically as well as formally. Example: A grammatical sentence in Swiss German (according to Partee, ter Meulen, Wall 1993, 502): Jan säit
das mer
em
Hans ½dat.
es huus ½acc.
hälfred
aastriiche
½John said that we Hansðdat:Þ the houseðacc:Þ helped paint There is a cross-serial dependence in the sentence as well as a case dependence (dative and accusative). Empirical evidence (according to Partee, ter Meulen, Wall 1993, 502): 1. Sentences in which there is no case dependence are categorized as ungrammatical by native speakers systematically. 2. The length of sentence constructions is not limited – in other words, the restriction is of a purely performative nature (to the extent that the speaker is able to express a sentence of a given length): Another example: (according to Partee, ter Meulen, Wall 1993, 502) €it das mer d′chind ½acc. em Hans ½dat. es huus ½acc. lo €nd ha €lfe aastriiche Jan sa
½John said that we the childrenðacc:Þ Hansðdat:Þ the houseðacc:Þ let help paint
248
Appendix
Formal confirmation (according to Partee, ter Meulen, Wall 1993, 502–503): 1. Create the regular language R (according to Partee, ter Meulen, Wall 1993, 502): €lfeÞ ✶ €it das mer ðd′chindÞ ✶ ðem HansÞ ✶ es huus haend wele ðlaaÞ ✶ ðha Jan sa aastriiche John said that we ðthe childrenÞ ✶ ðHansÞ ✶ the house have wanted to ðletÞ ✶ ðhelpÞ ✶ paint 2. Create an intersection of Swiss German SG and the regular language R, i.e. L = SG ∩ R (according to Partee, ter Meulen, Wall 1993, 503): €lfeÞm €it das mer ðd′chindÞn ðem HansÞm es huus haend wele ðlaaÞn ðha Jan sa aastriiche The following applies to the L language: a) The number of nouns in the accusative (n) coincides with the number of verbs requiring the accusative (n). The number of nouns in the dative (m) coincides with the number of verbs requiring the dative (m) (Partee, ter Meulen, Wall 1993, 503). b) All accusative-case nouns precede all dative-case nouns. All accusative-case marking verbs precede all dative-case marking verbs (Partee, ter Meulen, Wall 1993, 503). 3. The L language can be expressed formally: L = wan bm xcn dm y Using the pumping theorem, one can prove that L is not a context-free language (CFL). Since the intersection of a CFL and a RL is again a CFL,463 we can conclude that a SG is not a CFL (Partee, ter Meulen, Wall 1993, 503).
For comment on this, see Partee, ter Meulen, Wall (1993, 497).
Appendix 10 Herdan’s “New Statistics” and the Identity of Indiscernibles
249
Appendix 10 Herdan’s “New Statistics” and the Identity of Indiscernibles Herdan often uses physical analogies in the TLCC, but his use of new quantum statistics in connection with the analysis of the lexical plane of language is by far the most notable.464 We believe that Herdan does not use physical examples as metaphors, his approach corresponds to the search for isomorphism between the physical and linguistic systems. Thus, for example, he believes that Planck’s approach, which Planck chose when formulating the law of absolute black-body radiation, is exactly the approach Herdan himself used to derive the random-partitioning function (RPF).465 Unlike the classical statistics that Herdan employs on the phonetic level, he uses new quantum statistics on the lexical level – specifically the Bose-Einstein statistics.466 Herdan does not enunciate it explicitly, but we can say that he means that the words behave like bosons. This comparison seems to be a bit eccentric, but let us keep following it because it shows how a given statistical description can act naturally in the linguistic field, even though for the physical field it means a radical redefinition of the basic unit of description. Let us take the sentence scheme “I . . . what I . . . and I . . . what I don’t . . .”, and consider in how many ways we can supplement it with the word “know” to fulfill the sentence scheme. The obvious answer that there is only one way will not surprise anyone. The word type “know” is realized by four instances of tokens of “know”, which are fully interchangeable (indistinguishable) – they do not have
We have no room to present Herdan’s arguments for the need to distinguish statistics at the phonetic and lexical levels. The basic difference is related to the nature of fluctuations at both levels, where the phonetic level suffices with linear fluctuations, while the lexical one already requires quadratic fluctuations. And for Herdan, the occurrence of quadratic fluctuations is also a link to quantum statistics (see TLCC, 249–253 for details). Regarding the changes in statistics, he states: “(. . .) the investigation of the ‘particles’ of communication by language, such as words, phonemes, letters etc., have not attracted much attention on the part of statisticians (Yule being an exception), although communication engineers and linguists have made the study of these universes from their own angle part of their respective branches of knowledge.” (TLCC, 422). 465 “Now let in the above E stand for the total text length as the outcome of energy of communication by speech, let N stand for the number of segments into which the text is divided, and P for the number of occurrences of a word, and Planck’s reasoning is recognised as that which led to the derivation of the generalised R.P.F. (. . .).” (TLCC, 256). RPF plays an important role in Herdan’s analyses, (cf. TLCC, 219–248). We encountered this kind of statistics above (see note 461) when considering the clash of object and structural realism.
250
Appendix
their own identity. The nature of the unit in linguistics is always based on this obvious distinction between token and type. In classical statistical physics, however, the analogous situation is radically different; in classical physics, we have a particle as a type realized in many instances, which, however, have their own identity. In classical statistical physics, of course, this view is related to determining the number of microstates that are realized in a given system – if we want to divide n particles into n positions equally, then we can do it in n! ways (that is, for n = 4, there are 24 ways). In BoseEinstein’s statistics, however, it is possible to realize this division in only one way (there is no identity of individual particles or bosons), quite analogous to the sentence given by us: “I know what I know and I know what I don’t know.” Herdan expresses the basic difference between traditional and new statistics explicitly in the context of Random Partitioning and analogizes the basic units of physics with the basic units of linguistics explicitly. And he considers the indistinguishability of basic units at different levels of description to be a key issue: The characteristic difference between the statistics used on this level of language and that used on the phonemic level where, as explained above, conventional statistics are, by and large, appropriate, consists in this: whereas combinatorics in the conventional sense works with the principle of indistinguishability of elements – phonemes and alphabetic signs, in our case –, one must assume on the vocabulary level also indistinguishability of the segments of language in the line, into which the whole has been divided. To this corresponds homogeneity of style in terms of use of vocabulary in segments of different length, which means of different duration in units of linguistic time. But such complete indistinguishability is essentially also what differentiates BOSE-EINSTEIN statistics or the ‘New Statistics’ from ‘Classical Statistics’: in addition to the indistinguishability of the particles, it assumes also indistinguishability of the cells of phase space over which the particles are distributed (. . .). (TLCC, 426)467
Despite this definition, however, we have to admit that we do not understand why the “new statistics” does not apply to the phonetic level either.
Appendix 11 Linguistic Duality and Parity in Physics
251
Appendix 11 Linguistic Duality and Parity in Physics If we still wanted to understand Herdan’s physical examples as mere analogies, then his final reflections on the relationship of linguistic duality to the physical level of description make it very difficult for us: As a last step, we shall now show that what I have described under the name of Linguistic Duality has also its exact counterpart in the area of physical particles. In this case it is, however, no longer a matter of similar statistical procedures, but an exact correspondence between certain fundamental properties of language structure and certain fundamental principles of elementary particles [emphasis mine]. (TLCC, 431)
Herdan attempts to connect all examples of linguistic duality and find their common source.468 Strictly speaking, if we really understood Herdan’s view not as a “mere” analogy, but as a real theory of language, then it would be a direct reduction of linguistics (and also neuroscience) to physics. Herdan explicitly states that: This theory receives support from certain developments in theoretical physics connected with the behaviour of elementary particles. These have led to conclusions that make the above observation about linguistic duality and the laterality of the speech function appear to be in full accordance with a general law of nature. (TLCC, 431)
Herdan points to a physical example of a violation of one basic conservation principle – parity invariance, which is associated with parity symmetry.469 This disruption occurs in the case of a weak nuclear interaction (as opposed to a strong interaction).470 Herdan establishes an analogy between the physical concepts of strong and weak nuclear interactions and biological-neuroscientific concepts: The analogue to this on the level of the human frame is the dichotomy between the strong interaction of the parts and organs of the human body, on the one hand, and the weak interactions between the units of language in use. (TLCC, 435)
If Herdan had been writing the book today, he might have used the concept of spontaneous symmetry breaking and incorporated it into his own theory of
“(. . .) the various forms of duality in language as established by Boole (law of duality), Trubetzkoy (phonological opposition), Jakobson (distinctive feature opposition), Information Theory (binary coding), Language statistics (repeat rate) and by myself as linguistic duality were all established independently, it would be rather unscientific to leave it at that and not aim at a unification of these forms by reference to some common cause.” (TLCC, 431). Herdan refers to the lecture of Blin-Stoyle (1964). For transcription, see TLCC (432–435). The current standard model can already incorporate parity violation into Weinberg-Salam’s theory of electro-weak interaction. The discovery of the Higgs boson in 2012 then corroborates this theory, (cf. Stenger 2006, 82–85).
252
Appendix
language. In the 1960s, he was inspired by fresh physical discoveries and formulated the following surprising solution to the problem of parity on a linguistic, but in fact also on a psychological (and biological) level: In terms of the physics of elementary particles, we would say that the thought-spinning brain emits the elementary particles of communication (the forms of language) only laterally, lefthanded, and that the law of conservation of parity breaks down here, were it not for the fact of every particle having its anti-particle, namely every linguistic statement having its dual according to the law of duality. This restores the conservation of parity, and brings language into line with the other functions of the human body. (TLCC, 436)
This is, of course, an analogy471 that leaves questions about parity violation itself in physics unanswered completely. In fact, Herdan argues that through the principle of linguistic duality, we can overcome the disruption of symmetry that is given by the uni-laterality of speech functions (only in the left hemisphere) and align it with the symmetry of a biological body. Leaving aside obvious objections (e.g. that the human body is not perfectly symmetrical, that there are unpaired organs, etc.), an essential proviso remains, i.e. that nothing entitles us to claim that this is a correspondence of the physical and linguistic levels of description. Although Blin-Stoyle speaks of the salvation of the conservation laws of physics by referring to the existence of antiparticles (TLCC 435), Herdan’s assertion that linguistic duality corresponds to the duality of a particle – an antiparticle is again an analogy that can only refer to a very general principle of duality. But such a general principle of duality is not a scientific principle. The unification that Herdan speaks of is only a supplement to another interesting example of binary oppositions, not a definition of a principle that would actually explain physical and biological and linguistic phenomena. However, the question is whether Herdan’s extreme attempt is not a potential extreme of our efforts to define a principle-based model of explanation for systemtheoretical linguistics either; whether it is actually an explication of the limits of our possibilities of system conceptualization – through symmetries and their breakings. Are we ourselves not only venturing into traps of the metaphysical underpinnings of scientific theories? Nevertheless, with a bit of exaggeration, we can say that Gustav Herdan is a dual projection of Roger Penrose (cf. Penrose 1997).
The systematic introduction of the literal (and analogous) use of physical quantities is applied to the phonetic level by Udo Strauss (1980). Strauss was directly inspired by (among others) Herdan. Thus, for example, he finds an analogy in Heisenberg’s uncertainty relations for the energy concept of the phonetic system (Strauss 1980, 61).
Appendix 12 Mario Bunge
253
Appendix 12 Mario Bunge Mario Bunge (September 21, 1919 Buenos Aires – February 24, 2020 Montreal) was one of the founders of the tradition of analytic philosophy in Argentina.472 In this regard, according to Lombardi, Cordero, Pérez Ransanz (2020) he was influenced by Hans Lindemann, who emigrated to Argentina in the 1940s. Lindemann popularized Bertrand Russell’s philosophy and the ideas of the Vienna Circle in Argentina. Bunge and his colleagues (Gregorio Klimovsky, and Julio Rey Pastor) developed Lindemann’s inspiration in further discussions and courses (Lombardi, Cordero, Pérez Ransanz 2020). However, Bunge first studied physics (1942, La Plata), then obtained a doctorate in physics (1952, La Plata, specialization in nuclear physics, supervised by an Austrian physicist Guido Beck) and was professionally active in physics (Matthews 2019, 6).473 Bunge is behind the development of philosophy of science in Argentina in the 1950s and 1960s, both professionally (see the overview of his works) and organizationally (Bunge led the Círculo Filosófico association, was active in the Agrupación Rioplatense de Lógica y Filosofía Científica etc.) (Lombardi, Cordero, Pérez Ransanz 2020). According to Matthews, Bunge became a world-famous philosophical celebrity in 1956 at the Inter-American Philosophical Congress in Santiago de Chile (Matthews 2019, 6–7). Quine is known to have said about him later: The star of the philosophical congress was Mario Bunge, an energetic and articulate young Argentinian of broad background and broad, if headstrong, intellectual concerns. He seemed to feel that the burden of bringing South America up to a northern scientific and intellectual level rested on his shoulders. He intervened eloquently in the discussion of almost every paper. (Quine 1985, 266; cited by Matthews 2019, 7)
At the turn of the 1950s and 1960s, Bunge worked as a professor of physics and philosophy, first at La Plata University and then in Buenos Aires. From 1966 he was a Professor of Philosophy at McGill University in Montreal (Romero 2019, 292). Bunge is the author of the very first philosophical book with an analytical focus in Latin America. It is the book Causality: The Place of the Causal Principle in Modern Science (Bunge 1959), originally published in English. A year later, he published Antología Semántica (Bunge 1960), which contains the first Spanish translations of a number of famous personalities of the analytical tradition (see
Lombardi, Cordero, Pérez Ransanz (2020) state that (Comte’s) scientism and positivism were popular in Argentina. Among other things, Bunge’s uncle and lawyer Carlos Bunge leaned towards them. Among his papers we can find a short paper for Nature, Bunge (1945).
254
Appendix
Perez 2018). Subsequently, however, he moved from Argentina first to the USA (1963) and then to Canada (1966), where he worked at McGill University for the rest of his life (see Lombardi, Cordero, Pérez Ransanz 2020). The source of much information about the life and work of Mario Bunge is the Festschrift published on the occasion of Bunge’s 100th birthday (Matthews [ed.] 2019). A complete summary of Bunge’s publications is also included. We will present here only Bunge’s eight-volume Treatise on Basic Philosophy, published gradually from 1974 to 1989 by Reidel (Boston, Dordrecht): 1974 Semantics I: Sense and Reference. 1974 Semantics II: Interpretation and Truth. 1977 Ontology I: The Furniture of the World. 1979 Ontology II: A World of Systems. 1983 Epistemology and Methodology I: Exploring the World. 1983 Epistemology and Methodology II: Understanding the World. 1985 Epistemology and Methodology III: Philosophy of Science and Technology: Part I. Formal and Physical Sciences. 1985 Epistemology and Methodology III: Philosophy of Science and Technology: Part II. Life Science, Social Science and Technology. 1989 Ethics: The Good and the Right. These volumes of Treatise on Basic Philosophy show that Bunge, unlike most famous personalities in analytic philosophy, struggled to create a comprehensive system of philosophy. In this respect, his conception of philosophy was universalist. The specificity of his system – tied to his background in analytic philosophy and philosophy of science – lies in the semantic basis, which is only followed by ontology, epistemology and ethics. Matthews (2019, 7–20) discusses various aspects of Bunge’s philosophical system. Dozens of texts in the Festschrift itself (Matthews [ed.] 2019) follow the diverse directions of Bunge’s interests; with regard to philosophy of physics, biology, sociology, cognitive sciences, but also political sciences and education.
Appendix 13 An Overview of Requirements in System-theoretical Linguistics
255
Appendix 13 An Overview of Requirements in System-theoretical Linguistics
Requirement
Symbol
Influence on
Coding Specification De-specification Application Transmission security Economy Minimisation of production effort Minimisation of encoding effort Minimisation of decoding effort Minimisation of inventories Minimisation of memory effort Context economy Context specificity Invariance of the expression-meaning-relation Flexibility of the expression-meaning-relation Efficiency of coding Maximisation of complexity Preference of right-branching Limitation of embedding depth Minimisation of structural information Adaptation Stability
Cod Spc Dsp Usg Red Ec minP minC minD minI minM CE CS Inv Var OC maxC RB LD minS Adp Stb
Size of inventories Polysemy Polysemy Frequency Length of units Sub-requirements Length, complexity Size of inventories, polysemy Size of inventories, polysemy Size of inventories Size of inventories Polytextuality Polytextuality Synonymy Synonymy Sub-requirements Syntactic complexity Position Depth of embedding Syntactic patterns Degree of adaptation readiness Degree of adaptation readiness
Figure 8: Requirements in System-theoretical Linguistics (QSA, 179).
Credit Line: Köhler, Reinhard, Quantitative Syntax Analysis, Berlin: De Gruyter Mouton, 2012, p. 201, tab. 4.33.
Figure 9: The Structure of the Syntactic Subsystem (QSA, 201).474
Appendix 14 Syntactic Subsystem in System-Theoretical Linguistics
256 Appendix
Appendix 15 Speech Self-organization
257
Appendix 15 Speech Self-organization Above all, it is Wildgen’s extensive subchapter “Sprache und Selbstorganisation: Anwendung der Theorie dynamischer Systeme in der Sprachwissenschaft” (Wildgen, Mottron 1987, 87–215) which attempts to apply the theory of dynamical systems (and the concept of chaos), Thom’s theory of catastrophes and also Haken’s synergetics for the needs of new conceptualization of individual language levels, especially phonetic, lexical-semantic and syntactic ones. Wildgen does so by means of analogies, but also of inferences from some schematized models, in the spirit typical of the period of the 1980s and 1990s, when many authors used the interdisciplinary nature of models of dynamical systems theory (again e.g. Kellert 1993, Smith 1998b). Unfortunately, Wildgen has failed to transfer the formal means of some of these source theories into a linguistic context in a coherent form. Nowhere in the whole text is there any formalization so typical of dynamical systems theory to be found. Stephen Kellert (2008) considers the impossibility of transferring the formal means of a source theory to a new context as an important indicator of unsuitability of the whole conceptual borrowing, especially in the chapters “Metaphorical chaos” and “How to criticize a metaphor” (cf. Kellert 2008, 103–148). Balasubrahmanyan and Naranan have tried to connect system-theoretical linguistics closely with the theory of complexity in a pair of papers from the second half of the 1990s (Balasubrahmanyan, Naranan 1996, Naranan, Balasubrahmanyan 1998). They are also referred to (among others) by Ramon Ferrer-i-Cancho and Ricard V. Solé (Ferrer-i-Cancho, Solé 2001a) when rethinking the traditional concept of Zipf’s law.475 These authors (see chapter 5.4 above) represent, especially thanks to Ferrer-i-Cancho, a unique line of further development of quantitative linguistics, which does not resign itself to the search for general principles of quantitative-linguistic theories. On the contrary, they attempt to expand these principles beyond linguistics (in psychology, neuroscience, etc.).
475 They notice that the exponent in Zipf’s power law typically takes on two different values. “The two observed exponents divide words in two different sets: a kernel lexicon formed by about N versatile words and an unlimited lexicon for specific communication. We suggest that the change in exponents is related to the amount of words speakers are able to store and use efficiently. Human brain constraints could be involved.” (Ferrer-i-Cancho, Solé 2001a, 170).
258
Appendix
Appendix 16 Unified Approach in Quantitative Linguistics Altmann and Wimmer define a unified approach by means of two assumptions (see chapter 5.3 above), which they then formally define for a continuous and discrete approach. In the case of a continuous approach, it is a differential equation: ! k2 k1 X X dy a1i a2i + + . . . dx = a0 + y−d ðx − b1i Þc1 ðx − b2i Þc2 i=1 i=1 for ci ≠ cj and i ≠ j. (Wimmer, Altmann 2005, 792) Wimmer and Altmann comment on constants aij : The constants aij must be interpreted in every case differently; they represent properties, “forces”, order parameters, system requirements etc. which actively participate in the linkage between X and Y (cf. Köhler 1986; . . .) but remain constant because of the ceteris paribus condition. (Wimmer, Altmann 2005, 793)
Since the quantities in the above equation are separated (Wimmer, Altmann 2005, 793), a general solution of the equation can be found for c1 = 1 (Wimmer, Altmann 2005, 793): y = Ce
a0 x
k1 Y i=1
ðx − b1i Þ
a1i
· exp
kj XX j≥2
i=1
aji
!
c − 1 + d 1 − cj x − bji j
For a discrete approach, Wimmer and Altmann present the equation (Wimmer, Altmann 2005, 797): k2 k1 X X ΔPx−1 a1i a2i = a0 + + + ... c1 Px−1 ðx − b1i Þ ðx − b2i Þc2 i=1 i=1
where fP0 , P1 , . . .g are probability mass functions (Wimmer, Altmann 2005, 797).
Appendix 17 Haken’s Equations of Self-organization
259
Appendix 17 Haken’s Equations of Self-organization Hermann Haken describes the process of self-organization using two differential equations (Haken 1978, 194): .
q1 = −γ1 q1 − aq1 q2 .
q2 = −γ2 q2 + bq21 where q1 represents the order parameter that enslaves the subsystem q2 . The values of γ1 and γ2 correspond to damping constants and a and b are other parameters. Haken directly states: (. . .) when dealing with complex systems, q1 describes the degree of order. This is the reason why we shall refer to q1 as “order parameter”. In general we shall call variables, or, more physically spoken, modes “order parameters” if they slave subsystems. (Haken 1978, 195) .
If we further put q2 = 0, then we can write (Haken 1978, 195): q2 ðtÞ ≈
bq21 ðtÞ γ2
If we substitute this solution into the first equation, then we obtain the equation (Haken 1978, 195): .
q1 = −γ1 q1 −
abq31 γ2
where for γ1 < 0, there is a steady state solution (Haken 1978, 195): 1
q1 = ±ðjγ1 jγ2 =abÞ2 Since the above mentioned corresponds only to two levels of the system (the order parameter enslaves a subsystem), Haken generalizes for n levels of the system as follows (Haken 1978, 195): .
q1 = −γ1 q1 + g1 ðq1 , . . ., qn Þ .
q2 = −γ2 q2 + g2 ðq1 , . . ., qn Þ ... . qn
= −γn qn + gn ðq1 , . . ., qn Þ
where gj ðq1 , . . . , qn Þ are nonlinear functions (Haken 1978, 196).
260
Appendix
Appendix 18 Hřebíček’s Derivation of Menzerath-Altmann’s Law Hřebíček (1994) introduced a new procedure for deriving Menzerath-Altmann’s law (MAL), in which the isomorphism of the mathematical expression for the calculation of the fractal dimension (specifically for the Koch curve) and the MAL (Hřebíček 1994, 85–86) is invoked. In the mathematical expression of a fractal dimension: D=
log N 1 log r
where N expresses the number of parts of the whole and r the similarity ratio – Hřebíček relates N to the length of the language construct x and r to the length of the constituent y (Hřebíček 1994, 85). Then he rewrites the previous mathematical relations into the following form (Hřebíček 1994, 85): D=
log x log x = 1 − log y log y
He then modifies this equation into the form (Hřebíček 1994, 85): log y = −
1 log x D
Hřebíček then adds the logarithmized parameter A to this relation and substitutes b for 1=D (Hřebíček 1994, 85–86): log y = −b log x + log A The last mathematical equation listed above is identical to the MAL equation: y = Ax −b Hřebíček points out that this fact can be interpreted in two ways: it can refer to a mere structural homogeneity of both formulas, or it can mean the real identity of N ≡ x and r ≡ y, which would mean that these identities are: (. . .) in agreement with reality and thus the derived law is a real consequence of the fractal structure existing in language. (Hřebíček 1994, 86)
Hřebíček further considers that the fractal dimension D = 1=b could be understood as an invariant of the process of transition between the levels of constructs and constituents in language (Hřebíček 1994, 86).
Appendix 19 Hempel’s Functional Analysis
261
Appendix 19 Hempel’s Functional Analysis Hempel (1965) builds functional analysis in three steps, in which he gradually eliminates individual problems. First (1) he points to the fallacy of affirming the consequent, then (2) he points to the problem of “weak explanandum”, and finally (3) to the problem of the difference between conditional and categorical predictions. We will gradually demonstrate these steps schematically: 1. Fallacy of affirming the consequent (Hempel 1965, 310) a) At time t, system s functions adequately in setting of kind c (specific internal and external conditions) b) s functions adequately in a setting of kind c only if a certain necessary condition, n, is satisfied c) If trait i were present in s then, as an effect, condition n would be satisfied d) (Hence), at t, trait i is present in s In case (1), the fallacy is realized by the highlighted part of premise (c). Therefore, Hempel decides to transform premise (c) into the form we find below in (2).476 2. Weak Explanandum (Hempel 1965, 313) a) At time t, system s functions adequately in setting of kind c (specific internal and external conditions) b) s functions adequately in a setting of kind c only if requirement n is satisfied c΄) I is the class of empirically sufficient conditions for n, in the context determined by s and c; and I is not empty d΄) Some one of the items included in I is present in s at t In case (2) we can deduce only weak explanandum (d΄) correctly. The deletion of premise (a) leads to the strengthening of the explanandum, as shown in (3). 3. Conditional predictions (Hempel 1965, 316) a) At time t, system s functions adequately in setting of kind c (specific internal and external conditions) b) s functions adequately in a setting of kind c only if requirement n is satisfied
There is also a change in premise (b), but this change is only terminological, not substantive.
262
Appendix
c΄) I is the class of empirically sufficient conditions for n, in the context determined by s and c; and I is not empty d΄΄) If s functions adequately in a setting of kind c at time t, then some one of the items included in I is present in s at t In this form, as a conditional functional prediction of the strengthened explanandum (d΄΄), or rather the predicandum (d΄΄), the functional analysis has already solved fundamental problems. However, it is more of a conditional functional prediction than an explanation (although the symmetry thesis allows us to speak of conditional explanation, but Hempel does not choose this term). Hempel points out that if we want to upgrade conditional prediction to categorical prediction, then we need something extra: (. . .) whenever functionalist analysis is to serve as a basis for categorical prediction (. . .), it is of crucial importance to establish appropriate hypotheses of self-regulation in an objectively testable form. (Hempel 1965, 317)
Appendix 20 Relation Between Power Law, Self-similarity and Minimization
263
Appendix 20 Relation Between Power Law, Self-similarity and Minimization Caldarelli (2007, 92–93) illustrates nicely how to understand the relation between fractal self-similarity and minimization by presenting an example of a simple optimization problem: supplying water from a source to a certain number of customers. The basic requirement here is to minimize the amount of “pipelines” used. However, as Caldarelli states, we can make the task more complicated by adding another requirement of all customers being as close to the source as possible (Caldarelli 2007, 92). These two requirements go against each other to some extent, which is illustrated well in the following diagram (Figure 10):
Figure 10: Optimization Task (Caldarelli 2007, 93).
In case A, we have met the requirement for a close connection, but the number of “pipelines” used is at its maximum. To quote Caldarelli: “Maximum of local benefit (. . .) but poor global optimization.” (Caldarelli 2007, 93). In case B, we have minimized the amount of “pipelines” used, but the local benefit is very low for those users who are far away in the chain. Case C is an optimal solution, which minimizes the cost of the system (cost function) – as in case B, we need 12 “pipelines” – and at the same time we meet the requirement for proximity of the connection better (than in the case B). At the same time, we see that the optimal solution C shows self-similarity (Caldarelli 2007, 92–93).
264
Appendix
Appendix 21 Altmann’s Duality of the Method The duality of the approach towards quantitative-linguistic research is confirmed by Altmann’s dual definition of the quantitative-linguistic practice (Altmann 1996). Altmann explicitly states that quantitative linguistics can be practiced inductively (the inductive approach) and deductively (the deductive approach). Altmann defines both approaches – but above all the inductive one – with absolute ease, which indicates the considerably detached view of his and a great deal of experience in working with language data. The inductive approach corresponds to what we have already described above (see subchapter 5.3.1) as Altmann’s inductivism, with all the reservations we have raised towards inductivism: ignoring the presence of theoretical entities (a) that lead us to define units (b) and (c), searching for a suitable empirical formula (d) and (e) and unapproachableness of axioms (f). Altmann defines the points as follows: (a) Do not care about what units and properties qualitative linguistics regards as relevant. It strongly selects and pursues its own aims. (b) Define the unit in such a way that it can be identified and segmented. Do not forget that definitions are nothing but conventions, and that segmentation rules are nothing but operational definitions. The resulting entities are our constructs. (. . .) (c) Define one or more properties of the unit, measure them and (. . .) “publish the results”. Perhaps they will prompt other researchers to formulate hypotheses. (d) Study the frequency distribution of the property under analysis, its relation(s) to other properties of the given unit – in case several of them have been defined –, to the subunits and to the superunits. (e) Put up the first empirical formulas that describe the course of the property or the relations well. (. . .) (f) Try to give theoretical reasons for these formulas, i.e. deduce them from theoretical assumptions. (Altmann 1996, 4–5)
Altmann identifies the deductive strategy with Köhler’s system-theoretical version of quantitative linguistics, which ultimately enables him to apply the method of sophisticated falsificationism. Altmann defines it as follows: (a) Start out from the assumption that language is a self-regulating system, where all properties are connected with each other (. . .), that language is organized according to the needs of the members of the language community, that numerous processes take place in language, and that language forms a functionally interwoven whole (. . .). This assumption makes it possible to draw on systems theory, synergetics, which is necessary because there has not yet been any other theory of language worth mentioning. (b) Choose any property and conclude from its values what the results for one or several other properties – its functional equivalents – must be. (c) Measure all qualities concerned and find out if the postulated bond between them really exists. Obviously one has to use statistical tests here, whose results have to be interpreted. (. . .) (d) In the positive case, one deals with a candidate for a law-like hypothesis, and should try to find a place for this fact in Köhler’s system. That is, one should try to include the hypothesis into a system of law-like hypotheses. (e) In the negative case, check, if the assumptions are correct, or if there is another highly correlated
Appendix 21 Altmann’s Duality of the Method
265
factor that has not been taken into account yet, if the consequence has been deduced correctly, if the quantification and the measurement of the property were adequate, if the statistical test has been interpreted correctly, etc. In other words, do not give up right away, but stick to the hypothesis until it is ultimately refuted by other evidence (. . .). (Altmann 1996, 5)
The paradox is that Altmann considers the deductive path to be significantly more uncertain than the inductive one, without at the same time noting that this is – as Popper and Lakatos state – an advantage: The deductive way is much more insecure than the inductive one, because at every step, one is exposed to the danger of falsification. It is only at point (f) that the inductive way may be misleading, yet hardly less adventurously than the results of point (b) in the deductive procedure might be. (Altmann 1996, 6)
However, he then seems to connect the true search for theory and explanation with the deductive strategy: If the only aim is the determination and description of the status quo or of the existing tendencies, then a few inductive steps are sufficient in order to achieve valuable results. If the aim is a theory, then deductive work is unavoidable (. . .). (Altmann 1996, 6)
The questions arise: Where does the unified approach belong (will it belong)? Why are so many quantitative linguists attracted by the inductive way? Is this state of affairs (the popularity of inductivism) compatible with Altmann’s creed? “(. . .) only in quantitative linguistics grammatical or other rules may acquire a theoretical status” (Altmann 1996, 6).
References Aitchison, John & James Alexander Campbell Brown. 1954. On Criteria for Descriptions of Income Distribution. Metroeconomica 6 (3). 88–107. Altmann, Gabriel. 1978. Towards a theory of language. In Gabriel Altmann (ed.), Glottometrika (vol. 1), 1–25. Bochum: Brockmeyer. Altmann, Gabriel. 1980. Prolegomena to Menzerath’s Law. In Reinhard Köhler (ed.), Glottometrika (vol. 2), 1–10. Bochum: Brockmeyer. Altmann, Gabriel. 1981. Zur Funktionalanalyse in der Linguistik. In Jürgen Esser & Axel Hübler (eds.), Forms and Functions, 25–32. Tübingen: Narr. Altmann, Gabriel. 1993. Phoneme counts. In Gabriel Altmann (ed.), Glottometrika (vol. 14), 54–68. Bochum: Brockmeyer. Altmann, Gabriel. 1996. The Nature of Linguistic Units. Journal of Quantitative Linguistics 3 (1). 1–7. Altmann, Gabriel & Violetta Burdinski. 1982. Towards a Law of Word Repetitions in Text-Blocks. In Werner Lehfeldt & Udo Strauss (eds.), Glottometrika (vol. 4), 147–167. Bochum: Brockmeyer. Altmann, Gabriel & Bernhard Kind. 1983. Ein Semantisches Gesetz. In Reinhard Köhler & Joachim Boy (eds.), Glottometrika (vol. 5), 1–13. Bochum: Brockmeyer. Altmann, Gabriel & Reinhard Köhler. 1995. “Language Forces” and Synergetic Modelling of Language Phenomena. In Peter Schmidt (ed.), Glottometrika (vol. 15), 62–76. Bochum: Brockmeyer. Altmann, Gabriel, Michael H. Schwibbe & Werner Kaumanns (eds.). 1989. Das Menzerathsche Gesetz in informationsverarbeitenden Systemen. Hildesheim: Georg Olms Verlag. Altmann, Gabriel, Haro von Buttlar, Walter Rott & Udo Strauss. 1983. A Law of Change in Language. In Barron Brainerd (ed.), Historical Linguistics, 104–115. Bochum: Brockmeyer. Andres, Jan. 2009. On de Saussure’s principle of linearity and visualization of language structures. Glottotheory 2 (2). 1–14. Andres, Jan. 2010. On a Conjecture about the Fractal Structure of Language. Journal of Quantitative Linguistics 17 (2). 101–122. Andres, Jan & Martina Benešová. 2012. Fractal analysis of Poe’s Raven II. Journal of Quantitative Linguistics 19 (4). 301–324. Andres, Jan, Jiří Langer & Vladimír Matlach. 2020. Fractal-based analysis of sign language. Communications in Nonlinear Science and Numerical Simulation 84 (105214). 1–14. Andres, Jan & Miroslav Rypka. 2013. Vizualization of Hyperfractals. International Journal of Bifurcation and Chaos 23 (10). 1–12. Ayer, Alfred Jules (ed.). 1959. Logical Positivism. New York: The Free Press. Baggott, Jim. 2012. Higgs: The Invention and Discovery of the “God Particle”. Oxford: Oxford University Press. Baggott, Jim. 2013. Farewell to Reality: How Modern Physics Has Betrayed the Search for Scientific Truth. New York: Pegasus Books. Bain, Jonathan. 2013. Effective Field Theories. In Robert Batterman (ed.), The Oxford Handbook of the Philosophy of Physics, 224–254. Oxford: Oxford University Press. Baker, Alan. 2005. Are there Genuine Mathematical Explanations of Physical Phenomena? Mind 114 (454). 223–238. Balasubrahmanyan, Viddhachalam K. & Sundaresan Naranan. 1996. Quantitative linguistics and Complex System Studies. Journal of Quantitative Linguistics 3 (3). 177–228. Bangu, Sorin. 2013. Symmetry. In Robert Batterman (ed.), The Oxford Handbook of Philosophy of Physics, 287–317. Oxford: Oxford University Press.
https://doi.org/10.1515/9783110712759-008
268
References
Bangu, Sorin. 2017. Review of the book Because Without Cause: Non-Causal Explanations in Science and Mathematics, by Marc Lange. British Journal for Philosophy of Science. BJPS Review of Books. http://www.thebsps.org/reviewofbooks/marc-lange-because-without-cause/ (accessed 26 November 2022) Barbieri, Marcello. 2015. Code Biology: A New Science of Life. Dordrecht: Springer. Barnsley, Michael Fielding. 2006. SuperFractals: Patterns of Nature. Cambridge: Cambridge University Press. Batterman, Robert. 2011. Emergence, Singularities, and Symmetry Breaking. Foundations of Physics 41 (6). 1031–1050. Batterman, Robert. 2013. The Tyranny of Scales. In Robert Batterman (ed.), The Oxford Handbook of the Philosophy of Physics, 255–286. Oxford: Oxford University Press. Beneš, Martin. 2015. Máme se vzdát dichotomického pohledu na „jazyk“? Studie z aplikované lingvistiky 6 (2). 181–191. Benešová, Martina, Dan Faltýnek & Lukáš Zámečník. 2015. Menzerath-Altmann Law in Differently Segmented Texts. In Arjuna Tuzzi, Martina Benešová & Ján Mačutek (eds.), Recent Contributions in Quantitative Linguistics, 27–40. Berlin: De Gruyter Mouton. Benešová, Martina, Dan Faltýnek & Lukáš Zámečník. 2018. Functional Explanation in Synergetic Linguistics. In Lu Wang, Reinhard Köhler & Arjuna Tuzzi (eds.), Structure, Function and Process in Texts, 15–24. Lüdenscheid: RAM-Verlag. Best, Karl-Heinz & Gabriel Altmann. 2007. Gustav Herdan (1897–1968). Glottometrics 15. 92–96. Binet, Laurent. 2017. The 7th Function of Language. New York: Farrar, Straus & Giroux. Blin-Stoyle, Roger John. 1964. Radio broadcast of March 1964 and Private Communication. In Gustav Herdan. 1966. The Advanced Theory of Language as Choice and Chance, 432–435. Berlin: Springer-Verlag. Bliss, Ricki Leigh & Kelly Trogdon. 2014. Metaphysical Grounding. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/grounding/ (last modified 6 December 2021) Boghossian, Paul Artin. 2006. The Fear of Knowledge: Against Relativism and Constructivism. Oxford: Clarendon Press. Bokulich, Alisa. 2018. Searching for Non-Causal Explanations in a Sea of Causes. In Alexander Reutlinger & Juha Saatsi (eds.), Explanation Beyond Causation, 141–163. Oxford: Oxford University Press. Boole, George. 1854. An Investigation of the Laws of Thought on Which are Founded the Mathematical Theories of Logic and Probabilities. London: Walton and Maberly. Bromberger, Sylvain. 1966. Why-questions. In Robert Colodny (ed.), Mind and Cosmos, 86–111. University of Pittsburgh Press. Bunge, Mario. 1945. Neutron-proton scattering at 8.8 and 13 MeV. Nature 156. 301. Bunge, Mario. 1959. Causality: The Place of the Causal Principle in Modern Science. Cambridge: Harvard University Press. Bunge, Mario. 1960. Antología semántica. Buenos Aires: Nueva Visión. Bunge, Mario. 1967. Scientific Research I: The Search for System. New York: Springer. Bunge, Mario. 1995a. Quality, Quantity, Pseudo-quantity and Measurement in Social Science. Journal of Quantitative Linguistics 2 (1). 1–10. Bunge, Mario. 1995b. Causality and Probability in Linguistics. A Comment on “Informational Measures of Causality” by Juhan Tuldava. Journal of Quantitative Linguistics 2 (1). 15–16. Butler, Christopher. 2003. Structure and Function: A Guide to Three Major Structural-Functional Theories. Amsterdam: John Benjamins.
References
269
Caldarelli, Guido. 2007. Scale-Free Networks: Complex Webs in Nature and Technology. Oxford: Oxford University Press. Carnap, Rudolf. 1934. Logische Syntax der Sprache. Vienna: Springer. Carnap, Rudolf. 1950. Logical Foundations of Probability. Chicago: The University of Chicago Press. Carnap, Rudolf. 1959. The Elimination of Metaphysics through Logical Analysis of Language. In Alfred Jules Ayer (ed.), Logical Positivism, 60–81. New York: The Free Press. Carroll, John. 2016. Laws of Nature. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/laws-of-nature/ (last modified 16 November 2020) Cartwright, Nancy. 1983. How the Laws of Physics Lie. Oxford: Clarendon Press. Cartwright, Nancy. 1999. The Dappled World: A Study of the Boundaries of Science. Cambridge: Cambridge University Press. Čech, Radek. 2017. Jazykověda bez langue. Odpověď Martinu Benešovi. Studie z aplikované lingvistiky 8 (1). 103–110. Chomsky, Noam. 1969. Aspects of the Theory of Syntax. Cambridge: The MIT Press. Chomsky, Noam. 1970. Current Issues in Linguistic Theory. Hague: Mouton. Chomsky, Noam. 2002. Syntactic Structures. Berlin: De Gruyter Mouton. Chomsky, Noam. 2005. Three Factors in Language Design. Linguistic Inquiry 36 (1). 1–22. Chomsky, Noam. 2015. The Minimalist Program. Cambridge: The MIT Press. Cohen, Jack & Ian Stewart. 1994. The Collapse of Chaos. Discovering Simplicity in a Complex World. New York: Viking. Craver, Carl. 2006. When Mechanistic Models Explain. Synthese 153 (3). 355–376. Craver, Carl. 2013. Functions and Mechanisms: A Perspectivalist Account. In Philippe Huneman (ed.), Functions: Selection and Mechanisms, 133–158. Dordrecht: Springer. Craver, Carl & Lindley Darden. 2013. In Search of Mechanisms: Discovery across the Life Sciences. Chicago: The University of Chicago Press. Craver, Carl & James Tabery. 2015. Mechanisms in Science. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/science-mechanisms/ (last modified 18 November 2015) Cvek, Boris. 2015. Instrumentality of Knowledge: Instrumentalism in Philosophy of Science. Ph.D. Thesis, Charles University in Prague. https://is.cuni.cz/webapps/zzp/detail/84799/?lang=en (accessed 25 August 2021) Daston, Lorraine & Peter Galison. 2007. Objectivity. New York: Zone Books. Davidson, Donald. 1974. On the Very Idea of a Conceptual Scheme. Proceedings and Addresses of the American Philosophical Association 47. 5–20. Davidson, Donald. 2001. Subjective, Intersubjective, Objective. Oxford: Oxford University Press. de Regt, Henk, Sabina Leonelli & Kai Eigner (eds.). 2009. Scientific Understanding: Philosophical Perspectives. Pittsburgh: University of Pittsburgh Press. de Saussure, Ferdinand. 1971. Cours de linguistique général. Paris: Payot. de Saussure, Ferdinand. 2011. Course in General Linguistics. New York: Columbia University Press. Deutsch, David. 2011. The Beginning of Infinity: Explanations that Transform the World. London: Penguin Books. Egré, Paul. 2015. Explanation in Linguistics. Philosophy Compass 10 (7). 451–462. Emmeche, Claus & Kalevi Kull (eds.). 2011. Towards a Semiotic Biology: Life is the Action of Signs. London: Imperial College Press. Everett, Daniel. 2005. Cultural Constraints on Grammar and Cognition in Pirahã. Current Anthropology 46 (4). 621–646.
270
References
Ferrer-i-Cancho, Ramon & Ricard V. Solé. 2001a. Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf’s Law Revisited. Journal of Quantitative Linguistics 8 (3). 165–174. Ferrer-i-Cancho, Ramon & Ricard V. Solé. 2001b. The Small World of Human Language. Proceedings of the Royal Society B. London 268 (1482). 2261–2265. Ferrer-i-Cancho, Ramon & Ricard V. Solé. 2003. Optimisation in Complex Networks. Lectures Notes in Physics 625. 114–126. Feynman, Richard, Robert Leighton & Matthew Sands. 2010. The Feynman Lectures in Physics (vol. 2). New York: Basic Books. Fine, Kit. 2001. The Question of Realism. Philosophers’ Imprint 1 (1). 1–30. Foucault, Michel. 2002. The Order of Things: An Archaeology of the Human Sciences. London: Routledge. Freeman, Walter. 1988. Strange Attractors Govern Mammalian Brain Dynamics, Shown by Trajectories of Electroencephalographic (EEG) Potential. IEEE Transactions on Circuits & Systems 35 (7). 781–783. French, Steven. 2014. The Structure of the World: Metaphysics and Representation. Oxford: Oxford University Press. Friedman, Michael. 1974. Explanation and Scientific Understanding. Journal of Philosophy 71 (1). 5–19. Frigg, Roman. (2018, May 28th) Theory and Observation. Presentation at the workshop “Representation in Science”, Prague. http://stream.flu.cas.cz/media/roman-frigg-theory-andobservation (accessed 25 August 2021) Frigg, Roman & James Nguyen. 2016. The Fiction View of Models Reloaded. The Monist 99 (3). 225–242. Galison, Peter. 2003. Einstein’s Clocks and Poincare’s Maps: Empires of Time. London: W. W. Norton & Company. Garson, Justin. 2008. Function and Teleology. In Sahotra Sarkar & Anya Plutynski (eds.), A Blackwell Companion to the Philosophy of Biology, 525–549. Oxford: Blackwell Publishing. Garson, Justin. 2013. The Functional Sense of Mechanism. Philosophy of Science 80 (3). 317–333. Giere, Ronald. 1999. Science without Laws. Chicago: The University of Chicago Press. Giere, Ronald. 2002. Models as Parts of Distributed Cognitive Systems. In Lorenzo Magnani & Nancy J. Nersessian (eds.), Model-based Reasoning: Science, Technology, Values, 227–242. New York: Kluwer Academic. Giere, Ronald. 2004. How Models Are Used to Represent Reality. Philosophy of Science 71 (5). 742–752. Giere, Ronald. 2006. Scientific Perspectivism. Chicago: The University of Chicago Press. Givón, Talmy. 1979. On Understanding Grammar. New York: Academic Press. Glennan, Stuart. 2016. Mechanisms and Mechanical Philosophy. In Paul Humphreys (ed.), The Oxford Handbook of the Philosophy of Science, 796–816. Oxford: Oxford University Press. Godfrey-Smith, Peter. 2009. Abstractions, Idealizations, and Evolutionary Biology. In Anouk Barberousse, Michel Morange & Thomas Pradeu (eds.), Mapping the Future of Biology: Evolving Concepts and Theories, 47–56. Boston: Springer. Griffiths, Paul. 1993. Functional Analysis and Proper Function. British Journal for the Philosophy of Science 44 (3). 409–422. Gross, Jonathan. L. & Thomas W. Tucker 1987. Topological Graph Theory. New York: John Wiley & Sons. Grzybek, Peter. 2006. Introductory Remarks: On the Science of Language in Light of the Language of Science. In Peter Grzybek (ed.), Contributions to the Science of Text and Language. Word Length Studies and Related Issues, 1–14. Dordrecht: Springer. Haken, Hermann. 1978. Synergetics: An Introduction. Nonequilibrium Phase Transitions and SelfOrganization in Physics, Chemistry and Biology. Berlin: Springer.
References
271
Halvorson, Hans. 2016. Scientific Theories. In Paul Humphreys (ed.), The Oxford Handbook of Philosophy of Science, 585–608. Oxford: Oxford University Press. Hammerl, Rolf. 1990. Zum Aufbau eines dynamischen Lexikmodells – dynamische Mikro- und Makroprozesse der Lexik. In Luděk Hřebíček (ed.), Glottometrika (vol. 11), 19–40. Bochum: Brockmeyer. Hammerl, Rolf. 1991. Untersuchungen zur Struktur der Lexik: Aufbau eines lexikalischen Basismodells. Trier: WVT. Hammerl, Rolf & Jaroslaw Maj. 1989. Ein Beitrag zu Köhler’s Modell der sprachlichen Selbstregulation. In Rolf Hammerl (ed.), Glottometrika (vol. 10), 1–31. Bochum: Brockmeyer. Hanegraaff, Wouter. 1996. New Age Religion and Western Culture: Esotericism in the Mirror of Secular Thought. Leiden: E. J. Brill. Hanson, Norwood Russell. 1958. Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge: Cambridge University Press. Hanson, Norwood Russell. 1960. The Mathematical Power of Epicyclical Astronomy. Isis 51 (2). 150–158. Haspelmath, Martin. 1999. Optimality and Diachronic Adaptation. Zeitschrift für Sprachwissenschaft 18 (2). 180–205. Haspelmath, Martin. 2004. Does Linguistic Explanation Presuppose Linguistic Description? Studies in Language 28 (3). 554–579. Heim, Irene & Angelika Kratzer. 1998. Semantics in Generative Grammar. Malden: Wiley Blackwell. Hempel, Carl Gustav. 1965. Aspects of Scientific Explanation and other Essays in the Philosophy of Science. New York: The Free Press. Hempel, Carl Gustav & Paul Oppenheim. 1948. Studies in the Logic of Explanation. Philosophy of Science 15 (2). 135–175. Herdan, Gustav. 1966. The Advanced Theory of Language as Choice and Chance. Berlin: Springer-Verlag. Hilberg, Wolfgang. 2004. Some Results of Quantitative Linguistics Derived from a Structural Language Model. Glottometrics 7. 1–24. Hjelmslev, Louis. 1969. Prolegomena to a Theory of Language. Madison: The University of Wisconsin Press. Hřebíček, Luděk. 1992. Text in Communication: Supra-Sentence Structures. QL: Bochum. Hřebíček, Luděk. 1993. Text as a Construct of Aggregations. In Reinhard Köhler & Burghard B. Rieger (eds.), Contributions to Quantitative Linguistics, 33–39. Dordrecht: Kluwer. Hřebíček, Luděk. 1994. Fractals in Language. Journal of Quantitative Linguistics 1 (1). 82–86. Hřebíček, Luděk. 1995. Text Levels. Language Constructs, Constituents and the Menzerath-Altmann Law. Trier: WVT. Hřebíček, Luděk. 1999. Principle of Emergence in Text and Linguistics. Journal of Quantitative Linguistics 6 (1). 41–45. Hřebíček, Luděk. 2002a. The Elements of Symmetry in Text Structures. Glottometrics 2. 17–33. Hřebíček, Luděk. 2002b. Vyprávění o lingvistických experimentech s textem. Praha: Academia. Hřebíček, Luděk. 2003. Some Aspects of Power Law. Glottometrics 6. 1–8. Huneman, Philippe. 2010. Topological Explanations and Robustness in Biological Sciences. Synthese 177 (2). 213–245. Huneman, Philippe. 2018. Diversifying the Picture of Explanations in Biological Sciences: Ways of Combining Topology with Mechanisms. Synthese 195 (1). 115–146. Itkonen, Esa. 1983. Causality in Linguistic Theory. London: Croom Helm. Itkonen, Esa. 2005. Analogy as Structure and Process. Amsterdam: John Benjamins.
272
References
Jakobson, Roman. 1962. The Concept of the Sound Law and the Teleological Criterion. In Roman Jakobson. Selected writings 1, 1–2. ‘s-Gravenhage: Mouton. Jansson, Lina & Juha Saatsi. 2019. Explanatory Abstractions. British Journal for the Philosophy of Science 70 (3). 817–844. Kellert, Stephen. 1993. In the Wake of Chaos: Unpredictable Order in Dynamical Systems. Chicago: The University of Chicago Press. Kellert, Stephen. 2008. Borrowed Knowledge: Chaos Theory and the Challenge of Learning across Disciplines. Chicago: The University of Chicago Press. Khalifa, Kareem. 2017. Understanding, Explanation, and Scientific Knowledge. Cambridge: Cambridge University Press. Khalifa, Kareem, Jared Millson & Mark Risjord. 2021. Inference, Explanation, and Asymmetry. Synthese 198 (4). 929–953. Kim, Jaegwon. 2005. Physicalism or Something Near Enough. Princeton: Princeton University Press. Kitcher, Philip. 1981. Explanatory Unification. Philosophy of Science 48 (4). 507–531. Kitcher, Philip. 1993. The Advancement of Science: Science without Legend, Objectivity without Illusions. Oxford: Oxford University Press. Köhler, Reinhard. 1984. Zur Interpretation des Menzerathschen Gesetzes. In Joachim Boy & Reinhard Köhler (eds.), Glottometrika (vol. 6), 177–183. Bochum: Brockmeyer. Köhler, Reinhard. 1986. Zur linguistischen Synergetik: Struktur und Dynamik der Lexik. Bochum: Brockmeyer. Köhler, Reinhard. 1987. System Theoretical Linguistics. Theoretical Linguistics 14 (2–3). 241–257. Köhler, Reinhard. 1989. Das Menzeratche Gesetz als Resultat des Sprachverarbeitungsmechanismus. In Gabriel Altmann & Michael H. Schwibbe (eds.), Das Menzeratsche Gesetz in informationsverarbeitenden Systemen, 108–112. Hildesheim: Georg Olms Verlag. Köhler, Reinhard. 1990a. Linguistische Analyseebenen, Hierarchisierung und Erklärung im Modell der sprachlichen Selbstregulation. In Luděk Hřebíček (ed.), Glottometrika (vol. 11), 1–18. Bochum: Brockmeyer. Köhler, Reinhard. 1990b. Zur Charakteristik dynamischer Modelle. In Luděk Hřebíček (ed.), Glottometrika (vol. 11), 41–46. Bochum: Brockmeyer. Köhler, Reinhard. 1990c. Elemente der synergetischen Linguistik. In Rolf Hammerl (ed.), Glottometrika (vol. 12), 179–187. Bochum: Brockmeyer. Köhler, Reinhard. 2005. Synergetic Linguistics. In Reinhard Köhler, Gabriel Altmann & Rajmund G. Piotrowski (eds.), Quantitative Linguistics: An International Handbook, 760–774. Berlin: Walter de Gruyter. Köhler, Reinhard. 2012. Quantitative Syntax Analysis. Berlin: De Gruyter Mouton. Kořenský, Jan. 1984. Konstrukce gramatiky ze sémantické báze. Praha: Academia. Kořenský, Jan. 2014. Proměny myšlení o řeči na rozhraní tisíciletí. Olomouc: VUP. Kostić, Daniel. 2019. Mathematical and Non-causal Explanations: An Introduction. Perspectives on Science 27 (1). 1–6. Kostić, Daniel. 2020. General Theory of Topological Explanations and Explanatory Asymmetry. Philosophical Transactions of the Royal Society B: Biological Sciences 375 (1796). 1–8. Krott, Andrea. 1996. Some Remarks on the Relation between Word Length and Morpheme Length. Journal of Quantitative Linguistics 3 (1). 29–37. Krott, Andrea. 1999. The Influence of Morpheme Polysemy on Morpheme Frequency. Journal of Quantitative Linguistics 6 (1). 58–65. Kuhn, Thomas Samuel. 1962. The Structure of Scientific Revolutions. Chicago: The University of Chicago Press.
References
273
Lacková, Ľudmila. 2018. The Prague School, Teleology and Language as a Dynamic System. Acta Structuralica 3 (1). 105–121. Lacková, Ľudmila, Vladimír Matlach & Dan Faltýnek. 2017. Arbitrariness is Not Enough: Towards a Functional Approach to the Genetic Code. Theory in Biosciences 136 (3–4). 187–191. Lakatos, Imre. 1978. The Methodology of Scientific Research Programmes. Cambridge: Cambridge University Press. Lakatos, Imre. 1998. Science and Pseudoscience. In Martin Curd & Jan A. Cover (eds.), Philosophy of Science: The Central Issues, 20–26. New York: WW Norton & Company. Lakoff, George. 1990. Women, Fire and Dangerous Things. Chicago: The University of Chicago Press. Lange, Marc. 2017. Because Without Cause: Non-Causal Explanations in Science and Mathematics. Oxford: Oxford University Press. Lange, Marc. 2021. Asymmetry as a Challenge to Counterfactual Accounts of Non-causal Explanation. Synthese 198. 3893–3918. Lehfeldt, Werner & Gabriel Altmann. 2002. Padenie reducirovannykh v svete zakona P. Mencerata [The Fall of the Reduced Vowels in the Light of Menzerath’s Law]. Russian Linguistics 26 (3). 327–344. Levin, Michael. 1977. Explanation and Prediction in Grammar (and Semantics). Midwest Studies in Philosophy 2 (1). 128–137. Li, Wentian. 1992. Random Texts Exhibit Zipf’s-law-like Word Frequency Distribution. IEEE Transactions on Information Theory 38 (6). 1842–1845. Lightfoot, David. 1999. The Development of Language: Acquisition, Change and Evolution. Oxford: Blackwell. Lombardi, Olimpia, Alberto Cordero & Ana Rosa Pérez Ransanz. 2020. Philosophy of Science in Latin America. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/ entries/phil-science-latin-america/ (last modified 7 February 2020) Mancosu, Paolo. 2018. Explanation in Mathematics. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/mathematics-explanation/ (last modified 16 March 2018) Mandelbrot, Benoît. 1953. An Informational Theory of the Statistical Structure of Languages. In Willis Jackson (ed.), Communication Theory, 486–504. London: Betterworth. Markoš, Anton. 2002. Readers of the Book of Life: Contextualizing Developmental Evolutionary Biology. Oxford: Oxford University Press. Marshall, Michael (ed.). 2018. Human Origins: 7 Million Years and Counting. London: Quercus. Matthews, Michael. (ed.). 2019. Mario Bunge: A Centenary Festschrift. Springer Nature Switzerland. Matthews, Michael. 2019. Mario Bunge: An Introduction to His Life, Work and Achievements. In Michael Matthews (ed.), Mario Bunge: A Centenary Festschrift, 1–28. Springer Nature Switzerland. Maudlin, Tim. 2007. The Metaphysics within Physics. Oxford: Oxford University Press. Matlach, Vladimír, Diego Krivochen & Jiří Milička. 2021. A Method for Comparison of General Sequences via Type-token Ratio. In Adam Pawłowski, Sheila Embleton, Ján Mačutek & George Mikros. (eds.), Language and Text. Data, Models, Information and Applications, 38–53. Amsterdam: John Benjamins Publishing Company. Matlach, Vladimír, Daniel Dostál & Marian Novotný. 2022. Secondary Structures of Proteins Follow Menzerath-Altmann Law. International Journal of Molecular Sciences 23 (3). 1569–1–1569–13. Maxwell, Grover. 1962. The Ontological Status of Theoretical Entities. In Martin Curd & Jan A. Cover. 1998. Philosophy of Science: The Central Issues, 1052–1063. New York: W. W. Norton & Company. Maziarz, Mariusz & Martin Zach. 2020. Agent‐based Modelling for SARS‐CoV‐2 Epidemic Prediction and Intervention Assessment: A Methodological Appraisal. J Eval Clin Pract 26 (5). 1352–1360.
274
References
Menzerath, Paul. 1954. Die Architektonik des deutschen Wortschatzes. Bonn: Dümmler. Merton, Robert King. 1949. Social Theory and Social Structure. New York: Free Press. Meyer, Peter. 2002. Laws and Theories in Quantitative Linguistics. Glottometrics 5. 62–80. Milička, Jiří. 2014. Menzerath’s Law: The Whole is Greater than the Sum of its Parts. Journal of Quantitative Linguistics 21 (2). 85–99. Mitzenmacher, Michael. 2004. A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics 1 (2). 226–251. Monton, Bradley & Chad Mohler. 2017. Constructive Empiricism. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/constructive-empiricism/ (last modified 13 April 2021) Morrison, Margaret. 2013. The Unification in Physics. In Robert Batterman (ed.), The Oxford Handbook of the Philosophy of Physics, 381–415. Oxford: Oxford University Press. Morrison, Margaret. 2015. Reconstructing Reality: Models, Mathematics and Simulations. Oxford: Oxford University Press. Nagel, Ernest. 1961. The Structure of Science. New York: Brace & World. Naranan, Sundaresan & Viddhachalam K. Balasubrahmanyan. 1998. Models for Power Law Relations in Linguistics and Information Science. Journal of Quantitative Linguistics 5 (1–2). 35–61. Naranan, Sundaresan & Viddhachalam K. Balasubrahmanyan. 2005. Power Laws in Statistical Linguistics and Related Systems. In Reinhard Köhler, Gabriel Altmann & Rajmund G. Piotrowski (eds.), Quantitative Linguistics: An International Handbook, 716–738. Berlin: Walter de Gruyter. Newmeyer, Frederick. 1998. Language Form and Language Function. Cambridge: The MIT Press. Newmeyer, Frederick. 2016. Formal and Functional Explanation. In Ian Roberts (ed.), The Oxford Handbook of Universal Grammar, 1–25. Oxford: Oxford University Press. Nilsson, Dan, Eric Warrant, Sönke Johnsen, Roger Hanlon & Nadav Shashar. 2012. A Unique Advantage for Giant Eyes in Giant Squid. Current Biology 22 (8). 683–688. Nöth, Winfried. 1990. Handbook of Semiotics. Indiana University Press. Novella, Steven, Bob Novella, Cara Santa Maria, Jay Novella & Evan Bernstein. 2018. The Skeptics’ Guide to the Universe: How to Know What’s Really Real in a World Increasingly Full of Fake. New York: Grand Central Publishing. Osolsobě, Ivo. 2003. A Source of Teleological Thinking for the Prague Linguistic Circle. In Marek Nekula (ed.), Prager Strukturalismus/Prague structuralism, 121–134. Heidelberg: Carl Winter. Papineau, David. 2012. Philosophical Devices: Proofs, Probabilities, Possibilities, and Sets. Oxford: Oxford University Press. Partee, Barbara H., Alice ter Meulen & Robert E. Wall. 1993. Mathematical Methods in Linguistics. Dordrecht: Kluwer Academic Publishers. Pateman, Trevor. 1985. Review of Causality in Linguistic Theory, by Esa Itkonen. Journal of Linguistics 21 (2). 481–487. Peitgen, Heinz-Otto, Hartmut Jürgens & Dietmar Saupe. 2004. Chaos and Fractals: New Frontiers of Science. Berlin: Springer. Penrose, Roger. 1997. The Large, the Small and the Human Mind. Cambridge: Cambridge University Press. Perez, Diana Ines. 2018. Analytic Philosophy in Latin America. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/latin-american-analytic/ (last modified 8 October 2018) Pickering, Andrew. 1984. Constructing Quarks: A Sociological History of Particle Physics. Chicago: The University of Chicago Press. Poggiolesi, Francesca. 2021. Grounding Principles for (Relevant) Implication. Synthese 198. 7351–7376.
References
275
Popper, Karl Raimund. 1935. Logik der Forschung: Zur Erkenntnisstheorie der modernen Naturwissenschaft. Wien: Springer Verlag. Popper, Karl Raimund. 2002. The Logic of Scientific Discovery. Routledge: London. Prince, Alan & Paul Smolensky. 1997. Optimality: From Neural Networks to Universal Grammar. Science 275 (5306). 1604–1610. Putnam, Hilary. 1975. Mathematics, Matter and Method. Cambridge: Cambridge University Press. Quine, Willard van Orman. 1953. From a Logical Point of View: Logico-philosophical Essays. Cambridge: Harvard University Press. Quine, Willard van Orman. 1963. From a Logical Point of View: Logico-philosophical Essays. New York: Harper Torchbooks. Quine, Willard van Orman. 1985. The Time of My Life: An Autobiography. Cambridge: Bradford Books. Reboul, Anne. 2017. Cognition and Communication in the Evolution of Language. Oxford: Oxford University Press. Reichenbach, Hans. 1938. Experience and Prediction: An Analysis of the Foundations and the Structure of Knowledge. Chicago: The University of Chicago Press. Reutlinger, Alexander & Juha Saatsi (eds.). 2018. Explanation Beyond Causation. Oxford: Oxford University Press. Romero, Gustavo. 2019. Physics and Philosophy of Physics in the Work of Mario Bunge. In Michael Matthews (ed.), Mario Bunge: A Centenary Festschrift, 289–302. Springer Nature Switzerland. Rosenberg, Alex. 2005. The Philosophy of Science: A Contemporary Introduction. London: Routledge. Saeed, John. 2016. Semantics. Malden: Wiley Blackwell. Salmon, Wesley. 1998. Causality and Explanation. Oxford: Oxford University Press. Scholz, Barbara C., Francis Jeffry Pelletier & Geoffrey K. Pullum. 2015. Philosophy of Linguistics. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/lin guistics/ (last modified 2 March 2022) Schröder, Manfred. 1991. Fractals, Chaos, Power Laws. New York: Freeman. Searle, John Rogers. 1997. The Construction of Social Reality. New York: The Free Press. Searle, John Rogers. 2010. Making the Social World. Oxford: Oxford University Press. Searle, John Rogers. 2012. Reply to Commentators. Organon F, Supplementary issue 2, 19. 200. Sériot, Patrick. 2014. Structure and the Whole. Berlin: De Gruyter. Shannon, Claude Elwood. 1948. A Mathematical Theory of Communication. Bell System Technical Journal 27 (3). 379–423. Shieber, Stuart. 1985. Evidence Against the Context-Freeness of Natural Languages. Linguistics and Philosophy 8 (3). 333–343. Shin, Sun-Joo. 2002. The Iconic Logic of Peirce’s Graphs. Cambridge: The MIT Press. Simon, Herbert Alexander. 1955. On a Class of Skew Distribution Functions. Biometrika 42 (3–4). 425–440. Simon, Herbert Alexander. 1969. The Sciences of the Artificial. Cambridge: MIT Press. Singh, Simon. 1997. Fermat’s Last Theorem. New York: Fourth Estate. Skow, Bradford. 2014. Are There Non-Causal Explanations (of Particular Events)? British Journal for the Philosophy of Science 65 (3). 445–467. Skow, Bradford. 2016. Scientific Explanation. In Paul Humphreys (ed.), The Oxford Hanbook of Philosophy of Science, 525–543. Oxford: Oxford University Press. Smith, Peter. 1998a. Approximately Truth and Dynamical Theories. The British Journal for the Philosophy of Science 49 (2). 253–277. Smith, Peter. 1998b. Explaining Chaos. Cambridge: Cambridge University Press.
276
References
Stenger, Victor John. 2006. The Comprehensible Cosmos: Where Do the Laws of Physics Come From? New York: Prometheus Books. Stephan, Achim. 1999. Emergenz: Von der Unvorhersagbarkeit zur Selbstorganization. Dresden: Dresden University Press. Strauss, Udo. 1980. Structure and Performance of Vocal Systems. Bochum: Brockmeyer. Strevens, Michael. 2008. Depth: An Account of Scientific Explanation. Cambridge, MA: Harvard University Press. Strevens, Michael. 2016. Complexity Theory. In Paul Humphreys (ed.), The Oxford Handbook of Philosophy of Science, 695–716. Oxford: Oxford University Press. Suárez, Mauricio. 2003. Scientific Representation: Against Similarity and Isomorphism. International Studies in the Philosophy of Science 17 (3). 225–244. Suárez, Mauricio. 2016. Representation in Science. In Paul Humphreys (ed.), The Oxford Handbook of Philosophy of Science, 440–459. Oxford: Oxford University Press. Suárez, Mauricio & Francesca Pero. 2018. The Representational Semantic Conception. Philosophy of Science 86 (2). 344–365. Suppe, Frederick. 1977. The Structure of Scientific Theories. Urbana: University of Illinois Press. Suppes, Patrick. 1970. Probabilistic Grammars for Natural Languages. Synthese 22 (1–2). 95–116. Suppes, Patrick. 1972. Axiomatic Set Theory. New York: Dover Publications. Thompson, D’Arcy Wentworth. 1961. On Growth and Form. Cambridge: Cambridge University Press. Torre, Iván G., Bartolo Luque, Lucas Lacasa, Christopher T. Kello & Antoni Hernández-Fernández. 2019. On the Physical Origin of Linguistic Laws and Lognormality in Speech. Royal Society Open Science 6 (8). 1–22. Torretti, Roberto. 1999. Philosophy of Physics. Cambridge: Cambridge University Press. Toulmin, Stephen & June Goodfield. 1999. The Fabric of the Heavens: The Development of Astronomy and Dynamics. Chicago: The University of Chicago Press. Tuldava, Juhan. 1995a. Methods in Quantitative Linguistics. Trier: Wissenschaftlicher Verlag. Tuldava, Juhan. 1995b. Informational Measures of Causality. Journal of Quantitative Linguistics 2 (1). 11–14. Tuldava, Juhan. 1995c. A Comment on Bunge’s “Causality and Probability in Linguistics”. Journal of Quantitative Linguistics 2 (1). 17–18. Tuldava, Juhan. 1998. Investigating Causal Relations in Language with the Help of Path Analysis. Journal of Quantitative Linguistics 5 (4). 256–261. Turing, Alan Mathison. 1937. On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society s2–42 (1). 230–265. van Eck, Dingmar & Julie Mennes. 2016. Design Explanation and Idealization. Erkenntnis 81 (5). 1051–1071. van Fraassen, Bas Cornelis. 1980. The Scientific Image. Oxford: Oxford University Press. van Fraassen, Bas Cornelis. 1989. Laws and Symmetry. Oxford: Oxford University Press. van Fraassen, Bas Cornelis. 2002. The Empirical Stance. London: Yale University Press. Veyne, Paul. 2010. Foucault: His Thought, His Character. Cambridge: Polity Press. von Neumann, John & Arthur W. Burks (ed.). 1966. Theory of Self-Reproducing Automata. Illinois: University of Illinois Press. von Wright, Georg Henrik. 1971. Explanation and Understanding. London: Routledge. Weber, Erik. (2019, May 10th) Against “Distinctively Mathematical” Explanations of Physical Facts. Presentation at the workshop “Non-Causal Explanations: Logical, Linguistic and Philosophical Perspectives”, Ghent.
References
277
Weisberg, Michael. 2013. Simulation and Similarity: Using Models to Understand the World. Oxford: Oxford University Press. Weyl, Hermann. 1952. Symmetry. Oxford: Oxford University Press. Wiener, Norbert. 1948. Cybernetics: Or Control and Communication in the Animal and the Machine. Cambridge: MIT Press. Wildgen, Wolfgang. 2005. Catastrophe Theoretical Models in Semantics. In Reinhard Köhler, Gabriel Altmann & Rajmund G. Piotrowski (eds.), Quantitative Linguistics: An International Handbook, 410–423. Berlin: Walter de Gruyter. Wildgen, Wolfgang & Laurent Mottron. 1987. Dynamische Sprachtheorie: Sprachbeschreibung und Spracherklärung nach den Prinzipien der Selbstorganisation und der Morphogenese. Bochum: Brockmeyer. Wimmer, Gejza & Gabriel Altmann. 1994. The Theory of Word Length: Some Results and Generalizations. In Peter Schmidt (ed.), Glottometrika (vol. 15), 110–129. Bochum: Brockmeyer. Wimmer, Gejza & Gabriel Altmann. 1999. Thesaurus of Univariate Discrete Probability Distributions. Essen: STAMM. Wimmer, Gejza & Gabriel Altmann. 2005. Unified Derivation of Some Linguistic Laws. In Reinhard Köhler, Gabriel Altmann & Rajmund G. Piotrowski (eds.), Quantitative Linguistics: An International Handbook, 791–807. Berlin: Walter de Gruyter. Winkler, Peter. 1982. Quantitative Analyse phonetischer Transkripte. In Werner Lehfeldt & Udo Strauss (eds.), Glottometrika (vol. 4), 1–79. Bochum: Brockmeyer. Woodward, James. 2003. Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press. Woodward, James. 2018. Some Varieties of Non-causal Explanation. In Alexander Reutlinger & Juha Saatsi (eds.), Explanation Beyond Causation, 117–140. Oxford: Oxford University Press. Wouters, Arno. 1999. Explanation Without a Cause. Ph.D. Thesis. Utrecht University, The Netherlands. http://www.morepork.demon.nl/diss/dissertation.pdf (accessed 25 August 2021) Wouters, Arno. 2003. Four Notions of Biological Function. Studies in History and Philosophy & Biomedical Sciences 34 (4). 633–668. Wouters, Arno. 2007. Design Explanations: Determining the Constraints on What Can Be Alive. Erkenntnis 67 (1). 65–80. Wright, Larry. 1976. Teleological Explanations. An Etiological Analysis of Goals and Functions. Berkeley, California: University of California Press. Zámečník, Lukáš. 2012. External Realism as a Non-Epistemic Thesis. Organon F, Supplementary issue 2, 19. 25–30. Zámečník, Lukáš. 2014. The Nature of Explanation in Synergetic Linguistics. Glottotheory 5 (1). 101–120. Zámečník, Lukáš. 2015. Nástin filozofie vědy. Brno: Host. Zámečník, Lukáš. 2018. Mathematical Models as Abstractions. Organon F 25 (2). 244–264. Zámečník, Lukáš. 2021. Towards a Universal Account of Asymmetry in Non-causal Explanations. Filozofia 76 (6). 407–422. Zámečník, Lukáš. 2022. The Role of Philosophy of Science in Quantitative Linguistics. Linguistic Frontiers 5 (1). 13–23. Ziegler, Arne. 2005. Denotative Textanalyse. In Reinhard Köhler, Gabriel Altmann & Rajmund G. Piotrowski (eds.), Quantitative Linguistics: An International Handbook, 423–447. Berlin: Walter de Gruyter. Zipf, George Kingsley. 1949. Human Behavior and the Principle of Least Effort. Cambridge: Addison-Wesley Press.
Persons index Altmann Gabriel 4, 8, 35, 102 n, 116, 124, 133, 139 n, 140, 143, 146–166, 168, 170–172, 174, 176, 179, 180, 181, 188 n, 190 n, 193, 204 n, 220, 222, 232, 258, 264, 265 Andres Jan 13 n, 28, 35, 124 n, 157, 164 n, 169 n, 172, 175, 193 n, 206, 208, 209, 217, 220, 221, 232 Aristotle 238 Baggott Jim 33 n, 162 n, 193 n, 232 n, 237 Bain Jonathan 22 Baker Alan 235, 236 Balasubrahmanyan Viddhachalam K. 174, 257 Batterman Robert 22, 26, 265 n, 175 n Beck Guido 253 Beneš Martin 158, 163 Benešová Martina 116, 157 n, 158 n, 203, 204, 205, 221 Best Karl-Heinz 102 n, 169 Blin-Stoyle Roger John 251 n, 252 Bliss Ricki Leigh 1 n, 232, 233 Bokulich Alisa 6 n, 26, 30, 31 Boole George 102, 111, 112 Bromberger Sylvain 14, 15, 230 Bunge Mario 48 n, 117, 134 n, 137, 147 n, 150, 154, 158, 186, 187, 220, 253, 254 Burks Arthur W. 24, 74 Caldarelli Guido 32 n, 34 n, 107, 172–177 n, 209, 221 n, 263 Carnap Rudolf 14 n, 59, 63 n, 69 n, 70 n, 153 n Cartwright Nancy 19, 20, 41 Chomsky Noam 2, 3, 7, 9, 35, 38, 74–93, 96, 103, 117, 120, 124, 138, 218, 219 Cohen Jack 160 Cordero Alberto 253, 254 Craver Carl 23, 24 n, 25 n, 35, 220, 225, 226 Čech Radek 158, 163, 220 Darden Lindley 23, 35, 220 Davidson Donald 154, 241, 243 n De Saussure Ferdinand 2, 7, 9, 12n, 19, 20, 35, 37, 47–58, 60, 62, 64, 70, 74, 77, 90, 102, 103, 105, 109, 110, 113, 114, 218, 222 Dostál Daniel 176, 179 https://doi.org/10.1515/9783110712759-009
Einstein Albert 111, 238, 242 n, 244 n, 249, 250 Egré Paul 93–96, 120, 122, 123, 179 Faltýnek Dan 1 n, 51 n, 116, 141 n, 158 n, 213, 204, 205, 221 Ferrer-i-Cancho Ramon 172, 173 n, 174, 175, 176, 179, 257 Fine Kit 232, 233 Foucault Michel 2 n, 50, 56 n French Steven 33, 47 n, 52 n, 53 n, 55 n, 56, 100 n, 209 n, 241, 242 n, 244 n Frigg Roman 14 n, 19 Galilei Galileo 20, 37, 81, 130 n, 156, 229, 230, 238 Galison Peter 2 n, 55 Garson Justin 24 n, 72, 203, 204, 205 Giere Ronald 14 n, 11, 16, 17, 18, 19, 36 n, 38 n, 41, 42, 158, 223, 224 Givón Talmy 94, 96, 97 Glennan Stuart 23, 24, 225, 226 Gross Jonathan 78 n, 209 Grzybek Peter 48, 117 n, 137 n, 150, 158 Haken Hermann 34 n, 73, 124, 125, 141, 148, 149, 159, 172, 189 n, 194, 199, 202, 257, 259 Halvorson Hans 16 n, 17, 44, 138 Hammerl Rolf 126 n, 180, 186, 197, 198 Hanson Norwood Russell 3, 153 Haspelmath Martin 7, 75, 93, 98–101, 121–123, 193, 206 Hempel Carl Gustav 3–5, 9, 10, 13 n, 14, 15, 17, 19, 21, 23, 25 n, 46 n, 48 n, 59, 61, 72, 116, 150, 152, 180–185, 188 n, 189, 190, 194, 196 n, 203, 220, 222, 232, 261, 262 Herdan Gustav 2, 7, 9, 35, 52 n, 53, 75, 82 n, 90 n, 101–114, 124, 125 n, 126, 130 n, 170, 172, 218, 219, 221 n, 244 n, 249, 251, 252 Hjelmslev Louis 2, 9, 35, 47, 48 n, 59–72, 77, 80, 85, 87, 113, 118 n, 169, 218 Hřebíček Luděk 8, 13, 28, 35, 68, 107, 114, 124 n, 125 n, 133, 140 n, 145, 164–169, 174, 193, 206, 208, 214, 217, 220–222 n, 232, 245, 260 Huneman Philipe 26, 35, 207, 209, 210 n, 212, 213, 215, 216, 217, 220
280
Persons index
Jakobson Roman 72, 118, 251 n Jansson Lina 27, 28, 31, 34 Jürgens Hartmut 55, 185
Morrison Margaret 1 n, 19, 20, 21 n, 22, 28 n, 33 n, 34, 161, 167, 173, 221 n, 235 n Mottron Laurent 140, 257
Kellert Stephen 34, 55 n, 94 n, 124, 159 n, 167, 185 n, 194 n, 244 n, 257 Khalifa Kareem 5, 26, 58 n Kim Jaegwon 39 n, 184 n, 195, 196 n, 203 Klimovsky Gregorio 253 Köhler Reinhard 2–5 n, 8, 9, 19, 23, 30, 35, 46 n, 54, 58, 71, 73, 75, 82 n, 88 n, 89 n, 104–107, 109–111, 113, 115–120, 123–149, 151, 156, 158–161, 163–168, 170–176, 178–183, 185–209, 211–215, 217–222, 232, 256, 258, 264 Kořenský Jan 72, 81 n, 118 Kostić Daniel 26, 30, 31, 35, 207, 215, 220, 221 Krivochen Diego 176 Krott Andrea 133 Kuhn Thomas 30 n, 132, 153
Nagel Ernst 16 n, 59, 134 n, 150 Naranan Sundaresan 174, 257 Newmeyer Frederick 7, 75, 93, 95–99, 101, 118–123, 128, 143, 144, 179, 193, 206 Newton Isaac 2, 3, 10, 13, 20 n, 21 n, 33 n, 37, 81, 86, 87, 130 n, 139, 156, 157 n, 163, 166, 167, 208, 238, 239, 240n, 242 n Nguyen James 19 Nilsson Dan 228 Novotný Marian 176, 179
Lacková Ľudmila 51 n, 72 Lakatos Imre 76, 81 n, 153, 265 Lange Marc 1 n, 5, 10 n, 11 n, 13 n, 14, 22, 26, 28–31, 33, 35, 52 n, 131 n, 164 n, 197 n, 198 n, 209, 214 n, 221, 224, 235 Levin Michael 94 Li Wentian 174 Lightfoot David 121 Lindemann Hans 253 Lombardi Olimpia 253, 254 Maj Jaroslaw 126 n, 180, 197 Mancosu Paolo 1 n, 233–236 Mandelbrot Benoît 83 n, 106, 107 n, 139 n, 164, 174 n, 177, 178 Matlach Vladimír 51 n, 164 n, 176, 179 Matthews Michael 253, 254 Maudlin Tim 232 Maxwell Grover 161, 244 n Maziarz Mariusz 226 Mennes Julie 23, 227, 228 Meyer Peter 117 n, 149, 154–161, 167, 168, 172, 174. 176, 189 Milička Jiří 111 n, 141 n, 142 n, 176 Millson Jared 5, 26 Mitzenmacher Michael 111 n, 177, 178
Oppenheim Paul 13 n, 61 n, 72, 152 n, 181 n, 184 n, 185 n, 189 Osolsobě Ivo 72 Papineau David 60 Partee Barbara H. 24 n, 75 n, 76 n, 246–248 Pastor Julio Rey 253 Peitgen Heinz-Otto 55 n, 185 Pelletier Francis Jeffry 6 n, 22 n, 78 n, 93 Penrose Roger 252 Perez Diana Ines 254 Pérez Ransanz Ana Rosa 253, 254 Planck Max 249 Poggiolesi Francesca 11, 232 Poincaré Henri 55, 222 Popper Karl Raimund 59, 60 n, 76, 150–153, 158, 232, 241 n, 265 Prigogine Ilya 73 Pullum Geoffrey K. 6 n, 22 n, 78 n, 93 Quine Willard van Orman 16, 79, 153, 154, 202, 253 Reboul Anne 83 n, 92 Reutlinger Alexander 5, 22, 23, 26, 39 n Risjord Mark 5, 26 Rosenberg Alex 5 n, 18, 38 n, 76 n Russell Bertrand 253 Saatsi Juha 5, 22, 23, 26–28, 31, 84 Salmon Wesley 5, 26 n, 39, 95 n, 229
Persons index
Saupe Dietmar 55, 185 Scholz Barbara C. 6 n, 22 n, 78 n, 93 Schröder Manfred 166 Searle John R. 242, 243 Sellars Wilfrid 154 Sériot Patrick 72 Shannon Claude Elwood 74, 82 Shieber Stuart 21 n, 76 n, 247 Simon Herber 35, 83 n, 177, 195, 196 n Singh Simon 234 Skow Bradford 6 n, 15 n, 22, 25, 39 n Smith Peter 18, 20 n, 34, 55 n, 124, 177 n, 257 Solé Ricard V. 172, 174–176, 179, 257 Stenger Victor John 10 n, 32, 33 n, 57 n, 166, 237–240, 251 n Stephan Achim 125 n, 159 n, 184 n, 202 Stewart Ian 160 Strauss Udo 134 n, 139 n, 184 n, 202 Strevens Michael 5 n, 102, 233 Suárez Mauricio 17, 138 Tabery James 53 n, 25 n, 225, 226 Taylor Richard 234 Ter Meulen Alice 24 n, 75 n, 76 n, 246–248 Thom René 257 Torre Iván G. 139 n, 141, 170 n, 172, 174, 179, 186 n Torretti Roberto 2, 37 n, 81 n, 156 Trogdon Kelly 1n, 232, 233 Tucker Thomas W. 78 n, 209
281
Tuldava Juhan 185–187, 196 Turing Alan Mathison 24 n, 70 n, 74, 77, 235 n Van Eck Dingmar 23 Van Fraassen Bas 15, 16, 19, 32 n, 41, 154, 242 Von Neumann John 24, 70 n, 74 Wall Robert E. 24 n, 75 n, 76 n, 246–248 Weber Erik 13, 16, 236 Weyl Hermann 165 Wiener Norbert 74, 189 n Wildgen Wolfgang 140, 174, 257 Wiles Andrew 234 Wimmer Gejza 116, 140, 146–149, 154–156, 158, 159, 161, 162, 171, 174, 190 n, 193, 197, 258 Woodward James 23, 26–29, 39, 227, 229–231 Wouters Arno 227 Wright Larry 2 n, 72 Zach Martin 226 Zámečník Lukáš 4 n, 22 n, 27 n, 29, 78 n, 84, 116, 141 n, 145 n, 147 n, 158 n, 167, 176, 177 n, 184 n, 188 n, 195, 197 n, 199 n, 202–205, 215, 221, 224 n, 231 n, 236, 240 n, 243 n Ziegler Arne 141 Zipf Georg K. 35, 102 n, 106, 107 n, 116, 119, 127 n, 139, 148, 157, 165 n, 166, 168, 170, 172, 174, 176, 190, 245 n, 257
Subject index Arbitrariness 7, 9, 21, 22, 49, 50–52, 54–57, 60, 62–64, 85, 113, 114, 219, 236 Axiom – structural axiom 8, 9, 73, 110, 111, 116, 125, 140–142, 144–146, 148, 151, 162, 170, 171, 183, 188–191, 193, 194, 195, 199, 200, 203, 205, 206, 213, 222 Causality 5, 14, 15, 22, 26, 39, 40, 94, 104, 144, 185–188, 194, 198, 202, 253 – causal action 194 – causal interpretation 31, 197, 202 – causal laws 14, 144, 181, 244 – causal nexus 5, 14, 15, 19, 27, 28, 29, 31, 39, 44, 54, 185, 186, 187, 194, 197, 198, 200, 203, 205–207, 244 – causal relation 39, 185, 186, 196, 202, 214 – causal thesis 202 – downward causation 172, 194, 197–203, 205 Dependence 25, 27–30, 42, 44, 64–68, 70, 71, 79, 83, 110, 123, 130–135, 139 165, 185–187, 195, 197, 228, 230, 242, 247 Description – formal description 4, 12, 80, 87, 89, 97, 98, 120, 122 – formal linguistic description 4, 12, 89 – functional description 101, 194, 195, 206 – grammatical description 78, 122 – linguistic description 1, 9, 51, 98, 115, 121, 122, 145, 195, 218, 219 – quantitative statistical description 75 – self-consistent and exhaustive description 62 – self-consistent and simple description 59 – structural description 75, 113, 220, 249 – structuralist description 113 – systemic description 38, 50, 56–59, 62–65, 67–69, 71, 73, 74, 80, 86, 113, 117, 121, 162, 203, 219 Explanation – causal explanation 19, 22, 26, 27, 29, 39, 40, 118, 172, 180, 187, 226, 229 – counterfactual framework 27, 31, 227, 229, 230 https://doi.org/10.1515/9783110712759-010
– design explanation 23, 24, 219, 220, 227, 228 – D-N model of explanation 7, 44, 60, 96, 138, 141, 144, 152, 179, 181, 190, 207 – explanation by description 4 – formal explanation 3, 4, 6, 7, 9, 45, 74–78, 80, 82, 87, 93, 95–99, 101, 117, 118, 120–123, 171 – functional explanation 3–5, 7–9, 24, 35, 40, 45, 46, 54, 58, 68, 72, 73, 75, 77, 93, 95, 98, 99, 101, 107, 115–123, 139, 141–145, 147, 160, 171, 172, 178–181, 183–185, 187, 189, 191, 193–196, 200, 203, 204, 207, 208, 210, 211, 213, 219–222, 227, 228 – grammatical explanation 93–95, 123 – linguistic explanation 5–8, 12, 22, 35, 36, 40, 42, 43, 50, 58, 92, 98, 101, 122, 123, 137, 145 – mathematical explanation 28, 32, 34, 43, 169, 209, 221, 224, 232–235 – mechanistic explanation 23, 35, 225, 226 – metaphysical explanation 11, 233, 234, 241 – neuroscientific explanation 145, 206 – non-causal explanation 5–7, 14, 22, 23, 25–35, 39, 43, 44, 70, 113, 161, 206, 207, 220, 229, 232, 235 – non-causal structuralist explanation 58 – scientific explanation 4–7, 10–24, 26, 28, 29, 32, 35, 36, 37, 38, 39, 40, 42, 43, 45, 47, 51, 52, 59, 61, 87, 96, 115, 137, 145, 162, 206, 220, 233, 234, 243, 245 – systemic explanation 7, 47–73 – teleological explanation 3, 72, 73, 118, 160, 172, 180, 182, 184 – topological explanation 26, 31, 35, 43, 45, 100, 145, 178, 207–210, 212–216, 220, 221 – Woodward’s counterfactual framework 27, 227 Functional equivalents 8, 120, 131, 180–184, 188, 190, 191, 192, 193, 199, 210, 212, 213, 222, 264 Functional reduction 195, 196, 203 Generalization – empirical generalization 63, 86, 96, 99, 135, 151, 152, 162, 163
284
Subject index
– linguistic generalization 99, 100 Generativism 3, 4, 12, 21, 37, 38, 42, 43, 74–78, 81, 82, 85, 87–89, 91–101, 117, 122, 137, 154, 162, 191, 197, 206, 220 – generativist conception 80 Grammar – autonomous syntax 95, 96 – context-free grammar 76, 246 – context-sensitive grammar 77, 83 – finite state grammar 83 – formal grammar 3, 12, 21, 74, 75, 76, 77, 88 – generative grammar 10, 75, 76, 82, 90, 91, 94, 122 – regular grammar 83, 246 – simplest grammar 81 – syntactic structures 2, 75, 78, 86, 87, 96 – transformational grammar 3 – universal grammar 12, 78, 79, 80, 81, 91, 98, 121 – universal principle of grammar 81 Hypothesis – hypothesis of self-regulation 183, 189, 194 – register hypothesis 111, 116, 140, 141, 143, 144, 148, 166, 167, 168, 171, 173, 174, 175, 179, 196, 201, 202, 208, 221 Interdependence 66, 67, 70 Law – causal law 14, 144, 181, 244 – developmental law 139, 198 – diachronic law 56, 72 – distribution law 135, 139, 171 – empirical law 80, 107 – fundamental law of communication 104 – language law 56 – law of duality 102, 103, 112, 252 – law of the least effort 106, 107 – linguistic law 4, 8, 47, 48, 52, 56, 104, 107, 137, 139, 140, 146, 152, 154, 157, 158, 170, 172, 174, 178, 179, 206, 220, 221 – Mandelbrot’s canonical law 106 – Menzerath-Altmann’s law 140–146, 148, 157, 159, 164, 165, 167, 168, 170–174, 176, 194, 201, 208, 221, 222, 260
– Menzerath’s law 139, 143 – Newton’s law 10, 13, 139, 166, 208, 238, 239 – panchronic law 56, 57 – power law 34, 107, 109, 111, 116, 132, 140, 146, 162, 165–179, 208, 214, 221, 263 – scientific law 13, 14, 19, 40, 41, 57, 61, 106, 137, 150, 152, 239, 244 – stochastic law 158, 171 – synchronic law 56, 105 – Zipf’s law 106, 119, 139, 168, 170, 172, 176, 190, 257 Linguistics – biolinguistics 87, 89 – cognitive linguistics 3, 9, 81, 93, 112, 113, 117, 118, 143, 144, 191, 194, 211, 220 – comparative linguistics 154 – formal linguistics 4, 12, 89, 97, 122, 124 – generative linguistics 7, 8, 9, 96 – linguistic phenomena 1, 2, 5, 12, 19, 47, 59, 82, 221, 252 – linguistic structure 78, 85, 86, 87, 113 – neurolinguistics 3, 117, 118, 144 – psycholinguistics 81, 118, 186, 194 – quantitative linguistics 4, 7, 8, 9, 13, 23, 28, 33, 35, 35, 42, 43, 102, 105, 115–217, 219–221, 257, 258, 264, 265 – structural linguistics 9, 103 – synergetic linguistics 115, 124, 142, 147, 159, 172, 198 – system-theoretical linguistics 4–6, 8, 9, 21–23, 30, 35, 42, 43, 45, 46, 58, 71, 73, 75, 78, 82, 92, 97, 98, 101, 104–106, 109, 113–118, 123, 124–146, 147–151, 156, 162, 170–172, 176, 178–216, 219, 220, 221, 244, 245, 255, 256, 257 Mathematical borrowings 2 – boundary conditions 10, 13, 44, 45, 73, 190, 200, 231 – graph theory 2, 19, 27, 49, 74, 77, 78, 90, 209, 211, 216 – lognormal distribution 110, 111, 172, 173, 176–178 – mathematical abstractions 84, 88 – mathematical concepts 103, 233, 234 – mathematical intuitions 2, 234 – mathematical logic 91
Subject index
– mathematical structures 58, 221 – statistical distributions 116, 146, 168, 172, 220 – topological property 212, 216 – topology 2, 49, 74, 175, 177, 209, 211, 215 Methods – deductive argument 5, 10, 11, 13, 182 – functional analysis 4, 9, 18, 72, 73, 116, 123, 147, 172, 180–184, 189, 194, 203-, 206, 220, 261, 262 – quantitative-linguistic methods 4 – falsification 76, 153, 264, 265 – inductivism 59, 63, 150, 151, 153, 172, 264, 265 – language theory method 59 – mathematical analysis 89 – method of generalization 59 – verificationism 59, 253 Minimalist program 75, 80, 91, 95, 96, 101 Model – causal models of scientific explanations 7 – D-N model 7, 14, 15, 36, 44, 60, 61, 72, 94, 95, 96, 137, 138, 141, 144, 152, 153, 179, 181, 190, 207, 230 – explanation model 5, 14, 26, 28, 30, 39, 40, 43, 144, 172, 179, 191, 193, 206, 207, 208, 212–215 – explanatory model 5, 8, 31, 207 – formal model 75, 76, 96, 101, 159 – functional model 2, 8, 54, 70, 146, 151, 170, 172, 175, 181, 187, 188, 190, 199, 206, 222 – Huneman’s model 207 – Kostić’s model 217 – linguistic model 14, 147, 148, 211 – mathematical model 22, 103, 106, 118 – model of explanation 7–10, 23, 30, 35, 44–46, 51, 52, 54, 56, 57, 60–62, 66, 67, 72, 73, 75, 77, 80, 81, 84, 87–89, 95–97, 101, 103–105, 112–114, 116, 135, 138, 141, 144–146, 150–152, 169–173, 175, 176, 179, 181, 187, 188, 190, 193, 199, 200, 203, 206–209, 211, 216, 217, 221, 230, 234, 252 – model of human language processing 141 – models of scientific explanation 7, 14, 26, 40, 43 – principle-based model of explanation 7, 9, 44, 45, 46, 51, 52, 54, 56, 57, 62, 66, 67, 73, 77,
285
80, 84–89, 95, 96, 97, 101, 103–105, 112, 114, 116, 135, 144, 145, 169, 171, 173, 193, 199, 206, 209, 217, 252 – scientific model 17–20, 25, 32, 224 – standard model of particles and interactions 32, 237, 245 – topological model 8, 9, 30, 113, 176, 206, 209, 211, 213, 214, 216, 217, 221 Natural language 21, 22, 67–69, 74–80, 83, 86, 88, 100, 131 Ordners 198–202, 210 Philosophy – analytic philosophy 61, 69, 154, 184, 232, 253, 254 – conventionalism 138 – determinism 124, 245 – emergentism 202 – epistemology 16, 61, 254 – instrumentalism 163, 164, 241 – logical empiricism 149, 152 – logical positivism 12, 39, 150, 151, 154 – ontology 34, 202, 203, 254 – philosophy of language 61, 69, 74, 103 – philosophy of linguistics 6, 9, 137, 218, 219 – philosophy of mind 195, 203 – philosophy of science 2–12, 14, 16, 18–23, 25, 29, 33–35, 39–42, 46, 58, 60, 61, 72, 76, 85, 116, 117, 123, 132, 137, 138, 139, 147, 150, 151, 153, 154, 161, 163, 170, 171, 180, 190, 200, 206, 207, 209, 215, 218, 219, 220, 224, 229, 232, 242, 253, 254 – philosophy of scientific explanation 7, 10–46 – physicalism 195, 203 – teleology 3, 72, 117, 160, 172, 205 – realism 25, 39, 64, 65, 163, 164, 241–243 Principle – conservation principle 33, 145, 166, 214, 222, 235, 251 – construction principle 88, 90–92, 95–98, 101 – economization principle 106, 114, 145, 169, 174, 244 – enslaving principle 199, 202
286
Subject index
– explanatory principle 8, 43, 51, 52, 54, 58, 68, 76, 84, 87, 88, 89, 105, 107, 114, 116, 118, 145, 146, 162, 171, 174, 178, 189, 200 – general guiding principle 98 – general principle 57, 61, 95, 113, 134, 144, 170, 209, 252, 257 – generative principle 76, 77 – linguistic principle 7, 43, 61, 90, 97, 107, 169, 173, 214, 217 – minimization principle 174, 175, 177, 178 – physical principle 88, 89, 189 – principle of analysis 2, 7, 9, 48, 64–71, 87, 113, 169, 209, 219, 231 – principle of arbitrariness 7, 9, 50, 51, 52, 54, 55, 56, 57, 62, 113, 114, 219 – principle of compositeness 8, 9, 71, 146, 164, 165, 166, 167, 168, 169, 173, 179, 208, 209 – principle of duality 9, 111–114, 170, 252 – principle of economization 144 – principle of invariance 8, 9, 32, 45, 141, 145, 146, 165, 170, 171, 210, 211, 212, 213, 216, 237, 238, 251 – principle of the linguistic system structure 9, 141, 145, 146 – principle of language 64, 91, 92 – principle of recursion 7, 9, 84, 86, 88, 89, 90, 92, 101, 169, 219 – principle of regularity 56 – principle of universality 167, 173 – scientific principle 42–45, 73, 77, 81, 89, 244, 245, 252 – structural principle 88, 141 – symmetry principle 32, 33, 35, 65, 194, 235, 237 – systemic principle 107 – teleological principle 73 – universal principle 45, 75, 78, 81, 107, 170, 190, 204 Pythagorean Theorem 10, 13, 223, 224 Requirement – adaptation requirement 129, 142, 255 – application requirement 128, 255 – coding requirement 127, 255 – context economy 127, 128, 255 – context specificity 127, 128, 255 – de-specification requirement 255
– economy requirement 107, 255 – efficiency of coding 128, 255 – flexibility of the expression-meaning-relation 255 – invariance of the expression-meaningrelation 255 – invariance requirement 134 – limitation of embedding depth 128, 255 – maximisation of complexity 128, 255 – minimisation of decoding effort 255 – minimisation of encoding effort 255 – minimisation of inventories 128, 255 – minimisation of memory 128, 255 – minimisation of production effort 255 – minimisation of structural information 128, 255 – preference of right-branching 128, 255 – redundancy 110, 128 – specification requirement 127, 128, 130, 134, 192, 255 – stability 116, 129, 142, 255 – structure-concept iconicity 119, 127, 128 – transmission security 128, 255 Representation 11, 17, 18, 31, 34, 41, 69, 78, 119, 149, 204, 242, 243 Self-organization 34, 34, 73, 116, 140, 144, 145, 148, 149, 159, 162, 194, 199, 202, 206, 211, 259, 73, 116, 140, 144, 145, 148, 149, 159, 162, 194, 199, 202, 206, 211, 259 Semantics 11, 17, 36, 41, 42, 44, 45, 95, 97, 138, 147, 161, 253, 254, 257 Semiology 54, 57, 58 Structuralism 2, 3, 7–9, 20–22, 37, 38, 42, 43, 47–50, 53–58, 72, 74, 77, 78, 88–90, 93, 101, 102, 105, 112, 113, 118, 124, 162, 232, 245 Synergetic interpretation 172, 191, 195, 198, 203 Synergetics 73,124, 125, 140, 141, 145, 148, 149, 159, 160, 172, 173, 194, 198, 199, 201–203, 206, 257, 264 Symmetry – explanatory symmetry 44 – gauge symmetry 14 – structure symmetry 167 – symmetry breaking 33, 114, 166, 167, 237, 238, 240, 245, 251 – translational symmetry 32, 237, 238, 240 – symmetry maintaining 33, 114, 237, 245
Subject index
System – axiomatic system 67, 137, 139 – coding system 114 – cognitive system 11, 13, 16, 121, 224 – dynamical system 32, 34, 35, 55, 73, 124, 140, 159, 166, 175, 176, 179, 238, 257 – language system 8, 12, 19, 20, 22, 49, 50, 55, 56, 64, 66, 73, 108, 109, 110, 113, 121, 123, 142, 145, 187, 190, 201, 208 – linguistic system 4, 9, 12, 21, 54, 111, 113, 116, 119, 120, 129, 134, 135, 139, 140, 141, 144, 145, 146, 148, 149, 131, 179, 201, 207, 208, 210–214, 216, 217, 221, 222, 249 – living system 145, 201, 208 – self-organized system 110, 119 – self-regulating system 125, 148, 180, 189, 264 – semiotic system 51 – syntactic system 128, 135, 201 – system of invariants 216 – system of oppositions 77, 103, 114 – system variables 126, 127, 129, 185, 198, 217
287
Theory – automata theory 74 – cybernetics 74, 180, 185 – explanatory theory 2, 71, 75, 86, 109, 146 – grammar theory 84–86 – information theory 74, 75, 106, 167, 180 – Köhler’s theory 109, 115, 116, 129, 132, 134, 151, 175, 215 – language theory 51, 59–63, 69, 71, 88 – mathematical theory of communication 75, 82 – model-based view of scientific theories 23, 41 – scientific theory 2, 60, 61, 65, 79, 85, 158, 162, 163 – theory of dynamical systems 34, 35, 73, 140, 166, 176, 257 – theory of explanation 229 – theory of syntax 88 Unified Approach 8, 35, 116, 140, 146–149, 151, 153–156, 159, 162–164, 166, 171, 174, 193, 197, 220, 265