325 63 34MB
English Pages [272] Year 2021
Rational Machines and Artificial Intelligence
This page intentionally left blank
Rational Machines and Artificial Intelligence
Tshilidzi Marwala Professor, Department of Electrical Engineering, University of Johannesburg, Johannesburg, South Africa
Academic Press is an imprint of Elsevier 125 London Wall, London EC2Y 5AS, United Kingdom 525 B Street, Suite 1650, San Diego, CA 92101, United States 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom Copyright © 2021 Elsevier Inc. All rights reserved No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN 978-0-12-820676-8 For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals
Publisher: Mara Conner Acquisitions Editor: Chris Katsaropoulos Editorial Project Manager: Emily Thomson Production Project Manager: Niranjan Bhaskaran Cover Designer: Miles Hitchen Typeset by SPi Global, India
Contents Preface........................................................................................................................xi
CHAPTER 1
Introduction to machine and human rationality.........1 1.1 Introduction................................................................................. 1 1.2 Machine rationality.....................................................................3 1.3 Fourth industrial revolution......................................................... 3 1.4 Artificial intelligence................................................................... 6 1.5 Summary of the book.................................................................. 9 References................................................................................. 12
CHAPTER 2
What is machine vs human rationality?.....................15 2.1 Introduction............................................................................... 15 2.2 What is utility?.......................................................................... 16 2.2.1 Static utility function........................................................ 16 2.2.2 Dynamic utility function.................................................. 17 2.3 What is rationality?................................................................... 17 2.4 Utility of actions........................................................................ 18 2.5 State and action utility measurement........................................19 2.6 Bounded rationality................................................................... 19 2.7 Seeking a bounded rational model............................................ 20 2.8 Human rationality: Clay pots.................................................... 21 2.9 Machine rationality: Condition monitoring............................... 24 2.10 Conclusions............................................................................... 27 References................................................................................. 28
CHAPTER 3
Rational machine...........................................................31 3.1 Introduction............................................................................... 31 3.2 What is a rational machine?...................................................... 32 3.3 Artificial intelligence................................................................. 32 3.3.1 Machine learning: Multilayer perceptron and deep learning������������������������������������������������������������������� 32 3.3.2 Soft computing................................................................. 35 3.3.3 Computational intelligence.............................................. 38 3.4 Knowledge representation......................................................... 40 3.5 Rational machines..................................................................... 41 3.6 Conclusions............................................................................... 42 References................................................................................. 42
v
vi
Contents
CHAPTER 4
Flexibly bounded rationality........................................47 4.1 Introduction............................................................................... 47 4.2 Rational decision-making.......................................................... 48 4.3 Bounded rational decision-making............................................ 49 4.4 Flexibly bounded rational decision-making of humans............ 51 4.5 Flexibly bounded rational decision-making of machines.......... 53 4.6 Flexibly bounded rationality humans vs machines................... 55 4.7 Humans vs machines: Case of radiology.................................. 57 4.8 Conclusions............................................................................... 57 References................................................................................. 58
CHAPTER 5
Rational expectation.....................................................61 5.1 Introduction............................................................................... 61 5.2 Adaptive expectations...............................................................62 5.3 Knowledge representation of man vs machines........................ 65 5.4 Machines vs human decision-making....................................... 67 5.5 Rational expectation and interstate conflict.............................. 68 5.6 Rational expectation of HIV infection...................................... 69 5.7 Conclusions............................................................................... 70 References................................................................................. 70
CHAPTER 6
Rational choice..............................................................73
6.1 Introduction............................................................................... 73 6.2 What is rational choice?............................................................ 74 6.3 Information................................................................................ 75 6.4 Choices...................................................................................... 76 6.5 Optimization.............................................................................. 77 6.6 Rational choice.......................................................................... 77 6.7 Human vs artificial intelligence rational choice........................ 78 6.8 Interstate conflict and human vs machine rational choice........................................................................................82 6.9 Conclusions............................................................................... 82 References................................................................................. 83
CHAPTER 7
Bounded rational counterfactuals..............................85 7.1 Introduction............................................................................... 85 7.2 Counterfactuals.......................................................................... 86 7.3 Counterfactuals and causality.................................................... 88 7.4 Rational counterfactuals............................................................ 89 7.5 Bounded rational counterfactuals in politics............................. 91 7.6 Bounded rational counterfactuals in history.............................. 91 7.7 Bounded counterfactuals in engineering................................... 92 7.8 Bounded rational counterfactuals in interstate conflict............. 93
Contents
7.9 Counterfactuals and artificial intelligence................................. 94 7.10 Conclusions............................................................................... 95 References................................................................................. 95
CHAPTER 8
Rational opportunity cost.............................................97
8.1 Introduction............................................................................... 97 8.2 Counterfactuals.......................................................................... 98 8.3 Opportunity cost........................................................................ 99 8.4 Game theory and opportunity cost.......................................... 101 8.5 Rational opportunity cost........................................................ 102 8.6 Artificial intelligence and rational opportunity cost................ 104 8.7 Handling rational opportunity cost.......................................... 104 8.8 Conclusions............................................................................. 105 References............................................................................... 105
CHAPTER 9
Can machines be rational?........................................ 109
9.1 Introduction............................................................................. 109 9.2 What is rationality?................................................................. 109 9.3 What did Herbert Simon say?................................................. 110 9.4 Data information and rationality............................................. 110 9.5 What happens when replacing humans with machines?......... 112 9.6 Digital and quantum computing.............................................. 113 9.7 All models are wrong.............................................................. 114 9.8 Can machines be rational?....................................................... 116 9.9 St. Petersburg Paradox and rationality.................................... 116 9.10 Conclusion............................................................................... 118 References............................................................................... 119
CHAPTER 10
Can rationality be measured?.................................... 123 10.1 Introduction............................................................................. 123 10.2 Information.............................................................................. 124 10.3 Model: Biological and artificial brain..................................... 124 10.4 What about the uncertainty principle?....................................125 10.5 Optimization............................................................................ 126 10.6 Classification of rationality..................................................... 126 10.7 Rational decision-making........................................................ 127 10.8 What is irrationality?............................................................... 129 10.9 Marginalization of irrationality theory.................................... 130 10.10 Marginalization of irrationality in decision-making............... 131 10.11 Rationality quantification........................................................ 132 10.12 Rationality and condition monitoring..................................... 132 10.13 Conclusion............................................................................... 133 References............................................................................... 137
vii
viii
Contents
CHAPTER 11
Is machine rationality subjective?........................... 141 11.1 Introduction............................................................................. 141 11.2 Optimization............................................................................ 142 11.3 Choosing optimization method............................................... 144 11.4 Local optimization................................................................... 144 11.5 Global optimization................................................................. 146 11.6 Is a single goal optimization subjective?................................. 148 11.7 Does multicriteria optimization make rationality subjective?............................................................................... 149 11.8 The curse of dimensionality and model complexity............... 150 11.9 Is machine rationality subjective?........................................... 151 11.10 Conclusion............................................................................... 151 References............................................................................... 152
CHAPTER 12
Group vs individual rationality.................................. 155
12.1 Introduction............................................................................. 155 12.2 Democracy............................................................................... 155 12.3 Authoritarianism...................................................................... 156 12.4 Democracy vs authoritarianism............................................... 157 12.5 Committee of rational machines.............................................158 12.5.1 Bayes Optimal Classifier.............................................. 159 12.5.2 Bayesian model averaging............................................ 159 12.5.3 Bagging........................................................................ 160 12.5.4 Boosting....................................................................... 160 12.5.5 Stacking........................................................................ 160 12.5.6 Evolutionary committees.............................................. 161 12.6 Theory of committee of networks........................................... 161 12.6.1 Equal weights............................................................... 162 12.6.2 Variable weights........................................................... 163 12.7 Application to condition monitoring....................................... 165 12.8 Conclusions............................................................................. 168 References............................................................................... 168
CHAPTER 13
Human vs machine rationality................................... 173
13.1 Introduction............................................................................. 173 13.2 Human vs machine chess player............................................. 174 13.3 Human vs machine go player.................................................. 175 13.4 Human-driven vs autonomous vehicles (AVs)........................ 176 13.5 Human vs autonomous aircraft pilot....................................... 178 13.6 Human vs machine language translator..................................179 13.7 Human vs machine rational expectations................................ 179 13.8 Human vs machine: Optimal stopping problem...................... 180
Contents
13.9 Human vs machine rational choice......................................... 181 13.10 Human vs machine game theory and mechanism design........ 182 13.11 Human vs machine rational counterfactuals........................... 183 13.12 Human vs machine rational opportunity cost.......................... 184 13.13 Human vs machine rationality subjectivity............................. 184 13.14 Human vs machine group and individual rationality.............. 184 13.15 Machine vs human rationality vs prospect theory................... 185 13.16 Conclusion............................................................................... 186 References............................................................................... 187
CHAPTER 14
Rational markets......................................................... 193
14.1 Introduction............................................................................. 193 14.2 Efficient market hypothesis..................................................... 195 14.3 Historical prices and market rationality.................................. 196 14.4 Causality and market rationality.............................................. 196 14.5 Historical prices, internet, and market rationality................... 197 14.6 Historical prices, internet, private information, and market rationality.............................................................. 197 14.6.1 Generative adversarial network (GAN)........................ 198 14.6.2 Simulated annealing..................................................... 199 14.7 Rational expectations and market rationality.......................... 200 14.8 Bounded rationality and market rationality............................. 201 14.9 Rational choice and market rationality.................................... 202 14.10 Information asymmetry and market rationality....................... 202 14.11 Biases, heuristics, and market rationality................................ 203 14.12 Prospect theory and market rationality.................................... 204 14.13 Irrational exuberance and market efficiency........................... 204 14.14 Evolution of market efficiency................................................ 205 14.15 Conclusion............................................................................... 205 References............................................................................... 206
CHAPTER 15
Human vs machine ethics.......................................... 211 15.1 Introduction............................................................................. 211 15.2 Normative vs scientific laws.................................................... 211 15.3 Governance and ethics............................................................. 213 15.4 Machine ethics......................................................................... 214 15.5 Data......................................................................................... 216 15.6 Model: Algorithms and testing................................................ 217 15.7 Decisions and actuators........................................................... 217 15.8 Contemporary ethical issues.................................................... 218 15.9 Conclusion............................................................................... 220 References............................................................................... 220
ix
x
Contents
CHAPTER 16
Conclusion.................................................................... 223 16.1 Introduction............................................................................. 223 16.2 Rationality............................................................................... 223 16.3 Bounded rationality, rational expectations, and rational choice......................................................................... 224 16.4 Rational counterfactual and opportunity cost.......................... 225 16.5 Rationality quantification and subjectivity.............................. 225 16.6 Rationality, groups, markets, and ethics.................................. 225 16.7 Conclusion............................................................................... 226 References............................................................................... 226
Nomenclature.......................................................................................................... 229 Appendix A: Data................................................................................................... 231 Appendix B: Subjectivity vs relativity.................................................................... 237 Appendix C: Algorithms......................................................................................... 247 Index....................................................................................................................... 253
Preface Rationality is a concept that has been studied for more than 2000 years from the times of Socrates to the modern times of Daniel Kahneman. Rationality, simply put, means logical. Throughout these times, rationality has always been about humans. Now that intelligent machines that are powered by AI are becoming common, the rationality of machines is emerging as an essential area of study. Rational agents maximize their net utility. In this book, we use the concept of net utility, which is the difference between the expected utility and the utility cost needed to achieve that expected utility. The utility is the good that is derived from a particular action or object. According to Socrates, that which is good helps and preserves whereas that which is bad or evil corrupts and destroys. Accordingly, rationality is the maximization of that which helps and preserves. Rational or logical agents must meet two criteria: (1) they must achieve their goal or objective, which is to maximize the net utility and (2) they must use the minimum amount of energy to maximize the net utility. For an agent not to maximize the net utility, is like a businessman who leaves money at the marketplace because he has gathered enough for the day. Such a businessman is not acting logically or rationally. Furthermore, not using the minimum amount of energy is akin to going from Cape Town to Johannesburg via London for no other reason except that such a traveler wants to travel from Cape Town to Johannesburg. This book studies the rationality of machines, especially in the light of the advances in AI, which are enabled by technologies such as deep learning, the abundance of data, and the increase in computational power due to Moore’s Law. This book studies the concept of bounded or limited rationality, which was proposed by the Nobel Laureate Herbert Simon, of both humans and machines and finds that the bounds or limits of rationality in machines are more flexible than those of humans. It also studies the theory of rational expectation of humans vs machines. The rational expectation is a theory that states that agents cannot be consistently wrong when predicting the future because it can always expand on the information it uses to predict the future. This book finds that machine rational expectation is more accurate than human rational expectation. The book also studies the theory of rational choice. Rational choice states that if two options are presented and option A offers a higher expected net utility than option B, then a rational agent prefers option A over B. The situation becomes complex if viable options are numerous and unknown. Then when humans make such decisions, they also have to identify all alternative options, which can also be viewed as espousing opportunity costs, and this is computationally expensive. The process of imagining alternatives is called counterfactual thinking. When an intelligent machine is used, the capability to explore alternative options is enhanced and accelerated. Machines are found to be better rational choice agents than human beings. Furthermore, the book introduces rational counterfactuals which are counterfactuals
xi
xii
Preface
that maximize the attainment of a particular goal. If the alternative option is as good as the most rational option, then this is a rational opportunity cost. The difference between rational opportunity costs and rational counterfactuals is that rational counterfactuals are imagined events and do not have to be practical while rational opportunity costs are viable and practical options that are forgone on making a choice. Machines were found to create more viable rational counterfactuals and opportunity costs than human beings. Furthermore, this book studies the various properties of rationality. Can rationality be measured? Is rationality subjective? Is group rationality better than individual rationality? Are humans more rational than machines? Do machines make markets more rational than humans? Are machines more ethical than humans? This book concludes that it is difficult to measure rationality and therefore determines that rationality is subjective. Furthermore, it concludes that groups are more rational than individuals. It found that markets that are populated by machines are more rational and efficient than human-based markets. In conclusion, machines were found to be more rational than humans. Tshilidzi Marwala Department of Electrical Engineering, University of Johannesburg, Johannesburg, South Africa
CHAPTER
Introduction to machine and human rationality
1
1.1 Introduction If we dig down to the linguistic definition of rationality, it is the use of information and logic to efficiently arrive at a conclusion. This chapter introduces the concept of machine rationality. It studies how intelligent machines are revolutionizing the world of politics, economics, and society. It studies some of the technologies that are driving intelligent machines such as artificial intelligence (AI), advanced materials, and biotechnology. These concepts are sometimes grouped and are collectively called the fourth industrial revolution (4IR) (Klein, 2008; Schwab, 2017). It studies the evolution of production, including the first, second, and the third industrial revolutions. It then links these to the emergence of intelligent machines and investigates an essential characteristic of an intelligent machine, which is machine rationality. Perhaps the best manner to introduce the notion of rationality is to trace the fall of the Tsars. The Tsars ruled Russia with an iron fist from the 16th century until the communist takeover in 1917 (Ferro, 1995; Warnes, 1999). The word Tsar is derived from the word Caesar, the name of the famous Roman emperor Julius Caesar (Freeman, 2008). Tsar Nicholas II was the last Emperor of Russia and was the grandson of Queen Victoria of England. He married German princess Alexandra of Hesse who was also a granddaughter of Queen Victoria. Nicholas II and Alexandra were second cousins, making this union genetically irrational. In genetics, it is known that offspring out of blood relations can be dangerous as it can exacerbate familyrelated illnesses as it reduces genetic diversity. A shift in a political regime is often explained through the concept of “dialectical materialism.” Dialectical materialism is the theoretical basis of communism, and it leads to the material interpretation of history, which, according to Karl Marx, is “the history of class struggle” (Mandel, 1977). Simply put, dialectical materialism infers that the type of society we live in is based on the mode of production that society adopts. This type of society continues until the internal contradictions, or in this case class struggles, within it create the material conditions for a revolution in order to change the identity of that society. Yet, the communists took over Russia, not because this concept was appealing to Russians, instead, Tsar Nicholas II was dethroned because of the failure of his reign due to him making too many mistakes, including marrying his cousin and fighting a disastrous war with Japan. Nicholas II failed because his life was full of irrationalities. I define irrationality, in this context as the act of contradicting the natural flow of things. In physics, there is a notion called “the path of least resistance,” Rational Machines and Artificial Intelligence. https://doi.org/10.1016/B978-0-12-820676-8.00006-5 Copyright © 2021 Elsevier Inc. All rights reserved.
1
2
CHAPTER 1 Introduction to machine and human rationality
which governs the dynamics of all things (Weissman, 2012) and explains why a given path is chosen in lieu of others. It is because of this concept of “the least resistance” that we have gravity. For instance, if you drop the ball, it falls along the shortest path to the ground under the force of gravity. It is also because of this concept that we have the principle of conservation of energy, which states that “energy cannot be created or destroyed but can only be transferred from one form to another” (Planck, 1923). Nicholas II’s life was so full of these contradictions that the communists exploited them before they overthrew him and killed his entire nuclear family. But when exactly did things go wrong for Nicholas II? Firstly, by marrying his second cousin, the couple’s only son, Alexei—the heir to the Russian throne—had a family-inherited disease, hemophilia. Hemophilia is an inherited genetic illness in which the blood is unable to form clots and, therefore, prevent bleeding (Peyvandi et al., 2016). To treat this disease, the Romanovs (the Russian royal family) enlisted the services of a controversial faith healer Grigori Rasputin (Fuhrmann, 1990). Rasputin was not a doctor nor was he a scientist, and there is no evidence that indicates he learned to read and write. The Romanovs made another irrational decision, which was to entrust the care of their sick son to an illiterate spiritual man. Rasputin was a powermonger who became their advisor on matters of the state, including the military, much to the dismay of Prime Minister Alexander Trepov and the Commander-in-Chief Grand Duke Nicholas. Rasputin was so powerful that in the very sensitive era of the World War I, he effectively appointed Boris Sturmer as Prime Minister, and Minister of Interior and Minister of Foreign Affairs. Rasputin’s desire to maximize the control of Russia (which was rational from his perspective) was irrational for Tsar Nicholas II (because it weakened Russia and consequently his own rule). Because of the confluence of the irrationality of the Tsar when dealing with the Russian people, disease, wars, etc., the Romanovs lost their empire and ultimately their lives as the Red October swept through Russia (Ascher, 2014). What could Tsar Nicholas II have done to prevent the disaster that befell his family? We call the hypothetical thinking of what could have been done a counterfactual (Rescher, 1964). If the identified counterfactual could have maximized Nicholas II’s stay in power, then this is a rational counterfactual (Marwala, 2014a). Counterfactuals are just thought experiments. The counterfactuals are real choices Nicholas II could have chosen, and forgoing such counterfactuals is called the opportunity cost. The opportunity cost that could have maximized the survival of the Tsar’s regime is the rational opportunity cost (Marwala and Hurwitz, 2017). One type of counterfactual, which is the subject of this chapter, is a machine counterfactual. Suppose Tsar Nicholas II had an artificial intelligence (AI) machine to assist him in making all the decisions he had to make in his reign. What would the outcome of this have looked like? Would the communists have taken over Russia? Would Russia have industrialized as it did during Joseph Stalin’s reign? Would the World War II have been won? One clear thing is that the AI machine would not have relied on the conman, Rasputin. At the core of this AI machine is its rationality, which is the subject of the next section.
1.3 Fourth industrial revolution
1.2 Machine rationality What is machine rationality? Rational decision-making means making logical decisions and is the ideal concept of intelligence. In its purest form, rational agents maximize their utility. If one wanted to fly from Johannesburg to London, the choice that one makes would be rational if it maximizes comfort, minimizes costs, etc. Studies by Kahneman and Tversky have shown that human beings are at best irrational (Tversky and Kahneman, 1989). The reason why Tsar Nicholas II made so many irrational decisions was that he was a human being. Had he been an AI machine, the decisions that he made could have at least been more rational and perhaps because of this reason, his life could have been spared. A rational machine, i.e., AI, does tasks that an irrational machine, i.e., a human, cannot do. Rational decision means making logical decisions using information. Essentially, a rational agent processes all relevant information optimally to achieve its objective. Rationality has two fundamentals: the use of relevant information and the efficient processing of such information. Optimizing means a rational agent can find the minimum or maximum solutions. It means that such an agent, when traveling, can find the shortest distance between two locations, given the constraints. It means that when executing a task, it can identify where there would be minimum costs. Of course, there are complex mathematical arguments that will be explored in this book such as the condition in which it is possible to identify a globally optimum solution. Similarly, the efficient use of relevant and complete information is essential for an agent to act rationally. Complete relevant information is not practically attainable. The extent of the relevance and completeness of the information the AI machine uses as well as the efficiency of the intelligent machine in executing its task determine the quantification of the rationality of a machine. Unbounded rational decision-making is the notion of making decisions with perfect information, using a perfect brain in an optimized manner. Rational decision-making involves the optimal processing of complete information to achieve an objective. Because this entire process is optimized, it leads to the maximization of utility. Full or unbounded rationality is often unattainable and, consequently, rationality even of AI machines is always limited. Nobel Prize winner Herbert Simon called this phenomenon the theory of bounded rationality, which will be discussed later in this book (Simon, 1991).
1.3 Fourth industrial revolution The 4IR is the era in which intelligent machines will perform tasks that were traditionally reserved for human beings (Marwala, 2020; Doorsamy et al., 2020). In the 4IR era, we should understand a human-machine as a system whose psychology, rationality, and effectiveness are interlinked. What then is the 4IR? For us to understand the 4IR, we should understand the previous industrial revolutions. The 4IR is the fourth because there were first, second, and third industrial revolutions.
3
4
CHAPTER 1 Introduction to machine and human rationality
The first industrial revolution occurred in England during the 17th century. We do not know the reason why the first industrial revolution happened in England. Given the population sizes of India and China at the time, the first industrial revolution should have happened in these two countries rather than in England. One of the reasons why it perhaps happened in England was because of Reformation, which led to the scientific revolution (Kuhn, 1962; Cameron, 2012), which saw developments in mathematics, physics, astronomy, biology, and chemistry. The Reformation led to the separation of Europe into the Catholic south and the Protestant north. To this day, the Catholic south is still more impoverished than the Protestant north. This scientific revolution in Britain gave us scientific luminaries such as Isaac Newton and James Watt. The first industrial revolution gave us the steam engine, which revolutionized transportation as well as the means and mode of production. Before that, manufacturing was performed by hand by trained craftsmen. The first industrial revolution allowed for commodities to be manufactured in bulk in factories. The steam trains gave rise to railroads that carried massive amounts of goods far more quickly and efficiently than could be done on horseback, for example. The second industrial revolution saw the introduction of electricity and mass production. This changed the scale and speed of manufacturing significantly. It also gave us the electric motor, which in turn gave us the assembly line. The assembly line led to the mass production of goods and services. The second industrial revolution happened in the United States with ideas from Britain on electromagnetism (Darrigol, 2000). Michael Faraday realized that if one locates an electric conductor next to a magnet and moves the conductor, then electricity is generated. The reverse of that is that if one puts electricity in an electric conductor located next to a magnet, it moves, and this is the basis of an electric motor. The principle underlying this is called electromagnetism and was theorized by James Clerk Maxwell. Maxwell’s theory of electromagnetism was not consistent with Newtonian mechanics, so a Dutch physicist Lorentz developed correction factors (Lorentz transformation) to make the two theories consistent. It took Einstein’s theory of relativity to unify the theories of Newton, Maxwell, and Lorentz. The third industrial revolution happened because of the discovery of semiconductors (Amos and James, 1999). These materials conduct electricity under certain conditions. Because of this reason, they can be used as efficient switches, and hence are used in digital computing. Before the advent of digital computers, we used to communicate digitally (using ones and zeros) using a telegram that used analogue switches. From the semiconductor devices, Bardeen, Brattain, and Shockley invented a transistor which ultimately led to the integrated circuit that makes modern computing possible. The third industrial revolution gave birth to the electronic age. Digital technology has improved so rapidly that every 2 years, we were able to double the processing power of computers, and this phenomenon is called Moore’s law. At some stage, we will not be able to miniaturize the integrated circuit because of the quantum effects, and this will be the end of Moore’s law. The 4IR is an era characterized by the confluence of cyber, physical, and biological systems. An illustration of this is shown in Fig. 1.1. The cyber technologies include AI, blockchain, and the Internet of things (IoT) enabled by 5G technology. The
1.3 Fourth industrial revolution
Cyber: AI, Blockchain, Quantum, Internet : 1.0 (nets begins), 2.0: (social nets), 3.0: (internet of markets), 4.0: (IoT and IoET) Physical: 3-D Printing; Robotics; New Materials; graphene Biological: Biomedical Engineering; Biotechnology FIG. 1.1 An illustration of the fourth industrial revolution.
01
1st Industrial Revolution: knowledge formulation DNA: Newton and James Watt steam mechanization, etc.
02
2nd Industrial Revolution: knowledge evolution DNA: Electro-Magnetism by Faraday, Maxwell and Hans Christian Ørsted electrification, mass production, etc.
03
3rd Industrial Revolution: knowledge distribution DNA: Transistors based on Semi-Conductors by Bardeen, Brattain and Shockley computerization, Internetization, etc.
04
4th Industrial Revolution: knowledge mutation DNA: Artifical Intelligence by Turing cyber-physical systemization, artifical cognization, robotization, etc.
FIG. 1.2 An illustration of the evolution of industrial revolutions.
developments in material science have led to the development of a new material called graphene as well as robots that can perform complex tasks in a hostile environment. On the biological technology side, developments in gene editing enhanced by AI are challenging the very essence of who we are as humans. An illustration of the evolution of 4IR is shown in Fig. 1.2. We had seen automation in the previous three revolutions, but these were on a mechanical level—taking over the labor intensive skills. The 4IR, however, is an entire paradigm shift. The primary driver of the 4IR is AI, and this is the subject of the next section.
5
6
CHAPTER 1 Introduction to machine and human rationality
1.4 Artificial intelligence AI is a computational technique that makes machines intelligent. While computers traditionally relied on people to tell them what to do and how to react, AI is based on machines that can learn and make their own decision. A machine is considered to be intelligent if it can analyze information and extract insights beyond the obvious. While computers traditionally relied on people to tell them what to do and how to react, AI is based on machines that can learn and make their own decision. The limit of how intelligent these machines can be is not known. However, we now know that AI machines can perform many complicated tasks such as playing better chess than human beings. There are three different types of AI: machine learning, soft computing, and computational intelligence (CI) (Bishop, 1995; Rutkowski, 2008; Chaturvedi, 2008). An illustration of AI with its subdisciplines is shown in Fig. 1.3. A type of machine learning is deep learning, which is a neural network with many layers. Machine learning is the use of data and statistics to create intelligent machines. Machine learning picks up on patterns and mimics human intelligence and in some instances, surpasses it. This gives AI some of the decision-making abilities that humans have. There are many types of machine learning, which include neural networks and support vector machines. The neural network was inspired by the structure of the human brain and consists of neurons that are connected, and they can map some inputs with outputs and is shown in Fig. 1.4. For example, the input can be the face of a person, the output can be who that person is, and this is a face recognition algorithm. CI uses group intelligence observed in nature to build intelligent machines. It is the technique of building intelligence by observing how nature works and using this information to make intelligent machines. Unlike neural network, which uses individual intelligence, CI uses group intelligence. An example of CI is ant colony optimization, which uses the principles observed in the workings of the ant colony to build intelligent machines. The first person to observe the intelligence and rationality of ants was a South African poet Eugene Marais who in his seminal book Die siel
Artificial Intelligence Machine Learning Soft Computing
Computational Intelligence
FIG. 1.3 Example of artificial intelligence.
Deep Learning
1.4 Artificial intelligence
input layer
hidden layer 1
hidden layer 2
hidden layer 3
output layer
FIG. 1.4 A deep neural networks with three hidden layers.
van die mier unlocked how white ants can build complicated anthills (Marais, 1937). These anthills have complicated tunnels with air conditioning systems, which are far better and efficient than the mechanical air conditioning systems in our rooms. An example of these ants and the corresponding ant hills are shown in Figs. 1.5 and 1.6. CI requires little data and sometimes no data at all. It has been used successfully for the clustering of machines (Xing et al., 2010a, b, c), designing cellular manufacturing layout (Xing et al., 2010a, b, c), cybersecurity (Ranjan et al., 2018), missing data estimation (Leke Betechouoh and Marwala, 2006), and designing storage and retrieval system (Xing et al., 2010a, b, c). Many other types of CI algorithms exploit group intelligence. One such CI algorithm, which uses the group intelligence of birds, is particle swarm optimization (PSO). This is based on the concept that individual solutions are particles that evolve into a more reliable solution. PSO is an intelligent global optimization algorithm that has been successfully used to tackle complex problems such as improving the accuracy of aircraft models (Mthembu et al., 2011; Boulkaibet et al., 2015) as well as predicting wind energy (Mbuvha et al., 2018). Soft computing is an AI technique that requires limited data and is used to bring precision to issues such as linguistic variables. An example of soft computing is fuzzy logic. Fuzzy logic is based on vague, imprecise notions that may be true. Put differently, it is the logic of partial degrees of truth. In fuzzy logic, linguistic variables are encoded into the fuzzy domain using fuzzy membership functions, generating fuzzy rules and aggregating outputs and then defuzzification. This imitates the entire
7
8
CHAPTER 1 Introduction to machine and human rationality
FIG. 1.5 Ants that can build complex anthills.
FIG. 1.6 Complex anthill built by ants.
thought process a human being would have in a decision that would require a yes or a no answer. Another example of soft computing is a fuzzy-neuron system, which is an aggregation of fuzzy logic and neural networks. In other words, it uses a learning algorithm (neural network) to determine parameters (fuzzy logic) by processing data samples. The difference between fuzzy logic and neuro-fuzzy system is that the neuro-fuzzy system is more accurate than fuzzy logic, which is, in turn, more
1.5 Summary of the book
Table 1.1 The summary of AI methods. AI method
Examples
Data requirements
Applications
Machine learning Computation intelligence
Neural networks; support vector machines Ant colony optimization; particle swarm optimization; genetic algorithm Fuzzy logic; rough sets
Lots of data
Prediction Classification Optimization
Soft computing
No data
Limited data
Prediction Classification
transparent, i.e., interpretable than the neuro-fuzzy system. The general law that governs fuzzy systems is that the more transparent it is, the less accurate it is whereas the less transparent it is, the more accurate it is. Marwala and Leke (2019) as well as Marwala (2018) extensively studied the aspects of artificial intelligence such as machine learning and optimization in decision- making. In contrast, Leke and Marwala (2019) and Marwala (2009) used neural networks, support vector machines, fuzzy systems, and deep learning networks for missing data estimation in engineering and biomedical systems. Xing and Marwala (2018a, b) used smart computing in crowdfunding, whereas Xing and Marwala (2018a, b) studied smart maintenance for human-robot interaction from an intelligent search and algorithmic perspective. Marwala and Hurwitz (2017) studied the impact of artificial intelligence on economic theory. In contrast, Marwala et al. (2017) and Marwala (2010) used artificial Bayesian statistics to improve models of aeronautical and mechanical systems. Artificial intelligence has been used to understand causality, correlation, and rational decision-making (Marwala, 2014a, b, 2015) and to model complex economic phenomena such as interest rates, stock market, and derivatives (Marwala, 2013a, b). Marwala (2012) used AI to model the condition of mechanical and electrical structures, whereas Marwala and Lagazio (2011) used AI to model interstate conflict. Marwala (2007) used AI to model complex systems such as HIV (Table 1.1).
1.5 Summary of the book Chapter 2 defines rationality and studies the role of information, the techniques of analyzing such information, and optimization in the formulation of the concept of rationality, which is based on the principle of maximizing utility. The utility is the usefulness of a good or service. The concept of maximizing utility is a concept called utilitarianism, a concept introduced by Jeremy Bentham and extensively studied by John Stuart Mill (Bentham, 1776; Mill and Bentham, 2004). This chapter, furthermore, studies how to formulate utility and whether there is a unique utility function for a given problem. It concludes that utility formulation is subjective and investigates the implications of this conclusion on the concept of rationality.
9
10
CHAPTER 1 Introduction to machine and human rationality
Chapter 3 introduces the concept of a rational machine, which is based on the concept of an intelligent machine. This chapter, mainly, investigates how we can and should build a rational machine. This chapter explores how AI can be used to build rational and intelligent machines. The AI concepts considered in this chapter include Bayes decision theory, artificial general intelligence, knowledge representation, the multilayered perceptron, radial basis function, and support vector machines (Marwala, 2007). Chapter 4 describes the flexibly bounded rationality, which was proposed by Marwala (2013a, b) as an extension of the theory of bounded rationality. Nobel Laureate Herbert Simon proposed the theory of bounded rationality to characterize the limitation of rationality in humans. Decision-making uses information, which is imperfect and incomplete, and intelligent machines, to attempt to make an optimized decision. Because of the limitations of the information and intelligent machines, this decision-making process is bounded or limited rationally. In machine decisionmaking, big data analytics expand the amount of information used, advances in AI through technologies such as deep learning, and advances in computer processing power due to Moore’s Law, and expand the bounds through which rationality can be exercised. This is flexibly bounded rationality. Chapter 5 studies rational expectation, a theory that is an improvement of the theory of adaptive expectations. Expectations are the estimations of the future expected values of variables. For example, in economics, we might be interested in the interest rate of a country in the next 3 years. The rational expectation is a theory that states that the expectation of future values of variables (in this case interest rates of South Africa in the next 3 years) cannot be consistently wrong because agents are able to incorporate all the information at their exposure to correct future expectations. This chapter studies the theory of rational expectation within the context of an AI machine. Chapter 6 studies the theory of rational choice, which assumes that when people make decisions, they aim to maximize their utility. To attain this objective, they ought to utilize all available information and consider all the choices available to select an optimal choice. This chapter studies what happens when artificially intelligent agents make decisions rather than human beings. Firstly, the expectations of the future (prediction of the future) are more consistent if artificially intelligent agents make them, and the decisions are more rational. Chapter 7 studies bounded rational counterfactuals, which maximize the attainment of the desired consequent. The theory of rational counterfactuals identifies the antecedent that gives the desired consequent necessary for rational decisionmaking. We apply rational counterfactuals and artificial intelligence in practical problems. Chapter 8 introduces the rational opportunity cost, which is the most optimal opportunity cost forgone when an agent makes choices. Rational choice is when the agent makes a choice that gives the highest comparative utility than other choices available. The rational opportunity costs arise when there is more than one choice that gives identical utility, and consequently, some of the choices forgone could
1.5 Summary of the book
have been rationally chosen as they have the same utility as those chosen. This chapter studies the implications of rational opportunity cost in the era of advances in artificially intelligent machines. Chapter 9 studies machine rationality and explores if machines are more rational than human beings. It observes the real reasons why humans are not rational, which is because information is imperfect and limited, processing power through the brain is limited and inconsistent, and decisions are unoptimized. It investigates whether these limitations of humans are shifted to the limitations of machines. Chapter 10 studies whether we can compute and quantify rationality. Rationality is defined as the application of complete information, which is executed using a perfect biological or physical brain, in an optimized manner. To compute rationality, one needs to quantify the completeness of the information that is used for decisionmaking, perfection of the physical (for machines) or biological brain (for humans), and the extent of the optimization of the entire decision-making system. We measure the rationality of a machine (i.e., physical or biological brain) by the expected accuracy of the model. Furthermore, we measure the rationality of the optimization procedure as the ratio of the achieved objective (i.e., utility) to the global objective. We quantify the overall rationality of a decision by multiplying the rationality of the model with the rationality of the optimization procedure. In Chapter 11, we explore rational decision-making and whether machine rationality is subjective. Rational decision-making is the practice of making logical decisions. A rational agent processes all relevant information optimally to attain its objective. Rationality has two foundations: the usage of relevant information and the efficient processing of such information. In reality, the relevant information is incomplete, imperfect, and the processing engine, which is a brain for humans, is suboptimal. Humans are risk-averse rather than utility maximizers. Real-world problems are mostly nonconvex, and this makes the idea of rational decision-making fundamentally unachievable (Bertsekas, 2015). There is a trade-off between the quantity of information used for decision-making and the complexity of the decision model used. Chapter 12 studies collective decision-making by a group of intelligent machines, and compares this to decision-making by individual intelligent machines. In particular, it explores whether a group of intelligent machines is more rational than individual intelligent machines. It applies this to problems in society that involves voting and suggests how society should be structured to increase aggregate rationality. Chapter 13 compares machine rationality to human rationality. In behavioral economics, researchers such as Kahneman, Tversky, and Thaler studied extensively the rationality of humans and how humans, consequently, make decisions. This chapter compares the differences between human and machine rationality. Furthermore, it studies how these differences affect the decision-making of machines when compared to humans. In Chapter 14, we explore the impact of AI on the efficiency of the market. Furthermore, it studies theories that influence market efficiency and how they are changed by the advances in AI and how they impact on market efficiency. It surmises
11
12
CHAPTER 1 Introduction to machine and human rationality
that advances in artificial intelligence and its applications in financial markets make markets more efficient. In Chapter 15, we explore the difference between human and machine ethics. The discipline of ethics falls in the category of normative laws. For example, the Ten Commandments in the Bible are normative rather than natural and scientific laws. This is because as human beings, we can choose not to obey the Ten Commandments whereas we cannot choose not to obey the force of gravity which is a natural law. Chapter 16 concludes the book, and critical lessons learned from this book are outlined and outstanding issues described in this chapter.
References Amos, S.W., James, M.R., 1999. Principles of Transistor Circuits. Butterworth-Heinemann. Ascher, A., 2014. The Russian Revolution: A Beginner’s Guide. Oneworld Publications. Bentham, J., 1776. A Fragment on Government. London. Preface (2nd para). Bertsekas, D.P., 2015. Convex Optimization Algorithms. Athena Scientific, Belmont, MA. Bishop, C., 1995. Neural Networks for Pattern Recognition. Oxford University Press. Boulkaibet, I., Mthembu, L., de Lima Neto, F.B., Marwala, T., 2015. Finite element model updating using fish school search and volitive particle swarm optimization. Integr. Comput. Aided Eng. 22 (4), 361–376. Cameron, E., 2012. The European Reformation. Oxford University Press. Chaturvedi, D.K., 2008. Soft Computing: Techniques and Its Applications in Electrical Engineering. Springer. Darrigol, O., 2000. Electrodynamics From Ampère to Einstein. Oxford University Press, New York. Doorsamy, W., Paul, S., Marwala, T., 2020. The Disruptive Fourth Industrial Revolution. Springer, London. Ferro, M., 1995. Nicholas II: Last of the Tsars. Translated by Pearce, Brian, Oxford University Press. Freeman, P., 2008. Julius Caesar. Simon and Schuster. Fuhrmann, J.T., 1990. Rasputin: A Life. Praeger Frederick. Klein, M., 2008. The technological revolution. The Newsletter of Foreign Policy Research Institute 13 (18). Kuhn, T.S., 1962. The Structure of Scientific Revolutions. University of Chicago Press, Chicago, IL. Leke Betechouoh, B., Marwala, T., 2006. Ant colony optimization for missing data estimation. In: Proceeding of the Pattern Recognition of South Africa, pp. 183–188. Leke, C.A., Marwala, T., 2019. Deep Learning and Missing Data in Engineering Systems. Springer, London. Mandel, E., 1977. The Formation of the Economic Thought of Karl Marx. Monthly Review Press, New York. Marais, E., 1937. The Soul of the White Ant (first published as Die Siel van die Mier in 1925, in Afrikaans). Discovery Miles Publisher. Marwala, T., 2007. Computational Intelligence for Modelling Complex Systems. Research India Publications, Delhi.
References
Marwala, T., 2009. Computational Intelligence for Missing Data Imputation, Estimation, and Management: Knowledge Optimization Techniques. IGI Global, Pennsylvania. Marwala, T., 2010. Finite Element Model Updating Using Computational Intelligence Techniques: Applications to Structural Dynamics. Springer, Heidelberg. Marwala, T., 2012. Condition Monitoring Using Computational Intelligence Methods. Springer, Heidelberg, ISBN: 978-1-4471-2380-4. Marwala, T., 2013a. Flexibly-Bounded Rationality and Marginalization of Irrationality Theories for Decision Making. arXiv:1305.6037. Marwala, T., 2013b. Economic Modeling Using Artificial Intelligence Methods. Springer, Heidelberg. Marwala, T., 2014a. Rational Counterfactuals. arXiv:1404.2116. Marwala, T., 2014b. Artificial Intelligence Techniques for Rational Decision Making. Springer, Heidelberg. Marwala, T., 2015. Causality, Correlation, and Artificial Intelligence for Rational Decision Making. World Scientific, Singapore. Marwala, T., 2018. Handbook of Machine Learning: Foundation of Artificial Intelligence. vol. 1 World Scientific Publication. Marwala, T., 2020. Closing the Gap: The Fourth Industrial Revolution in Africa. MacMillan, Johannesburg. Marwala, T., Hurwitz, E., 2017. Artificial Intelligence and Economic Theory: Skynet in the Market. Springer. Marwala, T., Lagazio, M., 2011. Militarized Conflict Modeling Using Computational Intelligence. Springer, Heidelberg. Marwala, T., Leke, C.A., 2019. Handbook of Machine Learning: Optimization and Decision Making. vol. 2 World Scientific Publication. Marwala, T., Boulkaibet, I., Adhikari, S., 2017. Probabilistic Finite Element Model Updating Using Bayesian Statistics: Applications to Aeronautical and Mechanical Engineering. John Wiley and Sons. Mbuvha, R., Boulkaibet, I., Marwala, T., de Lima Neto, F.B., 2018. A hybrid GA-PSO adaptive neuro-fuzzy inference system for short-term wind power prediction. In: Tan, Y., Shi, Y., Tang, Q. (Eds.), Advances in Swarm Intelligence. ICSI 2018. Lecture Notes in Computer Science, vol. 10941. Springer, Cham. Mill, J.S., Bentham, J., 2004. In: Ryan, A. (Ed.), Utilitarianism and Other Essays. Penguin Books, London. Mthembu, L., Marwala, T., Friswell, M.I., Adhikari, S., 2011. Finite element model selection using particle swarm optimization. In: Conference Proceedings of the Society for Experimental Mechanics Series, 1, vol. 13. Dynamics of Civil Structures, vol. 4. Springer, London, pp. 41–52. Peyvandi, F., Garagiola, I., Young, G., 2016. The past and future of haemophilia: diagnosis, treatments, and its complications. Lancet 388 (10040), 187–197. Planck, M., 1923. Treatise on Thermodynamics, third English edition translated by A. Ogg from the seventh German edition. Longmans, Green & Co., London. Ranjan, A., Selvaraj, R., Kuthadi, V.M., Marwala, T., 2018. Stealthy attacks in MANET to detect and counter measure by ant colony optimization. In: Lecture Notes in Electrical Engineering, vol. 443. Springer, Singapore. Rescher, N., 1964. Hypothetical Reasoning. North Holland Pub Co., Amsterdam. Rutkowski, L., 2008. Computational Intelligence: Methods and Techniques. Springer. Schwab, K., 2017. The Fourth Industrial Revolution. Crown Publishing Group, New York.
13
14
CHAPTER 1 Introduction to machine and human rationality
Simon, H., 1991. Bounded rationality and organizational learning. Organ. Sci. 2 (1), 125–134. Tversky, A., Kahneman, D., 1989. Rational choice and the framing of decisions. In: Karpak, B., Zionts, S. (Eds.), Multiple Criteria Decision Making and Risk Analysis Using Microcomputers. NATO ASI Series (Series F: Computer and Systems Sciences), vol. 56. Springer, Berlin, Heidelberg. Warnes, D., 1999. Chronicle of the Russian Tsars. Thames and Hudson Ltd, London. Weissman, D., 2012. Cage, The: Must, Should, and Ought From Is. SUNY Press. Xing, B., Marwala, T., 2018a. Smart Computing in Crowdfunding. CRC Press (Taylor and Francis), London. Xing, B., Marwala, T., 2018b. Smart Maintenance for Human–Robot Interaction: An Intelligent Search Algorithmic Perspective. Springer, London. Xing, B., Gao, W.J., Nelwamondo, F.V., Battle, K., Marwala, T., 2010a. Part-machine clustering: the comparison between adaptive resonance theory neural network and ant colony system. In: Lecture Notes in Electrical Engineering. Springer, Berlin, Heidelberg, pp. 747–755. Xing, B., Gao, W.J., Nelwamondo, F.V., Battle, K., Marwala, T., 2010b. Two-stage inter-cell layout design for cellular manufacturing by using ant colony optimization algorithms. In: Lecture Notes in Computer Science. Springer, pp. 281–289. Xing, B., Gao, W.J., Nelwamondo, F.V., Battle, K., Marwala, T., 2010c. Ant colony optimization for automated storage and retrieval system. In: Proceedings of the IEEE Conference Evolutionary Computation, pp. 1133–1139.
CHAPTER
What is machine vs human rationality?
2
2.1 Introduction This chapter studies what rationality is. In particular, it studies the role of information, the mechanism of analyzing such information and optimization in the formulation of the concept of rationality, which is based on the principle of maximizing the net utility (Marwala, 2014, 2015; Muller and Karpas, 2019). Furthermore, it studies the concept of utility, utility formulation, and the uniqueness of utility function formulation for a given problem. It concludes that utility formulation is subjective and studies the implications of this conclusion on the concept of rationality. In Chapter 1, we studied the decision-making approach of Tsar Nicholas II and deemed it irrational (Lieven, 1993). Why was Nicholas II irrational in his war with Japan, his decision to be on the front line in the World War II, and in allowing Rasputin to capture state power? To answer these questions, we need to firstly understand what rationality is and what it achieves. Entirely rational agents efficiently achieve their goals by maximizing their net utility, which considers the actions undertaken to achieve a particular goal. In this chapter, we discuss what maximization and utility are. Nicholas II did not optimize (maximize) the attainment of his goals. Even if he had achieved his goals or had not achieved these goals efficiently, he would not have been fully rational. The difficulty of a complicated life, like the one that Tsar Nicholas II was supposed to lead, is that utility functions are multidimensional. A utility function can be summarized as a function or mathematical equation that describes the goal that an agent seeks to achieve. For example, in Nicholas II’s war with Japan, his overall utility function was to win the war, in the shortest period with minimal resources. Winning the war is a utility function on its own, as is the period it takes to win the war, and as is the act of using the minimum amount of resources. Therefore, the overall utility function is the combination of these three utility functions. Combining these three objective functions is subjective. For example, Nicholas II can decide that he wants to win the war at all costs. In this regard, he will have one objective function, which is to win the war even if it impoverishes his country for the next thousand years. The following section will describe what a utility is.
Rational Machines and Artificial Intelligence. https://doi.org/10.1016/B978-0-12-820676-8.00020-X Copyright © 2021 Elsevier Inc. All rights reserved.
15
16
CHAPTER 2 What is machine vs human rationality?
2.2 What is utility? The utility is the usefulness of a solution (Fishburn, 1970). For instance, Tshilidzi is an engineer at a company called Bearing Incorporated, and he is given a goal of creating a computer model of a bridge using the software. In this regard, Tshilidzi’s goal is to develop a model of a bridge that behaves as closely to the physical bridge as possible. The closer his model is to the actual physical bridge, the more rational he is in designing this bridge, and consequently, the more rational the model is. In this regard, we understand the utility or the usefulness of this model as a measure of the accuracy of this model. Mathematically, we can quantify the error of the model by using the distance between the model and the measured data as follows: E = å i =1 M - Di ¥
(2.1)
here, E is the error, M is the model, D is the ith data set, and ‖ ‖ is the Euclidean norm (Trèves, 1995). Eq. (2.1) states that there are infinite data D that one can extract from a structure, and, in our context, a bridge to improve the accuracy of the bridge model. From Eq. (2.1), the lower the distance between the model and the data, the more useful (higher utility) is the model. Conversely, the higher the distance between the model and the data, the less useful (lower utility) is the model. Thus, from Eq. (2.1), we can quantify utility (U) from the error in Eq. (2.1) as follows: U = - E = -å i =1 M - Di ¥
(2.2)
What are some of the difficulties with the formulation of utility functions? The first difficulty is that it is challenging to formulate a utility function, as one must make subjective decisions such as how to quantify the distances, what data to use, and how this data should be processed. Secondly, the absence of any form of a machine presents a moderately complex problem, as a human being cannot formulate the utility function well. This is because human decision-making is devoid of the precise formulation of a utility function and uses heuristics to estimate the utility function (Kahneman et al., 1982; Kahneman, 2011). Thirdly, when a utility function is a combination of many different utility functions such as in Eq. (2.1), which combines different data types to formulate a combination of utility functions, it is not clear how to weigh each utility function in the formulation of the overall utility function. The consequence of all these issues is that utility functions are often incomplete and subjective as there are multiple ways of formulating the utility function. The incompleteness of the utility function, in turn, leads to difficulty in making fully or unbounded rational decisions by man or machine or the man-machine system. There are two types of utility functions: static utility function and dynamic utility function.
2.2.1 Static utility function The static utility function is the type of utility in which the target or the goal does not change as a function of time. For example, the utility function in Eq. (2.2) is
2.3 What is rationality?
static in the sense that the target data (i.e., the goal) which we are making the model predict does not change as a function of time. For example, if we direct a missile to a bridge, the utility function will be the distance between the global position system (GPS) coordinates of a bridge and the GPS coordinates of a missile, and this can be mathematically written as follows: U = - E = - å M c (U ) - Bc
(2.3)
here, Mc stands for missile GPS coordinates, Bc stands for bridge GPS coordinates, and U stands for the control schedule of the missile. So, in this problem, the missile will actuate its control mechanism to ensure that it hits the target. In this regard, the target, which is the bridge, is stationary. Therefore, the static utility function is a function where the target data or the goal is not changing.
2.2.2 Dynamic utility function The dynamic utility function is a utility function in which the target or goal changes as a function of time. An example of this is a scenario where a missile is trying to hit a moving target such as a moving vehicle represented as V. In this regard, the utility function can be written as follows by adapting Eq. (2.3): U = - E = - å M c (U ) - Vc ( t )
(2.4)
Naturally, a dynamic utility function is more challenging to model than a static target. The mathematics of handling the dynamic utility function is called dynamic programming, and it was studied quite extensively by Bellman (1957a,b). Now that we understand what utility is, the next section will look at how to define rationality.
2.3 What is rationality? Conventionally, a rational agent maximizes utility to achieve its goal. The problem with this formulation is that it does not consider the cost of achieving this goal. Therefore, correctly, a rational agent must maximize their net utility. The net utility is the difference between utility at state (i.e., position) t and utility at state t + 1, less the utility of the actions needed to move from state t to t + 1. Kahneman and Tversky introduced the concept of net utility in decision-making (Kahneman, 2011). For instance, they observed that if John invests $1 million in stock and the value of this stock deteriorates to $100,000 and Peter invests $1000 in stock, and the value of this stock increases to $100,000, Peter would be happier than John would. This is because Peter’s net utility is +$99,000, whereas John’s net utility is −$900,000. This is true even though both have the same utility of $100,000. In this regard, Kahneman and Tversky observed that the utility of a state is not as important as the net utility. The concept of maximization falls under the mathematical field of optimization (Lee, 2004). In optimization, the hardest problem to untangle is that there are some types of problems where one is not sure whether one has sufficiently optimized (Bertsekas et al., 2003). These problems are nonconvex. For these types of problems, a ssuming
17
18
CHAPTER 2 What is machine vs human rationality?
we have identified an appropriate and perfect utility function, we are not able to know if we are fully rational. Therefore using this definition, if Tshilidzi is supposed to maximize the utility function in Eq. (2.3), less the utility of the actions he is supposed to undertake to achieve his goals, what then is the utility of the actions?
2.4 Utility of actions Suppose Khathu is a traveling salesman who sells golf clubs in South Africa. Suppose the golf clubs sell for different prices in different cities, as illustrated in Table 2.1. Suppose Khathu is based in Soweto and must rationally decide if he will stay in Soweto or whether he should move. Suppose moving his merchandise from Soweto to Cape Town costs $4000, Soweto to Port Elizabeth $3000, Soweto to Pretoria $2000, and Soweto to Johannesburg $1000, where should he move to? In order to make this move, he needs to establish the utility of his actions (i.e., the expected cost of his move). This then shows that he needs to move from Soweto to Johannesburg. At any given location, he has a utility at that state. The term state here is the expected utility at a given location. The state-space analysis is an established field in a discipline called control systems (Friedland, 2005; Hinrichsen and Pritchard, 2005). Any other move will result in a lower net utility. Table 2.2 presents the state-space analysis of moving from Soweto to various cities. Thus, any other move from Soweto to any other city is not rational. We do realize that moving from Soweto to Johannesburg is more rational than moving to Pretoria, which is more rational than moving to Cape Town, which is more rational Table 2.1 Traveling salesman expected utility. City
Price ($)
Expected sold quantity
Expected utility ($)
Johannesburg Pretoria Port Elizabeth Cape Town Soweto
100 120 93 80 102
100 90 70 105 40
10,000 10,800 6510 8400 4080
Table 2.2 State-space analysis of a move from Soweto to the other cities. Attribute
Soweto
Johannesburg
Pretoria
Port Elizabeth
Cape Town
Utility at state 1 Utility at state 2 Utility of action Net utility
4080 – – –
– 10,000 1000 4920
– 10,800 2000 4720
– 6510 3000 − 570
– 8400 4000 320
2.6 Bounded rationality
than moving to Port Elizabeth. This analysis indicates that rationality can indeed be quantified, a topic that will be explored later in the book.
2.5 State and action utility measurement The net utility in the previous section was easy to calculate because the utilities at different states and the utility of action were all in dollars. For many problems, as we shall see in the next section, the utility of the states and utility of actions are in different measurement units. For example, the utility of the state can be in terms of percentage error, whereas the actions can be in hours. In this situation, it is essential to convert the respective units into a uniform unit, and this is very difficult to achieve. Moreover, how we achieve this is often subjective and thus makes the whole concept of the utility subjective. Because of limitations such as the subjectivity of the net utility, the efficiency of processing information and the mathematical difficulty of optimization, unbounded or unlimited rationality does not exist. This is what Nobel Laureate Herbert Simon called the theory of bounded rationality, which we describe in the next section (Simon, 1991).
2.6 Bounded rationality One description of rationality comprises the notion of making an optimized decision. Practically, this is impossible because of the constraint of the accessibility of information necessary to make this decision, and the inadequacy of a device to analyze all the information (Simon, 1990). These limitations give rise to the theory of bounded rationality, which was first introduced by Nobel Laureate Herbert Simon (Simon, 1957). The implications of bounded rationality on economic, political, and social systems are substantial (Simon, 1991). The theory of bounded rationality does not replace the theory of rationality but only puts constraints on the applicability of the theory of rationality, as illustrated in Fig. 2.1.
Data
Incomplete, Imperfect
FIG. 2.1 The theory of bounded rationality.
Model
Inconsistent, Imperfect
Decision
19
20
CHAPTER 2 What is machine vs human rationality?
Fig. 2.1 demonstrates that in rational decision-making, the data we use to make such decisions are imperfect and incomplete, and the model which inputs the data and transforms these into decisions is imperfect. If this model is a human brain, then it is inconsistent because it is subject to factors such as biases and heuristics. Simon introduced a linguistic word satisficing, thus combining the words satisfying and sufficing (Brown, 2004). Thus, bounded rationality is the act of making optimized decisions under the constraints of incomplete and imperfect information using an inconsistent and imperfect model. The theory of bounded rationality has been expanded to make the flexibly bounded rationality, which is described later in this book (Marwala, 2015).
2.7 Seeking a bounded rational model In this section, we refer back to the example of Tshilidzi trying to fit a mathematical model to some measured data to create a computer model of a bridge. For Tshilidzi to expand the degree of rationality, he should account for the cost of all the actions he takes to create such a model. However, in executing such actions, Tshilidzi must contend with several limitations. The first limitation is that Tshilidzi has no access to all the measurable and unmeasurable data from the actual bridge. Secondly, the utility function in Eq. (2.2) that Tshilidzi has formulated is just one of many possible utility functions. One example of an alternative utility function is the following: U = - E = -å i =1 g i M - Di ¥
(2.5)
here, γ is the weighting function for each type of data. Of course, we cannot possibly access all the data, and as a result, this whole process of Tshilidzi trying to build a model that is as close to the measured data as possible and accounts for the cost of all his actions is the theory of bounded rationality. Thus, we must truncate the amount of data he can use, and as a result the summation in Eq. (2.5) is limited to a finite amount of data and not to infinity. The model of a bridge considered by Tshilidzi is shown in Fig. 2.2 (Marwala, 1997). In Fig. 2.2, Marwala (1997) took measurements at the positions indicated, and he processed the data in two ways: firstly using modal analysis; and secondly using the frequency response function. It should be noted that there are many ways in which Element 1
FIG. 2.2 A beam which represents a bridge Tshilidzi is modeling.
Element 11
2.7 Seeking a bounded rational model
Table 2.3 Predicted data obtained using improved models.
Modes
Measured frequency (Hz)
Initial frequency (Hz)
Frequencies from the FRF (Hz)
Frequencies from the modal properties (Hz)
Updated combined data
1 2 3 4
41.5 114.5 224.5 371.6
42.3 117.0 227.3 376.9
42.1 116.8 222.3 368.8
41.9 111.2 226.7 374.9
41.6 115.4 225.7 371.1
the data can be represented, and these have been covered extensively by Marwala (2012, 2013). The model of a bridge is improved to better reflect the measured data by identifying the parameters that are in doubt. This is achieved by constructing an objective function that measures the distance between measured data and the model of this bridge (Marwala, 1997). Two types of data were used, and these were modal data and the frequency response functions (FRF). The details on this can be obtained from a book by Marwala (2010). The results of the improved model that Tshilidzi obtained are shown in Table 2.3. In this table, the improved model was obtained using three sets of data: modal data, FRF, and the combination of the two. These results indicate that the combined sets of data give the best results, followed by the modal data. How would Tshilidzi have performed if he had decided to improve this model without the use of any computational model? He would not have been able to achieve the results that he did when he used a computational model because human beings are not wired to compute multidimensional and sophisticated systems. This is because Tshilidzi, as a human agent, has not evolved to compute precisely. For Tshilidzi to make this decision precisely, he should maximize the net utility. Here, the net utility will be the difference between the accuracy of the model and the costs incurred to build the model, which can be summarized as a regular market-based currency. The problem is that the measurement unit for the value of the model (accuracy of the model) and the cost of the model (in hard currency) are incompatible so the net utility is difficult to calculate. One way of fixing this is to convert the value of the model into hard currency and then use the derived value together with the cost of the model (in hard currency) to calculate the net utility. If we ignore the cost of the model, Table 2.3 shows that the combined model is more valuable and thus more rational than the model based on modal data, which is, in turn, more valuable than the model that uses the frequency response data. This indicates that indeed, there are levels of rationality, an issue that will be explored later in the book.
21
22
CHAPTER 2 What is machine vs human rationality?
FIG. 2.3 Pictures of clay pots.
2.8 Human rationality: Clay pots Tshilidzi’s grandmother Vho-Tshianeo was a rural engineering teacher from a village called Duthuni in South Africa. Vho-Tshianeo knew how to predict the failure of structures before they occurred. She would often look at the sky and predict whether it would rain or not based on the color of the clouds. Vho-Tshianeo also used to make clay pots, which are shown in Fig. 2.3. The art of making clay pots is rich in lessons in engineering such as supply chain management, metallurgy, applied mathematics, thermodynamics, and AI (Minsky, 1967, 2006; Russell and Norvig, 2009; Reardon, 2011). The process of making clay pots serves as a guide on how to monitor the integrity of any structure. Firstly, one needs to identify the source of good clay, which requires knowledge of materials science. In engineering there is an entire field called material selection, which looks at how to optimally select materials (Ashby, 1999, 2005; George, 1997). The clay is then delivered to the manufacturing place where it is processed and formed into pots. This involves three-dimensional visualization and the ability to form shapes. Optimally, Vho-Tshianeo could have used computer-aided design to create the design for these pots, which would have been more precise than her heuristics approach (Narayan, 2008). Then the pots are put in the sun so that they can dry. After they are dried, a furnace is created where these pots are baked so hot that they look red because of the fire. This requires the knowledge of thermodynamics (Bejan, 2016). With thermodynamics, we can study the optimal way of baking these pots, but VhoTshianeo used heuristics to decide how much heat was needed to bake these pots. Then the fire is extinguished and the pots are cooled down slowly. Cooling the pots slowly is a process called annealing, which is learned in metallurgical engineering (Van Vlack, 1985). Annealing is such a powerful concept that there is an optimization
2.8 Human rationality: clay pots
algorithm called simulated annealing, which looks at the probability of various solutions and finds the most optimal one. This has been used to solve problems such as the traveling salesman problem and scheduling (Černý, 1985; Falk et al., 2006). Vho-Tshianeo was able to understand this annealing process without the need to know the Boltzmann equation, which was invented by the Austrian scientist Ludwig Boltzmann (Gressman and Strain, 2011). To understand simulated annealing, it is essential to understand the Boltzmann equation. Vho-Tshianeo had no understanding of the Boltzmann equation, yet she understood the practical side of annealing. This notion of knowing a concept, such as the practical aspect of annealing, without knowing the theoretical point of it (i.e., the Boltzmann equation) is what the Italian philosopher Antonio Gramsci called organic intellectualism (Gramsci, 1982). It is the last stage of making clay pots that is instrumental in allowing us to predict the failure of buildings before they occur. Vho-Tshianeo would take each clay pot, tap it, and listen to the ringing sound, and based on this, she was able to tell whether the pot was good or bad. If the pot rang for a long time, it meant it was baked well, and it was a good pot but if it rang for a short time, then it was a bad pot. In vibration engineering, when it rings for a long time, it is called a lightly damped structure and if it rings for a short time, it is called a damped structure (Tongue, 2001). Therefore, the pot ringing indicates whether there are pockets of air trapped inside the walls of the pot. This process of using sound to test whether the pot is good or bad is what engineers call nondestructive testing (Hellier, 2003). This procedure is routinely applied in aerospace engineering to assess whether airplanes have cracks on their bodies or not to save lives. This process teaches us, “Everything has something to say, all we need to do is to know how to listen to it.” In effect, what Vho-Tshianeo was saying was that the clay pot could tell us through the sound it made after being tapped whether it was of good quality or not. As Vho-Tshianeo was growing old, she frequently threw away good pots. This was because her hearing was deteriorating. Thus, as far as detecting the structural integrity of a pot, Vho-Tshianeo was becoming more irrational because of her deteriorating hearing. Irrationality is not an active act, but it is usually an involuntary act. The process that Vho-Tshianeo followed is shown in Fig. 2.4. Despite the critical process that Vho-Tshianeo used to make clay pots, it is full of inefficiencies and thus irrationality. Firstly, the clay selection process was based on visual inspection. A more effective system would be to conduct material testing. The process of turning clay into pots was done without the use of modern design tools such as Computer-Aided Design (CAD) and optimization techniques to establish the most optimal thickness of the walls of the pot. The temperature schedule for the baking of the pots in the furnace was random and was not optimized for maximum strength. The cooling of the pots was not done using the most optimal cooling schedule. In a modern factory, all these steps can be done more optimally than Vho-Tshianeo was able to do using intelligent machines. Moreover, the last step of knocking the pots and listening to the sound to establish the integrity of the pots is not practical. This is ineffective when compared to using the vibration of the pots (measured by
23
24
CHAPTER 2 What is machine vs human rationality?
Selection of Clay
Make the pot
Put pot in the sun
Put pot in the furnace
Cool pot slowly (annealing)
Knock each pot and listen to the sound
Rings for a long time
Good Pot
Rings for a short time
Bad Pot
FIG. 2.4 A human-based process of establishing whether the pots are good or bad.
accelerometers), analyzing this data (using advanced signal processing tools such as the Fourier transform), and using AI to make sense of the data and establish whether the pots are in good condition or not (Fourier, 1878). The next section describes such an automated system.
2.9 Machine rationality: condition monitoring
2.9 Machine rationality: Condition monitoring In the book “Condition Monitoring Using Computational Intelligence,” Marwala (2012) took this concept of listening to objects, as Vho-Tshianeo listened to her pots, into the 4IR (Schwab, 2017; Marwala, 2020; Doorsamy et al., 2020). The 4IR is an era in human development when machines are increasingly becoming more intelligent. In this regard cars are now able to drive themselves, self-diagnose, and self-repair. The implications on the economy, society, and politics of the 4IR are far-reaching. As explained above, the sound that Vho-Tshianeo was listening to is referred to as vibration data in engineering. This vibration data is processed using the Fourier analysis, which breaks these data into a series of cycles (sines and cosines) and identifies their natural frequencies (Fourier, 1878). The analysis of the data, represented in cycles, does not need to be done by a human brain, but in the fourth industrial revolution it is performed by artificially intelligent machines. The older Vho-Tshianeo became, the more she disposed of good pots because of her deteriorating hearing. AI machines do not suffer from hearing loss. This framework can be used to monitor the safety of buildings and bridges. In this regard, data acquisition devices or sensors are embedded on buildings and bridges and the data gathered is relayed to an artificially intelligent machine. This machine analyzes the data and decides whether a building or a bridge is in danger of collapsing or not. In cases of imminent danger, automated messaging can be relayed to allow
FIG. 2.5 Generalized data acquisition system.
25
26
CHAPTER 2 What is machine vs human rationality?
for relevant measures. This allows for a building to be secured before it collapses, thereby saving lives. An example of the use of intelligent machines to verify the integrity of structures is a scenario where cylinders are manufactured, which was studied by Marwala (2012). Each cylinder is evaluated to see if it is a good cylinder or not. A data acquisition system in Fig. 2.5 has three main components: the data cylinder excitation mechanism, the sensing mechanism, and the data processing system. The excitation mechanism excites the cylinder using a small hammer so that its vibration response is measured. The sensing mechanism measures the response from a structure using an accelerometer that measures the acceleration response. The third component Structure
Excite the structure by exerting force
Measure the vibration response using an accelerometer
Signal process the excitation force and the response
Extract features
Input features into Artificial Intelligence
Good/Bad Structure FIG. 2.6 An artificial intelligence fault detection system.
2.9 Machine rationality: condition monitoring
Table 2.4 Confusion matrix from the classification of fault cases. Predicted
Actual
[000] [100] [010] [001] [110] [101] [011] [111]
[000]
[100]
[010]
[001]
[110]
[101]
[011]
[111]
36 0 0 0 0 0 0 0
0 3 0 0 0 0 0 0
2 0 3 0 0 0 0 1
1 0 0 3 0 0 0 0
0 0 0 0 3 0 0 4
0 0 0 0 0 3 0 3
0 0 0 0 0 0 3 6
0 0 0 0 0 0 0 25
amplifies, filters, and converts the data from analog to digital format and sends this to a computer. The data in the computer is processed using advanced techniques such as the Fourier transform. Then features are extracted from this data and are input into the AI technique. A multilayer perceptron neural network, which is a type of AI, is used to extract this data (Marwala, 2012). More details on the multilayer perceptron network are described in Chapter 3. The process of acquiring data from the cylinders and using artificial intelligence to establish whether the cylinder is good or not is illustrated in Fig. 2.6. Cylinders had seven types of faults described digitally as [100], [010], [001], [110], [101], [011], and [111]. If the cylinder had no fault, it was represented as [000]. Each cylinder was measured three times under different boundary conditions. The results showing the confusion matrices when this procedure was used are presented in Table 2.4 (Marwala, 2012). These results show that 92.3% of [000] cases, all the one- and two-fault cases, and 64.1% of [111] cases were correctly classified. Of the three [000] fault cases that were classified wrongly, two were classified as [010] cases and one as a [001] case. Of the 14 [111] cases that were classified wrongly, 4 were classified as [110] cases, 3 as [101] cases, 6 as [011] cases, and 1 as a [010] case. These results show that this classification system was able to classify faulty cases correctly and only misclassified three no-fault cases as faulty, giving an overall accuracy of 97%, even though it had difficulty classifying the eight fault cases (including [000]). This accuracy of 97% is more effective than Vho-Tshianeo would have been able to achieve using her ear to listen to the vibration and her brain to assess the sound. The results obtained using machine intelligence demonstrate that machines are more objective than humans when assessing the structural integrity of cylinders. Because machines are more objective and use real measured data, they are more rational than human beings, a theme that we will be addressing throughout this book.
27
28
CHAPTER 2 What is machine vs human rationality?
2.10 Conclusions This chapter observes that a rational agent maximizes the net utility rather than utility. It observes that the net utility is the difference between the utility at a position (also called the state) and the utility at a previous position less the cost of action of moving from one position to another. It observes that the measurement unit of utilities can differ from the measurement unit of the cost of action, making the net utility challenging to calculate. It studies the concept of utility in the making of clay pots.
References Ashby, M., 1999. Materials Selection in Mechanical Design, third ed. Butterworth-Heinemann, Burlington, MA. Ashby, M.F., 2005. Materials Selection in Mechanical Design. Elsevier, USA. Bejan, A., 2016. Advanced Engineering Thermodynamics. Wiley. Bellman, R., 1957a. Dynamic Programming. Princeton University Press, Princeton. Bellman, R., 1957b. A Markovian decision process. J. Math. Sci. Mech. 38, 716–719. Bertsekas, D.P., Nedic, A., Ozdaglar, A., 2003. Convex Analysis and Optimization. Athena Scientific, Belmont, MA. Brown, R., 2004. Consideration of the origin of Herbert Simon’s theory of ‘satisficing’ (19331947). Manag. Decis. 42 (10), 1240–1256. Černý, V., 1985. Thermodynamical approach to the traveling salesman problem: an efficient simulation algorithm. J. Optim. Theory Appl. 45, 41–51. Doorsamy, W., Paul, S., Marwala, T., 2020. The Disruptive Fourth Industrial Revolution. Springer, London. Falk, D.L., Rubin, D.M., Marwala, T., 2006. Enhancement of noisy planar nuclear medicine images using mean field annealing. In: Imaging the Future Medicine, Proceedings of the IFMBE. vol. 14. Springer-Verlag, Berlin Heidelberg, pp. 3581–3585. Fishburn, P.C., 1970. Utility Theory for Decision Making. Robert E. Krieger, Huntington, NY. Fourier, J.B., 1878. The Analytical Theory of Heat. Translated by Alexander Freeman, The University Press. Friedland, B., 2005. Control System Design: An Introduction to State-Space Methods. Dover. George, E.D., 1997. Overview of the materials selection process. In: ASM Handbook. Materials Selection and Design, vol. 20. ASM International (Formerly known as American Society of Metals), Cleveland, OH. Gramsci, A., 1982. Selections From the Prison Books. Lawrence and Wishart. Gressman, P.T., Strain, R.M., 2011. Global classical solutions of the Boltzmann equation without angular cut-off. J. Am. Math. Soc. 24 (3), 771. Hellier, C., 2003. Handbook of Nondestructive Evaluation. McGraw-Hill. Hinrichsen, D., Pritchard, A.J., 2005. Mathematical Systems Theory I, Modelling, State Space Analysis, Stability and Robustness. Springer. Kahneman, D., 2011. Thinking, Fast and Slow. Farrar, Straus and Giroux. Kahneman, D., Slovic, D., Tversky, A., 1982. Judgment Under Uncertainty: Heuristics and Biases. Cambridge University Press, New York. Lee, J., 2004. A First Course in Combinatorial Optimization. Cambridge University Press.
References
Lieven, D., 1993. Nicholas II, Emperor of all the Russias. Pimlico, London. Marwala, T., 1997. A Multiple Criterion Updating Method for Damage Detection on Structures. University of Pretoria (Master’s thesis). Marwala, T., 2010. Finite Element Model Updating Using Computational Intelligence Techniques: Applications to Structural Dynamics. Springer, Heidelberg. Marwala, T., 2012. Condition Monitoring Using Computational Intelligence Methods. Springer, Heidelberg. Marwala, T., 2013. Economic Modeling Using Artificial Intelligence Methods. Springer, Heidelberg, ISBN: 978-1-84996-323-7. Marwala, T., 2014. Artificial Intelligence Techniques for Rational Decision Making. Springer, Heidelberg. Marwala, T., 2015. Causality, Correlation, and Artificial Intelligence for Rational Decision Making. World Scientific, Singapore. Marwala, T., 2020. Closing the Gap: The Fourth Industrial Revolution in Africa. Macmillan, Johannesburg. Minsky, M., 1967. Computation: Finite and Infinite Machines. Prentice-Hall, Englewood Cliffs, NJ. Minsky, M., 2006. The Emotion Machine. Simon & Schusterl, New York, NY. Muller, D., Karpas, E., 2019. Value driven landmarks for oversubscription planning. In: Twenty-Eighth International Conference on Automated Planning and Scheduling. AAAI Publications. Narayan, K.L., 2008. Computer Aided Design and Manufacturing. Prentice Hall of India, New Delhi. Reardon, A., 2011. Metallurgy for the Non-Metallurgist, second ed. ASM International. Russell, S.J., Norvig, P., 2009. Artificial Intelligence: A Modern Approach, third ed. Prentice Hall, Upper Saddle River, NJ. Schwab, K., 2017. The Fourth Industrial Revolution. Crown Publishing Group, New York. Simon, H., 1957. A behavioral model of rational choice. In: Models of Man, Social and Rational: Mathematical Essays on Rational Human Behavior in a Social Setting. Wiley, New York. Simon, H., 1990. A mechanism for social selection and successful altruism. Science 250 (4988), 1665–1668. Simon, H., 1991. Bounded rationality and organizational learning. Organ. Sci. 2 (1), 125–134. Tongue, B., 2001. Principles of Vibration. Oxford University Press. Trèves, F., 1995. Topological Vector Spaces, Distributions and Kernels. Academic Press, Inc. Van Vlack, L.H., 1985. Elements of Materials Science and Engineering. Addison-Wesley.
29
This page intentionally left blank
CHAPTER
Rational machine
3
3.1 Introduction This chapter introduces the concept of a rational machine and studies how we can build them. Specifically, it explores how AI can be used to build rational machines. The AI concepts considered include artificial general intelligence, knowledge representation, multilayered perceptron, radial basis function, and support vector machines (Marwala, 2018). Examples are used to evaluate the rationality of these machines. In Chapters 1 and 2, we studied that a rational agent maximizes the net utility to achieve its goals. To maximize its net utility, a rational agent needs to use complete information and process such information in an optimal manner. We are not able to access complete information and that information is always imperfect and incomplete. The human brain is not capable enough to process all the data it needs to make a rational decision and thus human decision-making is riddled with biases and heuristics (Kahneman et al., 1982; Kahneman, 2011). Because of this problem, human decision-making is bounded rationally (i.e., limited) (Simon, 1957, 1990, 1991). This chapter explores building a rational machine. A rational machine is a device that is designed to maximize its performance in order to achieve its goal. Yet unlimited rationality does not exist in real life. This, therefore, means that this chapter should be titled “Bounded Rational Machines.” This bounded rational machine uses available data to make a somewhat optimized decision. Before we move forward, we must understand what data is. To understand data, we need to refer to the English poet T.S. Elliot who wrote in his poem The Rock: “Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?” (Browne, 1969). The most popular addition to this poem is a further question: “Where is the information we have lost in data.” Thus, the logical flow of data, information, knowledge, and wisdom is used for decision-making. Machines, especially deep learning machines use data to make information (Leke and Marwala, 2019). Traditionally in AI, neural networks use information that has been extracted from data to make decisions. Some of the techniques that have been traditionally used to extract information from data, which is a process called feature (knowledge) extraction, include the principal component analysis or c onventionally signal (data) processing tools such as the Fourier transform or
Rational Machines and Artificial Intelligence. https://doi.org/10.1016/B978-0-12-820676-8.00011-9 Copyright © 2021 Elsevier Inc. All rights reserved.
31
32
CHAPTER 3 Rational machine
wavelet transform (Fourier, 1878). Given all these developments, the nature of a rational machine is described in the next section.
3.2 What is a rational machine? A rational machine is a decision-making system, whether computational or mathematical, which is built to maximize the decision-making process. One example of such a machine is found in the field of AI. As described in Chapter 1, AI can be grouped into three types: machine learning, computational intelligence, and soft computing. This chapter explores the rationality of each of these three types of AI.
3.3 Artificial intelligence 3.3.1 Machine learning: Multilayer perceptron and deep learning The first type of AI is machine learning. Machine learning is the use of statistics and data to build intelligent systems. One example of machine learning is the radial basis function (RBF). The RBF networks train fast and are less prone to problems with nonstationary inputs. The RBF contains network weights, and the centers of each have hidden nodes. Training the RBF involves estimating both network weights and hidden node centers and usually involves estimating the centers using the k-means clustering algorithm and then estimating the network weights using linear programming. Effectively, training the RBF involves the use of an optimization routine, an issue that will be related to rationality later in this chapter. Support vector machine (SVM) is a supervised learning method introduced by Vapnik (1995, 1998), which is used for classification and regression. To formulate the SVM, we conceptualize a data point as a p-dimensional vector. The objective of SVMs is to separate such points with a p − 1-dimensional hyperplane known as a linear classifier. To deal with nonlinear problems, we introduce a kernel function that transforms learner hyperplanes into linear hyperplanes. Researchers have proposed many hyperplanes including the hyperplane that shows the most substantial separation (margin) between the two classes and thus maximizes the distance from it to the nearest data point on both sides. This is the maximum-margin hyperplane. This can be done by formulating the classification problem as an approximation function f, which depends on input-output training data. From this function, we can estimate an independent and identically distributed unknown probability distribution P(x,y) that makes f classify unseen (x,y) data. To create the SVM, we have to perform optimization, and this issue will be related to rationality later in the chapter. The most successful example of machine learning is the multilayer perceptron (MLP) neural network (Marwala, 2018). The MLP contains multiple layers of computational units, usually interconnected in a feed-forward manner. Biological
3.3 Artificial intelligence
Output Units
z0 bias
Hidden
z1
xd x0
Input Units
FIG. 3.1 Feed-forward multilayer perceptron network having two layers of adaptive weights.
neurons inspired the development of these interconnections and each neuron in one layer is connected to the neurons of the following layer (Marwala, 2009). We can write the MLP neural network as follows (Bishop, 1995; Marwala, 2009): M 2 d 1 1 2 yk fouter wkj finner wji xi wj 0 wk 0 . i 1 j 1
(3.1)
(2) In Eq. (3.1), w(1) ji and wji are weight parameters in the first and the second layer, respectively, from input i to hidden unit j, d is the number of output units, M is the number of hidden units, and fouter(•) and finner are activation functions in the inner and outer layers (usually hyperbopic tangent in the inner layer and linear in the outer layer), while wj0(1) and wk0(2) are the network parameters indicating the biases for the hidden unit j and the output unit k. The selection of these network parameters enables the function to model linear and nonlinear data of any order. The network described above is shown in Fig. 3.1, and it has one layer. If this network has more than one layer, then the network is called a deep learning network. The principle followed to estimate the weight parameters in Eq. (3.1) is the same as that followed in linear regression. When we talk about training the MLP neural network, we are referring to a process of identifying the weights in Eq. (3.1) using the observed data. To train the network, we construct an objective function representing the distance between the model prediction and the observed target data with the weight parameters as unknown variables. We identify these weight parameters by
33
34
CHAPTER 3 Rational machine
minimizing this objective function, thereby maximizing the capacity of the MLP to predict outputs whenever we give the neural network the input values. The approach of minimizing the objective function is the maximum likelihood N approach. If the training set D = {xk, yk}k = 1 is used and assuming that the targets y are sampled independently given the inputs xk and the weight parameters, wkj, then we write the objective function, E, as follows using the sum-of-squares of errors objective function (Bishop, 1995; Marwala, 2009): N
K
E t nk ynk
(3.2)
2
n 1 k 1 N
K
t nk ynk x , w . n 1 k 1
2
Here, n is the index for the training example, k is the index for the output units, {x} is the input vector, and {w} is the weight vector. This objective function is well matched for regression problems, and the cross-entropy objective function is suited for classification problems. Preceding neural network training, we construct the network architecture by selecting the number of hidden units, M. A small M causes the neural network to be inadequately flexible, and this gives a reduced generalization of the data because of bias. A large M makes the neural network too complex, pointlessly flexible, and thus gives poor generalization because of overfitting caused by high variance. The choice of an appropriate M is known as model selection. A procedure called back-propagation is used to train the MLP neural network (Marwala, 2009; Werbos, 1974). Back-propagation is a technique used for finding the derivatives of the error in Eq. (3.2) with respect to the network weights. We identify the neural network weights using the following gradient descent equation (Marwala, 2009; Bishop, 1995):
wi1 wi
E wi . w
(3.3)
Here, η is the learning rate and {} is a vector. We minimize the objective function E by calculating the derivative of the errors in Eq. (3.2) with respect to the network weight. We can calculate the gradient in Eq. (3.3) as follows (Bishop, 1995): E E ak wkj ak wkj
(3.4)
E yk ak yk ak wkj
fouter ak n
E zj ynk
M
here, zj = finner(aj) and ak wkj 2 y j . The derivative of the error with respect to weight j 0
is written using the chain rule as follows (Bishop, 1995; Marwala, 2009): E E ak wkj ak wkj
(3.5)
finner ak a j wkj fouter n
k
E ynk
3.3 Artificial intelligence
d
here, a j wji1 xi . The derivative of the objective function with respect to the output i 1
is (Bishop, 1995; Marwala, 2009) E t nk ynk , ynk
(3.6)
To use the gradient descent equation to update the network weights, we use an optimization method and more details on the optimization methods are provided by Marwala and Leke (2019). The multilayer perceptron neural network can be extended by having multiple layers as opposed to a single layer in Fig. 3.1, and this has revolutionized machine learning and is called deep learning. Therefore, deep learning is a type of method in the discipline of machine learning that uses nonlinear nodes which are organized into multiple layers that extract and translate feature variable values from the input vector to the output vector (Deng et al., 2013; Deng and Yu, 2014). Each layer of such a system has an input and the outputs from preceding layers, excluding the input layer, which receives input signals from the outside environment. In deep learning, higher-level features are extracted from lower-level features to obtain a stratified portrayal of the input data via an unsupervised learning approach on different levels of the features (Deng and Yu, 2014). In this regard, different layers of depictions of the data represent varying levels of absorption of the data. Examples of deep learning methods include Deep Belief Networks (DBNs) (Hinton, 2009), Deep/Stacked Auto-encoder Networks (DAEs/SAEs) (Vincent et al., 2010; Larochelle et al., 2009), and Convolutional Neural Networks (CNNs) (Alex et al., 2012; LeCun and Bengio, 1995; LeCun et al., 2015). Deep learning methods were applied in natural language processing, speech recognition, chess playing, audio recognition, condition monitoring, object recognition, and detection, and computer vision. Deep learning methods learn from the data using optimization methods, and this issue will be related to rationality later.
3.3.2 Soft computing Soft computing is a computational paradigm that functions similar to the way human beings compute. One example of soft computing is fuzzy logic. Fuzzy logic is a method of mapping input data to output data using linguistic rules based on the if-then statements (Bih, 2006). The implementation of fuzzy logic involves the use of fuzzy sets, membership functions, fuzzy logic operators, and fuzzy rules (Cox, 1994; Von Altrock, 1995; Biacino and Gerla, 2002). In set theory, an object is either an element or not an element of a specific set (Devlin, 1993; Ferreirós, 1999; Johnson, 1972). In fuzzy logic, a fuzzy set does not have as clear-cut boundaries as traditional set theory, and therefore objects have degrees of membership to a specific set (Wright and Marwala, 2006; Hájek, 1998). Therefore, we can represent the in-between values of objects in much the same way the human brain thinks, unlike the clear cut-off boundaries in classical sets. A membership function quantifies the degree in which an object belongs to a set or class. The membership function maps the input space
35
36
CHAPTER 3 Rational machine
variable to a number between 0 and 1. This represents the degree to which a specific input variable belongs to a specific set (Klir and Folger, 1988; Klir and Yuan, 1995; Klir et al., 1997). A membership function can be characterized as a curve of any shape. In the example of whether the person is tall or short, there are two subsets: first one for tall and the second for short that overlap. This way an individual can have partial membership in each of these sets. This determines the extent to which the person is both short and tall. Logical operators are used for creating new fuzzy sets from the current fuzzy sets. In set theory, three key operators are used: intersection, union, and complement (Kosko, 1993; Kosko and Isaka, 1993). These same operators are also used in fuzzy logic and are adapted to handle partial memberships. In the mathematics of fuzzy logic, the intersection (AND operator) of two fuzzy sets converts into a minimum operation. The union (OR operator) of two fuzzy sets converts into a maximum operation (Novák, 1989, 2005; Novák et al., 1999). We use these logical operators to determine the rules of the overall fuzzy set output. Fuzzy rules express the conditional statements used to model the input-output relationships of a system in natural language. These linguistic rules are expressed in the if-then statements using the logical operators and membership functions to yield output. A vital aspect of fuzzy logic is the use of linguistic variables, which take sentences or words as their values as a substitute for numbers (Zadeh, 1965; Zimmermann, 2001; Marwala and Leke, 2019). Every linguistic variable assumes a linguistic value that matches a fuzzy set. These values that the linguistic variable can assume are known as the term set. For instance, a linguistic variable height could assume the following term set {very tall, tall, medium, short, very short}. An example of a fuzzy rule is as follows (Zadeh, 1965; Zimmermann, 2001; Marwala and Leke, 2019): if x is A and y is B then z is C
(3.7)
here, A, B, and C are fuzzy sets representing input and output spaces. Variables x, y, and z are linguistic variables. Variables A, B, and C are the linguistic values denoted by the membership functions. Each rule contains antecedent and consequent expressions. The antecedent is the aspect of spanning the if-then rule and maps the inputs x and y to the fuzzy sets A and B through the membership functions. The consequent aspect of the rule after spanning the if-then rule transforms the output y using a membership function. The input membership values are like the weighting function that defines their effect on the fuzzy output sets. Consequently, a fuzzy system contains a list of these if-then rules that we estimate in parallel. We can express the antecedent in more than one linguistic variable, and aggregate these inputs through the AND operator. We evaluate each rule for an input set, and equivalent output for the rule is attained. If the input tallies to two linguistic variable values, then we evaluate the rules connected to both these values. In addition, we evaluate the rest of the rules, but these will not affect the result of the linguistic variable because it will have a value of zero. Consequently, if the antecedent is true to a certain extent, then the consequent is also
3.3 Artificial intelligence
true to a certain extent (Zadeh, 1965). We then compare the degree of each linguistic output value by executing a logical sum for each membership function and then aggregate all the aggregated sums for a linguistic variable. Finally, we use the inference technique to transform the result into the output membership function (Zadeh, 1965). Lastly, we defuzzify the results to produce a single numeric output. An approach to achieve this is to take the maximum of all rules describing this linguistic output value and to take the output as the center of gravity of the area under the affected part of the output membership function (Mamdani, 1974). There are other inference methods such as averaging and sum mean square. Fig. 3.1 shows the steps involved in creating an input-output mapping using fuzzy logic (Wright and Marwala, 2006) (Fig. 3.2). Soft computing, as described here through fuzzy logic, does not involve the optimization process. This is primarily because the variables involved are linguistic,
,QSXW 9DULDEOHV
$VVLJQPHQWRI0HPEHUVKLS )XQFWLRQV
) 8 =
$SSOLFDWLRQRI)X]]\5XOHV
=