332 67 13MB
English Pages XV, 225 [236] Year 2021
Studies in Computational Intelligence 931
Priti Srinivas Sajja
Illustrated Computational Intelligence Examples and Applications
Studies in Computational Intelligence Volume 931
Series Editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new developments and advances in the various areas of computational intelligence—quickly and with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. The books of this series are submitted to indexing to Web of Science, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.
More information about this series at http://www.springer.com/series/7092
Priti Srinivas Sajja
Illustrated Computational Intelligence Examples and Applications
123
Priti Srinivas Sajja PG Department of Computer Science Sardar Patel University Vallabh Vidyanagar, Gujarat, India
ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN 978-981-15-9588-2 ISBN 978-981-15-9589-9 (eBook) https://doi.org/10.1007/978-981-15-9589-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
To My Parents
Preface
Artificial Intelligence (AI) and Machine Learning (ML) have become essential instruments in the modern era, which are ubiquitously applied to get the benefits of human-like intelligence. Applications of these techniques have been always challenging and appealing simultaneously. This book entitled Illustrated Computational Intelligence: Examples and Applications presents a summary of necessary classical artificial intelligence and machine learning techniques in brief in its initial two chapters while the remaining chapters presents hundreds of illustrated examples and applications from real life on various constituent techniques. These techniques include fuzzy logic, artificial neural network, genetic algorithm, and their possible hybridizations. Chapters 4 to 6 of the book present illustrations, examples, and applications on these techniques. Significant effort is taken to minimize the fundamental and background concepts that are available otherwise, and focus is on the illustrated examples and applications with complete details. Simultaneously, care is also taken that no fundamental and important concept is left without discussion. This makes the book complete, and the readers need not have to refer elsewhere for the preliminary but trivial background concepts, which are necessary. This non-talkative and illustrative book on computational intelligence is meant for computer professionals as well as non-computer professionals, as the application of artificial intelligence has always been ubiquitous in nature. Students and instructors at the graduate level, postgraduate level, and diploma level on national and international platforms can use this book for practising computational intelligence. The solved examples demonstrated in the book will be a great help to instructors, students, non–AI professionals, and researchers. It is to be noted that every example is discussed in detail with normalization, architecture, detailed design, process flow, encoding, and sample input/output. Summary of the fundamental concepts with illustrated examples makes this book unique in its category. Some examples which are described in this book are webpages' classification, sales prediction, matrimonial, and others like job profile matching, emotion detection, diagnosis of flu and viral fever such as COVID-19 in four different ways with neural network, fuzzy logic, genetic algorithm, and neuro-fuzzy systems, and also eight queens chess vii
viii
Preface
problem, knapsack problem, k-means solutions, collaborative filtering, image classification through the fuzzy convolutional network, fuzzy movie recommendations, fruits and dry fruits grading, identification/sorting, etc., using various computational intelligence techniques such as neural network, genetic algorithms, fuzzy logic, and their various hybridizations. Another salient feature of the book is the list of core and applied project ideas and future research possibilities in various domains using computation intelligence. Every technique of artificial intelligence and machine learning is divided into the core (pure) research area and the applied research area, and a list of applications and innovative research ideas for the technique are listed in every chapter. Every chapter enlists at least 40 research possibilities and applications besides the solved examples. Within the six chapters, about 215 illustrations and hundreds of examples, applications, project ideas, and research possibilities are presented. I take this opportunity to extend my thanks to the Almighty for bestowing me with health and determination to write this book. The project kept my morals high during these testing times amidst a world pandemic. I found a purpose to occupy myself as lives around me changed drastically for the worse due to the COVID-19. I am also indebted to Dr. Loy D’Silva, Ms. Suvira Srivastava, Ms. Raghavy Krishnan, and Mr. Gaurishankar Ayyappa for their continued support in the process of publishing this book. I am grateful to my extended family of students and colleagues at the Department of Computer Science, Sardar Patel University, Gujarat, India for their help. I also thank my family for blessings, unconditional love, and encouragement. Vallabh Vidyanagar, Gujarat, India
Priti Srinivas Sajja
Salient Features of the Book (Back Cover)
The salient features of the book are as follows: • Summary of symbolic Artificial Intelligence (AI) and computational intelligence such as fuzzy logic, artificial neural network, genetic algorithm, machine learning, and hybrid technologies. • The focus is to show real-life examples and applications in the domain of computational intelligence with architecture, detailed design, process flow, and sample input/output along with detailed methods of solution. • Within the six chapters, about 215 illustrations and hundreds of examples, applications, project ideas, and research possibilities are enlisted in various domains using computational intelligence. • Useful for undergraduate, postgraduate, and diploma level students of various courses of computers, IT, management, science, engineering, and technology at national and international level. • Considering the ubiquitous nature of the computational intelligence, care is taken that the book content is understandable to non-computer professionals too and it tries to illustrate necessary solution steps in various domains.
ix
Contents
1 Introduction to Artificial Intelligence . . . . . . . . . . . . . . . . . . . . 1.1 Natural Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Non-Algorithmic Approach . . . . . . . . . . . . . . . . . . 1.1.2 Heuristic Approach . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Self Learning and Inference . . . . . . . . . . . . . . . . . 1.2 Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Types of Artificial Intelligence and Applications . . . . . . . . . 1.4 Components of the Symbolic AI . . . . . . . . . . . . . . . . . . . . 1.4.1 Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2 Inference Engine . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.3 Self-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.4 Explanation and Reasoning . . . . . . . . . . . . . . . . . . 1.4.5 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Knowledge Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Knowledge Representation . . . . . . . . . . . . . . . . . . . . . . . . 1.6.1 Declarative Knowledge . . . . . . . . . . . . . . . . . . . . . 1.6.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.3 Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.4 Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.5 Semantic Networks . . . . . . . . . . . . . . . . . . . . . . . 1.6.6 Hybrid Knowledge Representation Structure . . . . . 1.6.7 Evaluation of Different Knowledge Representation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7 Types of Knowledge-Based Systems . . . . . . . . . . . . . . . . . 1.7.1 Expert System . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7.2 Linked Based Systems . . . . . . . . . . . . . . . . . . . . . 1.7.3 Computer Aided Systems Engineering (CASE) Based System . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7.4 Knowledge-Based Tutoring Systems . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . .
1 2 2 2 3 3 4 4 6 6 8 8 8 8 9 10 10 10 11 12 12
. . . .
. . . .
. . . .
. . . .
13 13 14 15
.... ....
15 17
xi
xii
Contents
1.7.5 Agent-Based System . . . . . . . . . . . . . . . . . . . . . . . . 1.7.6 Intelligent Interface to Data . . . . . . . . . . . . . . . . . . . . 1.8 Testing Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Benefits of AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10 Applications of Artificial Intelligence . . . . . . . . . . . . . . . . . . . 1.11 Limitations of Traditional AI Based Solutions . . . . . . . . . . . . 1.12 Core and Applied Research Ideas in Symbolic Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.12.1 Core Project/Research Ideas in Symbolic Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.12.2 Applied Project/Research Ideas in Symbolic Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
18 19 21 21 22 22
..
23
..
23
.. ..
24 25
. . . . . .
2 Constituents of Computational Intelligence . . . . . . . . . . . . . . 2.1 Computing Intelligence, Hard Computing, and Soft Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . 2.4 Hopfield Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Properties of a Hopfield Network . . . . . . . . . . . . . . . . . 2.6 Learning in Hopfield Network . . . . . . . . . . . . . . . . . . . . 2.7 Single Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Multilayer Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Training Using Back Propagation in Supervised Manner . 2.10 Designing a Neural Network with Unsupervised Manner 2.11 Kohonen Self-Organizing Maps (SOMs) . . . . . . . . . . . . 2.12 Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 2.13 Encoding Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . 2.14 Genetic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.15 Travelling Salesperson Problem with Genetic Algorithm . 2.16 Schema in Genetic Algorithms . . . . . . . . . . . . . . . . . . . 2.17 Hybrid Computational Intelligence Based Systems . . . . . 2.18 Neuro-Fuzzy Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 2.19 Fuzzy-Genetic Systems . . . . . . . . . . . . . . . . . . . . . . . . . 2.20 Neuro-Genetic Systems . . . . . . . . . . . . . . . . . . . . . . . . . 2.21 Other Hybrid Systems . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
......
27
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
28 29 34 35 35 36 37 38 39 42 42 44 44 44 47 49 51 51 53 53 54 54
3 Examples and Applications on Fuzzy Logic Based Systems 3.1 Fuzzy Set and Membership for Students Attendance . . . 3.2 Fuzzy Membership Function for the Speed of a Vehicle 3.3 Operations of Fuzzy Sets: Numerical Example . . . . . . . 3.4 Fuzzy Operations: Newspapers Example . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
55 55 57 58 59
. . . . .
Contents
3.5 3.6 3.7 3.8 3.9
Fuzzy Operations: Sensors Example . . . . . . . . . . . . . . . . . . Selection of a Job Based on Fuzzy Parameters . . . . . . . . . . . Affordability of a Dress . . . . . . . . . . . . . . . . . . . . . . . . . . . . Affordability of a Software . . . . . . . . . . . . . . . . . . . . . . . . . Membership Functions and Fuzzy Rules for Automatic Car Braking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10 Fuzzy Logic Application in Share Market . . . . . . . . . . . . . . 3.11 Fuzzy Relationship: Numerical Examples . . . . . . . . . . . . . . . 3.12 Fuzzy Relationship: Covid–19 Symptoms . . . . . . . . . . . . . . 3.13 Fuzzy Relationship and Membership: Comfort While Playing with Bat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.14 Fuzzy Rule-Based System for Washing Machine . . . . . . . . . 3.15 Fuzzy Diagnosing for Covid-19 . . . . . . . . . . . . . . . . . . . . . . 3.16 Customized Presentation of Learning Material to Slow Learners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.17 Fuzzy Almond Sorting Example . . . . . . . . . . . . . . . . . . . . . 3.18 Type-2 Fuzzy Logic Based System . . . . . . . . . . . . . . . . . . . 3.19 Fuzzy and Modular Restaurant Menu Planner Systems . . . . . 3.20 Core and Applied Research Ideas in Fuzzy Logic Based Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.20.1 Core Research and Applications . . . . . . . . . . . . . . . 3.20.2 Applied Research and Applications . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Examples and Applications on Artificial Neural Networks . . . . . 4.1 Example of Linearly Separable Decision: To Issue a Credit Card or Not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Perceptrons to Simulate Logical Functions . . . . . . . . . . . . . . 4.3 Neural Network for XOR Function . . . . . . . . . . . . . . . . . . . 4.4 Numerical Example for Weight Update for Perceptron Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Sales Prediction Using Multilayer Perceptron . . . . . . . . . . . . 4.6 Selection of Mobile Using Neural Network . . . . . . . . . . . . . 4.7 Detection of Flu Based Viral Disease . . . . . . . . . . . . . . . . . . 4.8 Neural Network for Students Course Selection and Aptitude Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Selection of Tour Place . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10 Agricultural Advisory System Using Neural Network . . . . . . 4.11 An Example of Multilayer Perceptron for Job Selection . . . . 4.12 SVM to Identify Suitable Candidates for ‘Content Writers’ . . 4.13 Neural Network for Best Student Selection . . . . . . . . . . . . . 4.14 Neural Network to Learn Price of a House . . . . . . . . . . . . . . 4.15 Neural Network for Handwritten Digit Recognition . . . . . . . 4.16 Neural Network to Classify Images: Case of Captcha Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiii
. . . .
. . . .
. . . .
60 64 65 67
. . . .
. . . .
. . . .
69 71 76 78
... ... ...
79 80 84
. . . .
. . . .
. . . .
87 91 94 96
. . . .
. . . .
. 98 . 98 . 99 . 100
. . . 101 . . . 102 . . . 103 . . . 104 . . . .
. . . .
. . . .
105 107 109 111
. . . . . . . .
. . . . . . . .
. . . . . . . .
112 114 116 119 121 122 124 125
. . . 127
xiv
Contents
4.17 Speech Recognition Using Neural Network . . . . . . . . . . . . . 4.18 Recurrent Neural Network for Selecting School Children Activity Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.19 Kohonen Neural Network (SOM): Detailed Calculation . . . . 4.20 Kohonen Neural Network Example to Categorize Fruits: An Example of Unsupervised Learning . . . . . . . . . . . . . . . . . . . 4.21 Application of the K-means Algorithm: Solved Example . . . 4.22 Numerical Example on K Nearest Neighbour: Share Market Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.23 Numerical Example on K Nearest Neighbour: Prediction of Fruits, Vegetables, or Protein . . . . . . . . . . . . . . . . . . . . . 4.24 To Purchase a House or not: An Example of Naive Bayes Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.25 Core and Applied Research and Project Ideas in Artificial Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.25.1 Core Applications and Project Ideas . . . . . . . . . . . . 4.25.2 Applied Research and Project Ideas in Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Examples and Applications on Genetic Algorithms . . . . . . . . . . 5.1 Function Optimization for Single Variable: Case of Profit and Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Genetic Algorithm for Single Variable Example: Minimization in the Disguise of Maximization . . . . . . . . . . 5.3 Numeric Example of Genetic Algorithm . . . . . . . . . . . . . . 5.4 Function Optimization for Multi-Variable Function . . . . . . . 5.5 Selection of the Best Student Using GA . . . . . . . . . . . . . . 5.6 Genetic Algorithm for Mobile Selection: Binary Encoding . 5.7 Genetic Algorithm for Mobile Selection: Decimal Encoding 5.8 Car Selection Using the Genetic Algorithm . . . . . . . . . . . . 5.9 Solving Knapsack Problem with Genetic Algorithm . . . . . . 5.10 Survival with a Backpack: A Case of Knapsack Using Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11 Solution of 8 Queens Problem with Genetic Algorithms . . . 5.12 To Evolve a Fixed Size Sentence Using Genetic Algorithm 5.13 Traveling Salesperson Problem with Partially Mapped Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.14 Traveling Sales Person Problem with Order Mapped Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.15 Machine Learning with Genetic Algorithm: Evolving Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.16 Traveling Sales Person with Cost Matrix . . . . . . . . . . . . . . 5.17 Setting Test Paper with Genetic Algorithm . . . . . . . . . . . . .
. . . 131 . . . 134 . . . 135 . . . 140 . . . 141 . . . 144 . . . 145 . . . 147 . . . 151 . . . 151 . . . 152 . . . 153
. . . . 155 . . . . 156 . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
157 159 160 161 163 164 166 169
. . . . 171 . . . . 173 . . . . 175 . . . . 177 . . . . 178 . . . . 179 . . . . 181 . . . . 183
Contents
xv
5.18 Core and Applied Applications of Genetic Algorithms . . . . . . . . 186 5.18.1 Core Applications and Research Ideas . . . . . . . . . . . . . . 187 5.18.2 Applied GA and Research Ideas . . . . . . . . . . . . . . . . . . 188 6 Examples and Applications on Hybrid Computational Intelligence Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Neuro-Fuzzy Course Selection System . . . . . . . . . . . . . . . . . . 6.2 Hybrid System for Web Page Classification . . . . . . . . . . . . . . 6.3 Neuro-Fuzzy Disease Diagnosing Case of COVID-19 . . . . . . . 6.4 Neuro-Fuzzy System to Find a Soul Mate: A Case of Matrimonial Profile Classification . . . . . . . . . . . . . . . . . . . . . 6.5 Genetic Fuzzy System for Fashion Design . . . . . . . . . . . . . . . 6.6 Fuzzy Convolutional Neural Network for Viral Disease Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Evolutionary Neural Network . . . . . . . . . . . . . . . . . . . . . . . . 6.8 Type-2 Neuro-Fuzzy System for Evaluation of Software Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9 Neuro-Fuzzy-Genetic System for Evaluation of Software Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10 Neuro-Fuzzy-Genetic System for Students Aptitude Testing . . 6.11 Fuzzy Collaborative Filtering for Movie Recommendation . . . 6.12 Neuro-Fuzzy System for Emotion Detection . . . . . . . . . . . . . . 6.13 Core and Applied Research Ideas in Hybrid Computational Intelligence Based Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 6.13.1 Core Research/Project Ideas and Examples . . . . . . . . 6.13.2 Applied Research/Project Ideas and Examples . . . . . .
. . . .
. . . .
191 192 194 198
. . 199 . . 203 . . 206 . . 209 . . 211 . . . .
. . . .
213 215 216 218
. . 223 . . 224 . . 224
Chapter 1
Introduction to Artificial Intelligence
Abstract The chapter introduces natural and artificial intelligence characteristics such as non-algorithmic approach, use of heuristics, self-learning, and ability to handle partial inputs along with various types of artificial intelligence and application areas. Terms like weak AI, narrow AI, classical AI, modern AI, symbolic AI, machine learning, super AI, and general AI are outlined in this chapter. As the basic source of intelligence is knowledge, the systems which deal with artificial intelligence classically need to be knowledge-based. To demonstrate how such knowledge-based systems are working, it is necessary to know about various types of knowledge, knowledge acquisition processes, and related heuristics. Once a sufficient amount of variety of knowledge is collected from multiple domain experts, it needs to be effectively represented into the system. For this, various knowledge representation structures with their possible hybridization and comparative evaluation are illustrated in this chapter. Besides the knowledge acquisition and knowledge representation, there are many more components required for a traditional knowledge-based system. These components are the inference engine, self-learning, explanation, and user interface. These are known as components of symbolic AI or classical AI. Major popular symbolic AI systems, also known as the knowledge-based system are namely: (i) Expert systems, (ii) Linked based systems, (iii) Computer Aided Systems Engineering (CASE) based system, (iv) Knowledge-based tutoring systems, (v) Agentbased system, and (vi) Intelligent interface to data. All these systems are discussed with their general architecture and basic components. Once the knowledge-based system is developed, it must undergo testing and certification. To test such systems, various types of testing mechanisms are available. The Turing test, Chinese room test, Marcus test, Lovelace test 2.0, etc. are briefly discussed in this chapter. At the end, the chapter enlists the benefits, applications, and limitations of artificial intelligence. The chapter enlists more than 50 applications and possibilities of further research in the core, applied, and hybrid areas.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 P. S. Sajja, Illustrated Computational Intelligence, Studies in Computational Intelligence 931, https://doi.org/10.1007/978-981-15-9589-9_1
1
2
1 Introduction to Artificial Intelligence
1.1 Natural Intelligence With the help of natural intelligence, humankind has proved its significance in the universe. Starting from the routine business transactions to spectacular activities, intelligence makes us understand the situation, react flexibly and determine the relative priority of things in order to make highly effective decisions. Further, decision making and problem-solving processes instrumented by intelligence do not have any predefined script or algorithms. Rather, they follow a heuristic-based and non-algorithmic approach. Figure 1.1 Demonstrates a few characteristics of natural intelligence. Figure 1.1 describes many terms such as non-algorithmic approach, heuristic, inference, etc., a brief description of which is provided in the following sub-sections.
1.1.1 Non-Algorithmic Approach As stated, humans do not have a bounded script for making a decision mainly because all decision-making situations are unique and require a higher level of knowledge of the domain. Many decisions are also taken subconsciously, as expert’s knowledge used to be stored in the subconscious mind. How to swim, how to ride a bicycle, and how to do bargaining are the example situations where the decisions come from wisdom (higher-level knowledge) and taken subconsciously.
1.1.2 Heuristic Approach A heuristic is defined as a rule of thumb discovered and learned through experiences. The majority of decision making is aided by the heuristic. It is not a fixed formula for the optimum result but a practical approach for a good and acceptable solution. Application of the heuristic saves a lot of time and effort, however, the solutions
Fig. 1.1 Characteristics of natural intelligence
1.1 Natural Intelligence
3
offered by such an approach are not optimum. For example, the rule of 72 in banking and finance says that divide 72 by the interest rate offered to get the number of years to make your principal amount double. Another heuristic is: a working day between two holidays is likely to be converted into a paid leave to get the successive holiday of three days.
1.1.3 Self Learning and Inference The inference is defined as a component that logically deduces conclusions with the help of available pieces of evidence. The inference is done in two ways namely Forward inferencing and Backward inferencing. In the forward inference mechanism, all available pieces of evidence are considered and a conclusion is deduced (data driven). In the case of backward inferencing, a hypothesis is set first and its truth value is checked based on available evidence (goal driven). There can be a hybrid inference also. Inferencing can be considered as a kind of self-learning from the pieces of evidence if it can produce new knowledge. Inference generally uses the knowledge pieces from the knowledge base and tries to derive the required piece of knowledge, which is not directly stored in the knowledge base. Learn by cases or data, learn by interactions with stakeholders through an interactive editor, etc. are other means of self-learning. Besides these, intelligence enables the ability to understand from vague, incomplete, and ambiguous inputs. It also imparts the ability to behave flexibly.
1.2 Artificial Intelligence Artificial Intelligence (AI) is defined as an experiment to simulate the abovementioned characteristics of natural intelligence to make machines intelligent. If a machine can exhibit characteristics such as flexibility in decision making and problem-solving, ability to identify the relative importance of things, ability to distinguish between things which look similar, and vice versa, just like people do; then the machine is considered to be intelligent. Unlike natural intelligence, a machine can be made intelligent only in a selected narrow domain. Natural intelligence is more generic in nature and possesses a high amount of flexibility. Obviously, it is impossible to impart such super intelligence or general intelligence into machines. Artificial Intelligence (AI), as defined by Elaine Rich et al. (Elaine Rich, Kevin Knight, Shivashankar B. Nair, 2008) is “the study of how to make computers do things at which, at the moment, people are better”. John McCarthy (2020) has defined AI as “The science and engineering of making intelligent machines, especially intelligent computer programs”. There are many ways to define AI. Some of the definitions are provided in Table 1.1.
4
1 Introduction to Artificial Intelligence
Table 1.1 Possible definitions of artificial intelligence AI definitions • If a computer system simulates human intelligence and cognitive behavior, in a given narrow domain, it is called an intelligent system • A system that follows human-like thought processing for problem-solving and decision making, is considered as an intelligent system • AI is the application of a heuristic-based and non-algorithmic approach to problem-solving as humans do • If a machine exhibits the characteristics that are associated with natural intelligence, the machine is said to be intelligent • A system that can handle partial information and the ability to learn itself can be considered as an intelligent system
From Table 1.1, it can be observed that the definitions consider a few aspects/characteristics of intelligence. These characteristics are in accordance with the characteristics of natural intelligence shown in Fig. 1.1. Since the field of artificial intelligence is loosely defined; one can take the liberty to define artificial intelligence in a unique way as per the application domain or need.
1.3 Types of Artificial Intelligence and Applications Generally, most of the AI-based systems are implemented and supposed to work in a narrow domain. Such systems are called weak AI or narrow AI. Weak AI-based systems do not have general intelligence, which is very difficult to achieve for a machine. Human beings are blessed with such generic intelligence, which is highly flexible and has greater application scope in various fields. Artificially intelligent tasks are generally classified into three basic categories, namely: formal, expert, and mundane tasks. Tasks such as perception, balancing, talking, etc. are mundane tasks, which are learned automatically by human beings and considered as ordinary and easy to learn. For machines, such mundane tasks are very difficult because of the nature of knowledge and the non-algorithmic approach of problem-solving. For machines, formal tasks are the easiest tasks because they are well defined with sets of formal rules. One of the examples is chess playing. Table 1.2 enlists a few examples of mundane tasks, expert tasks, and formal tasks.
1.4 Components of the Symbolic AI Traditionally, all artificially intelligent systems have to deal with knowledge implicitly or explicitly. As per the famous DIKW (Data, Information, Knowledge, and Wisdom chain) (Akerkar & Sajja, 2009) which is also known as the data pyramid,
1.4 Components of the Symbolic AI
5
Table 1.2 Examples of various task types Category
Task description and examples
Formal tasks
• Integration and differentiation • Theorem proving in Mathematics, etc. • Chess and other games playing
Expert tasks
• • • • •
Fault Finding in machines Design of products, machines, and processes Medical diagnosing Financial analysis Planning, control, and monitoring
Mundane tasks
• • • • • •
Perception and vision Video abstraction Image identification/understanding Language generation, translation and Natural Language Processing (NLP) Reasoning and common sense Balancing and locomotive and other robot control applications
knowledge is the basic entity to gain wisdom and intelligence. Systems that explicitly acquire, store, and use knowledge for intelligent decision making are known as knowledge-based systems. Knowledge-based systems are the consortium of techniques developed over a knowledge base in order to facilitate storage, retrieval, use, and infer knowledge for problem-solving. The typical architecture of a knowledge-based system is given in Fig. 1.2. Figure 1.2a shows an abstract version of the knowledge-based system showing all facilities built on a kernel as the core part, called the knowledge base. Figure 1.2b shows the major components of a knowledge-based system. The major components of a knowledge-based system include the knowledge base, inference engine, explanation & reasoning utility, and a user interface. These components are discussed as follows.
Fig. 1.2 Knowledge-based systems architecture
6
1 Introduction to Artificial Intelligence
1.4.1 Knowledge Base A knowledge base is a core part of a typical knowledge-based system and acts as a repository of acquired and inferred knowledge into proper knowledge structure. A knowledge base contains knowledge in various structures such as rules, semantic networks, scripts, frames, etc. The knowledge base also includes facts and heuristics with other local data to make decisions. It is believed that a knowledge base should be a little empty to accommodate new knowledge and its content needs to be updated from time to time to remove knowledge which is no longer needed. Such a knowledge update can be done in the following ways. • Manual Update: With the help of a knowledge engineer (software engineer who develops the knowledge-based system); • Update by Users: The knowledge base of the system undergoes necessary updates by the users themselves. As users are generally non-computer professionals, there is a need for an interactive and user-friendly editor to interact with users and to update the knowledge base at a given time interval or on request of users. • Machine Learning: In this category, machines themselves identify necessary knowledge from the users, data, and environments surrounding them. The learning process can be supervised (in the presence of data and controlled learning mechanism), unsupervised or hybrid. A knowledge-based system needs to acquire, store, and use different types of knowledge. Table 1.3 describes different types of knowledge with examples. It is to be noted that the knowledge, regardless of its type, is very abstract in nature. It is continuously and non-monotonically increasing. Further, knowledge is hard to characterize too. These characteristics of knowledge make it very difficult to acquire, represent, and apply knowledge for decision making and problem-solving.
1.4.2 Inference Engine An inference engine in a typical knowledge-based system is a control program that infers knowledge available into the knowledge base of a system in order to make intelligent decisions. An inference engine can use forward chaining, backward chaining, or hybrid chaining. An inference engine should also encompass the conflict resolution strategy. Conflict resolution strategy is called when more than one rule is matching with the given situation and the system cannot determine which rule has to be fired. Various approaches are used for handling such conflict, examples of which are rule ordering and priority setting, recency (a rule which is using the recently added data in working memory), specificity (most/high number of conditions in rule), refractoriness (removing a rule from the knowledge base which is already fired, to avoid), etc.
1.4 Components of the Symbolic AI
7
Table 1.3 Types of knowledge Type of knowledge
Description with example
Domain knowledge
• The knowledge that specifically belongs to domains such as drug designing, bio-informatics structure, banking, and finance, etc. • Domain knowledge is one of the key ingredients for a symbolic artificial intelligence-based system
Heuristic knowledge
• Rule of thumb and practical approach for good and acceptable solutions with acceptable time instead of the optimum solution in impractical time • An example can be a working day embedded between two holidays often converted into paid leave
Meta knowledge
• Knowledge about knowledge is considered as meta-knowledge • Name and type of knowledge, its frequency of usage, path to reach the knowledge chunks in the storage, last used by whom and why types of log details, etc. are some examples of the meta knowledge
Commonsense knowledge
• Knowledge which is expected to be known by common people regardless of domain and expertise • Examples can be given as: (i) Purchase of sunglasses should be in daylight, and (ii) If you do not exercise you’d gain weight
Informed commonsense knowledge • Some knowledge is expected to be very trivial and known to a group of people who practice in the given domain • For example, Paracetamol drug is necessary for controlling fever • Another example is: Even if you do exercise regularly, you may gain weight because of hypothyroidism Other types of knowledge
• Tacit knowledge is the knowledge which lies at the subconscious mind of a human and cannot be represented easily. Such unwritten, hidden, and unspoken knowledge is based on one’s experiences, intuition, and insight • Explicit knowledge, on the other hand, can be easily represented and documented. This is a kind of knowledge, which is easy to articulate and communicate through formal means • Procedural or imperative knowledge is a kind of ‘knowing how’ type of knowledge such as standard business routines and management procedures. Examples can be steps and tricks of vehicle or personal loan passing, fault diagnosis, recipes, and kitchen hacks while cooking, etc.
8
1 Introduction to Artificial Intelligence
1.4.3 Self-Learning Learning in a knowledge-based system updates its knowledge base. With the help of different learning strategies such as case base leaning, interactive learning, machine learning, etc., a knowledge-based system can update its knowledge base. Many times interactive editors are available to facilitate learning from the cases to eliminate the need of knowledge engineers. In the previous sections, the self-learning strategies and possible methods of knowledge update are discussed.
1.4.4 Explanation and Reasoning A knowledge-based system can use the knowledge stored in it for providing brief reasoning and a detailed explanation of the decision taken. For example, a knowledgebased system having rules and facts in its knowledge base can provide a reference (list) of rules used to conclude the given decision, which is called reasoning. A detailed explanation of the decision making can come from the associated text files or generate on demand.
1.4.5 User Interface A knowledge-based system can interact with its stakeholders with the help of an effective and friendly user interface. It is observed that the creditability of a system depends on many factors; a friendly user interface is one of the most important factors. If end-users, developers, and testers are not comfortable interacting with the system, the operational acceptability of the system becomes low.
1.5 Knowledge Acquisition Acquisition of knowledge is one of the practical challenges that one needs to face while developing a knowledge-based system. As knowledge is difficult to characterize, voluminous, and changing continuously, knowledge acquisition has to be done with good care and caution. As experts are rare commodities in the domain, the very first difficulty is finding and convincing suitable experts for the process of knowledge acquisition. Though experts are ready to share their knowledge, they are non-computer professionals, and hence a knowledge engineer has to understand the domain knowledge first and later to encode it into the knowledge base. Further, very little support of a few fact-finding methods such as interviews, questionnaires, record reviews, and observations along with a few knowledge acquisition
1.5 Knowledge Acquisition
9
methods and tools is available; which makes the process further challenging. Concept mapping (organization of entities and relationships, also used for knowledge representation), auditing, discussion, storytelling, experimenting, etc. are a few examples of knowledge acquisition techniques. Knowledge can be acquired from multiple sources, however, the best source of knowledge are experts. Many experts’ knowledge can be acquired by different techniques for a given domain, cross-verified, and encoded with the help of proper knowledge representation schemes. The acquisition and representation of knowledge also depend on the type of knowledge. While acquiring knowledge, one needs to consider the following heuristics. • Most of the important knowledge usually comes from experts, which are rare commodities of the domain. • A knowledge engineer, the person who is dealing with the identified experts, has to understand the domain terminology and basic facts before interacting with the selected experts. • Without the notion of the problem, it is challenging to acquire knowledge. • If there are multiple experts, there are chances of contradiction as well as duplications, which need to be managed. • It is knowledge of the knowledge engineer that is reflected in the system. • Interviewing, auditing, brainstorming, questionnaire, concept mapping, record reviews, storytelling, problem-solving, mining, and observation are the major tools/techniques for knowledge acquisition. Once knowledge is acquired, it is necessary to document the knowledge into suitable knowledge representation schemes. Major and popular knowledge representation schemes are mentioned in the following section.
1.6 Knowledge Representation As there are various types of knowledge to be represented, such as objects and facts, relationships, heuristics, rules and procedures, and meta knowledge, etc., various representation schemes become inevitable. There are knowledge representation structures available such as declarative, procedural, and structural knowledge representation schemes. Acquired knowledge is limited to the knowledge engineer, who manages conscious efforts to collect knowledge from various sources and multiple experts by various techniques. The knowledge needs to be effectively represented in the system for its possible applications. Depending on the type of knowledge and application needs, suitable knowledge representation schemes need to be selected. Popular knowledge representation schemes are rules, semantic networks, frames, scripts, and hybridization of these schemes.
10
1 Introduction to Artificial Intelligence
Fig. 1.3 Knowledge representation in rules
1.6.1 Declarative Knowledge To represent formal entities we normally use declarative knowledge. For example, consider these two statements. (i) “Man needs to take food”; and (ii) “Ram is a man”. This fact can be represented as follows. ∀x (Man (x) → Needs_Food (x)) Man (Ram) This type of representation with qualifiers and operators such as implications, conjunction, etc. is also called Well Formed Formula (WFF).
1.6.2 Rules Rules (also called production rules) follow ‘If condition then action’ format. A knowledge based system needs a set of such multiple rules, working memory, and recognize/act cycle. The controlling program checks for the condition provided and if it holds true then production rule fires and corresponding action is carried out. In the working memory current state, values of variable, data, etc. can be stored. In some cases, when multiple rules (‘if condition…’ parts of multiple rules) are matching simultaneously, conflict resolution strategy is also required. Figure 1.3 illustrates a few examples of rules. Using rules to represent knowledge makes the knowledge representation very natural, readable, and modular. However, the learning capability of rules as a representation tool is very limited.
1.6.3 Frames A frame is a knowledge representation structure which is similar to a record data structure in general. The frame is used to store the default knowledge about an entity. For example, students, bikes, books, cars, etc. can be best represented with the frame knowledge structure. An example is given in Table 1.4. In the frame shown in Table 1.4, optional procedural knowledge is also attached which is provided on demand. Many slots can be added or altered in a frame as
1.6 Knowledge Representation Table 1.4 An example of a frame: book
11 Slots
Description
Title
Computational intelligence
Genre
Computer science
Author
Priti Srinivas Sajja
Edition
Third edition
Year
2020
Pages
325
Topics
If needed: (Look at the content)
per the need. The frame can be easily hybridized with rules or other knowledge representation scheme. As the frame gives limited opportunities to accommodate the inference and learning mechanism, such hybridization may be useful. The frames are also useful to the programmers, as the knowledge represented in a frame can be mapped easily to a well-defined computer (data) structure. The frames are very flexible for expansion as well as a restriction, as the slots in a frame can be easily added and removed.
1.6.4 Scripts The script is a structure that describes various situations about decision making that are expected to happen. A script represents a series of events (scenes) with roles, properties, and entry as well as exit conditions along with procedures. ‘Going to the Library’ script is described in Table 1.5. The script generally used to represent a sequence of events and actions that is normally and generally carried out. A script can lead to other scripts as demonstrated in Table 1.5. Table 1.5 An example of the script: ‘Going to the library’
Roles
Librarian, students, readers, guest, etc.
Props
Library, table, books, computer, chairs, etc.
Entry
First: enroll in the library script
Procedures
Find book script Get issues script Read book script Return book script
Exit
Finally: leave library script
12
1 Introduction to Artificial Intelligence
Fig. 1.4 An example of a semantic network
1.6.5 Semantic Networks A semantic network is a graphical representation of entities with relationships as connections. The entities can be people, objects, or concepts. It is also considered as a connected graph with the entities as vertices and edges as semantic relationships. Sometimes, the semantic network is considered as an alternative to the predicate logic. Semantic networks are visual, easy to understand, and expandable. The relationship between two objects can be either (i) Is a Relation (Inheritance) or (ii) Kind of Relation. Figure 1.4 illustrates an example of a semantic network. An example of a semantic network is WordNet.1 WordNet is a lexical collection of most of the English language components such as verbs, nouns, pronouns, adjectives, etc. Wordnet groups the words of the English language into a set of synonyms and identifies as well as represents relationships between the words.
1.6.6 Hybrid Knowledge Representation Structure Two or more knowledge representation schemes can be combined to get multifold advantages and increased the effectiveness of the presentation style. For example, in a semantic network, as shown in Fig. 1.4, the relationship between students and the book is mentioned. Here the ‘student’ can be a frame containing default knowledge about a student. In a script (see Table 1.5) also, such frames can be included. Similarly, rules can be accommodated in a frame for fine-tuning and enhancing ‘If needed’ type of slots. A semantic network of frames and scripts can also be considered as hybrid knowledge representation structures. 1 https://wordnet.princeton.edu/.
1.6 Knowledge Representation
13
1.6.7 Evaluation of Different Knowledge Representation Techniques There are some criteria to select schemes for knowledge representations. A few criteria are mentioned for various knowledge representation schemes in Table 1.6. A similar type of matrix can be prepared for hybrid knowledge representation schemes.
1.7 Types of Knowledge-Based Systems There are many types of knowledge-based systems, prominent of which are given below. • • • • • •
Expert systems; Linked based systems; CASE based systems; Intelligent tutoring systems; Agent-based systems; and Intelligent interface of the data. These systems are discussed in brief with their typical architectures below.
Table 1.6 Evaluation of different knowledge representation techniques Criteria
Predicate logic
Rules
Frame
Script
Semantic network
Ability to handle incomplete knowledge
Poor
Good
Average
Average
Poor
Ability to incorporate default knowledge
Good
Poor
Good
Average
Poor
Adequacy
Good
Good
Good
Good
Good
Completeness
Good
Average
Average
Average
Poor
Consistency
Good
Average
Good
Average
Average
Expressiveness
Average
Good
Good
Good
Good
Extendibility
Average
Good
Average
Good
Average
Knowledge support
Declarative
Procedural
Both declarative and Procedural
Procedural
Declarative
Modularity
Good
Good
Average
Average
Average
Naturalness
Good
Good
Good
Good
Good
14
1 Introduction to Artificial Intelligence
Fig. 1.5 Architecture of an expert system
1.7.1 Expert System An expert system is the most popular type of knowledge-based system. As one or more experts’ knowledge is documented into suitable knowledge structure for problem-solving, decision making, and future use & training, it is known as the expert system. It is also defined as a system which replaces expert in a given narrow domain. Mycin and Dendral2 are examples of expert systems that were the initial expert systems. Typically in an expert system, the knowledge has to be collected and stored into a knowledge base. However, modern business and applications require much more than an expert system provides. As the traditional expert systems are not as scalable as the demand, new/modern artificial intelligence methods need to be used. Figure 1.5 illustrates the traditional architecture of an expert system. The components such as knowledge base, inference mechanism, explanation, reasoning, and user interface work in the manner described in Sect. 1.4 of this chapter. As illustrated in Fig. 1.5, the knowledge engineer is responsible to acquire knowledge from one or more experts, understand, and represent the knowledge using a suitable
2 https://exhibits.stanford.edu.
1.7 Types of Knowledge-Based Systems
15
knowledge structure. Various knowledge acquisition methods and knowledge representation structures are listed in the figure. Users can use the expert system with a given user interface. Later, the users can provide feedback either to the user interface (if such provision is made) or to the knowledge engineer. Besides fifth-generation programming languages, many tools are available to develop expert systems. There are readymade expert systems available with an empty knowledge base, which are known as expert system shells. Experts themselves add knowledge in the knowledge base with the help of an interactive editor.
1.7.2 Linked Based Systems As per the well-known DIKW chain (refer Sect. 1.4) the information, as well as knowledge chunks, have to be synthesized or linked with each other to produce higher-level knowledge/wisdom. On this concept, the linked based systems are built. Since it deals with stored knowledge in a well-structured knowledge base, it is considered as a kind of knowledge-based system. In the era of the Internet (collection of machines) and the Web (collection of documents on the Internet), such systems might play an important role. Sometimes, a standalone object or event does not contribute a major knowledge; however, when it linked with other related entity it may generate a high level of knowledge. Such linking is made possible between various types of data and entities, many of which are multimedia objects e.g. hyper videos. Figure 1.6 demonstrates the architecture of a linked based system on a web-based platform or the internal network of an organization. The material to be presented to the user is stored as a loosely coupled component or object in multimedia format. The material will be dynamically linked and a page is formed on the requirement. To support this we need style sheets, template base, document type definition along with other data, and related protocols. To impart intelligence, the knowledge base and other utilities such as inference, explanation, reasoning, self-learning, etc. are available. Above this, there is a user interface for non-computer professionals, administrators, and knowledge engineers for use, manage, and update the system.
1.7.3 Computer Aided Systems Engineering (CASE) Based System It is observed that systems development involves the application of knowledge at conscious as well as subconscious levels. It is a kind of art as well as science too! For the development of the traditional non-intelligent systems, lots of approaches, models and techniques are available; however, these tools provide only guidelines to some extent. System development requires more support in an intelligent manner.
16
1 Introduction to Artificial Intelligence
Fig. 1.6 Architecture of the link-based system
Further, to develop a knowledge-based system, no significant support is available. Here, the Computer Aided Systems Engineering (CASE) type of knowledge-based system helps in many ways. Major help can be in the area of knowledge discovery and requirements documentation using suitable ontology. It also helps in the selection of robust, secure, and efficient design, automatic programming, and testing by generating effective test cases. Figure 1.7 illustrates the general structure of a knowledge-based system to support the software engineering process. The architecture supports traditional activities involved in software engineering such as elicitation of requirements, evaluation of requirements through feasibility tests, design development, coding, testing, and implementation. For these activities, inputs from experts, knowledge engineers, users (for requirements and feedback), and other resources are collected and provided to the system. The system provides support of knowledge acquisition, representation (in a manual manner), and other traditional utilities of the knowledge-based system.
1.7 Types of Knowledge-Based Systems
17
Fig. 1.7 Architecture of KBS to support CASE
1.7.4 Knowledge-Based Tutoring Systems Knowledge-based tutoring systems also require an intelligent approach to store and retrieve higher-level learning material, their efficient delivery, and identification of various types and levels of learners. Different learning objects are normally bounded dynamically on demand and according to the learner’s level. Virtual classrooms, simulation, natural language processing, etc., tools and technologies can be used to enhance the learning experience. Further, platforms and technologies like the Internet, web, cloud computing, and other such tools accelerate learning among the target audience. Intelligent tutoring systems are typically compared with the
18
1 Introduction to Artificial Intelligence
Fig. 1.8 Architecture of knowledge-based tutoring system
human tutor’s behavior and ability to respond to complex reactions of the learners. Figure 1.8 illustrates the architecture of a knowledge-based tutoring system with student (learners) module, instructor module, and knowledge module. There may be more than one learning material repository, which can be added to the system on demand. It is to be noted that the processes related to the knowledge acquisition and knowledge representation by the knowledge engineer with the support of the experts and users are not mentioned in Fig. 1.8.
1.7.5 Agent-Based System Agents are real-life entities that work on behalf of users when the problem is not known and resources such as equipment, information, and expertise are not available at the given location, but in distributed places. Intelligent agents are autonomous, independent, proactive, knowledge-based, cooperative, social, and able to learn. Many such agents can be accommodated into a multi-agent framework, where they communicate using a common protocol and/or language such as Agent Communication Language (ACL), proposed by the Foundation for Intelligent Physical Agents (FIPA)3 or Knowledge Query and Manipulation Language (KQML) to exhibit intelligent behavior.
3 https://www.fipa.org/.
1.7 Types of Knowledge-Based Systems
19
Fig. 1.9 Mechanism of an agent
As stated, an agent is an entity, process, or instrument that works on behalf of its users. The agents can learn, co-operative, autonomous, and social in nature. All the agents have a mechanism to sense the environment and act accordingly. Figure 1.9 illustrates the general mechanism of an agent. The agent-based system is traditionally having one or more agents along with a knowledge base and other utilities. The system having multiple agents in it is called a multi-agent system. The agents in the multi-agent systems are categorized into a few categories such as (i) domain agent, (ii) interface agent, (iii) information agent, (iv) mobile, and (v) hybrid agent. The domain agents are application-specific and work for the part of the problem where domain knowledge has to be applied. The other agents can be generic. The information agent generally tries to identify and acquire information from distributed heterogeneous (not restricted to one type, say not only tables and files) sources on a given network. The interface agent helps in providing a friendly and/or graphical or regional language-based environment for ease of use and smooth interactions with the users of the system. A mobile agent is having a list of IP (Internet Protocol) based address to uniquely identify a computer. This list is known as a ticket, with which it can move in a given network environment. A hybrid agent combines the methodology of two or more agent categories. For example, an information agent can have mobility by hybridizing mobile and information technologies. Figure 1.10 illustrates the generic structure of a multi-agent system. The architecture of Fig. 1.10 is divided into three layers i.e. interface, domain, and information or service layer.
1.7.6 Intelligent Interface to Data In the modern era, majority of the businesses use computers and other electronic tools to support their routine to extraordinary transactions. This generates lots of data, which remains unused. Sometimes, if a business is large enough, these data are unstructured, heteronymous, voluminous, vague, and often come with different velocities. Such really big data are difficult to handle to get a proper understanding
20
1 Introduction to Artificial Intelligence
Fig. 1.10 Architecture of a multi-agent system
of the business and get insight to make useful decisions. A knowledge-based system handling such data and act as an intelligent interface of the heterogeneous data sources is becoming popular in the current scenario. Figure 1.11 illustrates a generic architecture of the intelligent interface to data-based systems. It is to be noted that all types of knowledge-based systems are based on the generic architecture illustrated in Fig. 1.2.
Fig. 1.11 Intelligent interface to data-based system
1.8 Testing Intelligence
21
1.8 Testing Intelligence To test whether the given system is intelligent or not, traditionally a test proposed by well-known scientist Alan Turing (1950) is being used. Alan Turing presented an outline of the test, hence called the Turing test. In the Turing test, a human (questioner) is supposed to interact repeatedly with a human being (answerer) and a machine for an intelligent task in a narrow domain. If the machine can mimic the human thought process for the domain and while interaction, human (questioner) cannot distinguish between the machine and human being (answerer), then the machine is said to be as intelligent as a human being. There are limitations to this test too. If a human being is normally making mistakes in the given situations, then the machine also should do similar mistakes. If the machine acts in a way smarter than the human being in the domain, it may fail the test, even though its high capabilities. Further, in the modern era, smart programs such as chatbots can easily pretend like a human being and can pass the test successfully. Variations of the Turing test are also available. In the original Turing test, the computer is trying to act as intelligent as a human being. In the case of the reserve Turing Test, a human tries to convince a computer that it is not a computer; for example, popular CAPTCHA control on web pages. If the Turing test involves only objective questions with only possible answers as ‘Yes’ and ‘No’, then it is called objective Turing test, or minimum intelligent signal test. As alternatives of the Turing test, the Chinese room test, Marcus test (to watch television episode and answer questions based on that), Lovelace test 2.0 for creative art, etc. are available. The customized test can also be developed that depends on the domain of application and expectation from the intelligent system developed.
1.9 Benefits of AI Generic benefits on intelligence and documentation of knowledge are achieved when knowledge is stored electronically. These include the following types of benefits. • Use of knowledge for problem-solving increases productivity and effectiveness; • Advantages related to automation (efficiency) such as speed, accuracy, longtime storage of knowledge, etc. can be achieved; • More than one experts’ knowledge can be available at one place all the time on demand; • The intelligent approach increases the quality of the solution; and • Documented knowledge can be used as intellectual property for future use and training.
22
1 Introduction to Artificial Intelligence
Table 1.7 Applications of AI in various domains Domain
Applications
Health care
Disease diagnosing systems, health informatics
Commerce
Personalized commerce and k-commerce
Education
eLearning, personalized learning, managing learning object repository in an intelligent way, question answering systems, student motoring, syllabus design, feedback evaluation system
Finance
Portfolio management, investments, budget allocation, utilization
Law
Legal referencing, preventive measures of crime, cybercrime, prediction, controlling crime
Manufacturing
Assembly line robots, process automation, new product design
Media and communication
Selection of news in desktop publishing, video summarization, translation, searching from big archives
Domestic appliances
Speech recognition, home assistance robots
Military and defense
Security, crowd monitoring
Planning and scheduling
Job scheduling
Administration
Employee selection, evaluation, promotion and demotion, effective resource utilization, workflow management system
Entertainment
Gaming, movie recommendation
….
….
1.10 Applications of Artificial Intelligence Artificial intelligence has got ubiquitous nature. Any business you name can be benefited from artificial intelligence. Table 1.7 enlists some domains with example intelligent systems. Refer Sect. 1.12 for other applications.
1.11 Limitations of Traditional AI Based Solutions The symbolic artificial intelligence-based systems traditionally deal with stored knowledge into the knowledge base. They are often called classical artificial intelligence-based systems. As stated earlier, intelligence requires knowledge, which is hard to characterize. Further, a large amount of knowledge is needed to take a small decision. For example, to make a move on the chessboard, which is a relatively very simple and well-defined problem; a significantly big set of rules are needed along with the positions of the chess pieces. The knowledge also keeps on changing, a new type of knowledge can evolve and old, least used knowledge chunks have to be removed from the knowledge base. A knowledge engineer is supposed to perform these tasks. In the absence of the knowledge engineer, machine learning techniques update the system parameters and knowledge base.
1.11 Limitations of Traditional AI Based Solutions
23
As effectiveness (due to intelligence) and efficiency (due to automation) both join their hands, artificial intelligence might become a superpower and may not remain under the control of humans. Machine, if overpowers the human race, then human existence is at risk. Further, there are some legal, ethical, and social issues that need to be considered while using artificial intelligence as a mighty tool.
1.12 Core and Applied Research Ideas in Symbolic Artificial Intelligence Techniques of the symbolic artificial intelligence can be employed for much innovative research as well as commercial applications. Some applications contribute to the core techniques of the symbolic artificial intelligence e.g. hybrid knowledge representation structure along with its inference mechanism. Such innovative core research is generic and can be applied to multiple domains to achieve combined advantages of multiple knowledge representation structures. Other applications, such as faultfinding in a machine is very specific to a given domain and range of machines. It is a good application of experts system, however, it does not contribute anything towards the techniques of traditional artificial intelligence. There may be hybrid research also, where a fault-finding in all categories of automatic washing machines is implemented with an innovative searching algorithm or novel knowledge representation technique. It is to be noted that for well-defined narrow domains symbolic artificial intelligence techniques are generally used.
1.12.1 Core Project/Research Ideas in Symbolic Artificial Intelligence • Knowledge discovery and editing tools which are domain-independent; • Implementation of modified forward chaining and backward chaining. It can be extended to develop and implement a hybrid inference engine; • Design and development of new knowledge structure or hybrid knowledge structure along with suitable inferencing strategies; • Taxonomies for knowledge and experience storing and inferencing; • Development of customized tests for evaluating intelligence such as modified Turing tests and Chinese room test; • Identification and detailed study on parameters which make the system further intelligent; • Study on various types of intelligence such as traditional/symbolic AI, Narrow AI, Modern AI, General/Super AI, etc.; • Development of software engineering model/guidelines for better development and monitoring of expert systems;
24
1 Introduction to Artificial Intelligence
• Development of multi-agent system frameworks and methods of interaction between agents such as KQML; • Generic intelligent interface to data-based systems and interface for big data handling; • Study of various evaluation criteria for knowledge representation and defining a benchmark quality matrix for knowledge representation scheme evaluation; • Generic user-friendly tool to generate an expert system or an expert system shell with an empty knowledge base; • Models of linked based and web based intelligent system; • Novel and hybrid searching algorithms; • Smart programming languages, intelligent debugging agents and interactive editors for experimenting knowledge-based systems; • Knowledge oriented (just like object oriented or aspect oriented) programming and designing tools which are based on natural language or multi-lingual; • etc.
1.12.2 Applied Project/Research Ideas in Symbolic Artificial Intelligence Besides the applications enlisted for various domains in Table 1.7, the following applications/projects can be possible. • Expert system to assist with the selection of crop considering the land pattern, soil type, rain prediction, history, and irrigation as well as other infrastructure facilities in agricultural domain; • Expert system for fault finding in engineering and manufacturing applications; • Knowledge based system for crime investigation, law referencing, and legal advisory applications; • Intelligent systems for various insurance suggestions for health, vehicle, gold ornaments, etc. besides claims handling, etc.; • Expert system to find the aptitude of students for further studies; • Expert system to play games such as chess; • Expert system to select products such as online consumer product, course, exercise, diet plan, movies, songs, tour plan, and promotion of employees, etc.; • Planning and monitoring various budget related activities; • Analysis of various heuristics used in different business domains such as farming, academic, medical diagnosing, fault finding, planning, share market portfolio management, etc. • Small and medium scale business advisory systems for rural development; • Study and collection of various types of knowledge in a given domain and representing the knowledge into appropriate knowledge structure. Example domains are sales, finance, marketing, planning, etc. for a business.
1.12 Core and Applied Research Ideas in Symbolic Artificial Intelligence
25
• Natural and regional language processing such as word sense disambiguation and semantic parsing using the semantic network; • Computer-aided instructions and eLearning for a skill-based course; • Secured gateways for online payment; • Domestic security applications; • etc.
References Akerkar, R. A., & Sajja, P. S. (2009). Knowledge based systems. Sudbury: Jones & Bartlett Publishers. McCarthy, J. (2020, June 14). Retrieved from https://www-formal.stanford.edu/jmc//. Rich, E., Knight, K., & Nair, S. B. (2008). Artificial intelligence. Ltd: Tata McGraw-Hill Education Pvt. Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460.
Chapter 2
Constituents of Computational Intelligence
Abstract This chapter introduces the terms of hard computing, soft computing, and computational intelligence by highlighting the difference between traditional hard computing and soft computing. This chapter also discusses how limitations of the symbolic AI can overcome by modern computational intelligence.
The major and popular constituents of the computational intelligence namely fuzzy logic, artificial neural network, and genetic algorithm along with their possible hybridization are discussed in this chapter. While discussing fuzzy logic constituent, concepts such as fuzzy sets, crisp sets, membership functions, fuzzy operations, fuzzy relations, fuzzy rules, and fuzzy control systems, etc. are elaborated by giving necessary illustrations and examples. The second constituent of computational intelligence discussed here is the artificial neural network. This chapter presents an introduction to a biological neuron and its simulation to demonstrate how the neural network-based systems are working. Different neural network models such as Hopfield network, perceptron, multilayer perceptron with backpropagation, and Kohonen self-learning model are also discussed in detail by providing their properties, design architecture/heuristics, learning mechanisms, and examples. Strategies such as supervised and unsupervised learning are also discussed. Further, a brief introduction to deep learning is also presented. The third constituent discussed in detail is genetic algorithms. Genetic evolutionary algorithm encoding, fitness functions, and genetic operators are discussed with illustrative examples. Encoding such as binary encoding, alphabet encoding, hexadecimal encoding, and tree encoding strategies are discussed in detail. An example of a function optimization is discussed in detail. A brief discussion is also available for the application specific operators and fitness function with an example illustrating the traveling salesperson problem. At the end of this section, the notion of a schema is introduced. The section concludes with a brief note on parallel genetic algorithms. The last section of the chapter discusses possible hybridization of the computational intelligence constituents such as neuro-fuzzy systems, genetic-fuzzy systems, neuro-genetic systems, and neuro-fuzzy-genetic systems by providing possible
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 P. S. Sajja, Illustrated Computational Intelligence, Studies in Computational Intelligence 931, https://doi.org/10.1007/978-981-15-9589-9_2
27
28
2 Constituents of Computational Intelligence
methods of fusion, advantages, and disadvantages along with illustrations and examples.
2.1 Computing Intelligence, Hard Computing, and Soft Computing Computing can be defined as any goal-oriented activity encompassing analysis, design, processing, and often automation. Various domains such as information technology, computer engineering, computer science, information systems, and software engineering involve computing. In these fields, computing always means ‘soft’ computing. Tangible things like keyboard, monitor, processor, etc. are called hardware, and intangible things such as logic (thoughtful steps - set of instructions, known as program) and process (a procedure in execution) are identified as software. When it comes to the artificial intelligence domain, hard computing and soft computing have little different understandings. Hard computing deals with rigid models and strict definitions which are used traditionally in non-AI based systems. Examples of hard computing cover generally lower level but trivial methods such as addition, percentage calculation, average and complex decision making such as simplex and linear programming methods of operation research. Hard computing also involves a high level but very rigid methods. Soft computing deals with solving real-life problems in approximate and natural ways. Soft computing techniques can also handle imprecision, uncertainty, and partial information. As modern problems require the involvement of many non-linear, dynamic, and spatial techniques for situations, their solution with the traditional hard computing methods is challenging and less effective. Traditional and classical AI also have got many limitations such as abstract nature of knowledge, limited support of knowledge acquisition methods, limited knowledge representation structures, and effort needed for development of knowledge-based components. Such system quickly obsolete too. Further, the modern world has witnessed the birth of innovative and spectacular problems which are multidisciplinary or interdisciplinary in nature. This requires employing more than one method in flexible ways for decision making and problemsolving. The soft computing term is coined by Lofty Zadeh (1994). He defined soft computing as an approach that imitates the human mind to reason and learns in an environment of uncertainty and impression. As per Zadeh, soft computing is a consortium of more than one technique. Such consortium is also known as computational intelligence as defined by IEEE.1 Sometimes soft computing/computational intelligence consortium is extended with techniques such as probabilistic reasoning, swarm intelligence, and chaos theory. However, there is a fine line between soft computing and computational intelligence and often they are used interchangeably. The major and popular constituents of the computational intelligence consortium are fuzzy logic, artificial neural network, and genetic algorithms. These constituents 1 https://cis.ieee.org/.
2.1 Computing Intelligence, Hard Computing, and Soft Computing
29
Table 2.1 Hard computing and soft computing Traditional hard computing
Computational intelligence-based (soft) computing
Traditional, formal, and conventional techniques
Informal and non-conventional techniques
Requires and handles precise and complete data while input, output, and processing
Can handle ambiguous data as well as data with partial, vague, and imprecision content
Applicability is less flexible and rigid
Applicable in a highly flexible way to real-life complex problems
Resembles mechanical procedures very well
Resembles biological processes very well
Based on binary logic and crisp set/logic theory
Based on approximate reasoning, multi-valued logic, self-learning, and evolutionary approach
Generally performs sequential or linear computations
Can perform sequential as well parallel computations
Results are precise and the aim is to obtain optimum results
Provides approximate, good, and acceptable results
are not always competitive with each other, but rather complementing each other by proper hybridization. Later in this chapter, the possible hybridizations of these popular computational intelligence constituents are discussed. Chapter 6 of the book also discusses some practical applications of hybrid intelligent systems using computational intelligence techniques. Table 2.1 describes the major differences between traditional hard computing and soft computing/computational intelligence. As stated above, computational intelligence or soft computing is a consortium of various techniques such as fuzzy logic, artificial neural network, and evolutionary algorithms, etc. Figure 2.1 illustrates the different constituents of soft computing. The three constituents shown in the Fig. 2.1 namely fuzzy logic, neural network and evolutionary algorithms can also be called as pillars of computational intelligence. The other constituents of soft computing are swarm intelligence and chaos theory. It is to be noted that many other components can be added to the consortium. In the subsequent sections of the chapter, these basic constituents are discussed in brief.
2.2 Fuzzy Logic Fuzzy logic introduced by Lofty Zadeh (1994) is a multi-valued logic based on the fuzzy set theory. He also suggested the fuzzy logic as one of the important components of intelligence consortium. Our traditional sets are crisp in nature i.e. for a given set, if an entity fulfills the conditions imposed by the set definitions completely, it belongs to the set, otherwise not. There is no intermediate or partial belongingness to the set. In reality, we used to categorize things into classes where
30
2 Constituents of Computational Intelligence
Fig. 2.1 Constituents of soft computing
there is no rigid boundary. A luxurious car, handsome man, ripe mango, big house, young woman, hot temperature, and tall person are some examples of the same. Suppose, a traditional set is considered for tall people, it is defined as follows. Set of tall people = {All people whose height is > = 5.6 feet}. In the above-mentioned set of tall people, neither a person with a height of 5.4 feet be accommodated nor a person with a height of 4.2 feet. That means both the persons are treated at par; just because of having 2 inches less from 5.6 feet benchmark, the first person loses his membership from the set of tall people. In reality, we still try to accommodate the person with a height measure of 5.4 feet in the set of tall people with less significance. This is called partial membership, and the set without such a rigid boundary is called a fuzzy set. An illustration is provided in Fig. 2.2. As shown in Fig. 2.2, the crisp set (shown with the label (a)) has a rigid boundary and provides tight definitions about the membership. All people who have height more than or equal to 5.6 feet, they can be part of the set. That is, the crisp set offers a binary status of membership to the entities. If an entity is a member of a set by following the definition of the set, it completely becomes a member of the sets. In this case, the belongingness degree of the entity to the set is 1 (complete belongingness). If an entity is not a member of the set, the belongingness of the entity to the set is 0 (complete non-belongingness). This gives us a clear partition between the set of tall people and other people, as shown in Fig. 2.2 (c). On the other hand, fuzzy logic enables partial belongingness of entities into a fuzzy set. Such sets are called fuzzy as there are no rigid boundary constraints imposed on entities to become their members. According to this concept, a person with a height of 4.8 feet can belong to the set of tall people with partial membership, the degree of which is defined by
2.2 Fuzzy Logic
31
Fig. 2.2 Crisp set, fuzzy set, and partial membership
a function called membership functions as shown in Fig. 2.2d. As per the function illustrated in Fig. 2.2d, a person with a height of 4.8 feet belongs to the fuzzy set of tall people with membership degree 0.8, see dotted line in Fig. 2.2d. Such belongingness is also called graded membership. Fuzzy sets can be considered as a superset of the crisp (classical) sets. Classical sets offer only two values of memberships, complete membership (1) or no membership (0). Fuzzy sets offer multiple membership values between 0 and 1. That is why the range of numbers on Y-Axis is always between 0 and 1 and the fuzzy logic is referred as the multi valued logic. The mathematical notation of the fuzzy set can be given as follows. “A fuzzy set à in the universe of information U can be defined as a set of ordered pairs and it can be represented mathematically as: à ={(y,µ à (y))|y∈U} Where µÃ is a membership function of Ã; this assumes values in the range from 0 to 1, i.e., µÃ (·)∈[0,1]. The membership function µÃ (·) maps U to the membership space M. The dot (·) in the membership function described above, represents the element in a fuzzy set; whether it is discrete or continuous.” If a membership function is defined, membership of any entity can be determined, this process is called fuzzification. We also require conversion of the fuzzy values into its equivalent crisp value; this procedure is called de-fuzzification. The area under a curve, centroid, max-membership method, mean max, and weighted average method are some popular defuzzification methods. As we can do operations such as union, intersection, and complement on crisp sets, such operations can also be performed on fuzzy sets with little modifications. Table 2.2 enlists some operations on fuzzy sets.
32
2 Constituents of Computational Intelligence
Table 2.2 Operations on fuzzy sets Operation
Notation
Definition
Illustration Set A and Set B
Illustration result
Complement
Ã
µÃ (x) = 1- µA (x)
A
Ã
Union
AUB
µAUB (x) = Max [µA (x), µB (x)]
A and B
AUB
Intersection
A∩B
µAUB (x) = Min[µA (x), µB (x)]
A and B
A∩B
Fuzzy logic is based on fuzzy sets and deals with human-like approximate reasoning and the ability to handle vague and partial information. To accommodate fuzzy logic in traditional programming, linguistic variable along with fuzzy rules are used. A variable, whose possible values are strings of one or more words, is called linguistic variables. A linguistic variable is defined as follows. A linguistic variable is defined using five entities called a quintuple (X, T, U, G, M); where. X is the name of the variable, T is the set of terms of X, U is the universe of discourse, G is a syntactic rule for generating the name of the terms, and M is a semantic rule for associating each term with its meaning i.e. a fuzzy set defined on U. In the case of tall people example discussed above, height is a fuzzy variable, which can take values such as short, tall, average, etc. Adjectives such as very tall, little tall, not so tall can also be used which are called fuzzy hedges. Connectives such as ‘AND’, ‘NOT’, and ‘OR’ can also be used with such values. As per Lofty Zadeh (1994), “The concept of a linguistic variable provides a means of approximate characterization of phenomena which are too complex or too ill-defined to be amenable to the description in conventional quantitative terms.” Traditionally within the ‘if then else’ rules, such linguistic variables are included to generate one or more fuzzy rules. In a fuzzy logicbased system, fuzzy rules are used to infer output based on the hypothesis, on data available, or based on both the hypothesis and the available data. A few examples of fuzzy rules are provided in Table 2.3. Such multiple fuzzy rules about the domain knowledge can be formed along with underlying fuzzy membership functions and stored in a repository called rule base. A system, called fuzzy control system handles this rule base for further decision making
2.2 Fuzzy Logic
33
Table 2.3 Examples of fuzzy rules Fuzzy rules for selection of a tall person • If height is short put the person in reject list • If height is average put the person in waiting list • If height is tall or very tall put the person in final list
Fig. 2.3 Fuzzy control system
in an automatic way. As a machine cannot handle rules with a fuzzy and verbal component, defuzzification needs to be performed. After defuzzification, equivalent crisp values are obtained and sent to the action interface. Later the acknowledgment and results of the actions performed are converted into verbal values with the fuzzification process via stored membership functions and sent back to the users. This process is illustrated in Fig. 2.3. Fuzziness can also be applied to relationships. For example, a key belongs to only one particular lock, hence the relationship between key and lock is crisp. Either the key applies to the lock completely or not. Similarly, Person X is married to Person Y, which is a crisp relationship. These relationships are denoted as follows. Example 1: • Let R1 is a relation between a lock (Lock1 ) and a key (Key1 ) defined as R1 (Lock1 , Key1 ); Key1 correctly applies to the Lock1 ; hence R1 value is 1. So, R1 (Lock1 , Key1 ) = 1. • Let R2 is a relationship between a lock (Lock1 ) and another key (Key2 ) defined as R2 (Lock1 , Key2 ); Key2 does not apply to the Lock1 ; hence R2 value is 0. So R2 (Lock1 , Key2 ) = 0. Example 2: • Let ‘Married_to’ is a relationship showing whether a person is married to another person or not. In this case, Married_to (Person X, Person Y) = 1; if X and Y are a couple and married to each other.
34
2 Constituents of Computational Intelligence
Example 3: • Let ‘Ownership’ is a relationship showing whether a person owns a machine or not. In this case, Ownership (Person P, Machine M) = 1, if the Machine M belongs to Person P. Fuzzy relationship, on the other hand, does not return 0 or 1 (crisp) values generally. For example, how much comfort a person can feel on a given machine. Such a fuzzy relationship can be demonstrated as follows. Example 4: • Let ‘Comfort’ is determined as the level of comfort of a person to work with a machine. Then, Comfort (Person P, Machine M) can be 0 (zero) if person P is not at all comfortable with the machine M. • Comfort (Person P, Machine M) can be 1 (one) if person P is totally and completely comfortable with the machine M. • However, in most of the cases the Comfort (Person P, Machine M) is a fuzzy value between 0 and 1, showing the level of comfort for person P while using machine M. A fuzzy relation can be defined as follows. A fuzzy relation is a mapping from the Cartesian space X x Y to the interval [0,1], where the strength of the mapping is expressed by the membership function of the relation µ (x,y). The “strength” of the relation between ordered pairs of the two universes is measured with a membership function expressing various “degree” of strength [0,1]. In Chap. 3, a few more illustrations and examples of fuzzy logic and fuzzy relations are discussed in detail.
2.3 Artificial Neural Networks Artificial Neural Network (ANN) is a collection of elements called ‘Neuron’ in a systematic way. A neuron is a simulation of a biological neuron in a human nervous system. An artificial neural network is inspired by the human nervous system. A biological neuron is a building block of a human nervous system. It can take multiple inputs at a time through sensory inputs (like 5 senses: touch, see, smell, hear and taste) and other neurons, process the inputs through its nucleus, and outputs the processed information, if it is significant. See Fig. 2.4a for the basic structure of a biological neuron. A single neuron is not able to contribute much in overall decision-making processes, however, the ability to decision making and thoughtfulness comes from the ability to work in parallel. There are billions and trillions of neurons in a human nervous system, which are working in a parallel manner; each in an independent way. Since they are too many, if some of them are not working, the network performance is not affected much. Hence, such a structure works in a parallel but asynchronous way
2.3 Artificial Neural Networks
35
Fig. 2.4 Biological and artificial neuron
and offers a high level of fault tolerance with distributed control. As per the famous saying, ‘I think therefore I am’ by Rene Descartes (1637), intelligence results from the ability to think. If such a concept is incorporated within a machine, would the machine be able to think or not is the basic inspiration behind the artificial neural network. Figure 2.4b presents a simulation of a neuron, known as an artificial neuron. As stated, the ability to think and hence intelligent decision making comes from the collection of a large number of neurons working together towards a global solution. For this, the neurons should be arranged in a proper structure or topology. Hopfield network, perception, and multilayer perceptron with their variations, and self-organizing maps are a few prominent neural network topologies to experiment with artificial neural networks. The following section describes these artificial neural network models in brief.
2.4 Hopfield Network John Hopfield (1982) invented the Hopfield network. The basic objective to create such a network is the regeneration of the data fed to the system after proper noise correction. It is a network of neurons connected with other neurons with symmetrical links with positive or negative weights. The neurons are also referred to as nodes. Figure 2.5 illustrates the Hopfield network.
2.5 Properties of a Hopfield Network Following is a list of properties of the Hopfield network. • It is a network of neurons where all nodes (neurons) are connected with other nodes.
36
2 Constituents of Computational Intelligence
Fig. 2.5 Hopfield network
• • • •
The nodes may have positive or negative weights. All the connections are symmetric in nature. No looping (connection to self) of a node is permitted. For the update and/or learning of the network, the asynchronous method is used. Here, a node for which the current status has to be updated is selected randomly. Further, such an update process is carried out in a parallel manner for many nodes. • No hidden layer nodes are allowed in a network.
2.6 Learning in Hopfield Network As illustrated in Fig. 2.5, a node in a network can be active (filled circle in the figure) or inactive (empty circle in the figure). All the nodes are connected with each other having positive or negative weights. Each node is either in the active status or inactive status. Following steps are carried out to make the network learn: • A node from the network is selected randomly. • Consider all active neighbor nodes of the selected node and calculate their total weights towards the node (in-degree). • If the calculated sum is positive, the selected node status can be active. • Repeat this procedure until the neural network becomes stable and no further activation/deactivation is possible. Hopfield networks are used for experimentation on associative memory or associative storage and different pattern recognition problems. A Hopfield network can be used as a content addressable memory to store patterns. Even if the partial pattern is available, a complete pattern can be retrieved.
2.6 Learning in Hopfield Network
37
Such networks often converged to a local minimum and may produce the wrong local minimum than the expected minimum result. Further, for the Hopfield network, self-loops of the nodes are generally not visible and connections between the nodes are symmetric.
2.7 Single Perceptron Frank Rosenblatt (1958) proposed an artificial neuron inspired by the human nervous system between the eyes and the brain. The main objective of the group of neurons between the eye and brain is to perceive the image acquired. Hence, the name of the suggested model of a neuron is ‘Perceptron’. As per the proposed model, the perceptron welcomes multiple inputs, which are weighted in nature. Beside the weighted inputs, the perceptron also has a processing function, called activation or transfer function. The activation function processes the acquired inputs. There is a threshold function along with a neuron which determines the further action of the neuron. If the processed value generated from the weighted inputs to the neuron is significant enough, the neuron communicates the output by sending it further—to other neurons, if any. This phenomenon is called firing the output i.e. a perceptron’s basic function is to collect several weighted inputs, process it, and fires the processed value further if the processed value is significant. As a perceptron can choose to fire the output further or not; it is used to divide the problem space into two classes. Hence, all linear- kind of ‘to be or not to be’ type of problems can be effectively solved with a single perceptron. Figure 2.6 illustrates a single perceptron solving a problem of selection of a course for a student to study further based on inputs of parents. Figure 2.6 illustrates a single perceptron that solves a classical ‘to be or not to be’ problem for the selection of a course. As shown in the figure the neuron takes two inputs from both the parents about a possible course and processes it through an activation function available in the nucleus of the neuron. Here, the activation function
Fig. 2.6 Selection of a course
38 Table 2.4 Fixed increment perceptron learning
2 Constituents of Computational Intelligence 1. Let x(n) = input vector given as {+1, x1 (n), x2 (n), …, xm (n)} w(n) = weight vector as {b, w1 (n), w2 (n), …, wm (n)} b = bias a(n) = actual response r(n) = desired response l = learning rate parameter 2. Initialize w(0) = 0 3. Activate perceptron by applying input vector x(n) and desired response r(n) 4. Compute actual response of perceptron a(n) = f[w(n)x(n)] 5. If r(n) and a(n) are different then w(n + 1) = w(n) + l[d(n)-a(n)]x(n), where r(n) = ± 1 6. Go to step 3 till all patterns properly classified
is WiXi. Parents opinions are encoded as X1 (0.7) and X2 (0.4). The relationship strengths of the candidate who wants to select the course with the parents are given as W1 (0.3) and W2 (0.6) for both the parents respectively. As per the sample values shown in the figure, the activation function calculates the value 0.45, which is less than the threshold value 0.6. Hence, the neuron decides not to fire, and the course is not selected. Different weights and values of parents’ opinions can be tried for a better understanding of the problem. This perceptron classifies the problem into two categories to select or not to select the course; hence, called the linearly separable problem. In the case of linearly separable problems, data are usually separated by a line (hyper-plane in an advanced dimension). A generic learning mechanism for a linearly separable problem, which is also known as ‘fixed increment perceptron learning’ can be given as shown in Table 2.4. In the case of a popular tool called Support Vector Machine, the same strategy is implemented. The support vector machine is a simple neural network model that considers the given data and tries to classify them into two classes. Through the SVM, classification is done in the presence of data.
2.8 Multilayer Perceptron As stated, a single perceptron cannot solve non-linearly separable problems, which are much complicated but fall into real-life problems category; hence, a multilayer perceptron structure is proposed. The structure is illustrated in Fig. 2.7. In the multilayer perceptron, neurons are arranged in different layers. These neurons are often called nodes. These layers are called the input layer, hidden layer, and output layer. The input layer is responsible for collecting input from the environment. The output layer produces the output. Hidden layers are invisible, and help in learning. At least one layer of each category is required to form a multilayer neural
2.8 Multilayer Perceptron
39
Fig. 2.7 Multilayer perceptron
network. Hidden layers are generally many; in case of simple multilayer perceptron there may be one, two, or three hidden layers. However, in the case of deep learning, there are multiple hidden layers. Typically, the input and output layers are one each. Depending on applications more than one input and more than one output layers can be planned. Table 2.5 illustrates popular heuristics to design a multilayer perceptron.
2.9 Training Using Back Propagation in Supervised Manner The multilayer perceptron learns with the help of valid data sets. Such learning in the presence of data and a well-defined learning mechanism is called supervised learning. This is a kind of learning using past situational data sets or experiences. The data sets have both input as well as output data. Backpropagation is a technique to employ supervised learning. Major steps for backpropagation learning are illustrated in Table 2.6. To train the neural network through backpropagation many training datasets are required. The data sets need to be labeled properly for input and output. For every data set, only input data are provided to the network and let the network learn what it wants to. As all the weights are initially set as random weights, the calculated network results through the random weights are incorrect. The errors will be found out by comparing the calculated values by the network and the correct value provided into the training data sets. These errors are backpropagated to find out possible adjustments to the network weights. After suggested weight corrections, the same input data will be
40
2 Constituents of Computational Intelligence
Table 2.5 Design heuristics for multilayer perceptron 1. Verify the nature of the problem. Typically where many data are available but there is a lack of generalized logic, one may go for multilayer perceptron ANN 2. Select critical parameters that play an important role in decision making. For this, one needs to study the data available on hand. Alternatively, a few successful cases where such decisions are made can be considered. Total number of such important and critical parameters is, say ‘n’ 3. Create an input layer (I) containing ‘n’ number of neurons. Also, assign its activation function as the value of the input 4. Identify possible choices/output options for the problem. Say this number is ‘m’ 5. Create one or two hidden layers (H1 and H2) containing an average of input and output number of nodes; that is (‘n’ + ‘m’)/2. Assign activation function in each neuron of every layer. Typical activation functions are softmax, sigmoid, hyperbolic tangent, rectified linear activation unit, etc. The activation function at the first hidden layer involves input values from the input layer nodes with their weights. The activation function at the second hidden layer involves the previous layer (hidden) nodes’ values with their weights 6. Create an output layer (O) containing ‘m’ number of nodes. Assign an output activation function to each neuron in the output layer. The activation function at the output layer involves values from the last hidden layer nodes with their weights 7. Connect all neurons in such a way that ‘each neuron is connected in a forward direction to every neuron of the adjacent layer’. This makes the network fully connected, feed-forward (as all the connections are in a forward direction only) multilayer neural network 8. Assign random weights to each connection 9. Train the network with collected valid data sets
Table 2.6 Major steps in backpropagation Phase 0: Initialization of the network by assigning random weights For each training data set do following Phase 1: Forward Pass 1. Take the input data from the current data set and feed it to the input layer 2. Calculate the hidden layer values and output layer values as per the activation functions assigned 3. Let neural network output ‘what it thinks’ Phase 2: Backward Pass 1. Compute error by finding the difference between the calculated output and correct output from the data set. The error can be calculated for each neuron and stored as ‘Delta’ typically 2. Back propagate the error and with delta values, update the network weights. Use the formula as: weight = weight + learning_rate * error * input. Repeat the forward and backward pass until the error is acceptable
provided to the network, and the network is asked to calculate the possible output value again. Further, errors will be found out by comparing the calculated values with the actual value, and weights are corrected again. This procedure is repeated for every training data set until the calculated output matches the sample output provided into the training set i.e. for a single training data set, multiple iterations are carried
2.9 Training Using Back Propagation in Supervised Manner
41
Fig. 2.8 Forward and backward passes in a multiplayer perceptron
out for weight update and the neural network learns to make the correct decision for the case. This is called an episode for a training data set. When input data are sent to the neural network to calculate the possible output, the phase is called forward pass. When the errors are sent back to the previous layers for the weight connections, the phase is called the backward pass. Figure 2.8 shows the forward pass and backward pass of one training data set i.e. the network is considered trained for a single data set. However, it is very much necessary for a neural network, to be trained itself for plenty of correct data sets. Once a neural network is trained, it is tested with similar but selective and less number of data sets called validation sets. After the complete training, the network can take the correct decision on given input data (without any output). So a network undergoes designing, training, and testing phases before going to be used. It is to be noted that for the artificial neural network such as feed-forward multilayer perceptron learning through supervised manner, training data sets are very important. Depending on the data, the neural network generalizes its weights. Neurons (or nodes), connections, and activation functions are well defined. It is the set of weights, which is selected only randomly at the initial phase. With the help of correct and reliable data sets, weights can be generalized. This is like, we provide a nice school, learning material, and other facilities to a small baby and with that, the baby learns herself.
42
2 Constituents of Computational Intelligence
Fig. 2.9 Sample and clustering
2.10 Designing a Neural Network with Unsupervised Manner There are situations where a lot many data are available, if these data cannot be easily labeled, then the supervised learning mechanism discussed above is not useful for learning. Here, instead of classification, one should employ the approach of clustering. This is a kind of technique of learning where supervision on the learning mechanism is not needed. For example, a baby might have seen her parent’s car, which has a particular color and model. When the baby sees a neighbor’s car, even without having previously seen the other car (with new color and different model), she can identify the vehicle as a car; as she can recognize the features such as 4 wheels, a steering wheel, etc. In case of supervised learning, the baby is needed to be told that it’s a car too i.e. the baby always needs a label ‘Car’ for every new car model and even color. Unsupervised learning tries to find all possible unknown patterns in the data provided to the network. Such learning can identify the features which are important and useful for the classification in the absence of necessary labels. Further, the labeling process (or analysis processes) is carried out in real-time. Unsupervised learning can be further grouped into clustering problems and association problems. Figure 2.9 shows various objects clustered according to some characteristics. This is a demonstration of a sample situation, where the sampled items can easily be divided into three clusters according to their shapes. If the sample items are not clearly distributed, clustering becomes challenging. The Kohonen Self-Organizing Maps (SOMs) are popular tools, which can be used to implement the unsupervised learning mechanism in an artificial neural network, which is discussed in the next section.
2.11 Kohonen Self-Organizing Maps (SOMs) The architecture of SOM is first proposed by the researcher from Finland named Teuvo Kohonen (1982). Besides grouping similar data into clusters, the SOM also acts as a visualization and organization technique that facilitates the reduction of
2.11 Kohonen Self-Organizing Maps (SOMs)
43
high dimensional data to a map; hence called self-organizing map. That is, the SOM reduces the dimensionalities of data and highlights similarities within the data. In the case of such unsupervised learning, the learning process is data-driven because of the absence of training data. Further, the features and characteristics of the input data are not mandatory to be known. That is why the self-organizing map is also referred to as a self-organizing feature map. In the typical SOM, the neurons are arranged in a flat grid. One can call it as a two-dimensional array or map. The SOM contains only an input layer and an output layer. There is no hidden layer typically in SOM. Figure 2.10 illustrates the structure of a self-organizing map. The basic steps involved in learning in a self-organizing map are as shown in Table 2.7. Artificial neural networks are used for character recognition, face recognition, pattern reorganization, classification, clustering, prediction, and forecasting. For detailed examples of various artificial neural networks, refer to Chap. 4. The chapter discusses the design of various artificial neuron networks including job selection, web page filtering, consumer behavior modeling, sales predication, aptitude testing, fault finding, disease diagnosis for a novel virus such as corona, etc. Fig. 2.10 Self-organizing map
Table 2.7 Learning in self-organizing map 1. Design a SOM network by arranging neurons in a flat grid/map and also design an input layer 2. Initialize the weights of each node 3. Chose one unit/vector of the training data set and feed it to the map 4. Let every node determine whether its weight is similar to the weights of input vector or not. Take the help of the Euclidian distance function 5. Consider the node whose weight is nearest to that of the input vector. The node is known as the winner or the ‘Best Matching Unit (BMU)’ 6. Calculate the neighborhood of the winning node and provide a reward to the winning node in terms of weight correction 7. Repeat step 3 onwards for a sufficient number of iterations
44 Table 2.8 Basic steps of a genetic algorithm
2 Constituents of Computational Intelligence Initialize the population of randomly selected individuals Encode the individual in the population Calculate the fitness of the individuals in the population Repeat till acceptable fitness of an individual Apply genetic operators such as selection, crossover, and mutation to generate new offspring and modify the population Go to step 3
2.12 Evolutionary Algorithms The main ideas behind the evolutionary algorithm are ‘Adaptation is intelligence’ and ‘survival of the fittest’. Evolutionary algorithms, evolutionary computing, or genetic algorithms are the term generally used interchangeably. There are many models and variations of evolutionary algorithms, among which the genetic algorithm is the most popular, prominent, and canonical technique. The genetic algorithm can be thought of as a population-based method of searching from large search space. There are many search algorithms available which are numerical, graph-based, or trajectory-based methods. The first two search methods mentioned here are deterministic in nature and the last one, the trajectory-based, is stochastic in nature. The genetic algorithm is also a population-based stochastic method. The genetic algorithms were developed by John Holland (1975) to understand the adaptability of the natural evolution based on Darwin’s principle of ‘natural selection’. Table 2.8 discusses the basic outline of the genetic algorithm.
2.13 Encoding Individuals As mentioned in Table 2.8, a genetic algorithm starts with a randomly selected random number of individuals. These individuals need to be presented in the form of a valid chromosome, sequence of genes; where each gene is representing a characteristic. Binary or hexadecimal number based encoding, character encoding, and tree encoding are a few popular encoding strategies. Table 2.9 enlists popular encoding strategies with examples and operations.
2.14 Genetic Operators Selection, crossover, and mutation are the three basic genetic operators that can modify the genes in individuals. In case of selection, an exact copy of a parent individual is made, if the parent seems impressive in terms of fitness. To select good parents, often tournament selection or Roulette wheel selection methods are used. In
2.14 Genetic Operators
45
Table 2.9 Encoding strategies with operations Sr. no
Encoding strategy
Description
1
Binary encoding
Individual 1: 110 011 010 111 Individual 2: 100 101 001 011 Mutation: Individual 1, position 3 New Individual 3: 111011 010 111 (Whatever digit is available at 3rd position in the Individual 1, can be changed with another possible symbol. It is 0, so the only possible number in binary is 1) Crossover: Individual 1 and Individual 2, position 4, size 3 Individual 1: 110 011010 111 Individual 2: 100 101001 011 New Individual 3: 110 101010 111 New Individual 4: 100 011001 011 (Corresponding substrings from both the individuals are interchanged)
2
Hexadecimal encoding
Individual 1: 9E7B Individual 2: A345 Mutation: Individual 1, position 2 New Individual 3: 9A7B (Besides E, other possible symbols in Hexadecimal number system are 0–9 digits, A, B, C, D, and F. Here the E is replaced with A. It can be replaced by any other permissible symbol) Crossover: Individual 1 and Individual 2, position 2, size 2 Individual 1: 9E7B Individual 2: A345 New Individual 3: 934B New Individual 4: AE75
3
Character encoding
Individual 1: ABDEF Individual 2: PQRST Mutation: Individual 1, position 2 New Individual 3: ACDEF Crossover: Individual 1 and Individual 2, position 4, size 2 Individual 1: ABDEF Individual 2: PQRST New Individual 3: ABDST New Individual 4: PQREF
4
Tree encoding Individual 1
Individual 2
\
\
* B
+ A
+ C
D
P
R
T
Mutation: Individual 1, lower left-most node (continued)
46
2 Constituents of Computational Intelligence
Table 2.9 (continued) Sr. no
Encoding strategy
Description
\
* A
+
A
C
D
Crossover: Individual 1 and 2, right-most subtree New Individual 3
New Individual 4
\
\
* B
+
A
T
P
+ R
C
D
the tournament selection, some individuals are selected randomly and best among them are selected using the given fitness function. In the case of the roulette wheel, the population is divided on a wheel similar to the popular Roulette wheel in a casino according to the percentages of their fitness values. The wheel is moved several times, and each time, an individual is selected with the help of a pointer available with the wheel. In the case of genetic mutation, a single gene value is changed to other permissible gene values depending on a probability that shows the possibility of a mutation. The mutation is done very moderately and brings diversity in the population, which are not accommodated originally in the population. On the other hand, selection does not allow new individuals but restricts the population to a similar type of individuals. Crossover genetic operator is the most important and significant operator among the three. In crossover operation, two parents are selected from the current population and similar groups of their genes are interchanged to generate one or more offspring. This is just like a baby resembling the skin of her mother and height of her father. Newly generated individuals are further tested for their fitness in the environment. By applying these genetic operators, the population can have new and better individuals in terms of fitness. The modified population is now considered as the new generation. Genetic operators are applied to the new generation to evolve a further new generation. This process continues till one finds the desired fitness, cannot see any improvement in the individual’s fitness value, memory or time constraints, or any combination of these. One may stop generating new populations after a fixed number of iterations such as 100 or 500; depending on the population size and length of chromosomes. Figure 2.11 illustrates the basic operations for evolution.
2.14 Genetic Operators
47
Fig. 2.11 Basic operations in GA
Genetic algorithms are used when the search space is too large or where the traditional solution strategy is not available or feasible. The popular candidate applications of the genetic algorithm are as follows. • • • • • • •
Function optimization Scheduling and planning Game playing Fault finding Genetic programming Crime investigations Evolving hardwaretc. Table 2.10 illustrates an example of function optimization with binary encoding.
2.15 Travelling Salesperson Problem with Genetic Algorithm There are many situations where the typical genetic operators do not work. Consider the traveling salesperson problem with 6 cities, encoded as alphabets A, B, C, D, E, and F. As per the traveling salesperson problem, all these cities have to be traversed
48
2 Constituents of Computational Intelligence
Table 2.10 Example of function optimization Maximize Profit P = f(x,y), which is defined as 2x + y + 7; where x is investment and y is the batch size of the product to be manufactured. Let us assume that the x can take values from [0, 7] in lakh bucks; and y is [0,7] tons Let the initial population I0 , which is a set of order pairs of x and y, that claim for a solution I0 = {( 2,3), (3,4), (1,2)} Let us encode the initial population using a binary encoding scheme. Here the maximum value that x (as well as y) can take is 7; hence 3 binary digits (bits) are sufficient to represent the individuals Encoded I0 = {(010, 011), (011,100), (001,010)} Fitness of I0 = {(2*2 + 3 + 7), (2*3 + 4 + 7), (2*1 + 2 + 7)} = {14, 17, 11} As we know that, we are in search of maximum profit value, our individual from the initial population provides us maximum value 17. We can directly select that. If the solution is not clear or there is no significant diversity in the fitness values, we may go for Roulette wheel selection A roulette wheel is made to show the fitness proportions and hence randomly and naturally selecting an individual with a better probability of survival For the initial population I0 , the Roulette wheel is as follows
Ind 3 11
Ind 1 14
Ind 2 17
So, as per the Roulette wheel selection, which we have executed three times, as we want 3 individuals in the new generation, the modified population is as follows I1 = {(010, 011), (011,100), (011,100)} One can see that the fittest individual (2nd ) is selected twice in the Roulette wheel selection and the least fit individual (3rd ) is discarded Now, the crossover can be applied on individual 1 and individual 2 for the last two bits (genes). We also apply mutation on the 3rd element of the new population at the last bit. Hence, the modified population I1 will be given as follows I1 = {(010, 011), (011,100), (011,100)} becomes I1 = {(010, 000), (011,111), (011,101)} The fitness of I1 = {(2*2 + 0 + 7), (2*3 + 7 + 7), (2*3 + 5 + 7)} I1 = {11, 20, 18} It can be observed that the fitness values of individuals in the second generation are increased for two individuals. This process can be continued until the satisfactory fitness of the individual is achieved. Here, the optimum value after applying selection, crossover, and mutation is order pair (7,7), which yields profit P = 2*7 + 7 + 7 = 28
2.15 Travelling Salesperson Problem with Genetic Algorithm
49
Fig. 2.12 Travelling salesperson with genetic algorithm
only once, without missing any city with an objective to minimise the cost and effort. The possible two routes are as follows. R1 = {A D C B F E}. R2 = {A B E F D C}. A typical mutation on R1 , position 3 asks us to change the city C to any other possible city such as A, B, D, E or F. If we replace the city C at the third position in the first route R1 with any other city, it generates an invalid route. Similarly, a crossover between R1 and R2 for the group of first three cities generates two more routes such as R3 = {A B E B F E} and R4 = {A D C F D C}. Both the newly generated routes R3 and R4 are invalid as per the constraints of the traveling salesperson problem. In this case, there is a need to design a novel genetic operator called edge recombination. For edge recombination, from the initial population of the routes, an adjacency matrix is prepared, from which, new routes can be generated. From the adjacency, it is possible to create new rules. Refer Fig. 2.12 for an illustration. Refer Chap. 5 for illustrated examples on the traditional as well as novel crossover operaters on the typical traveling salesperson problem.
2.16 Schema in Genetic Algorithms A schema is a generic template of strings in an encoding scheme, particularly in binary digits. It is a template to define the design of individuals. For example, in binary encoding, schema given as 10**1 represents a set of binary strings given as follows. Schema S = 10**1.
50
2 Constituents of Computational Intelligence
Fig. 2.13 Search space reduction through schemas
Set of members of the schema S = {10001, 10011, 10101, 10111}. The basic idea to have a schema is to identify the behavior of the group of individuals and apply genetic operators to the schema itself. This saves a lot of computational effort and time. As stated, genetic algorithms are very much effective when the search space is large. In this case, the notion of schema further helps in dividing the search space into different partitions, where each partition is represented by schema. To find the optimum solution, instead of taking a random individual from the initial population, schemas representing the partitions are considered for checking the partitions fitness. In other words, a schema representing a partition reflects the average fitness of elements of that partition. Figure 2.13 illustrates the domain of 4 digited binary numbers. Just by defining schema using only one defined bit at either most significant bit or least significant bit, one can reduce the number of individuals into half. If the number of defined bits in schema (fix bits) is increased, the schema becomes rigid and the order of the schema is high. In the schema S mentioned above, defined as S = 10**1, the order of schema is 3. Similarly, the defined length of the schema is the distance between its first and last defined bits. In the case of S = 10**1, the defined length is 5–1 = 4. It is obvious that the schema with greater fitness value contributes more individuals in the next generation. It is also observed that the schema with good, generally above average fitness with short defining length and low order are prone to survive. Genetic algorithms are natural and easy to code. Further, the genetic algorithm can be easily parallelized. It is to be noted that, the genetic algorithms are very slow if the population size is more. In this case, parallel genetic algorithms can be employed. Modern computing devices and techniques support multi-core and multitasking computing, offering the advantages of parallelism. Genetic algorithms are used for applications involving optimization, scheduling, planning, searching, etc. For other applications and examples of genetic algorithms in different domains, refer to Chap. 5. The chapter discusses various solved examples such as single variable and multiple variable function optimizations, traveling salesperson, scheduling problems, mobile selection, car selection, best student selection, evolving rule bases, etc.
2.17 Hybrid Computational Intelligence Based Systems
51
2.17 Hybrid Computational Intelligence Based Systems As per the discussion in the previous chapter, every constituent of computational intelligence contributes towards intelligent decision making and problem-solving. However, every technique has its pros and cons. For example, fuzzy logic provides us means to deal with vague information and uncertainty handling, but cannot deal with situations where generalized logic is not available. Artificial neural networks can learn generic knowledge from the lots of domain data and store the knowledge into their connections; however, they cannot explain or reason the decision taken, as the component called knowledge base is not available with the networks. Similarly, genetic algorithms do no possess the ability to handle uncertainty and self-learning from data but offer the ability to handle large search space, or handle problems where traditional solution models are not available. Consider an example, which is handling large problem space or big repository of unstructured data (such as big data in the modern era) needs search mechanism combined with the ability to handle vague and incomplete data as well as self-learning. Here, we require genetic algorithms, fuzzy logic, and artificial neural networks. That is, it can be observed that in real-life complex systems, more than one computational intelligence constituents are needed to be utilized to effectively solve the problem. To do so, one must understand the characteristics of every constituent and various possible models, and how they can be hybridized to solve problems. Table 2.11 presents the pros and cons of various computational intelligence constituents in brief. The following sections discuss possible hybridization of the computational intelligence constituents.
2.18 Neuro-Fuzzy Systems To take benefits of fuzzy logic-based systems and artificial neural networks simultaneously neuro-fuzzy hybridization is called. An artificial neural network takes normalized and crisp (non-fuzzy) data for various parameters based on which decision is learned. However, there are situations where data may be incomplete, vague, and uncertain. Here, fuzzy membership functions can be used as an interface to the neural network. The membership functions vaguely interact with users, deal with uncertainty, and take fuzzy data in order to make the data understandable by the back end neural network. There are many ways how a neuro-fuzzy hybridization is achieved. Some of the popular ways are enlisted below. • As an interface to a base neural network, to convert vague data into crisp data while providing input to a neural network. Similarly, crisp output is converted into natural, user-friendly, and linguistic output;
52
2 Constituents of Computational Intelligence
Table 2.11 Pros and cons of computational intelligence constituents Constituent name
Advantages
Disadvantages
Fuzzy logic
Handling uncertainty
Cannot learn
Managing human-like approximate reasoning
Cannot evolve
Multiple values between two extremes
Not effective in big search space
Fuzzy rule base/knowledge base Membership functions and can be treated as documented rules are needed to be designed knowledge and offers advantages related to knowledge management; which is a key component of knowledge commerce (k-Commerce) Artificial neural networks Can self-learn from data
Coding is less, and can learn with the large amount of data
Cannot explain as there is no rule base, weights are stored in connections. Further, no advantage related to knowledge management is offered Cannot evolve Not effective in big search space Cannot handle vagueness and uncertainty Dependent mainly on training data
Genetic algorithms
Useful for optimization
Cannot handle vagueness and uncertainty
Manages large search space effectively
Cannot explain as there is no rule base/knowledge base
Useful in the absence of traditional models/techniques Mimics evolution processes of Cannot provide approximate nature very well and evolve reasoning solution in an automatic manner Easy to parallelize and easy to code Provides multiple solutions
• The output of the neural network can be fine-tuned with the help of a fuzzy rule-based system; such fine-tuning can facilitate the addition of explanation and reasoning into the neural network; • The activation function of a neural network can be fuzzy; • Weights of connections in a neural network can be fuzzy;
2.18 Neuro-Fuzzy Systems
53
• The error determining functions can be fuzzy for backpropagation algorithm; • Fuzzy rules, membership functions, or other parameters of the fuzzy system can be learned via a neural network; etc. A neural network and fuzzy system can be co-operative in nature and helpful to each other for various tasks. Both, fuzzy logic component and neural network component, can be active and concurrently executing and contributing to each other. Sometimes, when both the hybridized components are called one after another, in a loosely coupled manner, it is called pipeline or sequential hybridization. Other approaches can be fusion, embedded, or auxiliary hybridization.
2.19 Fuzzy-Genetic Systems Fuzzy-genetic systems hybridize fuzzy logic for uncertainty management, approximate reasoning, and managing vague inputs/outputs; and genetic algorithms for optimization, evolution searching, and other possible evolutionary benefits. The most popular use of genetic-fuzzy hybridization is to evolve fuzzy rules from a given set of a few rules, called seed rules set. The following are the major objectives of the genetic-fuzzy hybridization. • To evolve fuzzy rules or component of fuzzy logic-based systems; • To evolve strong membership functions and other parameters to employ fuzzy systems including testing (genetic learning of fuzzy components); • Genetic adaptive fuzzy inference technique and inference engine parameters; • Fuzzy genetic operators such as fuzzy crossover, fuzzy mutation, and other application-specific genetic operators; • The fine-tuning output of fuzzy rule-based systems with genetic algorithms; • etc.
2.20 Neuro-Genetic Systems Learning capability and evolutionary advantages are combined in the case of the neuro-genetic system. The neuro-genetic system can be hybridized by accommodating two computational intelligence constituents, artificial neural network, and genetic algorithms. Major and popular uses of such a neuro-genetic system are to evolve the design of a neural network with the help of evolutionary algorithms or to learn genetic systems related parameters such as fitness function through a neural network. The following are the major possibilities of such hybridization. • Topology and design of neural network can be evolved by a genetic algorithm; • Solutions learned by a neural network can be further evolved by a genetic algorithm; however, this type of hybridization is less useful, as neural networks used to provide limited results;
54
2 Constituents of Computational Intelligence
• Learning rate, rate of momentum, level of tolerance, etc. neural network control parameters can be evolved by genetic algorithm; • A neural network can be used to evaluate fitness functions used by the genetic algorithms component; • etc.
2.21 Other Hybrid Systems Often, there is a requirement of combining more than two constituents of computational intelligence to take multi-folded advantages of the constituents. For example, evolving topologies of neuro-fuzzy systems, self-learning (by ANN) of a fuzzy fitness function for a genetic algorithm, etc. requires the three computational intelligence constituents namely artificial neural network, genetic algorithms, and fuzzy logic-based systems. Application domain for such multi-folded hybridization can be face reorganization, crowd behavior monitoring, cybercrime monitoring & security in eCommerce transactions, multiple intelligence modeling, advisory systems, etc. Detailed examples, research as well as project ideas on hybrid computational intelligent systems are discussed in detail in Chap. 6.
References Descartes, R. (1637). Discourse on the method of rightly conducting the reason, and seeking truth in the sciences. Holland, J. H. (1975). Adoption in natural and artificial system. Michigan: The University of Michigan Press. Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United States of America, 79(8), 2554–2558. Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43(1), 59–69. Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408. Zadeh, L. A. (1994). Fuzzy logic, neural networks, and soft computing. Communications of the ACM, 37(3), 77–84.
Chapter 3
Examples and Applications on Fuzzy Logic Based Systems
Abstract The chapter provides examples and applications based on fuzzy sets and fuzzy logic-based theory. Initially, basic and fundamental examples of day to day life are demonstrated in this chapter with necessary design details and step by step calculations. The examples included here are fuzzification of irregular students considering fuzzy attendance, speed of a vehicle, job selection, fuzzy operations for almond sorting, and viral disease diagnosis such as the Covid-19. Numeric examples of fuzzification, defuzzification, operations on fuzzy sets, fuzzy relations are also included. Applications of fuzzy logic in fashion designing, software engineering, domestic appliances such as washing machines, share market analysis, and sensor control are also discussed in this chapter. Detailed discussion is presented on restaurant menu planner and customized representation of material to slow learners by giving complete systems architectures, design of fuzzy functions, and fuzzy rules. Traditional fuzzy logic, which is known as type-1 fuzzy logic, has got some limitations. To overcome the limitations, type-2 fuzzy logic is used. This chapter introduces and demonstrates an application of type-2 fuzzy logic along with its membership function. The fuzzy logic as a constituent of computational intelligence evolves continuously and observes possibilities of many innovative research opportunities. Besides detailed discussion on approximately 20 examples as mentioned above, in the end, the chapter enlists possible research ideas in the pure fuzzy logic-based system. There are possibilities of hybrid and applied research in the field of fuzzy logic too, which are enlisted at the end of the chapter. Approximately 40 core research ideas/projects and applications, which will be helpful for the learners, professionals, and researchers, are contributed to this chapter.
3.1 Fuzzy Set and Membership for Students Attendance The fuzzy set, as discussed in Chap. 2, is considered a set with no boundary and partial belongingness. In the academic environment, the attendance of a student in a course is one of the parameters to evaluate a student. Suppose, the minimum requirement of attendance for a student to fulfill one of the mandatory requirements of the degree is © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 P. S. Sajja, Illustrated Computational Intelligence, Studies in Computational Intelligence 931, https://doi.org/10.1007/978-981-15-9589-9_3
55
56
3 Examples and Applications on Fuzzy Logic Based Systems
80%. In this case, a crisp set of eligible students contains all the students who have taken admission in the course and attended the lectures with 80% or more value. If a student has an attendance value of 75%, he will not be allowed to appear in the examination of the course. Figure 3.1 illustrates the crisp set of attendance. While using the crisp sets, we are treating students with 75% attendance value and 10% attendance value at par. However, a fuzzy set attendance allows partial membership to the student with different attendance values. Figure 3.1 demonstrates crisp as well as fuzzy sets of students with different attendance values. As illustrated in Fig. 3.1, numbers 6, 79, 76, and 29 cannot become members of the set because of the rigid definition of the set. However, these numbers can be part of the fuzzy set with different capacities as there is no boundary. The belongingness of a number is determined by the fuzzy membership function given in Fig. 3.2. With the help of the above mentioned fuzzy set of attendance, attendance can be thought of as a linguistic variable having values in words such as low attendance,
29
76 79
87
89 93
6
40
29
80
99
87
79 93
83
99
76
6
96
33
Crisp Set with Rigid Boundary
Fuzzy Set without Boundary
Fig. 3.1 Crisp and fuzzy sets of attendance
Very Low
Low
Average
High
1 0.65 0.17 0
10
20
30
40
50
60
Attendance in % Fig. 3.2 Membership function for students attendance
70
80
3.1 Fuzzy Set and Membership for Students Attendance
57
average attendance, very low attendance, high attendance, etc. Figure 3.2 illustrates a fuzzy membership function for a student’s attendance. Here, on X-Axis attendance of a student is considered. Since fuzzy logic is a multi-valued logic between 0 and 1, on Y-Axis, there is always 0 and 1 numbers. As per the membership functions defined in Fig. 3.2, if the attendance of a student is 73%, it is a member of the High attendance set with membership degree 0.65 and simultaneously becomes a member of the Average attendance set with membership degree 0.17. Defining the regularity of a student’s is also a similar type of problem (related to the attendance) discussed in this section. Depending on the attendance only, students can be categorized into classes such as irregular students and regular students. The membership function illustrated in Fig. 3.2 can be used to demonstrate the regularity of the student with different hedges or linguistic variables. In this case, on X-Axis number of days or number of lectures attended may be considered instead of % values of the attendance. Similarly, many other fuzzy linguistic variables for the academic environment can be defined for a big holistic system such as student monitoring system, student aptitude testing system, course selection advisory system, etc. Some examples of the fuzzy membership functions used in such systems are Interest in studies, the Difficulty level of the course, Future prospects of the course, Availability of infrastructure for the course, etc.
3.2 Fuzzy Membership Function for the Speed of a Vehicle Consider a car (or any similar vehicle on road), whose speed can be measured in miles per hour. If you drive in the center of a busy city, 30 miles per hour is considered as high speed. However, on the highway, this speed is considered very less. Further, for different people, high speed means different values, just as the temperature in a room or comfort while sitting on a sofa. Some people find 30 miles per hour is cool in a city too! Figure 3.3 illustrates different fuzzy membership functions on the variable speed. Figure 3.3 illustrates Very high speed, High speed, Medium speed, Low speed, and Very low speed. These are five membership functions designed on a single axis. If you want more accuracy and high precision, many membership functions (say 25 to 30) can be designed. Here, for an approximate evaluation and demonstration only five membership functions as shown above.
58
3 Examples and Applications on Fuzzy Logic Based Systems
1
0
Medium
Very Low
10
Low
20
High
30
40
Very High
50
60
70
80
Miles/Hour Fig. 3.3 Fuzzy membership function for speed of a vehicle
3.3 Operations of Fuzzy Sets: Numerical Example This example illustrates different operations on fuzzy sets such as union, intersection, complement, and difference on the fuzzy set X and the fuzzy set Y. The sets X and Y are given as follows. Fuzzy Set X = {x1 , x2 , x3 , x4 } = {0.20, 0.40, 0.70, 0.30} Fuzzy Set Y = {y1 , y2 , y3 , y4 } = {0.15, 0.60, 0.90, 0.10} The union of the fuzzy set X and Y is defined as follows. Fuzzy X ∪ Fuzzy Y = max[µ(x), µ(y)] = {0.20, 0.60, 0.90, 0.30} The intersection of the fuzzy set X and Y is defined as follows. Fuzzy X ∩ Fuzzy Y = min[µ(x), µ(y)] = {0.15, 0.40, 0.70, 0.10} Alternatively, the intersection of the fuzzy set X and Y is also defined as follows. Fuzzy X ∩ Fuzzy Y = mul[µ(x), µ(y)] = {0.03, 0.24, 0.63, 0.03}
3.3 Operations of Fuzzy Sets: Numerical Example
59
The complement of the fuzzy set X is defined as follows. Complement(Fuzzy X) = [1 − µ(x] = {0.80, 0.60, 0.30, 0.70} The complement of the fuzzy set Y is defined as follows. Complement(Fuzzy Y) = [1 − µ(y] = {0.85, 0.40, 0.10, 0.90} The difference between fuzzy sets X and Y is defined as follows. Difference(Fuzzy X, Fuzzy Y) = Fuzzy X ∩ (Complement Y) = min[µ(x), 1 − µ(y)] = {0.20, 0.40, 0.70, 0.30} ∩ {0.85, 0.40, 0.10, 0.90} = {0.20, 0.40, 0.10, 0.30}
The bold union of fuzzy set X and Y is defined as follows. BU(Fuzzy X, Fuzzy Y) = min[1, µ(x) + µ(y)] = {0.35, 1.00, 1.00, 0.40} The bold intersection of fuzzy set X and Y is defined as follows. BI(Fuzzy X, Fuzzy Y) = max[0, µ(x) + µ(y) − 1] = {0.00, 0.00, 0.60, 0.00}
3.4 Fuzzy Operations: Newspapers Example Let there are three leading newspapers named N1 , N2 , and N3 in circulation in an area. The choices of men and women in the area about the newspapers are different. Suppose, the set of Men is represented as M and the set of women is represented as W, their likings about the newspapers are presented as follows. M = {0.40/N1 , 0.60/N2 , 0.20/N3 } W = {0.50/N1 , 0.70/N2 , 0.80/N3 } Newspapers Not Liked by Men As the set M enlists the liking about various newspapers, the disliking of newspapers by men can be found out by the complement operations.
60
3 Examples and Applications on Fuzzy Logic Based Systems
M = {0.60/N1 , 0.40/N2 , 0.80/N3 } Newspapers Not Liked by Women Similarly, the disliking of newspapers by women is given as follows. W = {0.50/N1 , 0.30/N2 , 0.20/N3 } Newspapers Liked by Both the Parties To find out the newspaper liked by members of both the sets we will use intersection operation. The result of the intersection operation using the minimum operator is given as follows. M ∩ W = {0.40/N1 , 0.60/N2 , 0.20/N3 } ∩ {0.50/N1 , 0.70/N2 , 0.80/N3 } = {0.40/N1 , 0.60/N2 , 0.20/N3 } =M Instead of the minimum operator, if the multiplication operator is selected, the result of the intersection operation is given as follows. M ∩ W = {0.40/N1 , 0.60/N2 , 0.20/N3 } ∩ {0.50/N1 , 0.70/N2 , 0.80/N3 } = {0.20/N1 , 0.42/N2 , 0.16/N3 } Newspapers Liked by Either Party To find out the newspaper liked by either of the party we will use union operation. The result of the union operation using the maximum operator is given as follows. M ∪ W = {0.40/N1 , 0.60/N2 , 0.20/N3 } ∪ {0.50/N1 , 0.70/N2 , 0.80/N3 } = {0.50/N1 , 0.70/N2 , 0.80/N3 } =W
3.5 Fuzzy Operations: Sensors Example Let there are three sensors named S1 , S2 , and S3 . The gain setting count and fuzzy degree of detection levels of all the three sensors are given as shown in Table 3.1. Using the values provided in the table, let us calculate the following expressions. (i) (ii)
µDS1 (x) U µDS2 (x) U µDS3 (x) µDS1 (x) ∩ µDS2 (x) ∩ µDS3 (x)
3.5 Fuzzy Operations: Sensors Example
61
Table 3.1 Gain settings and detection levels of sensors Gain setting
0
10
20
30
40
50
60
70
80
90
µDS1
0.20
0.41
0.53
0.66
0.74
0.83
0.89
0.97
1.00
1.00
µDS2
0.15
0.23
0.37
0.46
0.58
0.69
0.80
0.91
1.00
1.00
µDS3
0.00
0.08
0.12
0.20
0.32
0.47
0.54
0.69
0.81
1.00
(iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii)
{µDS1 (x) U µDS2 (x)} ∩ µDS3 (x) µDS1 ’(x) µDS2 ’ (x) µDS3 ’ (x) µDS1 ’(x) U µDS2 ’ (x) µDS1 (x) U µDS3 ’ (x) µDS1 ’(x) ∩µDS2 ’ (x) µDS1 (x) U µDS2 (x) U µDS3 ’ (x) µDS1 ’(x) U µDS2 ’ (x) U µDS3 ’ (x) µDS1 ’(x) ∩ µDS2 ’ (x) ∩ µDS3 ’ (x)
The solution (denoted as D) of the problem is given as follows. (i) D = µDS1 (x) U µDS2 (x) U µDS3 (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1
0.20
0.41
0.53
0.66
0.74
0.83
0.89
0.97
1.00
1.00
µDS2
0.15
0.23
0.37
0.46
0.58
0.69
0.80
0.91
1.00
1.00
µDS3
0.00
0.08
0.12
0.20
0.32
0.47
0.54
0.69
0.81
1.00
D
0.20
0.41
0.53
0.66
0.74
0.83
0.89
0.97
1.00
1.00
(ii) D = µDS1 (x) ∩ µDS2 (x) ∩ µDS3 (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1
0.20
0.41
0.53
0.66
0.74
0.83
0.89
0.97
1.00
1.00
µDS2
0.15
0.23
0.37
0.46
0.58
0.69
0.80
0.91
1.00
1.00
µDS3
0.00
0.08
0.12
0.20
0.32
0.47
0.54
0.69
0.81
1.00
D
0.00
0.08
0.12
0.20
0.32
0.47
0.54
0.69
0.81
1.00
62
3 Examples and Applications on Fuzzy Logic Based Systems
(iii) D = {µDS1 (x) U µDS2 (x)} ∩ µDS3 (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1
0.20
0.41
0.53
0.66
0.74
0.83
0.89
0.97
1.00
1.00
µDS2
0.15
0.23
0.37
0.46
0.58
0.69
0.80
0.91
1.00
1.00
µDS3
0.00
0.08
0.12
0.20
0.32
0.47
0.54
0.69
0.81
1.00
D
0.00
0.08
0.12
0.20
0.32
0.47
0.54
0.69
0.81
1.00
10
20
30
40
50
60
70
80
90
(iv) D = µDS1 ’ (x) Gain Setting
0
µDS1
0.20
0.41
0.53
0.66
0.74
0.83
0.89
0.97
1.00
1.00
D
0.80
0.59
0.47
0.34
0.26
0.17
0.11
0.03
0.00
0.00
10
20
30
40
50
60
70
80
90
(v) D = µDS2 ’ (x) Gain Setting
0
µDS2
0.15
0.23
0.37
0.46
0.58
0.69
0.80
0.91
1.00
1.00
D
0.85
0.77
0.63
0.54
0.42
0.31
0.20
0.09
0.00
0.00
(vi) D = µDS3 ’ (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS3
0.00
0.08
0.12
0.20
0.32
0.47
0.54
0.69
0.81
1.00
D
1.00
0.92
0.88
0.80
0.68
0.53
0.46
0.31
0.19
0.00
(vii) D = µDS1 ’(x) U µDS2 ’ (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1 ’
0.80
0.59
0.47
0.34
0.26
0.17
0.11
0.03
0.00
0.00
µDS2 ’
0.85
0.77
0.63
0.54
0.42
0.31
0.20
0.09
0.00
0.00
D
0.85
0.77
0.63
0.54
0.42
0.31
0.20
0.09
0.00
0.00
3.5 Fuzzy Operations: Sensors Example
63
(viii) D = µDS1 (x) U µDS3 ’ (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1
0.20
0.41
0.53
0.66
0.74
0.83
0.89
µDS3 ’
1.00
0.92
0.88
0.80
0.68
0.53
0.46
0.97
1.00
1.00
0.31
0.19
D
1.00
0.92
0.88
0.80
0.74
0.83
0.00
0.89
0.97
1.00
1.00
(ix) D = µDS1 ’(x) ∩ µDS2 ’ (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1 ’
0.80
0.59
0.47
0.34
0.26
0.17
0.11
0.03
0.00
0.00
µDS2 ’
0.85
0.77
0.63
0.54
0.42
0.31
0.20
0.09
0.00
0.00
D
0.80
0.59
0.47
0.34
0.26
0.17
0.11
0.03
0.00
0.00
(x) D = µDS1 (x) U µDS2 (x) U µDS3 ’ (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1
0.20
0.41
0.53
0.66
0.74
0.83
0.89
0.97
1.00
1.00
µDS2
0.15
0.23
0.37
0.46
0.58
0.69
0.80
0.91
1.00
1.00
µDS3 ’
1.00
0.92
0.88
0.80
0.68
0.53
0.46
0.31
0.19
0.00
D
1.00
0.92
0.88
0.80
0.74
0.83
0.89
0.97
1.00
1.00
(xi) D = µDS1 ’(x) U µDS2 ’ (x) U µDS3 ’ (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1 ’
0.80
0.59
0.47
0.34
0.26
0.17
0.11
0.03
0.00
0.00
µDS2 ’
0.85
0.77
0.63
0.54
0.42
0.31
0.20
0.09
0.00
0.00
µDS3 ’
1.00
0.92
0.88
0.80
0.68
0.53
0.46
0.31
0.19
0.00
D
1.00
0.92
0.88
0.80
0.68
0.53
0.46
0.31
0.19
0.00
(xii) D = µDS1 ’(x) ∩ µDS2 ’ (x) ∩ µDS3 ’ (x) Gain Setting
0
10
20
30
40
50
60
70
80
90
µDS1 ’
0.80
0.59
0.47
0.34
0.26
0.17
0.11
0.03
0.00
0.00
µDS2 ’
0.85
0.77
0.63
0.54
0.42
0.31
0.20
0.09
0.00
0.00
µDS3 ’
1.00
0.92
0.88
0.80
0.68
0.53
0.46
0.31
0.19
0.00
D
0.80
0.59
0.47
0.34
0.26
0.17
0.11
0.03
0.00
0.00
64
3 Examples and Applications on Fuzzy Logic Based Systems
3.6 Selection of a Job Based on Fuzzy Parameters It is observed that people select jobs based on various parameters such as Domain of interest, Salary, Distance from home, and Future prospects. There may be many more parameters; however, for demonstration purposes, the four parameters are selected as mentioned above. This example demonstrates how these parameters fuzzy membership functions are designed and based on which how decision is made. Let us consider that there are three jobs available as shown in Table 3.2. Let us consider the fuzzy set A representing the interest level of all the jobs, which is given below. A = {0.80, 0.55, 0.80, 0.25} Similarly, we can have other fuzzy sets for Salary, Distance, and Future Prospects as defined below. These sets are called B, C, and D respectively. B = {0.50, 0.80, 0.45, 0.15} C = {0.70, 0.15, 0.20, 0.90} D = {0.20, 0.60, 0.15, 0.50} For a job, a good salary, the domain of interest, and future prospects are desirable characteristics and need to be maximized. However, distance is not a desirable factor for a job. Hence, complement of the set C, say C’ is considered. All the sets A, B, C’, and D are desirable for a job, so the intersection is considered. The intersection with the multiplication operator yields recommendation as shown in the last column of Table 3.3. Based on the recommendation values, ranks of the jobs are determined. Instead of the multiplication, the minimum operator for intersection can also be considered. A weighted formula can also be used to determine the rank of a job considering the preference of users. This is considered as a customized, novel, and application-specific intersection operator. Table 3.2 Details of available jobs Parameter
Domain of interest
Salary
Distance
Future prospects
Job no.
A
B
C
D
1
0.80
0.50
0.70
0.20
2
0.55
0.80
0.15
0.60
3
0.80
0.45
0.20
0.15
4
0.25
0.15
0.90
0.50
3.7 Affordability of a Dress
65
Table 3.3 Job recommendation and rank by fuzzy logic Job no. Input
Output
Rank
Domain of interest Salary Distance Future Prospects Recommendation A
B
C’
D
Intersection
1
0.80
0.50
0.30
0.20
0.0240
3
2
0.55
0.80
0.85
0.60
0.2244
1
3
0.80
0.45
0.80
0.15
0.0432
2
4
0.25
0.15
0.10
0.50
0.0019
4
3.7 Affordability of a Dress The selection of a woman’s readymade dress for an evening party depends on a variety of factors, some of which are difficult to predict! Affordability is one such factor that plays a critical role in the selection of a dress. The affordability of a dress can be determined based on some important and logical parameters such as Design, Cloth quality, Color, and Price. Depending on the collective value of these parameters one can determine that the dress is affordable or not. Let us define a membership function about the design of a dress. On X-Axis, a score is presented based on the rank between 1 and 10 given by the user. The fuzzy membership function for the Design of a dress is shown in Fig. 3.4. It can be observed that there are no proper units possible for the parameter Design on X-Axis, hence a score/rank based on a scale of 10 is considered. Later, with type-2 fuzzy membership functions, this can be enhanced. Similar to this, other fuzzy membership functions are designed. As stated, other fuzzy membership functions on a scale of 10 are developed for cloth quality and color. For the price of a dress, either actual values (in a specific
Fig. 3.4 Membership function of the design of a dress
66
3 Examples and Applications on Fuzzy Logic Based Systems
range) can be considered, percentages can be considered, or 10 based scale may be used. For each dress item different linguistic values on various parameters such as Design, Quality, Color, Price, etc. are collected from users. These values are in words (linguistic in nature), such as low quality of the dress, bad design, good quality cloth, etc. These fuzzy linguistic values are not understood by the system, hence before computing these values for further decision making, they need to be converted into their equivalent crisp values through appropriate fuzzy membership functions. From the selected four parameters in this example, it is obvious that all the parameters are not equally important. Some parameters such as Design and Quality are more important than other parameters. Such weights for different parameters can be determined with the help of experts or can be acquired on custom based depending on one’s likings. Some of the major parameters are described in Table 3.4. Table 3.4 enlists the major four parameters for selections of readymade party dress namely Design, Cloth quality, Color, and Price. The weights of these parameters are respectively 0.3, 03, 01, and 0.3. Four major items shortlisted for selection are given as follows. • Item 1: Cream Silk Long Gown. • Item 2: Black Midi Skirt Lycra/Rayon Material. Table 3.4 Weights and ranks of the dress selection examples Parameter Description
Design
Cloth Quality Color Price
Weights
–
0.30
0.30
0.10
0.30
Item 1
Cream silk long gown
Good
Moderate
Good
Good
Equivalent crisp values of Item 1
8
6
8
8
Item 2
Moderate Moderate
Good
Bad
6
5
7
4
Moderate
5
Moderate
6
5
6
Black midi skirt lycra/Rayon material
Equivalent crisp values of Item 2 Item 3
Formal cocktail Bad dress high-quality satin beige color
Equivalent crisp values of Item 3
5
Item 4
Moderate Very Bad3
Bad
Moderate
Equivalent crisp values of Item 4
7
3
4
5
…
…
…
…
…
Semi-formal party maxi red color chiffon
Weighted Sum
7.40
5.20
5.60
4.90
3.7 Affordability of a Dress
67
Fig. 3.5 Affordability of a dress
• Item 3: Formal Cocktail Dress High-Quality Satin Beige Color. • Item 4: Red Semi Formal Party Maxi Chiffon. Table 3.4 uses the above mentioned shortlisted items and presents linguistic as well as crisp values of the individual parameters for each item for further computation. For every item, the user can give linguistic values as shown in the table. Along with the user given linguistic values, equivalent crisp values are obtained through membership functions. These values are listed in Table 3.4 along with the user given linguistic values. From the obtained crisp values and the weights assigned by the experts, the weighted sum for each item is calculated. This weighted sum forms a basis of selection of the dress. The affordability of the dress is a fuzzy linguistic variable, which uses the weighted sum for an item and maps it into the interval [0, 1]. The affordability fuzzy membership function is defined as shown in Fig. 3.5. The calculated weighted sum is provided to the fuzzy membership function illustrated in Fig. 3.5; from which a fuzzy value about the affordability is determined. It is to be noted that, it is possible to have the triangular membership functions of different heights and levels. Some membership functions may be up to 0.80 scale on the Y-Axis. Further, it is not necessarily to be always a symmetrical function. Similarly, the affordability of many items such as any consumer product, machine, books, and software can be determined. Refer to the next section for identifying the affordability of a software product.
3.8 Affordability of a Software The affordability of software depends on various quality factors and the price of the software. Given quality rank by experts and the price of software, one can determine the affordability of the software. Instead of having only one parameter representing
68
3 Examples and Applications on Fuzzy Logic Based Systems
the quality of software, one can consider various quality-related factors such as reliability, usability, functionality, portability, maintainability, etc. For each of these parameters, a fuzzy membership function can be developed on a scale of 10. Here, values of the quality factors about various software are taken from experts of the domain. Let the A is fuzzy membership stating Affordability (A) of a software product defined as follows: A = 0, if the price of the software is more than 5999 Dollars. A = 1 − {price/6000}otherwise. Prices of various software products are available in Table 3.5. The table also enlists the quality ranks of various software products after rigorous testing by experts on various parameters. Please note that the quality of software generally not given the highest value of 100% (or 1, as denoted in Table 3.5), as there is always a possibility of a better product. Here, the quality value 1 is taken for demonstration only. Let us calculate the Affordability of every software product as per the fuzzy function defined above as A. The calculated values are shown in Table 3.6. The software product which has good quality, as well as high affordability, is the best software to be purchased. This can be identified by applying the intersection Table 3.5 Price and quality of software products
Software product
Price ($)
Quality
Product 1
6000
1.00
Product 2
400
0.30
Product 3
800
0.70
Product 4
900
0.50
Product 5
1000
0.30
Product 6
500
0.20
Table 3.6 Cost-effectiveness of a software Software
Product 1
Price ($)
6000
Quality
1.00
Affordability
Cost-effectiveness
Cost-effectiveness
As per Function A
Minimum Operation
Multiplication Operation
0.00
0.00
0.00
Product 2
400
0.30
0.93
0.30
0.28
Product 3
800
0.70
0.87
0.70
0.61
Product 4
900
0.50
0.85
0.50
0.43
Product 5
1000
0.30
0.83
0.30
0.25
Product 6
500
0.20
0.92
0.20
0.18
3.8 Affordability of a Software
69
operator on the Quality and the Affordability of the software product. The last column of Table 3.6 presents the results of the intersection operation between the Quality and the Affordability for each software product. For the intersection operation minimum as well as multiplication operators are applied. By both types of intersection operations, it can be observed that the software product named Product 3 is the most preferred.
3.9 Membership Functions and Fuzzy Rules for Automatic Car Braking Let us consider a scenario where automatic cars are driven on road. The cars need to maintain enough distance and time to time needs to apply brakes too in case of obstacles, signals, or the arrival of destination. We will select an example scenario that determines how fast the car can be driven and how tightly the brakes have to be applied. For this, a few variables/parameters such as (i) Speed of the car, (ii) Distance between car to another vehicle on the road ahead, and (iii) How tightly one needs to apply brakes of the car are designed. These variables are fuzzy in nature and to use them into the system we need corresponding fuzzy membership function for each of the variables. Refer Figs. 3.6, 3.7, and 3.8 for the membership functions. Figure 3.6 demonstrates the speed of a car using the scale miles/hour on the X-Axis. The figure illustrates 5 different membership functions called Very Low, Low, Average, High, and Very High on a single axis. Figure 3.7 demonstrates the distance between two vehicles with similar fuzzy titles. However, the scale on the X-Axis is measured in feet. In Figs. 3.6, 3.7, and 3.8 all the membership functions are triangular, except for the last one. Figure 3.8 presents 5 membership functions to measure the fuzzy degree of Pressure on the Brakes. These functions are called Very Low, Low, Average, High, and
Fig. 3.6 Speed of the car
70
3 Examples and Applications on Fuzzy Logic Based Systems
Fig. 3.7 Distance between two vehicles
Fig. 3.8 Pressure on the brakes
Very High. Here, on the X-Axis a scale of 10 is considered to measure the pressure on the brakes. Using fuzzy membership functions on different linguistic variables, many fuzzy rules in the following form are developed. ‘I f (a condition involving f uzzy variable) then(action involving f uzzy variable) Table 3.7 presents some combinations of the fuzzy variables for possible fuzzy rules to be prepared in the above-mentioned format. The rules can be derived from the different values of the variables shown in Table 3.7. A few example rules are listed below. Rule 1 If car Speed is High and Distance is Average then Pressure on the brakes is Medium. Rule 2 If car Speed is Very High and Distance is Very Low then Pressure on the brakes is Very High.
3.9 Membership Functions and Fuzzy Rules for Automatic Car Braking Table 3.7 Values of the fuzzy variables for the car example
71
Rule no.
Speed
Distance
Pressure
1
HH
AV
ME
2
VH
VL
VH
3
VL
VH
VL
4
VL
VL
ME
5
VH
VH
VL
6
HH
HH
HH
…
…
…
…
The Encoded Values for the Functions are: VL: Very Low L: Low, AV: Average, HH: High, VH: Very High
Rule 3 If car Speed is Very Low and Distance is Very High then Pressure on the brakes is Very Low. etc. Such multiple rules can be stored in a knowledge base of a system that manages driverless vehicles.
3.10 Fuzzy Logic Application in Share Market Many parameters affect the decision about the investment amount in the share market. The brand image of a company, return rate, expected time of return such as long term or short term, the price of the script, the number of shares/scripts to be purchased, the risk associated with the decision, etc. Many of these parameters are vague and there is no generic set of guidelines available for people. To demonstrate the use of fuzzy logic in the domain, an example with a limited number of inputs and outputs is presented as follows. Consider two input parameters I1 and I2 defined as (i) the return rate in percentages and (ii) brand image of a company in which investment is to be made. It has been observed that it is very difficult to guess the correct range rate of return. People generally predict the average rate of return; however, the achieved (actual) return might not fall into the defined range. Further, it is to be noted that the range may include the negative numbers too. Here, the range of return is considered from −25 to 125%. It may exceed beyond both ends. Refer Fig. 3.9 for the fuzzy membership function for the first input parameter (I1 ) named Rate of return. The fuzzy membership definitions are as follows. µBR = (−x)/25;
−25