329 5 102MB
English Pages xxxiv+1534 [1569] Year 2021
Research Anthology on Multi-Industry Uses of Genetic Programming and Algorithms Information Resources Management Association USA
Published in the United States of America by IGI Global Engineering Science Reference (an imprint of IGI Global) 701 E. Chocolate Avenue Hershey PA, USA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail: [email protected] Web site: http://www.igi-global.com Copyright © 2021 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark. Library of Congress Cataloging-in-Publication Data Names: Information Resources Management Association, editor. Title: Research anthology on multi-industry uses of genetic programming and algorithms / Information Resources Management Association, editor. Description: Hershey, PA : Engineering Science Reference, [2021] | Includes bibliographical references and index. | Summary: “This book of research chapters explores the technology, uses, and implementation of genetic programming and algorithms across multiple industries creating a fundamental understanding of this technology, and how genetic programming and algorithms are implemented in fields such as healthcare, engineering, social sciences, computer science and more”-- Provided by publisher. Identifiers: LCCN 2020055174 (print) | LCCN 2020055175 (ebook) | ISBN 9781799880486 (hardcover) | ISBN 9781799880998 (ebook) Subjects: LCSH: Genetic programming (Computer science) | Genetic algorithms--Industrial applications. | Engineering mathematics. | Industries--Data processing. Classification: LCC QA76.623 .R47 2021 (print) | LCC QA76.623 (ebook) | DDC 006.3/823--dc23 LC record available at https://lccn.loc.gov/2020055174 LC ebook record available at https://lccn.loc.gov/2020055175 British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. The views expressed in this book are those of the authors, but not necessarily of the publisher. For electronic access to this publication, please contact: [email protected].
Editor-in-Chief Mehdi Khosrow-Pour, DBA Information Resources Management Association, USA
Associate Editors Steve Clarke, University of Hull, UK Murray E. Jennex, San Diego State University, USA Ari-Veikko Anttiroiko, University of Tampere, Finland
Editorial Advisory Board Sherif Kamel, American University in Cairo, Egypt In Lee, Western Illinois University, USA Jerzy Kisielnicki, Warsaw University, Poland Amar Gupta, Arizona University, USA Craig van Slyke, University of Central Florida, USA John Wang, Montclair State University, USA Vishanth Weerakkody, Brunel University, UK
List of Contributors
Abdelkader, Mostefai / Dr. Tahar Moulay University of Saida, Algeria........................................ 1053 Abou-Msabah, Slimane / University of Science and Technology Houari Boumedienne, Bab Ezzouar, Algeria.......................................................................................................................... 1471 Acharjya, Debi Prasanna / School of Computer Science and Engineering, VIT University, Vellore, India............................................................................................................................... 1229 Adamatti, Diana Francisca / Universidade Federal do Rio Grande (FURG), Brazil..................... 1215 Agrawal, Navneet / The College of Technology and Engineering, Department of Electronics and Communication, Udaipur, India.................................................................................................... 482 Aherwar, Amit / Madhav Institute of Technology and Science, India.............................................. 676 Ait Omar, Driss / Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco........................................................................................................ 328 Akter, Ruksana / Hankuk University of Foreign Studies, Seoul, Korea............................................ 181 Al Maghayreh, Eslam / King Saud University, Saudi Arabia........................................................... 495 Alaoui, Abdiya / EEDIS Laboratory, Department of Computer Science, Djillali Liabes University Sidi Belabbes, Algeria.................................................................................................. 244 Amrani, Fouzia / Ecole Nationale Polytechnique d’Oran (ENPO), Oran, Algeria........................ 1186 Aote, Shailendra / Ramdeobaba College of Engineering and Management, India.............................. 1 Ascencio, Jorge A / Polytechnic University of Quintana Roo, Cancun, Mexico................................ 947 Ashour, Amira S. / Tanta University, Egypt....................................................................................... 592 Baba-Ali, Ahmed-Riadh / University of Science and Technology Houari Boumedienne, Bab Ezzouar, Algeria.......................................................................................................................... 1471 Bansal, Ankita / Netaji Subhas Institute of Technology, Delhi, India............................................... 279 Barma, Partha Sarathi / National Institute of Technology Durgapur, West Bengal, India............... 403 Baslam, Mohamed / Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco........................................................................................................ 328 Belayachi, Naima / Laboratoire d’informatique d’Oran (LIO), University of Oran 1 Ahmed Ben Bella, Oran, Algeria.................................................................................................................... 1186 Bhattacharyya, Siddhartha / RCC Institute of Information Technology, India................................. 78 Bhupendro Singh, N. / National Institute of Technology, India.......................................................... 78 Bianconi, Fabio / Università degli Studi di Perugia, Italy................................................................. 997 Born, Míriam Blank / Universidade Federal do Rio Grande (FURG), Brazil................................ 1215 Bottura, Celso Pascoli / Unicamp, Brazil........................................................................................... 700 Bouamama, Sadok / FCIT, University of Jeddah, Jeddah, Saudi Arabia....................................... 1140 Bouamrane, Karim / Laboratoire d’informatique d’Oran (LIO), University of Oran 1 Ahmed Ben Bella, Oran, Algeria............................................................................................................. 1186
Boutekkouk, Fateh / Research Laboratory on Computer Science’s Complex Systems (ReLaCS2), University of Oum El Bouaghi, Algeria....................................................................................... 1116 Buffi, Alessandro / Università degli Studi di Perugia, Italy.............................................................. 997 Calzada-Orihuela, Gustavo / Morelos State Autonomous University, Cuernavaca, Mexico........... 947 Celik, Gaffari / Agri Ibrahim Cecen University, Turkey.................................................................... 642 Chakraborty, Shouvik / University of Kalyani, India...................................................................... 592 Chandila, Anuj / IEC-CET, Greater Noida, India............................................................................ 300 Chatterjee, Sankhadeep / University of Calcutta, India................................................................... 592 Chaudhari, Narendra S. / Indian Institute of Technology Indore, Indore, India............................ 1431 Chawla, Suruchi / Shaheed Rajguru College Delhi University, Delhi, India................................... 656 Chen, Xirui / College of Mechanical Engineering, Chongqing Technology and Business University, Chongqing, China......................................................................................................... 65 Chillarige, Raghavendra Rao / SCIS, University of Hyderabad, Hyderabad, India......................... 223 Chung, Yoojin / Hankuk University of Foreign Studies, Seoul, Korea.............................................. 181 Das, Kedar Nath / NIT Silchar, India................................................................................................. 148 De, Tanmay / National Institute of Technology Durgapur, West Bengal, India................................. 403 Deo, Ravinesh C. / University of Southern Queensland, Australia.................................................... 116 Dey, Nilanjan / Department of Information Technology, Techno India College of Technology, Kolkata, India................................................................................................................................ 592 Dhamodharavadhani S. / Periyar University, India.......................................................................... 742 Dhinesh Babu L.D. / VIT University, Vellore, India........................................................................ 1492 Dilip, Kumar / Jawaharlal Nehru University, India.......................................................................... 811 Downs, Nathan J. / University of Southern Queensland, Australia................................................... 116 Dutta, Joydeep / National Institute of Technology Durgapur, West Bengal, India........................... 403 Eid, Heba F. / Al Azhar University, Egypt.......................................................................................... 620 El Amrani, Mohamed / Information Processing and Decision Support Laboratory, Sultan Moulay Slimane University, Morocco........................................................................................... 328 Elberrichi, Zakaria / EEDIS Laboratory, Department of Computer Science, Djillali Liabes University Sidi Belabbes, Algeria.................................................................................................. 244 Eren Şenaras, Arzu / Uludag University, Turkey............................................................................ 1207 Eroglu, Ergun / Istanbul University, Turkey...................................................................................... 790 Fakir, Mohamed / Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco........................................................................................................ 328 Filippucci, Marco / Università degli Studi di Perugia, Italy............................................................. 997 Ganapathy, L. / National Institute of Industrial Engineering, Mumbai, India................................. 375 Ganeshkumar C / Indian Institute of Plantation Management (IIPM), Bangalore, India............... 1074 Garmani, Hamid / Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco........................................................................................................ 328 Gautam, S. S. / Mahatma Gandhi Chitrakoot Gramodaya Vishwavidyalaya, India.......................... 550 Gharbi, Ibrahim / ENSI, Manouba University, Manouba, Tunisia................................................ 1140 Gharsellaoui, Hamza / National Engineering School of Carthage (ENIC), Carthage University, Tunis, Tunisia............................................................................................................................... 1140 Ghimire, Sujan / University of Southern Queensland, Australia...................................................... 116 Giannatsis, John / Department of industrial Management & Technology, University of Piraeus, Piraeus, Greece........................................................................................................................... 1342 Giesbrecht, Mateus / Unicamp, Brazil.............................................................................................. 700
Godandapani, Zayaraz / Pondicherry Engineering College, Pondicherry, India.......................... 1513 Gong, Xiansheng / College of Mechanical Engineering, Chongqing University, Chongqing, China & The State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing, China........................................................................................................................... 65 Gopal, Girdhar / Sanatan Dharma College, India........................................................................... 851 İnanç, Şahin / Bursa Uludağ University, Turkey............................................................................. 1207 Ismail, Ismail A. / 6 October University, Egypt................................................................................. 355 J., Jagan / VIT University, India.......................................................................................................... 98 J., Sharon Moses / Vellore Institute of Technology, India.................................................................. 829 Jain, Ashish / Indian Institute of Technology Indore, Indore, India & Manipal University Jaipur, Jaipur, India................................................................................................................................ 1431 Jain, Vivek / The College of Technology and Engineering, Department of Electronics and Communication, Udaipur, India.................................................................................................... 482 Jajoria, Sourabh / Netaji Subhas Institute of Technology, Delhi, India............................................ 279 Jayakumar L / Pondicherry University, Pondicherry, India............................................................ 1074 Jujjavarapu, Satya Eswari / National Institute of Technology Raipur, India................................... 609 Kar, Samarjit / National Institute of Technology Durgapur, West Bengal, India............................. 403 Krishna, Addepalli V. N. / Christ University, India.......................................................................... 969 Kumar, Amit / Birla Institute of Technology Mesra, India............................................................... 874 Kumar, Harendra / Gurukula Kangri Vishwavidyalaya, Haridwar, India..................................... 1156 Kumar, Pankaj / Gurukula Kangri Vishwavidyalaya, Haridwar, India......................................... 1156 Kumar, Sachin / College of Information Business Systems, National University of Science and Technology, MISiS, Russian Federation........................................................................................ 762 Kumar, T.V. Vijay / School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India............................................................................................................................ 512 Kurup, Pradeep U. / University of Massachusetts – Lowell, USA....................................................... 98 Kyriazis, Andreas / Department of Industrial Management & Technology, University of Piraeus, Piraeus, Greece........................................................................................................................... 1342 L. D., Dhinesh Babu / Vellore Institute of Technology, India............................................................ 829 Laskar, Baddrud Zaman / NIT Silchar, India................................................................................... 427 Li, Rongrong / Guangdong University of Science and Technology, Dongguan, China.................... 190 M., Nirmala / Vellore Institute of Technology, India......................................................................... 829 Majumder, Swanirbhar / NERIST, India.......................................................................................... 427 Mali, Kalyani / University of Kalyani, India..................................................................................... 592 Malik, Sunesh / Guru Gobind Singh Indraprastha University, New Delhi, India............................. 851 Melkemi, Kamal Eddine / LMA Laboratory, University of Mohamed Khider, Biskra, Algeria........ 205 Mercangöz, Burcu Adıguzel / Istanbul University, Turkey............................................................... 790 Mewada, Shivlal / Mahatma Gandhi Chitrakoot Gramodaya Vishwavidyalaya, India.................... 550 Mezzoudj, Saliha / LaSTIC Laboratory, University Hadj Lakhdar, Batna, Algeria......................... 205 Mishra, Anjana / Department of IT, C.V.Raman College of Engineering, Mahura, India................. 35 Mishra, K. K. / MNNIT Allahabad, India.......................................................................................... 300 Mishra, Poonam Prakash / Pandit Deendayal Petroleum University, India................................... 1175 Mishra, Rajashree / KIIT University, India...................................................................................... 148 Mishra, Shashwati / Utkal University, Vani Vihar, India.................................................................. 896 Mohamed-Khireddine, Kholladi / Echahid Hamma Lakhdar University, El Oued, Algeria.......... 260 Moharam, Riham / Suez Canal University, Egypt............................................................................ 355
Morsy, Ehab / Suez Canal University, Egypt.................................................................................... 355 Mourelle, Luiza de Macedo / Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil............................................................................................................................................. 534 Nadimpalli, Vijaya Lakshmi V. / ACRHEM, University of Hyderabad, Hyderabad, India.............. 223 Naik, Bighnaraj / Department of Computer Application, Veer Surendra Sai University of Technology, Burla, India................................................................................................................. 35 Nautiyal, Lata / Graphic Era University, India................................................................................. 344 Nedjah, Nadia / Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil........................ 534 Negewo, Adem Guluma / Addis Ababa Science and Technology University, Ethiopia................... 1260 Panda, Mrutyunjaya / Utkal University, Vani Vihar, India.............................................................. 896 Pandey, Shriansh / Christ University, India...................................................................................... 969 Paul, Victer / Department of Computer Science and Engineering, Vignan’s Foundation for Science, Technology & Research, Guntur, India......................................................................... 1074 Prakash, Jay / School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India.................................................................................................................................... 512 Pundir, Ashok K. / National Institute of Industrial Engineering, Mumbai, India............................. 375 Punhani, Akash / ABES Engineering College, Ghaziabad, India..................................................... 300 Puyalnithi, Thendral / Vellore Institute of Technology(VIT), Vellore, India...................................... 50 Qiu, Linrun / Guangdong University of Science and Technology, China......................................... 190 R., Viswanathan / Galgotias University, India.................................................................................... 98 Raghuwanshi, Mukesh M. / Yeshwantrao Chavan College of Engineering, India............................... 1 Raj, Nawin / University of Southern Queensland, Australia............................................................. 116 Ram, Mangey / Graphic Era University (Deemed), India................................................................. 344 Ramkumar, Sarala / Pondicherry Engineering College, Pondicherry, India................................. 1513 Rathee, Manisha / Jawaharlal Nehru University, India.................................................................... 811 Rathee, Ritu / Indira Gandhi Delhi Technical University for Women, India.................................... 811 Rathi, R. / School of Information Technology and Engineering, VIT University, Vellore, India..... 1229 Rathipriya R. / Periyar University, India........................................................................................... 742 Ray, Jhuma / RCC Institute of Information Technology, India........................................................... 78 Recioui, Abdelmadjid / University of Boumerdes, Algeria............................................................. 1017 Reddlapalli, Rama Kishore / Guru Gobind Singh Indraprastha University, New Delhi, India........ 851 Reyes-Salgado, Gerardo / Computer Science, National Center for Research and Technological Development (CENIDET), Cuernavaca, Mexico.......................................................................... 947 Rodríguez de Guzmán, Ignacio García / Alarcos Research Group, University of Castilla-La Mancha, Spain............................................................................................................................. 1053 Ryma, Guefrouchi / Abdelhamid Mehri Constantine2 University, Constantine, Algeria................. 260 Sager, Basma / University of Sciences and Technologies Houari Boumediene, Bab Ezzouar, Algeria......................................................................................................................................... 1471 Sahu, Sanat Kumar / Govt. K. P.G. College Jagdalpur Bastar, Jagdalpur, India............................ 917 Saini, Hemraj / Department of Computer Science and Engineering, Jaypee University of Information Technology, Waknaghat, India........................................................................ 773, 1456 Samui, Pijush / National Institute of Technology Patna, India............................................................ 98 Sanchotene de Aguiar, Marilton / Universidade Federal de Pelotas (UFPel), Brazil................... 1215 Sarda, Raghav / Christ University, India.......................................................................................... 969 Sarkar, Bikash Kanti / Birla Institute of Technology Mesra, India................................................... 874 Schiavon de Souza, Weslen / Universidade Federal de Pelotas (UFPel), Brazil............................ 1215
Sharan, Aditi / SC & SS: School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India......................................................................................................... 447 Sharma, Bharti / Shivalik College of Engineering Dehradun, India................................................ 762 Sharma, Manisha / Panjab University, Chandigarh,, India............................................................ 1156 Sharma, Oshin / PES University, Bangalore, India................................................................ 773, 1456 Sharma, Pradeep / Holkar Science College, India........................................................................... 550 Sharon Moses J. / VIT University, Vellore, India............................................................................. 1492 Shivach, Preeti / Graphic Era University, India................................................................................ 344 Shrivas, A. K. / Dr. C. V. Raman University Kota Bialspur (C.G.), Bilaspur, India.......................... 917 Sineva, Irina S. / Moscow Technical University of Communications and Informatics, Russia........ 1414 Singh, Amit / Jawaharlal Nehru University, New Delhi, India......................................................... 447 Singh, Bikesh Kumar / National Institute of Technology Raipur, India............................................ 609 Singh, Varimna / Som Lalit Institute of Management Studies, Ahmedabad, India........................... 375 Siotropos, Panayiotis / Department of Industrial Management & Technology, University of Piraeus, Piraeus, Greece............................................................................................................. 1342 Srichandan, Suresh Kumar / Department of IT, Veer Surendra Sai University of Technology, Burla, India..................................................................................................................................... 35 Srinivasan, Santhoshkumar / Vellore Institute of Technology, India............................................... 829 Tambouratzis, Tatiana / University of Piraeus, Piraeus, Greece................................................... 1342 Tarik, Boudheb / EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria................................................................................................................................. 570, 1318 Tavares, Yuri Marchetti / Brazilian Navy, Rio de Janeiro, Brazil.................................................... 534 Tiwari, Shailesh / CSED, ABES Engineering College, Ghaziabad, India........................................ 300 Unune, Deepak Rajendra / The LNM Institute of Information Technology, India............................ 676 Urquiza-Beltrán, Gustavo / Morelos State Autonomous University, Cuernavaca, Mexico.............. 947 Vankadara, Madhuviswanatham / Vellore Institute of Technology(VIT), Vellore, India.................. 50 Vivekanandan, Vijayalakshmi / Pondicherry Engineering College, Pondicherry, India.............. 1513 Wang, John / School of Business, Montclair State University, Upper Montclair, USA.................... 928 Wankar, Rajeev / SCIS, University of Hyderabad, Hyderabad, India.............................................. 223 Xavier, Anitha Mary / Karunya University, India........................................................................... 1285 Xu, Shubin / College of Business and Management, Northeastern Illinois University, Chicago, USA................................................................................................................................................ 928 Zaichenko, Dmitry S. / Moscow Technical University of Communications and Informatics, Russia.......................................................................................................................................... 1414 Zakaria, Elberrichi / EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria................................................................................................................................. 570, 1318 Zhang, Dongbo / Guangdong Institute of Intelligent Manufacturing, China.................................... 190 Zhang, Geng / College of Mechanical Engineering, Chongqing University, Chongqing, China & College of Information and Electrical Engineering, Chongqing Radio and TV University, Chongqing, China........................................................................................................................... 65
Table of Contents
Preface.................................................................................................................................................. xxi
Volume I Section 1 Fundamental Concepts and Theories Chapter 1 Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm, and Differential Evolution and Its Similarities............................................................................................... 1 Shailendra Aote, Ramdeobaba College of Engineering and Management, India Mukesh M. Raghuwanshi, Yeshwantrao Chavan College of Engineering, India Chapter 2 Missing Value Imputation Using ANN Optimized by Genetic Algorithm............................................ 35 Anjana Mishra, Department of IT, C.V.Raman College of Engineering, Mahura, India Bighnaraj Naik, Department of Computer Application, Veer Surendra Sai University of Technology, Burla, India Suresh Kumar Srichandan, Department of IT, Veer Surendra Sai University of Technology, Burla, India Chapter 3 A Unified Feature Selection Model for High Dimensional Clinical Data Using Mutated Binary Particle Swarm Optimization and Genetic Algorithm........................................................................... 50 Thendral Puyalnithi, Vellore Institute of Technology(VIT), Vellore, India Madhuviswanatham Vankadara, Vellore Institute of Technology(VIT), Vellore, India
Chapter 4 PID Control Algorithm Based on Genetic Algorithm and its Application in Electric Cylinder Control................................................................................................................................................... 65 Geng Zhang, College of Mechanical Engineering, Chongqing University, Chongqing, China & College of Information and Electrical Engineering, Chongqing Radio and TV University, Chongqing, China Xiansheng Gong, College of Mechanical Engineering, Chongqing University, Chongqing, China & The State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing, China Xirui Chen, College of Mechanical Engineering, Chongqing Technology and Business University, Chongqing, China Chapter 5 Portfolio Optimization and Asset Allocation With Metaheuristics: A Review..................................... 78 Jhuma Ray, RCC Institute of Information Technology, India Siddhartha Bhattacharyya, RCC Institute of Information Technology, India N. Bhupendro Singh, National Institute of Technology, India Section 2 Development and Design Methodologies Chapter 6 Determination of Spatial Variability of Rock Depth of Chennai........................................................... 98 Pijush Samui, National Institute of Technology Patna, India Viswanathan R., Galgotias University, India Jagan J., VIT University, India Pradeep U. Kurup, University of Massachusetts – Lowell, USA Chapter 7 Optimization of Windspeed Prediction Using an Artificial Neural Network Compared With a Genetic Programming Model............................................................................................................... 116 Ravinesh C. Deo, University of Southern Queensland, Australia Sujan Ghimire, University of Southern Queensland, Australia Nathan J. Downs, University of Southern Queensland, Australia Nawin Raj, University of Southern Queensland, Australia Chapter 8 A Novel Hybrid Genetic Algorithm for Unconstrained and Constrained Function Optimization...... 148 Rajashree Mishra, KIIT University, India Kedar Nath Das, NIT Silchar, India Chapter 9 An Improved Genetic Algorithm for Document Clustering on the Cloud.......................................... 181 Ruksana Akter, Hankuk University of Foreign Studies, Seoul, Korea Yoojin Chung, Hankuk University of Foreign Studies, Seoul, Korea
Chapter 10 Research on an Improved Coordinating Method Based on Genetic Algorithms and Particle Swarm Optimization........................................................................................................................................ 190 Rongrong Li, Guangdong University of Science and Technology, Dongguan, China Linrun Qiu, Guangdong University of Science and Technology, China Dongbo Zhang, Guangdong Institute of Intelligent Manufacturing, China Chapter 11 A Hybrid Approach for Shape Retrieval Using Genetic Algorithms and Approximate Distance....... 205 Saliha Mezzoudj, LaSTIC Laboratory, University Hadj Lakhdar, Batna, Algeria Kamal Eddine Melkemi, LMA Laboratory, University of Mohamed Khider, Biskra, Algeria Chapter 12 Innovative Genetic Algorithmic Approach to Select Potential Patches Enclosing Real and Complex Zeros of Nonlinear Equation................................................................................................ 223 Vijaya Lakshmi V. Nadimpalli, ACRHEM, University of Hyderabad, Hyderabad, India Rajeev Wankar, SCIS, University of Hyderabad, Hyderabad, India Raghavendra Rao Chillarige, SCIS, University of Hyderabad, Hyderabad, India Chapter 13 Neuronal Communication Genetic Algorithm-Based Inductive Learning.......................................... 244 Abdiya Alaoui, EEDIS Laboratory, Department of Computer Science, Djillali Liabes University Sidi Belabbes, Algeria Zakaria Elberrichi, EEDIS Laboratory, Department of Computer Science, Djillali Liabes University Sidi Belabbes, Algeria Chapter 14 Genetic Algorithm With Hill Climbing for Correspondences Discovery in Ontology Mapping........ 260 Guefrouchi Ryma, Abdelhamid Mehri Constantine2 University, Constantine, Algeria Kholladi Mohamed-Khireddine, Echahid Hamma Lakhdar University, El Oued, Algeria Chapter 15 Cross-Project Change Prediction Using Meta-Heuristic Techniques.................................................. 279 Ankita Bansal, Netaji Subhas Institute of Technology, Delhi, India Sourabh Jajoria, Netaji Subhas Institute of Technology, Delhi, India Chapter 16 Environmental Adaption Method: A Heuristic Approach for Optimization........................................ 300 Anuj Chandila, IEC-CET, Greater Noida, India Shailesh Tiwari, CSED, ABES Engineering College, Ghaziabad, India K. K. Mishra, MNNIT Allahabad, India Akash Punhani, ABES Engineering College, Ghaziabad, India
Chapter 17 Decision Choice Optimization With Genetic Algorithm in Communication Networks..................... 328 Driss Ait Omar, Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco Mohamed El Amrani, Information Processing and Decision Support Laboratory, Sultan Moulay Slimane University, Morocco Hamid Garmani, Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco Mohamed Baslam, Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco Mohamed Fakir, Information Processing and Decision Support Laboratory Sultan Moulay Slimane University, Morocco Chapter 18 Optimal Designs by Means of Genetic Algorithms............................................................................. 344 Lata Nautiyal, Graphic Era University, India Preeti Shivach, Graphic Era University, India Mangey Ram, Graphic Era University (Deemed), India Chapter 19 T-Spanner Problem: Genetic Algorithms for the T-Spanner Problem................................................. 355 Riham Moharam, Suez Canal University, Egypt Ehab Morsy, Suez Canal University, Egypt Ismail A. Ismail, 6 October University, Egypt Chapter 20 An Improved Genetic Algorithm for Solving Multi Depot Vehicle Routing Problems....................... 375 Varimna Singh, Som Lalit Institute of Management Studies, Ahmedabad, India L. Ganapathy, National Institute of Industrial Engineering, Mumbai, India Ashok K. Pundir, National Institute of Industrial Engineering, Mumbai, India Chapter 21 A Modified Kruskal’s Algorithm to Improve Genetic Search for Open Vehicle Routing Problem.... 403 Joydeep Dutta, National Institute of Technology Durgapur, West Bengal, India Partha Sarathi Barma, National Institute of Technology Durgapur, West Bengal, India Samarjit Kar, National Institute of Technology Durgapur, West Bengal, India Tanmay De, National Institute of Technology Durgapur, West Bengal, India Section 3 Tools and Technologies Chapter 22 Gene Expression Programming........................................................................................................... 427 Baddrud Zaman Laskar, NIT Silchar, India Swanirbhar Majumder, NERIST, India
Chapter 23 Genetic-Fuzzy Programming Based Linkage Rule Miner (GFPLR-Miner) for Entity Linking in Semantic Web...................................................................................................................................... 447 Amit Singh, Jawaharlal Nehru University, New Delhi, India Aditi Sharan, SC & SS: School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India Chapter 24 Implement Multichannel Fractional Sample Rate Convertor using Genetic Algorithm..................... 482 Vivek Jain, The College of Technology and Engineering, Department of Electronics and Communication, Udaipur, India Navneet Agrawal, The College of Technology and Engineering, Department of Electronics and Communication, Udaipur, India
Volume II Chapter 25 A Genetic-Algorithms-Based Technique for Detecting Distributed Predicates.................................. 495 Eslam Al Maghayreh, King Saud University, Saudi Arabia Chapter 26 A Multi-Objective Approach for Materialized View Selection........................................................... 512 Jay Prakash, School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India T.V. Vijay Kumar, School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India Chapter 27 Tracking Patterns with Particle Swarm Optimization and Genetic Algorithms.................................. 534 Yuri Marchetti Tavares, Brazilian Navy, Rio de Janeiro, Brazil Nadia Nedjah, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil Luiza de Macedo Mourelle, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil Chapter 28 Exploration of Fuzzy System With Applications................................................................................ 550 Shivlal Mewada, Mahatma Gandhi Chitrakoot Gramodaya Vishwavidyalaya, India Pradeep Sharma, Holkar Science College, India S. S. Gautam, Mahatma Gandhi Chitrakoot Gramodaya Vishwavidyalaya, India Chapter 29 Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms........................................................................................................................................... 570 Boudheb Tarik, EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria Elberrichi Zakaria, EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria
Chapter 30 Intelligent Computing in Medical Imaging: A Study.......................................................................... 592 Shouvik Chakraborty, University of Kalyani, India Sankhadeep Chatterjee, University of Calcutta, India Amira S. Ashour, Tanta University, Egypt Kalyani Mali, University of Kalyani, India Nilanjan Dey, Department of Information Technology, Techno India College of Technology, Kolkata, India Chapter 31 Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery: Current Practices and Forthcoming Challenges.................................................................. 609 Satya Eswari Jujjavarapu, National Institute of Technology Raipur, India Bikesh Kumar Singh, National Institute of Technology Raipur, India Chapter 32 Application of Computational Intelligence in Network Intrusion Detection: A Review..................... 620 Heba F. Eid, Al Azhar University, Egypt Chapter 33 Determining Headache Diseases With Genetic Algorithm................................................................. 642 Gaffari Celik, Agri Ibrahim Cecen University, Turkey Chapter 34 Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search.......................................................................................................................................... 656 Suruchi Chawla, Shaheed Rajguru College Delhi University, Delhi, India Chapter 35 A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling: Enhanced Quality Micro-Hole Fabrication in Inconel 718................................................................. 676 Deepak Rajendra Unune, The LNM Institute of Information Technology, India Amit Aherwar, Madhav Institute of Technology and Science, India Section 4 Utilization and Applications Chapter 36 Application of Natural-Inspired Paradigms on System Identification: Exploring the Multivariable Linear Time Variant Case.................................................................................................................... 700 Mateus Giesbrecht, Unicamp, Brazil Celso Pascoli Bottura, Unicamp, Brazil Chapter 37 Variable Selection Method for Regression Models Using Computational Intelligence Techniques... 742 Dhamodharavadhani S., Periyar University, India Rathipriya R., Periyar University, India
Chapter 38 Delay Optimization Using Genetic Algorithm at the Road Intersection............................................. 762 Bharti Sharma, Shivalik College of Engineering Dehradun, India Sachin Kumar, College of Information Business Systems, National University of Science and Technology, MISiS, Russian Federation Chapter 39 Energy and SLA Efficient Virtual Machine Placement in Cloud Environment Using NonDominated Sorting Genetic Algorithm................................................................................................ 773 Oshin Sharma, PES University, Bangalore, India Hemraj Saini, Department of Computer Science and Engineering, Jaypee University of Information Technology, Waknaghat, India Chapter 40 The Genetic Algorithm: An Application on Portfolio Optimization................................................... 790 Burcu Adıguzel Mercangöz, Istanbul University, Turkey Ergun Eroglu, Istanbul University, Turkey Chapter 41 DNA Fragment Assembly Using Quantum-Inspired Genetic Algorithm............................................ 811 Manisha Rathee, Jawaharlal Nehru University, India Kumar Dilip, Jawaharlal Nehru University, India Ritu Rathee, Indira Gandhi Delhi Technical University for Women, India Chapter 42 Genetic Algorithm-Influenced Top-N Recommender System to Alleviate the New User Cold Start Problem................................................................................................................................................ 829 Sharon Moses J., Vellore Institute of Technology, India Dhinesh Babu L. D., Vellore Institute of Technology, India Santhoshkumar Srinivasan, Vellore Institute of Technology, India Nirmala M., Vellore Institute of Technology, India Chapter 43 GA-Based Optimized Image Watermarking Method With Histogram and Butterworth Filtering...... 851 Sunesh Malik, Guru Gobind Singh Indraprastha University, New Delhi, India Rama Kishore Reddlapalli, Guru Gobind Singh Indraprastha University, New Delhi, India Girdhar Gopal, Sanatan Dharma College, India Chapter 44 Performance Analysis of Nature-Inspired Algorithms-Based Bayesian Prediction Models for Medical Data Sets................................................................................................................................ 874 Amit Kumar, Birla Institute of Technology Mesra, India Bikash Kanti Sarkar, Birla Institute of Technology Mesra, India
Chapter 45 Medical Image Thresholding Using Genetic Algorithm and Fuzzy Membership Functions: A Comparative Study............................................................................................................................... 896 Shashwati Mishra, Utkal University, Vani Vihar, India Mrutyunjaya Panda, Utkal University, Vani Vihar, India Chapter 46 Analysis and Comparison of Clustering Techniques for Chronic Kidney Disease With Genetic Algorithm............................................................................................................................................. 917 Sanat Kumar Sahu, Govt. K. P.G. College Jagdalpur Bastar, Jagdalpur, India A. K. Shrivas, Dr. C. V. Raman University Kota Bialspur (C.G.), Bilaspur, India Chapter 47 An Efficient Batch Scheduling Model for Hospital Sterilization Services Using Genetic Algorithm............................................................................................................................................. 928 Shubin Xu, College of Business and Management, Northeastern Illinois University, Chicago, USA John Wang, School of Business, Montclair State University, Upper Montclair, USA Chapter 48 Implementing Genetic Algorithms to Assist Oil and Gas Pipeline Integrity Assessment and Intelligent Risk Optimization............................................................................................................... 947 Gustavo Calzada-Orihuela, Morelos State Autonomous University, Cuernavaca, Mexico Gustavo Urquiza-Beltrán, Morelos State Autonomous University, Cuernavaca, Mexico Jorge A Ascencio, Polytechnic University of Quintana Roo, Cancun, Mexico Gerardo Reyes-Salgado, Computer Science, National Center for Research and Technological Development (CENIDET), Cuernavaca, Mexico Section 5 Organizational and Social Implications Chapter 49 A Secured Predictive Analytics Using Genetic Algorithm and Evolution Strategies......................... 969 Addepalli V. N. Krishna, Christ University, India Shriansh Pandey, Christ University, India Raghav Sarda, Christ University, India Chapter 50 Optimization and Evolution in Architectural Morphogenesis: Evolutionary Principles Applied to Mass Housing...................................................................................................................................... 997 Alessandro Buffi, Università degli Studi di Perugia, Italy Marco Filippucci, Università degli Studi di Perugia, Italy Fabio Bianconi, Università degli Studi di Perugia, Italy
Volume III Chapter 51 Home Load-Side Management in Smart Grids Using Global Optimization..................................... 1017 Abdelmadjid Recioui, University of Boumerdes, Algeria Chapter 52 A Novel Approach for Business Process Model Matching Using Genetic Algorithms.................... 1053 Mostefai Abdelkader, Dr. Tahar Moulay University of Saida, Algeria Ignacio García Rodríguez de Guzmán, Alarcos Research Group, University of Castilla-La Mancha, Spain Chapter 53 Performance Evaluation of Population Seeding Techniques of Permutation-Coded GA Traveling Salesman Problems Based Assessment: Performance Evaluation of Population Seeding Techniques of Permutation-Coded GA.............................................................................................. 1074 Victer Paul, Department of Computer Science and Engineering, Vignan’s Foundation for Science, Technology & Research, Guntur, India Ganeshkumar C, Indian Institute of Plantation Management (IIPM), Bangalore, India Jayakumar L, Pondicherry University, Pondicherry, India Chapter 54 Real Time Scheduling Optimization.................................................................................................. 1116 Fateh Boutekkouk, Research Laboratory on Computer Science’s Complex Systems (ReLaCS2), University of Oum El Bouaghi, Algeria Chapter 55 New Hybrid Genetic Based Approach for Real-Time Scheduling of Reconfigurable Embedded Systems.............................................................................................................................................. 1140 Ibrahim Gharbi, ENSI, Manouba University, Manouba, Tunisia Hamza Gharsellaoui, National Engineering School of Carthage (ENIC), Carthage University, Tunis, Tunisia Sadok Bouamama, FCIT, University of Jeddah, Jeddah, Saudi Arabia Chapter 56 Solving Flow Shop Scheduling Problems with Blocking by using Genetic Algorithm.................... 1156 Harendra Kumar, Gurukula Kangri Vishwavidyalaya, Haridwar, India Pankaj Kumar, Gurukula Kangri Vishwavidyalaya, Haridwar, India Manisha Sharma, Panjab University, Chandigarh,, India Chapter 57 Genetic Algorithm Approach for Inventory and Supply Chain Management: A Review................. 1175 Poonam Prakash Mishra, Pandit Deendayal Petroleum University, India
Chapter 58 A Decision-Making Tool for the Optimization of Empty Containers’ Return in the Liner Shipping: Optimization by Using the Genetic Algorithm................................................................. 1186 Naima Belayachi, Laboratoire d’informatique d’Oran (LIO), University of Oran 1 Ahmed Ben Bella, Oran, Algeria Fouzia Amrani, Ecole Nationale Polytechnique d’Oran (ENPO), Oran, Algeria Karim Bouamrane, Laboratoire d’informatique d’Oran (LIO), University of Oran 1 Ahmed Ben Bella, Oran, Algeria Chapter 59 Solving Nurse Scheduling Problem via Genetic Algorithm in Home Healthcare............................. 1207 Şahin İnanç, Bursa Uludağ University, Turkey Arzu Eren Şenaras, Uludag University, Turkey Chapter 60 Use SUMO Simulator for the Determination of Light Times in Order to Reduce Pollution: A Case Study in the City Center of Rio Grande, Brazil................................................................................. 1215 Míriam Blank Born, Universidade Federal do Rio Grande (FURG), Brazil Diana Francisca Adamatti, Universidade Federal do Rio Grande (FURG), Brazil Marilton Sanchotene de Aguiar, Universidade Federal de Pelotas (UFPel), Brazil Weslen Schiavon de Souza, Universidade Federal de Pelotas (UFPel), Brazil Chapter 61 A Rule Based Classification for Vegetable Production Using Rough Set and Genetic Algorithm... 1229 R. Rathi, School of Information Technology and Engineering, VIT University, Vellore, India Debi Prasanna Acharjya, School of Computer Science and Engineering, VIT University, Vellore, India Section 6 Critical Issues and Challenges Chapter 62 A Survey on Grey Optimization........................................................................................................ 1260 Adem Guluma Negewo, Addis Ababa Science and Technology University, Ethiopia Chapter 63 Genetic-Algorithm-Based Performance Optimization for Non-Linear MIMO System.................... 1285 Anitha Mary Xavier, Karunya University, India Chapter 64 Privacy Preserving Feature Selection for Vertically Distributed Medical Data Based on Genetic Algorithms and Naïve Bayes............................................................................................................. 1318 Boudheb Tarik, EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria Elberrichi Zakaria, EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria
Chapter 65 Applying the Computational Intelligence Paradigm to Nuclear Power Plant Operation: A Review (1990-2015)....................................................................................................................................... 1342 Tatiana Tambouratzis, University of Piraeus, Piraeus, Greece John Giannatsis, Department of industrial Management & Technology, University of Piraeus, Piraeus, Greece Andreas Kyriazis, Department of Industrial Management & Technology, University of Piraeus, Piraeus, Greece Panayiotis Siotropos, Department of Industrial Management & Technology, University of Piraeus, Piraeus, Greece Chapter 66 The Study of Genetic Type Steganographic Models to Increase Noise Immunity of IoT Systems... 1414 Dmitry S. Zaichenko, Moscow Technical University of Communications and Informatics, Russia Irina S. Sineva, Moscow Technical University of Communications and Informatics, Russia Section 7 Emerging Trends Chapter 67 An Improved Genetic Algorithm and A New Discrete Cuckoo Algorithm for Solving the Classical Substitution Cipher............................................................................................................................ 1431 Ashish Jain, Indian Institute of Technology Indore, Indore, India & Manipal University Jaipur, Jaipur, India Narendra S. Chaudhari, Indian Institute of Technology Indore, Indore, India Chapter 68 Performance Evaluation of VM Placement Using Classical Bin Packing and Genetic Algorithm for Cloud Environment...................................................................................................................... 1456 Oshin Sharma, Department of Computer Science and Engineering, Jaypee University of Information Technology, Waknaghat, India Hemraj Saini, Department of Computer Science and Engineering, Jaypee University of Information Technology, Waknaghat, India Chapter 69 A Controlled Stability Genetic Algorithm With the New BLF2G Guillotine Placement Heuristic for the Orthogonal Cutting-Stock Problem........................................................................................ 1471 Slimane Abou-Msabah, University of Science and Technology Houari Boumedienne, Bab Ezzouar, Algeria Ahmed-Riadh Baba-Ali, University of Science and Technology Houari Boumedienne, Bab Ezzouar, Algeria Basma Sager, University of Sciences and Technologies Houari Boumediene, Bab Ezzouar, Algeria
Chapter 70 Genetic Algorithm Influenced Top-N Recommender System to Alleviate New User Cold Start Problem.............................................................................................................................................. 1492 Sharon Moses J., VIT University, Vellore, India Dhinesh Babu L.D., VIT University, Vellore, India Chapter 71 A Hybrid Tabu Genetic Metaheuristic for Selection of Security Controls........................................ 1513 Sarala Ramkumar, Pondicherry Engineering College, Pondicherry, India Zayaraz Godandapani, Pondicherry Engineering College, Pondicherry, India Vijayalakshmi Vivekanandan, Pondicherry Engineering College, Pondicherry, India Index.................................................................................................................................................... xxv
xxi
Preface
Genetic programming has become a hot topic for discussion in the realm of computer science, proving its usefulness in selecting the most (or least) suitable program in a random selection of programs needed for a particular task. These tasks range in needs across many different types of industries, leading genetic programming and algorithms to have a high usability and diversity factor. Within computer science and operations research, high quality solutions to optimization and search problems can be easily discovered in ways that the human mind could not unravel itself. By being able to pick optimal software and the best solutions for problems in many fields of research, the exploration of genetic algorithms and genetic programming is rising with its ability to solve a variety of problems, matching and possibly surpassing the abilities of neural networks and machine learning techniques. Examining current research, case studies, and applications can lead to a greater understanding of the technological, social implications, advancements, and issues related to using genetic programming and algorithms across multiple types of industries. Staying informed of the most up-to-date research trends and findings is of the utmost importance. That is why IGI Global is pleased to offer this three-volume reference collection of reprinted IGI Global book chapters and journal articles that have been handpicked by senior editorial staff. This collection will shed light on critical issues related to the trends, techniques, and uses of various applications by providing both broad and detailed perspectives on cutting-edge theories and developments. This collection is designed to act as a single reference source on conceptual, methodological, technical, and managerial issues, as well as to provide insight into emerging trends and future opportunities within the field. The Research Anthology on Multi-Industry Uses of Genetic Programming and Algorithms is organized into seven distinct sections that provide comprehensive coverage of important topics. The sections are: 1. 2. 3. 4. 5. 6. 7.
Fundamental Concepts and Theories; Development and Design Methodologies; Tools and Technologies; Utilization and Applications; Organizational and Social Implications; Critical Issues and Challenges; and Emerging Trends. The following paragraphs provide a summary of what to expect from this invaluable reference tool.
Preface
Section 1, “Fundamental Concepts and Theories,” serves as a foundation for this extensive reference tool by addressing crucial theories essential to the understanding of the components and the technologies within genetic programming and algorithms and how they operate. This comprehensive reference work opens with the chapter “Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm, and Differential Evolution and Its Similarities” by Prof. Shailendra Aote of Ramdeobaba College of Engineering and Management, India and Prof. Mukesh M. Raghuwanshi of Yeshwantrao Chavan College of Engineering, India, which discusses optimization and solutions to optimization problems by using particle swarm optimization (PSO), genetic algorithm (GA) and differential evolution (DE) and gives a comparison of between the three. This opening section closes with “Portfolio Optimization and Asset Allocation With Metaheuristics: A Review” by Profs. Siddhartha Bhattacharyya and Jhuma Ray of RCC Institute of Information Technology, India and Prof. N. Bhupendro Singh of National Institute of Technology, India, which assesses how portfolio optimization can be improved with the usage of metaheuristics, which stands to be an effective measure for finding near optimal solutions for tough optimization issues in an adequate computational time frame. Section 2, “Development and Design Methodologies,” presents in-depth coverage of the development and design of genetic programming and algorithm technologies and how these different designs, models, and systems are used to solve different problems. This section begins with “Determination of Spatial Variability of Rock Depth of Chennai” written by Prof. Pijush Samui of the National Institute of Technology Patna, India; Prof. Viswanathan R. of Galgotias University, India; Prof. Jagan J. of VIT University, India; and Prof. Pradeep U. Kurup of the University of Massachusetts – Lowell, USA, which presents a study that adopts and compares four modeling techniques—ordinary kriging (OK), generalized regression neural network (GRNN), genetic programming (GP), and minimax probability machine regression (MPMR)—for prediction of rock depth at Chennai, India. The last chapter in this section, “A Modified Kruskal’s Algorithm to Improve Genetic Search for Open Vehicle Routing Problem,” by Profs. Joydeep Dutta, Partha Sarathi Barma, Samarjit Kar, and Tanmay De of National Institute of Technology Durgapur, West Bengal, India, includes a modified Kruskal’s method to increase the efficiency of a genetic algorithm to determine the path of least distance starting from a central point to solve the open vehicle routing problem (the need to reduce the number of vehicles used and the distance traveled simultaneously). Section 3, “Tools and Technologies,” explores the various tools and technologies used in genetic algorithm and genetic programming for selection and detection in a variety of situations derived from different industries. The chapter “Gene Expression Programming” authored by Prof. Baddrud Zaman Laskar of NIT Silchar, India and Prof. Swanirbhar Majumder of NERIST, India opens this section and explores the descendant of genetic algorithm and genetic programming, gene expression programming (GEP) and discusses the different fields of GEP, GEP architectures, and an example of GEP being used for the detection of age from facial features as a soft computing-based optimization problem using genetic operators. Concluding this section is “A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling” Enhanced Quality Micro-Hole Fabrication in Inconel 718” by Prof. Amit Aherwar of Madhav Institute of Technology and Science, India and Prof. Deepak Rajendra Unune of LNM Institute of Information Technology, India, which attempts to optimize parameters in micro-electrical discharge drilling (µ-EDD) of Inconel 718 using genetic algorithm-based multi-objective optimization and is verified experimentally. Section 4, “Utilization and Applications,” describes how these tools can be used for prediction and detection in diverse industries and how they operate in different fields of research. Starting this section xxii
Preface
is “Application of Natural-Inspired Paradigms on System Identification: Exploring the Multivariable Linear Time Variant Case” by Profs. Mateus Giesbrecht and Celso Pascoli Bottura of Unicamp, Brazil, which discusses the application of nature-inspired paradigms on system identification and reviews the recent applications of techniques such as genetic algorithms, genetic programming, immuno-inspired algorithms, and particle swarm optimization. Finishing the section is “Implementing Genetic Algorithms to Assist Oil and Gas Pipeline Integrity Assessment and Intelligent Risk Optimization” by Profs. Gustavo Calzada-Orihuela and Gustavo Urquiza-Beltrán of Morelos State Autonomous University, Cuernavaca, Mexico; Prof. Jorge A. Ascencio of Polytechnic University of Quintana Roo, Cancun, Mexico; and Prof. Gerardo Reyes-Salgado of National Center for Research and Technological Development (CENIDET), Cuernavaca, Mexico, which proposes an intelligent support system that provides optimized projections using a set of genetic algorithms (GAs) for effective risk management in the oil and gas industry. Section 5, “Organizational and Social Implications,” includes chapters discussing the organizational and social impact of findings derived from using GA and GP for solving and aiding with social issues and societal concerns. The chapter “A Secured Predictive Analytics Using Genetic Algorithm and Evolution Strategies” by Profs. Addepalli V. N. Krishna, Shriansh Pandey, and Raghav Sarda attempts to identify the churning rate of the possible customers leaving banks by using genetic algorithm, a method that could potentially be used in the future for banks to take measures to reduce these rates. The final chapter in this section is “A Rule-Based Classification for Vegetable Production Using Rough Set and Genetic Algorithm” by Prof. Carlos Adrian Catania of the National University of Cuyo, Argentina and Profs. R. Rathi and Debi Prasanna Acharjya of VIT University, Vellore, India. This chapter describes how agriculture is the main occupation of India and how the economy depends on agricultural production, thus forecasting the accuracy of future events based on extracted patterns in genetic algorithms plays a vital role in improving agricultural productivity. Section 6, “Critical Issues and Challenges,” presents coverage of academic and research perspectives on challenges to using genetic algorithms and genetic programming as a detection method and problem-solving technique. Beginning the compilation of research in this section is the chapter “A Survey on Grey Optimization” by Prof. Adem Guluma Negewo of Addis Ababa Science and Technology University, Ethiopia, which provides a literature review of optimization problems in the context of grey system theory and details the computation procedures involved for solving the aforementioned optimization problems with uncertainty. Finally, “The Study of Genetic Type Steganographic Models to Increase Noise Immunity of IoT Systems” by Profs. Dmitry S. Zaichenko and Irina S. Sineva of Moscow Technical University of Communications and Informatics, Russia presents a genetic coding that hides messages between internet of things devices and is capable of detecting both internal and external attacks in the intellectual infrastructure. It also shows the sufficiently high efficiency of preliminary genetic algorithm coding for objects. Section 7, “Emerging Trends,” explores areas for future research within this field. This final section in this reference book opens with “A Survey on Grey Optimization” by Prof. Adem Guluma Negewo of Addis Ababa Science and Technology University, Ethiopia, which explains the binary interactive algorithm approach as a problem-solving method for linear programming and quadratic programming problems with uncertainty along with a genetic-algorithm-based approach as a second problem-solving scheme for linear programming, quadratic programming, and general nonlinear programming problems with uncertainty. The concluding chapter of this book is “A Hybrid Tabu Genetic Metaheuristic for Selection of Security Controls” written by Profs. Sarala Ramkumar, Zayaraz Godandapani, and Vijayalakshmi Vivekanandan. xxiii
Preface
Although the primary organization of the contents in this multi-volume work is based on its seven sections, offering a progression of coverage of the important concepts, methodologies, technologies, applications, social issues, and emerging trends, the reader can also identify specific contents by utilizing the extensive indexing system listed at the end of each volume. As a comprehensive collection of research on the latest findings related to these systems, the Research Anthology on Multi-Industry Uses of Genetic Programming and Algorithms provides programmers, engineers, computer scientists, IT specialists, practitioners, researchers, academicians, and students with a complete understanding of the applications, concepts, and challenges to using genetic programming and algorithms. Given the need for programming that is fast, effective, and multi-purposed, the Research Anthology on Multi-Industry Uses of Genetic Programming and Algorithms encompasses the most pertinent research on how these technologies are being employed to aid in selection, detection, and optimization processes across diverse industries to provide the highest quality solutions and software.
xxiv
xxv
Index
A ACO 63, 304, 405, 593, 615, 618, 672, 787, 813, 822, 824, 874, 875, 876, 882-891, 1142, 1144, 1154, 1184, 1345, 1459, 1461, 1468 Advanced Metering Infrastructure (AMI) 1022, 1052 Advanced Optimization methods 609 Agents 85, 92, 205, 206, 207, 211-215, 220, 277, 504, 505, 506, 601, 612, 615, 623, 634, 652, 653, 747, 1215-1219, 1345 Agriculture 618, 772, 1229, 1230, 1231, 1245, 1246, 1254, 1256 AIC 754-759 air quality 1215, 1216 Alignment 262, 264, 265, 276, 277, 278, 464, 479, 815, 822, 825, 1053-1072 ALSTOM gasifier 1285, 1288, 1290, 1295, 1299, 1315, 1316, 1317 approximate distance 205-211, 220 approximate distance shape context 205, 206, 210, 220 Artificial Intelligence 2, 48, 66, 93, 96, 110, 113, 143, 179, 242, 256, 257, 258, 298, 326, 342, 351, 374, 398, 401, 422, 428, 445, 479, 480, 530, 533, 592, 603-607, 618, 638, 642, 643, 644, 648-655, 694, 695, 739, 807, 817, 846, 849, 850, 892, 893, 894, 916, 926, 949, 952, 967, 1002, 1039, 1073, 1113, 1175, 1179, 1181, 1184, 1205, 1217, 1218, 1227, 1257, 1258, 1340, 1345, 1379, 1384, 1386, 1509, 1511, 1512, 1520 Artificial Neural Network 36, 108, 111, 116, 118, 126, 134, 138, 142-146, 325, 510, 592, 595, 602-612, 653, 678, 696, 870, 898, 995, 1230, 1257, 1373, 1374, 1380, 1384-1390, 1428 asset allocation 78, 79, 83, 90, 96 automated cryptanalysis 1431, 1436, 1449, 1452, 1453
B Bacterial Foraging Optimization 148, 154, 155, 176-
180 batch processing machines 929, 943-946 batch scheduling 928-933, 940, 943, 946 benchmark functions 148, 161, 163, 178, 180, 223, 300, 303, 310, 311, 316, 322 Bin Packing 775, 776, 788, 931, 1456-1461, 1465, 1466, 1469, 1490, 1491 binary genetic algorithm 50 Binary Particle Swarm Optimization 50, 54, 62, 760, 761, 1453 Binary Relation 560, 1229 Bio chemicals 609 biomedical applications 614 biomedical image analysis 592-601 blocking 1156-1174 building energy simulation 997, 1006, 1015 Business Process 569, 848, 1053, 1054, 1055, 1059, 1066, 1067, 1072, 1073, 1181, 1185, 1510 Butterworth filtering 851-861, 870, 871
C change-prone 279-282, 286, 287, 289, 295, 297, 299 Chemo-taxis 148, 149, 154, 180 Chromosome 14, 15, 30, 57, 68, 69, 70, 153, 184, 185, 186, 193, 197, 224-231, 265, 266, 268, 348, 349, 351, 358-365, 376, 384-388, 402, 407-415, 427, 428, 433, 439, 500-506, 538, 539, 565, 578-584, 644, 661-666, 748, 765, 766, 778, 780, 793-796, 804, 809, 810, 816-821, 834, 878, 879, 901, 935, 936, 952-960, 964, 974, 979, 980, 1039-1043, 1055, 1056, 1057, 1062-1065, 1119-1129, 11471150, 1162, 1163, 1174, 1194, 1210, 1211, 1216, 1217, 1222, 1231-1241, 1245, 1299, 1325, 13301334, 1418, 1463, 1475, 1496, 1518, 1519, 1524 classical ciphers 1431-1434, 1438, 1441, 1451 Classification 37, 38, 51, 62, 63, 64, 80, 82, 83, 99, 102, 108, 109, 113, 146, 188, 221, 222, 244-258, 280-285, 297, 298, 304, 325, 352, 433, 434, 435,
Volume I: 1-494; Volume II: 495-1016; Volume III: 1017-1534
Index
439-446, 472, 549, 569-575, 581, 590-607, 614, 618, 619, 623, 626, 630-639, 643, 648, 654, 745, 748, 752, 759, 761, 847, 874-897, 916, 918, 926, 927, 945, 955, 956, 957, 998, 1020, 1059, 1066, 1083, 1087, 1138, 1196, 1229-1235, 1240, 1246, 1255, 1256, 1258, 1320, 1332, 1339, 1340, 1341, 1364, 1367, 1377, 1383, 1388, 1392, 1520 Clinical Decision-making System 50 Cloud Environment 773, 775, 776, 787, 789, 1141, 1456-1462, 1466, 1468 Clustering 9, 16, 36, 37, 48, 51, 181-189, 242, 263, 278, 326, 375, 378, 386, 398-402, 550, 551, 600, 605, 606, 632, 640, 660-666, 674, 675, 744, 745, 759, 761, 792, 808, 816, 826, 832, 833, 848, 849, 850, 896, 898, 914-927, 1053, 1214, 1392, 1494, 1495, 1510, 1511, 1512 Cold Start 657, 829-834, 840, 845-850, 1492-1496, 1503, 1508, 1509, 1512 Combinatorial Optimization 96, 355, 378, 383, 422, 739, 811, 812, 816, 817, 818, 825, 826, 882, 944, 973, 1077, 1113, 1137, 1184, 1453, 1472, 1517, 1532, 1533 Comparison 1, 4, 37, 43, 44, 49, 75, 95, 96, 98, 107113, 117, 139-146, 151, 161, 163, 164, 185, 186, 188, 195-203, 240, 254, 255, 261, 262, 263, 275, 303, 306, 310, 314-325, 351, 354, 373, 403, 405, 418, 419, 420, 433, 452, 453, 465, 466, 470, 479, 493, 512, 515, 522, 534, 535, 537, 544, 567, 615, 629, 636, 639, 659, 671, 675, 716, 718, 733, 754, 757, 769, 770, 771, 778, 785, 813, 814, 822, 823, 824, 840-845, 849, 854, 865-870, 876, 881, 888, 890, 895, 906, 907, 914, 917, 918, 949, 962, 963, 964, 986, 995, 998, 1067-1071, 1087, 1116, 1130, 1138, 1152, 1165, 1168, 1170, 1202, 1208, 1235, 1238, 1257, 1313, 1314, 1374, 1423, 1433-1438, 1446-1452, 1461, 1465, 1472, 1486, 1490, 15031512, 1517, 1524 complex root 224 Computational Intelligence 33, 111, 146, 177, 258, 276, 277, 327, 374, 444, 508, 530, 538, 593, 595, 620, 623, 636-641, 651, 653, 703, 739-747, 759, 809, 946, 1154, 1317, 1342, 1343, 1387, 1389, 1452 Computer aided diagnosis 592, 593 conflicting data set 895 conventional genetic algorithms 1116, 1117 cooperative algorithm 190, 191, 196, 202 Correlation 38, 99, 103, 108, 125, 130-137, 141, 534537, 541-549, 631, 757, 785, 832, 833, 841, 847854, 860, 862, 864, 887, 921, 985, 1230, 1389, 1392, 1423, 1494, 1495, 1504, 1512 Correspondences 260-276, 1053-1060, 1065, 1073 xxvi
cost effectiveness 1175 Crossover 3, 16-24, 30, 31, 41, 42, 56, 68, 69, 70, 83, 84, 101, 123, 150-154, 158-163, 174, 177, 181195, 199, 206, 212, 213, 224-229, 234, 246-251, 265, 266, 273, 278, 302, 347-354, 358, 362, 363, 365, 374, 376, 378, 384-390, 409, 410, 411, 415, 422-431, 447-461, 500, 502, 507, 510, 516, 517, 518, 523, 538-544, 564, 565, 566, 574, 578, 583, 626, 644, 647, 654, 661-667, 690, 692, 749, 750, 751, 766, 779-783, 794-797, 804, 805, 816, 817, 830, 834, 836, 852, 853, 857-862, 876-880, 887, 899, 901, 918, 920, 921, 935, 937, 938, 946, 953, 974, 975, 979, 980, 988-993, 1002, 1040-1043, 1052, 1056, 1057, 1062, 1063, 1067, 1075, 1088, 1111, 1125, 1126, 1137, 1145-1156, 1163, 1164, 1174, 1179, 1180, 1181, 1194, 1196, 1197, 1201, 1203, 1209, 1211, 1217, 1229-1244, 1248, 1252, 1254, 1256, 1299, 1301, 1322, 1333, 1347, 1392, 1418, 1421, 1433, 1437, 1441, 1458-1463, 1476, 1477, 1493-1499, 1518, 1519, 1525 Cross-Validation 102, 109, 113, 129, 248, 249, 254, 279, 291, 299, 466, 472, 476, 601, 888, 921, 1371 Cuckoo Search 3, 144, 304, 306, 325, 326, 533, 600, 813, 898, 915, 916, 1431, 1432, 1433, 1437-1442, 1452-1455 Cutting and Packing 1471
D Data Warehouse 512, 513, 531, 532, 533, 1416 Debugging 71, 496 Decision Making 2, 81, 90, 95, 342, 512, 513, 522, 532, 551, 607, 617, 918, 1238, 1246, 1256, 1319, 1385, 1456-1461, 1465, 1466, 1515, 1516 Decision Support System 51, 150, 176, 376, 399, 530, 569, 590, 606, 848, 884, 893, 894, 916, 927, 954, 1186, 1187, 1191, 1192, 1202-1208, 1320, 1340, 1510, 1516 Decision Tree 36, 38, 221, 256, 279-287, 292-298, 435, 446, 449, 634, 640, 649, 892, 970, 1230, 1392 Demand Response 1030, 1047-1052 Demographical Data 829, 1492 DENATRAN 1216, 1228 dependability 495 Design Problem 93, 344, 345, 353, 1002, 1006, 1184 Differential 1, 3, 27, 28, 30, 67, 68, 87, 91, 94, 172, 176, 178, 300, 302, 303, 308, 326, 349, 436, 446, 533, 591, 602, 609, 610, 614, 616, 619, 632, 635, 898, 1316, 1341, 1388, 1423 Differential Evolution 1, 3, 27, 87, 91, 94, 176, 178, 300, 303, 308, 326, 349, 436, 446, 533, 609, 610,
Index
614, 616, 619, 635, 898, 1316, 1388 digital watermarking 851-856, 872, 873 Dispersion of polllutants 1215 distributed applications 495, 496, 498, 504, 507 Distributed Naïve Bayes 570-574, 589, 1318, 1321, 1322, 1326, 1327, 1339 distributed predicates detection 495, 496, 497, 506 Distribution Characteristics 279 Diversity in Population 181 DNA sequence 811, 813, 814, 821-827 Document Clustering 181, 182, 183, 187, 188, 189, 660, 675 Drilling 142, 676-680, 687, 692-697, 1075
E Economic Load Dispatch 164, 168, 176, 178, 180 electric cylinder 65, 67, 71-77 Electric-discharge 676 embedded method 127 Embedded Multiprocessor Architecture 1116 empty container 1186, 1187, 1193, 1195, 1199-1205, 1423 Energy Efficiency 762, 773, 788, 1025, 1030 Energy Management 1017, 1019, 1023-1026, 10471052, 1469 Energy Minimization 1456 ensemble learners 279, 280, 284, 297 entity linking 447, 448, 450, 479 Environmental Adaption Method (EAM) 300, 303 Environmental optimization 997, 1002, 1006 Evolutionary Algorithms 3, 4, 15, 31, 32, 49-54, 76, 83, 87-96, 149, 150, 180, 182, 192, 204, 211, 242, 245, 246, 256, 278, 280, 281, 287, 295, 297, 298, 304, 347, 349, 353, 354, 433, 434, 451, 530, 531, 566, 567, 599, 614-619, 652, 654, 700, 701, 746, 790, 791, 792, 809, 810, 828, 899, 918, 1002, 1112, 1143, 1144, 1155, 1179, 1213, 1216, 1227, 1375, 1418, 1452, 1457-1461, 1518, 1520, 1532, 1534 Evolutionary Approach 96, 188, 402, 533, 701, 702, 818, 1471 Evolutionary Computation 33, 34, 63, 64, 66, 76, 87, 91-96, 177, 178, 179, 189-195, 203, 242, 256259, 277, 298, 326, 327, 354, 402, 424, 448, 479, 508, 510, 601, 604, 606, 619, 644, 649, 652, 739, 740, 741, 746, 759, 760, 789, 806, 807, 808, 817, 825-828, 848, 849, 875, 892, 1113, 1137, 1138, 1154, 1155, 1209, 1213, 1217, 1342, 1376, 1418, 1452, 1512, 1532, 1533, 1534
F facial recognition 1418 Feature Extraction 206, 207, 593, 597, 606, 607, 630, 809, 876, 918, 1380 Feature Selection 2, 50-55, 59-64, 111-119, 125-133, 141, 142, 146, 188, 245, 258, 281, 292, 444, 570573, 585, 589, 590, 593, 599, 603, 605, 607, 626, 630-641, 743, 744, 745, 752, 759, 761, 874-894, 917-926, 970, 973, 988, 995, 1233, 1257, 13181324, 1339, 1340, 1341 Feature Selection Techniques 571, 572, 590, 640, 744, 875, 882, 888, 917, 923, 925, 926, 1341 Feature Subset Selection 51-64, 572, 638, 760, 809, 884, 1533 filter method 742 Fitness 3-16, 23, 24, 27, 39-42, 47, 56, 58, 66-71, 84, 88, 89, 101, 123, 125, 150-156, 160, 173, 174, 186, 192-199, 212, 213, 224-229, 245-252, 259, 264-272, 276, 280, 281, 300-310, 337, 339, 347, 349, 351, 358, 360, 361, 365, 384, 400, 409, 410, 411, 415, 419, 421, 428-441, 451-461, 465-477, 500-506, 516, 517, 535-541, 564, 566, 567, 574-580, 601, 626, 633, 661-668, 690, 691, 702, 716-725, 735-738, 745-753, 765, 766, 775, 781, 783, 788, 793, 795, 796, 802, 803, 804, 816-825, 834, 836, 853, 857, 858, 860, 870, 877-880, 896, 899, 901, 907, 921, 935, 952-962, 974-986, 1000-1004, 1008-1013, 1039-1043, 1056, 1057, 1062, 1065, 1071, 1079, 1084-1088, 1093, 1108, 1124-1129, 1137, 1149-1152, 1162, 1163, 1164, 1174, 1179, 1194, 1195, 1196, 1201, 1209, 1211, 1220, 1226, 1228, 1233-1243, 1248, 1269, 1280, 1298-1301, 1321-1333, 1340, 1347, 1418, 1432, 1433, 1437-1440, 1444, 1451, 1458, 1462, 1463, 1468, 1473-1481, 1485, 1486, 1496, 1498, 1518, 1519, 1520, 1524, 1525, 1531 Flow Shop Scheduling 659, 1118, 1138, 1156-1161, 1165, 1170-1174 Forecasting 99, 109, 114-120, 126, 127, 130, 137-147, 202, 440, 446, 593, 703, 875, 1229, 1230, 1256, 1257, 1379, 1417 Form-finding 997, 999, 1000 Fractional Multichannel Sample Rate Convertor 482, 484 fragment assembly 811-816, 822-827 Fuzzy Automata 550, 559, 561, 563 Fuzzy C-Means 48, 600, 745, 898, 917-926 Fuzzy data mining 550, 551, 569 fuzzy entropy 856, 872, 897-901, 916 Fuzzy Logic 142, 378, 434, 445, 447, 448, 456, 510, xxvii
Index
550, 568, 569, 595, 600, 610, 613, 619-626, 635, 639, 641, 695, 763, 771, 896-901, 907, 914, 915, 916, 927, 949, 1181, 1342, 1346, 1371, 1372, 1374, 1378, 1382-1389, 1520 fuzzy neural networks 550, 556, 557, 558 Fuzzy System 456, 457, 550, 552, 635, 916, 946, 1258, 1377
G GABIL 245-254, 258 Gas 14, 27, 37, 49, 66, 71, 142, 144, 145, 186, 278, 334, 350-353, 376, 378, 433, 434, 486, 500, 501, 567, 626, 690, 702, 748, 876, 878, 935, 947-960, 965, 974, 975, 980-986, 1018, 1021, 1027, 1039, 1055, 1057, 1074, 1075, 1078, 1082, 1117, 1118, 1129, 1137, 1148, 1149, 1208, 1209, 1216, 1217, 1220, 1227, 1286-1289, 1295, 1306, 1316, 1317, 1324, 1325, 1326, 1333, 1342, 1344, 1347, 1353, 1360, 1364-1371, 1376, 1389, 1437 Gaussian membership function 896, 903-907 Gender 829-840, 845, 846, 1492-1503, 1508, 1509 Gene 14, 16, 38, 69, 70, 110, 112, 123, 124, 125, 153, 192, 325, 347, 348, 349, 353, 358, 363, 386, 387, 390, 427-439, 444, 445, 446, 502, 505, 564, 565, 567, 581, 583, 601, 602, 615, 638, 640, 661, 749, 765, 781, 794, 795, 810, 879, 894, 901, 935, 936, 937, 952, 953, 957-964, 974, 1003, 1040, 1042, 1055, 1062-1065, 1075, 1077, 1079, 1088, 1091, 1111, 1113, 1125, 1147-1151, 1163, 1194, 1330, 1333, 1500, 1519 Gene expression programming (GEP) 427, 428, 433, 434, 445 Generalized Regression Neural Network 98, 99, 107114, 613 generative design 997, 1000 Genetic Algorithm (GA) 1, 23, 83, 84, 148, 150, 182, 191, 192, 202, 205, 346, 358, 406, 427, 500, 593, 599, 614, 626, 634, 642, 643, 678, 742-748, 761, 775, 790-793, 813, 830, 877, 1019, 1039, 1158, 1209, 1415, 1431, 1433, 1493 Genetic Operators 86, 87, 123, 206, 211, 212, 220, 347, 384, 427-433, 439, 453, 501, 748, 780, 781, 802, 834, 846, 859, 875, 879, 887, 920, 921, 935, 974, 1075, 1113, 1126, 1130, 1145, 1148, 1420, 1475, 1477, 1496, 1508 Genetic Optimization Technique 482 Genetic Programming (GP) 99, 114, 116, 117, 123, 136, 193, 427, 428, 434, 448, 451, 613, 634, 874, 879 Genre 829-840, 845-850, 1492-1503, 1508-1512 Global Optimization 2, 32, 65, 67, 178, 179, 192, 257, xxviii
326, 340, 605, 896, 1017, 1074, 1075, 1112, 1233, 1380 Graph Theory 355, 550 grey linear programming 1260-1269, 1276, 1283 Grey Non-Linear Optimization 1260, 1279 grey quadratic programming 1260, 1261, 1271, 1272, 1276, 1283 grey systems 1260, 1261, 1263, 1271, 1284 guillotine constraint 1471, 1472, 1473, 1486, 1490
H Hashing Technic 1318 HBEFA 1215, 1219, 1228 headache disease 642-647, 651 Heterogeneous Fleet 375, 378, 380, 396, 397, 398, 422, 984 Heuristics 32, 81, 91, 93, 221, 245, 375, 376, 397, 400, 424, 449, 478, 514, 530, 747, 787-791, 798, 807, 810, 816, 828, 928, 930, 931, 943, 944, 975, 1002, 1055, 1075, 1076, 1112, 1141, 1148, 1156-1159, 1170, 1171, 1172, 1186, 1188, 1191, 1224, 1227, 1437, 1451, 1452, 1457, 1458, 1461, 1468, 1471, 1472, 1476, 1477, 1485, 1490, 1517, 1533 Hierarchical clustering 917-924 Hill Climbing 3, 260, 261, 265-272, 276, 278, 567, 1517 Histogram 137, 138, 206-211, 601, 851-862, 866-873, 898-903, 914, 1421, 1422, 1424 Home Healthcare 1207, 1214 hospital sterilization services 928, 931, 932, 943, 945 HybMAS-GA 205, 206, 207, 211-221 Hybrid Algorithm 50, 158, 190, 245, 254, 261, 265-275, 405, 406, 407, 412, 1180, 1520, 1524, 1526, 1531 Hybrid GA 375, 376, 378, 388, 390, 425, 600, 1075, 1152, 1180, 1392 Hybrid Genetic Based Approach 1140, 1154 Hybridization 148, 149, 157, 158, 176, 180, 192, 253, 260, 261, 266, 268, 269, 273, 276, 388, 405, 407, 418, 657, 806, 872, 1233, 1256, 1514, 1520
I identity based encryption 969, 971, 986, 988, 989, 996 Image Analysis 592-601, 607, 638, 896, 1258 Image Processing 51, 193, 222, 493, 547, 548, 549, 572, 596, 597, 601, 602, 607, 678, 695, 863, 871, 872, 873, 896, 897, 927, 1320, 1344, 1418, 1423, 1426 Image Segmentation 207, 221, 304, 325, 549, 593, 596, 597, 602-607, 653, 879, 892-899, 907, 908, 914, 915, 916 image thresholding 605, 606, 896, 897, 899, 915, 916
Index
imbalanced data set 895 immuno-inspired algorithms 700, 704, 715-720 incomplete data set 895 Inconel 718 676-679, 689, 693, 694, 696 indiscernibility 1229, 1232 Inductive Learning 244-254, 258 Information Retrieval 180, 181, 221, 264, 274, 374, 446, 447, 604, 656-662, 672, 772, 847, 873, 1054, 1058, 1059, 1510 Information Scent 656-663, 667, 672, 674 Information Security Risk Assessment 1513 integrated inventory 1175, 1176 integrity assessment 947-960 intelligent diagnosis 642, 646 Intelligent Medical Diagnosis 651, 655, 1258 Internet of Things 950, 965, 1414-1417, 1469 Irredundant Coding 1414
K K-Means 9, 16, 182-189, 600, 660, 661, 663, 832, 848, 917-927, 1510
L Least Significant Bit Method 1414 linked data 447, 448, 478, 480 Linked Open Data 447, 448 Load Side Management 1017, 1050, 1052 Local Minima 2, 4, 9, 16, 162, 163, 181-187, 303, 309, 317-322, 348, 410, 634, 793 Local Optimum 2, 53, 153, 157, 256, 261, 267, 268, 716, 750, 766, 1043, 1177, 1237, 1421, 1437, 1471, 1484 lower and upper approximation 1232 Lower Order Model 1285
M Machine Learning 32, 37, 38, 39, 48, 58, 63, 66, 91, 109, 116, 117, 142, 177, 229, 242, 244, 245, 252, 256, 257, 258, 281, 282, 298, 299, 326, 342, 343, 352, 353, 374, 445, 446, 449, 478, 530, 535, 570, 589, 593, 595, 607, 619, 624, 629, 630, 631, 638643, 649-654, 744, 789, 792, 808, 827, 832, 833, 840, 847, 850, 884, 892, 893, 918, 919, 923, 926, 927, 944, 970, 971, 984, 986, 996, 1054, 1059, 1111, 1112, 1114, 1183, 1227, 1257, 1315, 1318, 1340, 1373, 1377, 1415, 1491, 1510, 1532 Magnetic Resonance Imaging 592, 608, 897, 899, 902 Maritime Transport Network (MTN) 1186
Markovitz Mean Variance Theory 810 mass-customized housing 999 Matcher 218, 260, 262, 264, 1054, 1058, 1059, 1066, 1068, 1069 Matching 74, 150, 178, 205, 206, 207, 211, 213, 214, 218-222, 261-265, 276, 277, 278, 449, 450, 464, 478, 479, 480, 534-537, 545-549, 664, 813, 825, 984, 1053-1061, 1066-1073, 1378, 1382, 1514, 1515, 1532 Materialized View Selection 512, 515, 522, 530-533 Mathematical Programming 80, 82, 91, 809, 930, 931, 934, 938, 1263, 1283 MATLAB code 1, 32 Mechanical Engineering 65, 96, 148, 241, 344, 353, 354, 694, 809, 1429 Medical 49, 51, 62, 71, 244-248, 252-258, 325, 327, 493, 570, 571, 572, 589-597, 601-608, 614, 618, 619, 642-655, 874-884, 888-901, 907, 908, 914918, 923, 925, 928, 929, 931, 946, 986, 1141, 1207, 1208, 1258, 1318-1321, 1339, 1340 Medical Datasets 244, 245, 649, 1339 Medical Diagnosis 62, 244, 257, 642, 643, 644, 648655, 884, 893, 908, 914, 917, 1258, 1320 Medical Imaging 325, 592, 593, 601-608, 1141 Medical Modalities 592, 608 Metaheuristics 78-86, 90, 92, 245, 258, 278, 388, 423, 596, 597, 791, 946, 1002, 1147, 1213, 1431-1434, 1445, 1451, 1453, 1471, 1520, 1531-1534 micro-holes 676-680, 687, 692, 693, 695 Minimax Probability Machine Regression 98, 99, 107, 109, 110, 114 Minimum Maximum Stretch Spanning Tree 355, 356, 368 Missing Values 35-44, 48, 49, 875, 884, 1324 m-machine flow 1156 Mono-Objective Problems 78 multi depot VRP 375, 405 multiagent system 205, 221 Multiagents 1215 Multi-Layer 35-40, 48, 510, 634, 1346 multinomial logistic regression 742, 754 Multi-Objective Optimization 33, 69, 78, 79, 80, 8692, 176, 242, 345, 351, 512, 516, 530, 616, 617, 618, 676, 678, 690, 693, 695, 773-781, 789, 791, 798, 896, 1000, 1010-1013, 1017, 1044, 1143, 1181, 1315, 1317, 1458, 1513-1520, 1529-1534 multivariable system identification 700, 711, 717 Mutated Binary Particle Swarm Optimization 50 Mutation 3, 5, 23, 28, 29, 34, 42, 54, 56, 68, 69, 70, 83, 84, 89, 101, 123, 150, 153, 154, 158-163, 174, 175, 182-195, 199, 206, 213, 224-229, 234, 247, xxix
Index
249, 251, 260, 265-268, 272, 273, 276, 280, 302, 306, 309, 334, 342, 347-350, 358, 362-365, 371, 374, 378, 384, 386, 390, 410, 411, 415, 422-433, 438, 447-461, 500, 502, 507, 510, 516-519, 523, 538-544, 564, 566, 567, 574, 578, 583, 626, 644, 647, 654, 661-667, 690, 692, 720, 739, 746-751, 759, 779-783, 791-798, 804, 805, 817, 830, 834, 853, 857-862, 876-880, 887, 899, 901, 918, 920, 921, 935, 936, 937, 952-961, 974, 975, 980, 988993, 1002, 1039-1043, 1052, 1056, 1057, 1062, 1064, 1067, 1075, 1088, 1111, 1112, 1125, 1137, 1145-1152, 1156, 1158, 1163, 1164, 1173, 1174, 1179, 1180, 1181, 1194, 1197, 1201, 1203, 1211, 1217, 1222-1228, 1233, 1234, 1237, 1241, 1244, 1248, 1252, 1254, 1299, 1301, 1322, 1333, 1347, 1388, 1418, 1420, 1421, 1433, 1437, 1440, 1444, 1458-1463, 1476, 1477, 1486, 1493, 1496, 1518, 1519, 1520, 1525 Mutation Rate 249, 430, 459, 460, 542, 544, 647, 666, 667, 692, 804, 857, 860, 1067, 1088, 1181, 1201, 1222, 1223, 1228, 1248, 1462, 1463, 1476
N Naïve Bayes 570-577, 585, 589, 620, 624, 632, 639, 874, 876, 883, 884, 888-894, 1318-1322, 13261330, 1339 Nature-Inspired Techniques 655 NCA 116, 127, 128, 129, 133, 244-249 NCGABIL 244, 248-254 Neighborhood Component Analysis Feature Selection 116 Network Design 93, 144, 694, 1175-1183, 1188, 1204, 1205, 1206 Neural Networks 35, 37, 39, 48, 92, 93, 109-118, 122, 143-147, 246, 258, 277, 400, 550-558, 593-610, 617, 618, 619, 623, 639, 640, 649-654, 674, 701, 760, 832, 852, 879, 892, 893, 894, 926, 970, 983, 985, 995, 996, 1114, 1158, 1179, 1183, 1217, 1227, 1230, 1257, 1342-1346, 1364, 1371-1392, 1461, 1520 nonidentical machines 932 Nurse Scheduling Problem 1207, 1209, 1213
O Object Tracking 534, 548, 549 Offspring 16, 20, 24, 25, 27, 56, 83, 84, 101, 160, 185, 212, 213, 226, 227, 266, 268, 273, 308, 349, 358, 362-365, 384-388, 409, 410, 412, 438, 453, 500, 516, 517, 538, 566, 567, 583, 626, 661, 664, 749, xxx
751, 781, 794, 810, 821, 834, 837, 838, 858, 878, 901, 935, 937, 971, 974, 980, 983, 1003, 1039, 1042, 1043, 1079, 1148, 1149, 1163, 1170, 1174, 1179, 1234, 1237, 1241, 1299, 1333, 1347, 1421, 1476, 1498, 1500, 1518, 1519 Oil Pipeline Integrity 947 ontology mapping 260-269, 274, 276, 277 Open Source 124, 146, 252, 280-283, 287, 292, 297, 298, 299, 439, 478, 548, 1015, 1016, 1216, 1218, 1457 Operating Systems Tasks 1140 Optimal Design 33, 109, 344, 494, 578, 1002, 1184 optimal security controls 1513, 1514, 1516 Optimization 1-6, 32-35, 39, 42, 50, 53, 54, 58, 62-73, 78-96, 107, 113, 116, 120, 122, 123, 142-150, 154, 155, 161, 164, 168, 176-182, 188-197, 202, 203, 204, 221, 242, 245, 247, 254-269, 276-282, 298315, 322-330, 334, 340-358, 376, 378, 383, 397, 401-406, 419-436, 444, 445, 482, 487, 493, 494, 508, 510, 515, 516, 517, 522, 530-535, 539-551, 564-569, 578, 592-625, 633, 638-662, 672, 676679, 690-703, 711, 715-720, 739, 740, 744, 746, 759-781, 787-818, 825-833, 847-856, 870-882, 892-901, 915, 918, 920, 926, 931, 935, 943, 944, 947, 955, 965, 966, 971, 972, 973, 983, 986, 988, 996-1013, 1017-1022, 1033, 1034, 1039, 1040, 1044-1055, 1062, 1070-1077, 1086, 1111-1120, 1137, 1142-1145, 1150-1154, 1159, 1164, 1171, 1176-1193, 1198, 1205, 1209, 1213, 1227, 1230, 1233, 1238, 1257-1265, 1269, 1277-1285, 1298, 1299, 1302, 1311, 1315, 1316, 1317, 1345, 1353, 1367, 1371-1393, 1415, 1418, 1422, 1424, 1428, 1429, 1433, 1437, 1438, 1442, 1451-1461, 14681472, 1477, 1479, 1491, 1493, 1495, 1510-1520, 1529-1534 Optimization Algorithms 3, 53, 54, 81, 277, 280, 301, 303, 310, 322, 325, 346, 566, 567, 592, 690-697, 718, 791, 810, 900, 1002, 1311, 1455, 1514 Optimization of Traffic Flows 762 Optimization Problems 1, 2, 3, 34, 80, 81, 91-96, 148, 149, 161, 176, 179, 180, 190, 191, 197, 245, 264, 300-303, 312, 322, 326, 329, 344, 345, 346, 350355, 376, 383, 422, 516, 535, 564, 566, 567, 597, 614, 678, 701, 716, 739, 746, 761, 790-793, 798, 803, 807, 812, 816, 817, 828, 830, 875, 877, 900, 920, 935, 966, 971, 1055, 1075, 1112, 1143, 1144, 1152, 1260, 1261, 1269, 1277-1284, 1386, 1418, 1437, 1493, 1517-1520 Order Distance Vector 1074 Ordinary Kriging 98, 99, 107-114
Index
P Particle Swarm Optimization (PSO) 1, 6, 53, 85, 191, 193, 202, 300, 534, 535, 541, 593, 610, 625, 703, 761, 813, 874, 881, 882, 898, 1142 Peak Load Reduction 1052 Perceptron 35, 38, 39, 48, 118, 119, 468, 470, 472, 613, 894, 985, 986, 1387 Personalized Web Search 656-662, 666-673, 1509 PID 9, 10, 65-76, 569, 650, 660, 1295-1301, 1315, 1316 Pipeline Risk Optimization 947 Pollutants Dispersion 1228 Population 3-16, 22-31, 39, 40, 42, 68, 69, 70, 83-88, 101, 123, 124, 134, 141, 150-163, 172, 173, 175, 181-199, 211, 212, 224-230, 234, 241, 247, 249, 250, 261-268, 273, 276, 280, 304-324, 334, 346353, 358-366, 374, 384, 386, 409-414, 427-431, 439, 451-461, 500-506, 516, 517, 518, 538-544, 564, 567, 574, 578-583, 599, 600, 625, 626, 634, 647, 649, 657, 661-667, 690, 692, 702, 703, 704, 716-724, 744-751, 765, 766, 776-783, 791-798, 803, 804, 805, 809-825, 853, 857-862, 876-881, 887, 899, 901, 920, 921, 935-938, 955, 974-982, 1000-1004, 1008, 1011, 1039-1043, 1055, 1056, 1057, 1062, 1067, 1074-1114, 1119, 1120, 1125, 1126, 1130, 1142-1152, 1158, 1162, 1163, 1164, 1179, 1185, 1186, 1191, 1194, 1195, 1201, 1203, 1209, 1210, 1211, 1216, 1217, 1220, 1228, 12331241, 1248, 1298, 1299, 1301, 1322, 1324, 1325, 1330, 1333, 1347, 1418, 1420, 1421, 1433, 1437, 1440, 1444, 1462, 1463, 1471-1480, 1484, 1485, 1486, 1514, 1518, 1519, 1520, 1524, 1525 population seeding technique 1074-1088, 1092, 1093, 1112 Portfolio Optimization 78-96, 178, 790-793, 798-810, 966 Portfolio Selection 78, 80, 82, 86, 90-96, 793, 796, 806-809, 1515 potential rectangles 223-241 Power Balance Constraint 164, 169, 180 Predictions 38, 279, 286, 287, 290, 294, 297, 444, 614, 969, 970, 985-996, 1383 Premature Convergence 31, 70, 158, 182, 188, 191, 212, 276, 350, 384, 430, 448, 539, 936, 937, 1080, 1085, 1088, 1120, 1197, 1471 Privacy Preserving Feature Selection 572, 589, 13181321, 1339, 1340 Progressive Flow 762 Prohibited operating zones 164, 170, 175, 180 Proportional-Integral-Derivative-Filter 1285, 1303 PSNR 851-864, 868
PSO 1-8, 31, 32, 33, 53, 54, 63, 85-90, 142, 149, 175-178, 191-202, 265, 300-312, 317, 319, 322, 534-549, 593, 600-603, 610, 614, 625, 633, 634, 636, 703, 761, 787, 813, 822, 824, 874, 875, 876, 881-894, 898, 1071, 1142, 1143, 1144, 1345, 1461
Q QUANTUM INSPIRED GENETIC ALGORITHMS 1116, 1117, 1118 queue length 762, 764, 770
R Ramp Rate Limit 164, 168-174, 180 Random Values 15, 570-577, 585, 586, 589, 1301, 1322, 1329, 1335, 1339 RDF 447, 448 Real Root 223 Real-Time Scheduling 510, 1139-1146, 1153, 1154 Receiver Operating Characteristics 287, 593 Recommender System 656-661, 671, 674, 829-833, 846, 848, 849, 1492-1495, 1508-1512 Reconfigurable Embedded Systems 1140, 1143, 1154 Reduct 745, 750, 752, 754, 759, 761, 1233 Regression Model 114, 119, 127, 128, 129, 742-748, 754-759, 1379 Response surface method (RSM) 609 Rio Grande 1215-1222, 1226, 1228 Risk 78, 80, 83, 90, 91, 95, 178, 421, 569, 613, 619, 790, 791, 792, 796-810, 875, 884, 898, 914, 916, 947-967, 984, 1022, 1026, 1178, 1270, 1386, 1513-1521, 1531, 1533, 1534 Risk Assessment 619, 947-955, 959, 965, 966, 967, 1513, 1516, 1520, 1521, 1531, 1534 Road Transportation 403 Robustness 66, 73, 75, 148, 192, 214, 221, 223, 345, 346, 350, 352, 545, 547, 624, 635, 690, 747, 851872, 885, 887, 1299, 1345, 1362, 1393 rock depth 98, 107, 108, 110, 114 Rough Set 605, 606, 632, 651, 744, 745, 752, 761, 894, 1229-1234, 1238, 1240, 1242, 1246, 1254-1258 rule classification 244, 252 runtime verification 495, 497, 499, 506, 507
S Search Engines 182, 656 search space reduction 223, 225, 229, 231, 234, 241 Secure Wrapper Technique 570, 1321 Security 179, 204, 433, 434, 444, 445, 464, 509, 534, xxxi
Index
589, 620, 621, 635-641, 673, 674, 758, 761, 789, 809, 834, 850, 852, 856, 873, 915, 919, 949, 965, 969, 970, 971, 988, 994, 995, 996, 1019, 1138, 1228, 1256, 1323, 1329, 1335, 1339, 1414-1419, 1428, 1429, 1452, 1453, 1496, 1512-1517, 15211525, 1531-1534 semantic similarity 1053, 1054, 1061, 1070 Semantic Web 277, 278, 447, 448, 463, 478-481, 658, 675, 848, 1510 shape context 205-210, 214, 220 shape retrieval 205, 206, 207, 211, 213, 214, 219, 221 Sharpe Ratio 790, 798-806, 810, 1516 Shortest Path Problem 403 Simulation 34, 65-77, 83, 93, 95, 96, 113, 117, 133, 144, 175, 203, 344, 345, 350, 353, 429, 431, 535, 546, 547, 566, 618, 638, 741, 745, 763, 764, 771, 784, 788, 824, 916, 944, 945, 997, 1004-1008, 1014, 1015, 1138, 1143, 1178-1184, 1215-1228, 1291, 1301, 1316, 1317, 1346, 1372, 1373, 1377, 1382, 1388, 1389, 1453, 1457, 1464-1468 Smart Grids 1017-1024, 1047, 1050, 1052 Smart Meters 1019, 1052 Soft Computing 34, 63, 114, 176-179, 325, 354, 397, 398, 424, 427, 479, 494, 531, 613, 616, 635-641, 650, 654, 672, 674, 678, 694, 695, 806, 872, 894, 915, 916, 1073, 1114, 1213, 1230, 1298, 1316, 1317, 1375, 1452, 1454, 1520, 1532 Spatial variability 98, 102-110, 115 sphere function 1, 5, 9, 16, 310, 312, 314 steganography 1416, 1418, 1419, 1423-1429 Stock Selection 790 stretch factor 355, 356, 357, 368 SUMO 1215-1228 suppliers 1018, 1030, 1175-1180, 1188 Swarm Intelligence 3, 32, 62, 85, 89, 149, 178, 325, 326, 548, 549, 567, 643, 644, 650, 651, 742, 747, 760, 761, 771, 850, 1345, 1371, 1374, 1452, 1455, 1512 Syntactic Similarity 270, 1053 system identification 700-706, 710, 711, 715, 717, 719, 737-741
T Tabu Search 3, 93, 397, 398, 400, 404, 422, 423, 424, 650, 792, 816, 1143, 1147-1151, 1158, 1170,
xxxii
1171, 1179, 1209, 1384, 1431, 1433, 1437, 1453, 1454, 1513-1532 tasks allocation 1122, 1124, 1126, 1136 Template Matching 534-537, 545-549, 1382 Test suit 1 time variant system identification 701, 704, 706, 715, 717, 719, 740 total completion time 929-939, 943, 1118, 1138, 1156-1174 Traffic Delay 762 Traffic Lights 763, 764, 1215-1227 Trapezoidal membership function 896, 903-907 Traveling Salesman Problem 199, 200, 204, 375, 424, 565, 659, 673, 1075, 1077, 1083, 1111-1114 tree t-spanner 355-359, 365, 366, 368, 374 Triangular Membership Function 458, 898-907 Trusted-Third Party 570 t-spanner subgraph 355-358, 366, 368 TSPLIB 1075, 1077, 1083, 1111
U Up Convertor 482
V vagueness (or complex) data set 895 Variable Selection 145, 615, 618, 742-745, 750-760 Vehicle Routing 354, 375, 378, 385, 396-406, 415, 421-425, 659, 671, 945, 946, 984, 1111, 11781184, 1208, 1214, 1435, 1455, 1534 Virtual Machine Placement 773, 776, 787, 788, 789, 1468, 1469, 1470 Virtual Machines 181, 773-780, 784-788, 1141, 1456-1469 Virtualization 774, 788, 1335, 1456, 1457, 1458 VM Migration 774, 1456, 1458, 1464 VM Placement 773-787, 1456-1468 voluminous data set 895
W Windspeed Prediction 116 wrapper method 742, 743 Wrapper Technique 570, 1321
Section 1
Fundamental Concepts and Theories
1
Chapter 1
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm, and Differential Evolution and Its Similarities Shailendra Aote Ramdeobaba College of Engineering and Management, India Mukesh M. Raghuwanshi Yeshwantrao Chavan College of Engineering, India
ABSTRACT To solve the problems of optimization, various methods are provided in different domain. Evolutionary computing (EC) is one of the methods to solve these problems. Mostly used EC techniques are available like Particle Swarm Optimization (PSO), Genetic Algorithm (GA) and Differential Evolution (DE). These techniques have different working structure but the inner working structure is same. Different names and formulae are given for different task but ultimately all do the same. Here we tried to find out the similarities among these techniques and give the working structure in each step. All the steps are provided with proper example and code written in MATLAB, for better understanding. Here we started our discussion with introduction about optimization and solution to optimization problems by PSO, GA and DE. Finally, we have given brief comparison of these.
1. INTRODUCTION Problem solving is one of the most complicated intellectual activities of the human brain. The process of problem solving deals with finding the solution in the presence of the constraints. An exact solution to some problems might simply be infeasible, especially if it has larger dimensionality. In those problems, DOI: 10.4018/978-1-7998-8048-6.ch001
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
solution near to the exact value might be deemed very good and sufficient. Knapsack problem, linear programming problem is an example of an optimization problem. Another example of an optimization problem is to arrange the transistors on a computer chip, so that it will occupy the smallest area and the number of components it will used are as few as possible. Optimization is generally used in many problems like Scheduling Problems, Resource Allocation, Decision Making, and Industrial Planning. Furthermore, these optimization techniques cover large application areas in business, industry, and engineering and computer science. Due to simplicity, involvement of less parameter and fast convergence, Many real world problems like Economic Dispatch (Selvakumar & Thanushkodi, 2007), Scheduling Problems (Pongchairerks, 2009), Textual Entailment Recognition (Mehdad & Magnini, 2009), Term Extraction (Mehdad & Magnini, 2009), Intelligent Web Caching (Syafrullah & Salim, 2010), Text Feature Selection (Sulaiman, Shamsuddin, Forkan, & Abraham, 2008) etc. are solved using these algorithms.
1.1. Basic Concepts Optimization means problems solving in which one tries to find a minimum or maximum value of the given function by systematically choosing the values of real or integer variables from given set. The representation of an optimization problem is as follows, Given: Consider a function f, which maps from set X to the real numbers. If an element xi in X such that f (xi) ≥ f(x) for all other x in X is called maximization or such that f (xi) ≤ f(x) for all other x in X is called minimization. And our aim is to find xi. Let, R is the Euclidean space and X is a subset of R. Every member of X has to satisfy the set of constraints, equalities or inequalities. The search space is defined as the domain X of f, while the elements of X are represented as feasible solutions. The function f is called as an objective function. An optimal solution is a feasible solution that minimizes (or maximizes) the value of the objective function. In case of an optimization of a single objective function f, an optimum value is either its maximum or minimum. This is depended on the nature of the problem. This optimum must satisfy the constraints. Consider the process of manufacturing plant: we have to assign incoming orders to machines, so that it minimizes the time needed to complete them. On the other hand, we will arrange the employment of staff, the placing of commercials and the purchase of raw materials in a way that maximizes our profit. In global optimization, the optimization problems are mostly defined as minimization problems. A global optimum is a value, which is the best among all the values in whole domain X while a local optimum is an optimum of only a subset of X. If anyone wants to minimize the solution, then it tries to get a global minimum point in the search space. It is possible to get local minimum during the search, which diverts the pointer from achieving a better solution. This problem is referred as local minima in Artificial Intelligence (AI). AI also defines the problems in the optimization as plateau and ridge. Optimization problems are mainly classified as a single objective optimization (SOO) and multiobjective optimization (MOO). Here our aim is to deal with the single objective optimization problems. SOO Problems are further classified as Unimodal and Multimodal Optimization, whereas other types of classes are Single dimensional and multidimensional problems. Difficulty in solving the problems increases from unimodal to multimodal as well as from single dimensional to multidimensional problems. If the problem is multimodal and multidimensional, then optimum solution is presently far away from the real solution. 2
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
1.2. Classical Methods Different methods to solve optimization problems are as follows. 1. Traditional Optimization algorithms use the exact method to find the best available solution (Selvakumar & Thanushkodi, 2007). The idea is that if the problem can be solved, then the algorithm must find the global best solution. Brute force method is an example of an exact method, which tries every solution in the search space. But as the search space increases, the cost of brute force method also increases. Therefore the brute force method is not appropriate for the NP-hard problems. Another method includes Linear Programming, Divide and Conquer, Dynamic Programming and Greedy etc. 2. Stochastic Optimization methods try to find the near-optimal solution for NP-Hard problems in polynomial time. It considers any random initial solution in the search space, and goes about finding an optimal solution in a number of iterations. Hill Climbing algorithm, Simulated Annealing, Tabu search etc are known as stochastic optimization methods. 3. Evolutionary algorithm is also one of the stochastic optimization methods, which differs with others in the fact that EAs maintains a population of the probable solutions to a problem. Different types of Evolutionary algorithms are Genetic Algorithm, Differential Evolution, Genetic programming, etc. The Evolutionary algorithm works as follows: A population of individuals is initialized in the search space, where each individual has its potential solution. The quality of each solution is evaluated by using the given fitness function. It starts with selection process, which is used to give next population. Mutation and Crossover is performed to alter the individuals and the higher order transformation respectively. This process is repeated for several numbers of iterations or till the optimal solution is obtained. Figure 1 shows the working principle of evolutionary algorithms. 4. Particle swarm optimization, Ant colony optimization, Cuckoo Search, etc. are the swarm intelligence techniques in optimization.
2. LITERATURE REVIEW The first evolutionary algorithms that were purposefully designed to obtain an approximation set were proposed in the mid-1980s (Kursawe, 1990; Schaffer, 1985; Fourman, 1985). In these schemes, a proportion of the population was selected according to each individual objective. The main difficulty with this approach is that it often creates a phenomenon known as speciation, in which solutions arise in the population that are particularly strong in a single objective and particularly poor in others. Thus, important compromise solutions remain undiscovered, since the recombination of solutions from different extreme regions of the tradeoff surface cannot usually be assumed to generate an „intermediate‟ compromise. In the weighted-sum approach to MO, performance is captured in a single objective, calculated as a weighted-sum of individual performance in each of the original individual objectives. The well-known drawbacks of this approach are the difficulty in setting values for the weights, and the necessary condition for convexity of the trade-off surface that is required to obtain all Pareto optimal solutions. Thus, no combination of weights exists that can generate solutions in non-convex regions of the trade-off surface, as shown geometrically by Fleming and Pashkevich (1985). However, MOEAs based on weighted-sums schemes have also been proposed. Haleja and Lin (1992) included the weight 3
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
Figure 1. A flowchart of working principle of evolutionary algorithms
vector in the solution genotype and allowed multiple weight combinations to be propagated through the population during evolution. Jin, Okabe and Sendhoff (2001) varied the weight vector over the evolution, and have also provided theoretical justification for the method (Jin, Okabe, & Sendhoff, 2001a; 2001b). Unlike these early attempts, the majority of modern MOEAs are based on the concept of Pareto dominance (Coello, 2002). The use of Pareto dominance as a basis for solution comparison in Goldberg first suggested EAs in (Goldberg, 1989), together with the use of a niching technique to encourage solution distribution across the trade-off surface. In the early-1990s, three much-cited techniques emerged based on Goldberg‟s ideas: Fonseca and Fleming‟s (1993) multi-objective genetic algorithm (MOGA), Horn and Nafpliotis‟s (1993) niched Pareto genetic algorithm (NPGA) and Srinivas and Deb‟s nondominated sorting genetic algorithm (NSGA) (Srinivas & Deb, 1994), although early less well-known implementations by Ritzel and Cieniawski have also been reported (Horn & Nafpliotis, 1993; Fonseca & Fleming, 1995). The techniques differ slightly in the way in which fitness is derived from Pareto comparisons of solutions. MOGA, NPGA, and NSGA all use fitness sharing for diversity promotion (Goldberg & Richardson, 1987). Lots of PSO variants are proposed from its formation to till date. Efforts are put towards increase in efficiency and convergence speed. Both the things are rarely achieved in a single algorithm. Low dimensional problems are easier to solve, where as complexity increase as the increase in dimensions and modality. A niching method is introduced in EAs to locate multiple optimal solutions (Li & Deb, 2010). Distance based LIPS model (Qu, Suganthan, & Das, 2013), a memetic PSO (Wang, Moon, Yang, & Wang, 2012), Adaptive PSO (Zhan, Zhang, Li, & Chung, 2009), Fractional PSO (Kiranyaz, Ince, Yildirim, & Gabbouj, 2010), AGPSO (Mirjalili, Lewis, & Sadiq, 2014), CSO (Cheng & Jin, 2015) and many other techniques are proposed to handle higher dimensional and multimodal problems. In spite of these techniques, trapping in local minima and the rate of convergence are two unavoidable problems in PSO and all other EAs. Though the PSO is nature inspired algorithm, a lot of issues can still
4
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
be modeled to improve the performance. Different techniques to deal with the stagnation are studied in Bonyadi, Reza, Michalewicz, and Li (2014), Li and Yao (2012), and Bonyadi, Reza, Michalewicz, and Li (2014). They control the parameters involved in velocity update equation. To remove the problem of stagnation and to get the better performance, lot of techniques like GCPSO (Bergh, 2002) which uses a different velocity update equation for the best particle since its personal best and global best both lie at the same point. OPSO (Wang et al., 2007) employs opposition based learning for each particle and applies dynamic Cauchy mutation on the best particle. QPSO (Yang, Wnag, & Jiao, 2004) proposed a new discrete particle swarm optimization algorithm based on quantum representation of individuals, which in turn causes faster convergence. In the new method called H-PSO (Janson & Middendorf, 2005), the particles are arranged in a dynamic hierarchy that is used to define a neighborhood structure. Depending on the quality of their so-far best found solution, the particles move up or down the hierarchy. This gives good particles that move up in the hierarchy a larger influence on the swarm. George I. Evers proposed a RegPSO (Evers & Ghalia, 2009), where the problem of stagnation is removed by automatically triggering the swarm regrouping. Efforts are taken to solve multimodal problems. To solve higher dimensional problems, variants cooperative co evolution strategies like CPSO-SK and CPSO-HK (Bergh & Engelbrecht, 2004), CCPSO (Yang, Tang, & Yao, 2008), CCPSO2 (Cui, Zeng, & Yin, 2008) are proposed.
3. PARTICLE SWARM OPTIMIZATION Scientists and engineers from all disciplines often have to deal with the classical problem of search and optimization. Optimization means the action of finding the best-suited solution of a problem within the given constraints and flexibilities. While optimizing performance of a system, we aim at finding out such a set of values of the system parameters, for which the overall performance of the system will be the best under some given conditions. Usually, the parameters governing the system performance are represented in a vector x= [x1, x2, x3 ……….xD] . For real parameter optimization, each parameter xi is a real number. To measure how far the “best” performance we have achieved, an objective function (or fitness function) is designed for the system. Let us consider the following simple objective function named sphere function. Figure 2. Objective function – sphere
5
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
F(x) =
n
∑z
2 i
where x ε [-100, 100]
i =1
Particle Swarm Optimization (PSO) is a simple real parameter optimization algorithm. It works through simple cycle of stages as shown in Figure 3. Figure 3. Stages in PSO working
3.1. Representation of Solution There are different methods for representing solution/point in search space (variable space). The use of proper data structure (like list, tree etc.) plays very important role in processing of solutions. Particles are the basic building blocks in particle swarm optimization. Particle movement in the search space leads to the solution. The movement depends on velocity (v) and displacement (x) of the particle. When particle moves, they adjust their velocity, so that no collision should occur between them. Velocity of the particle depends on certain factors like current velocity & personal best and social best of the swarm. After finding the velocity, the new displacement vector of each particle is find out. Generally in PSO, variables and objectives are represented separately NP = 10; x = zeros(NP,D); v = zeros(NP,D); pbest = x ; fit = fun(x); gbest = x(min(fit));
3.2. Initialization Particle Swarm Optimization algorithm (PSO) begins its process by randomly initializing the NP number of particles in D dimension vectors within the search space. These vectors act as a candidate solution for a given objective function. Every cycle in PSO can be considered as a generation, G= 1, 2, 3 ….. Gmax. Any candidate solution at any generation can be denoted as: xi,g=[x1,g,x2,g,x3,g ……….xD,g]
6
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
There exists a lower and upper bound within search space for each parameter in the problem, it can be denoted as: Xmin=[X1,min,X2,min ……Xd,min] and Xmax=[X1,max,X2,max ……Xd,max] To consider the effect of normal distribution, the initial population at G=0 must cover the maximum range between the lower and upper bound. This can be achieved by following equation: Xj,i,0= Xj,min+ rand [0,1] (Xj,max- Xj,min) whereas xj,i,0 = jth component of ith member of the population at 0th generation and xj,min and xj,min are its respective lower and upper value. This type of initialization is called symmetric initialization, where the search is being performed in given range only. This range varies as per the fitness function. Let us consider NP=8, D=2, then the initial population for above mentioned objective function with fitness value will be as shown in Table 1. Table 1. Initial population with fitness value SN
Initial Population
Fitness Value
1 2 3 4 5 6 7 8 9 10
49.2214 4.6626 92.3987 -93.9459 -13.9432 78.0072 -54.05976 -77.8154 30.3994 -74.9137 76.0132 51.1828 56.6531 95.7127 -89.8707 -14.8693 -87.9401 80.6312 -40.3144 2.7629
2444.48605672000 17363.3518885000 6279.53607808000 8977.69412841760 6536.18596805000 8397.68559008000 12370.4946809000 8297.83880098000 14234.8516014500 1632.88446377000
3.2.1. Random Initialization (Symmetric) Code: function [pop]=init_pop(pop,NP,D,xl,xu) for i=1:NP for j=1:D xMin=xl(j); xMax=xu(j); % Random real number between xMin and xMax x(i,j) = xMin+rand*(xMax-xMin); %Initialize Position v(i,j) = xMin+rand*(xMax-xMin); end end end
7
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
Function init_pop randomly initializes a population of size NP. There are D numbers of decision variables in each solution of population. Each variable is having lower (xl) and upper (xu) bound. Each variable is randomly initialized between the lower & upper bound. If lower and upper bounds for all variables are same (for ex. xl=-100 &xu=100) then For i=1:NP for j=1:D x(i,j) = xl+rand*(xu-xl); v(i,j) = xl+rand*(xu-xl); end end
Figure 4. Distribution of initial population in search space
3.2.2. Information Based Initialization In most real-world problems, the knowledge of the exact optimum is usually not available. Symmetric initialization creates population around the optimal solution whereas skewed initialization creates population away from the actual optimum. Performance of PSO on symmetric initialization may not represent the PSO’s true performance in solving the same problem with a different initialization or other problem. The initialization of population away from global basin makes sure that an algorithm must overcome a
8
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
number of local minima to reach global basin. Hence it is always better to use either mix initialization scheme or test the performance of algorithm on both types of initialization schemes. After initialization of population it is further tuned to ensure well and uniform distribution over search space. There are various techniques like opposition based learning (OBL), k-means clustering, Chaotic methods etc can be used for tuning. After initialization of population, next job is to calculate fitness or objectives of each solution. For example: sphere function (fun=1) is single objective function. function fit = func(x,func_num) % Sphere Function If func_num==1 fit = sum(x.*x, 2); end for i = 1:1:N f(i) = func(x(i,:),fun); end
% Function value for initial position
Other variables are defined as shown in Table 2. Table 2. Variables
Formula / Values
pbest
pbest = x;
gbest
p_f = f; [gbest, g] = min(p_f);
w
0.72
c1, c2
1.49
3.3. Generation of New Solution In a particle swarm optimizer, individuals are “evolved” by cooperation and competition among the individuals themselves through generations. Each particle adjusts its flying according to its own flying experience and its companions’ flying experience. Each individual is named as a “particle” which, in fact, represents a potential solution to a problem. Each particle is treated as a point in a D dimensional space. The ith particle is represented as Xi = (Xi1, Xi2,…………,XiD) The best previous position (the position giving the best fitness value) of any particle is recorded and represented as Pi = (Pi1, Pi2,……..,PiD)
9
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
The index of the best particle among all the particles in the population is represented by the symbol g. The rate of the position change (velocity) for particle i is represented as Vi = (Vi1,Vi2,……….,ViD) The Particle are manipulated according to equation ViD =w × V iD + c1× rand () × (PiD −X iD ) + c2 × rand () × (PgD −X iD )
(a)
X iD = X iD +ViD
(b)
rand() is a random function which generates value in the range[0,1]. The second part of the equation (a) is the “cognition’ part, which represents the private thinking of the particle itself. The third part is the “social” part, which represents the collaboration among the particles. The equation (a) is used to calculate the particle‘s new velocity according to its previous velocity and the distances of its current position from its own best experience (position) and the group’s best experience. Then the particle flies toward a new position according to equation (b). Figure 5. Particle’s position change
w = Inertia weight used to explore and exploit the search space c1= Cognitive constant i.e. learning factor for personal movement c2 = Social Constant i.e. learning factor for group movement Let us consider w=0.7, c1=c2=1.49, then Vid = w * Vid + c1 * rand() * (Pid - Xid) + c2 * rand() * (Pgd - Xid)
10
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
Table 3. Values at 1st generation
Code: function [x,v]=generation(x,v,pbest,g,c1,c2,w,LB,UB,D,N) gbest = repmat(x(g,:),N, 1); %Update Velocity & Position for i=1:N v(i,:) = w*v(i,:) + c1*rand()*(pbest(i,:) x(i,:))+c2*rand()*(gbest(i,:)-x(i,:)); % Update the velocity x(i,:) = x(i,:) + v(i,:); %Update Position end %Initialize the particle in range if they move outside the range for i=1:N for j = 1:1:D xMin=LB(j); xMax=UB(j); if x(i,j) > xMax
11
Mathematical Optimization by Using Particle Swarm Optimization, Genetic Algorithm
x(i,j) = xMax; v(i,j) = -0.5*v(i,j);
end
end
end if x(i,j) < xMin x(i,j) = xMin; v(i,j) = -0.5*v(i,j); end end %j
3.4. Selection of Solutions to Generate New Solutions When particles move outside, those should be reinitialized on the boundary. It is shown in above table by bold values (having values -100 or 100). It is necessary to change the value pbest for next generation. To do so, calculate the fitness value of newly calculated position vector. Those fitness values are then compared with old fitness value of position vector. If new fitness value is better than old, then make corresponding position as pbest otherwise it remains as it is. This process is continuing until the stopping criterion is met or up to end of number of iterations. Code: [pbest p_f g]=selection(x,f,p_f,N,pbest); function [pbest,p_f,g]=selection(x,f,p_f,N,pbest) %Change personal best for i=1:1:N if f(i) 1) No of parents:2 Distribution index:2 x1:9.700000 x2:20.100000 rnd:0.959492 betaq:2.311060 Dist:5.200000 y1:-2.317513 y2:21.717513 No of parents:2 x2:20.100000 rnd:0.505957 y1:4.479183
Distribution index:2 x1:9.700000 betaq:1.004003 y2:14.920817
Dist:5.200000
b. Contracting Crossover (rnd 65% and recall > 65%. The total number of prediction instances is the sum of number of intra-project and number of cross-project change predictions. Figure 4. Calculating distributional characteristics of a dataset
Table 5. Descriptions of indicators used Indicator
Description
Mode
Most frequently occurring value in a population.
Median
The middle value in the given list of numbers.
Mean
The average value in a population.
Minimum
The smallest value in a population.
Maximum
The largest value in a population.
Variance
The average of the squared deviation of the values from the mean value.
Standard Deviation
The amount of variation in a population.
Skewness
A measure of the asymmetry of a population about its mean.
Kurtosis
A measure of the peakness of a population.
Range
The difference between the highest and the lowest value.
290
Cross-Project Change Prediction Using Meta-Heuristic Techniques
Figure 5. Train-test-result dataset
4.4. Validation Methods To validate each experiment, the following techniques are used:
4.4.1. Intra-Project Validation In intra-project cross-validation, the prediction model is constructed on one of the previous versions and is validated on the subsequent future versions of the same project.
4.4.2. Cross-Project Validation In cross-project validation, different projects are used for training and testing the models respectively. The model constructed using one project is used to predict change prone classes of some other project.
4.4.3. Rules Generation Validation Rules are derived from the models constructed on only one dataset which train-test-result dataset (refer Figure 5) as explained in the previous section. When only one dataset is available, then the results will be highly biased if training and testing are performed on the same dataset. Thus, K-cross validation is used where the dataset is divided into K parts. The model is trained on K-1 sets and testing is performed on
291
Cross-Project Change Prediction Using Meta-Heuristic Techniques
remaining 1 set. This process is repeated K times and mean values of results are taken. K- cross validation provides accurate estimation by reducing the validation bias (Carvalho, 2010). In this study, the authors applied 10- cross validation to check the performance of the constructed model for rules generation.
5. EXPERIMENTAL RESULTS For the experiments, 15 datasets are used which are obtained from 3 different projects which are MyFaces, Struts and Wickets. First, the results for intra-project change prediction are presented, followed by the results for cross-project change prediction and at last the rules are generated for selecting the training set which can be used to create change prediction models. For each dataset, the features are filtered using Correlation-based Feature Selection (CFS) (Refaeilzadeh, 2009). The set of features selected using CFS are used for constructing the prediction model. The same sets of features are used in every testing set tested using that dataset. The initial parameter settings for all the algorithms are presented in Table 6. To construct hybrid decision tree genetic algorithm and oblique decision tree with evolutionary learning models, the authors used an open source tool keel (https://www.keel.es/). Table 6. Initial parameters for experiments Algorithm
Parameters
Adaboost
Number of iterations = 10 Seed = 1 Weight threshold = 100
Logitboost
Number of iterations = 10 Likelihood threshold = -1.79 Seed = 1 Shrinkage parameter = 1.0 Weight threshold = 100
Bagging
Bag size percentage = 100 Number of iterations = 10 Seed = 1
Random forests
Number of trees to be generated = 100 Seed = 1
C4.5 decision tree
Confidence factor = 0.25 Minimum number of instances per leaf = 2 Number of folds = 3 Seed = 1
Hybrid decision tree genetic algorithm
Keel tool default settings
Oblique decision tree with evolutionary learning
Keel tool default settings
292
Cross-Project Change Prediction Using Meta-Heuristic Techniques
5.1. Validation Results of Intra-Project Change Prediction The results for every intra-project change prediction pair for MyFaces along with the classifier which gave those results are shown in Table 7, for Struts are shown in Table 8 and for Wickets are shown in Table 9 respectively. As for every project with 5 datasets there will be 10 intra-project prediction instances, applying 4 algorithms on them will result in 40 results instances. Due to space constraints, the tables show only the results of the algorithm which gave the best values of accuracy, recall, precision and AUC for every result instance. Table 7. Intra-project prediction results for MyFaces Train Set
Test Set
Classifier
Accuracy
Recall
Precision
AUC
mf12
mf23
AdaBoost
68.54%
77.46%
75.31%
0.70
mf12
mf34
AdaBoost
60.09%
57.19%
33.05%
0.62
mf12
mf45
AdaBoost
66.48%
78.57%
16.33%
0.73
mf12
mf56
Bagging
65.09%
66.98%
54.18%
0.68
mf23
mf34
AdaBoost
69.32%
63.31%
42.62%
0.70
mf23
mf45
AdaBoost
66.34%
84.82%
17.12%
0.80
mf23
mf56
LogitBoost
71.92%
66.98%
63.09%
0.74
mf34
mf45
Bagging
73.18%
71.43%
18.69%
0.79
mf34
mf56
Bagging
71.41%
69.03%
61.87%
0.77
mf45
mf56
Bagging
70.83%
72.01%
60.50%
0.77
Table 7 shows the results when intra-project change prediction is done on MyFaces dataset. Out of the 10 instances, AdaBoost gave better results in 5 instances as compared to other algorithms. Table 8 shows the results when intra-project change prediction is done on Struts dataset. Out of the 10 instances, LogitBoost gave better results in 4 instances as compared to other algorithms. Table 8. Intra-project prediction results for Struts Train Set
Test Set
Classifier
Accuracy
Recall
Precision
AUC
st12
st23
RandomForest
61.18%
63.19%
52.75%
0.60
st12
st34
Bagging
58.08%
60.73%
26.48%
0.60
st12
st45
LogitBoost
61.49%
70.20%
43.49%
0.65
st12
st56
Bagging
62.78%
62.68%
23.19%
0.65
st23
st34
Bagging
67.79%
68.06%
34.76%
0.74
st23
st45
LogitBoost
69.98%
71.85%
52.05%
0.77
st23
st56
AdaBoost
68.28%
76.09%
29.17%
0.78
st34
st45
LogitBoost
70.91%
68.91%
53.36%
0.77
st34
st56
LogitBoost
66.57%
73.19%
27.48%
0.76
st45
st56
RandomForrest
80.23%
81.16%
42.18%
0.88
293
Cross-Project Change Prediction Using Meta-Heuristic Techniques
Table 9 shows the results when intra-project change prediction is done on Wickets dataset. Out of the 10 instances, Bagging gave better results in 6 instances as compared to other algorithms. Overall for all the datasets, 70% of the intra-project prediction instances out of 30 instances had an accuracy > 65% and a recall > 65% using the bagging learner. Table 9. Intra-project prediction results for Wickets Train Set
Test Set
Classifier
Accuracy
Recall
Precision
AUC
wk12
wk23
Bagging
73.42%
69.84%
10.51%
0.78
wk12
wk34
Bagging
75.05%
72.63%
0.87%
0.78
wk12
wk45
Bagging
70.00%
71.98%
13.29%
0.78
wk12
wk56
Bagging
73.49%
73.33%
0.89%
0.79
wk23
wk34
LogitBoost
74.48%
73.68%
0.86%
0.81
wk23
wk45
LogitBoost
73.71%
65.93%
0.14%
0.79
wk23
wk56
LogitBoost
75.90%
67.62%
0.91%
0.79
wk34
wk45
Bagging
69.87%
72.53%
13.31%
0.78
wk34
wk56
Bagging
73.19%
71.43%
0.87%
0.79
wk45
wk56
LogitBoost
72.74%
70.48%
0.84%
0.79
5.2. Validation Results of Cross-Project Change Prediction The total number of cross-project predictions performed was 150 for each of the 4 learners. Due to space constraints, the prediction results are not specified. Bagging gave the best performance as compared to other learners. Out of 150 result instances, 63.33% of the cross-project prediction instances had an accuracy > 65% and a recall > 65% using the bagging learner.
5.3. Selecting Training Set for Cross-Project Change Prediction As there are 30 intra-project and 150 cross-project predictions, hence the total number of instances is 180. Learners developed using bagging gave the most number of instances which satisfied the acceptance criteria as compared to other models. Out of 180 instances, 116 prediction instances satisfied the results criteria. CFS is applied on the train-test-result dataset to filter the features. The number of indicators used to describe the distributional characteristics is 70 for training set and 70 for testing set, hence a total of 140 variables. Out of 140 variables, 15 variables are selected using CFS. Thus, only this subset of variables is used to construct decision tree prediction models. The results of 10- cross validation are shown in table 10. Table 10. Prediction results for selecting suitable training set Classifier
Accuracy
Recall
Precision
C4.5 decision tree
73.33
74.14
82.69
Hybrid decision tree GA
75.00
93.97
74.15
Oblique decision tree with EA
75.56
79.31
82.14
294
Cross-Project Change Prediction Using Meta-Heuristic Techniques
Results show that the tree constructed from predictors based on C4.5 algorithm provides the prediction results with values of accuracy of 73.33%, recall value of 74.14% and precision value of 82.69%. Evolutionary algorithms also give comparable results. Tree constructed from hybrid decision tree genetic algorithm provides the results with values of accuracy of 75.00%, recall value of 93.97% and precision value of 74.15%. Oblique decision tree provides the results with values of accuracy of 75.56%, recall value of 79.31% and precision value of 82.14%. This means that distributional characteristics can be effectively used to select suitable training set for predicting change-prone classes in software systems. Based on the results for C4.5 decision tree, the authors derived certain rules for selecting suitable training data. These rules are shown in Table 11. The supporting instances denote the number of instances classified correctly using the rule. The derived rules can be used to select suitable training set to predict change-prone classes. For instance, the first rule implies that if mean NOC of testing set is less than or equal to 1.07, mean NOC of training set is greater than 1.07, kurtosis DIT of testing set greater than 2.15 and standard-deviation WMC is greater than 8.51, then the training set can be used to predict change-prone classes in the testing set.
6. DISCUSSION OF RESEARCH QUESTIONS Based on the results the authors answer the questions presented in the introduction section as follows: 1. Does intra-project change prediction work better than cross-project change prediction? As results in experiments section show, the cross-project change prediction provides comparable results as compared to intra-project change prediction. However, the number of cross-project change prediction instances (150) conducted by us are more as compared to intra-project change prediction (30). 2. Can distributional characteristics be used for selecting suitable training set for predicting changeprone classes? Distributional characteristics helped in identifying suitable training set for cross-project change prediction. Table 10 shows the prediction results for selecting suitable training set. The authors also generated rules for selecting suitable training set for making prediction models as presented in Table 11. The rules were validated using 10-folds cross validation. 3. Can evolutionary decision tree give comparable or better results than traditional decision tree for selecting suitable training set? Based on our results in Table 9, evolutionary decision trees gave comparable performance to C4.5 decision tree for successful change prediction. 4. Which evolutionary decision tree gives best results for selecting suitable training set for predicting change-prone classes?
295
Cross-Project Change Prediction Using Meta-Heuristic Techniques
Based on accuracy and precision Oblique decision tree with evolutionary learning genetic gave better results than hybrid decision tree with values of accuracy of 75.56%, recall value of 79.31% and precision value of 82.14% as shown in Table 9.
Table 11. Rules learned by C4.5 decision tree
296
Cross-Project Change Prediction Using Meta-Heuristic Techniques
7. LIMITATIONS OF THE WORK It is very difficult to generalize the experimental results in software engineering. In this study, the authors used 3 open source software systems developed in Java programming language by Apache only. The conclusions drawn from the analysis may not be valid for other systems developed in Java and commercial proprietary closed software. The metrics data collected from CRG tool which automates school’s Understand tool for calculating the metrics data can be inaccurate. The acceptable prediction result values used in this paper are not a strict criterion. Previous studies have used different criteria for acceptance (He, 2012; Zimmermann, 2009). However, results observed in using the distributional characteristics to select suitable training set for cross-project change prediction is in line with results observed by He in defect prediction studies. The authors used ensemble learners to construct prediction models for intraproject and cross-project change prediction. There might be some other classification algorithms which can provide better results than the algorithms used in this paper. Another possible limitation of the study can be the choice of software metrics used to construct predictors. Various researches have ability of object-oriented metrics to predict change prone classes (Zhou, 2012; Malhotra & Khanna, 2013; Romano & Pinzger, 2011; Malhotra & Khanna 2013). But, other software metrics may exhibit different results. So, the authors limit their conclusions to the metrics used in this paper.
8. CONCLUSION AND FUTURE WORK Change-proneness prediction remains an interesting and important topic for researchers. The motivation of this study is to work on cross-project change prediction. First, the authors collected the dataset from 3 open source software projects. Then, they studied intra-project and cross-project change predictions using the collected datasets. The authors applied tree based algorithms to select suitable training dataset for change prediction using the distributional characteristics of the training dataset and test dataset. The main findings of the research are summarized as follows: • •
•
Cross-project change prediction provides comparable results to intra-project change prediction. This information can be helpful for predicting the change-prone classes in projects having no historical data. Distributional characteristics of dataset are helpful in selecting suitable training dataset for change prediction. Selecting suitable training dataset is a very important step in building the prediction model. The authors derived rules to select the right training dataset based on the distributional characteristics of the training dataset and testing dataset. Evolutionary algorithms based decision trees provide comparable results to standard decision trees. This widens the scope of applying evolutionary algorithms in the domain of software change prediction.
Future work includes carrying out change prediction on a large scale with more projects. More metrics can be considered for making models. Researchers can use more learning methods and more evolutionary algorithms can be used for making prediction models.
297
Cross-Project Change Prediction Using Meta-Heuristic Techniques
REFERENCES Bala, J., Huang, J., Vafaie, H., DeJong, K., & Wechsler, H. (1995, August). Hybrid learning using genetic algorithms and decision trees for pattern classification. In Proceedings of the 14th international joint conference on Artificial intelligence (Vol. 1, pp. 719-724). Bansal, A. (2017). Empirical analysis of search based algorithms to identify change prone classes of open source software. Computer Languages, Systems & Structures, 47, 211–231. doi:10.1016/j.cl.2016.10.001 Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. doi:10.1023/A:1010933404324 Brooks, F. P. (1974). Mythical Man-Month. Datamation, 20(12), 44–52. Cantú-Paz, E., & Kamath, C. (2000, October). Combining evolutionary algorithms with oblique decision trees to detect bent-double galaxies. In International Symposium on Optical Science and Technology (pp. 63-71). International Society for Optics and Photonics. 10.1117/12.403609 Cantu-Paz, E., & Kamath, C. (2003). Inducing oblique decision trees with evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 7(1), 54–68. doi:10.1109/TEVC.2002.806857 Carvalho, D. R., & Freitas, A. A. (2004). A hybrid decision tree/genetic algorithm method for data mining. Information Sciences, 163(1), 13–35. doi:10.1016/j.ins.2003.03.013 De Carvalho, A. B., Pozo, A., & Vergilio, S. R. (2010). A symbolic fault-prediction model based on multiobjective particle swarm optimization. Journal of Systems and Software, 83(5), 868–882. doi:10.1016/j. jss.2009.12.023 Dietterich, T. G. (2000, June). Ensemble methods in machine learning. In International workshop on multiple classifier systems. Springer Berlin Heidelberg. Eiben, A. E., & Smith, J. E. (2003). Introduction to evolutionary computing (Vol. 53). Heidelberg: springer. Harman, M. (2010). The relationship between search based software engineering and predictive modelling. In Proceedings of PROMISE’10. He, Z., Shu, F., Yang, Y., Li, M., & Wang, Q. (2012). An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering, 19(2), 167–199. doi:10.100710515-011-0090-3 Khoshgoftaar, T. M., Seliya, N., & Liu, Y. (2003, November). Genetic programming-based decision trees for software quality classification. In Proceedings 15th IEEE International Conference on Tools with Artificial Intelligence 2003 (pp. 374-383). IEEE. 10.1109/TAI.2003.1250214 Liu, Y., & Khoshgoftaar, T. M. (2001). Genetic programming model for software quality classification. In Sixth IEEE International Symposium on High Assurance Systems Engineering 2001 (pp. 127-136). IEEE. Lu, H., Zhou, Y., Xu, B., Leung, H., & Chen, L. (2012). The ability of object-oriented metrics to predict change-proneness: A meta-analysis. Empirical Software Engineering, 17(3), 200–242. doi:10.100710664011-9170-z Malhotra, R., Bansal, A., & Jajoria, S. (2016, September). An automated tool for generating change report from open-source software. In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 1576-1582). IEEE. 10.1109/ICACCI.2016.7732273 298
Cross-Project Change Prediction Using Meta-Heuristic Techniques
Malhotra, R., & Bansal, A. J. (2014, September). Cross project change prediction using open source projects. In 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 201-207). IEEE. 10.1109/ICACCI.2014.6968347 Malhotra, R., & Khanna, M. (2013a). Investigation of relationship between object-oriented metrics and change-proneness. International Journal of Machine Learning and Cybernetics, 4(4), 273–286. doi:10.100713042-012-0095-7 Malhotra, R., & Khanna, M. (2013b). Inter project Validation for Change-proneness Prediction using Object-Oriented Metrics. Software engineering. International Journal (Toronto, Ont.), 3(1), 21–31. Malhotra, R., & Khanna, M. (2014). The Ability of Search-Based Algorithms to Predict Change-Prone Classes. Software Quality Professional, 17(1). Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-validation. In Encyclopedia of database systems (pp. 532-538). Springer US. Romano, D., & Pinzger, M. (2011, September). Using source code metrics to predict change-prone java interfaces. In 2011 27th IEEE International Conference on Software Maintenance (ICSM) (pp. 303-312). IEEE. 10.1109/ICSM.2011.6080797 Scitools. (n.d.). Using understand from the command line with Und. Retrieved from https://scitools. com/support/commandline/ Subramanyam, R., & Krishnan, M. S. (2003). Empirical analysis of ck metrics for object-oriented design complexity: Implications for software defects. IEEE Transactions on Software Engineering, 29(4), 297–310. doi:10.1109/TSE.2003.1191795 Tsantalis, N., Chatzigeorgiou, A., & Stephanides, G. (2005). Predicting the probability of change in object-oriented systems. IEEE Transactions on Software Engineering, 31(7), 601–614. doi:10.1109/ TSE.2005.83 Watanabe, S., Kaiya, H., & Kaijiri, K. (2008, May). Adapting a fault prediction model to allow inter language use. In Proceedings of the 4th international workshop on Predictor models in software engineering (pp. 19-24). ACM. 10.1145/1370788.1370794 Wikipedia. (n.d.). C4.5 Algorithm. Retrieved from https://en.wikipedia.org/wiki/C4.5_algorithm Zimmermann, T., Nagappan, N., Gall, H., Giger, E., & Murphy, B. (2009, August). Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering (pp. 91-100). ACM. 10.1145/1595696.1595713 Zimmermann, T., Zeller, A., Weissgerber, P., & Diehl, S. (2005). Mining version histories to guide software changes. IEEE Transactions on Software Engineering, 31(6), 429–445. doi:10.1109/TSE.2005.72
This research was previously published in International Journal of Applied Metaheuristic Computing (IJAMC), 10(1); pages 43-61, copyright year 2019 by IGI Publishing (an imprint of IGI Global). 299
300
Chapter 16
Environmental Adaption Method:
A Heuristic Approach for Optimization Anuj Chandila IEC-CET, Greater Noida, India Shailesh Tiwari CSED, ABES Engineering College, Ghaziabad, India K. K. Mishra MNNIT Allahabad, India Akash Punhani ABES Engineering College, Ghaziabad, India
ABSTRACT This article describes how optimization is a process of finding out the best solutions among all available solutions for a problem. Many randomized algorithms have been designed to identify optimal solutions in optimization problems. Among these algorithms evolutionary programming, evolutionary strategy, genetic algorithm, particle swarm optimization and genetic programming are widely accepted for the optimization problems. Although a number of randomized algorithms are available in literature for solving optimization problems yet their design objectives are same. Each algorithm has been designed to meet certain goals like minimizing total number of fitness evaluations to capture nearly optimal solutions, to capture diverse optimal solutions in multimodal solutions when needed and also to avoid the local optimal solution in multi modal problems. This article discusses a novel optimization algorithm named as Environmental Adaption Method (EAM) foable 3r solving the optimization problems. EAM is designed to reduce the overall processing time for retrieving optimal solution of the problem, to improve the quality of solutions and particularly to avoid being trapped in local optima. The results of the proposed algorithm are compared with the latest version of existing algorithms such as particle swarm optimization (PSO-TVAC), and differential evolution (SADE) on benchmark functions and the proposed algorithm proves its effectiveness over the existing algorithms in all the taken cases. DOI: 10.4018/978-1-7998-8048-6.ch016
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Environmental Adaption Method
1. INTRODUCTION Optimization is a process of finding out the best solutions among all available solutions for a problem. In a given domain, selection of best solution is done on the basis of objective function. An optimal solution of a given problem will have either the maximum or minimum value of objective function. Thus, an optimization problem is a search problem in which optimization algorithm is used to target optimal solutions in the space of all possible solutions, known as problem search space. This search space may be continuous or discrete. Each point in this search space represents one solution. Depending on the complexity of optimization problem deterministic or randomized version of optimization algorithms are designed. If numbers of points in the search space are less then dynamic programming can be applied to retrieve exact optimal solution. This algorithm extracts the best solution by comparing the fitness values of all possible solutions. Problems with large number of solutions will require many comparisons and it will be computationally infeasible to compare all those points to retrieve exact optimal solutions. For these NP Hard problems where even searching a better solution is very typical task, so a local optimal solution or nearly optimal solution may be very valuable. Thus these problems can be solved by local search algorithms, gradient based algorithm or randomized algorithms. Local search algorithms and gradient based algorithms use mathematical approach to target local optimal solutions. Randomized algorithms search in random direction until they are able to find out nearly optimal solutions. Many randomized algorithms such as evolutionary programming, evolutionary strategy, genetic algorithm, particle swarm optimization and genetic programming are widely acceptable for solving optimization problems. Although a number of randomized algorithms are available in literature for solving optimization problems yet their design objectives are same. Each algorithm has been designed to meet certain goals like minimizing total number of fitness evaluations to capture nearly optimal solutions, to capture diverse optimal solutions in multi modal solutions when needed and also to escape from local optimal solution in multi modal problems (Elbeltagi, Hegazy & Grierson, 2005). Proposed study discusses these objectives in detail, focus on solutions implemented by existing algorithms and finally design a novel algorithm. The design objectives of all algorithms can be explained as follows.
1.1. Convergence Rate The prime objective of a newly designed algorithm is to minimize the total number of fitness evaluations for capturing optimal solutions. This can be done by improving the convergence rate of new algorithm. Convergence rate of an algorithm denotes how fast an algorithm is approaching to the optimal solution. To improve the convergence rate of new algorithm one should know on which parameters the convergence rate depends and how it can be accelerated. After a intensive review of existing randomized algorithms, we have noticed that the convergence rate of these algorithm depends on the natural phenomena used for mapping to search optimal solution. This can be explained as follows To solve complex optimization problems, one has to design a randomized algorithm which is capable to search optimal solution. So within randomized algorithm there has to be some logical steps that can guide the search. As in most of the cases nature of objective function is not known to the user, one should choose a mapping that can automatically guide the search even in absence of any information. Mapping of Natural phenomena provide such a framework for guiding the search toward optimal solution when
301
Environmental Adaption Method
search direction is not clear. This is the reason why most of these randomized algorithms are inspired by nature. So it can be inferred that searching capability of any nature inspired algorithm lies within the mapping used to implement natural phenomena. Hence convergence rate of any algorithm can be improved by mapping a new natural phenomenon that can target optimal solution as soon as possible. Even though many nature inspired algorithms exist still there is a need of new optimization algorithm that can capture optimal solution as early as possible.
1.2. Ability to Capture Multiple Optimal Solutions There are two types of single objective optimization problems uni-modal optimization problems and multi modal optimization problems. Uni-modal optimization problems have only one optimal solution whereas multi modal optimization problems may have one or more global optimal solutions. To solve these problems, randomized algorithm has to search whole search space until all optimal solutions are obtained. These algorithms perform this searching with the help of some operators. These operators are designed to explore and exploit the problem search space. Random parameters used in these operators are used to control the exploration and exploitation capability of algorithm. In uni-modal optimization problem, exploration of problem search space is done to find out the probable area where an optimal solution may lie. Once this area is recovered, exploitation of this zone is done to retrieve optimal solution. Almost all randomized algorithm are very suitable for uni-modal problems because in early generations, they explore the problem search space to find out probable regions where optimal solution may lie, then exploitation of these regions are done to retrieve optimal solutions. In multi modal problems situation is very different, here even after finding out the probable area and exploiting it, it cannot be guaranteed that the obtained solution is the global optimal solution, so problem search space is explored again and again until all probable regions are explored and all optimal solutions are extracted. So to solve these problems, randomized algorithm has to repeat many cycles of exploration and exploitation to retrieve optimal solutions. As many existing randomized algorithm like GA (Holland, 1975), DE and PSO target their search toward only one optimal solution they are well suited to uni-modal problems. These algorithms are not good in capturing multiple solutions in multi modal problems (Erol & Eksin, 2006). These algorithms should be modified when they are applied to multi modal problems. For example, Simple GA can easily handle uni-modal problems because most of the operators in GA are used for the exploitation of search space (Hoffmeister & Bäck, 1990). Only mutation operator is responsible for exploration of search space. Due to lake of exploration, GA mostly capture a local optimal point in multi modal problems (Erol & Eksin, 2006). To deal with multi modal optimization problems, sharing function theory was implemented in GA. Similarly crossover and differential operator of DE should be properly designed to capture multiple optimal solutions in multi modal problems. PSO is also not good with multi modal problems (Arumugam, Rao, & Tan, 2009; Poli, Kennedy, & Blackwell, 2007). Its capability to capture multiple optimal solutions in a multi modal problem depends on the tuning of random parameters. That is why parameter tuning should be done very carefully in PSO. An algorithm with a balance in exploitation and exploration capability will work well with both types of problems.
1.3. Proper Tuning of Random Parameters Nature inspired algorithms use some random parameters for guiding their search toward optimal solutions. Like in GA random parameters are probability of crossover and mutation(Goldberg, 1989), DE
302
Environmental Adaption Method
use control parameters F, CR (Mezura-Montes, Velázquez-Reyes, & Coello Coello, 2006; Qin, Huang, & Suganthan, 2009) and PSO use w,c1 and c2 (Fan & Shi, 2001; Shi & Eberhart, 1998). Proper Tuning of these parameters can improve the convergence rate of the algorithm. It may also remove the problem of stagnation which is vital for multimodal problems. Problem of stagnation in case of optimization problem is that when, we are searching for the maxima and minima for the specific problem, there may be a local maxima or local minima where the search algorithm may stuck and interpret it as the global solution which is not the actual case. At present, Proper tuning of these parameters in GA,DE and PSO has been done by a experienced programmer who has a lot of experience. However in some circumstance experts may be unavailable, this may cause frustrating results for programmer. To overcome this problem, one can design an auto learning algorithm in which this parameter setting can be done automatically. So we should design a novel algorithm which has good convergence rate and it should be able to capture multiple optimal solutions in multi modal problems. Even more it should be able to remove the problem of stagnation. Thus, in this paper a novel optimization algorithm named as Environmental Adaption Method (EAM) has been proposed to solve optimization problems. EAM is designed to reduce the overall processing time for retrieving optimal solution of the problem, to improve the quality of solutions and particularly to avoid being trapped in local optima. A primary version of this algorithm has appeared in ICCCT-2011 (Mishra, Tiwari, & Misra, 2011). Convergence rate: As the overall processing time depends on the convergence rate of algorithm.EAM maps a new natural phenomenon which is able to provide good convergence rate. This algorithm is able to generate nearly optimal solutions in minimum number of fitness evaluations. As we are exploring the natural phenomematechniques for the random exploration EAM comes with the idea which is very fast in nature in comparison to other techniques like Genetic algorithm in which evolve over generation where as EAM works in the same generation itself with multiple number of attempts. The convergence of EAM can be improved by problem specific tuning of constant parameter. Ability to capture multiple optimal solutions: EAM has been designed to capture multiple optimal solutions in multi modal optimization problems. EAM use Adaption operator for searching in random direction to capture multiple optimal solution. This operator explores whole search space again and again to find out good solutions. Alteration operator is used to exploit good regions. Here more focus is on exploration of search space, hence this algorithm give proper weight age to all obtained optimal solutions. Hence this algorithm is more suitable for multi modal problems. Auto tuning of random parameters: Parameter tuning of this algorithm has been done automatically so that this algorithm does not face problem of stagnation and it should be able to capture multiple optimal solutions in multi modal problems.The results of the proposed algorithm are compared with the latest version of existing algorithms such as particle swarm optimization (PSO-TVAC) (Ratnaweera, Halgamuge, & Watson, 2004), and differential evolution (SADE) (Qin et al., 2009) on benchmark functions and the proposed algorithm proves its effectiveness over the existing algorithms in all the taken cases. Before discussing the proposed algorithm,let us take a brief overview of existing optimization algorithms.
2. BIOLOGICAL MOTIVATION BEHIND PROPOSED ALGORITHM Nature in itself is a best example to solve many real life problems in an efficient and effective manner. This is the main reason why many natural phenomena were mapped to design many optimization algo-
303
Environmental Adaption Method
rithms. For example, cooling phenomena of molten metal were used in Simulated Annealing algorithm, Process of natural selection and evolution was applied in Evolutionary Algorithms and intelligence of swarms were used in PSO, ACO, ABC (Basturk& Karaboga, 2006; Karaboga & Basturk, 2007; Ch & Mathur, 2012) another major idea is based on Cuckooalgorithm and has been successfully implemented and tested on various applications (Chakraborty et al., 2017) and many other algorithms like Firefly algorithm (Dey et al., 2014). (Samanta et al., 2016) suggested the idea that employs the Evolutionary algorithm using the quantamcomputing.The examples of the appliations of Evolutionary algorithms can be studied from the sources like (Giesbrecht & Bottura, 2015; Kanungo, Nayak, Naik, & Behera, 2016; Klepac, 2015; Tang, Xie, & Xue, 2015).The modified cuckoo search has been used for biomedical image enhancement employing the McCulloch’s method (Roy et al., 2017) and similar image segmentation idea has been proposed by (Chatterjee et al., 2017a). The various hybrid algorithms are also generated that have the strength of neural network and modified cuckoo search (Chatterjee et al., 2017b; Chatterjee et al., 2017c) in this authors have used the approach for the classification of dengue data. Similar to these phenomena, the process of adaptive learning can be used to create a novel optimization algorithm. In best of our knowledge, the process of adaptive learning has never been applied to design optimization algorithm. The theory of adaptive learning was given by Mark Baldwin (Baldwin, 1896). According to the theory of adaptive learning if a population (with plastic traits) finds itself in a new environment where its individuals phenotype are not optimal then due to adaptive plasticity suboptimal individuals will acquire higher fitness. This learning process is iteratively performed until population attains optimal fitness. Hence, learning improves the chances of survival of species in a new environment. In another way, this process can be understood as follows. If a population move from one environment to another environment then, there will be some changes in the environmental conditions. This variation in environment induces variation in plastic traits (behavior). This modification of the behavior of species is mainly due to the variation in its genetic or phenotypic structure. The genetic structure of a specie store inheritable information and alteration in this structure is possible only after sexual reproduction. However the phenotypic structure store the information related to the behavior, physiology, life history, and/or morphology and variation in phenotypic expressions may take place due to change in environment. These genetic/phenotypic variations might be positive or negative, positive changes always support in survival while negative changes degrade the chances of survival. These changed structures will undergo natural selection and only positive structure will be promoted. According to the Wund (Wund, 2012), this process can be explained as follows “under novel environmental conditions, some changes in behavior, physiology, life history, and/or morphology might be beneficial, allowing at least some individuals to persist until natural selection further enhances a population’s mean fitness”. Since this concept automatically produces an optimal structure, it can be used to design optimization algorithm. We will use this theory to design Environmental adaption method. Graphically, this theory can be explained by the given diagram shown in Figure 1. To develop novel algorithm, following assumptions have been made. We have taken a new variable environmental fitness, which is taken as the average fitness for current generation. This average fitness has been taken as a measurement to check the suitability of specie in a particular environment. All those species whose fitness is higher than environmental fitness are favored by the current environmental conditions. Those species whose fitness is less than environmental fitness will be struggling in current environmental conditions. Any changes in environmental conditions will motivate all species to update their structure so that they can achieve minimum fitness required for their survival, if they are well adapted to the existing environment than to achieve better fitness. 304
Environmental Adaption Method
Figure 1. Graphical Representation of EAM
To develop novel algorithm, following assumptions have been made. We have taken a new variable environmental fitness, which is taken as the average fitness for current generation. This average fitness has been taken as a measurement to check the suitability of specie in a particular environment. All those species whose fitness is higher than environmental fitness are favored by the current environmental conditions. Those species whose fitness is less than environmental fitness will be struggling in current environmental conditions. Any changes in environmental conditions will motivate all species to update their structure so that they can achieve minimum fitness required for their survival, if they are well adapted to the existing environment than to achieve better fitness. To design optimization algorithm, we have divided the process of adaptive learning in to three steps which involves adaption, alteration and selection. In adaption step, new environmental conditions motivate the initial population to update their phenotypic structure. However, during phenotypic changes some alteration may occur in solutions due to environmental noise. Structures generated after adaption and alteration form intermediate population. Initial structures are then combined with intermediate structures for selection. Those structures which do not contribute in new environment will be demised. Those structures which survive in the new environment will design a new mean fitness. Since these changes are taking place in the single life span, the possibility of getting good solution is less time is very high. To understand how adaptive learning is helpful in generating better individuals. Consider this example of dengue mosquito. A simple mosquito has adapted and modified its phenotypic structure resulting into a special type of mosquito class named as Adeas. The fast development rate has become a danger to the human survival. These changes in the mosquitos are the aftereffect of changing environmental conditions. In the past, the mosquito enjoys less mortality rate as there were no means by which humans could kill mosquito automatically. Now, seeing the seriousness problem doctors/biologist have developed coils and vaporizers with pesticides to kill mosquitoes, mosquito has to refine its internal phenotypic structure to survive in new environmental conditions. Again the refinement rate was so high that within two months a new mosquito was able to adapt to the changed environment.
305
Environmental Adaption Method
3. PROPOSED APPROACH Alike GA and PSO, this algorithm is also a population based algorithm. This algorithm initially generated the random process which is processed using three operators. The first operator is named as an adaption operator. The adaptation operator is responsible for updating the current generation phenotypic structure taking the account of current fitness and environmental fitness. Second operator introduces the alteration in the existing population to mimic the environmental noise. The resultant population after application both of these operators is referred as intermediate generation. The best N individuals are selected from the intermediate generation and initial generation with the help of selection operator. The operators are applied to generate new generations until the maximum generation count is reached or the desired solution has been generated. Working of EAM can be explained by Figure 2. Even though the algorithm seems to similar to evolutionary algorithm but main difference between the EAM and EA is that in EA we have the recombination operator which is basically the mixing of the two best solutions but here the inspiration is not derived from the single individual here the effect is dependent on the complete generation as we have considered Favg in the adaptation operator. The details of which can be seen in the adaptation operator. Secondly the Mutation operator is quite simple in comparison to that of then alteration operator. In alteration operator we have probability of flipping the each bit rather than for the population will go under the changes or not. EAM is also different from the advance algorithms like Cuckoo search algorithm. From the flowchart EAM we can see that the number of operations are fixed as every individual will undergo the adaption and alteration operator but in the case of Cuckoo search algorithm as described by the (Chakraborty et al 2017) the previous nests are preserved if the probability K is less than the threshold and will lead to the skipping of the some generation which will be directly propotional to Rate of discovery of alien eggs. Figure 2. Working of EAM
306
Environmental Adaption Method
Box 1. Proposed Algorithm:(In Pseudo Code) Input: MAX_G (maximum number of generations) P_S (Population size) Output: Q* (Final set) Other variables: n (number of generation) O’ (intermediate offspring) O (final offspring) P (temporary pool) in (ith individual of nth generation) Step 1. Set n=0. 2. Generate initial population POP0 by P_S random individuals. 3. Apply Adaption operator (calculation given below) to each member (i) of population POPn and form decimal value of binary string used to represent Pin.
O 'n , Here P
in
represent the
Fn ( Pin ) FAvg + β % 2l O ' = α * Decoded value in decimal of binary coding of P ( ) in in
( ) (1)
Apply alteration operator as per the probability of alteration On’ and On. 4.
Pn = {POPn ∪ O n }
5. Increment n 6. if n0.
The Luce Model The Luce model is a first probabilistic choice model that incorporates boundedly rational choice of customers(Kim & Yoon, 2004)-(Qi, Zhang, Zhang, & Shi, 2006)-(Shapiro, 2001). With this model, customers can choose the operator that will maximize profits by choosing one that has the maximum probability; but forcing this choice is inadequate in this area given the existence of hidden information that has not been represented in this model. The following equation represents the probability that the customer chooses SPi: u i
ρi (p, q ) =
eλ N
∑e
u
j λ
(2)
j =1
with 𝜆∈[0,1] is the degree of customer irrationality, N is the number of SPs and p and q are respectively the vector of price and QoS. When 𝜆 tends to 0, means that customer behavior is rational, i.e., they have all the information and rules that allow them to maximize their profit, while customers are irrational when 𝜆 approaches 1.
330
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Demand Model Di We consider a market size n (the total number of customers), the function of the application to the operator’s services i, Di is the probability that a customer selects the operator multiplied by the size, n, of the market. It is expressed by: Di (p, q ) = n.ρi (p, q )
(3)
Theoretical Quality of Service We consider Delayiu the time required for data transmission to a SPi user u. This time is expressed in telecommunications (M. Baslam, El-Azouzi, Sabir, & Echabbi, 2011) as a function of bandwidth available at I SPi and the demand Di: Delayiu =
1 Φi − Di
(4)
That means more demand is greater than the time increases, and vice versa, over the bandwidth increases the time becomes less important. This proportionality is logical since: • •
More as demand increases, the number of customers connected to the SPi becomes large and thus the time becomes more important. More than the bandwidth increases, the SP largely has capacity to cover all customers and therefore the time became smaller.
In the model of L. Kleinrock (Kleinrock, 1975)with queues, Quality of service QoS is the inverse of the total response time when the user wants to access the service. Let ci the deadlines for transmission of data between the service provider and the provider services l’SPi, the total time of the answer is accumulates between ci and Delayiu . Thus the quality of service is expressed by the following equation (as (Mohamed Baslam, Echabbi, El-Azouzi, & Sabir, 2012)): qi =
1 Delayiu + ci
(5)
From the two equations (4) et (5), we show the existence of the relationship between the quality of service, demand Di(p,q) = n𝜌i(p,q) and bandwidth Φi by the following equation (M. Baslam et al., 2011): qi =
Φi − Di 1 + ci (Φi − Di )
(6)
331
Decision Choice Optimization With Genetic Algorithm in Communication Networks
or by the following equation: Φi = Di +
qi 1 − qici
(7)
From the equations (6), we can deduce that when demand of SPi approach covering all the bandwidth; QoS becomes less.
Problem Formulation From equation (1), we find that the theoretical benefit of a user is: ui (qi , pi ) = α.qi − pi ,
(8)
then the real benifit is: uiR (qiR , piR ) = α.qiR − piR ,
(9)
with qiR is a real QoS and piR is a real price. Assumption 1: In telecommunications, there is no general difference between promotional price (theoretical) and the real price that the user paid when the invoice settlement. However, in our study we assume that pi = piR . Taking into consideration this assumption 1, the difference between the real and theoretical benefit becomes: ui − uiR = α.qi − pi − (α.qiR − piR ) = α(qi − qiR )
Resource Management Model Used to help operators in the telecommunications field to maintain their resources so that the difference between their offers and advertising that benefit the customer actually is optimal. It is a tool to customers who want to register with the operator that meets their needs. 2 1 min Fi (p, q ) = .α.ρi ⋅ [ qi − qiR ] 2
(
under the constraints:
332
)
Decision Choice Optimization With Genetic Algorithm in Communication Networks
p < P i max R qmin < qi < qi
The first constraint is related to customer purchasing power. the second constraint is a formulation in terms of customer needs in real QoS. it must meet a minimum threshold and it should not exceed the theoretical QoS.
Discrete Choice Model Customers Lets customers know the weight (sincerity) of all operators in the telecommunications market. This problem is formulated as a this multi-objective model: F (p, q ) 1 F (p, q ) 2 min G (p, q ) = ( , ) F p q N under the constraints: ∀ i ∈ {1,..., N } p < P i max R ∀ i ∈ {1,..., N } qmin < qi < qi To solve the multi-objective problem (MOP), we must transform it into a single-objective problem weighted. For this, we applied the aggregation method and the result of the transformation is: N
minG (p, q ) = ∑γi .Fi (p, q ) i =1
under the constraints: N ∀ i ∈ {1,..., N } 0 < γi < 1 et ∑γi = 1 i =1 ∀ i ∈ {1,..., N } p < P i max R ∀ i ∈ {1,..., N } qmin < qi < qi
and
333
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Fi (p, q ) =
2 1 .α.ρi ⋅ [ qi − qiR ], i = 1, 2,..., N 2
(
)
with 𝛾 is considered vector weight of the operators in the telecommunications market.
GENETIC ALGORITHM Genetic Algorithms (GAs) developed by Holland (Holland, 1992) and his student Goldberg (Shapiro, 2001), are based on the mechanics of natural evolution and natural genetics(Michalewicz, 1996)-(Deep & Thakur, 2007). GAs differ from usual inversion algorithms because they do not require a starting value. The GAs use a survival-of-the-fittest scheme with a random organized search to find the best solution to a problem. Solve an optimization problem is find the optimum of a function from a finite number of choices, often very large. The practical applications are numerous, whether in the field of industrial production, transport or economics - wherever there is need to minimize or maximize digital functions in systems simultaneously operate a large number of parameters. Algorithm (1) represents the genetic algorithm used to optimize the models proposed in this work. Algorithm 1: Genetic Algorithm 1. Initialize the initial population P. 2. Evaluate P(t). 3. While No convergence do 4. a. P(t+1)= Selection of Parents in P(t). b. P(t+1) = Apply Crossing Operator on P(t+1) c. P(t+1) = Apply on Mutation Operator P(t+1) d. P(t) = Replace elders of P(t) Descendants of their P(t+1) e. Evaluate P(t) 5. End while
NUMERICAL RESULTS In this section, we present the numerical results obtained by assuming that we have SPs in this telecommunications market. We use the genetic algorithm with the parameters that will allow us to obtain the optimal solution for our proposed models.
The Real Quality of the Function Study Study of a Limited Case In the telecommunications market, the real quality of service is a function that depends on the bandwidth Φi and the demand Di of SPi. In reality, we know that when the bandwidth increases, the real quality of
334
Decision Choice Optimization With Genetic Algorithm in Communication Networks
service (QoS) increases and vice versa; also when demand increases, real service quality decreases and increases when demand decreases. In this context, we observed that the real quality can be expressed as a polynomial of degree 2, the variable xi is the ratio between Φi and Di, as following: qiR (Φi , Di ) = αi1 * (
Φi Di
)2 + αi2 * (
Φi Di
) + αi3
We used the genetic algorithm (with the table settings 1) to find these coefficients for different values of bandwidth Φi and of demand Di. The figures 1 and 2 show the influence of respectively Φi and Di on QoS (theorical and real). Table 1. Genetic algorithm parameters to the figures of the results 1 and 2 Population size N
16
Type selection
at roulette
Type of crossover
Single-points
Probability of crossover Pc
0.7
Type of mutation
uniform
Probability of mutation
0.05
Maximum number of generations
100
From Figures 1 and 2 we note that with the change of bandwidth Φi and the demand Di, the genetic algorithm was able to find the good coefficients of the polynomial to minimize the gap between what is theoretical and what is real. In the next part, we will not restrict ourselves to the case presented above, but we are expanding the study of the variation of the actual quality using the technique of discretization dominates definition of theoretical quality seeking at each point the value of the actual quality by solving the model for managing resources.
Study of a General Case by Discretization The quality theoretical of a SPi may vary within a range delimited by a minimum and maximum value: qit ∈ [qmin , qmax ]. To make a digital resolution, we will discretize the interval 3, that is to say, turn it into an approximate problem (discrete) to find the values of the actual quality qiR at each point of the discrete domain or h is a positive regular pitch.
A) Study The Impact of Bandwidth on qt and qr We launched the genetic algorithm; Matlab programmed with the parameters listed in Table 2; on the model of resource management with variation of bandwidth Φi, and we obtain the results shown in Figures 4, 5 and table 3.
335
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Figure 1. Variation of qit and qiR in terms of Φi
Figure 2. Variation of qit and qiR in terms of Di
336
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Figure 3. Discretization of an interval
Table 2. Genetic Algorithm Parameters to the Figures of the results 4 and 5 Population size N
20
Type selection
at roulette
Type of crossover
Multi-point
Probability of crossover Pc
0.65
Type of mutation
non uniform
Probability of mutation
0.05
Maximal Number of generation
300
Figure 4 shows the decrease of the function Fitness relative to iterations (generations) the genetic algorithm. The objective function value begins 103, in the first generation, to reach the value 10‑4, in the 253ème generation.This result shows that the decrease is Igue (remarkable). From Figure 5, we note that to achieve the same minimum value, the algorithm needs to go to the 253 generation. Figure 4. Decrease in the Objective function according generations
337
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Figure 5. Changes in the real quality of service over the generations
Table 3. Convergence results of the genetic algorithm (variation of Φi) Number of generation
253
minimum cost
2.517*10‑4
step of discretization h
1 10
B) Impact on Demand Study qt and qr We launched the genetic algorithm; with the parameters listed in Table 4;on the model of resource management with variation of the demand Di, and we obtain the results shown in the following figures 6, 7 and table 5. Table 4. Genetic algorithm parameters to the figures of the results 6 and 7 Population size N
10
Type of selection
at roulette
Type of crossover
Multi-point
Probability of crossover Pc
0.60
Type of mutation
not uniform
Probability of mutation
0.05
Maximum number of generation
200
338
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Figure 6. Decrease in the Objective function according generations
Figure 7. Changes in the real quality of service over the generations
Figure 6 shows the decrease of the function Fitness relative to iterations (generations) the genetic algorithm. the objective function value begins with 102 in the first generation, reaching the values 5.694*10‑4, in the 167th generation. This result shows that the decrease is Igue (remarkable). From the figure 7, We note that to achieve the same minimum value, the algorithm needs to go to the 167 generation.
339
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Table 5. Convergence results of the genetic algorithm (variation of Di) Number of generation
167
Minimum cost
5.694*10‑4
Step of discretization h
1 20
Model Resolution of the Weight Calculation In this part, we consider a telecommunication network system that we have two services providers. We will use the model of discrete choice of customers to find their weight in this market ranging Φi and Di an operator. The calculation of these weights is a kind of decision support for customers seeking to register with the services of the most sincere operator (who has more confidence in the sense of the difference between qt and qR).
Impact of Bandwidth Φ on the Weight of Operators We vary the bandwidth of the operator 1 and observe the influence on weight and that of the adversary. Figure 8 show that the weight of the operator is an increasing function compared to bandwidth 𝜑1, then the weight of its adversary is a decreasing function with respect to 𝜑1. This result is real, since the increase in bandwidth 𝜑1 causes the improvement of the real QoS q1R and therefore the operator 1 must have a good reputation and a good weight for his adversary.
Impact of Demand D on the Weight of Operators We will vary the request of the operator 1 and observe the influence on weight and that of the adversary. Figure 9 shows that the wight of operator 1 is decreasing function compared to demand D1, then the wight of its adversary is a increasing function compared to D1. This result is real, because the increase in demand D1 causes degradation of the real QoS q1R and therefore the operator 1 must not have a good reputation and a good weight for his adversary.
CONCLUSION In this chapter, we have studied the problem of real quality, which is the key to rational customer decisionmaking in telecommunications networks. We were able to propose an inverse problem reformulated in the form of optimization models. These models will allow customers to know the service provider that best meets their requirements by knowing the demand and the bandwidth they use. Also, it is a modeling that allows service providers to adapt their bandwidth to user demand so that the difference between what is theoretical and what is perceived is optimal. However, we used the genetic algorithm for the numerical solution of optimization models. This choice is due to the efficiency it shows in the field of global optimization.
340
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Figure 8. Change in weight of SPs increasing Φ1
Figure 9. Change in weight SPs increasing D1
In future work, we aim to merge our model with the study of cache in information centric networks (Garmani, Baslam, & Jourhmane, 2018) to understand the impact of caching on the quality of service perceived by customers.
341
Decision Choice Optimization With Genetic Algorithm in Communication Networks
REFERENCES Ait Omar, D., Garmani, H., El Amrani, M., Baslam, M., & Fakir, M. (2019). A Customer Confusion Environment in Telecommunication Networks : Analysis and Policy Impact. International Journal of Cooperative Information Systems, 28(02), 1930002. doi:10.1142/S021884301930002X Ait Omar, D., Outanoute, M., Baslam, M., Fakir, M., & Bouikhalne, B. (2017). Joint Price and QoS Competition with Bounded Rational Customers. In A. El Abbadi & B. Garbinato (Eds.), Networked Systems (pp. 457–471). Springer International Publishing; doi:10.1007/978-3-319-59647-1_33 Baslam, M., Echabbi, L., El-Azouzi, R., & Sabir, E. (2012). Joint Price and QoS Market Share Game with Adversarial Service Providers and Migrating Customers. In R. Jain & R. Kannan (Eds.), Game Theory for Networks (pp. 642–657). Springer Berlin Heidelberg. Baslam, M., El-Azouzi, R., Sabir, E., & Echabbi, L. (2011). Market share game with adversarial Access providers : A neutral and a non-neutral network analysis. International Conference on NETwork Games, Control and Optimization (NetGCooP 2011), 1-6. Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer. Coucheney, P., Maille, P., & Tuffin, B. (2013). Impact of Competition Between ISPs on the Net Neutrality Debate. IEEE eTransactions on Network and Service Management, 10(4), 425–433. doi:10.1109/ TNSM.2013.090313.120326 Deep, K., & Thakur, M. (2007). A new mutation operator for real coded genetic algorithms. Applied Mathematics and Computation, 193(1), 211–230. doi:10.1016/j.amc.2007.03.046 Garmani, H., Baslam, M., & Jourhmane, M. (2018). Caching Games between ISP in Information Centric Network. International Journal of Control and Automation, 11(4), 125–142. doi:10.14257/ijca.2018.11.4.12 Holland, J. H. (1992). Adaptation in Natural and Artificial Systems : An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence. Cambridge, MA: MIT Press; doi:10.7551/ mitpress/1090.001.0001 Kim, H.-S., & Yoon, C.-H. (2004). Determinants of subscriber churn and customer loyalty in the Korean mobile telephony market. Telecommunications Policy, 28(9), 751–765. doi:10.1016/j.telpol.2004.05.013 Kleinrock, L. (1975). Queueing Systems. New York, NY: Wiley-Interscience. Lorkowski, J., & Kreinovich, V. (2018). Bounded Rationality in Decision Making Under Uncertainty : Towards Optimal Granularity. Consulté à l’adresse https://www.springer.com/gp/book/9783319622132 Michalewicz, Z. (1996). Genetic Algorithms + Data Structures = Evolution Programs (3e éd.). Consulté à l’adresse https://www.springer.com/gp/book/9783540606765 Morgan, B. J. T. (1974). On Luce’s choice axiom. Journal of Mathematical Psychology, 11(2), 107–123. doi:10.1016/0022-2496(74)90002-9 Qi, J., Zhang, Y., Zhang, Y., & Shi, S. (2006). TreeLogit Model for Customer Churn Prediction. 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC’06), 70-75. doi:10.1109/APSCC.2006.111
342
Decision Choice Optimization With Genetic Algorithm in Communication Networks
Shapiro, J. (2001). Genetic Algorithms in Machine Learning. In G. Paliouras, V. Karkaletsis, & C. D. Spyropoulos (Eds.), Machine Learning and Its Applications : Advanced Lectures (pp. 146–168)., doi:10.1007/3-540-44673-7_7
This research was previously published in Innovative Perspectives on Interactive Communication Systems and Technologies; pages 194-209, copyright year 2020 by Information Science Reference (an imprint of IGI Global).
343
344
Chapter 18
Optimal Designs by Means of Genetic Algorithms Lata Nautiyal Graphic Era University, India Preeti Shivach Graphic Era University, India Mangey Ram https://orcid.org/0000-0002-8221-092X Graphic Era University (Deemed), India
ABSTRACT With the advancement in contemporary computational and modeling skills, engineering design completely depends upon on variety of computer modeling and simulation tools to hasten the design cycles and decrease the overall budget. The most difficult design problem will include various design parameters along with the tables. Finding out the design space and ultimate solutions to those problems are still biggest challenges for the area of complex systems. This chapter is all about suggesting the use of Genetic Algorithms to enhance maximum engineering design problems. The chapter recommended that Genetic Algorithms are highly useful to increase the High-Performance Areas for Engineering Design. This chapter is established to use Genetic Algorithms to large number of design areas and delivered a comprehensive conversation on the use, scope and its applications in mechanical engineering.
INTRODUCTION Designing process of a product includes a number of optimization problems. Such as, when we design an engine controller, the fuel economy and power performance should be optimized. And both of these factors are also affected by various other parameters such as temperature, pressure etc. Therefore, controlling fuel injection and air-fuel ratio with reference to these parameters is an optimizing problem and also a complex one. There are various issues when we deal with the engineering problems. First one is DOI: 10.4018/978-1-7998-8048-6.ch018
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Optimal Designs by Means of Genetic Algorithms
to improve design competency. Industries have to develop good products with in a time bound because of competition or requirements. Second issue is to optimize the design process. The third issue is to achieve robustness requirements. Although design problems are typically demarcated more indefinitely and have a variety of precise answers, but the methodology may require backtracking as well as iteration. Designing process is a depending process and the solution depends on unpredicted difficulties and modifications as it matures. The Wright brothers didn’t come to know problems until they actually build and test their primary gliders. The fundamental five steps for resolving the problems associated with designing or design problems are as follows: 1. 2. 3. 4. 5.
Describe the overall problem Collect relevant information Produce several solutions Evaluate and choose a solution Assess and execute the solution
The first phase is Describe the overall problem. Problem description step mainly consists of a list of requirements of the customer and also information like function and features of the product among other things. In the second step, appropriate material is collected for the design and functional specifications of the product. For this purpose, similar products available in the market are reviewed. When the relevant information is collected successfully then the duty of the design teams, manufacturing teams, and marketing teams is to generate large number of options to accomplish the objectives. Evaluate and choose a solution step is all about the comprehensive analysis of the solutions and results in finding out of the absolute design that finest fits the final needs of the customer. By applying this step, a sample is developed and tested against functionality to validate and possibly revise the design. This process is depicted in Figure 1.
DESIGN PROBLEMS IN ENGINEERING As contemporary computational and modeling technologies propagate all over the world, engineering design profoundly depends on computer modeling and simulation to quicken design cycles as well as to save the cost (Xiaopeng, 2007; Roosenburg & Eekeks, 1995). A complicated design problem will comprise of various design constraints. Discovering design space and finding best solutions are still major issues and the challenges for complex systems. In a product design process, many complex multi-objective optimization problems occur. For example, in designing an engine controller, appropriate fuel injection times and air-fuel ratios have to be decided to improve engine fuel economy and power performance. But engine fuel economy and power are also affected by hundreds of other engine conditions, such as intake manifold pressure, intake manifold temperature, coolant temperature etc. How to control fuel injection time and air-fuel ratio with respect to these conditions to achieve the optimal fuel economy and power performance is an extremely complex problem. It is essential for engineers to progress the design by using simulation and optimization techniques. Although large number of challenging issues is there when complex engineering problems 345
Optimal Designs by Means of Genetic Algorithms
Figure 1. Design Process
are solved. The first issue is all about finding the design efficiency. Current industries require a high quality product in a small time because of opposition or design cycle requirements. Traditional design processes can be much improved by using computational engineering tools. The second issue is how to optimize the complex design. The engineering optimization problems are normally high dimensional and with conflicting objectives. The optimization algorithms need to be introduced to help explore design space and find the optimal solution. The third issue is how to meet robustness requirements. Engineering design always has uncertainties due to manufacturing tolerance and perturbation in real operation. These three issues are the main focus of this dissertation. Rapid prototyping helps to speed up the design process and explore research and development ideas. Engineers are able to build complex computational models to simulate many physical dynamics, such as combustion dynamics, fluid dynamics, and vibration dynamics. Model accuracy has been improving as we understand more about the system and computational power is enhanced. Industries are able to study prototyping before any manufacture production happens. However, even with the help of computational modeling, the design process is a long and tedious procedure and requires a lot of experiments and simulations to explore the design concept. How to improve design process efficiency is still one of major challenges in current industrial world.
GENETIC ALGORITHM Genetic Algorithm (GA) starts with population of encoded solution and lead the population for optimal solution. Therefore, GA searches a space of possible solutions and finds the best solution. GA algo-
346
Optimal Designs by Means of Genetic Algorithms
rithms belong to zero-order optimization method. Three main tasks of these algorithms are; selection, mutation and crossover. Selection chooses individuals called parents that contribute to the population of next generation. Mutation operation mutates the distinct parents to generate children. Last operation joins two parents and generates children for future generation. Fittest is survived and unfit is died out (Tom, 2017). The success of a Genetic Algorithm depends on selection of parameter and genetic operators. They belong to class of evolutionary algorithms (Pisinger & Ropke, 2007). GA is different from conventional algorithms in the following manners (Golberg, 1989): • • • •
Theses algorithms works on coding of design variables rather than variable itself. They use an objective function; no derivative is used. They start from a population not from a single point. They can be used with discrete, integer, continuous or a mix of the three.
For example, suppose the strings 10101 and 10010 are selected for crossover. Suppose we select an intertwining point of 2. Crossover is executed then the next iteration starts with selection. This practice carries on until a specific criterion is met. 1st Parent = 1 0 ¦ 1 0 1 2nd Parent = 1 0 ¦ 0 1 0 1st Child =1 0 0 1 0 2nd Child = 1 0 1 0 1 There are three fundamental operators originate in every genetic algorithm (Khan and Bajpai (2013)):
Reproduction In this operation individuals are selected for generating next generation. The selection of an individual is based on the fitness value of that particular individual. Fitness value is calculated by using a fitness function. For every generation, the breeding operator selects individuals that are engaged into a mating pool. Random immigrants bring some entirely new randomly generated elements into the gene pool (Branke, 2000).
Crossover In term of biology, crossover refers to mingling the chromosomes from the parents to generate new chromosomes for the descendants. Analogous to biological crossover, the GA also chooses two individuals randomly then checks whether crossover operation can be performed by using a factor called crossover probability. These two individuals are simply copied to next generation if the GA finds that the parameter is not up to and if crossover operation is performed then a point of joining is chosen randomly. New individuals are created and placed in next generation. A uniform crossover operator probability of 0.5 is recommended by various works such as Syswerda (1989) and Spears and De Jong (1991).
347
Optimal Designs by Means of Genetic Algorithms
Mutation In term of biology, mutation refers to changing the nucleotide arrangement of the genome. Unrepaired impairment to DNA or RNA genomes, faults in the course of replication is the causes of mutation. Mutation operation of GA is a genetic operation in which an operator is used to preserve the diversity from generations to generations. It basically changes one or more genomes in a chromosome from its original state. It may result in a totally different solution. Hence, a mutation operator has a great impact on the solution. It occurs for the duration of evolution according to a user-defined probability of mutation. Mutation probability should not be high because if it is then the search process will start moving towards a basic random search (Golberg, 2006; Yeh et al., 2004). Upholding and introducing variety is the main goal of mutation operation. It should avoid local minima by stopping the population of individuals from becoming alike to each other. Therefore, it slows down the process of evolution. This logic also clarifies the point that many GA systems evade from only selecting the finest of the individuals in producing the next but rather a random selection with a weighting toward those that are fitter. •
Examples: Maximize the function f(x) = x2, x ∈ (0,31). x will be represented as five digit using integer. So, we will select randomly generated preset of solution as in (Goldberg (2006)) Gene1 Gene2 Gene3 Gene 4 01101 11000 01000 10011 (13) (24) (8) (9)
Table 1. Reproduction String No
Initial Population
Value of x
Fitness Function f(x) = x2
Probability
Expected Count
Actual Count
1.
01101
13
169
0.14
0.58
1
2.
11000
24
576
0.49
1.97
2
3.
01000
8
64
0.06
0.22
0
4.
10011
19
361
0.31
1.23
1
Sum
1170
1.00
4.00
4
Average
293
0.25
1.00
1
Maximum
576
0.49
1.97
2
348
Optimal Designs by Means of Genetic Algorithms
Table 2. Crossover String No
Mating Pool
Cross Over Point
Offspring After Crossover
Value of x
Fitness Function f(x) = x2
1.
0 1 1 0 |1
4
01100
12
144
2.
1 1 0 0 |0
4
11001
25
625
3.
0 1 |0 0 0
2
11011
27
729
4.
1 0 |0 1 1
2
10000
16
256
Sum
1754
Average
439
Maximum
729
VARIANTS OF GENETIC ALGORITHMS Real Coded Genetic Algorithm The term Real Coded Genetic Algorithm (RCGA) holds large amount of benefits as compared to binary coded equivalent when handling continuous search spaces with large dimensions and a great numerical precision is needed. In RCGA, each and every gene signifies a different set of the problem, and the size of the chromosome is reserved the same as the length of the outcome to the problem. Consequently, RCGA can handle large domains without sacrificing precision as the binary implementation. Additionally, RCGA retains the ability for the local tuning of the solutions; it similarly permits assimilating the domain knowledge so as to improve the performance of GA.
Binary Coded Genetic Algorithm An algorithm which is based on a fundamental concept of probabilistic search algorithm that repetitively transmutes a group (called a population) of mathematical entities (typically fixed length binary character strings), each with an connected fitness value, into a new population of offspring entities using the theory of Charles Darwin’s natural selection and using operations that are mottled after naturally occurring genetic operations, such as crossover (sexual recombination) and mutation is called as Binary Coded Genetic Algorithm. Following the model of evolution, they set up a population of individuals, where each distinct corresponds to a point in the search space. An objective function is applied to each individual to rate their fitness.
Differential Evolution Differential Evolution (DE) methods are the replacement of traditional mutation and crossover operation by alternative differential operators. Machine intelligence and cybernetics are widely using these methods. It outperforms the GA in various cases. As in other evolutionary algorithms, two primary processes drive the evolution of a DE population: the variation process, which permits searching the diverse areas of the search space, and the selection process, which guarantees manipulation of the developed information about the fitness landscape.
349
Optimal Designs by Means of Genetic Algorithms
Least Mean Square Algorithm Least Mean Square (LMS) methods are stochastic gradient descent methods. Based on the current error rate, filters are designed to be adaptive. One can implement these methods without squaring, differentiating. These methods are used in adaptive filters to find the filter coefficients that relate to producing the least mean squares of the error signal (difference between the desired and the actual signal). It is a stochastic gradient descent method in which the filter is adaptive based on the error at the current time. The LMS algorithm can be implemented without squaring, averaging or differentiation and is simple and efficient process.
Sawtooth GA There are various methods that are designed to enhance the robustness and effectiveness of computation. Standard GA is based on selection, crossover and mutation. Major parameter of GA is its population size which affects effectiveness of computation and robustness of the algorithm. If the population size is too small then the solution prematurely converges to a non-optimal solution, whereas if the population size is large then the computation efforts become considerable. Numerous approaches have been proposed that try to upsurge the range of the population and avoid premature convergence
WHY GENETIC ALGORITHM IN ENGINEERING? As the growth of engineering industry, computer modeling and simulation are profoundly used to improve the designing process of engineering. That is also cost effective. Finding an optimal solution of an engineering problem is still a challenging task. Genetic algorithms are widely used solution for optimizing problems. Genetic Algorithms (GAs) are heuristic search algorithms that imitate the process of biological evolution (Goldberg, 1998). GA developed by Holland (1975), GAs are based on the principle given by Darwin. According to Darwin Neither the most intelligent nor the most powerful but the species survive will be the most adaptive. The search procedure of GAs finds the best and fittest design solutions. These algorithms are used in engineering because they are easy to use. Opposite to gradient methods, the GAs work with a set of solutions and are guided by probability. These are not deterministic. Optimization techniques like goal programming, linear programming, branch and bound algorithm face the problem when the number of variables increases. Each optimization algorithm is intended to solve a particular type of problem. Some algorithms produce solution accurately but not cost efficient whereas some algorithms don’t provide an optimal solution. Hence, while solving a problem we have to compromise between high accuracy and low accuracy. Genetic Algorithms (GAs) are heuristic search algorithms that imitate the process of biological evolution. Genetic algorithms are a very popular heuristic which have been successfully applied to many optimization problems (Bhoskar et al., 2015). These algorithms are a part of revolution in computer science field and downtrend the limits of other algorithms: • • • 350
GAs are most robust. While executing search in large state –space, GAs offer noteworthy profits over many other techniques. GA operates on coding of solution set, not the solutions themselves.
Optimal Designs by Means of Genetic Algorithms
• •
GAs search from a population not a single solution. The rules are not deterministic rather probabilistic.
The positive side of GA is that it handles the constraints and objectives very easily. They can be easily applied to a wide range of jobs such as optimization, learning etc. Below are some advantages of using Gas: • • • • •
Each optimization problem that can be defined with the chromosome encoding can be solved by using GA. It solves problems with multiple solutions. Structural genetic algorithm gives us the possibility to solve the solution structure and solution parameter problems all at once. Genetic algorithm is a method which is very easy to understand and it practically does not demand the knowledge of mathematics. Genetic algorithms are easily moved to current simulations and models.
LIMITATIONS • • •
Certain optimization problems (they are called variant problems) cannot be solved by means of genetic algorithms. This happens because of inadequate fitness functions which produce bad chromosome blocks despite the fact that only noble chromosome blocks crossover. There is no absolute assurance that a genetic algorithm will find a global optimum. It happens very frequently when the populations have many subjects. Like other artificial intelligence techniques, the genetic algorithm cannot guarantee stable response time. This is a major reason of not applying Gas to real time applications.
APLLYING GENETIC ALGORITHM IN ENGINEERING DESIGN One of the popular heuristic search algorithms is genetic algorithm. GA not only has all heuristic algorithms’ characteristics, but also is a multi-directional search method. It originally is designed for single objective optimization problem since it uses a fitness to do evaluation. As GAs are applied to multiobjective optimization, the fitness concept has been extended to dominance rank, which is created for searching the Pareto Front. Since then, GAs begin to become popular in multi-objective optimization areas, especially in finding the Pareto Front. GAs use dominance rank to push the population close to the Pareto Front and it has been proved to be an effective way to explore the Pareto Front. One of the difficulties in exploring the Pareto Front is the curse of dimension. As dimension of the problem increases, the Pareto Front becomes very complicated. GAs tends to be stuck in some local Pareto Front areas. To make the optimal solutions well covering the Pareto Front and quickly converting to the optimum is what most of research on GAs are focusing on. How to balance proximity and diversity in exploration is a multi - objective optimization problem itself. There are many studies on diversity preservation, diversity estimation, and metric comparison to improve the population diversity. Many different GAs have been presented to improve diversity and 351
Optimal Designs by Means of Genetic Algorithms
convergence such as RAND, FFGA, NPGA, HLGA, VEGA, NSGA listed by Zitzler, Deb, and Thiele 2000. There are new research ideas such as co-evolutionary GAs and Clustered Oriented Gas (Packham & Parmee, 2000). No matter what GAs are, they are trying to make the search converge to the Pareto Front (or close to the Pareto Front) as quick as possible and make the population cover the Pareto Front (or close to the Pareto Front) as even as possible at the least computation time. Since these goals are conflicting themselves, we often find out that any GA is a tradeoff of these goals and it may perform well on some certain problems but bad on others.
Engineering Design Using GAs Since GAs have shown excellent performance in optimization problems, especially in multiobjective optimization, engineering design optimization problems have been explored with GAs. Engineering design optimization problems normally are multi-objective problems with high dimensional design variables. They also are complicated in that system dynamics are always non-linear and with uncertainty. In addition, engineering design problems often have constraints on design variables. All these issues have been well addressed in different GAs. As engineering design becomes more and more complex in modern industry, computer modeling is one of the essential methods to achieve reducing design cycle and improve design quality. Genetic algorithms have been used in a lot of complex design problems. There have been a number of activities from developing GA software for engineering design to improving GAs for engineering design.
Robustness in Engineering Design Robustness is the key to designing products that work in a range of conditions. From this aspect, robustness is sometimes in higher priority than optimality. Engineering design has to deal with uncertain environment, manufacturing tolerance and un-modeled effects. Real industry problems have shown that uncertainty can result in failure in the field. Previous researches focused on uncertain objective function problems, i.e., objective functions will return different values with the same design inputs. The techniques used for these problems are to estimate distribution of objective functions. These methods have been used to apply to mathematical problems to deal with uncertainty. However, this method is limited in that engineering design has to deal with uncertain design variables, especially curves. Few researches are oriented for this area at present. GAs are useful to many scientific, engineering problems including: • • • •
352
Optimization: GAs have been used in a wide variety of optimization tasks like circuit design, numerical optimization, video and sound quality optimization. Automatic Programming: GAs have been used to evolve computer programs for specific tasks, and to design many computational assemblies, such as cellular automata and sorting networks. Machine and Robot Learning: These have been used for various applications of machine learning such as classification and prediction etc. GAs have also been used to design learning classifier systems. Ecological Models: GAs have been used to model ecological phenomena such as biological arms races, host-parasite co-evolutions, symbiosis and resource flow in ecologies.
Optimal Designs by Means of Genetic Algorithms
• •
Population Genetics Models: GAs have been used to study questions in population genetics, such as “under what conditions will a gene for recombination be evolutionarily viable?” Models of Social Systems: To analyze evolutionary facets of social systems, Gas are also used. Such as the evolution of interaction, and trail-following behavior in ants (Tom (2017))
Application in Mechanical Engineering GA is a search based optimizing technique. It is used in Mechanical Engineering in the following ways: • • • • • • •
In optimizing processes like ECM, EDM, USM, etc. In operation sequencing & machining parameters selection. In designing airplane wings. Planning the cyclic preventive maintenance of the components. In physical distribution of a service. Organization uses GA to effective deliver their services to customer. In optimizing production planning and production scheduling activity of engineering. In designing Automobile suspension system.
CONCLUSION With the advancement in contemporary computational and modeling skills, engineering design completely depends upon on variety of computer modeling and simulation tools to hasten the design cycles and decrease the overall budget. The most difficult design problem will include various design parameters along with the tables. Although to find out the design space and ultimate solutions to those problems are still biggest challenges for the area of complex systems. This chapter is all about suggesting the use of Genetic Algorithms to enhance maximum engineering design problems. The chapter recommended that Genetic Algorithms are highly useful to increase the High-Performance Areas for Engineering Design. This chapter established to use Genetic Algorithms to large number of design areas and delivered a comprehensive conversation on the use, scope and its applications in mechanical engineering.
REFERENCES Bhoskar, T., Kukarni, O. K., Kulkarni, N. K., Patekar, S. L., Kakandikar, G. M., & Nandedkar, V. M. (2015). Genetic Algorithm and Its Applications to Mechanical Engineering: A Review. 4th International Conference on Materials Processing and Characterization. Materials Today: Proceedings, 2, 2624 – 2630. Branke, J. (2000). Efficient evolutionary algorithms for searching robust solutions. ACDM. doi:10.1007/978-1-4471-0519-0_22 Goldberg, D. E. (1998). The Design of Innovation: Lessons from Genetic Algorithms. Lessons for the Real World. University of Illinois at Urbana-Champaign. Goldberg, D. E. (2006). Genetic algorithms in search, optimization & Machine learning. Pearson Education.
353
Optimal Designs by Means of Genetic Algorithms
Holland, J. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press. Khan, M. Z. R., & Bajpai, A. K. (2013). Genetic Algorithm and Its Application In Mechanical Engineering. International Journal of Engineering Research & Technology, 2(5), 677-683. Packham, I. S. J., & Parmee, I. C. (2000). Data analysis and Visualization of Cluster-Orient Genetic Algorithm Output. Proceedings of the International Conference on Information Visualization, 173-178. 10.1109/IV.2000.859752 Pisinger, D., & Ropke, S. (2007). A General Heuristic for Vehicle Routing Problems. Computers & Operations Research, 34(8), 2403–2435. doi:10.1016/j.cor.2005.09.012 Roosenburg, N., & Eekeks, J. (1995). Product Design: Fundamentals and Methods. John Wiley & Sons Inc. Spears, W. M., & De Jong, K. A. (1991). On the Virtues of Parameterized Unıform Crossover. Proceedings of the 4th International Conference on Genetic Algorithms, 230-236. Syswerda, G. (1989). Uniform Crossover in Genetic Algorithms. Proceedings of the 3rd International Conference on Genetic Algorithms, 2-9. Tom, M. V. (2017). Transportation System Engineering, Genetic Algorithm. Indian Institute of Technology Bombay. Xiaopeng, F. (2007). Engineering design using genetic algorithms. Retrospective Theses and Dissertations. Paper 15943. Yeh, L. J., Chang, Y. C., & Chiu, M. C. (2004). Article. Journal of Marine Science and Technology, 12(3), 189–199. Zitzler, E., Deb, K., & Thiele, L. (2000). Comparison of multi-objective evolutionary algorithms: Empirical results. Evolutionary Computation, 8(2), 173–195. doi:10.1162/106365600568202 PMID:10843520
This research was previously published in Soft Computing Techniques and Applications in Mechanical Engineering; pages 151-161, copyright year 2018 by Engineering Science Reference (an imprint of IGI Global).
354
355
Chapter 19
T-Spanner Problem: Genetic Algorithms for the T-Spanner Problem Riham Moharam Suez Canal University, Egypt Ehab Morsy Suez Canal University, Egypt Ismail A. Ismail 6 October University, Egypt
ABSTRACT The t-spanner problem is a popular combinatorial optimization problem and has different applications in communication networks and distributed systems. This chapter considers the problem of constructing a t-spanner subgraph H in a given undirected edge-weighted graph G in the sense that the distance between every pair of vertices in H is at most t times the shortest distance between the two vertices in G. The value of t, called the stretch factor, quantifies the quality of the distance approximation of the corresponding t-spanner subgraph. This chapter studies two variations of the problem, the Minimum t-Spanner Subgraph (MtSS) and the Minimum Maximum Stretch Spanning Tree(MMST). Given a value for the stretch factor t, the MtSS problem asks to find the t-spanner subgraph of the minimum total weight in G. The MMST problem looks for a tree T in G that minimizes the maximum distance between all pairs of vertices in V (i.e., minimizing the stretch factor of the constructed tree). It is easy to conclude from the literatures that the above problems are NP-hard. This chapter presents genetic algorithms that returns a high quality solution for those two problems.
DOI: 10.4018/978-1-7998-8048-6.ch019
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
T-Spanner Problem
INTRODUCTION Let G=(V,E) be an undirected edge-weighted graph with vertex set V and edge set E such that |V|=n and |E|=m. A spanning subgraph H in G is said to be a t-spanner subgraph if the distance between every pair of vertices in H is at most t times the shortest distance between the two vertices in G. The value of t, called the stretch factor, quantifies the quality of the distance approximation of the corresponding t-spanner subgraph. The goodness of t-spanner subgraph H is estimated by either its total weight or the distance approximation of H (stretch factor t of) (D. Peleg and J. D. Ulman, 1989). We concern with the following two problem. The first problem, called the Minimum t-Spanner Subgraph (MtSS), we are given a value of stretch factor t and the problem requires to find the t-spanner subgraph of the minimum total weight in G. The problem of finding a tree t-spanner with the smallest possible value of t is known as the Minimum Maximum Stretch Spanning Tree (MMST) problem (Y. Emek and D. Peleg, 2008). The t-spanner subgraph problem is widely applied in communication networks and distributed systems. For example, the MMST problem is applied to the arrow distributed directory protocol that supports the mobile object routing (M. J. Demmer & M. P. Herlihy, 1998). In particular, it is used to minimize the delay of mobile object routing from the source node to every client node in case of concurrent requests through a routing tree. The worst case overhead the ratio of the protocol is proportional to the maximum stretch factor of the tree (see (D. Peleg & E. Reshef, 2001)). Kuhn and Wattenhofer (2006) showed that the arrow protocol is a distributed ordering algorithm with low maximum stretch factor. Another application of the MMST is in the analysis of competitive concurrent distributed queuing protocols that intend to minimize the message transit in a routing tree (M. Herlihy, et al, 2001). Also, low-weight spanners have recently found interesting practical applications in areas such as metric space searching(G. Navarro, et al, 2002) and broadcasting in communication networks (M. Farley, 2004). A spanner can be used as a compact data structure for holding information about (approximate) distances between pairs of objects in a large metric space, say, a collection of electronic documents by using a spanner instead of a full distance matrix, significant space reductions can be obtained when using search algorithms like AESA (G. Navarro, et al, 2002) For message distribution in networks, spanners can simultaneously offer both low cost and low delay when compared to existing alternatives such as minimum spanning trees (MSTs) and shortest path trees. Experiments with constructing spanners for realistic communication networks show that spanners can achieve a cost that is close to the cost of a MST while significantly reducing delay (or shortest paths between pairs of nodes) cost (A. M. Farley, et al, 2004). It is well known that the MtSS problem is NP-complete (L. Cai, 1994). For any t≥1, the problem of deciding whether G contains a tree t-spanner is NP-complete (L. Cai and D. Corneil, 1995), and consequently the MMST problem is also NP-complete. In this chapter, we present efficient genetic algorithms to these two problems. Our experimental results show that the proposed algorithms return high quality solutions for both problems.
BACKGROUND In this section, we present results on related problems.
356
T-Spanner Problem
For an unweighted graph G, (L. Cai and D. Corneil, 1995) produced a linear time algorithm to find a tree t- spanner in G for any given t≥2. Moreover, they showed that, for any t≥4, the problem of finding a tree t-spanner in G is NP-complete. (Brandsta¨dt, et al, 2007) improved the hardness result in (L. Cai and D. Corneil, 1995) by showing that a tree t-spanner is NP-complete even over chordal graphs which each created a cycle with length 3 whenever t≥4 and chordal bipartite graphs which each created a cycle with length 4 whenever t≥5. Peleg and Tendler (D. Peleg and D. Tendler, 2001) proposed a polynomial time algorithm to determine a minimum value for t for the tree t-spanner over outerplanar graphs. In (S. P. Fekete and J. Kremer, 2001), they showed that it is NP-hard to determine a minimum value for t for which a tree t-spanner exists even for planar unweighted graphs. They designed a polynomial time algorithm that decides if the planar unweighted graphs with bounded face length contains a tree t-spanner for any fixed t. Moreover, they proved that for t=3, it can be decided whether the unweighted planar graph has a tree t-spanner in polynomial time. The problem was left open whether a tree t-spanner is polynomial time solvable in case of t≥4. Afterwards, this open problem is solved by (F. F. Dragan, et al, 2010). They proved that, for any fixed t, the tree t-spanner problem is linear time solvable not only for a planar graphs, but also for the class of sparse graphs which include graphs of bounded genus. In (Y. Emek and D. Peleg, 2008) presented an O(log n)-approximation algorithm for finding the tree t-spanner problem in a graph of size n. Moreover, they established that unless P = NP, the problem cannot be approximated additively by any o(n) term. In (M. Sigurd and M. Zachariasen, 2004) presented exact algorithm for the Minimum Weight Spanner problem, they proposed an integer programming formulation based on column generation. They showed that the total weight of a spanner constructed by the greedy spanner algorithm is typically within a few percent from the optimum. Recently, (F. F. Dragan and E. K¨ohler, 2014) examined the tree t-spanner on chordal graphs, generalized chordal graphs and general graphs. For every n-vertex m-edge unweighted graph G, they proposed a new algorithm constructs a tree (2log2n)-spanner in O(mlog n) time for chordal graphs, a tree (2𝜌log2n)-spanner O(mlog2n) time or a tree (12𝜌log2n)-spanner in O(mlogn) time for graphs that confess a Robertson- Seymour’s tree-decomposition with bags of radius at most 𝜌 in G and a tree (2t/2log2n)-spanner in O(mnlog2n) time or a tree (6tlog2n)-spanner in O(mlogn) time for graphs that confess a tree t-spanner. They produced the same approximation ratio as in (Y. Emek and D. Peleg, 2008) but in a better running time.
GENETIC ALGORITHMS In this section, we propose a genetic algorithms for the two variants of t-spanner problem, the problem of finding a tree t- spanner in a given edge-weighted graphs that minimizes the stretch factor t (MMST) and the problem of finding the t-spanner subgraph with minimum total weight for a given value t (MtSS). (see Section I). We first introduce some terminologies that will be used throughout this section. Let G ′ be a subgraph of G. The sets V (G ′) and E (G ′) denote the set of vertices and edges of G ′ , respectively. The shortest
distance between two vertices u and v in G ′ is denoted by dG ' (u, v ) . For two subgraphs G1 and G2 of
G, let G 1 ∪ G 2 , G 1 ∩ G 2 , and G1–G2 denote the subgraph induced by E (G 1) ∪ E (G 2 ) , E (G 1) ∩ E (G 2) , and E(G1) – E(G2), respectively. 357
T-Spanner Problem
Algorithm Overview1 The Genetic Algorithm (GA) is an iterative optimization approach based on the principles of genetics and natural selection (Andris P. Engelbrecht, 2007). We first have to define a suitable data structure to represent individual solution (chromosomes), and then construct a set of candidate solutions as an initial population (first generation) of an appropriate cardinality pop‑size. The following typical procedure is repeated as long as a predefined stopping criteria are met. Starting with the current generation, we use a predefined selection technique to repeatedly choose a pair of individuals (parents) in the current generation to reproduce, with probability pc, a new set of individuals (offsprings) by exchanging some parts of between the two parents (crossover operation). To avoid local minimum, we try to keep an appropriate diversity among different generations by applying mutation operation, with specific probability pm, to genes of individuals of the current generation. Finally, based on the values of an appropriate fitness function, we select a new generation from both the offspring and the current generation (the more suitable solutions have more chances to reproduce). Note that, determining representation method, population size, selection technique, crossover and mutation probabilities, and stopping criteria in genetic algorithms are crucial since they mainly affect the convergence of the algorithm (see (O. Abdoun, et al, 2012, J. Hesser and R. Mnner, 1991, W. Y. LIN, et al, 2001, O. Roeva, et al, 2013, K. Vekaria and C. Clack, 1998)). The rest of this section is devoted to describe steps of the above algorithm in details.
Representation Let G=(V,E) be a given undirected graph such that each vertex in V is assigned a distinct label from the space 1,2,..,n, i.e., V={1,2,…,n}. Clearly, each edge e∈E with end points i and j is uniquely defined by the unordered pair i,j. Moreover, every subgraph of G is uniquely defined by the set of unordered pairs of all its edges. In particular, every spanning tree T in G is induced by a set of exactly n–1 unordered pairs corresponding to its edges since T is a subgraph of G that spans all vertices in V and has no cycles. Therefore, each chromosome (t-spanner subgraph) can be represented as a set of unordered pairs of integers each of which represent a gene (edge) in the chromosome.
Initial Population Constructing an initial generation is the first step in typical genetic algorithms. We first have to decide the population size pop‑size, one of the decisions that affect the convergence of the genetic algorithm. It is expected that small population size may lead to weak solutions, while, large population size increases the space and time complexity of the algorithm. Many literatures studied the influence of the population size to the performance of genetic algorithms (see (O. Roeva, et al, 2013) and the references therein). In this chapter, we discuss the effect of the population size on the convergence time of the algorithm. One of the most common methods is to apply random initialization to get an initial population.
t-Spanner Subgraph We compute each chromosome (spanning subgraph) in the initial population by applying the following two phases procedure. 358
T-Spanner Problem
Figure 1. The representation of chromosome
In the first phase, we repeatedly add a new edge to the constructed subgraph so far as long as the cardinality of the set of visited vertices is less than dn/2c. Let H denote the tree constructed so far by the procedure (initially, H consists of a random vertex from V(G). We first select a random vertex v ∈ V (G) from the set of the neighbors of all vertices in H, and then add the edge e = (u, v ) to H(if e∈/E(H)) where u is the neighbor of v in H . In the second phase, we repeatedly add a new vertex to the subgraph output from the first phase as long as the set of vertices of the subgraph is less than n. We first select a random vertex v ∈ /V (H ) from the set of the neighbors of all vertices in H, and then add the edgee = (u, v ) to, where u is the neighbor of v in H. It is easy to verify that the above procedure returns a spanning subgraph . The generated subgraph H is added to the initial population if it is t-spanner, and the above algorithm is repeated as long as the number of constructed population is less than pop‑size.
Tree t-Spanner We compute each chromosome (spanning tree) in the initial population by repeatedly applying the following simple procedure as long as the cardinality of the set of traversed edges is less than n–1. Let T denote the tree constructed so far by the procedure (initially, T consists of a random vertex from V(G)). We first select a random vertex v ∈ /V (T ) from the set of the neighbors of all vertices in T, and then add the edge e = (u, v ) to T, where u is the neighbor of v in T . It is easy to verify that the above algo-
rithm visits the set of all vertices in the underlying graph after exactly n–1 iterations, thus, it returns a spanning tree. The generated tree T is added to the initial population. (see Figure 2 for an illustration example). The above algorithm is repeated as long as the number of constructed population is less than pop‑size.
359
T-Spanner Problem
Figure 2. Construct each chromosome randomly: in (a) T started with v 4 randomly and then from its neighbors v 3 selected randomly, after that e34 added to T as in (b) this procedure will be repeated n–1 times until the set of all vertices in the underlying graph is visited as in (c).
Fitness Function Fitness function is a function used to evaluate each chromosome. Here, the objective function of the underlying problem is used as the corresponding fitness function. Namely, for the MtSS problem, the fitness function is the total weight of the chromosome, i.e., the fitness value of chromosome H equals ∑ w (e ) . For the MMST problem, the maximum ratio between all pairs of vertices in the chromosome e ∈E (H )
is used as its fitness, i.e., maxu,v ∈V
dT (u, v )
dG (u, v )
is the fitness value of T, where dT (u, v ) and dG (u, v ) are
the distances between u and v in T and G, respectively. Note that, in both problems we look for a chromosome of the least fitness value.
Selection Process In this chapter, we present three common selection techniques: roulette wheel selection, stochastic universal sampling selection, and tournament selection. All these techniques are called fitness-proportionate selection techniques since they are based on a predefined fitness function used to evaluate the quality of individual chromosomes. Throughout the execution of the proposed algorithm, the reverse of this ratio is used as the fitness function of the corresponding chromosome. We assume that the same selection technique is used throughout the whole algorithm. The rest of this section is devoted to briefly describe these selection techniques.
Roulette Wheel Selection (RWS) In the roulette wheel selection, the probability of selecting a chromosome is based on its fitness value (Andris P. Engelbrecht, 2007, A. Chipperfield, et al, 1994). More precisely, each chromosome is selected with the probability that equals to its normalized fitness value, i.e., the ratio of its fitness value to the total fitness values of all chromosomes in the set from which it will be selected. (see Figure 3 for an illustration example).
360
T-Spanner Problem
Figure 3. Example for roulette wheel selection: The circumference of the roulette wheel is the sum of all six individual’s fitness values. Individual 5 is the most fit individual and occupies the largest interval, whereas individuals 6 and 4 are the least fit and have correspondingly smaller intervals within the roulette wheel. To select an individual, a random number is generated in the interval (0, 1) and the individual whose segment spans the random number is selected. This process is repeated until the desired numbers of individuals have been selected.
Stochastic Universal Sampling Selection (SUS) Instead of a single selection pointer used in roulette wheel approach, SUS uses h equally spaced pointers, where h is the number of chromosomes to be selected from the underlying population (T. Blickle and L. Thiele, 1995, A. Chipperfield, et al, 1994). All chromosomes are represented in number line 1 randomly and a single pointer ptr ∈ 0, is generated to indicate the first chromosome to be selected. h The remaining h–1 individuals whose fitness spans the positions of the pointers ptr+i/h, i=1,2,…,h–1 are then chosen. (see Figure 4 for an illustration example).
Tournament Selection (TRWS) This is a two stages selection technique (Andris P. Engelbrecht, 2007, T. Blickle and L. Thiele, 1995). We first select a set of k 1), one hidden layer, and a continuous activation function (e.g., the sigmoid function) in each neuron are universal approximators. Comparable neural networks with more than one hidden layer are, of course, universal approximators as well. 2. Fuzzy expert systems based on multi-conditional approximate reasoning can approximate feedforward neural networks with n inputs, m outputs, one or more hidden layers, and a continuous activation function in each neuron, provided that the range of the input variable is discretized into n values and the range of the output variable is discretized into m values. 3. It follows from (1) and (2) that fuzzy expert systems of the type described in (2) are also universal approximators. 4. Fuzzy input-output controllers, that is, fuzzy expert systems based on multi-conditional approximate reasoning and a defuzzification of obtained conclusions, are universal approximators. 5. Most of these results, which are of considerable theoretical significance, are thus far of limited practical value. They are primarily existence results, which do not provide us with procedures for constructing practical approximators. In the rest of this section, we describe an example proposed by Patrikar and Provence [1993] for approximating the inference engine of a simple fuzzy controller by a feed forward neural network with one hidden layer. The fuzzy controller has two input variables, e and e’, and one output variable, v, which have the meaning as per system. Seven linguistic states are distinguished for each variable, which are represented by triangular-shape fuzzy numbers equally spread over the range of values of the variable, as depicted in Figure 1. Assume, for convenience, that the variables have the same range [—6, 6], That is, the range [—a, a] in Figure 1 is interpreted for each variable as [—6, 6]. It is obvious that any given range can be normalized to this convenient range. Assume further that the 49 fuzzy rules specified in Figure 3 are all included in the fuzzy rule base of our controller.
Feed Forward Neural Network A possible structure of a feed forward neural network with one hidden layer by which the inference engine of our fuzzy controller can be approximated is sketched in Fig. 4. It is assumed that the sigmoid function is chosen as the activation function for each neuron, and the back propagation algorithm is employed for training the neural network to map a set of given input patterns into a set of desirable output patterns.
553
Exploration of Fuzzy System With Applications
We assume that the neural network has q neurons in the hidden layer. According to available heuristic rules (no definite formulas have been developed as yet), q ≥ 6. Figure 1.
Figure 2.
Figure 3.
554
Exploration of Fuzzy System With Applications
Inputs of the neural network are partitioned into two subsets that correspond to variables e and e as shown in Figure 4. Each input is assigned to a fuzzy number that represents a particular linguistic state of the respective variable. Each output is allocated to one of thirteen equally distributed discrete points in the range [—6, 6], which are the integers —6, —5, —4, —3, —2, —1, 0, 1, 2, 3, 4, 5, 6. Its value specifies, at the given discrete point in the range [—6, 6], the membership grade of the fuzzy number that represents the output variable v. Figure 4.
Each fuzzy inference rule is represented in the neural network by an input vector, which specifies the linguistic states of the input variables e and e’, and an output vector, which defines (at 13 discrete points) the corresponding fuzzy number of the output variable v. For example, the fuzzy inference rule, if e is AZ and e is NS, then v is PS is represented in the neural network by the input vector (0, 0, 0, 1, 0, 0, 0; 0, 0,1, 0, 0, 0, 0) and the output vector (0, 0, 0, 0, 0, 0, 0, .5,1, .5,0,0,0),
555
Exploration of Fuzzy System With Applications
as illustrated in Figure 4, These two vectors represent, in this application of the neural network one input-output pair of a training set. The full training set consists of all 49 fuzzy inference rules specified in Figure 3. When this training set is applied to the back propagation algorithm, the neural network gradually learns to associate the proper fuzzy number at the output with each of the possible pairs of input linguistic states. After training, the neural network responds correctly to each input vector in the training set. To utilize the network for producing appropriate control actions, it must be supplemented with fuzzification and defuzzification modules. The role of the fuzzification module is to feed appropriate input values into the neural network for any given measurements e = x0and e’ = y0. For general fuzzification functions feand fe’, these values form the input vector (a1, a2, a3, a4, a5, a6, a7; b1, b2, b3, b4, b5, b6, b7), where aj expresses the degree of compatibility of fe(x0) with the antecedent e = Aj, and bj expresses the degree of compatibility of fe’(y0) with the antecedent e = Bj(j = 1, 2, . . ., 7). Receiving an input vector of this form and meaning, the neural network produces an output vector (c-6, c-5, c-4, c-3, c-2, c-1, c0, c1, c2, c3, c4, c5, c6), by which a fuzzy set representing the conclusion of the inference is defined. This fuzzy set must be converted to a single real number by the defuzzification module. Once trained, the neural network provides us with a blueprint for a hardware implementation. Its actual hardware implementation involves massive parallel processing of information; hence, it is computationally very efficient. If desirable, the output representation of the neural network can be made more refined by dividing the range of the output variable into more discrete values and increasing the number of output neurons as needed.
4. FUZZY NEURAL NETWORKS As discussed in Sec. 5.3, feed forward neural networks are eminently suited for approximating fuzzy controllers and other types of fuzzy expert systems, as well as for implementing these approximations in appropriate hardware. Although classical neural networks can be employed for this purpose, attempts have been made to develop alternative neural networks, more attuned to the various procedures of approximate reasoning. These alternative neural networks are usually referred to as fuzzy neural networks.
556
Exploration of Fuzzy System With Applications
The following features, or some of them, distinguish fuzzy neural networks from their classical counterparts: 1. 2. 3. 4.
Inputs are fuzzy numbers; Outputs are fuzzy numbers; Weights are fuzzy numbers; Weighted inputs of each neuron are not aggregated by summation, but by some other aggregation operation.
A deviation from classical neural networks in any of these features requires that a properly modified learning algorithm be developed. This, in some cases, is not an easy task. Various types of fuzzy neural networks have been proposed in the literature. As an example, we describe basic characteristics of only one type. Some other types are listed, with relevant references. Fuzzy neural networks to be described here were proposed by Hayashi, Buckley, and Czogala [1993]. They are obtained by directly fuzzifying the classical feedforward neural networks with one or more layers. The following are basic features of the resulting networks: All real numbers that characterize a classical neural network become fuzzy numbers in its fuzzified counterpart. These are numbers that characterize inputs to the network, outputs of neurons at hidden layers and the output layer, and weights at all layers.
Example 1 Consider all numbers relevant to a particular output neuron, ONk, of a single-layer feedforward neural network, as depicted in Figure 5. If ONkis a fuzzy neuron, then the inputs Xk0, Xk1,……..Xkn, the weights W0, W1, . . ., Wn, and the output Ykof this neuron are all fuzzy numbers. Figure 5.
The output of each neuron, exemplified here by the neuron characterized in Figure 5, is defined by the formula n Yk = S β ∑Wj X kj , j =0
557
Exploration of Fuzzy System With Applications
where Sβ is a sigmoid function for some chosen value of the steepness parameter β (back propagation algorithm). Since symbols Wj and Xkj in (fig. 5) designate fuzzy numbers, the sum n
Ak = ∑Wj X kj j =0
must be calculated by fuzzy arithmetic. The output of the neuron, Yk = Sβ (Ak), is then determined by using the extension principle. Error function Ep, employed in the back propagation learning algorithm in a fuzzy neural network with m outputs for each training sample p is defined by the formula Ep =
2 1 Tkp −Ykp , ∑ 2
(
)
where T/ is the target output and Yk is the actual output of output neuron ONk, for training sample p. Here, again, fuzzy arithmetic must be used to calculate Ep. Otherwise, this formula for Ep is exactly the same as its counterpart for classical neural networks (compare (5.4) with (5.3) in back propagation algorithm). The stopping criterion for fuzzy neural networks must also be properly fuzzified. Assume that Tkp = Ykp for all k, which represents a perfect match of the actual outputs with the target outputs. Then, assuming that the support of Tkp (and, in this case, also of Ykp ) is the interval [tkp , tkp ] the support of Epis included in the interval [–λ, λ], where λ=
1 n p ∑ t − tkp1 2 k =1 k2
(
1
2
) 2
Choosing now some number ε> 0 as an acceptable deviation from the value of Epwhen Tkp = Ykp for all k, it is reasonable to stop the learning algorithm whenever Epis included in the interval [– λ, – ε, λ + ε]. Finally, we need to fuzzify the backpropagation learning algorithm. One way, proposed by Hayashi et at [1993], is to replace the real numbers in the standard formulas with their fuzzy counterparts and apply fuzzy arithmetic to them.
558
Exploration of Fuzzy System With Applications
5. FUZZY AUTOMATA A finite automaton (also called a finite-state machine or sequential machine) is a dynamic system operating in discrete time that transforms sequences of input states (stimuli) received at the input of the system to sequences of output states (responses) produced at the output of the system. The sequences may be finite or countably infinite. The transformation is accomplished by the concept of a dynamically changing internal stats. At each discrete time, the response of the system is determined on the basis of the received stimulus and the internal state of the system. At the same time, a new internal state is determined, which replaces its predecessor. The new internal state is stored in the system to be used the next time. An automaton is called a fuzzy automaton when its states are characterized by fuzzy sets, and the production of responses and next states is facilitated by appropriate fuzzy relations.
Definition 2 A finite fuzzy automaton, A, is a fuzzy relational system defined by the quintuple A= (X, Y, Z, R, S), where • • • • •
X is a nonempty finite set of input states (stimuli), Y. is a nonempty finite set of output states (responses), Z is a nonempty finite set of internal states, R is a fuzzy relation on Z ×Y, and S is a fuzzy relation on X × Z × Z.
Method Step 1: Assume that X = {x1, x2,..., xn},Y = {y1, y2,……..ym}, Z = {z1,z2,..., zq}, and let At, Bt, Ct, Etdenote the fuzzy sets that characterize, respectively, the stimulus, response, current internal state, and emerging internal state (nest state) of the automaton at time t. The idea of a fuzzy automaton is depicted in Figure 6. Figure 6.
559
Exploration of Fuzzy System With Applications
Step 2: Given Atand Ct at some time t, fuzzy relations R and S allow us to determine Btand Et. Clearly, At∈ F(X), Bt∈ F(Y), and Ct, Et∈F(Z). A fuzzy set Ct, which characterizes the initial internal state, must be given to make the fuzzy automaton operate. Then, Ct = Et-1for each time t ∈ N - {1}. The equation Ct= Et-1is assumed to be implemented by the block called storage in Figure 6. Its role is to store the produced fuzzy set Et at each time t and release it the next time under the label O.
Example 2 Given a sequence A1, A2,..., and an initial characterization C1 of the internal state, fuzzy relations R and S allow us to generate the corresponding sequences B1, B2,... and C2 = E1, C3 = E2,….. Due to the roles of relations R and S, it is reasonable to call R a response relation and S a state-transition relation. Assuming the standard fuzzy set operations, the fuzzy automaton operates as follows. For any given fuzzy input state At, the ternary state-transition relation S is converted into a binary relation, S At , on Z × Z by the formula S At (zi, zj) = max (min[At(xk),S(xk, zi, zj)]) k ∈N
(5.1)
for all pairs (Zi, Zj) ∈Z × Z. Then, assuming the present fuzzy state C is given, the fuzzy next state Etand fuzzy output state Btare determined by the max-min compositions Et= Ct ° S At
(5.2)
Bt= Ct ° R.
(5.3)
Equations (5.1) - (5.3) are sufficient for handling sequences of fuzzy states. Consider, for example, a sequence A1, A2,...,Arof r fuzzy input states applied to a given initial fuzzy state C1. Then, the fuzzy automaton produces the sequence of fuzzy internal states E1 = C1 ° S A1 E2 = C2 ° S A2 …………………….. Er = Cr-1 ° S Ar and the corresponding sequence of fuzzy output states B1= C1 ° R, B2= E1 ° R,
560
Exploration of Fuzzy System With Applications
…………… Br = Er-1o R. If we are interested only in the final internal state and the final output state, we can use the formulas Er = C1 ° S A1 ° S A2 ° ...° S A2 , Br = C1o S A1 ° S A2 ° ...° S Ar −1 ° R. Let us illustrate the concept of fuzzy automata by a simple example. Consider a fuzzy automaton with X — {x1, x2}, Y = (y1, y2, y3}, Z = [z1, z2, z3, z4} whose output relations R and state-transition relation S are denned, respectively, by the matrix y1 z 1 1 z 0 R = 2 z3 0 z 4 .5
y2 y 3 0 0 1 0 0 1 1 .3
and the three-dimensional array x1 z2 z 3 z 4
z1 z1 z 2 z 1 0 .4 .2 1 z 1 0 0 1 z .3 1 0 .2 z .2 0 0 2 8 = 2 z 0 0 0 z 5 0 0 1 . 3 3 z 0 0 0 1 4 z 4 1 .3 0
x2 z3 z4 0 1 1 .6
To describe how this fuzzy automaton operates, let fuzzy sets describing input, output, and internal states at any time t be defined by the vectors At = [At(x1), At(x2)], Bt = [Bt(y1), Bt (y2), Bt(y3)], Ct = [Ct(z1),Ct(z2),Ct(z3),Ct(Z4)].
561
Exploration of Fuzzy System With Applications
Assume now that the initial fuzzy state of the automaton is C1 = [1 .8 .6 .4] and its fuzzy input state is A1 = [1 .4], Then, using (12.14), z1 z 2 z 3 z 1 0 .4 .4 z .3 1 0 = 2 z 3 .5 0 0 z 4 .4 .3 0
S At
z4 1 .4 1 1
For example, S A1 (z1, z3) = max(min[At(x1), S(x1, z1, z3)], min[At(x2), S(x2, z1, z3)]) = max(min[l, .2], min[.4, 1]) = max(.2, .4) = .4. To calculate the fuzzy next state E1and the fuzzy output state B1 of the automaton, we now use (5.2) and (5.3): 0 .4 .4 1 .3 1 0 .4 = .5 .8 .4 1 E 1 = 1 .8 .6 .4 .5 0 0 1 .4 .3 0 1 1 0 B 1 = 1 .8 .6 .4 0 .5
0 0 1 0 = 1 .8 .6 0 1 1 .3
Assuming now that the next fuzzy input state is A2 = [0 1], we obtain
E 2 = E 1 S A2
562
0 0 .2 0 = .5 .8 .4 1 0 0 1 .3
1 0 0 1 = 1 .3 .5 .8 0 1 0 .6
Exploration of Fuzzy System With Applications
1 0 2 1 B = E R = .5 .8 .4 1 0 .5
0 0 1 0 = .5 1 .4 0 1 1 .3
Similarly, we can produce larger sequences of fuzzy internal and output states for any given sequence of fuzzy input states.
Generalization of Fuzzy Automata To define a meaningful fuzzy automaton, relations R and S cannot be arbitrary. For example, some next internal sate and some output state must be assured for any given input state and present internal state. 1. For each pair (xk, zi) ∈X × Z, we must require that S(xk, zi, zj) >0 for at least one zj∈ Z; 2. For each zi∈ Z, we must require that R(zi, yj) > 0 for at least one yi∈Y. These requirements may also be stated in a stronger form: 3. for each pair xk, z1 ∈X × Z, S(xk, zi, zj) = 1 for at least one zj∈ Z; and for each z1∈ Z, R(z1, y1) = 1 for at least one y1∈Y.
Remark 1. When these requirements are satisfied, respectively, for exactly one y1∈ Y, and exactly one y1∈ Y, we call the relations deterministic. 2. When all states of a fuzzy automaton are defined as crisp sets and R, S are crisp relations, we obtain a crisp automaton, which, in general, is nondetenninistic. When, in addition, all states are singletons taken from sets X, Y, Z and relations R, S are deterministic, we obtain the classical deterministic automaton of the Moore type.? The operations min and max employed in (5.1) - (5.3) may, of course, be replaced with other t-norms and t-conorms, respectively. For each replacement, we obtain an automaton of a different type. When min is replaced with the product and max is replaced with the algebraic sum, and we require, in addition, that input states are singletons and
∑C (z ) = 1, 1
i
zi ∈Z
∑ R(z , y ) = 1 for each z
yl ∈Y
i
i
l
∈ Z,
563
Exploration of Fuzzy System With Applications
∑ S (x , z , z ) = 1 for each (x , z ) ∈ X × Z ,
z j ∈Z
k
i
j
k
i
We obtain a classical probabilistic automaton of the Moore type. The concept of a fuzzy automaton is a broad one, which subsumes classical crisp automata (deterministic and nondetenninistic) as well as classical probabilistic automata as special cases. This concept, whose further study is beyond the scope of this text, clearly extends the domain of applicability covered by the various classical types of finite automata.
6. FUZZY SYSTEMS AND GENETIC ALGORITHMS In a genetic algorithm, a population of candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem is evolved toward better solutions. Each candidate solution has a set of properties (its chromosomes or genotype) which can be mutated and altered; traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals, and is an iterative process, with the population in each iteration called a generation. In each generation, the fitness of every individual in the population is evaluated; the fitness is usually the value of the objective function in the optimization problem being solved. The more fit individuals are stochastically selected from the current population, and each individual’s genome is modified (recombined and possibly randomly mutated) to form a new generation. The new generation of candidate solutions is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. A typical genetic algorithm requires: 1. A genetic representation of the solution domain, 2. A fitness function to evaluate the solution domain. A standard representation of each candidate solution is as an array of bits. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, which facilitates simple crossover operations. Variable length representations may also be used, but crossover implementation is more complex in this case. Tree-like representations are explored in genetic programming and graphform representations are explored in evolutionary programming; a mix of both linear chromosomes and trees is explored in gene expression programming. Once the genetic representation and the fitness function are defined, a GA proceeds to initialize a population of solutions and then to improve it through repetitive application of the mutation, crossover, inversion and selection operators. The connection between fuzzy systems and genetic algorithms is bidirectional. In one direction, genetic algorithms are utilized to deal with various optimization problems involving fuzzy systems. One important problem for which genetic algorithms have proven very useful is the problem of optimizing
564
Exploration of Fuzzy System With Applications
fuzzy inference rules in fuzzy controllers. In the other direction, classical genetic algorithms can be fuzzified. The resulting fuzzy genetic algorithms tend to be more efficient and more suitable for some applications. In this section, we discuss how classical genetic algorithms can be fuzzified; the use of genetic algorithms in the area of fuzzy systems is covered only by a few relevant references There are basically two ways of fuzzifying classical genetic algorithms. One way is to fuzzify the gene pool and the associated coding of chromosomes; the other one is to fuzzify operations on chromosomes. These two ways may, of course, be combined. In classical genetic algorithms, the set {0, 1} is often used as the gene pool, and chromosomes are coded by binary numbers. These algorithms can be fuzzified by extending their gene pool to the whole unit interval [0, 1]. To illustrate this possibility, let us consider the example of determining the maximum of function f(x) = 2x - x2/16 within the domain [0, 31]. By employing the gene pool [0, 1], there is no need to discretize the domain [0, 31]. Numbers in this domain are represented by chromosomes whose components are numbers in [0,1].
Example 3 Consider the chromosome (.1, .5, 0, 1, .9) represents the number 8.5 = .1 x 24 + .5 x 23 + 0 x 22 + 1 x 21 + .9 x 20 in [0, 31]. It turns out that this reformulation of classical genetic algorithms tends to converge faster and is more reliable in obtaining the desired optimum. To employ it, however, we have to find an appropriate way of coding alternatives of each given problem by chromosomes formed from the gene pool [0,1]. To illustrate this issue, let us consider a traveling salesman problem with four cities, C1, C2, C3, and C4. The alternative routes that can be taken by the salesman may be characterized by chromosomes (x1,x2, x3, x4) in which xicorresponds to city Ci(i ∈ N) and represents the degree to which the city should be visited early. Thus, for example, (.1, .9, .8, 0) denotes the route C2, C3, C1, and C4,C2. Although the extension of the gene pool from {0, 1} to [0,1] may be viewed as a fuzzification of genetic algorithms, more genuine fuzzification requires that the operations on chromosomes also be fuzzified. In the following, we explain, as an example, a fuzzified crossover proposed by Sanchez Consider chromosomes x = and y = whose components are taken from a given gene pool. Then, the simple crossover with the crossover position i ∈ Nn-1 can be formulated in terms of a special n-tuple t = (tj\ tj= 1 for j∈ N and tj= 0 for Ni+1, n), referred to as a template, by the formulas x’ = (x ˄ t) ˅ (y ˄ t) y’= (x ˄ t)˅ (y ˄ t), where˄ and ˅ are min and max operations on tuples and t = (tj\ tj= 1- tj)
565
Exploration of Fuzzy System With Applications
We can see that the template t defines an abrupt change at the crossover position i. This is characteristic of the usual, crisp operation of simple crossover. The change can be made gradual by defining the crossover position approximately. This can be done by a fuzzy template, f = For example, f = (1, . . ., 1, .8, .5, .2, 0, . . ., 0) is a fuzzy template for some n. Assume that chromosomes x = and y = are given, whose components are, in general, numbers in [0,1]. Assume further that a fuzzy template f = {f1, f2,..,, fn) is given. Then, the operation of fuzzy simple crossover of mates x and y produces offspring’s x’ and y’ defined by the formulas x’ = (x ˄ t) ˅ (y ˄ f) y’= (x ˄ f)˅ (y ˄ t), These formulas can be written, more specifically, as x’ = (max[min(xi, fi),min(yi, fi,)]\: i ∈Nn), y’ = (max[min(xi, fi),min(yi, fi,)]\: i ∈Nn), The operation of a double crossover as well as the other operations on chromosomes can be fuzzified in a similar way. Experience with fuzzy genetic algorithms seems to indicate that they are efficient, robust, and better attuned to some applications than their classical, crisp counterparts.
7. LIMITATIONS There are several limitations of the use of a genetic algorithm compared to alternative optimization algorithms: •
•
566
Repeated fitness function evaluation for complex problems are often the most prohibitive and limiting segment of artificial evolutionary algorithms. Finding the optimal solution to complex high dimensional, multimodal problems often requires very expensive fitness function evaluations. In real world problems such as structural optimization problems, one single function evaluation may require several hours to several days of complete simulation. Typical optimization methods can not deal with such types of problem. In this case, it may be necessary to forgo an exact evaluation and use an approximated fitness that is computationally efficient. It is apparent that amalgamation of approximate models may be one of the most promising approaches to convincingly use GA to solve complex real life problems. Genetic algorithms do not scale well with complexity. That is, where the number of elements which are exposed to mutation is large there is often an exponential increase in search space size. This makes it extremely difficult to use the technique on problems such as designing an engine, a house or plane. In order to make such problems tractable to evolutionary search, they must be
Exploration of Fuzzy System With Applications
• •
•
•
•
broken down into the simplest representation possible. Hence we typically see evolutionary algorithms encoding designs for fan blades instead of engines, building shapes instead of detailed construction plans, airfoils instead of whole aircraft designs. The second problem of complexity is the issue of how to protect parts that have evolved to represent good solutions from further destructive mutation, particularly when their fitness assessment requires them to combine well with other parts. It has been suggested by some in the community that a developmental approach to evolved solutions could overcome some of the issues of protection, but this remains an open research question. The “better” solution is only in comparison to other solutions. As a result, the stop criterion is not clear in every problem. In many problems, GAs may have a tendency to converge towards local optima or even arbitrary points rather than the global optimum of the problem. This means that it does not “know how” to sacrifice short-term fitness to gain longer-term fitness. The likelihood of this occurring depends on the shape of the fitness landscape: certain problems may provide an easy ascent towards a global optimum, others may make it easier for the function to find the local optima. This problem may be alleviated by using a different fitness function, increasing the rate of mutation, or by using selection techniques that maintain a diverse population of solutions although the No Free Lunch theorem proves that there is no general solution to this problem. A common technique to maintain diversity is to impose a “niche penalty”, wherein, any group of individuals of sufficient similarity (niche radius) have a penalty added, which will reduce the representation of that group in subsequent generations, permitting other (less similar) individuals to be maintained in the population. This trick, however, may not be effective, depending on the landscape of the problem. Another possible technique would be to simply replace part of the population with randomly generated individuals, when most of the population is too similar to each other. Diversity is important in genetic algorithms (and genetic programming) because crossing over a homogeneous population does not yield new solutions. In evolution strategies and evolutionary programming, diversity is not essential because of a greater reliance on mutation. Operating on dynamic data sets is difficult, as genomes begin to converge early on towards solutions which may no longer be valid for later data. Several methods have been proposed to remedy this by increasing genetic diversity somehow and preventing early convergence, either by increasing the probability of mutation when the solution quality drops (called triggered hypermutation), or by occasionally introducing entirely new, randomly generated elements into the gene pool (called random immigrants). Again, evolution strategies and evolutionary programming can be implemented with a so-called “comma strategy” in which parents are not maintained and new parents are selected only from offspring. This can be more effective on dynamic problems. GAs cannot effectively solve problems in which the only fitness measure is a single right/wrong measure (like decision problems), as there is no way to converge on the solution (no hill to climb). In these cases, a random search may find a solution as quickly as a GA. However, if the situation allows the success/failure trial to be repeated giving (possibly) different results, then the ratio of successes to failures provides a suitable fitness measure. For specific optimization problems and problem instances, other optimization algorithms may find better solutions than genetic algorithms (given the same amount of computation time). Alternative and complementary algorithms include evolution strategies, evolutionary programming, simulated annealing, Gaussian adaptation, hill climbing, and swarm intelligence (e.g.: ant colony op567
Exploration of Fuzzy System With Applications
timization, particle swarm optimization) and methods based on integer linear programming. The question of which, if any, problems are suited to genetic algorithms (in the sense that such algorithms are better than others) is open and controversial.
REFERENCES Bastian, A. (2000). Identifying Fuzzy Models utilizing Genetic Programming. Fuzzy Sets and Systems, 113(3), 333–350. doi:10.1016/S0165-0114(98)00086-4 Bastian, A., & Hayash, I. (1995). An Anticipating Hybrid Genetic Algorithm for Fuzzy Modeling. Journal of Japan Society for Fuzzy Theory and Systems, 7(5), 997–1006. Campos, L., & Verdegay, J. L. (1989). Linear programming problems and ranking of fuzzy numbers. Fuzzy Sets and Systems, 32(1), 1–1. doi:10.1016/0165-0114(89)90084-5 Chafale, D., & Pimpalkar, A. (2014). Review on Developing Corpora for Sentiment Analysis Using Plutchik’s Wheel of Emotions with Fuzzy Logic. International Journal on Computer Science and Engineering, 2(10), 14–18. Chen, S. H. (1985). Ranking of Fuzzy Numbers with Maximizing and Minimizing set. Fuzzy Sets and Systems, 17(2), 113–129. doi:10.1016/0165-0114(85)90050-8 Cordón, O., Gomide, F., Herrera, F., Hoffmann, F., & Magdalena, L. (2004). Ten years of genetic fuzzy systems: Current framework and new trends. Fuzzy Sets and Systems, 141(1), 5–31. doi:10.1016/S01650114(03)00111-8 Cordón, O., Herrera, F., Gomide, F., Hoffman, F., & Magdalena, L. (2001). Genetic Fuzzy Systems: Evolutionary Tuning and Learning of Fuzzy Knowledge Bases. In Advances in Fuzzy Systems: Applications and Theory (Vol. 19, pp. 1–462). Singapore: World Scientific. doi:10.1142/4177 Denguir-Rekik, A., Montmain, J., & Mauris, G. (2009). A possibilistic-valued multi-criteria decisionmaking support for marketing activities in e-commerce: Feedback Based Diagnosis System. European Journal of Operational Research, 195(3), 876–888. doi:10.1016/j.ejor.2007.11.020 Dubois, D., & Prade, H. (1978). Operations on fuzzy numbers. International Journal of Systems Science, 9(6), 613–626. doi:10.1080/00207727808941724 Dubois, D., & Prade, H. (1983). Ranking of Fuzzy Numbers in the Setting of Possibility Theory. Information Sciences, 30(3), 183–224. doi:10.1016/0020-0255(83)90025-7 Hayashi, Y., Buckley, J. J., & Czogala, E. (1993). Fuzzy neural network with fuzzy signals and weights. International Journal of Intelligent Systems, 8(4), 527–537. doi:10.1002/int.4550080405 Hemba, S., & Islam, N. (2017). Fuzzy Logic: A Review. International Journal on Computer Science and Engineering, 5(2), 61–63.
568
Exploration of Fuzzy System With Applications
Kickert, W. J., & Mamdani, E. H. (1978). Analysis of a fuzzy logic controller. Fuzzy Sets and Systems, 1(1), 29–44. doi:10.1016/0165-0114(78)90030-1 Klein, Y., Pery, R., Komem, J., & Kandel, A. (2000). Fuzzy data mining. In Intelligent systems and interfaces (pp. 131–152). Springer. doi:10.1007/978-1-4615-4401-2_5 Klir, G., & Yuan, B. (1995). Fuzzy sets and Fuzzy Logic - Theory and Applications. Prentice-Hall. Lee, C. C. (1990). Fuzzy logic in control systems: Fuzzy logic controller I. IEEE Transactions on Systems, Man, and Cybernetics, 20(2), 404–418. doi:10.1109/21.52551 Neumann, J. V., & Morgenstern, O. (2001). Theory of games and economic behavior (pp. 1–776). Princeton University Press. Ngai, E. W. T., & Wat, F. K. T. (2005). Fuzzy decision support system for risk analysis in e-commerce development. Decision Support Systems, 40(2), 235–255. doi:10.1016/j.dss.2003.12.002 Qiao, W. Z., & Mizumoto, M. (1996). PID type fuzzy controller and parameters adaptive method. Fuzzy Sets and Systems, 78(1), 23–35. doi:10.1016/0165-0114(95)00115-8 Ramik, J., & ímánek, J. (1985). Inequality between Fuzzy Numbers and its Use in Fuzzy Optimization. Fuzzy Sets and Systems, 16 2), 123–138. doi:10.1016/S0165-0114(85)80013-0 Rommelfanger, H. (1984). Entscheidungsmodellemit Fuzzy-Nutzen. In Operations Research Proceedings (pp. 559–567). Springer Berlin Heidelberg. Roubens, M. (1997). Fuzzy sets and decision analysis. Fuzzy Sets and Systems, 90(2), 199–206. doi:10.1016/ S0165-0114(97)00087-0 Sala, A., Guerra, T. M., & Babuška, R. (2005). Perspectives of fuzzy systems and control. Fuzzy Sets and Systems, 156(3), 432–444. doi:10.1016/j.fss.2005.05.041 Tanaka, H., Okuda, T., & Asai, K. (1976). A Formulation of Fuzzy Decision Problems and its Application to an Investment Problem. Kybernetes, 5(1), 25–30. doi:10.1108/eb005404 Watson, S. R., Weiss, J. J., & Donell, M. L. (1979). Fuzzy Decision Analysis. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 1–9. doi:10.1109/TSMC.1979.4310067 Yuan, Y., & Zhuang, H. (1996). A genetic algorithm for generating fuzzy classification rules. Fuzzy Sets and Systems, 84(4), 1–19. doi:10.1016/0165-0114(95)00302-9 Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. doi:10.1016/S00199958(65)90241-X Zimmermann, H. J. (1978). Fuzzy programming and linear programming with several objective functions. Fuzzy Sets and Systems, 1(1), 45–55. doi:10.1016/0165-0114(78)90031-3
This research was previously published in Handbook of Research on Promoting Business Process Improvement Through Inventory Control Techniques; pages 479-498, copyright year 2018 by Business Science Reference (an imprint of IGI Global).
569
570
Chapter 29
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms Boudheb Tarik EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria Elberrichi Zakaria https://orcid.org/0000-0002-3391-6280 EEDIS Laboratory, Djillali Liabes University, Sidi Bel Abbès, Algeria
ABSTRACT Due to the growing success of machine learning in the healthcare domain, medical institutions are striving to share their patients’ data in the intention to build more accurate models which will be used to make better decisions. However, due to the privacy of the data, they are reluctant. To build the best models, they have to make the best feature selection for horizontally distributed private biomedical data. The previous proposed solutions are based on data perturbation techniques with the loss of performance. In this article, the researchers propose an original solution without perturbation. This is so the data utility is preserved and therefore the performance. The proposed solution uses a genetic algorithm, a distributed Naïve Bayes classifier, and a trusted third-party. The results obtained by the proposed approach surpass those obtained by other researchers, for the same problem.
1. INTRODUCTION Nowadays, data mining is receiving much attention. It is widely used by many researchers. It provides the technology to transform huge data into valuable knowledge for decision-making. In healthcare domain, it is becoming increasingly popular, if not essential (Koh & Tan, 2011). Various algorithms associated with data mining have significantly helped to understand medical data more clearly, by distinguishing DOI: 10.4018/978-1-7998-8048-6.ch029
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
pathological data from normal data and identifying hidden complex relationships (Acharya & Yu, 2010). Better diagnose diseases with the aim to provide best cares to patients at reduced costs. Medical organizations such as hospitals, laboratories, governments want to join their efforts with ambitions to build a more accurate data mining model. However, this collaboration faces problems. Patients’ data shared during the data mining process may leave some patients worried. Their personal information could fall into the wrong hands and could be used for other purposes. Due to the privacy laws, parties are reluctant to share the data without privacy guarantees. On the other hand, data generated are too complex and voluminous, which ones are useful and relevant to build more powerful models? No relevant data decrease the actual performance of the model. Feature selection techniques have received an extensive attention from the researchers. It is used to identify and remove irrelevant attributes that may decrease the model performance. Feature selection speeds up the learning and improves the model interpretability. There are three categories of feature selection methods: Wrapper, Filter and embedded methods. The wrapper methods consider the selection of a set of features as a search problem, where different combinations are prepared, evaluated and compared to the other combinations. A predictive model is used to evaluate them and assign a score based on the model accuracy. Filter methods apply a statistical measure to assign a scoring to each attribute. The features are ranked by the score and either selected to be kept or removed from the dataset (Brownlee, 2014). Embedded methods combine the qualities of filter and wrapper methods (Saurav, 2016). Algorithms that have their own built-in feature selection implement it. In the last decades, feature selection over distributed data has been intensively studied in the context of privacy-preserving. Many approaches were proposed. Most of them are based on data perturbation technics such as K-anonymity, i.e., the information for each individual in the dataset cannot be distinguished from at least k−1 individuals whose information also appears in the dataset (Zhu, Li & Wu, 2009). The challenge with data perturbation techniques is to find a good tradeoff between accuracy and privacy, knowing that more privacy will provide less accuracy. In this paper, the researchers propose an original secure wrapper feature selection solution over horizontally distributed medical data. Contrary to earlier solutions, the original data are not perturbed, and the performances are kept intact. This is vital in the medical domain. More lives will be saved. The remainder of this paper is organized as follows: In section 2, the authors introduce the related work. In Section 3, they present the proposed approach. In section 4, they describe and discuss the experiments. Finally, section 5 presents the conclusion and future works.
2. RELATED WORK Because of the growing success of data mining in the medical field, the demand for sharing healthcare data has been growing rapidly. Data available among organizations can bring in significant benefits for both medical treatment and scientific research. Therefore, it is indispensable to develop techniques to take profit from healthcare data sharing without violation of privacy. In the latter decades, in context of privacy, many solutions were proposed to perform classification task based on various classifiers. Among them, Naïve Bayes has received a great attention. It is simple but highly effective. This combination of simplicity and effectiveness has led to its use as a baseline standard by which other classifiers are measured (Vaidya & Clifton, 2004). Naïve Bayes has been widely used recently to predict various diseases (Schurink, Lucas, Hoepelman & Bonten, 2005). The researchers
571
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
(Liu, Lu, Ma, Chen & Qin, 2016) propose a clinical decision-support system based on Naïve Bayes to improve diagnosis accuracy and reducing its time. The patients’ historical data are stored in the cloud and used to train a Naïve Bayes model without leaking any patient medical data. The authors (Vaidya & Clifton, 2004) and (Yi & Zhang, 2009) propose a two-party and a multi-party protocol to build Naïve Bayes classification model for distributed data in a semi-honest model. The researchers (Keshavamurthy, Sharma & Toshniwal, 2010) propose a secure Naïve Bayes using a trusted cloud. The data are first anonymized at a local party and then aggregation and global classification are done at the trusted third party. In the medical domain, doctors and researchers are looking for best accuracy. However, if a training dataset contains irrelevant features, the classification task may produce less accurate and less understandable models (Cheng, Wei & Tseng, 2006). Feature selection is a method where automatically a subset of relevant features is chosen from the initial data to enhance accuracy. It results in dimensionality reduction, lower computational costs, higher learning precision and better model interpretability (Chelvan & Perumal, 2017). During the last decade, the motivation for applying feature selection techniques in bioinformatics has shifted from being an illustrative example to becoming a real prerequisite for model building (Saeys, Inza & Larrañaga, 2007). Feature selection, in the context of privacy, has received a great attention from researchers. Many solutions were proposed. Most of them are based on perturbation techniques. The researchers (Chelvan & Perumal, 2017) performed privacy preserving feature selection over centralized perturbed data. The sensitive attributes, such as the quasi-identifiers, are perturbed. Information Gain (IG) was used to identify the quasi identifier attribute. Selection of features and their stability is calculated by making use of the stability measure called Kuncheva Index KI. For the numerical attributes of the original and modified datasets, the statistical measures of mean, variance and standard deviation gave almost similar results. The experiments have proved that the proposed privacy preserving algorithm gave almost stable feature selection results. At the same time, there will be a minimum change in the accuracy due to the perturbation of the datasets. The authors in (Jahan, Narsimha & Rao, 2012) performed feature selection over distorted data using singular value decomposition and sparsified singular value decomposition singular. They found the best features with a little perturbation. The researchers (Banerjee & Chakravarty, 2011) have proposed a distributed privacy preserving method to perform feature subset selection that handles both horizontal as well as vertical data partitioning. They used virtual dimensionality reduction method. It is used in the field of hyperspectral image processing for selection of the subset of hyperspectral bands. The researchers (Jena, Kamila & Mishra, 2014) performed feature selection based on a genetic algorithm and a secure Naïve Bayes classifier for horizontally distributed data. Privacy of individuals is preserved by applying K-anonymity technic, before sending data to the mixer.
3. THE PROPOSED APPROACH The researchers propose a new secure wrapper feature selection solution with the aim to preserve the patients’ privacy without loss of the accuracy. Data privacy is achieved by a simple and efficient technic based on the use of secret random values. Contrary to the previous solutions, there is no trade off to find between privacy and data utility. Original data are not perturbed, sensitive data are safe and initial performances are preserved. The authors propose to use Genetic Algorithms, a distributed Naïve Bayes classifier and a trusted third-party. They follow the previous work of (Jena, Kamila & Mishra, 2014) without using of the k-anonymity techniques which caused a loss of performances. Genetic algorithms
572
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
are used to optimize the search for a relevant subset of attributes. They are executed on a third-party. Naïve Bayes classifier is a learning machine algorithm. It is used to evaluate performances of the possible solutions. To reduce the costs of the calculation and the network communication between the third-party and collaborative sites, the researchers use Naïve Bayes probability matrices. Thus, the Naïve Bayes model is built only once in all the process, and a genetic algorithm are run without the need of the collaborative parties at each iteration (Figure 1). The proposed solution is designed for: • •
Semi-honest models, i.e., all parties follow the protocol specifications, but they try to keep a copy of intermediate results with the aim to derive extra-information; Horizontally distributed data where various parties share the same schema and hold different records of patients. i.e., each site holds a private database D with |D| transactions and shares the same attributes, including the class attribute.
Figure 1. Overview of the proposed solution
Algorithm 1: Feature selection for horizontally distributed private data algorithm. Input: N; // the maximum number of iterations; M; // to stop the algorithm when there are no new solutions; DN, DT; // Training and test data; Output: Relevant attributes. Begin nbt_it = 0; unch_sol = 0 Step 1: Compute the Local Probability Matrix Compute Distributed Naïve Bayes Model (DN); Compute the Local Probability Matrix (DT);
573
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Compute Perturbed Local Probability Matrix; Encode class attribute (ECA); Send to the third-party (PLPM, ECA) Step 2: Genetic Algorithms Create initial population; Compute the Global Probability Matrix; Compute the Global encoded class attributes; While (Nbr_it < N) and (unch_sol < M): Compute fitness function (F-score); Select best chromosomes (fitness); Crossover (chromosomes); Mutation (chromosomes); nbr_it = nbr_it + 1; If solutions are unchanged: unch_sol = unch_sol+1; else unch_sol = 0; End; Return (relevant attributes) End.
The algorithm above is executed in two steps. The first step is performed only once. It calculates the Perturbed Local Probability Matrix (PLPM) using a secure distributed Naïve Bayes classifier and secret random values. The second step is achieved by the third-party. It securely selects relevant attributes based on genetic algorithms and the PLPM. All steps of the algorithm are detailed below in separates sections.
3.1. Step 1: Compute the Local Probability Matrix 3.1.1. Secure Naïve Bayes Protocol Naive Bayes is a simple probabilistic classifier. It calculates a set of probabilities by counting the frequency and combinations of values in a given dataset. The algorithm uses Bayes theorem and assumes all attributes to be independent given the value of the class variable. This conditional independence assumption rarely holds true in real-world applications, hence the characterization as naïve, yet the algorithm tends to perform well and learn rapidly in various supervised classification problems (Patil & Shereker, 2013). Given the class y and the dependent feature vector (x1, x2, x3, …, xn), Bayes’ theorem states the following relationship: P (y|x1, x2, x 3, … , xn) =
P (y ) P (x1, x2, x 3, … , xn | y) P (x1, x2, x 3, … ,xn)
n
P (x1, x2, x 3, … , xn|y) = ∏P (xi | y ) i =1
for all ‘i’, this relationship is simplified to:
574
(1)
(2)
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
P (y ) ∏ P (xi | y ) n
P (y|x1, x2, x 3, … , xn) =
1
P (x1, x2, x 3, … , xn)
(3)
Since P(x1, x2, x3, …,xn) is constant given the input, we can use the following classification rule: n y = arg max P (y ) ∏P (xi | y ) i =1
(4)
The authors achieve a secure multiparty computation. They use a trusted third-party and random values (Figure 2). They construct a Naïve Bayes model without sharing the local data between collaborative sites, except the final model, as follows: n: the number of attributes on the dataset. i: the site number i. m: total number of collaborative sites. nbr_class: the total number of class. The master site ‘site1’ initiates the process. It sends: To each collaborative sitei a set of random values Ri={RCi, RMi, RVi} where: RCi = {rci1, rci2… rcin} to compute the number of occurrences; RMi = {rmi1, rmi2,…, rmin} to compute the mean; RVi = {rvi1, rvi2,…, rvin} to compute the variance; To the third-party: m
m
m
RC={ ∑rci 1, ∑rci 2 ,..., ∑rcin } i =1 m
i =1 m
i =1
m
RM={ ∑rmi 1, ∑rmi 2 ,..., ∑rmin } i =1 m
i =1 m
i =1 m
RV = { ∑rvi 1, ∑rvi 2 ,..., ∑rvin } i =1
i =1
i =1
Each sitei, including the master site, computes: For each class ‘y’ and for each attribute ‘j’ in dataset; If attribute ‘j’ is of numeric type: Ciyj=rcij + countiyj //count of occurrences in attribute j for class ‘y’; Miyj =rmij + ∑ viyj // sum of values in attribute j for class ‘y’; If attribute ‘j’ is of nominal type: Ciyjv=rcij + countiyjv // count of occurrences of value ‘v’ in attribute j for class ‘y’; For attributey class: Ciy = rcij + countiy; Sends (Ciyj, Ciyjv, Ciy, Miyj) to the third-party; The third-party computes: The probability for the class ‘y’: Cy=
m
∑c
iy
;
i =1
575
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
C y −RC j
PCy=
∑
nbr _ class
Cy
y = 1
;
The probability of the nominal attribute: Cyjv=
m
∑C ; (Cyjv −RC ) iyjv
i =1
PByjv=
j
Cy
;
The mean of the numerical attribute ‘j’ for class ‘y’: m
Meanyj=
(∑ C iyj ) − RC j i =1
m ∑ M iyj −RM j i =1
;
The third-party sends to the collaborative sites the computed mean of numerical attributes (Meanyj) and probability of nominal attributes (PByjv); Each site computes (only for numerical attributes): ▪▪ Vijy =rvij + (∑(viyj − Meanyj )2 ; ▪▪ Sends (Vijy) to the third-party; The third-party Computes the variance for each class ‘y’ and numerical attribute ‘j’ as follows: m
Varianceyj=
(∑ Viyj ) − RVj i =1
Meanyj −1
The third-party sends to the collaborative sites the computed variance. Figure 2. Distributed Naive Bayes
576
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
3.1.2. Local Probability Matrix (LPM) After building the Naïve Bayes model, each site computes the Local Probability Matrix (LPM) using the local test dataset. LPM is a list of matrices. The number of matrices corresponds to the number of the class values. A value on the matrix represents the probability of a value of an instance of the dataset to be in a defined class (Figure 3). Figure 3. Local Probability Matrix (LPM) of sitei
The LPM contains intermediate and statistical information such as real probability values, frequencies of appearance, relation between them, etc. According to (Zhang, Li & Lou, 2011), statistical or intermediate results can leak sensitive information. In order to hide this information before sending the LPM to the third party, the real probability values are perturbed using local secret random values (Figure 4). This perturbation has no effect on the fitness function calculation, i.e., fitness function computation based on the perturbed LPM will give the same result as with the LPM, because ∀A, B, r ∈ R+-{0}, max(A, B)= max(r.A, r.B)). The experimentations have proven that (section 4). Example of perturbing the LPM: Suppose that the sitei has two attributes (nominal and numerical type) and two classes {cl1, cl2}. So, the LPMi of the sitei has two matrices. The first one corresponds to the class cl1, and the second one to the class cl2. From the LPMi (see Figure 4), extra information can be derived, e.g., accordingly to the frequency of appearance of values, in the first column in the LPM of class cl1, two possible values are possible (p1=0.14, p2= 0.86 and p1+p2=1). Same thing for the second column, it has four possible values (0.12, 0.15, 0.33, and 0.40). This sensitive information is hidden using a different secret random value ‘Ri’ (see Figure 4).
3.1.3. Encoded Class Attribute (ECA) The class attribute of instances is necessary to compute the fitness function on the third-party. However, it is considered sensitive. It should be undisclosed. Therefore, in the proposed solution, each site encodes the class attribute before sending to the third-party. Each class value is substituted by its position in the sorted set of classes.
577
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Figure 4. Example of perturbing the LPM of sitei
Example: Cv = {c1, c2, c3} is a sorted set of possible class values. Rang of: c1=0, c2=1, c3=2 CN = [c2, c2, c1, c3, c2, c3, c2, c1, c1, c1, c2, c3, c2] is the actual class attribute The encoded class attribute is: ECAN = [1, 1, 0, 2, 1, 2, 1, 0, 0, 0, 1, 2, 1]
3.2. Step 2: Genetic Algorithms Genetic Algorithms (GA) are metaheuristic methods. They are used to generate high-quality solutions to optimization and search problem by relying on bio-inspired operators like selection, crossover and mutation (Mitchell, 1998). In the proposed solution, GA are executed on the third party (Figure 6). Before starting Genetic Algorithms, data received by the third party (LPM, ECA) are concatenated to create the Global Probability Matrix (GPM) and the Global Encoded Class Attribute (GEC) (Figure 5). They are used to compute the fitness function (Figure 8). GPM and GEA are calculated once during all the process.
3.2.1. Initial Population The initial population is a randomly generated set of chromosomes (solutions). The chromosome is a binary string of length equal to n (n is a number of attributes of the dataset). In the proposed solution, the initial population is represented as a matrix where columns design chromosomes (Figure 7).
3.2.2. Fitness Function The fitness function embodies the essential aspects of the problem to be solved (Salzberg, Searls & Kasif, 1998). The fitness function is used in genetic algorithms to guide simulations towards optimal
578
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
design solutions. It is used to test and quantify how ‘fit’ each potential solution (chromosome) is. In the proposed solution, it is the function that the algorithm is trying to optimize. It is equal to the f-score measure (Formula 9). It is updated by a higher value at each round of genetic algorithms. In the proposed solution, it is computed securely as shown in Figure 8. Figure 5. GPM and GEA
Figure 6. The process of genetic algorithms
579
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Figure 7. Initial population
Figure 8. Data flow to compute the fitness function (F-score)
580
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
3.2.2.1. Final Probability Matrix (FPM) FPM is the result of combining matrices from the GPM with the current population of chromosomes (Figure 9). It is a list of matrices. The number of matrices corresponds to the number of the class values. A matrix is associated to a defined class. A value IPij in a matrix from the FPM represents the probability of the instance number ‘i’ from the test data to be in a defined class based on the chromosome number ‘j’. The probability of an instance is equal to the product of selected probabilities and the class probability value (Formula 3). The probability of a value in the instance is selected based on the value of the gene on the chromosome. If the gene is equal to 1, the probability value is selected (means attribute is selected); else it is discarded (attribute is discarded). Figure 9. FPM, final probability Matrix for classi
3.2.2.2. Final Classification Matrix (FCM) FCM is a matrix (Figure 10). A value vij on the FCM represents the encoded predicted class of the instance number ‘i’ from the test dataset based on the chromosome number ‘j’. The class value is computed based on the following formula: chr (insti)) Predicted_Classchr (insti) = Vmax( Pclchr1 (insti), Pclchr2 (insti),..., Pclk
(5)
Vmax returns the position of the matrix from the FPM where the maximum value is there. 3.2.2.3. Compute the F-Score (FS) The formula to compute the f-score (Figure 11) is defined below (Formula 8). The true classes of the test data are stored on the GEA. The predicted classes of test dataset for each chromosome are stored on the FCM.
581
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Figure 10. FCM matrix
Figure 11. The F-score list
582
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
3.2.3. Chromosomes Selection (CS) The purpose of this task is to give preference to the more appropriate individuals, allowing them to transfer their genes to the next generation. The researchers sort the FS list (computed above) and select the ‘k’ best tuples with higher f-score values. The ‘k’ value is an input parameter in the proposed solution. The selected group of chromosomes will be used for the crossover task: CS = Select_best_chromosome (FS, k)
3.2.4. Crossover Two by two individuals are chosen from the selected chromosomes (CS). A crossover site along the bit strings is randomly chosen. The values of the two strings are exchanged up to this point, e.g., if P1 = 010010 and P2 = 110111 and the crossover point is 2 then P1’ = 010111 and P2’ = 110010. The two new offspring created from this mating are put into the next generation of the population. By recombining portions of good individuals, this process is likely to create even better individuals.
3.2.5. Mutation Mutation is an operation used to maintain genetic diversity from one generation of a population of the GA to the next. It is analogous to the biological mutation. Mutation alters one or more gene values in a chromosome. The genes are chosen randomly, the values are altered (if value is ‘0’ it will be ‘1’ and inversely).
4. EXPERIMENTATION The authors achieved various experiments. For evaluation, they used repeated stratified holdout to improve the reliability of the results. The master site sends to all sites the tuple (M, L, T) with the aim to make uniform the data sharing. Each site will randomly split its local data as follows: L% of records for training and T% for testing. Repeat these operation M times. Therefore, the f-score will be computed M times (Figure 12). By applying the models generated to the test sample, effectiveness measures like precision (P), recall (R) and F-score (F) are computed: Precision =
Recall=
True positive True positive +False positive
True positive True positive +False negative
(6)
(7)
583
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Figure 12. Data organization in the proposed solution
F-score represents the harmonic mean of precision and recall. it is computed as follows: Fscore =
2*Precision*Recall Precision + Recall
(8)
In the proposed solution, the final f-score is equal to the mean of all computed f-scores for a chromosome:
(
)
mean Fscore (chi ) =
∑
(
)
fscore ch (i)
N 1
N
(9)
N number of stratified holdout chi: chromosome number i The real-life benchmarks are obtained from the UCI repository (Table 1). To simulate horizontally distributed data, the researchers divided instances of the original dataset randomly between the collaborative sites without redundancy. The researchers experiment the proposed solution. They use the PLPM technic. The original records are distributed over 04 sites. The genetic algorithm process is achieved on the third party. The results shown in Table 3 show that the proposed solution selects securely relevant attributes from the initial data and increase the original f-scores (Figures 13 and 14). In datasets like Parkinson, the authors select 7 relevant attributes from 22 original attributes and increase the f-score by 10%. In Breast cancer diagnosis, they select 9 relevant attributes from 32 initial attributes and increase the f-score by 3.4%. In Vertebral 2C, they select 2 relevant attributes from 6 initial attributes and increase the f-score by 3%. In Hepatitis, they select 8 relevant attributes from 19 and increase the f-score by 5%. In Hypothyroid, they select 6 relevant attributes from 25 and increase the f-score by 0.5%, etc.
584
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Table 1. Datasets from the UCI repository Dataset
Code
Breast Cancer Wisconsin (Original)
BCO
Attributes 10
Instance 699
Type N
Class Binary
Missing Values Yes
Breast Cancer Wisconsin (Diagnostic)
BCD
32
569
N
Binary
No
Parkinson
PAR
23
197
N
Binary
No
Vertebral_Colum_2C
VC2
6
310
N
Binary
No
Vertebral column 3C
VC3
6
310
N
Multiclass
No
Hepatitis
HIP
19
155
N/C
Binary
Yes
Puma Indians Diabetes
PID
8
768
N
Binary
Yes
Hypothyroid
HYT
25
3164
N/C
Binary
Yes
Table 2. Privacy preserving features selection based on the PLPM Solution Based on the PLPM 04 Sites 10 Repeated Stratified Holdout Initial Population of Chromosomes= 26 Crossover: 0.5 Mutation= 01
Dataset All Attributes BCO
10
Initial f-Score 0.958
Relevant Attributes
Names of the Selected Attributes
Improved f-Score
4
Clump Thickness, Uniformity of cell Shape, Bare nuclei, Normal Nucleoli.
0,967
0.965
BCD
32
0.931
9
Radius_mean, texture_mean, texture_se, fractal_ dimension_se, texture_largest_worst, perimeter_largest_ worst, area_larger_worst, concave_points_largest_worst, fractal_dimension_largest_worst.
PAR
22
0.71
7
Shimer, HNR, RPDE, DFA, Spread1, Spread2, Ppe.
0.81
VC2
6
0.78
2
Pelvic radius, Grade of spondylolisthesis.
0.807
VC3
6
0.825
3
Sacral slope, Pelvic radius, Grade of spondylolisthesis.
0,837
HIP
19
0.842
8
Sex, Steroid, Antivirals, Malaise, Spiders, Bilirubin, Albumin, Protime.
0.89
PID
8
0.764
4
Plasma glucose concentration, Body mass index, Diabetes pedigree function, Age.
0.771
HYT
25
0.977
6
Age, sick, goitre, TT4, T4U, FTI
0.982
In the same context, the researchers (Jena, Kamila & Mishra, 2014) have studied the same problem. They proposed, for horizontally distributed data, a feature selection technique based on genetic algorithms and a decomposable Naïve Bayes classifier. They use the K-anonymity technic to Preserve Privacy of individuals. Comparing their solution with the proposed approach, based on the breast cancer Wisconsin diagnostic and adult datasets, the proposed approach doesn’t decrease the performances while considering the privacy. There is no trade off to find between privacy and data utility, they are independent. The Performances are constant, because the original data are not perturbed. The secret random values don’t affect the computation (Table 3). However, (Jena, Kamila & Mishra, 2014) solution loss performances against privacy. More privacy (by using a largest K value) will decrease performances as shown in Table 3.
585
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Figure 13. Relevant attributes selected by the proposed approach
Figure 14. Improvement of the f-score by the proposed approach
In order to prove that privacy doesn’t disturb calculations in the proposed approach, the authors tested the proposed solution in three situations (Table 4) (Figure 15). In centralized data, there is only one site having all records. In horizontally distributed data, there are multiple sites. Each one has a part of the records. In this situation, two sub-scenarios are tested. The first one is based on the LPM without use of random values. The second one is based on the PLPM using secret random float values. The outcomes in Table 4 show that the proposed solution gives exactly the same results in the three scenarios. The solution is stable. It works well in a distributed environment and performances are saved. Thereby, the random values, as proofed previously, don’t affect the calculation and secure the sensitive biomedical data.
586
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Table 3. The proposed solution vs. (Jena et al., 2014) solution The Proposed Solution Dataset
Initial f-Score
Improved f-Score
(Jena et al., 2014) Solution Initial f-Score
Wisconsin Breast cancer (BCD)
93.10
96.50
93.28 ±1.78
Adult
82.16
83.20
82.68±0.27
Improved f-Score K=50
K=100
K=200
K=500
95.27±1.57
94.77±2.09
93.89±2.03
93.04±1.09
81.81±0.13
81.41±0.79
81.47±0.77
81.19±1.11
Table 4. Privacy vs performances in the proposed solution 10 Repeated Stratified Holdout 67% - 33% I: Initial F-Score (All Attributes) O: Improved F-Score (Relevant Attributes)
BCO
Horizontally Distributed Data Third Party Sites: 2 to 4
Centralized Data 01 Site Without Privacy No Third Party IO
Dataset
0.958
0,967
LPM Without Privacy IO 0.958
0,967
PLPM Including Privacy IO 0.958
0,967
BCD
0.930
0.965
0.930
0.965
0.930
0.965
PAR
0.710
0.81
0.710
0.81
0.710
0.81
VC2
0.780
0.807
0.780
0.807
0.780
0.807
VC3
0.825
0,837
0.825
0,837
0.825
0,837
HIP
0.842
0.89
0.842
0.89
0.842
0.89
PID
0.764
0.771
0.764
0.771
0.764
0.771
HYT
0.977
0.982
0.977
0.982
0.977
0.982
Figure 15. Privacy vs. performances in the proposed approach
587
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
In intention of checking the reliability of the proposed approach, the authors validate the selected subset of attributes by Weka (Figure 16). They got results close to those given by the proposed approach (Table 5). Table 5. Validation of the selected attributes in Weka (names of relevant attributes are shown in Table 2)
All Attributes
Dataset
Number of Relevant Attributes
The Proposed Approach (Based on the PLPM) 10 CV Initial f-Score All Attributes
Improved f-Score Relevant Attributes
Weka 10 CV Initial f-Score All Attributes
Improved f-Score Relevant Attributes
BCO
10
4
0.958
0,967
0.96
0.963
BCD
32
9
0.931
0.965
0.93
0.963
PAR
22
7
0.71
0.81
0.713
0.812
VC2
6
2
0.78
0.807
0.778
0.803
VC3
6
3
0.825
0,837
0.83
0.832
HIP
19
8
0.842
0.89
0.847
0.882
PID
8
4
0.764
0.771
0.76
0.767
HYT
25
6
0.977
0.982
0.978
0.980
Figure 16. Validation of the selected attributes in Weka
588
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
4.1. Security Analysis • •
• • •
The Naïve Bayes model is built on a third party using secret random values, without sharing the original data and the intermediate results; The Genetic Algorithms process is achieved on the third party based on the local perturbed probability matrices. Thus, the sensitive statistical information contained in local probability matrices (like real probability values and the relation between them, frequencies of appearance, etc.) are well hidden; The secret random values used to perturb the LPM are locally generated and not shared between collaborative sites; The third party cannot perceive any data or intermediate results without knowing the secret random values; The only data shared between collaborative sites are the Naïve Bayes model and the relevant attributes.
5. CONCLUSION In this paper, the researchers have proposed a new secure wrapper features selection method for horizontally distributed medical data, based on genetic algorithms and a distributed Naïve Bayes. Contrary to earliest methods, the original data are not perturbed. The authors have constructed securely more accurate models with only relevant attributes. These models will be used to better diagnose diseases like breast cancer, diabetes, Parkinson, etc. with a more reduced cost. Comparing the proposed solution with (Jena, Kamila & Mishra, 2014)’ solution, the authors’ performances remained intact with privacy preserving, as experimented above. That was the researchers’ intention. In the future, they plane to address the same problem with other classifiers such as ID3 and SVM.
REFERENCES Acharya, U. R., & Yu, W. (2010). Data mining techniques in medical informatics. The Open Medical Informatics Journal, 4(2), 21–22. doi:10.2174/1874431101004020021 PMID:20694156 Banerjee, M., & Chakravarty, S. (2011). Privacy preserving feature selection for distributed data using virtual dimension. In Proceedings of the 20th ACM international conference on Information and knowledge management, Glasgow, Scotland, UK (pp. 2281-2284). ACM. 10.1145/2063576.2063946 Brownlee, J. (2014, October 6). An Introduction to Feature Selection. Machine Learning Mastery. Retrieved from https://machinelearningmastery.com/an-introduction-to-feature-selection/ Chelvan, M., & Perumal, K. (2017). Stable Feature Selection with Privacy Preserving Data Mining Algorithm. In Advanced Informatics for Computing Research (pp. 227–237). Singapore: Springer. doi:10.1007/978-981-10-5780-9_21
589
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Cheng, T. H., Wei, C. P., & Tseng, V. S. (2006). Feature selection for medical data mining: Comparisons of expert judgment and automatic approaches. In Proceedings of the 19th IEEE International Symposium on Computer-Based Medical Systems (pp. 165-170). IEEE. doi:10.1109/CBMS.2006.87 Jahan, T., Narsimha, G., & Rao, C. G. (2012). Data perturbation and feature selection in preserving privacy. In Proceedings of the Ninth International Conference on Wireless and Optical Communications Networks (WOCN) (pp. 1-6). IEEE. 10.1109/WOCN.2012.6335531 Jena, L., Kamila, N. K., & Mishra, S. (2014). Privacy preserving distributed data mining with evolutionary computing. In Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (pp. 259-267). Springer. 10.1007/978-3-319-02931-3_29 Keshavamurthy, B. N., Sharma, M., & Toshniwal, D. (2010). Privacy-preserving Naive Bayes classification using trusted third party and different offset computation over distributed databases. In Proceedings of the 1st International Conference On Parallel, Distributed and Grid Computing (PDGC 2010) (pp. 362-365). 10.1109/PDGC.2010.5679968 Koh, H. C., & Tan, G. (2011). Data mining applications in healthcare. Journal of Healthcare Information Management, 19(2), 64–72. PMID:15869215 Liu, X., Lu, R., Ma, J., Chen, L., & Qin, B. (2016). Privacy-preserving patient-centric clinical decision support system on naive Bayesian classification. IEEE Journal of Biomedical and Health Informatics, 20(2), 655–668. doi:10.1109/JBHI.2015.2407157 PMID:26960216 Mitchell, M. (1998). An introduction to genetic algorithms. Cambridge, MA: MIT press. Patil, T. R., & Sherekar, S. S. (2013). Performance analysis of Naive Bayes and J48 classification algorithm for data classification. International Journal of Computer Science and Applications, 6(2), 256-261. Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics (Oxford, England), 23(19), 2507–2517. doi:10.1093/bioinformatics/btm344 PMID:17720704 Salzberg, S. L., Searls, D. B., & Kasif, S. (Eds.). (1998). Computational methods in molecular biology. Amsterdam: Elsevier. Saurav, K. (2016, December 1). Introduction to Feature Selection methods with an example (or how to select the right variables?). Analytics Vidhya. Retrieved from https://www.analyticsvidhya.com/blog/2016/12/ introduction-to-feature-selection-methods-with-an-example-or-how-to-select-the-right-variables/ Schurink, C. A. M., Lucas, P. J. F., Hoepelman, I. M., & Bonten, M. J. M. (2005). Computer-assisted decision support for the diagnosis and treatment of infectious diseases in intensive care units. The Lancet Infectious Diseases, 5(5), 305–312. doi:10.1016/S1473-3099(05)70115-8 PMID:15854886 Vaidya, J., & Clifton, C. (2004). Privacy preserving naive Bayes classifier for vertically partitioned data. In Proceedings of the 2004 SIAM International Conference on Data Mining (pp. 522-526). 10.1137/1.9781611972740.59 Yi, X., & Zhang, Y. (2009). Privacy-preserving naive Bayes classification on distributed data via semitrusted mixers. Information Systems, 34(3), 371–380. doi:10.1016/j.is.2008.11.001
590
Best Feature Selection for Horizontally Distributed Private Biomedical Data Based on Genetic Algorithms
Zhang, N., Li, M., & Lou, W. (2011). Distributed data mining with differential privacy. In Proceedings of the IEEE International Conference on Communications (ICC), Kyoto, Japan (pp. 1-5). IEEE. doi:10.1109/icc.2011.5962863 Zhu, D., Li, X. B., & Wu, S. (2009). Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining. Decision Support Systems, 48(1), 133–140. doi:10.1016/j.dss.2009.07.003
This research was previously published in International Journal of Distributed Systems and Technologies (IJDST), 10(3); pages 37-57, copyright year 2019 by IGI Publishing (an imprint of IGI Global).
591
592
Chapter 30
Intelligent Computing in Medical Imaging: A Study
Shouvik Chakraborty https://orcid.org/0000-0002-3427-7492 University of Kalyani, India Sankhadeep Chatterjee University of Calcutta, India
Amira S. Ashour Tanta University, Egypt Kalyani Mali University of Kalyani, India
Nilanjan Dey https://orcid.org/0000-0001-8437-498X Department of Information Technology, Techno India College of Technology, Kolkata, India
ABSTRACT Biomedical imaging is considered main procedure to acquire valuable physical information about the human body and some other biological species. It produces specialized images of different parts of the biological species for clinical analysis. It assimilates various specialized domains including nuclear medicine, radiological imaging, Positron emission tomography (PET), and microscopy. From the early discovery of X-rays, progress in biomedical imaging continued resulting in highly sophisticated medical imaging modalities, such as magnetic resonance imaging (MRI), ultrasound, Computed Tomography (CT), and lungs monitoring. These biomedical imaging techniques assist physicians for faster and accurate analysis and treatment. The present chapter discussed the impact of intelligent computing methods for biomedical image analysis and healthcare. Different Artificial Intelligence (AI) based automated biomedical image analysis are considered. Different approaches are discussed including the AI ability to resolve various medical imaging problems. It also introduced the popular AI procedures that employed to solve some special problems in medicine. Artificial Neural Network (ANN) and support vector machine (SVM) are active to classify different types of images from various imaging modalities. Different diagnostic analysis, such as mammogram analysis, MRI brain image analysis, CT images, PET DOI: 10.4018/978-1-7998-8048-6.ch030
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Intelligent Computing in Medical Imaging
images, and bone/retinal analysis using ANN, feed-forward back propagation ANN, probabilistic ANN, and extreme learning machine continuously. Various optimization techniques of ant colony optimization (ACO), genetic algorithm (GA), particle swarm optimization (PSO) and other bio-inspired procedures are also frequently conducted for feature extraction/selection and classification. The advantages and disadvantages of some AI approaches are discussed in the present chapter along with some suggested future research perspectives.
INTRODUCTION Medical image analysis has a major role in detecting and diagnosis of different diseases. Recently, researchers are interested with biomedical image analysis (Doi, 2007). Mainly, techniques based on machine learning including artificial neural networks (ANNs), and bio-inspired algorithms have drawn the attention of several researchers. Computer aided diagnosis (CAD) is considered a promptly developing active areas with the help of modern computer based methods, and new medical imaging modalities. Decision-support tools and intelligent analysis frameworks are significant in biomedical imaging for CAD, detection and evaluation where accuracy is one of the major issues. CAD helps physicians by the results obtained from a computerized system for detecting and diagnosing different diseases, such as lesions, and tumors as well as measuring the extent and effect of specific disease. One of the foremost goals of these artificial systems is to improve the consistency and accuracy of diagnosis in such a way that the rate of false negative will be reduced. Generally, CAD systems involve initial selection of samples for training, image pre-processing for enhancement, selection of region of interests (ROI), features extraction, feature selection, classification and segmentation. The CAD system generally tries to localize and to identify the disease for diagnosis. CAD systems consist of two most important processes, namely image segmentation and image classification. During the image segmentation, pixels are grouped into some domains based on some feature of the image producing a set of distinct regions or objects that can be studied and quantified separately, representing specific ROC (Receiver Operating Characteristics) of the actual image. In the classification stage, features are extracted and depending on those features classification of objects are performed, i.e. normal or abnormal, malignant or benign. The prime goal of computational intelligence is to understand and to learn the principles and logics that can help a machine or artificial systems to behave intelligently. General approaches and methods applied in medical image analysis at various stages are shown in Figure 1. For applying the preceding medical image analysis/processing stages, the ANN is considered one of the popular, effective nonlinear information processing systems that designed from interconnected basic processing modules, namely neurons. Some of the basic advantages of the ANN include fault tolerance, adaptive learning, parallelism, and self-adjustment. ANN has several applications in knowledge extraction and classification, pattern recognition, forecasting of some results depending on some current results, clinical diagnosis and analysis. In this chapter, a study is conducted on the uses of NN and other AI systems to diagnostic science.
593
Intelligent Computing in Medical Imaging
Figure 1. Different stages of biomedical image analysis
594
Intelligent Computing in Medical Imaging
Ideally, the main objective the biomedical imaging system is to upgrade the efficiency in monitoring the conditions of humans and other biological species efficiently for possible illness and/or diseases detection and treatment. Techniques based on AI are widely used for biomedical imaging improvement. In order to perform biomedical image analysis for useful information retrieving with a certain amount of uncertainty related to it. It is generally performed using artificial computational intelligence methods, such as fuzzy logic (FL), GA, bio-inspired algorithms, SVM, ANN and deep neural networks. Recent research deals with different tools and methods based on AI, which are effective and known for very high classification accuracy. Some related tasks, such as biological features identification and segmentation are performed more precisely by applying the AI based methods. Consequently, these techniques are precious in biomedical image analysis for accurate analysis (Rastgarpour, & Shanbehzadeh, 2013). Thus, advanced procedures and systems for automated biomedical image analysis are very helpful in increasing the consistency and precise interpretation of the biomedical images. AI approaches integrated with fuzzy logic, machine learning, and deep learning are so precious in interpretation, analysis and mining some useful information from biomedical image data. AI has been tested to obtain promising results in digital imaging. Moreover, the structural information nature for biomedical images may efficiently be approached by using tools and techniques of AI. In biomedical imaging domain; conflicting, ambiguous, uncertain, imprecise, and complementary are very common and data fusion techniques can efficiently cope up with these issues. The following sections illustrate in detail how the ANN is employed for image classification and analysis. Typically, the ANN has a important role in the medical image analysis domain (Miller et al., 1992). Thus, the major focus of the current chapter is on preprocessing of the biomedical images, segmentation, and object recognition from biomedical images instead of attempting to cover the research aspects of artificial computing in biomedical image analysis.
NEURAL NETWORKS IN BIOMEDICAL IMAGING ANNs are very popular and massively used for their high performance and accuracy in classification and approximation of functions. ANNs have been employed with success in various applications of biomedical image analysis, more specifically in the preprocessing, segmentation, recognition and classification. An overview of some types of the ANN used in this domain is reported in Table 1.
Artificial Neural Network Assisted Biomedical Image Pre-Processing ANN assisted biomedical image pre-processing for image restoration including noise removal/quality enhancement and reconstruction. Hopfield neural network is considered as an efficient neural work for the reconstruction stage of biomedical images (Warsito, & Fan, 2001). The reconstruction process can be considered as an optimization problem using NN by allowing it to converge to a stable state through optimizing the energy function. Wang et al. (2003) conducted a comparative study of reconstructed shapes with the output generated from conventional convolution techniques and algebraic reconstruction methods. This study established the Hopfield neural networks efficiency in biomedical image reconstruction.
595
Intelligent Computing in Medical Imaging
Table 1. Some types of NN used in biomedical image processing Image Pre-processing
Image Segmentation
Classification/ Recognition
Feed Forward Neural Network
Yes
Yes
Yes
Fuzzy Neural Network
Yes
Yes
Yes
X
Yes
Yes
Convolution Neural Network
X
X
Yes
Radial basis function based Neural Network
X
X
Yes
Hopfield Neural Network
Yes
Yes
Yes
Self-organizing feature based Neural Network
Yes
Yes
Yes
Probabilistic Neural Network
Adaptive resonance theory based Neural Network
X
X
Yes
Artificial Neural ensemble
X
Yes
Yes
Massive training Neural Network
Yes
X
Yes
Cellular Neural Network
Yes
X
X
Acquired images reconstruction from electrical impedance tomography necessitates the removal of the nonlinear inverse on the data corrupted with noise. This requires some assumptions or generalization depending on some previous knowledge. The feed-forward ANN and the self-organizing ANN can be more advantageous for the reconstruction of such medical images compared to some other methods (Adler, & Guardo, 1994). Generally, noise is one of the major issues in biomedical images and in restoration stage, noise removal is performed. Suzuki et al. (2003) suggested a filter based on NN to solve this issue. An ANN has been employed in the proposed filter with multiple layers. This filter can obtain the function of different linear and nonlinear filters by proper training. Edge detection is very popular and has popular effective solutions. For example the Laplace, Prewitt, Sobel and some other advanced operators. Suzuki et al. (2003) developed a neural edge enhancement approach based on an adapted NN with multiple layers. It can enhance desired edges prominently from noisy images. This method is verified to be robust and less sensitive to noise and superior compared to other conventional methods. It has the capability to enhance continuous edges from noisy images.
Neural Network Based Biomedical Image Segmentation Feed-forward ANN based segmentation are powerful and less noisy than the conventional segmentation methods similar to the methods based on Maximum Likelihood Classifier (MLC). Feed-forward ANN is insensitive to the training set selection compared to the MLC classifier. An overview of the ANNs used in biomedical image segmentation is given in table 2. From the preceding literature review, it can be concluded that the NN is effective for biomedical image segmentation. However, it undergoes slow convergence rate and its learning parameters should be defined in advance. These disadvantages restrict the use of feed- forward ANN in biomedical image segmentation. However, recent studies have revealed that the ANNs performance can be improved by using metaheuristics. Training ANNs essentially means to find the optimal set of weight vector for the network for which some error function is minimized. Thus, the training can be framed as an optimization problem. Traditional backpropagation based learning algorithms may lead to a set of weight vectors for
596
Intelligent Computing in Medical Imaging
the ANN, which are not optimal. This happens due to the convergence to local optima while finding the optimal weight vectors. The problem has been addressed in several literatures (Chatterjee et al., 2016a; 2016b; 2017; Hore et al., 2015a; 2016a; 2017). It can be solved by solving the optimization problem regarding the training phase of ANN using metaheuristics (Hore et al., 2016b; Chatterjee et al., 2017a). Several metaheuristics have been proposed to train ANNs leading to hybrid ANN models that are extremely effective in image processing applications. Typically, the Hopfield ANN has been proposed as a tool for computing efficient output to difficult optimization problems. This sorts it an attractive option to conventional optimization procedures for the reconstruction process of the biomedical image reconstruction. Some other ANNs, such as fuzzy Gaussian basis ANN, contextual ANN, hybrid ANN based method is also employed for medical image segmentation. Table 2. Overview of the artificial neural networks applied for biomedical image segmentation Algorithm
Description
Du–Yih et al. (1994)
Applied on CT images of liver structure. A three layer BP artificial neural network has been used for segmentation.
Hasegawa et al. (1998)
Employed on the radiographic images of the chest. A shift invariant ANN has been used.
Ozkan et al. (1993)
Tested on the Magnetic Resonance Images acquired from brain tissues. A BP NN based method has been employed for characterization.
Li et al. (1999)
A method based on LSB ANN has been illustrated.
Middleton, & Damper (2004)
Applied on Magnetic Resonance Images. Here, Multilayer Perceptron has been combined with Active Contour model.
Koss et al. (1999)
Employed on CT images acquired from abdominal region. Images of liver have been taken for consideration. Hopfield NN has been used for segmentation.
Chang, & Ching (2002)
A fuzzy Hopfield NN based segmentation technique has been described.
Cheng et al. (1996)
A competitive Hopfield neural network has been discussed for biomedical image segmentation
Sun, & Wang (2005)
Applied on MRI, where Fuzzy Gaussian basis NN has been used for segmentation.
Lee, & Chung (2002)
A contextual NN based segmentation method has been described.
Wang et al. (1998)
A probabilistic NN based segmentation method has been described.
Pitiot et al. (2002)
Applied on MRI/CT images, where Hybrid NN has been presented.
Neural Network Based Object Detection In biomedical imaging, detection and recognition of different shapes and object, such as tumors is considered significant and prerequisite in various applications. It is also important in disease detection and useful in interpretation and retrieval of information from the biomedical images. Sometimes it is also the final stage in the biomedical image analysis. It is potentially the brightest area for the application of ANN because it is possible to combine several preceding phases (e.g. preprocessing, and feature extraction) by applying a NN. The main advantage is its capability to train the whole as a single system. A summary of the ANN that has been employed in object detection and recognition are reported in Table 3.
597
Intelligent Computing in Medical Imaging
Table 3. Summary of the AA applied for biomedical image detection Algorithm
Description
Maclin, & Dempsey (1992)
A feed-forward neural network based technique has been applied on liver images obtained from ultrasonography and has been used for liver images classification.
Wu et al. (1993)
A feed-forward NN based technique has been proposed for the interpretation and recognition of mammograms.
Tourassi, & Floyd (1993)
A feed-forward NN based technique has been developed for localization of cold lesion and the images are of SPECT modality.
Ercal et al. (1994)
This method has been based on BP NN with skin dataset for Melanoma detection.
Wolberg et al. (1995)
A feed-forward NN based technique has been carried out for breast cancer diagnosis.
Kadah et al. (1996)
A neural network based on radial basis function has been developed for diagnosis of liver related diseases obtained from ultrasonic modality.
Zhu, & Yan (1997)
This approach has been based on Hopfield NN to detect boundaries of brain tumor.
Innocent et al. (1997)
An ART NN based method has been proposed for radiographic image classification.
Chan et al. (1997)
A convolution NN has been applied for the detection of clustered micro calcifications.
Chen et al. (1998)
A probabilistic NN based model has been developed for the recognition of liver tumors.
Ashizawa et al. (1999)
A BP NN method has been tested on lung diseases classification.
Pavlopoulos et al. (2000)
A fuzzy NN based model has been employed on the ultrasound image of the liver.
Verma, & Zakos (2001)
A fuzzy NN based model has been used to detect micro calcifications in mammogram images.
Suzuki et al. (2005)
A modified BP NN based approach has been discussed to reduce the false positives. This method has been illustrated in computerized detection of lung nodules.
Feed-forward NN based techniques have proved their efficiency and preferred over conventional image detection and recognition methods due to their accuracy in terms of recognition. ANN ensembles are engaged for detecting cancer using two-level constructions, which are applied to find whether a cell is normal containing high confidence. Each network can produce two outputs related to normal cell or abnormal cell. In addition, the second-level is employed to handle the cancer cells as computed by the first level. The judgment of those separate networks is combined by a plurality voting method. Experimental results proved that the ANN ensemble can achieve a good rate of classification and recognition as well as can minimize the false negatives.
ADVANTAGES OF ARTIFICIAL NEURAL NETWORKS IN BIOMEDICAL IMAGE ANALYSIS From the preceding literatures, it can be concluded that the Hopfield ANN and the feed-forward ANN are the two mostly adapted models that can be used efficiently in biomedical image analysis. The main advantage of Hopfield ANN model in the domain of biomedical image analysis is that the problem in this domain can be considered as an optimization problem. The major advantages of this type of NN based techniques are their efficient capability to handle the tradeoff between the noise sensitivity and image reconstruction resolution. Furthermore, the self-organization feature map (SOM) is one of the attractive alternatives to supervised methods. It has the ability to learn and to classify biomedical image information.
598
Intelligent Computing in Medical Imaging
Disadvantages of Artificial Neural Networks in Biomedical Image Analysis Beside the success history, ANN based methods have several disadvantages compared to some other methods. The first problem that can be mentioned is the choice of the best ANN model and its related architecture. There are no appropriate data that can select the ANN type to be used for a certain application and no proper guidelines found that assures the best trade-off between the variance and bias for a specific volume of the training samples. These networks have to be developed by empirical trial and error method and it is hard to surmount. Moreover, there is chance of over fitting an ANN. Sometimes optimizing the objective measure does not lead to obtain a generalized ANN. The second issue is that the internal implementation is hidden such as the black-box. For a certain input, a corresponding output is generated without explanation of the taken decision. The third problem related with biomedical image analysis is the large volume of input data. If a neural network is trained with a few number of test instances, then the generalization power will be degraded and may not be useful for test cases. In order to ensure high reliability sufficient training samples is essential. In biomedical image analysis, the accuracy is imperative for computer aided diagnostic systems. Finally, training with large number of data may consume huge amount of time, while time is very important in diagnostic profession and timely accurate results. Moreover, there is a prominent necessity for a detailed validation of the proposed algorithms. In addition, in case of Hopfield ANN for biomedical image analysis, the actual problem is the requirement to its modification before being placed for a solution to the Hopfield network organization.
IMPACT OF THE BIO-INSPIRED AND EVOLUTIONARY ALGORITHMS IN BIOMEDICAL IMAGING Evolutionary algorithms are inspired by living organisms. One of the effective optimization techniques based on the evolution theory is the Genetic Algorithm (GA). Learning technique of GA uses computational models of natural adaptation to increase their efficiency by mimicking the population genetics. The GA is known as a population-based method because it does to consider only a single potential solution. It has several applications in biomedical imaging, which are proved to be efficient and effective. In mammogram image classification, physicians are interested about three types of classes i.e. normal, benign and malignant. A hybrid method has been developed by Vasantha et al. (2010) for feature selection and reduction. This method can reduce up to 75% of the features. A hybrid method of Greedy stepwise technique and GA has been developed to choose the optimal features for further classification. Cartesian genetic programming (CGP) is an efficient method where more than one networks is employed to classify the mammograms (Nandi et al., 2006), genetic programming has been adapted for implicit feature selection. Now to select the features from the available pool, feature selection approach with three statistical parameters, namely Kolmogorov–Smirnov test, Student’s t-test, and Kullback–Leibler divergence has been employed. The proposed method achieved system accuracy of 98%. HernándezCisneros, & Terashima-Marín (2006) developed a classification method for micro calcification clusters in using consecutive difference of Gaussian filters (DoG). Three evolutionary ANNs has been compared against a back-propagation feed forward ANN. The GA has been employed to find the optimal weights for ANN.
599
Intelligent Computing in Medical Imaging
Das, & Bhattacharya (2008) developed a computer guided treatment development system using Neuro-fuzzy method based on GA. The tumor features have been extracted based on the boundary using Fourier descriptors to represent the features. The proposed approach achieved 87% classification accuracy. Kharrat et al. (2010) proposed a hybrid method for brain tissues classification obtained from MRI images. This approach was based on GA and SVM classifier using texture features that based on wavelet. Spatial gray level dependence method is used to extract the features from both normal and tumor segments. These acquired features have been supplied to SVM. The RBF Kernel has been employed with accuracy range 96.38% to 98.78%. Hong, & Cho (2006) developed an advanced technique to select features using GA. Yeh, & Fu (2008) developed an optimization method based on hierarchical GA associated with a fuzzy learning-vector quantization network to perform segmentation of multi-spectral human-brain images of MRI modality. The HGALVQ method has been compare to some other wellknown methods, namely fuzzy c-means, k-means, FALVQ, and LVQ. The HGALVQ provided an exact number of clusters. Moreover, it outperformed some other techniques in terms of specificity. Saha & Bandyopadhyay (2007) proposed an automatic segmentation method based on genetic clustering. This clustering method has been based on fuzzy point symmetry and has been applied on the multispectral MRI of the brain. The proposed fuzzy-VGAPS method can automatically iterate and can evolve total number of clusters within the data set. This method used elitist GA with fixed number of generations. Kishore et al. (2000) studied the efficiency of the GA for various multiclass problems using association rules. This method has been compared with the maximum likelihood classifier. A classification function has been evolved through the training samples, where those samples that belong to same class are assigned with association strengths in terms of association degrees. A distinct expression-tree has been evolved through every class. It checked whether unknown test samples belong to the class under test. Parthiban, & Subramanian, (2008) proposed an automated analysis system based on coactive neuro-fuzzy inference system (CANFIS) to predict the heart disease using the ANN adaptive nature. Fuzzy logic qualitative approach has been used along with GA to detect the disease. GA has been used to improve the learning capability of CANFIS. It is used to optimize the number of membership function for separate inputs and other control parameters including the rate of learning, and momentum coefficient. A power spectral based hybrid GA has been proposed by Khazaee & Ebrahimzadeh (2010), where the SVM classifier has been employed to classify five different types of electrocardiogram (ECG) signals. These five classes are normal and four manifestations of heart arrhythmia. In case of SVM, the free parameters have great impact on the accuracy of the classification. This method could achieve up to 96.0% accuracy. Apart from GA, other bio-inspired method, such as PSO, ABC, and cuckoo search have great impact in the biomedical image analysis. PSO has been inspired by social behavior of bird flocking, where the possible solutions are termed as particles. This technique is easy to implement and computationally cheap. Its system requirements (i.e. mainly CPU and memory) are low. A new method has been developed by Dheeba,& T. Selvi (2010) to detect micro calcification from mammograms. This method has been based on PSO and clustering. Fuzzy c-means clustering method has been used with the PSO to find the center of the clusters automatically. This approach avoids the minimum local value. Geetah et al. (2010) developed a model to enhance the mammogram images. A median filter has been used for enhancement and normalization. In this study, the GA has been used to enhance the border; in addition, the PSO has been applied to find the nipple position in the mammogram. The algorithm effectiveness has been measured by true and false positive, where the Corel image database has been used to compute the efficiency of the classification. Ibrahim et al. (2010) applied the PSO algorithm to detect abnormalities in brain tissue. This technique is based on four main stages, namely an initially generated population 600
Intelligent Computing in Medical Imaging
particle, and then the computed fitness using the fitness function, afterward the position and velocity has been updated and finally the stopping criterion has been applied. Typically, PSO based methods worked efficiently on the light abnormalities but gives poor output in dark abnormalities segmentation. Maitra, & Chatterjee (2008) developed a new multilevel thresholding procedure for MRI brain images segmentation based on bacterial forging algorithm, where thresholding based on image histogram has been used. Jude et al. (2010) offered a modified counter propagation NN for brain MRI classification. In order to increase the efficiency, the PSO has been used with addition to the CPN. Alba et al. (2007) empirically evaluated a modified PSO namely Geometric PSO. This method has been evaluated by employing a binary representation in Hamming space. This approach has been tested on high dimensional microarray data. A comparative study has been given to compare PSO and GA. Both of them are associated with SVM with 10- fold cross-validation for the classification purpose. Generally, several researchers have been interested to apply several image processing techniques in various applications that can be applied in the medical domain (Setiawan, 2014; Kumar et al., 2014; Hore et al., 2016; Sambyal, & Abrol, 2016; Ang et al., 2016; Naik et al., 2016, Mohanpurkar, & Joshi, 2016; Wang et al., 2016; Dey et al. 2016; Manogaran, & Lopez, 2017; Li et al., 2017; Khachane, 2017; Tian et al., 2017; Boulmaiz et al., 2017; Satapathy et al., 2017; Juneja et al., 2017; Sharma, & Virmani, 2017).
CONCLUSION In this chapter, a study has been conducted on the impact and different application area of intelligent computing methods for biomedical image analysis. Different methods along with their basic features, limitations and advantages have been addressed. These methods can be used as the platform for future development and the problems associated with the discussed method can be eliminated. More efficient systems will ensure better and faster performance of the health care industry. Moreover, some modifications in ANN for biomedical and other image analysis applications have also been noticed in recent past years. Several advanced techniques, such as SVM are used to get more efficient results. ANN is not always considered as the only solution to the classification or regression related applications. There are some modified versions of the NN that seems to be promising. The use of some emergent ANN techniques in biomedical image processing, e.g. ANN ensembles, the hybrid ANN combined with some intelligent agents, the combination of bio-inspired methods with fuzzy fitness and ANN represent a promising choice to enhance efficiency and accuracy in biomedical image analysis.
REFERENCES Adler, A., & Guardo, R. (1994). A neural network image reconstruction technique for electrical impedance tomography. IEEE Transactions on Medical Imaging, 13(4), 594–600. doi:10.1109/42.363109 PMID:18218537 Alba, E., Garcia-Nieto, J., Jourdan, L., & Talbi, E. G. (2007, September). Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In Evolutionary Computation, 2007. CEC 2007. IEEE Congress on (pp. 284-290). IEEE.
601
Intelligent Computing in Medical Imaging
Ang, L. M., Seng, K. P., & Heng, T. Z. (2016). Information Communication Assistive Technologies for Visually Impaired People. International Journal of Ambient Computing and Intelligence, 7(1), 45–68. doi:10.4018/IJACI.2016010103 Ashizawa, K., Ishida, T., MacMahon, H., Vyborny, C. J., Katsuragawa, S., & Doi, K. (1999). Artificial neural networks in chest radiography: Application to the differential diagnosis of interstitial lung disease. Academic Radiology, 6(1), 2–9. doi:10.1016/S1076-6332(99)80055-5 PMID:9891146 Boulmaiz, A., Messadeg, D., Doghmane, N., & Taleb-Ahmed, A. (2017). Design and Implementation of a Robust Acoustic Recognition System for Waterbird Species using TMS320C6713 DSK. International Journal of Ambient Computing and Intelligence, 8(1), 98–118. doi:10.4018/IJACI.2017010105 Chan, H. P., Sahiner, B., Petrick, N., Helvie, M. A., Lam, K. L., Adler, D. D., & Goodsitt, M. M. (1997). Computerized classification of malignant and benign microcalcifications on mammograms: Texture analysis using an artificial neural network. Physics in Medicine and Biology, 42(3), 549–567. doi:10.1088/0031-9155/42/3/008 PMID:9080535 Chang, C. L., & Ching, Y. T. (2002). Fuzzy Hopfield neural network with fixed weight for medical image segmentation. Optical Engineering (Redondo Beach, Calif.), 41(2), 351–358. doi:10.1117/1.1428298 Chatterjee, S., Ghosh, S., Dawn, S., Hore, S., & Dey, N. (2016b). Forest Type Classification: A hybrid NN-GA model based approach. In Information Systems Design and Intelligent Applications (pp. 227236). Springer India. doi:10.1007/978-81-322-2757-1_23 Chatterjee, S., Hore, S., Dey, N., Chakraborty, S., & Ashour, A. S. (2017). Dengue Fever Classification Using Gene Expression Data: A PSO Based Artificial Neural Network Approach. In Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications (pp. 331-341). Springer. 10.1007/978-981-10-3156-4_34 Chatterjee, S., Nag, R., Sen, S., & Sarkar, A. (2017a, June). Towards Golden Rule of Capital Accumulation: A Genetic Algorithm Approach. In IFIP International Conference on Computer Information Systems and Industrial Management (pp. 481-491). Springer. 10.1007/978-3-319-59105-6_41 Chatterjee, S., Sarkar, S., Hore, S., Dey, N., Ashour, A. S., & Balas, V. E. (2016a). Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Computing & Applications, 1–12. Chen, E. L., Chung, P. C., Chen, C. L., Tsai, H. M., & Chang, C. I. (1998). An automatic diagnostic system for CT liver image classification. IEEE Transactions on Bio-Medical Engineering, 45(6), 783–794. doi:10.1109/10.678613 PMID:9609943 Cheng, K. S., Lin, J. S., & Mao, C. W. (1996). The application of competitive Hopfield neural network to medical image segmentation. IEEE Transactions on Medical Imaging, 15(4), 560–567. doi:10.1109/42.511759 PMID:18215937 Das, A., & Bhattacharya, M. (2008, September). GA based neuro fuzzy techniques for breast cancer identification. In Machine Vision and Image Processing Conference, 2008. IMVIP’08. International (pp. 136-141). IEEE. 10.1109/IMVIP.2008.19
602
Intelligent Computing in Medical Imaging
Dey, N., Ashour, A. S., Chakraborty, S., Samanta, S., Sifaki-Pistolla, D., Ashour, A. S., ... Nguyen, G. N. (2016). Healthy and unhealthy rat hippocampus cells classification: A neural based automated system for Alzheimer disease classification. Journal of Advanced Microscopy Research, 11(1), 1–10. doi:10.1166/jamr.2016.1282 Dheeba, J., & Selvi, T. (2010, December). Bio inspired swarm algorithm for tumor detection in digital mammogram. In International Conference on Swarm, Evolutionary, and Memetic Computing (pp. 404415). Springer Berlin Heidelberg. 10.1007/978-3-642-17563-3_49 Doi, K. (2007). Computer-aided diagnosis in medical imaging: Historical review, current status and future potential. Computerized Medical Imaging and Graphics, 31(4), 198–211. doi:10.1016/j.compmedimag.2007.02.002 PMID:17349778 Du–Yih, T. S. A. I. (1994). Automatic segmentation of liver structure in CT images using a neural network. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Science, 77(11), 1892–1895. Ercal, F., Chawla, A., Stoecker, W. V., Lee, H. C., & Moss, R. H. (1994). Neural network diagnosis of malignant melanoma from color images. IEEE Transactions on Bio-Medical Engineering, 41(9), 837–845. doi:10.1109/10.312091 PMID:7959811 Hasegawa, A., Lo, S. C. B., Lin, J. S., Freedman, M. T., & Mun, S. K. (1998). A shift-invariant neural network for the lung field segmentation in chest radiography. The Journal of VLSI Signal Processing, 18(3), 241–250. doi:10.1023/A:1007937214367 Hemanth, D. J., Vijila, C. K. S., & Anitha, J. (2010). Performance improved PSO based modified counter propagation neural network for abnormal MR brain image classification. Int. J. Advance. Soft Comput. Appl, 2(1), 65–84. Hernández-Cisneros, R., & Terashima-Marín, H. (2006). Classification of individual and clustered microcalcifications in digital mammograms using evolutionary neural networks. MICAI 2006. Advances in Artificial Intelligence, 1200–1210. Hong, J. H., & Cho, S. B. (2006). Efficient huge-scale feature selection with speciated genetic algorithm. Pattern Recognition Letters, 27(2), 143–150. doi:10.1016/j.patrec.2005.07.009 Hore, S., Chakraborty, S., Chatterjee, S., Dey, N., Ashour, A. S., Van Chung, L., & Le, D. N. (2016a). An Integrated Interactive Technique for Image Segmentation using Stack based Seeded Region Growing and Thresholding. Iranian Journal of Electrical and Computer Engineering, 6(6), 2773–2780. Hore, S., Chatterjee, S., Chakraborty, S., & Shaw, R. K. (2016b). Analysis of Different Feature Description Algorithm in object Recognition. Feature Detectors and Motion Detection in Video Processing, 66. Hore, S., Chatterjee, S., Santhi, V., Dey, N., Ashour, A. S., Balas, V. E., & Shi, F. (2017). Indian Sign Language Recognition Using Optimized Neural Networks. In Information Technology and Intelligent Transportation Systems (pp. 553–563). Springer International Publishing. doi:10.1007/978-3-31938771-0_54
603
Intelligent Computing in Medical Imaging
Hore, S., Chatterjee, S., Sarkar, S., Dey, N., Ashour, A. S., Balas-Timar, D., & Balas, V. E. (2016). Neural-based prediction of structural failure of multistoried RC buildings. Structural Engineering & Mechanics, 58(3), 459–473. doi:10.12989em.2016.58.3.459 Hore, S., Chatterjee, S., Shaw, R., Dey, N., & Virmani, J. (2015a, November). Detection of chronic kidney disease: A NN-GA based approach. In CSI—2015; 50th Golden Jubilee Annual Convention. Ibrahim, S., Khalid, N. E. A., & Manaf, M. (2010, March). Empirical study of brain segmentation using particle swarm optimization. In Information Retrieval & Knowledge Management,(CAMP), 2010 International Conference on (pp. 235-239). IEEE. 10.1109/INFRKM.2010.5466910 Innocent, P. R., Barnes, M., & John, R. (1997). Application of the fuzzy ART/MAP and MinMax/MAP neural network models to radiographic image classification. Artificial Intelligence in Medicine, 11(3), 241–263. doi:10.1016/S0933-3657(97)00032-8 PMID:9413608 Juneja, D., Singh, A., Singh, R., & Mukherjee, S. (2017). A thorough insight into theoretical and practical developments in multiagent systems. International Journal of Ambient Computing and Intelligence, 8(1), 23–49. doi:10.4018/IJACI.2017010102 Kadah, Y. M., Farag, A. A., Zurada, J. M., Badawi, A. M., & Youssef, A. B. (1996). Classification algorithms for quantitative tissue characterization of diffuse liver disease from ultrasound images. IEEE Transactions on Medical Imaging, 15(4), 466–478. doi:10.1109/42.511750 PMID:18215928 Khachane, M. Y. (2017). Organ-Based Medical Image Classification Using Support Vector Machine. International Journal of Synthetic Emotions, 8(1), 18–30. doi:10.4018/IJSE.2017010102 Kharrat, A., Gasmi, K., Messaoud, M. B., Benamrane, N., & Abid, M. (2010). A hybrid approach for automatic classification of brain MRI using genetic algorithm and support vector machine. Leonardo Journal of Sciences, 17(1), 71–82. Khazaee, A., & Ebrahimzadeh, A. (2010). Classification of electrocardiogram signals with support vector machines and genetic algorithms using power spectral features. Biomedical Signal Processing and Control, 5(4), 252–263. doi:10.1016/j.bspc.2010.07.006 Kishore, J. K., Patnaik, L. M., Mani, V., & Agrawal, V. K. (2000). Application of genetic programming for multicategory pattern classification. IEEE Transactions on Evolutionary Computation, 4(3), 242–258. doi:10.1109/4235.873235 Koss, J. E., Newman, F. D., Johnson, T. K., & Kirch, D. L. (1999). Abdominal organ segmentation using texture transforms and a hopfield neural network. IEEE Transactions on Medical Imaging, 18(7), 640–648. doi:10.1109/42.790463 PMID:10504097 Kumar, S. U., Inbarani, H. H., Azar, A. T., & Hassanien, A. E. (2014). Identification of heart valve disease using bijective soft sets theory. [IJRSDA]. International Journal of Rough Sets and Data Analysis, 1(2), 1–14. doi:10.4018/ijrsda.2014070101 Lee, C. C., & Chung, P. C. (2000). Recognizing abdominal organs in CT images using contextual neural network and fuzzy rules. In Engineering in Medicine and Biology Society, 2000. Proceedings of the 22nd Annual International Conference of the IEEE (Vol. 3, pp. 1745-1748). IEEE.
604
Intelligent Computing in Medical Imaging
Li, Y., Wen, P., Powers, D., & Clark, C. R. (1999). LSB neural network based segmentation of MR brain images. In Systems, Man, and Cybernetics, 1999. IEEE SMC’99 Conference Proceedings. 1999 IEEE International Conference on (Vol. 6, pp. 822-825). IEEE. Li, Z., Shi, K., Dey, N., Ashour, A. S., Wang, D., Balas, V. E., ... Shi, F. (2017). Rule-based back propagation neural networks for various precision rough set presented KANSEI knowledge prediction: A case study on shoe product form features extraction. Neural Computing & Applications, 1–18. Maclin, P. S., & Dempsey, J. (1992). Using an artificial neural network to diagnose hepatic masses. Journal of Medical Systems, 16(5), 215–225. doi:10.1007/BF01000274 PMID:1289469 Maitra, M., & Chatterjee, A. (2008). A novel technique for multilevel optimal magnetic resonance brain image thresholding using bacterial foraging. Measurement, 41(10), 1124–1134. doi:10.1016/j.measurement.2008.03.002 Manogaran, G., & Lopez, D. (2017). Disease surveillance system for big climate data processing and dengue transmission. International Journal of Ambient Computing and Intelligence, 8(2), 88–105. doi:10.4018/IJACI.2017040106 Middleton, I., & Damper, R. I. (2004). Segmentation of magnetic resonance images using a combination of neural networks and active contour models. Medical Engineering & Physics, 26(1), 71–86. doi:10.1016/ S1350-4533(03)00137-1 PMID:14644600 Miller, A. S., Blott, B. H., & Hames, T. K. (1992). Review of neural network applications in medical imaging and signal processing. Medical & Biological Engineering & Computing, 30(5), 449–464. doi:10.1007/BF02457822 PMID:1293435 Mohanpurkar, A. A., & Joshi, M. S. (2016). A Traitor Identification Technique for Numeric Relational Databases with Distortion Minimization and Collusion Avoidance. International Journal of Ambient Computing and Intelligence, 7(2), 114–137. doi:10.4018/IJACI.2016070106 Naik, A., Satapathy, S. C., Ashour, A. S., & Dey, N. (2016). Social group optimization for global optimization of multimodal functions and data clustering problems. Neural Computing & Applications, 1–17. Nandi, R. J., Nandi, A. K., Rangayyan, R. M., & Scutt, D. (2006). Classification of breast masses in mammograms using genetic programming and feature selection. Medical & Biological Engineering & Computing, 44(8), 683–694. doi:10.100711517-006-0077-6 PMID:16937210 Ozkan, M., Dawant, B. M., & Maciunas, R. J. (1993). Neural-network-based segmentation of multimodal medical images: A comparative and prospective study. IEEE Transactions on Medical Imaging, 12(3), 534–544. doi:10.1109/42.241881 PMID:18218446 Parthiban, L., & Subramanian, R. (2008). Intelligent heart disease prediction system using CANFIS and genetic algorithm. International Journal of Biological, Biomedical and Medical Sciences, 3(3). Pavlopoulos, S., Kyriacou, E., Koutsouris, D., Blekas, K., Stafylopatis, A., & Zoumpoulis, P. (2000). Fuzzy neural network-based texture analysis of ultrasonic images. IEEE Engineering in Medicine and Biology Magazine, 19(1), 39–47. doi:10.1109/51.816243 PMID:10659429
605
Intelligent Computing in Medical Imaging
Pitiot, A., Toga, A. W., Ayache, N., & Thompson, P. (2002). Texture based MRI segmentation with a two-stage hybrid neural classifier. In Neural Networks, 2002. IJCNN’02. Proceedings of the 2002 International Joint Conference on (Vol. 3, pp. 2053-2058). IEEE. 10.1109/IJCNN.2002.1007457 Rastgarpour, M., & Shanbehzadeh, J. (2013). The status quo of artificial intelligence methods in automatic medical image segmentation. International Journal of Computer Theory and Engineering, 5(1), 5–8. doi:10.7763/IJCTE.2013.V5.636 Saha, S., & Bandyopadhyay, S. (2007, September). MRI brain image segmentation by fuzzy symmetry based genetic clustering technique. In Evolutionary Computation, 2007. CEC 2007. IEEE Congress on (pp. 4417-4424). IEEE. 10.1109/CEC.2007.4425049 Sambyal, N., & Abrol, P. (2016). Feature based Text Extraction System using Connected Component Method. International Journal of Synthetic Emotions, 7(1), 41–57. doi:10.4018/IJSE.2016010104 Satapathy, S. C., Raja, N. S. M., Rajinikanth, V., Ashour, A. S., & Dey, N. (2017). Multi-level image thresholding using Otsu and chaotic bat algorithm. Neural Computing & Applications, 1–23. Setiawan, N. A. (2014). Fuzzy decision support system for coronary artery disease diagnosis based on Rough set theory. International Journal of Rough Sets and Data Analysis, 1(1), 65–80. doi:10.4018/ ijrsda.2014010105 Sharma, K., & Virmani, J. (2017). A Decision Support System for Classification of Normal and Medical Renal Disease Using Ultrasound Images: A Decision Support System for Medical Renal Diseases. International Journal of Ambient Computing and Intelligence, 8(2), 52–69. doi:10.4018/IJACI.2017040104 Sun, W., & Wang, Y. (2005). Segmentation method of MRI using fuzzy Gaussian basis neural network. Neural Information Processing-Letters and Reviews, 8(2), 19–24. Suzuki, K., Horiba, I., & Sugie, N. (2003). Neural edge enhancer for supervised edge enhancement from noisy images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12), 1582–1596. doi:10.1109/TPAMI.2003.1251151 Suzuki, K., Shiraishi, J., Abe, H., MacMahon, H., & Doi, K. (2005). False-positive reduction in computeraided diagnostic scheme for detecting nodules in chest radiographs by means of massive training artificial neural network 1. Academic Radiology, 12(2), 191–201. doi:10.1016/j.acra.2004.11.017 PMID:15721596 Tian, Z., Dey, N., Ashour, A. S., McCauley, P., & Shi, F. (2017). Morphological segmenting and neighborhood pixel-based locality preserving projection on brain fMRI dataset for semantic feature extraction: An affective computing study. Neural Computing & Applications, 1–16. Tourassi, G. D., & Floyd, C. Jr. (1993). Artificial Neural Networks for Single Photon Emission Computed Tomography: A Study of Cold Lesion Detection and Localization. Investigative Radiology, 28(8), 671–677. doi:10.1097/00004424-199308000-00002 PMID:8375998 Vasantha, M., Bharathi, D. V. S., & Dhamodharan, R. (2010). Medical image feature, extraction, selection and classification. International Journal of Engineering Science and Technology, 1(2), 2071–2076.
606
Intelligent Computing in Medical Imaging
Verma, B., & Zakos, J. (2001). A computer-aided diagnosis system for digital mammograms based on fuzzy-neural and feature extraction techniques. IEEE Transactions on Information Technology in Biomedicine, 5(1), 46–54. doi:10.1109/4233.908389 PMID:11300216 Wang, D., He, T., Li, Z., Cao, L., Dey, N., Ashour, A. S., ... Shi, F. (2016). Image feature-based affective retrieval employing improved parameter and structure identification of adaptive neuro-fuzzy inference system. Neural Computing & Applications, 1–16. Wang, Y., Adali, T., Kung, S. Y., & Szabo, Z. (1998). Quantification and segmentation of brain tissues from MR images: A probabilistic neural network approach. IEEE Transactions on Image Processing, 7(8), 1165–1181. doi:10.1109/83.704309 PMID:18172510 Wang, Y., Heng, P. A., & Wahl, F. M. (2003). Image reconstructions from two orthogonal projections. International Journal of Imaging Systems and Technology, 13(2), 141–145. doi:10.1002/ima.10036 Warsito, W., & Fan, L. S. (2001). Neural network based multi-criterion optimization image reconstruction technique for imaging two-and three-phase flow systems using electrical capacitance tomography. Measurement Science & Technology, 12(12), 2198–2210. doi:10.1088/0957-0233/12/12/323 Wolberg, W. H., Street, W. N., & Mangasarian, O. L. (1995). Image analysis and machine learning applied to breast cancer diagnosis and prognosis. Analytical and Quantitative Cytology and Histology, 17(2), 77–87. PMID:7612134 Wu, Y., Giger, M. L., Doi, K., Vyborny, C. J., Schmidt, R. A., & Metz, C. E. (1993). Artificial neural networks in mammography: Application to decision making in the diagnosis of breast cancer. Radiology, 187(1), 81–87. doi:10.1148/radiology.187.1.8451441 PMID:8451441 Yeh, J. Y., & Fu, J. C. (2008). A hierarchical genetic algorithm for segmentation of multi-spectral humanbrain MRI. Expert Systems with Applications, 34(2), 1285–1295. doi:10.1016/j.eswa.2006.12.012 Zhou, Jiang, Yang, & Chen. (2002). Lung Cancer Cell Identification Based on Artificial Neural Network Ensembles. Artificial Intelligence in Medicine, 24, 25-36. Zhu, Y., & Yan, Z. (1997). Computerized tumor boundary detection using a Hopfield neural network. IEEE Transactions on Medical Imaging, 16(1), 55–67. doi:10.1109/42.552055 PMID:9050408
KEY TERMS AND DEFINITIONS Classification: Defined as assigning a physical object or incident into a set of known groups. It is considered a numerical properties analysis of the image features for classification. Feature Extraction: The attributes transformation into a lower dimensional space. Feature Selection: The selection of the most significant attributes that provide important information about the object under concern. Image Analysis: The extraction of expressive information from images using digital image processing procedures. Image Processing: The manipulation and analysis of a digitized image for quality improvement. Image Segmentation: The image partitioning process into several segments. 607
Intelligent Computing in Medical Imaging
Magnetic Resonance Imaging: A medical device that employed nuclear magnetic resonance of protons to yield images. Medical Imaging: The method that produces/captures medical images. Medical Modalities: The medical instruments for capturing medical images of the internal organ of the body to support diagnosis.
This research was previously published in Advancements in Applied Metaheuristic Computing; pages 143-163, copyright year 2018 by Engineering Science Reference (an imprint of IGI Global).
608
609
Chapter 31
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery: Current Practices and Forthcoming Challenges Satya Eswari Jujjavarapu National Institute of Technology Raipur, India Bikesh Kumar Singh National Institute of Technology Raipur, India
ABSTRACT Before starting semi-pilot/pilot production plants for biochemical metabolites production, it is essential to optimize the fermentation media. This chapter discusses the classical and advanced techniques of media optimization. The statistical approaches save experimental time for developing processing and improving quality. Recent years have seen the growth of integrated approaches of microbial cultures. Optimization techniques such as response surface methodology, artificial neural network, genetic algorithms, differential evolution, ant colony optimization, etc. have received attention recently because of their major applications in various fields. Controlled release formulations have so many versatile applications in the field of pharmaceutical drugs that they have become important tools to apply the modern concept of therapeutic treatment. Process optimization of such formulations, mathematical modelling can play an important role. This chapter discusses various methodologies for optimization of formulation conditions for drug delivery.
DOI: 10.4018/978-1-7998-8048-6.ch031
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
INTRODUCTION Application of Optimization Techniques in Controlled Drug Delivery There have been a lot of advancements, over the past few years, in the field of drug delivery. These advances can be mainly attributed to the improvement of meticulous drug discharge dosage. Most of these advances have been done in the controlled drug discharge of oral quantity, by controlling the conditions that influence drug discharge. The drug release patterns can be fined into 2 groups; i) formulations that release drug at a slow, (0 or 1st order) rate and ii) sustained release formulations that afford a first amount of dose, followed by slow (zero or first order) release of active drug (Dash et al., 2010). Sustained release formulations allow us to maintain desired drug absorption in the plasma or in any target tissue for a lengthy period of time (Langer and Wise, 1984). The specific sustained release formulations allow controlling the rate of drug release and its duration (Li and Lee, 1987). These formulations generally release only a part of the active drug initially, to achieve the active tonic concentration of the drug in a short time. After that, the drug release kinetics follows a well-defined behaviour so as to attain and maintain the desired drug concentration. Designing and developing the controlled discharge formulations needs engineers and pharmacists to work composed to produce more effective products. The procedure of modelling may allow us to predict the release kinetics without or with minimum experimental data, which is an important step to develop formulations. Also mathematical modelling might improve the quantity of certain imperative parameters such as the drug diffusion coefficient and resorting to model fitting on experimental release data. Thus, for the process optimization of such formulations, mathematical modelling is having a significant role. The development of mathematical models needs the understanding of all the factors affecting drug release kinetics (Cartensen, 1996). The mathematical models involved in drug release kinetics can be viewed as mathematical metaphors of some aspects of reality which controls the phenomena ruling the discharge kinetics (Dressman and Fleisher, 1986). Because of these general aspects, mathematical modelling is widely employed in varied fields of science and technology ranging from genetics to medicine, psychology to economy, biology, and obviously engineering. In this article we explained about the RSM, design of experiments, ANN, RBFN, GRNN methodologies for the application of drug delivery process.
Advancements of Optimization Techniques for Biochemical Production Advancements in biochemical production require better understanding of the bioprocesses involved in biochemical production from different raw materials, agro-industrial by-products and waste streams, as well as possibility to predict the outcomes of these processes in terms of biochemical yield and substrate utilization (Olsson, 2007). Optimization of biochemical production processes is also important from the aspect of yield increase and bioprocess cost reduction. Optimization techniques such as Design of experiments (DOE), fuzzy logic (FL), Particle swarm optimization (PSO), genetic algorithms (GA), differential evolution (DE) neural networks (NN) and Multi objective optimization (MOO), could be used as a tool for biochemical production modelling and optimization. The basic characteristics and applications of ANN in modelling and optimization of biochemical production processes have been experimentally proven (Ivana et al., 2017). Figure 1 shows about the flow chart of various optimization methods.
610
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
Figure 1. Advancement of optimization techniques in bioprocesses
CONTROLLED DRUG DELIVERY, OPTIMIZATION In a condition of surgery or in case of patients in intensive care unit, certain drugs are administered to the patients generally to maintain vital physiological parameters, such as body temperature, blood pressure etc. The concentration of such drugs has to be maintained by experienced personnel on the basis of physiological conditions. It is a tedious and laborious process and thus prone to human errors. Therefore, there is a need to make the process at least partially automated by the use of computational tools. This might improve the intensive care of patients. At present computational tools such as optimization methods are used to restrict the increasing cost of health care. The drug delivery system has not been automated till date because the mathematical models describing the effects of drugs on the physiological parameters are not accurate. This has been a difficult step because of the non-linear response, the difference in response by different patients, the effects of outer conditions, and the interaction amongst different drugs. Various
611
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
models such as Response surface models; artificial neural network models can be developed and can be used as models in controlled drug delivery also further can be evaluated experimentally.
Response Surface Methodology (RSM) Application in Controlled Drug Delivery RSM can be used to optimise the drug delivery systems, since drug delivery depends on various factors, including physical, chemical, and physiological factors. RSM has been successfully used by research groups for optimisation of drug delivery. Kumari et al. (2009) optimised the delivery of chlorpheniramine maleate (CPM) by using RSM. The CPM drug was loaded in chitosan-alanine beads for controlled release; and was cross linked using glutaraldehyde as cross-linker. Four components were used as formulating agents and regression models were developed. The response variable (i.e. drug release) was optimised on the basis of formulation variables, which were, chitosan concentration, % of cross-linker, concentration of drug, and release time (RT). Factorial design (FD) analysis was castoff to find the effect of each parameter. The authority of the model was appraised by analysis of variance (ANOVA). All the 4 factors were said to have important effect on CPM release. The optimum values for the variables were found to be 0.42 g, 10-18%, and 100 mg respectively for X1, X2, and X3. Release time (X4) has a linear effect on drug release. Here they used linear regression models for controlled drug delivery process.
Placket Burman (PB) Methodology in Controlled Drug Delivery (CDD) Placket Burman (PB) can be used for designing controlled drug delivery (CDD) systems. Porous type tablets were intended as an osmotic drug delivery system, to invent out the better formulation (Patel et al., 2016). Three categories of polymers were screened for designing of the porous osmotic pump tablets. Six independent variables, namely osmotic agent sodium chloride (OASC) and microcrystalline cellulose (MCC), sodium lauryl sulphate (SLL) and sucrose, and ethyl cellulose (EC) and cellulose acetate (CA) were selected for screening. The drug release was mainly affected by osmotic agent and pore former. The porous osmotic pump tablets contained dicloxacillin sodium and amoxicillin trihydrate as the active antibiotic drugs. The optimised drug delivery system was designed by using sodium chloride (150 mg) as osmotic agent, sodium lauryl sulphate (15 mg) as pore former, and cellulose acetate (2%) as coating agent. Thus Placket Burman design was successfully used for optimisation of drug delivery system.
Neural Network (NN) Methodology in Controlled Drug Delivery The release profile of a new formulation of drug is a time consuming process because the many drug compositions and other linked variables are also to be considered account. The common procedure includes repeated stages of random experiments based on the experience of the drug formulators. An investigational study in trans-skin treatment was performed to assess the applicability of an NN-model in modelling and predicting drug release profiles (Lim et al. 2003). Gaussian mixture model (GMM) is an intelligent learning system based on neural-networks, which was used to model and predicts the drug release profiles using experimental data. The simultaneous effect of three process variables (pH and ionic strength of the buffer, and the magnitude of electrical current) on iontophoresis process was examined. The bootstrap method was used to prove the reliability of GMM for prediction. Brier et al. (1995) reported the prediction of peaks and troughs of gentamicin serum concentrations based on a set of empirical data using an ANN system and the results were comparable with those using nonlinear
612
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
mixed effect modelling. They predicted serum drug concentration for more than 100 patients of various physical parameters. The aptitude of NN in predicting the output of patients paining from a disorder to a particular drug was demonstrated (Valafar and Valafar 1999). Also they validated and response of the drug hydroxyurea was predicted by the neural network on 23 sickle cell anaemic patients. They reported 86 to 100% accuracy in predicting the response of patients. Some pharmacokinetic models based on ANNs have been developed to predict plasma drug concentration and to estimate heparin concentrations in patients undergoing haemodialysis treatment (Valafar and Valafar 1999). Some research groups reported the release of corticosteroid drug that can be controlled by an implant. The drug delivery system was based on the use of a biodegradable polymer. NN is used for modelling of the drug release data was examined by Rutkowski 2004. ANN classes are multilayer perceptron (MLP), radial basis function network (RBFN), and generalized regression neural network (GRNN) were used. The overhead stated NNs have potential of modelling and determining the relation among nonlinear data, approximating any arbitrary function between input and output vectors from training samples (Rutkowski 2004). They also mentioned the effectiveness of NNs in optimizing release drug profiles for designing controlled release formulations. Many recent articles (Barmpalexis et al., 2011a; Barmpalexis et al., 2011b; Petrović et al., 2012; Chansanroj et al., 2011; Tan et al., 2012) indicate that, in future NNs can be integrated with additional soft computing, such as fuzzy logic (FL), genetic programming (GP), decision trees (DT), self-organizing maps (SOM)a, etc.
OPTIMIZATION OF BIOPROCESSES Design of Experiments and Response Surface Method Methodology Application for Bioprocess The experimental analysis of any bioprocess generally requires expensive materials to extract the product or to analyse the concentration of any particular metabolite or even a drug molecule or its degradation product. Hence optimization tools can play vital role to reduce experimental efforts. Various advanced optimization techniques can be dealt in the following sections.
Fuzzy Logic Methodology Application for Bioprocess Fuzzy logic is a qualitative approach to modelling for bioprocess control. Biochemical processes can be controlled by specially designed fuzzy controllers (FLC). Fuzzy logic control based on the Takagi-Sugeno inference (TKI) is useful for control of the baker’s yeast fermentation. Strongly non-linear processes and processes that are difficult to model because of complicated reaction kinetics can be successfully controlled by fuzzy control; this gives an advantage to fuzzy logic to other control methods. But tuning of controllers working on fuzzy control designs is very time consuming process (Vasičkaninová et al., 2017). The use of fuzzy logic – a machine-learning algorithm - to assess cancer risk in screening oral potentially malignant disorders have also been proposed by Scrobota et al.(2017).
613
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
Genetic Algorithms (GA) Methodology Application for Medical Signals Genetic algorithms can be efficiently used in biomedical applications such as extraction and classification of data. Cardiac diseases can be diagnosed automatically only after successful extraction of features and for classifying signals of electrocardiogram (ECG) which can be achieved using wavelet packet decomposition (WPD) by the method neural network (GA-NN). This approach was projected by Li et al. (2017). The extraction of the effective features of ECG signals can be performed by WPD combined with the statistical method. The applications of GA include reduction in the dimensions of the feature sets; optimization of weights as digits and biases which is generally one NN. NN classifier with GA optimization is able to classify six types of ECG signals. The GA-NN system has been used with the database of arrhythmia, which was able to reduce dimensions by 50% and the classification results were accurate up to 97.78%. Some studies have shown the use of GA-BPNN method to achieve a much accuracy of up to 99.33%. Thus, it can be capably useful in the instinctive identification of arrhythmias.
Particle Swarm Optimisation (PSO) Methodology Application for Medical Signals PSO is yet another algorithm which can be used for biomedical data extraction and optimisation. It is a quite interesting heuristic method that has produced promising results for solving complex optimization problems. An ANN and PSO are used to ECG heartbeats are classified of a patient into 5 signals by Shadmand and Mashoufi (2016). A block established from 2-D array of interconnected blocks, was used as the classifier. PSO was used for the optimisation of the structure of network along with the weights. The input of the NN is a vector with its decision values as the features extracted from the ECG signals. The optimisation of NN parameters by PSO algorithm has an advantage to overwhelm the likely variations of signals of ECG variations. The trained NN has a distinctive structure for each person. A more accuracy of classification is 97% was reported by using the arrhythmia database.
Differential Evolution (DE) Methodology Application for Bioprocess There have been numerous developments area of bioprocess for better model development and validation, with current modern algorithms is insignificant. In case of fermentation, there are transformations detected amid model predictions and real conditions which may follow unreliable effects. An evolutionary algorithm of evolutionary optimization multiobjective (MOO) problems includes differential evolution (DE) and genetic algorithm (GA) yield optimal elucidations. Rocha et al. (2007) recited the 2 algorithms fit in Evolutionary Algorithms (EA) based on their optimal control. The advantages of DE include identical effects after associated models of mathematically represented process. Another advantage is that it treaties in terms of state variable noise which is the property of flowing decrease. On the other hand, when noisier settings are considered other optimisation methods prove superior to the differential evolution (DE). DE has been used for culture conditions optimization for the production of CHO cells which are Chinese hamster ovaries. According to the authors, involvement of trepidations in the components of the medium for growth and component composition optimization is the most flavour for any type of cell productivity. They have taken experimental model for the CHO cells growth metabolic flux analysis (MFA)-DE. They concluded that this method is an excellent for connecting input spaces and output variables (Satya Eswari and Venkateswarlu, 2012). Evolutionary algorithms (GA and DE) linked
614
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
RSM models are constructed. For this purpose, numerous sets of goals analogous to the populace the variable space are generated after calculating with RSM. Since does not take part of nonlinearities hence the ANN based RSM used. MOO strategies derived by either RSM model or an ANN-RSM-DE used for high formation of rhamnolipid by Pseudomonas aeruginosa AT10. Carbon (X1, gdm−3), nitrogen (NaNO3) (X2, gdm−3), the phosphate content (K2HPO4/KH2PO4) (X3, gdm−3) and the iron (FeSO4.7H2O) (X4, g dm−3) were considered critical components. The responses are the biomass (Y1, gdm−3) rhamnolipid (Y2, g dm−3). Pilot production of rhamnolipids the factors of the components-medium need to be optimized (Satya Eswari et al., 2013).
Ant Colony Optimisation (ACO) Methodology Application for Bioprocess Ant colony optimisation (ACO) algorithm can be applied as an optimisation method for bioprocesses. The kinetic and film thickness factors for treatment of wastewater in an anaerobic fixed bed biofilm reactor (AFBR) assessed by an inverse modelling (IM) ant colony optimization (ACO). Experimental data of treatment of industrial pharmaceutical waste water (PWW) can be determined the factors of model. Different modelling patterns resulting from the combination of mathematical prototypes, time wise data of expressions different optimization flow charts were assessed and 2D models with kinetic haldane was shown to be best treatment of the PWW. The mathematical and kinetic prototypes are worthwhile PWW (Satya Eswari and Venkateswarlu, 2013). Obtaining reliable models from experimental data is important are influenced by number of decision factors and best decision factors are identified. The Ant Colony Optimization (ACO) algorithm can be used efficiently for variable selection. Pessoa et al. (2015) used Saccharomyces cerevisiae fermentation data to select various variables for trail update and model comparison. ACO has also been used with flux balance analysis to optimize gene knockout identification by Salleh et al. (2015). Growing claim of biochemical supply for various industries which has led to designing of microbial cell factories. Natural producers mostly have very low product yields. Gene knockout strategy can improve the metabolite production. Some optimal microbial strains, to be used as cell factories, have been designed by the use of computational algorithms, developed specifically for the purpose. However, the large genome size of microorganisms, poses difficulties in finding the optimal combination of genes to be knocked out. A hybrid of Genetic Ant Colony Optimization (G-ACO) and Flux Balance Analysis (FBA) namely G-ACO-FBA was established to invention the optimal gene knockout to intensify biochemical production. Best optimal genes ate used to raise the production of four tested metabolites while taking care of the optimal growth rate in E. coli and S. cerevisiae genome scale model. It was shown to be more actual present methods.
Neural Network (NN) The selection of cell disruption technique depends on many factors such as type of cell, its growth status, molecules to be detected, etc. The most common lysis inducing chemical agents used are EDTA, lysozyme, Triton X, and polymyxin B. The degree of cell lysis is generally detected by the soluble protein concentration in the lysate. An experiments-with the strategy for efficient cell disruption protocols was designed by Glauche et al. (2016). Monitoring of a bioprocess can be done by using a feed-forward neural network in combination with 2-D fluorescence spectroscopy. In traditional approach, for the training the neural network, an offline measurement was needed. But in this method there is no need of
615
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
any offline measurement; instead a theoretical model of the process can be used to simulate the process state at a given time.
Multi Objective Optimization (MOO) Methodology Application for Bioprocess Naïve slow Genetic algorithms (NSGA) and Naïve slow Differential evolution (NSDE) The industrial bioprocesses mainly produce more than one product in a batch. Such processes need to be optimized for better utilization of substrate and obtaining more favourable products. MOO concerns with simultaneously optimizing two or more objectives. In case of conflicting objectives, optimizing one objective results in compromising with other objectives. The non-dominated objectives in this case are called as Pareto optimal or non-inferior solutions. These Pareto optimal solutions refer to different optimal operating conditions and called as Pareto set. Different methods are employed for solving the problems on conflicting multi-objective optimization. Classical methods of multiobjective optimization, namely weighting and e-constraints, convert the multi-objective problem into a single objective one and solve further to obtain Pareto optimal solutions (Chankong and Haimes, 1983; Ray, 1989). The problem with this approach is that they iteratively change the solution to get a single point optimized solution by using a point by point approach. MOO approach, on the other hand, uses EA, namely DE and GA to yield many Pareto solutions. Deb (2001), Rajesh et al. (2001) and Ohh et al. (2002) provide detailed information on the use of evolutionary algorithms (EA) for MOO. MOO strategies have been developed by integrating an EA algorithm with soft computing tools for formulation of pharmacological products. J. Satya Eswari et al., 2017 developed RBFN-NSDE (radial basis function network - non-conquered sorting differential evolution) and applied for optimal formulation of a trapidil product concerning conflicting output (OP) features. RBFN models are established by means of orthogonal designs (OD) of trapidil formulation variables on behalf of the amounts of MCC (microcrystalline cellulose), HMC (hydroxypropyl methylcellulose) and CP (compression pressure), and the analogous OP (output) distinctive data of RO (release order) and RC (rate constant).
CONCLUSION Experimental efforts of a biological concern are always laborious and cumbersome. Hence it is always easy to avoid experiments by using design of experiment and modelling of a bioprocess. Optimization concepts in controlled drug delivery and bioprocess optimization can reduce many experimental efforts and cost also. Before going for large production of biochemical and animal trials for controlled drug delivery, it is essential to optimize the fermentation media and formulation conditions. This chapter discusses about the classical and advanced techniques of optimization methodologies which are applied on biological concepts. The statistical approaches save experimental time for developing processing and improving quality. Optimizing a process leads to reduction in overall cost of production. The methods for optimization depend on simplicity, efficiency and time consumption. This chapter elaborately explains about the flavour of various optimization techniques in bioprocess engineering and controlled drug delivery formulations.
616
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
REFERENCES Barmpalexis, P., Kachrimanis, K., & Georgarakis, E. (2011a). Solid dispersions in the development of animodipine floating tablet formulation and optimization by artificial neural networks and genetic programming. European Journal of Pharmaceutics and Biopharmaceutics, 77(1), 122–131. doi:10.1016/j. ejpb.2010.09.017 PMID:20934511 Barmpalexis, P., Kachrimanis, K., Tsakonas, A., & Georgarakis, E. (2011b). Symbolic regression via genetic programming in the optimization of a controlled release pharmaceutical formulation. Chemometrics and Intelligent Laboratory Systems, 107(1), 75–82. doi:10.1016/j.chemolab.2011.01.012 Brier, E., Zurada, J. M., & Aronoff, G. R. (1995). Neural network predicted peak and trough of gentamicin concentrations. Pharmaceutical Research, 12(3), 406–412. doi:10.1023/A:1016260720218 PMID:7617529 Cartensen, J. T. (1996). Modeling and data treatment in the pharmaceutical sciences. New York: Technomic Publishing Co. Inc. Chankong, V., & Haimes, Y. Y. (1983). Multiobjective Decision making theory & methodology. New York: Elsevier. Chansanroj, K., Petrović, J., Ibrić, S., & Betz, G. (2011). Drug release control and system understanding of sucrose esters matrix tablets by artificial neural networks. European Journal of Pharmaceutical Sciences, 44(3), 321–331. doi:10.1016/j.ejps.2011.08.012 PMID:21878388 Cheng, K. K., Wu, J., Lin, Z. N., & Zhang, J. A. (2014). Aerobic and sequential anaerobic fermentation to produce xylitol and ethanol using non-detoxified acid pretreated corncob. Biotechnology for Biofuels, 7(1), 166. doi:10.118613068-014-0166-y PMID:25431622 Dash, S., Murthy, P. N., Nath, L., & Chowdhury, P. (2010). Kinetic modelling on drug release from controlled drug delivery systems. Acta Poloniae Pharmaceutica-Drug Research, 67(3), 217–223. PMID:20524422 Deb, K. (2001). Multi-objective optimization using evolutionary algorithms. New York: John Wiley & Sons Limited. Dressman, J. B., & Fleisher, D. (1986). Mixing-tank model for predicting dissolution rate control or oral absorption. Journal of Pharmaceutical Sciences, 75(2), 109–116. doi:10.1002/jps.2600750202 PMID:3958917 Glauche, F., Pilarek, M., Bournazou, M. N. C., Grunzel, P., & Neubauer, O. (2016). Design of experiments based high-throughput strategy for development and optimization of efficient cell disruption protocols. Engineering in Life Sciences, 1–7. Huang, L., Shi, Y., Wang, N., & Dong, Y. (2014). Anaerobic/aerobic conditions and biostimulation for enhanced chlorophenols degradation in biocathode microbial fuel cells. Biodegradation, 25(4), 615–632. doi:10.100710532-014-9686-1 PMID:24902896
617
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
Ivana, P., Jovana, G., Jelena, D., & Aleksandar, J. (2017). Applicaton of artificial neural networks in modelling and optimization of biofuels production. Journal on Processing and Energy in Agriculture, 21(2), 66–70. doi:10.5937/JPEA1702066P Kumari, K., Prasad, K., & Kundu, P. P. (2009). Optimization of chlorpheniramine maleate (CPM) delivery by response surface methodology – four component modeling using various response times and concentrations of chitosan-alanine, glutaraldehyde and CPM. Express Polymer Letters, 3(4), 207–218. doi:10.3144/expresspolymlett.2009.27 Langer, R. S., & Wise, D. L. (Eds.). (1984). Medical Applications of Controlled Release, Applications and Evaluation (Vols. 1-2). Boca Raton, FL: CRC Press. Li, H., Yuan, D., Ma, X., Cui, D., & Cao, L. (2017). Genetic algorithm for the optimization of features and neural networks in ECG signals classification. Scientific Reports, 7. PMID:28139677 Li, V. H. K., & Lee, V. H. L. (1987). Controlled drug delivery: Fundamentals and applications (2nd ed.). Marcel Dekker Inc. Lim, C. P., Quek, S. S., & Peh, K. K. (2003). Prediction of drug release profiles using an intelligent learning system: An experimental study in transdermal iontophoresis. Journal of Pharmaceutical and Biomedical Analysis, 31(1), 159–168. doi:10.1016/S0731-7085(02)00573-3 PMID:12560060 Nair, G., Jungreuthmayer, C., & Zanghellini, J. (2017). Optimal knockout strategies in genome-scale metabolic networks using particle swarm optimization. BMC Bioinformatics, 18(1), 78. doi:10.118612859017-1483-5 PMID:28143607 Oh, P. P., Rangaiah, G. P., & Ray, A. K. (2002). Simulation and multi-objective optimization of an industrial hydrogen plant based on refinery off-gas. Industrial & Engineering Chemistry Research, 41(9), 248–2261. doi:10.1021/ie010277n Olsson, L. (2007). Biofuels. Berlin, Germany: Springer-Verlag. doi:10.1007/978-3-540-73651-6 Patel, A., Dodiya, H., Shelate, P., Shastri, D., & Dave, D. (2016). Design, Characterization, and Optimization of Controlled Drug Delivery System Containing Antibiotic Drug/s. Journal of Drug Delivery, 2016, 9024173. doi:10.1155/2016/9024173 PMID:27610247 Pessoa, C. M., Ranzan, C., Trierweiler, L. F., & Trierweile, J. O. (2015). Development of Ant Colony Optimization (ACO) Algorithms Based on Statistical Analysis and Hypothesis Testing for Variable Selection. IFAC-PapersOnLine, 48(8), 900–905. doi:10.1016/j.ifacol.2015.09.084 Petrović, J., Ibrić, S., Betz, G., & Djurić, Z. (2012). Optimization of matrix tablets controlled drug release using Elman dynamic neural networks and decision trees. International Journal of Pharmaceutics, 428(1-2), 57–67. doi:10.1016/j.ijpharm.2012.02.031 PMID:22402474 Rafieniaa, M., Amirib, M., Janmalekic, M., & Sadeghiand, A. (2010). Application of artificial neural networks in controlled drug delivery systems. Applied Artificial Intelligence, 24(8), 807–820. doi:10.1 080/08839514.2010.508606 Rajesh, J. K., Gupta, S. K., Rangaiah, G. P., & Ray, A. K. (2001). Multi-objective optimization of industrial hydrogen plants. Chemical Engineering Science, 56(3), 999–1010. doi:10.1016/S0009-2509(00)00316-X
618
Optimization Techniques Applications in Biochemical Engineering and Controlled Drug Delivery
Ray, W. H. (1989). Advanced Process Control. New York: Butterworths. Rocha, M., Pinto, J. P., Rocha, I., & Ferreira, E. C. (2007). Evaluating Evolutionary Algorithms and Differential Evolution for the Online Optimization of Fermentation Processes. In E. Marchiori, J. H. Moore, & J. C. Rajapakse (Eds.), Lecture Notes in Computer Science: Vol. 4447. Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2007. Berlin: Springer. doi:10.1007/9783-540-71783-6_23 Rutkowski, L. (2004). Generalized regression neural networks in time-varying environment. IEEE Transactions on Neural Networks, 15(3), 576–596. doi:10.1109/TNN.2004.826127 PMID:15384547 Satya, E. J., & Ch, V. (2012). Optimization of Culture Conditions for Chinese Hamster Ovary (CHO) Cells Production Using Differential Evolution. International Journal of Pharmacy and Pharmaceutical Sciences, 4(1), 465–470. Satya, E. J., & Ch, V. (2013). Evaluation of Kinetic Parameters of an Anaerobic Biofilm Reactor Treating Pharmaceutical Industry Wastewater by Ant Colony Optimization. [PubMed]. Environmental Engineering Science, 30(9), 527–535. doi:10.1089/ees.2012.0158 Satya Eswari, J., & Mohan, A. (2013). Optimum Culture Medium Composition for Rhamnolipid Production by Pseudomonas Aeruginosa AT10 Using a Novel Multiobjective Optimization Method. Journal of Chemical Technology and Biotechnology (Oxford, Oxfordshire), 88(2), 271–279. doi:10.1002/jctb.3825 Scrobotă, I., Băciuț, G., Filip, A. G., Todor, B., Blaga, F., & Băciuț, M. F. (2017). Application of Fuzzy Logic in Oral Cancer Risk Assessment. Iranian Journal of Public Health, 46(5), 612–619. PMID:28560191 Sendrescu, D. (2013). Parameter Identification of Anaerobic Wastewater Treatment Bioprocesses Using Particle Swarm Optimization. Mathematical Problems in Engineering, 2013, 103748. doi:10.1155/2013/103748 Shadmand, S., & Mashoufi, B. (2016). Personalized ECG signal classification using block-based neuralnetwork and particle swarm optimization. Biomedical Signal Processing and Control, 25, 203–208. doi:10.1016/j.bspc.2015.10.008 Tan, C., & Degim, I. T. (2012). Development of sustained release formulation of an antithrombotic drug and application of fuzzy logic. Pharmaceutical Development and Technology, 17(2), 242–250. doi:10. 3109/10837450.2010.531739 PMID:21062232 Valafar, H., & Valafar, F. (1999). Prediction of a patient’s response to a specific drug treatment using artificial neural networks. IEEE, 3694–3697. Vasičkaninová, A., Bakošová, M., & Mészáros, A. (2017). Control of a biochemical process using fuzzy approach. 2017 21st International Conference on Process Control (PC), 173-178.
This research was previously published in Design and Development of Affordable Healthcare Technologies; pages 180-190, copyright year 2018 by Medical Information Science Reference (an imprint of IGI Global).
619
620
Chapter 32
Application of Computational Intelligence in Network Intrusion Detection: A Review
Heba F. Eid Al Azhar University, Egypt
ABSTRACT Intrusion detection system plays an important role in network security. However, network intrusion detection (NID) suffers from several problems, such as false positives, operational issues in high dimensional data, and the difficulty of detecting unknown threats. Most of the problems with intrusion detection are caused by improper implementation of the network intrusion detection system (NIDS). Over the past few years, computational intelligence (CI) has become an effective area in extending research capabilities. Thus, NIDS based upon CI is currently attracting considerable interest from the research community. The scope of this review will encompass the concept of NID and presents the core methods of CI, including support vector machine, hidden naïve Bayes, particle swarm optimization, genetic algorithm, and fuzzy logic. The findings of this review should provide useful insights into the application of different CI methods for NIDS over the literature, allowing to clearly define existing research challenges and progress, and to highlight promising new research directions.
INTRUSION DETECTION SYSTEM Heady et al. (1990) define an intrusion as any set of actions that attempt to compromise the integrity, confidentiality and availability of host or network resources. James P. Anderson (1980) divides the system intruders into four categories: 1. External Intruders: Who are unauthorized users of the machines they attack of. DOI: 10.4018/978-1-7998-8048-6.ch032
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Application of Computational Intelligence in Network Intrusion Detection
2. Masquerader: A user who gained access to the system and attempts to use the authentication information of another user. The masquerader can be either an external penetrator or other authorized user of the system; 3. Misfeasor: A user has legitimate access to privileged information but abuses this privilege to violate the security policy of the installation. 4. Clandestine: A user operates at a level below the normal auditing mechanisms, perhaps by accessing the machine with supervisory privileges. In 1980, Anderson proposed the concept of intrusion detection (ID) (Anderson, 1980). ID is based on the assumption that; the behavior of intruders is different from a legal user (Stallings, 2006). An intrusion detection system (IDS) dynamically monitors the events taking place in a system, and decides whether these events are symptomatic of an attack (intrusion) or constitute a legitimate use of the system (Debar et al., 1999). Figure 1 presents the general structure of an Intrusion detection system. Figure 1. General structure of intrusion detection system
INTRUSION DETECTION SYSTEM TAXONOMY There are several ways to categorize an IDS depending on, the location of the IDS in the system, the detection methodology used to generate alerts and respond action to the intrusion, as shown in Figure 2. Depending on the IDS location, the first type of IDS to appear was the Host-based Intrusion Detection System (HIDS) (Axelsson, 2000). HIDS are installed on the host and monitors the operating system information (e.g. system call sequences and application logs) (Debar et al., 1999). By checking traffic before being sent or just received, HIDS have the advantage of being able to detect attacks from the inside. However, the main problem of HIDSs is that they can only monitor the single host they are
621
Application of Computational Intelligence in Network Intrusion Detection
running on, and have to be specifically set up for each host. Thus, scalability is the main problem for HIDSs (Endorf et al., 2004; Bace & Mell, 2001). Network-based Intrusion Detection System (NIDS) identifies intrusions on external interfaces for network traffic among multiple hosts. NIDS gains access to network traffic by placing sniffers at hubs or network switches to monitor packets traveling among various communication mediums. The main advantage of NIDS is that a single system can be used to monitor the whole network. However, NIDS main disadvantages is that they can have difficulties when processing large amount of network packets (Sommers et al., 2004). Figure 2. Intrusion detection system taxonomy
Depending on the detection methodology IDSs can be divided into two techniques: misuse detection and anomaly detection (Biermann et al., 2001; Verwoerd & Hunt, 2002). Misuse intrusion detection system (signature-based detection) is the most popular commercial type of IDSs (Zhengbing et al., 2008). A misuse-based detection system contains a database which includes a number of signatures about known attacks. The audit data collected by the IDS is compared with the well-defined patterns of the database and, if a match is found, an alert is generated (Ilgun et al., 1995; Marchette, 1999). The main drawback of misuse-based detection is that it is not able to alert the system administrator in case of new attacks (Bace & Mell, 2001). Anomaly intrusion detection is a behavior-based detection method, it based on the idea of building a normal traffic profiles (Denning, 1987). It identifies malicious traffic based on the deviations from the normal profiles, where the normal patterns are constructed from the statistical measures of the system features (Mukkamala et al., 2002; Brown et al., 2001). One of the key advantages of anomaly based IDS over the misuse detection; is that it can detects new threats, or different versions of known threats (Lundin & Jonsson, 2002; Gong, 2003). According to the response technique, IDS can be categorized as active IDS and passive IDS. An active IDS is also known as Intrusion Detection and Prevention System (IDPS). The IDPS is configured to automatically block suspected intrusions without any intervention required by an administrator. IDPS has the advantage of providing real-time corrective action in response to an attack. A passive IDS is a
622
Application of Computational Intelligence in Network Intrusion Detection
system which is configured to monitor and analyze the network traffic activity, then only raises an alert if potential intrusions is found. A passive IDS is not capable of performing any protective or corrective operations on its own (Endorf et al., 2004).
COMPUTATIONAL INTELLIGENCE Computational intelligence (CI) is a fairly new research field, it was first used in 1990 by the IEEE Neural Networks Council (Dote & Ovaska, 2001). CI provides a combination of methods like learning, adaptation, optimization and evolution to create intelligent systems. Bezdek (1994) defined CI as: A system is computational intelligent when it: deals with only numerical (low-level) data, has pattern recognition components, and does not use knowledge in the AI sense. Eberhart et al. (1996) define CI as a methodology involving computing that exhibits an ability to learn and/ or deal with new situations such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association, and abstraction. While, Poole et al. (1998) defined CI as: Computational Intelligence is the study of the design of intelligent agents. An intelligent agent is a system that acts intelligently. For which, it is flexible to changing environments and changing goals, it learns from experience, and it makes appropriate choices given perceptual limitations and finite computation. The characteristics of CI methods, such as adaptation, fault tolerance and high computational speed, satisfied the requirements of building IDS.
COMPUTATIONAL INTELLIGENCE METHODS The intrusion detection problem can be approached by using different computational intelligence methods. This section presents an overview of modern CI methods that have been used to solve the ID problems. These CI methods includes Support vector machine, Hidden naïve bays, Particle Swarm Optimization, Genetic Algorithm and Fuzzy Logic.
SUPPORT VECTOR MACHINES Support vector machine (SVM) method is a CI classification technique based on Statistical Learning Theory (SLT). It is based on the idea of a hyper plane classifier, or linearly separability. The goal of SVM is to find a linear optimal hyper plane so that the margin of separation between the two classes is maximized (Vapnik, 1998; Burges, 1998). Suppose we have N training data points (x1, y1 ), (x 2 , y2 ), (x 3 , y3 ),..., (x N , yN ) , where xi∈Rd and y∈{+1,‑1}. Consider a hyper plane defined by
{
}
(w,b), where w is a weight vector and b is a bias. A new object x can be classified with the following function:
623
Application of Computational Intelligence in Network Intrusion Detection
N f (x ) = sign (w.x + b ) = sign ∑αi yi (x i , x ) + b i =1
(1)
In practice the data is often not linearly separable. However, one can still implement a linear model by transform the data points via a non-linear mapping to another higher feature space, such that the data points will be linear separable. This mapping is done by a kernel function K. The nonlinear decision function of SVM is given by the following function: N f (x ) = sign ∑αi yi K (x i , x ) + b i =1
(2)
where K(xi,x) is the kernel function. SVM has been recently used in many applications, since it has have some advantages compared with conventional machine learning methods (Kim et al., 2005a,b): 1. There are only two free parameters to be chosen, the upper bound and the kernel parameter. 2. The solution of SVM is optimal and global since the training of a SVM is done by solving a linearly constrained quadratic problem. 3. Good generalization performance and Good robustness.
HIDDEN NAÏVE BAYES Hidden naïve Bayes (HNB) was first proposed by Jiang et al. (2009). HNB inherits its structural from naive Bayes. Where, it creates a hidden parent for each attribute to combine the influences from all other attributes. Hidden parents are defined by the average of weighted one-dependence estimators. In HNB each attribute Ai has a hidden parent Ahpi, i = 1,2,...,n, where C is the class variable and c(E) represent the class of E. The joint distribution represented by an HNB is given by:
P ( A1 , …, An , C ) = P ( C ) ∏P ( Ai Ahpi , C ) i =1
(3)
where P (Ai Ahpi ,C ) =
n
∑W
ij
j =1, j ≠i
and
624
* P (Ai Aj ,C )
(4)
Application of Computational Intelligence in Network Intrusion Detection
n
∑W
ij
= 1
(5)
j=1, j≠i
The weight Wij is compute directly from the conditional mutual information between two attributes Ai and Aj Wij =
Ip (A i ; A jC)
∑
Ip (A i ; A jC)
n
(6)
j=1, j≠i
where Ip (A i ; A jC) =
P (a , a c) ∑ P (a , a , c) log P (a c) P (a c) i
i
a i ,a j , c
j
j
i
(7)
j
The HNB classifier on E=(a1,…,an) is define as follows n
c (E) = arg max P (c) ∏P (a i# a hpi , c) c∈C
(8)
i=1
PARTICLE SWARM OPTIMIZATION Particle Swarm Optimization (PSO) technique was developed by Kennedy and Eberhart (1995). PSO simulates the social behavior of organisms, such as bird flocking. PSO initialize with a random population (swarm) of individuals (particles). Where, each particle of the swarm represents a candidate solution in the d-dimensional search space. To discover the best solution at every iteration of the PSO algorithm, each particle Xi is updated by the two best values pbest and gbest. Where, pbest denotes the best solution the particle Xi has achieved so far, represented by Pi = (pi 1, pi 2 ,..., pid ) . While, Wgbest denotes the
global best position gained by the swarm so far, represented by Gi = (gi 1, gi 2 ,..., gid ) (Venter & Sobieski, 2003). The d-dimensional position for the particle i at iteration t can be represented as: x it = x it1, x it2 , …, x idt
(9)
While, the velocity; the rate of the position change; for the particle i at iteration t is given by: vit = vit1, vit2 , …, vidt
(10)
625
Application of Computational Intelligence in Network Intrusion Detection
All of the particles have fitness values, which are evaluated based on a fitness function: Fitness = α.γR (D ) + β
C +R C
(11)
where, 𝛾R(D) is the classification quality of condition attribute set R relative to decision D and |R| is the length of selected feature subset. |C| is the total number of features. While, the parameters α and β are correspond to the importance of classification quality and subset length, α =[0,1] and β = 1 − α.
GENETIC ALGORITHM Holland introduced the Genetic algorithm (GA) as an adaptive search technique (Holland, 1975). It is computational model designed to simulate the evolutionary processes in the nature (Duda et al., 2001). GA includes three fundamental operators: selection, crossover and mutation within chromosomes. 1. Selection: A population is created with a group of randomly individuals. The individuals in the population are then evaluated by fitness function. Two individuals (offspring) are selected for the next generation based on their fitness. 2. Crossover: crossover randomly chooses a point in the two selected parents and exchanging the remaining segments of them to create the new individuals. 3. Mutation: mutation randomly changes one or more components of a selected individual. This process continues until a suitable solution has been found or a certain number of generations have passed (Jiang et al., 2008). Given a well bounded problem GAs can find a global optimum which makes them well suited to feature selection problems.
FUZZY LOGIC In 1965, Lofti Zadeh introduced the fuzzy sets theory, which allows an element to belong to a set with a degree of membership (not only a binary degree) (Zadeh, 1965). This concept is extended to logic by Zadeh (1975) to introduce Fuzzy logic (FL). Classical logic deals with propositions which are either true or false but not both or in between. However, in real world scenario there are cases where propositions can be partially true and partially false. Fuzzy logic handles such real world uncertainty “vague” by allowing partial truth values.
626
Application of Computational Intelligence in Network Intrusion Detection
NETWORK INTRUSION DETECTION DATASETS Evaluation in intrusion detection system has been based on researchers’ proprietary data, thus results are generally not reproducible to be compared and validated. To reduce this problem, commonly used intrusion detection benchmarks have been used in either misuse detection or anomaly detection, Table 1. Table 1. Network Intrusion Detection benchmarks Dataset Name
Abbreviation
Dataset Link
1998 DARPA TCPDump Dataset
DARPA98
(DARPA98)
1999 DARPA TCPDump Dataset
DARPA99
(DARPA99)
KDD cup 99 Dataset
KDD99
(KDD99)
NSl-KDD Dataset
NSL-KDD
(NSL-KDD)
(Source: Lincoln Laboratory 2017 a, 2017b; KDD99, n.d.; Canadian Institute for Cybersecurity, n.d.)
In 1998 and 1999 Massachusetts Institute of Technology (MIT), Lincoln Laboratory (LL); under the sponsorship of Defense Advanced Research Projects Agency (DARPA) and Air Force Research Laboratory (AFRL); created the Intrusion Detection Evaluation (IDEVAL) benchmark (Mahoney & Chan, 2003; McHugh, 2000). The 1998 DARPA dataset includes 7 weeks of training data with labeled test data and 2 weeks of unlabeled test data, it contains over 300 instances of 38 attacks (Fried et al., 2000). The 1999 DARPA dataset presents over 5 million connections over 5 weeks: 2 were attack-free and 3 weeks included attacks. The 1998 and 1999 DARPA dataset has been used to test a large number of host based and network based systems IDS (Durst et al., 1999). Sal Stofo and Wenke Lee used 1998 DARPA benchmark to prepare the KDD’99 dataset (KDD99). The KDD’99 train dataset is about four gigabytes of compressed binary TCP dump data from seven weeks of network traffic, processed into about five million connections Record each with about 100 bytes. The two weeks of test data have around two million connection records. Each KDD’99 training connection record contains 41 features. There are 38 numeric features and 3 symbolic features, falling into the following four categories: • • • •
Basic Features: 9 basic features describe each individual TCP connection. Content Features: 13 domain knowledge related features which indicate suspicious behavior in the network traffic. This includes features such as the number of failed login attempts. Time-Based Traffic Features: 9 features used to summarize the connections over 2 second temporal window. Such as the number of connections that had the same destination host or the same service. Host-Based Traffic Features: 10 features constructed using a window of 100 connections to the same host instead of a time window. It is designed to assess attacks, which span intervals longer than 2 seconds.
627
Application of Computational Intelligence in Network Intrusion Detection
Each record is labeled as either normal or an attack, with one specific attack type. The training set contains a total of 22 attack types, while the testing set contains an additional 17 types of attacks. The attacks fall into four categories: 1. DoS: Attacker tries to prevent legitimate users from using a service. e.g Neptune, Smurf, Pod and Teardrop. 2. R2L: Unauthorized access to local from a remote machine e.g Guess-password, Ftpwrite, Imap and Phf. 3. U2R: Unauthorized access to root privileges e.g Buffer-overflow, Load-module, Perl and Spy. 4. Probing e.g. Port-sweep, IP-sweep, Nmap and Satan. Empirical study conducted by Leung et al (Leung and Leckie, 2005) states important issues which highly affects the performance of evaluated systems and results in a very poor evaluation of anomaly detection approaches. According to their statistical, two problems are reported in the KDD’99 dataset. 1. KDD’99 dataset contains huge number of redundant records.10% portions of the full dataset contained only two types of DoS attacks (Smurf and Neptune). These two types constitute over 71% of the testing dataset which completely affects the evaluation. 2. Since these attacks consume large volumes of traffic, they are easily detectable by other means and there is no need of using anomaly detection systems to find these attacks. To solve these problems, the NSL-KDD dataset is suggested (Tavallaee et al., 2009). NSL-KDD consists of selected records of the complete KDD’99 dataset, where the repeated records in the entire KDD’99 train and test set are removed.KDD’99 dataset contains 4898431 records in train set and 311027 records in test set. Table 2 gives statistics of the reduction of repeated records in the KDD train and test sets. The KDD’99 train set is reduced by 78.05% and the test set is reduced by 75.15%. Table 2 KDD’99 dataset reduction statistics Train Set
Test Set
Repeated Records
Reduction Rate
Repeated Records
Reduction Rate
Intrusion
3663472
93.32%
221058
88.26%
Normal
159967
16.44%
12680
20.92%
The NSL-KDD dataset has the following advantages over the original KDD’99 dataset (Tavallaee et al., 2009): 1. The train set does not include redundant records; hence the classifiers will not be biased towards more frequent records. 2. The proposed test sets have no duplicate records; therefore, the performances of the learners are not biased by the methods which have better detection rates on the frequent records.
628
Application of Computational Intelligence in Network Intrusion Detection
3. The number of records in the train and test sets is reasonable, which makes it affordable to run the experiments on the complete set without the need to randomly select a small portion. Consequently, evaluation results of different research works will be consistent and comparable.
PERFORMANCE COMPARISON OF KDD99 AND NSL-KDD Eid et al. (2010) conducted experimental analysis to compare the evolution performance of KDD99 and NSL-KDD dataset (Eid et al., 2010). The SVM classifier is applied on two test sets; the original KDD’99 test set (KDDTest) and NSL-KDD test set (KDDTest+) as shown in Tables 3. Table 3. KDD’99 and NSL-KDD dataset testing accuracy comparison Class Name Normal
Original KDD’99 Test Set
NSL-KDD Test Set
Test Accuracy
Test Accuracy
99.8%
99.5%
DoS
92.5%
97.5%
U2R
5.10%
86.6%
R2L
70.2%
81.3%
Probe
98.3%
92.8%
In Table 3, the redundant records in the original KDD’99 cause the learning algorithms to be biased towards the frequent records (DOS and Prob attacks). Thus prevent the detection algorithms from learning unfrequented records (U2R and R2L attacks).These unbalanced distributions of the KDD’99 testing dataset completely affects the evaluation of the detection algorithms; 5.1% for U2R and 70.2 for R2L. While, NSL-KDD test sets have no redundant records; hence, the performances of the learners are not biased by frequent records.
DATA PRE-PROCESSING Data Transformation Symbolic features appear frequently in the network traffic dataset. However, most machine learning methods are designed to work with numerical data only. In order for these methods to use information from symbolic features in detection, some coding schemes are necessary. The most commonly used coding scheme is establishing a correspondence between each category of a symbolic feature and a sequence of integer values. Yeung and Chow in 2002 proposed unsupervised anomaly NIDS designed to work with numeric data only. They use a coding scheme to transform each symbolic feature of the KDD99 dataset into numeric features. This is done by representing each symbolic feature by a group of binary-valued features. The resulting feature vectors have a total of 119 dimensions (Yeung & Chow, 2002).
629
Application of Computational Intelligence in Network Intrusion Detection
Laskov et al. in 2005, apply a data transformation by metric embedding which transforms the data into a metric space (Laskov et al., 2005). The metric embedding is performed in two-stage procedure similar to which is reported in Portnoy et al. (2001) and Eskin et al. (2002). Pereira et al. (2009) reported that most of machine learning methods used for classification are not able to handle symbolic features directly. They compare three different methods for converting the KDD99 dataset symbolic features into numeric features to be suitable for machine learning methods. The three methods indicator variables, conditional probabilities and the Separability Split Value method are contrasted with the arbitrary conversion method. The results obtained demonstrate that the three conversion methods improve the prediction ability of the classifiers, with respect to the arbitrary and commonly used assignment of numerical values (Pereira et al., 2009).
Data Normalization The KDD99 dataset features numerical values have significantly varying resolution and ranges. Thus, data normalization is required in order to avoid one feature to dominate another Feature. Li et al in 2007, proposed an anomaly NIDS. The authors normalized the KDD99 dataset features by replacing each feature value with its distance to the mean of all the values for that attribute in the instance space. In order to accomplish this, the mean and standard deviation vectors are calculated (Li et al., 2007).
Data Discretization Data discretization is performed by converting the features continuous space into a nominal space (Mizianty et al., 2010). The goal of data discretization process is to find a set of cut points, which split the range into a small number of intervals. The cut-points are real values within the range of the continuous values. These cut-points divides the range into two intervals one greater than the cut-point and other less than or equal to the cut-point value (Kotsiantis & Kanellopoulos, 2006). Eid et al. (2012) addresses the impact of applying data discretization on building network IDS. The authors conducted several groups of experiments on the NSL-KDD dataset. Experimental results show that data discretization has a positive influence on the detection speed, which is an important factor if real time network IDS is desired (Eid et al., 2013).
Data Reduction Extraneous network dataset features can make it harder to detect suspicious behavior patterns, leading to the curse of dimensionality problem (Shang & Shen, 2006). Hence, data reduction must be performed to high dimensional data set. Data reduction has been shown to both reduce the build and test time of classifiers, and also improve their detection rate. Data dimensionality reduction can be achieved by feature extraction or feature selection. Feature extraction methods create a new set of features by linear or nonlinear combination of the original features. While, feature selection methods generate a new set of features by selecting only a subset of the original features (Sung & Mukkamala, 2003). Feature selection aims to choose an optimal subset of features that are necessary to increase the classifier predictive accuracy (Dash et al., 2002; Koller & Sahami, 1996). Different feature selection
630
Application of Computational Intelligence in Network Intrusion Detection
methods are proposed to enhance the performance of IDS (Tsang et al., 2007). Based on the evaluation criteria feature selection methods fall into two categories: filter approach (Dash et al., 2002; Yu & Liu, 2003) and wrapper (Kim et al., 2000; Kohavi & John, 1997) approach. •
•
Filter approaches evaluate and select the new set of features depending on the general characteristics of the data without involving any machine algorithm. The features are ranked based on certain statistical criteria, where features with highest ranking values are selected. Frequently used filter methods include chi-square test (Jin et al., 2006), information gain (Ben-Bassat, 1982), and Pearson correlation coefficients (Peng et al., 2005). Wrapper approaches use a predetermined machine algorithm and use the classification performance as the evaluation criterion to select the new features set. Machine learning algorithms such as ID3 (Quinlan, 1986) and Bayesian networks (Jemili et al., 2009) are commonly used as induction algorithm for wrapper approaches.
Performance Evaluation Criteria The effectiveness of an ID system is evaluated by its prediction accuracy and detection speed. The prediction accuracy of an IDS is measured by the precision, recall and F − measure. These three measures are calculated based on the confusion matrix, shown in Table 4. The confusion matrix shows the four possible prediction outcomes; True negatives (TN), True positives (TP), False positives (FP) and False negatives (FN) (Duda et al., 2001; Wu & Banzhaf, 2010). Table 4. Confusion matrix Predicted Class
Actual Class
Normal
Intrusion
Normal
True positives (TP)
False positives (FP)
Intrusion
False negatives (FN)
True negatives (TN)
TN and TP indicate that number of normal events and attack events are successfully predicted. While, FP refer to the number of normal events being predicted as attacks, and FN refer the number of attack events are incorrectly predicted as normal. Recall =
TP TP + FN
Precision =
TP TP + FP
(12)
(13)
The commonly used IDS popular metrics include precision and recall, where F-measure is a weighted mean that assesses the trade-off between them.
631
Application of Computational Intelligence in Network Intrusion Detection
F-measure=
2 ∗ Recall ∗ Precision Recall + Precision
(14)
Support Vector Machine in NID Sung and Mukkamala in 2003 uses support vector machine classifier to build an intrusion detection system. However, the system totally ignores the relationships and dependencies between the features (Sung & Mukkamala, 2003). Yao et al. (2006) propose an enhanced SVM intrusion detection model with a weighted kernel function. Rough set theory is adopted to perform a feature ranking and selection task of the new model. They proposed intrusion detection model outperformed the conventional SVM in precision, computation time, and false negative rate (Yao et al., 2006). Shon et al. (2007) propose a new SVM approach for anomaly IDS, named Enhanced SVM. The overall model consist of an enhanced SVM detection engine and its supplement components such as packet profiling using SOFM, packet filtering using PTF, field selection using Genetic Algorithm and packet flow-based data preprocessing. SOFM clustering was used for normal profiling. The Enhanced SVM approach demonstrated an aptitude for detecting novel attacks while Snort and Bro, signature based systems in practice, have well developed detection performance. The SVM approach exhibited a low false positive rate similar to that of real NIDS (Shon & Moon, 2007). Eid et al. (2010) proposes a novel adaptive IDS based on principal component analysis (PCA) and support vector machines (SVMs). By making use of PCA, the dimension of network data patterns is reduced significantly. Then SVM is employed to construct classification models based on training data processed by PCA. Experimental results on NSL-KDD dataset show that there proposed PCA-SVM NIDS has comparable accuracy as that of conventional SVMs without PCA and is able to speed up the process of intrusion detection and to minimize the memory space and CPU time cost (Eid et al., 2010). Aburomman and Ibne Reaz (2017) compares several methods for creating SVM-based multiclass ID classifier from a set of binary SVM classifiers. Their research aims to identify the best suited multiclass SVM model for the intrusion detection task. The compared methods include one-against-rest SVM (OAR-SVM), one-against-one SVM (OAO-SVM), directed acyclic graph SVM (DAG-SVM), adaptive directed acyclic graph SVM (ADAGSVM), and error-correcting output code SVM (ECOC-SVM). Also, they propose a novel approach based on weighted one-against-rest SVM (WOAR-SVM). WOAR-SVM model uses differential evaluation as a meta-heuristic generated weight optimizer to define the relationship between the decision rules of the binary SVM classifier. Experiments results on the NSL-KDD dataset indicate that the WOAR-SVM shows an improvement in terms of overall accuracy (Aburomman & Reaz, 2017).
Hidden Naïve Bayes in NID Valdes et al. (2000) developed an anomaly detection system based on naïve Bayesian networks. The detection model suffers from the problem that the child nodes do not interact between themselves and their output only influences the probability of the root node (Valdes & Skinner, 2000). Eid et al. (2011) introduced and investigated the performance of a hybrid GA-HNB NIDS, where GA feature selection approach to reduce the data features space and then the hidden naïve bays (HNB)
632
Application of Computational Intelligence in Network Intrusion Detection
approach were adapted to classify the network intrusion. In order to evaluate the performance of the introduced hybrid GA-HNB NIDS, several experiments are conducted and demonstrated on NSL-KDD dataset. The experimental results show that the proposed GA-HNB NIDS produces consistently better performances on selecting the subsets of features which resulting better classification accuracies about 98.63%. Moreover, the performances results of the authors hybrid NIDS have been compared with the results of five well-known feature selection algorithms such as Chi square, Gain ratio and Principal component analysis (PCA) (Eid et al., 2011). Koc and Carswell (2015) introduced a binary classifier model based on HNB method as an extension to NB to reduce its naivety assumption. Authors augmented the HNB binary classifier with EMD discretization and CONS feature selection filter methods. Experiment research using classic KDD 1999 Cup ID dataset indicate that the HNB binary classification model has better performance in terms of detection accuracy, error rate and area under ROC curve compared to the traditional NB classifier and can be applied to intrusion detection problem (Koc & Carswell, 2015).
Particle Swarm Optimization in NID Srinoy (2007) reported that normal operation often produces traffic that matches like attack signature, resulting in false alarms. One main drawback of NIDS is the inability of detecting new attacks which do not have known signatures. The author proposed an IDS model where PSO is used to implement a feature selection, and SVMs with the one-versus-rest method serve as a fitness function of PSO. Experimental result shows that Srinoy’s model recognize not only known attacks but also detect suspicious activity that may be the result of a new, unknown attack. The proposed model simplifies features effectively and obtains a higher classification accuracy compared to other methods (Srinoy, 2007). Wang (2009) proposed PSO–SVM model to intrusion detection. Where, the standard PSO is used to determine free parameters of support vector machine and the binary PSO is to obtain the optimum feature subset at building IDS. The authors developed a series of experiments on KDD99 dataset to examine the effectiveness of their proposed NIDS. The experiment results indicate that PSO–SVM is not only able to achieve the process of selecting important features but also achieve higher detection rate than regular SVM for IDS (Wang et al., 2009). Kumar et al. (2012) present a new collaborating filtering technique for preprocessing the probe type of attacks. They implemented a hybrid classifiers based on binary PSO and random forests algorithm for the classification of probe attacks in a network. The Collaborative filtering technique and random forests algorithm has been successfully applied to find patterns that are suitable for prediction in large volumes of data. Their experimental result demonstrated that as number of trees used in forest increases, the false positive rate decreases (Kumar et al., 2012). Bamakan et al. (2015) developed a new model based on multiple criteria linear programming (MCLP) and PSO to enhance the accuracy of attacks detection. In order to improve the performance of MCLP classifier; PSO has been used to tuning the parameters of MCLP. KDD’99 dataset used to evaluate the performance of proposed PSO-MCLP model. The experimental study indicated that the PSO-MCLP model get better performance based on detection rate, false alarm rate and running time compare with MCLP model which its parameters has been chosen by user or by cross validation (Bamakan et al., 2015). Aburomman and Reaz (2016) proposes a novel ensemble construction method that uses PSO generated weights to create ensemble of classifiers with better accuracy for NID. Authors define an expert as a collection of five binary classifiers that together generate a binary vector of responses. For which,
633
Application of Computational Intelligence in Network Intrusion Detection
the expert opinions are combined in three ways. (1) PSO approach: generate weights using PSO that is constructed with manually selected behavioral parameters. These weights are then used with the weighted majority voting (WMV) to combine the expert opinions. (2) meta-optimized PSO: is similar to the first approach, except that the PSO behavioral parameters were optimized using Local unimodal sampling (LUS). (3) WMA approach: is to combine the opinions using the WMA. The three approaches were empirically compared using KDD99 datasets. Experimental results showed that the best results obtained for PSO with average accuracy improvement of 0.756% compared to the accuracy of the best base expert; and a relatively short time for PSO based ensemble to complete its task (Aburomman & Reaz, 2016).
GENETIC ALGORITHM IN NID Crosbie and Spafford (1995) applied the multiple agent technology and Genetic Programming (GP) to detect network intrusions. For both agents they used GP to determine anomalous network behaviors and each agent can monitor one parameter of the network audit data. The proposed model has the advantage when many small autonomous agents are used but it suffers a problem when communicating among the agents (Crosbie & Spafford, 1995). Stein et al. (2005) uses a genetic algorithm to select a subset of features for decision tree classifiers. The authors intrusion detection model increases the detection accuracy and decreases the false alarm rate (Stein et al., 2005). Goyal and Kumar (2008) described a GA based algorithm to classify all types of smurf attack. They GA algorithm takes into consideration different features in network connections to generate a classification rule set. The authors were able to generate a rule using the principles of evolution in a GA to classify all types of smurf attack labels in the KDD99 training dataset. Their false positive rate is quite low at 0.2% and accuracy rate is as high as 100% (Goyal & Kumar, 2008). Khan, in his 2011 study, showed that NIDS on rules formulation is an efficient approach to classify various type of attack. DoS or Probing attack are relatively more common and can be detected more accurately if contributing parameters are formulated in terms of rules. Khan used Genetic algorithm to devise such rule. The author experiments show that the accuracy of rule based learning increases with the number of iteration (Khan, 2011). Kuang et al. (2014) present a hybrid kernel principal component analysis (KPCA), support vector machine (SVM) and genetic algorithm (GA) model to enhance the detection precision for low-frequent attacks and detection stability. In the proposed model, a multi-layer SVM classifier is adopted, where KPCA is used as a preprocessor to reduce the dimension of feature vectors and shorten training time. GA is employed to optimize the punishment factor C, kernel parameters σ and the tube size ε of SVM. Experimental results on KDD dataset show that the classification accuracies of the proposed hybrid KPCA-SVM-GA model are superior to those of SVM classifiers whose parameters are randomly selected (Kuanga et al., 2014). Gauthama Raman et al. (2017) introduce an adaptive, and a robust NID technique using Hypergraph based Genetic Algorithm (HG -GA) for parameter setting and feature selection in Support Vector Machine (SVM). For which, the Hyper – clique property of Hypergraph was exploited for the generation of GA initial population to prevent the local minima trap and to fasten the optimal solution search. HG-GA uses a weighted objective function to maintain the trade-off between maximizing the detection rate and minimizing the false alarm rate, along with the optimal number of features. The performance of pro-
634
Application of Computational Intelligence in Network Intrusion Detection
posed HGGA-SVM model was evaluated using NSL-KDD dataset under two scenarios. Experimental results show the prominence of HG-GA-SVM model over the existing techniques in terms of classifier accuracy, detection rate, false alarm rate and runtime analysis (Raman et al., 2017).
FUZZY LOGIC IN NID Dickerson et al. (2000) developed the Fuzzy Intrusion Recognition Engine (FIRE). They applies fuzzy logic rules to the audit data to classify it as normal or intrusion. FIRE process the network input data and generate fuzzy sets for every observed feature. Then, the fuzzy sets are used to define fuzzy rules to detect individual attacks. The FIRE model proofed to be effective against port scans and probes, but its primary disadvantage is the labor intensive rule generation process (Dickerson & Dickerson, 2000). Orfila et al. (2003) introduce a measure of the IDS prediction skill computed according to the false positives produced. They proposed a model that can quantify the usefulness that a multi-model IDS prediction can bring to the user. The performance obtained from the application of fuzzy thresholds is compared with the corresponding crisp thresholds. The authors results of these comparisons conclude a relevant improvement when fuzzy thresholds are involved instead of crisp logic (Orfila et al., 2003). Shanmugavadivu and Nagarajan (2012) designed a fuzzy decision-making module for intrusion detection. Where, an effective set of fuzzy rules for inference approach were identified automatically by making use of the fuzzy rule learning strategy. They proposed model at first, the definite rules were for attack data as well as normal data. Then, fuzzy rules were identified by fuzzifying. The definite rules and these rules were given to fuzzy system, which classify the test data (Shanmugavadivu & Nagarajan, 2012). Elhag et al. (2015) proposed a new methodology for improving the behaviour of misuse NIDS. There new approach consider the use of Genetic Fuzzy Systems (GFS) within a pairwise learning framework for the development of a robust and interpretable NIDS. Specifically, it is based on the combination of the FARCHD algorithm, which is a linguistic fuzzy association rule mining classifier, and the OVO binarization that confronts all pairs of classes in order to learn a single model for each couple. The KDDCUP’99 has been selected as benchmark dataset for determining the robustness of the proposal approach under different perspectives. Experimental results show that the FARCHD-OVO approach has the best tradeoff among all performance measures, especially in the mean F-measure, the average accuracy and the false alarm rate (Elhag et al., 2015).
REFERENCES Aburomman, A., & Reaz, M. (2016). A novel svm-knn-pso ensemble method for intrusion detection system. Applied Soft Computing, 38, 360–372. doi:10.1016/j.asoc.2015.10.011 Aburomman, A. A., & Reaz, M. B. I. (2017). A novel weighted support vector machines multiclass classifier based on differential evolution for intrusion detection systems. Information Sciences, 414, 225–246. doi:10.1016/j.ins.2017.06.007 Anderson, J. P. (1980). Computer security threat monitoring and surveillance. Technical report. Fort Washington, PA: James P Anderson Co.
635
Application of Computational Intelligence in Network Intrusion Detection
Axelsson, S. (2000). Intrusion detection systems: A survey and taxonomy. Department of Computer Engineering, Chalmers University of Technology, Tech Rep. Bace, R., & Mell, P. (2001). Nist special publication on intrusion detection systems. National Institute of Standards and Technology, Tech Rep. Bamakan, S., Amiri, B., Mirzabagheri, M., & Shi, Y. (2015). A new intrusion detection approach using pso based multiple criteria linear programming. Procedia Computer Science, 55, 231–237. doi:10.1016/j. procs.2015.07.040 Ben-Bassat, M. (1982). Pattern recognition and reduction of dimensionality, volume 1. Handbook of Statistics II. North-Holland. Bezdek, J. C. (1994). What is computational intelligence? In Computational Intelligence Imitating Life. New York: IEEE Press. Biermann, E., Cloete, E., & Venter, L. M. (2001). A comparison of intrusion detection systems. Computers & Security, 20(8), 676–683. doi:10.1016/S0167-4048(01)00806-9 Brown, D. J., Suckow, B., & Wang, T. (2001). A survey of intrusion detection systems. Academic Press. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167. doi:10.1023/A:1009715923555 Canadian Institute for Cybersecurity. (n.d.). NSL-KDD. Retrieved from http://nsl.cs.unb.ca/NSL-KDD/ Crosbie, M., & Spafford, E. (1995). Applying genetic programming to intrusion detection. Proceedings of AAAI Fall Symposium on Genetic Programming, 1–8. Dash, M., Choi, K., Scheuermann, P., & Liu, H. (2002). Feature selection for clustering⣓a filter solution. Proceedings of the Second International Conference on Data Mining, 115-122. Debar, H., Dacier, M., & Wespi, A. (1999). Towards a taxonomy of intrusion-detection systems. Computer Networks, 31(8), 805–822. doi:10.1016/S1389-1286(98)00017-6 Denning, D. (1987). An intrusion detection model. IEEE Transactions on Software Engineering, 13(2), 222–232. doi:10.1109/TSE.1987.232894 Dickerson, J., & Dickerson, J. (2000). Fuzzy network profiling for intrusion detection. 19th International Conference of the North American Fuzzy Information Processing Society (NAFIPS), 301–306. Dote, Y., & Ovaska, S. J. (2001). Industrial applications of soft computing: A review. Proceedings of the IEEE, 1243–1265. 10.1109/5.949483 Duda, R., Hart, P., & Stork, D. (2001). Pattern Classification (2nd ed.). John Wiley & Sons. Durst, R., Champion, T., Witten, B., Miller, E., & Spagnuolo, L. (1999). Testing and evaluating computer intrusion detection systems. Communications of the ACM, 42(7), 53–61. doi:10.1145/306549.306571 Eberhart, R., & Kennedy, J. (1995). A new optimizer using particle swarm theory. Sixth International Symposium on Micro Machine and Human Science, 39–43. 10.1109/MHS.1995.494215
636
Application of Computational Intelligence in Network Intrusion Detection
Eberhart, R., Simpson, P., & Dobbins, R. (1996). Computational Intelligence PC Tools. Boston: Academic Press. Eid, H. F., Azar, A. T., & Hassanien, A. E. (2013). Improved real-time discretize network intrusion detection system. Seventh International Conference on Bio-Inspired Computing Theories and Applications (BIC-TA 2012) Advances in Intelligent Systems and Computing, 99–109. 10.1007/978-81-322-1038-2_9 Eid, H. F., Darwish, A., Hassanien, A. E., & Hoon Kim, T. (2011). Intelligent hybrid anomaly network intrusion detection system. In International Conference on Future Generation Communication and Networking. CCIS/ LNCS series, Jeju Island, Korea. Eid, H. F., Darwish, A., Hassanien, A. E., & Abraham, A. (2010). Principle components analysisand support vector machine based intrusion detection system. The 10th IEEE international conference in Intelligent Design and Application (ISDA2010). Elhag, S., Fernandez, A., Bawakid, A., Alshomrani, S., & Herrera, F. (2015). On the combination of genetic fuzzy systems and pairwise learning for improving detection rates on intrusion detection systems. Expert Systems with Applications, 42(1), 193–202. doi:10.1016/j.eswa.2014.08.002 Endorf, C., Schultz, E., & J., M. (2004). Intrusion Detection and Prevention. McGraw-Hill. Eskin, E., Arnold, A., Prerau, M., Portnoy, L., & Stolfo, S. (2002). A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In Data Mining in Computer Security. Kluwer. doi:10.1007/978-1-4615-0953-0_4 Fried, D. J., Graf, I., Haines, J. W., Kendall, K. R., Mcclung, D., Weber, D., . . . Zissman, M. A. (2000). Evaluating intrusion detection systems: The 1998 darpa off-line intrusion detection evaluation. Proceedings of the DARPA Information Survivability Conference and Exposition, 12-26. Gong, F. (2003). Deciphering detection techniques: Part ii anomaly-based intrusion detection. Network Associates. Goyal, A., & Kumar, C. (2008). A genetic algorithm based network intrusion detection system. Academic Press. Heady, R., Luger, G., Maccabe, A., & Servilla, M. (1990). The architecture of a network level intrusion detection system. Technical report. Computer Science Department, University of New Mexico. doi:10.2172/425295 Holland, J. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press. Ilgun, K., Kemmerer, R. A., & Porras, P. A. (1995). State transition analysis: A rule-based intrusion detection approach. IEEE Transactions on Software Engineering, 21, 181–199. Jemili, F., Zaghdoud, M., & Ahmed, M. (2009). Intrusion detection based on hybrid propagation in Bayesian networks. Proceedings of the IEEE international conference on Intelligence and security informatics, 137–142.
637
Application of Computational Intelligence in Network Intrusion Detection
Jiang, B., Ding, X., Ma, L., He, Y., Wang, T., & Xie, W. (2008). A hybrid feature selection algorithm:combination of symmetrical uncertainty and genetic algorithms. The Second International Symposium on Optimization and Systems Biology OSB’08, 152–157. Jiang, L., Zhang, H., & Cai, Z. (2009). A novel Bayes model: Hidden naive Bayes. IEEE Transactions on Knowledge and Data Engineering, 2(10), 1361–1371. doi:10.1109/TKDE.2008.234 Jin, X., Xu, A., Bie, R., & Guo, P. (2006). Machine learning techniques and chi-square feature selection for cancer classification using sage gene expression profiles. Lecture Notes in Computer Science, 3916, 106-115. doi:10.1007/11691730_11 KDD99. (n.d.). Retrieved from http://kdd.ics.uci.edu/databases Khan, M. (2011). Rule based network intrusion detection using genetic algorithm. International Journal of Computers and Applications, 18(8), 26–29. doi:10.5120/2303-2914 Kim, S., Shin, K. S., & Park, K. (2005a). An application of support vector machines for customer churn analysis: Credit card case. Lecture Notes in Computer Science, 3611, 636–647. doi:10.1007/11539117_91 Kim, S., Yang, S., Seo, K. S., Ro, Y. M., Kim, J.-Y., & Seo, Y. S. (2005b). Home photo categorization based on photographic region templates. Lecture Notes in Computer Science, 3689, 328–338. doi:10.1007/11562382_25 Kim, Y., Street, W., & Menczer, F. (2000). Feature selection for unsupervised learning via evolutionary search. Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 365-369. 10.1145/347090.347169 Koc, L., & Carswell, A. D. (2015). Network intrusion detection using a hnb binary classifier. In 17th UKSIM-AMSS International Conference on Modelling and Simulation (pp. 81–85). IEEE. 10.1109/ UKSim.2015.37 Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1-2), 273–324. doi:10.1016/S0004-3702(97)00043-X Koller, D., & Sahami, M. (1996). Toward optimal feature selection. Proceedings of the Thirteenth International Conference on Machine Learning, 284-292. Kotsiantis, S., & Kanellopoulos, D. (2006). Discretization techniques: A recent survey. GESTS International Transactions on Computer Science and Engineering, 32, 47–58. Kuanga, F., Xu, W., & Zhang, S. (2014). A novel hybrid kpca and svm with ga model for intrusion detection. Applied Soft Computing, 18, 178–184. doi:10.1016/j.asoc.2014.01.028 Kumar, G. S., Sirisha, C. V. K., Durga, R, K., & Devi, A. (2012). Robust preprocessing and random forests technique for network probe anomaly detection. International Journal of Soft Computing and Engineering, 6. Laskov, P., Dussel, P., Schafer, C., & K., R. (2005). Learning intrusion detection: supervised or unsupervised? Image Analysis and Processings ICIAP, 50–57.
638
Application of Computational Intelligence in Network Intrusion Detection
Leung, K., & Leckie, C. (2005). Unsupervised anomaly detection in network intrusion detection using clusters. Proceedings of the Twenty-eighth Australasian conference on Computer Science, 333–342. Li, Y., Fang, B., Guo, L., & Chen, Y. (2007). Network anomaly detection based on tcmknn algorithm. In Proceedings of the 2nd ACM symposium on Information computer and communications security. ACM. Lincoln Laboratory, Massachusetts Institute of Technology. (2017a). 1998 DARPA Intrusion Detection Evaluation Data Set. Retrieved from http://www.ll.mit.edu/mission/communications/cyber/CSTcorpora/ ideval/data/1998data.html Lincoln Laboratory, Massachusetts Institute of Technology. (2017b). 1999 DARPA Intrusion Detection Evaluation Data Set. Retrieved from http://www.ll.mit.edu/mission/communications/cyber/CSTcorpora/ ideval/data/1999data.html Lundin, E., & Jonsson, E. (2002). Anomaly-based intrusion detection: Privacy concerns and other problems. Computer Networks, 34(4), 623–640. doi:10.1016/S1389-1286(00)00134-1 Mahoney, M. V., & Chan, P. K. (2003). An analysis of the 1999 darpa/lincoln laboratory evaluation data for network anomaly detection. In Sixth International Symposium on Recent Advances in Intrusion Detection (pp. 220-237). Springer-Verlag. Marchette, D. (1999). A statistical method for profiling network traffic. Proceedings of the First USENIX Workshop on Intrusion Detection and Network Monitoring, 119–128. McHugh, J. (2000). Testing intrusion detection systems: A critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Transactions on Information and System Security, 3(4), 262–294. doi:10.1145/382912.382923 Mizianty, M., Kurgan, L., & Ogiela, M. (2010). Discretization as the enabling technique for the naïve Bayes and semi-naïve Bayes-based classification. The Knowledge Engineering Review, 25, 421–449. Mukkamala, S., Janoski, G., & Sung, A. (2002). Intrusion detection: support vector machines and neural networks. Proceedings of the IEEE International Joint Conference on Neural Networks (ANNIE), 1702–1707. 10.1109/IJCNN.2002.1007774 Orfila, A., Carbo, J., & Ribagorda, A. (2003). Fuzzy logic on decision model for intrusion detection systems. Proceedings of the 2003 IEEE International Conference on Fuzzy Systems, 1237–1242. 10.1109/ FUZZ.2003.1206608 Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of maxdependency, max-relevance, and min redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1226, 12–38. PMID:16119262 Pereira, H. E. (2009). Conversion methods for symbolic features: A comparison applied to an intrusion detection problem. Expert Systems with Applications, 36(7), 10612–10617. doi:10.1016/j.eswa.2009.02.054 Poole, D., Mackworth, A., & Goebel, R. (1998). Computational Intelligence: A Logical Approach. Oxford, UK: Oxford University Press.
639
Application of Computational Intelligence in Network Intrusion Detection
Portnoy, L., Eskin, E., & Stolfo, S. (2001). Intrusion detection with unlabeled data using clustering. Proc. ACM CSS Workshop on Data Mining Applied to Security. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. doi:10.1007/ BF00116251 Raman, M. R. G., Somu, N., Kirthivasan, K., Liscano, R., & Sriram, V. S. S. (2017). An efficient intrusion detection system based on hypergraph -genetic algorithm for parameter optimization and feature selection in support vector machine. Knowledge-Based System. Shang, C., & Shen, Q. (2006). Aiding classi cation of gene expression data with feature selection: A comparative study. Computational Intelligence Research, 1, 68–76. Shanmugavadivu, R., & Nagarajan, N. (2012). Learning of intrusion detector in conceptual approach of fuzzy towards intrusion methodology. Int. J. of Advanced Research in Computer Science and Software Engineering, 2. Shon, T., & Moon, J. S. (2007). A hybrid machine learning approach to network anomaly detection. Information Sciences, 177(18), 3799–3821. doi:10.1016/j.ins.2007.03.025 Sommers, J., Yegneswaran, V., & Barford, P. (2004). A framework for malicious workload generation. In 4th ACM SIGCOMM conference on Internet measurement (pp. 82-87). ACM. Srinoy, S. (2007). Intrusion detection model based on particle swarm optimization and support vector machine. Proceeding of Computational Intelligence in Security and Defense Applications. doi:10.1109/ CISDA.2007.368152 Stallings, W. (2006). Cryptography and network security principles and practices. Prentice Hall. Stein, G., & Chen, B. W., & K, H. (2005). decision tree classifier for network intrusion detection with ga-based feature selection. Proceedings of 43rd annual Southeast regional conference, 136–141. 10.1145/1167253.1167288 Sung, A., & Mukkamala, S. (2003). Identifying important features for intrusion detection using support vector machines and neural networks. Proceedings of the 2003 Symposium on Applications and the Internet, 209–216. 10.1109/SAINT.2003.1183050 Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A detailed analysis of the kdd cup 99 data set. Proceeding of the 2009 IEEE symposium on computational Intelligence in security and defense application (CISDA). 10.1109/CISDA.2009.5356528 Tsang, C., Kwong, S., & Wang, H. (2007). Genetic-fuzzy rule mining approach and evaluation of feature selection techniques for anomaly intrusion detection. Pattern Recognition, 40(9), 2373–2391. doi:10.1016/j.patcog.2006.12.009 Valdes, A., & Skinner, K. (2000). Adaptive model-based monitoring for cyber attack detection. Recent Advances in Intrusion Detection, 80–92. doi:10.1007/3-540-39945-3_6 Vapnik, V. (1998). Statistical learning theory. New York: Wiley.
640
Application of Computational Intelligence in Network Intrusion Detection
Venter, G., & Sobieski, J. S. (2003). Particle swarm optimization. AIAA Journal, 41(8), 1583–1589. doi:10.2514/2.2111 Verwoerd, T., & Hunt, R. (2002). Intrusion detection techniques and approaches. Computer Communications, 25(15), 1356–1365. doi:10.1016/S0140-3664(02)00037-3 Wang, J., Hong, X., Ren, R., & Li, T. (2009). A real-time intrusion detection system based on pso-svm. Proceeding of the 2009 International Workshop on Information Security and Application, 319–321. Wu, S., & Banzhaf, W. (2010). The use of computational intelligence in intrusion detection systems: A review. Applied Soft Computing, 10(1), 1–35. doi:10.1016/j.asoc.2009.06.019 Yao, J., Zhao, S., & Fan, L. (2006). An enhanced support vector machine model for intrusion detection. Proceedings of the First international conference on Rough Sets and Knowledge Technology, 538–543. 10.1007/11795131_78 Yeung, D., & Chow, C. (2002). Parzen-window network intrusion detectors. International Conference on pattern recognition, 385–388. 10.1109/ICPR.2002.1047476 Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: a fast correlation-based filter solution. Proceedings of the twentieth International Conference on Machine Learning, 856-863. Zadeh, L. (1965). Fuzzy sets. Information and Control, 8, 338-352. Zadeh, L. (1975). Fuzzy logic and approximate reasoning. Synthese, 30(3), 407–428. doi:10.1007/ BF00485052 Zhengbing, H., Zhitang, L., & Junqi, W. (2008). A novel network intrusion detection system (nids) based on signatures search of data mining. In 1st international conference on Forensic applications and techniques in telecommunications information, and multimedia (pp. 1-7). ICST.
This research was previously published in Handbook of Research on Investigations in Artificial Life Research and Development; pages 153-174, copyright year 2018 by Engineering Science Reference (an imprint of IGI Global).
641
642
Chapter 33
Determining Headache Diseases With Genetic Algorithm Gaffari Celik Agri Ibrahim Cecen University, Turkey
ABSTRACT Currently, medical diagnosis has a strong relation with the artificial-intelligence-oriented approaches. Because it is practical to employ intelligent mechanisms over some input data-expert knowledge and design effective solution ways, even the biomedical engineering field is interested in taking support from artificial intelligence. If applications in this manner are taken into consideration, we can see that medical diagnoses have a big percentage. In the sense of the explanations, the objective of this chapter is to use genetic algorithm (GA) for diagnosing headache diseases. As a popular and essential technique benefiting from evolutionary mechanisms, GA can deal with many different types of real-world problems. So, it has been chosen as the solution way/algorithm over the headache disease detection problem, which shapes the research framework of the study. The chapter content gives information about the performed diagnosis application and the results.
INTRODUCTION Our life is currently run over many technological developments directing the scientific literature in a fast way. This fast way of improvements and changes are of course associated with some past discoveries and inventions, which have important roles on changing our life. Today, individuals can experience very different lives according to the ones experienced by their past ancestors. It is clear that this situation is connected with innovative developments occurred as in fields of especially electrics, electronics and computers. When it is thought about current life standards, it can be easily expressed that the standards are highly directed by computer systems supported with strong environmental technologies. In this sense, especially software and hardware components have remarkable roles to make everything better for individuals while they are interacting with computer systems. At this point, there are currently also some
DOI: 10.4018/978-1-7998-8048-6.ch033
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Determining Headache Diseases With Genetic Algorithm
important scientific fields and technologies having great roles on both changing the ways of problem solving within computer systems and introducing new research literatures to the scientific community. Artificial Intelligence is known widely as one of them. Artificial Intelligence has been always a remarkable, strong scientific field to solve different kinds of real-world based problems. In detail, it is possible to indicate that it has reached to a multidisciplinary scope as a result of successful results achieved so far for different problems of different fields (Pannu, 2015). Today, this field is even separated into different research areas – topics such as Machine Learning, Swarm Intelligence and even Cybernetics (Alpaydin, 2014; Jordan & Mitchell, 2015; Kline, 2015; Magin, 2017; Merkle & Middendorf, 2013; Michalski et al., 2013; Sayre, 2014; Waldner, 2013; Yang et al., 2013). At this point, it is also associated with all fields in our life by having stronger connections with also some of them. Medical is one the most remarkable fields in which we can see many problem solution applications of Artificial Intelligence. Although there are different kinds of problem areas of medical as Artificial Intelligence is often employed, diagnosis oriented studies are among the most popular ones nowadays. Medical diagnosis has always been a remarkable problem solution approach of the field of medical. As a result of changed technologies and problem-solving approaches, even it has improved and evaluated in time. It has been even connected with the latest technological factors appeared in time within the scientific literature. Currently, medical diagnosis has a strong relation with the Artificial Intelligence oriented approaches. Because it is practical to employ intelligent mechanisms over some input data-expert knowledge and design effective solution ways, even Biomedical Engineering field is interested in taking support from Artificial Intelligence. If applications in this manner are taken into consideration, we can see that medical diagnosis have a big percentage. In the sense of the explanations done so far, objective of this study is to use Genetic Algorithm (GA) for diagnosing headache diseases. As a popular and essential technique benefiting from evolutionary mechanisms, GA can deal with many different types of real world problems. So, it has been chosen as the solution way - algorithm over the headache disease detection problem, which shapes the research framework of the study. Briefly, the formed approach has used a simple function to run from codes of GA particles having weights for different disease symptoms, as similar to an interesting way followed by Tezel and Kose (2011) before. On the other hand, the research done here will be of course a contribution to the associated literature including different examples of medical diagnosis (Some remarkable, recent ones are: Amato et al., 2013; Chattopadhyay et al., 2013; Chikh et al., 2012; Ghumbre et al., 2011; Kharya, 2012; Lavanya & Rani, 2011; Oniśko & Druzdzel, 2013; Prasadl et al., 2011; Qasem & Shamsuddin, 2011; Shankaracharya et al., 2010; Shiraishi et al., 2011; Tripathy et al., 2013; Wall et al., 2012). In detail, the paper content gives information about the performed diagnosis application and the results. Based on the chapter subject and the research scope, the remaining content is organized as follows: The next section is focused on the Genetic Algorithms (GA) and some information on headache diagnosis. Following that section, the third section is devoted to some brief information about the developed headache diagnosis approach with the support by GA in this study. This section also provides some information about diagnosing performances (classification success) of different GA systems with different specific parameters to enable readers for understanding how the GA based system was good at diagnosing headache disease. Finally, the chapter is ended by focusing on conclusions and some future work ideas under the last section.
643
Determining Headache Diseases With Genetic Algorithm
GENETIC ALGORITHMS AND HEADACHE DIAGNOSIS In the research work – study considered in this chapter, the technique of Genetic Algorithms (GA) was used to deal with a medical diagnosis problem over headache disease. In order to provide more information about working mechanism of GA and the objective headache diseases problem, this section has been devoted to these important two actors of the performed research.
Genetic Algorithms Genetic Algorithms (GA) is a popular Artificial Intelligence technique, which is used for many different kinds of real-world based problems. GA is briefly structured as inspiring from the Theory of Evolution (Pham & Karaboga, 2012). From this perspective, it is known as an Evolutionary Computation based technique but also included in the Swarm Intelligence research interest – topic of Artificial Intelligence (Goldberg & Koza, 1994; Yang et al., 2013). In detail, standard GA uses some mechanisms like crossover and mutation in order to improve effectiveness of the particles (individuals) searching for the solution values in the objective solution space (Gen & Cheng, 2000; Man et al., 2012; Sastry et al., 2014). From this perspective, this technique is accepted as an intelligent optimization technique in the related literature. A typical GA employs the following algorithmic features and mechanisms to use some intelligent problem solution processes (Mitchell, 1998; Sastry et al., 2014): • • • • • •
GA is initially set with a defined number of particles, which can be called as also individuals. Each individual briefly consists of genes forming a whole chromosome. In this sense, an individual is coded with some genes, which can be binary values, hexadecimal values, or even alphanumeric characters or some numbers (Figure 1 – Fisher, 2016). An individual code is used within an objective function or a set of functions to calculate some values or decision components associated with the objective problem. A typical problem solution flow of a GA is based on an iterative evaluation of each particle and performing crossover and mutation based operators on the successful ones for having newer generations thought to be leading to better solutions in each iteration flow. For determining which individuals – particles to be applied crossover, some alternative approaches are used. That’s similar for also mutation processes. Regarding iterative problem solution flow of GA, Figure represents a simple problem solution steps of the technique (Liao & Sun, 2001).
In the literature, there are many different, improved GA variations (Some recent, remarkable ones: Deng et al., 2015; Hsiao, 2015; Jain & Deb, 2013; Li et al., 2014; Lin et al., 2014; Qu et al., 2013; Shang et al., 2013; Yan et al., 2016; Zuo et al., 2015). But in the research expressed in this chapter, the standard model of GA has been used for diagnosis purposes over headache disease.
644
Determining Headache Diseases With Genetic Algorithm
Figure 1. Coding individuals in Genetic Algorithm Fisher, 2016.
Figure 2. Problem solution steps of the Genetic Algorithm Liao & Sun, 2001.
Headache Disease Headache diseases is a widely seen disease types as classified by the International Headache Society (HIS) into 13 classes (Olesen, 2000). In this context, it may be seen as a common disease but the literature indicates that 4 of these related classes accepted as primer type headache diseases while 9 ones are
645
Determining Headache Diseases With Genetic Algorithm
within secondary type headache diseases. Although primer ones are common types of headache types, headache diseases in the secondary type are caused by important medical problems such as celebral hemorrhage, neck problems, brain tumors and even eye diseases or some problems on the neck (Tezel & Kose, 2011). When we focus again generally to the whole headache diseases, it can be seen that the migraine has some certain symptoms associated with this class directly while another type: set headache is an unusual, but a serious headache disease. At this point, because there are different serious headache diseases that should be detected and also majority of headaches are not diseases and just regarding some factors noise, stress, air pollution, it is important to run some diagnosis oriented studies for patients – individuals (Tezel & Kose, 2011). Considering the whole headache classes, it is possible to derive some symptom types and classify the objective individuals as ill or healthy according to just having information about if these certain symptoms are observed or not. The most essential symptoms in this manner are presented in Table 1 (Tezel & Kose, 2011). Table 1. Symptoms for diagnosing headache disease Symptom No.
Symptom Explanation
1
Is there a vomitus?
2
Is there a nausea?
3
Is there a photophobia?
4
Is there an aura symptom?
5
Is the attack period is within 4 to 72 hours?
6
Is the headache most appropriate to be a throb?
7
Is the headache on different points?
8
Does the intensity of the headache improve with a move?
Tezel & Kose, 2011.
Considering the symptom types and the approach of classifying individuals as ill or healthy according to that, an intelligent diagnosis system over GA has been designed within the scope of the research explained in this chapter. The next section provides some brief information regarding system and some results on samples of symptoms.
DIAGNOSING APPLICATIONS WITH THE DEVELOPED SYSTEM The explained GA has been used in the research work – study expressed in this chapter, by using its default structure. In addition, some different parameter values were used at this point, to understand how the formed GA system can be effective with different values of parameters. The next paragraphs provide some information in this manner.
646
Determining Headache Diseases With Genetic Algorithm
Some Brief Information About the Diagnosis Problem The diagnosis system considered here, has been formed over a different use of GA in the context of optimization. This diagnosis approach is similar to the one done by Tezel and Kose (2011) by using the Clonal Selection Algorithm of Artificial Immunity Systems. Some essential information regarding the formed diagnosing approach can be expressed briefly as follows: • • • • • •
Individuals – particles in the GA has been coded as representing weight values for the related symptoms. By using a sample dataset of headache diseases, appropriate codes of GA particles are determined at the end of a typical training process of GA. The most optimum particle has given a code set of weights for each symptom to determine if a newly provided symptom leads to an ill or healthy individual (two-sided classifying) over a function. The used function is used a simple summation of the weighted symptoms with also some adjustment parameters and in this context, an average value has been used to determine if the individual is ill or healthy. In order to obtain exact diagnosing GA, a sample set of symptoms including a total of 200 symptom value sets has been used. In detail, 150 symptom value sets from the general sample set are for ill individuals while the remaining 50 ones are for healthy individuals. After the training phase, the obtained GA structure was then tested over a total of 100 symptom value sets.
Findings Over Applications By using the diagnosis set-up indicated under previous paragraphs, some alternative experimental studies have been done to determine which parameter values were effective at performing the diagnosis approach over objective headache sample set. At this point, different crossover rate and mutation rate values for a static population number (which is 100 in this research) were used to see which parameter values are good at classifying the objective test set including 100 symptom value sets. The obtained findings over a total of 10 different applications are reported in Table 2. As it can be seen from the Table 2, the formed GA structure and the followed diagnosing approach is effective enough to detect ill or healthy individuals according to provided symptom situations. On the other hand, it can be understood that different values of critical parameters regarding the technique of GA are effective at directing the diagnosis performance. Here, the third application has provided the most effective performance on the diagnosis approach.
CONCLUSION This chapter has provided a research work in which a Genetic Algorithms (GA) approach has been used for diagnosing headache diagnosis. Thanks to flexible, evolutionary based infrastructure of the GA, it has been possible to develop a diagnosis system for detecting if an individual has a serious headache disease or not. In detail, the chosen dataset of headache diseases was used to determine appropriate
647
Determining Headache Diseases With Genetic Algorithm
codes of GA particles and at the end of a typical training process of GA, the most optimum particle has given a code set of weights for each symptom to determine if a newly provided symptom leads to an ill or healthy individual over a function. The objective GA structure has been used for different parameters and the most appropriate parameters for determining the exact headache diagnosis GA system have been reported. The performed diagnosis process introduced here is an alternative work for the related applications on medical diagnosis introduced in the associated literature. Table 2. Findings over headache diagnosis applications Application No.
Crossover Rate
Mutation Rate
True Classified Symptom Value Sets
False Classified Symptom Value Sets
1
0.8
0.01
49
51
2
0.5
0.05
52
48
3
0.6
0.02
89
11
4
0.7
0.08
56
44
5
0.4
0.10
75
25
6
0.6
0.11
36
64
7
0.8
0.05
79
21
8
0.5
0.25
74
26
9
0.9
0.02
83
17
10
0.8
0.03
86
14
The most successful diagnosis performance is in bold and italic.
The performed research and the obtained successful results have encouraged the author for thinking about possible future research works on especially medical diagnosis. Some of remarkable future works can be mentioned as follows: • • • •
648
The used simple function will be improved to include not only two-sided classifying approach but also more advanced classification for all types of headache diagnosis. Different types of functions will be modeled and used by the author to see if there is alternative way to improve such interesting way of medical diagnosis. In addition to the used GA technique in this research – chapter, there is of course alternative Artificial Intelligence techniques that can deal with classification. So, performances of such different techniques will be evaluated. In order to improve medical diagnosis of the GA based approach to include more different types of medical diseases, some experimental studies on forming hybrid systems will be performed in the future.
Determining Headache Diseases With Genetic Algorithm
REFERENCES Alpaydin, E. (2014). Introduction to machine learning. MIT Press. Amato, F., López, A., Peña-Méndez, E. M., Vaňhara, P., Hampl, A., & Havel, J. (2013). Artificial neural networks in medical diagnosis. Journal of Applied Biomedicine, 11(2), 47–58. doi:10.2478/v10136012-0031-x Chattopadhyay, S., Banerjee, S., Rabhi, F. A., & Acharya, U. R. (2013). A Case‐Based Reasoning system for complex medical diagnosis. Expert Systems: International Journal of Knowledge Engineering and Neural Networks, 30(1), 12–20. doi:10.1111/j.1468-0394.2012.00618.x Chikh, M. A., Saidi, M., & Settouti, N. (2012). Diagnosis of diabetes diseases using an artificial immune recognition system2 (AIRS2) with fuzzy k-nearest neighbor. Journal of Medical Systems, 36(5), 2721–2729. doi:10.100710916-011-9748-4 PMID:21695498 Deng, Y., Liu, Y., & Zhou, D. (2015). An improved genetic algorithm with initial population strategy for symmetric TSP. Mathematical Problems in Engineering. Fisher, J. (2016). Genetic Algorithm – Programming by the Seat of Your Genes! – SlideShare.Net. Retrieved from https://www.slideshare.net/JeremyFisher1/genetic-algorithms-programming-by-the-seatof-your-genes Gen, M., & Cheng, R. (2000). Genetic algorithms and engineering optimization (Vol. 7). John Wiley & Sons. Ghumbre, S., Patil, C., & Ghatol, A. (2011). Heart disease diagnosis using support vector machine. International conference on computer science and information technology (ICCSIT’) Pattaya. Goldberg, D. E., & Koza, J. R. (1994). Genetic Algorithms & Evolutionary Computation. National Conference on Artificial Intelligence. Hsiao, F. H. (2015). Exponential Synchronization of Chaotic Cryptosystems Using an Improved Genetic Algorithm. The Scientific World Journal. PMID:26366432 Jain, H., & Deb, K. (2013). An improved adaptive approach for elitist nondominated sorting genetic algorithm for many-objective optimization. In International Conference on Evolutionary Multi-Criterion Optimization (pp. 307-321). Springer. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260. Kharya, S. (2012). Using data mining techniques for diagnosis and prognosis of cancer disease. arXiv preprint arXiv:1205.1923 Kline, R. R. (2015). The cybernetics moment: Or why we call our age the information age. JHU Press. Lavanya, D., & Rani, K. U. (2011). Performance evaluation of decision tree classifiers on medical datasets. International Journal of Computers and Applications, 26(4).
649
Determining Headache Diseases With Genetic Algorithm
Li, D., Chen, S., & Huang, H. (2014). Improved genetic algorithm with two-level approximation for truss topology optimization. Structural and Multidisciplinary Optimization, 49(5), 795–814. doi:10.100700158013-1012-8 Liao, Y. H., & Sun, C. T. (2001). An educational genetic algorithms learning tool. IEEE Transactions on Education, 44(2), 20. Lin, F., Duan, H., & Qu, X. (2014). PID control strategy for UAV flight control system based on improved genetic algorithm optimization. In Control and Decision Conference (2014 CCDC), The 26th Chinese (pp. 92-97). IEEE. 10.1109/CCDC.2014.6852124 Magin, R. L. (2017). Bioengineering and Cybernetics: A Modern Caduceus. IEEE Pulse, 8(1), 44–47. doi:10.1109/MPUL.2016.2627461 PMID:28129142 Man, K. F., Tang, K. S., & Kwong, S. (2012). Genetic algorithms: Concepts and designs. Springer Science & Business Media. Merkle, D., & Middendorf, M. (2014). Swarm intelligence. In Search methodologies (pp. 213-242). Springer US. Michalski, R. S., Carbonell, J. G., & Mitchell, T. M. (Eds.). (2013). Machine learning: An artificial intelligence approach. Springer Science & Business Media. Mitchell, M. (1998). An introduction to genetic algorithms. MIT Press. Olesen, J. (2000). The IHS Members’ Handbook. Academic Press. Oniśko, A., & Druzdzel, M. J. (2013). Impact of precision of Bayesian network parameters on accuracy of medical diagnostic systems. Artificial Intelligence in Medicine, 57(3), 197–206. doi:10.1016/j.artmed.2013.01.004 PMID:23466438 Pannu, A. (2015). Artificial intelligence and its application in different areas. Artificial Intelligence, 4(10). Pham, D., & Karaboga, D. (2012). Intelligent optimisation techniques: genetic algorithms, tabu search, simulated annealing and neural networks. Springer Science & Business Media. Prasadl, B. D. C. N., Prasad, P. N. K., & Sagar, Y. (2011). An approach to develop expert systems in medical diagnosis using machine learning algorithms (asthma) and a performance study. International Journal on Soft Computing, 2(1), 26–33. doi:10.5121/ijsc.2011.2103 Qasem, S. N., & Shamsuddin, S. M. (2011). Radial basis function network based on time variant multiobjective particle swarm optimization for medical diseases diagnosis. Applied Soft Computing, 11(1), 1427–1438. doi:10.1016/j.asoc.2010.04.014 Qu, H., Xing, K., & Alexander, T. (2013). An improved genetic algorithm with co-evolutionary strategy for global path planning of multiple mobile robots. Neurocomputing, 120, 509–517. doi:10.1016/j. neucom.2013.04.020 Sastry, K., Goldberg, D. E., & Kendall, G. (2014). Genetic algorithms. In Search methodologies (pp. 93-117). Springer US. doi:10.1007/978-1-4614-6940-7_4
650
Determining Headache Diseases With Genetic Algorithm
Sayre, K. (2014). Cybernetics and the Philosophy of Mind. Routledge. Shang, R., Bai, J., Jiao, L., & Jin, C. (2013). Community detection based on modularity and an improved genetic algorithm. Physica A, 392(5), 1215–1231. doi:10.1016/j.physa.2012.11.003 Shankaracharya, D. O., Samanta, S., & Vidyarthi, A. S. (2010). Computational intelligence in early diabetes diagnosis: A review. The Review of Diabetic Studies; RDS, 7(4), 252–262. doi:10.1900/ RDS.2010.7.252 PMID:21713313 Shiraishi, J., Li, Q., Appelbaum, D., & Doi, K. (2011). Computer-aided diagnosis and artificial intelligence in clinical imaging. Seminars in Nuclear Medicine, 41(6), 449–462. doi:10.1053/j.semnuclmed.2011.06.004 PMID:21978447 Tezel, G., & Kose, U. (2011). Headache Disease Diagnosis by Using the Clonal Selection Algorithm. In 6th International Advanced Technologies Symposium (IATS’11) (pp. 144-148). Academic Press. Tripathy, B. K., Acharjya, D. P., & Cynthya, V. (2013). A framework for intelligent medical diagnosis using rough set with formal concept analysis. arXiv preprint arXiv:1301.6011 Waldner, J. B. (2013). Nanocomputers and swarm intelligence. John Wiley & Sons. Wall, D. P., Dally, R., Luyster, R., Jung, J. Y., & DeLuca, T. F. (2012). Use of artificial intelligence to shorten the behavioral diagnosis of autism. PLoS One, 7(8), e43855. doi:10.1371/journal.pone.0043855 PMID:22952789 Yan, X., Gong, R., & Zhang, Q. (2016). Application of optimization SVM based on improved genetic algorithm in short-term wind speed prediction. Power Syst. Prot. Control, 44(9), 38–42. Yang, X. S., Cui, Z., Xiao, R., Gandomi, A. H., & Karamanoglu, M. (Eds.). (2013). Swarm intelligence and bio-inspired computation: theory and applications. Newnes. doi:10.1016/B978-0-12-405163-8.00001-6 Zuo, X., Chen, C., Tan, W., & Zhou, M. (2015). Vehicle scheduling of an urban bus line via an improved multiobjective genetic algorithm. IEEE Transactions on Intelligent Transportation Systems, 16(2), 1030–1041.
ADDITIONAL READING Acuña, G., & Möller, H. (2016). Indirect training of Gray-Box Models using LS-SVM and genetic algorithms. In Computational Intelligence (LA-CCI), 2016 IEEE Latin American Conference on (pp. 1-5). IEEE. 10.1109/LA-CCI.2016.7885719 Ahmad, R., Akhtar, N., & Choubey, N. S. (2017). Applications of Artificial Bee Colony Algorithms and its Variants in Health Care. BioChemistry: An Indian Journal, 11(1). Al-Shayea, Q. K. (2011). Artificial neural networks in medical diagnosis. International Journal of Computer Science Issues, 8(2), 150–154.
651
Determining Headache Diseases With Genetic Algorithm
Antonio, L. M., & Coello, C. A. C. (2017). Coevolutionary Multi-objective Evolutionary Algorithms: A Survey of the State-of-the-Art. IEEE Transactions on Evolutionary Computation, 1. doi:10.1109/ TEVC.2017.2767023 Arsene, O., Dumitrache, I., & Mihu, I. (2015). Expert system for medicine diagnosis using software agents. Expert Systems with Applications, 42(4), 1825–1834. doi:10.1016/j.eswa.2014.10.026 Back, T. (1996). Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms. Oxford University Press. Baxt, W. G. (1995). Application of artificial neural networks to clinical medicine. Lancet, 346(8983), 1135–1138. doi:10.1016/S0140-6736(95)91804-3 PMID:7475607 Becker, K., & Gottschlich, J. (2017). AI Programmer: Autonomously Creating Software Programs Using Genetic Algorithms. arXiv preprint arXiv:1709.05703. Bodnar, T., Barclay, V. C., Ram, N., Tucker, C. S., & Salathé, M. (2014). On the ground validation of online diagnosis with Twitter and medical records. In Proceedings of the 23rd International Conference on World Wide Web (pp. 651-656). ACM. 10.1145/2567948.2579272 Castaneda, C., Nalley, K., Mannion, C., Bhattacharyya, P., Blake, P., Pecora, A., ... Suh, K. S. (2015). Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine. Journal of Clinical Bioinformatics, 5(1), 4. doi:10.118613336-015-0019-3 PMID:25834725 Charniak, E., Riesbeck, C. K., McDermott, D. V., & Meehan, J. R. (2014). Artificial intelligence programming. Psychology Press. Črepinšek, M., Liu, S. H., & Mernik, M. (2013). Exploration and exploitation in evolutionary algorithms: A survey. [CSUR]. ACM Computing Surveys, 45(3), 35. doi:10.1145/2480741.2480752 Dasgupta, D., & Michalewicz, Z. (Eds.). (2013). Evolutionary algorithms in engineering applications. Springer Science & Business Media. Dilsizian, S. E., & Siegel, E. L. (2014). Artificial intelligence in medicine and cardiac imaging: Harnessing big data and advanced computing to provide personalized medical diagnosis and treatment. Current Cardiology Reports, 16(1), 441. doi:10.100711886-013-0441-8 PMID:24338557 Dutta, S. (2014). Knowledge processing and applied artificial intelligence. Elsevier. El-Dahshan, E. S. A., Mohsen, H. M., Revett, K., & Salem, A. B. M. (2014). Computer-aided diagnosis of human brain tumor through MRI: A survey and a new algorithm. Expert Systems with Applications, 41(11), 5526–5545. doi:10.1016/j.eswa.2014.01.021 Eshelman, L. J., & Schaffer, J. D. (1993). Real-coded genetic algorithms and interval-schemata. In Foundations of genetic algorithms (Vol. 2, pp. 187–202). Elsevier. Fieschi, M. (2013). Artificial intelligence in medicine: Expert systems. Springer. Foster, K. R., Koprowski, R., & Skufca, J. D. (2014). Machine learning, medical diagnosis, and biomedical engineering research-commentary. Biomedical Engineering Online, 13(1), 94. doi:10.1186/1475925X-13-94 PMID:24998888
652
Determining Headache Diseases With Genetic Algorithm
Furmankiewicz, M., Sołtysik-Piorunkiewicz, A., & Ziuziański, P. (2014). Artificial intelligence systems for knowledge management in e-health: The study of intelligent software agents. Latest Trends on Systems, 2, 551–556. Grefenstette, J. J. (1986). Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 16(1), 122–128. doi:10.1109/TSMC.1986.289288 Hager, G. D., Bryant, R., Horvitz, E., Mataric, M., & Honavar, V. (2017). Advances in Artificial Intelligence Require Progress Across all of Computer Science. arXiv preprint arXiv:1707.04352. Hamet, P., & Tremblay, J. (2017). Artificial intelligence in medicine. Metabolism: Clinical and Experimental, 69, S36–S40. doi:10.1016/j.metabol.2017.01.011 PMID:28126242 Jennings, N. R. (1996). Coordination techniques for distributed artificial intelligence. Foundations of distributed artificial intelligence, 187-210. Khan, I. Y., Zope, P. H., & Suralkar, S. R. (2013). Importance of Artificial Neural Network in Medical Diagnosis disease like acute nephritis disease and heart disease. [IJESIT]. International Journal of Engineering Science and Innovative Technology, 2(2), 210–217. Kulikowski, C. A. (1980). Artificial intelligence methods and systems for medical consultation. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2(5), 464–476. doi:10.1109/TPAMI.1980.6592368 Londhe, V. (2017). Brain MR Image Segmentation for Tumor Detection using Artificial Neural. Brain, 6(1). Luxton, D. D. (2014). Artificial intelligence in psychological practice: Current and future applications and implications. Professional Psychology, Research and Practice, 45(5), 332–339. doi:10.1037/a0034559 Nahar, J., Imam, T., Tickle, K. S., & Chen, Y. P. P. (2013). Computational intelligence for heart disease diagnosis: A medical knowledge driven approach. Expert Systems with Applications, 40(1), 96–104. doi:10.1016/j.eswa.2012.07.032 Norman, D. (2017). Design, Business Models, and Human-Technology Teamwork: As automation and artificial intelligence technologies develop, we need to think less about human-machine interfaces and more about human-machine teamwork. Research Technology Management, 60(1), 26–30. doi:10.1080 /08956308.2017.1255051 Patel, V. L., Shortliffe, E. H., Stefanelli, M., Szolovits, P., Berthold, M. R., Bellazzi, R., & Abu-Hanna, A. (2009). The coming of age of artificial intelligence in medicine. Artificial Intelligence in Medicine, 46(1), 5–17. doi:10.1016/j.artmed.2008.07.017 PMID:18790621 Patil, R. S. (2014). Techniques for diagnostic reasoning in medicine. In Exploring Artificial Intelligence: Survey Talks from the National Conferences on Artificial Intelligence (p. 347). Morgan Kaufmann. Peek, N., Combi, C., Marin, R., & Bellazzi, R. (2015). Thirty years of artificial intelligence in medicine (AIME) conferences: A review of research themes. Artificial Intelligence in Medicine, 65(1), 61–73. doi:10.1016/j.artmed.2015.07.003 PMID:26265491
653
Determining Headache Diseases With Genetic Algorithm
Poole, D. L., & Mackworth, A. K. (2017). Artificial intelligence. Cambridge University Press. Raedt, L. D., Kersting, K., Natarajan, S., & Poole, D. (2016). Statistical relational artificial intelligence: Logic, probability, and computation. Synthesis Lectures on Artificial Intelligence and Machine Learning, 10(2), 1–189. doi:10.2200/S00692ED1V01Y201601AIM032 Rajan, J. R. (2017). Prognostic system for early diagnosis of pediatric lung disease using artificial intelligence. Current Pediatric Research, 21(1). Reiter, R. (1987). A theory of diagnosis from first principles. Artificial Intelligence, 32(1), 57–95. doi:10.1016/0004-3702(87)90062-2 Salman, M., Ahmed, A. W., Khan, O. A., Raza, B., & Latif, K. (2017). Artificial Intelligence in BioMedical Domain. Artificial Intelligence, 8(8). Sanz, J. A., Galar, M., Jurio, A., Brugos, A., Pagola, M., & Bustince, H. (2014). Medical diagnosis of cardiovascular diseases using an interval-valued fuzzy rule-based classification system. Applied Soft Computing, 20, 103–111. doi:10.1016/j.asoc.2013.11.009 Seera, M., & Lim, C. P. (2014). A hybrid intelligent system for medical data classification. Expert Systems with Applications, 41(5), 2239–2249. doi:10.1016/j.eswa.2013.09.022 Sharma, G., & Carter, A. (2017). Artificial Intelligence and the Pathologist: Future Frenemies? Archives of Pathology & Laboratory Medicine, 141(5), 622–623. doi:10.5858/arpa.2016-0593-ED PMID:28447905 Shashi, S., & Deep, K. (2016). A Novel Crossover Operator Designed to Exploit Synergies of Two Crossover Operators for Real-Coded Genetic Algorithms. Sheikhtaheri, A., Sadoughi, F., & Dehaghi, Z. H. (2014). Developing and using expert systems and neural networks in medicine: A review on benefits and challenges. Journal of Medical Systems, 38(9), 110. doi:10.100710916-014-0110-5 PMID:25027017 Shrobe, H. E. (Ed.). (2014). Exploring artificial intelligence: survey talks from the National conferences on artificial intelligence. Morgan Kaufmann. Spears, W. M. (2013). Evolutionary algorithms: The role of mutation and recombination. Springer Science & Business Media. Srinivas, M., & Patnaik, L. M. (1994). Adaptive probabilities of crossover and mutation in genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 24(4), 656–667. doi:10.1109/21.286385 Szolovits, P. (Ed.). (1982). Artificial intelligence in medicine (pp. 1–226). Boulder, CO: Westview Press. Travé-Massuyès, L. (2014). Bridging control and artificial intelligence theories for diagnosis: A survey. Engineering Applications of Artificial Intelligence, 27, 1–16. doi:10.1016/j.engappai.2013.09.018 Wu, T., Ozpineci, B., Chinthavali, M., Wang, Z., Debnath, S., & Campbell, S. (2017, June). Design and optimization of 3D printed air-cooled heat sinks based on genetic algorithms. In Transportation Electrification Conference and Expo (ITEC), 2017 IEEE (pp. 650-655). IEEE. 10.1109/ITEC.2017.7993346
654
Determining Headache Diseases With Genetic Algorithm
Ziuziański, P., Furmankiewicz, M., & Sołtysik-Piorunkiewicz, A. (2014). E-health artificial intelligence system implementation: Case study of knowledge management dashboard of epidemiological data in Poland. International Journal of Biology and Biomedical Engineering, 8, 164–171.
KEY TERMS AND DEFINITIONS Artificial Intelligence: A scientific field that is dealing with simulating human and living organism behaviors and actions to develop intelligent machines. Genetic Algorithms: A type of intelligent algorithm inspired by some mechanisms explained under the theory of evolution for solving real-world-based problems over an optimization framework. Intelligent Medical Diagnosis: A type of medical diagnosis done generally thanks to computer systems supported with especially artificial intelligence. Medical Diagnosis: A type of diagnosis used for detecting a disease, medical factor, etc. by using some required input data/information. Nature-Inspired Techniques: A set of techniques that are generally artificial intelligence algorithms inspired by some aspects of the nature in order to deal with real-world-based problems.
This research was previously published in Nature-Inspired Intelligent Techniques for Solving Biomedical Engineering Problems; pages 249-262, copyright year 2018 by Medical Information Science Reference (an imprint of IGI Global).
655
656
Chapter 34
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search Suruchi Chawla Shaheed Rajguru College Delhi University, Delhi, India
ABSTRACT The main challenge to effective information retrieval is to optimize the page ranking in order to retrieve relevant documents for user queries. In this article, a method is proposed which uses hybrid of genetic algorithms (GA) and trust for generating the optimal ranking of trusted clicked URLs for web page recommendations. The trusted web pages are selected based on clustered query sessions for GA based optimal ranking in order to retrieve more relevant documents up in ranking and improves the precision of search results. Thus, the optimal ranking of trusted clicked URLs recommends relevant documents to web users for their search goal and satisfy the information need of the user effectively. The experiment was conducted on a data set captured in three domains, academics, entertainment and sports, to evaluate the performance of GA based optimal ranking (with/without trust) and search results confirms the improvement of precision of search results.
1. INTRODUCTION Information on the Web is huge and information retrieval of relevant documents for a specific information need of web users is a big challenge for search engines. The search engines retrieve large collection of ranked search results for a specific information need out of which very few are relevant. It is found that relevant documents are present lower in ranking of search results due to imprecise search queries and therefore the precision of search results decreases. Research had been done for web page ranking in order to bring more and more relevant documents up in ranking for the improvement of precision of search results (Peng & Lin, 2006; Selvan et al., 2012; Page et al., 1999; Xing & Ghorbani, 2004; DOI: 10.4018/978-1-7998-8048-6.ch034
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Ding et al., 2001; Jayanthi & Jayakumar, 2011). In (Chawla, 2016) GA was used for optimal web page ranking of High Scent Clicked URLs for personalized web search and experimental results confirmed the improvement of precision of search results. It is realized in this research that rank optimization of clicked URLs using GA can be more effective if trust is used to select the web pages for optimization. The optimal ranking of trusted clicked URLs using GA will retrieve more relevant documents up in ranking and therefore improves the precision of search results. In this paper a novel approach is proposed using hybrid of GA and Trust for optimal ranking of clicked URLs based on clustered web query sessions. The benefit of using hybridization of Trust and GA together is because of the following reasons .GA is parallel in nature and well suited for solving problems where the solution space is huge and time taken to search exhaustively is very high. The trust value of web pages determines the relevance of web page based on web usage data and is calculated using clustered query sessions. Thus, the use of high trust threshold value will select those web pages which are relevant and satisfy the information need of the users most of the time when recommended in search results. Thus, an algorithm is proposed for web page recommendation of optimal ranking of trusted URLs using hybrid of GA and Trust for effective personalization of user web search. The entire processing of the proposed algorithm is divided into two phase: Phase I (Offline) and Phase II(Online). In Phase I, the query sessions keyword vector is clustered to group the clicked URLs which satisfy similar information need. The clicked URLs are selected based on trust threshold value in a given cluster and GA is applied on the population of possible ranking of trusted clicked URLs in a given cluster for optimization. Thus, at the end of offline processing, each cluster is associated with optimal ranking of trusted clicked URLs. During online processing, the input query issued for web search select the most similar cluster for the recommendations of optimal ranking of trusted clicked URLs. The recommendation of optimal ranking of trusted URLs continues till the search is personalized to the information need of the user. The flowchart steps for the proposed approach of web page recommendations of optimal ranking of trusted URLs using GA is given below in Figure 1. Experiment was conducted on the data set of user query sessions captured on the web in three selected domains Academics, Entertainment and Sports to evaluate the effectiveness of hybridization of GA and Trust for web page recommender system. The results were compared with PWS(with GA)(Chawla,2016) /Classic IR and the improvement in the average precision of search results confirms that the rank optimization of trusted clicked URLs recommend relevant search results for effective Information retrieval .
2. RELATED WORK In (Pera & Ng, 2013) group recommender system for movies based on content similarity and popularity was proposed. In (Choi, Jeong & Jeong, 2010) a hybrid recommendation algorithm using both collaborative and content based filtering was proposed. In (Beel et al., 2013) both online and offline evaluation of research paper recommender system were analyzed. Offline evaluation was found to be unsuitable for evaluation of research paper recommender systems. In (Herlocker et al., 2004) collaborative filtering recommender systems were reviewed based on user tasks, types of datasets, method of calculation of prediction quality and prediction attributes. In (Kadam & Gaikwad,2015) cold start and sparsity problem found in content based filtering and item based filtering was overcome by using interpersonal interest, social profile in personalized recommendation system to recommend interested items to users.
657
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Figure 1. Shows the steps for clusterwise GA based optimal ranking of trusted clicked URLs
It was found that recommender system can be more effective by incorporating trust than traditional collaborative filtering (Massa & Avesani, 2007; Levien, 2004; Lathia, Hailes & Capra, 2008; Hwang & Chen, 2007; Peng & Seng-cho, 2009). In (Massa & Bhattacharjee, 2004) trust based recommender system was proposed using both trust metric and similarity metric. In (Xue & Fan, 2008) a new trust model based on social characteristic and reputation mechanism for the semantic web was proposed. In (Tian et al., 2008) Trust model based on reputation for peer-to-peer
658
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
networks was proposed. In (Jamali & Ester, 2009) TrustWalker: A Random Walk Model was proposed combining both trust-based and item-based recommendation. In (Jianshu, Chunyan & Angela, 2006) Collaborative Filtering was improved with Trustbased Metrics where trust metric was defined for incorporating trust in similarity computation. In (Guha et al., 2004) method of propagation of trust and distrust was introduced. In (Baudry, Hanh & Traon, 2000) method was proposed for building trustable OO (Object Oriented) componants. In (Bedi & Sharma, 2012) Trust based Ant Recommender System was proposed. It used the concept of dynamic trust among users and the best neighborhood was selected based on genetic image of ant colonies. In (Olli, 2001) genetic algorithm was used to solve vehicle routing problem with time window. In (Moon et al., 2002) a genetic algorithm was proposed to solve traveling salesman problem with precedence constraints. In (Tyagi & Varshney, 2012) genetic algorithm was used to solve flow shop scheduling problem. In (Verma & Kaur, 2015) recommender system was proposed which uses hybrid of content, collaborative and demographic information. Genetic algorithm and k-NN algorithm were used for recommendations to the user. In (Bobadilla et al., 2011) genetic algorithm was used for improving the performance of collaborative filtering based recommender system. In (Sathya & Simon, 2010) Genetic Algorithm was used to generate the optimal combination of terms for effective information retrieval. In (Aley & Kolte, 2015) Wireless sensor network energy issues and secure routing were solved. Genetic Algorithm was used as filter and applied the rules on packet and checked its validity. In (Nimbalkar, Das & Wagh, 2015) hybrid of genetic algorithms and trust had been used in wireless sensor network. In (Raha et al., 2013) Genetic Algorithm Inspired Load Balancing Protocol was proposed for Congestion Control in Wireless Sensor Networks using Trust Based Routing Framework. In (Agarwal & Bharadwaj, 2011) Friend Recommender System for WBSN was proposed. Genetic algorithm was used for learning the user preferences based on comparison of selected individual features. Trust propagation was used for solving the sparsity problem of collaborative filtering. In (Selvaraj & Anand, 2012) novel trust model was proposed which combined both peer profiling and anomaly detection. Genetic algorithm was used to detect the anomalous behavior. In (Gao, Yan, & Mu, 2014) trust‐oriented genetic algorithm (TOGA) was proposed for finding a near‐optimal service composition plan with QoS constraints. It is found that hybrid of Genetic algorithm and Trust had been applied in various domains and results shown were promising. Since there has been no work done using both GA and trust in Information Retrieval therefore in this research, the hybrid of GA and trust have been applied in the field of information retrieval for effective personalized web search based on web page recommendation. The proposed work is an extension to work done in (Chawla,2016) which uses GA only for optimal ranking. It is realized in this research that rank optimization of trusted web pages using GA will retrieve more relevant documents for information retrieval and improves the precision of search results. The results of proposed approach were compared with (Chawla, 2016) to confirm its effectiveness.
3. BACKGROUND 3.1. Information Scent Information scent is the measure of sense of relevance of clicked web page with respect to the information need of user based on web usage data. The Inferring User Need by Information Scent (IUNIS)
659
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
algorithm is used to quantify the Information Scent sid of the pages Pid clicked by the user in ith query session. (Chi et al., 2001; Heer & Chi, 2002; Pirolli, 1997; Pirolli, 2004) The page access PF.IPF weight and Time are used to quantify the information scent associated with the clicked page in a query session. The information scent sid is calculated for each clicked page Pid in a given query session i for all m query sessions identified in query session mining as follows sid = PF .IPF (Pid ) ×Time (Pid ) ∀i ∈ 1..m ∀d ∈ 1..n .
PF .IPF (Pid ) =
fP
id
max fP d ∈1..n
id
(1)
× log M mP d
(2)
In PF.IPF(Pid) PF correspond to the page Pid normalized frequency fP in a given query session i id
where n is the number of distinct clicked page in session i and IPF correspond to the ratio of total number of query sessions M in the whole data set to the number of query sessions m P that contain the d
given page Pd. Time(Pid). It is the ratio of time spent on the page Pid in a given session i to the total duration of query session i. (Chawla & Bedi, 2007, 2008; Chawla, 2012a; Chawla, 2012b; Chawla, 2013; Chawla, 2014a; Chawla, 2014b; Chawla, 2015).
3.1.1. Clustering of Query Sessions Keyword Vector Each query session keyword vector is generated from query session which is represented as follows. query session = (input query,(clicked URLs/Page)+) where clicked URLs are those URLs which user clicked in the search results of the input query before submitting another query; ‘+’ indicates only those sessions are considered which have at least one clicked Page associated with the input query. The query session vector Qi of the ith session is defined as linear combination of content vector of each clicked page Pid scaled by the weight sid which is the information scent associated with the clicked page Pid in session i. That is n
Qi = ∑sid × Pid
∀ i ∈ 1..m
(3)
d =1
In equation (3) n is the number of distinct clicked pages in the session i and sid (information scent) is calculated for each clicked page Pid present in a given session i as defined in equation 1. The content vector of clicked page Pid is weighted using TF.IDF. Each ith query session is obtained as weighted vector Qi using equation (3). This vector is modeling the information need associated with the ith query session. The k-means algorithm is used for clustering query sessions keyword vectors since its performance is good for document clustering. k-means is easy to understand and implement and takes less time to
660
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
execute as compared to other techniques. It can handle large data sets. (Wen, Nie & Zhang, 2002; Zhao & Karypis, 2002a) The vector space implementation of k-means uses score or criterion function for measuring the quality of resulting clusters. (Zhao & Karypis, 2002b)
3.2. Genetic Algorithms(GA) The Genetic Algorithm is a search method based on the natural theory of evolution. The algorithm to evolve solutions to the search problem using genetic algorithm is given below. (Bremermann, 1958; Pal, Talwar & Mitra, 2002; Goldberg, 1989) 1. Initialization: In the initialization step, population of chromosomes is initialized using the problem specific domain knowledge. The chromosomes represent the different possible solution to the given problem. 2. Evaluation: After the initialization of the population, the fitness value is defined relative to the problem. The fitness value measures the degree of goodness of the chromosomes in representing the solution to the problem. 3. Selection: In the selection phase, chromosomes with high fitness values are selected and are allocated more copies in the mating pool for reproduction using recombination operators. There are number of selection methods such as roulette-wheel selection, stochastic universal selection, ranking selection, tournament selection and truncate selection. 4. Recombination: In the Recombination phase, the selected chromosomes are recombined using crossover operator which is a genetic operator for the reproduction of offspring from parent chromosomes. There are various types of crossovers like k-point Crossover, Uniform Crossover, Uniform Order-Based Crossover, Order-Based Crossover and Partially Matched Crossover (PMX). 5. Mutation: In this phase mutation is applied to the selected chromosomes. The mutation is the genetic operator which changes the gene at the specific position in the chromosome. A common mutation type is bit wise/point mutation. 6. Replacement: In the Replacement phase, the offspring population generated using selection, recombination and mutation operators will replace the parent population. There are a number of replacement techniques such as elitist replacement, generation-wise replacement, steady-state-noduplicates and steady-state replacement methods. Steps 2-6 are repeated until a terminating condition is met.
3.3. Trust The concept of Trust has been gaining increase amount of attention in research communities like online recommender system. A trust is defined as social phenomena and the model of trust for artificial world like web is based on how trust works between people in society. (Abdul-Rahman & Hailes,2000) Although vast literature on trust has grown in various areas of research with varying meaning of trust but a complete formal unambiguous definition of trust exists rarely in the literature. (McKnight & Chervany,2002) In (Dimitrakos,2003) the general properties of trust in e-services were surveyed and analyzed and the general properties of trust are listed as follows:
661
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
• • • • • • • •
Trust is relevant to specific transactions only. Trust is a measurable belief. Trust is directed. Trust exists in time. Trust evolves in time, even within the same transaction. Trust between collectives does not necessarily distribute to trust between their members. Trust is reflexive, Trust is a subjective belief.
In (Donovan & Smyth,2005) the “trust” is defined as the reliability of a partner profile to deliver accurate recommendations in the past. Two models of trust called profile and item level were described for generating reliable and accurate recommendations.
4. OPTIMAL WEB PAGE RECOMMENDATIONS BASED ON HYBRID OF GA-TRUST FOR EFFECTIVE PERSONALIZED WEB SEARCH In this paper a novel method is proposed for optimal ranking of trusted clicked URLs using GA based on clustered query sessions. Trust measure the relevance of a web page based on the number of times it is clicked out of the total number of recommendations. The web pages with high trust value are selected for optimization using GA in order to generate the optimal ranking of trusted web pages for web page recommendations. An algorithm is proposed for web page recommendations of optimal ranking of trusted clicked URLs for effective Information Retrieval. The entire processing of the algorithm is divided into phases Phase I(Offline processing) and Phase II(Online Processing). In Phase I, web query sessions is collected and preprocessed to generate the keyword vector. Information Scent and content of clicked URLs of web query session is used to generate the keyword vector. Clustering is performed on keyword vectors and generates the clusters of clicked URLs which satisfy the similar information need in a given domain. Initially when the system has generated no recommendations, the trust value of clicked URLs is initialized using information scent value (Information Scent measure the static relevance based on usage statistic of clicked URL in web query sessions). The clicked URLs are selected based on trust threshold value in each cluster and forms the population of individuals where each individual represents the possible ranking of trusted clicked URLs. Genetic algorithm is applied on population of individuals for optimal ranking, the fitness value of each individual is calculated and the best fitness individuals are selected for reproduction using crossover and mutation in order to generate the next generation of population of individuals. The generations of population of individuals is continued till the terminating condition is reached. Thus, upon termination, individual with the best fitness value is selected to determine the optimal ranking of trusted set of URLs for a given cluster. Thus the trusted clicked URLs in each cluster are processed using Genetic algorithm and generate the optimal ranking of trusted clicked URLs. The stepwise description of Phase I is given below. Algorithm 1. Phase I Offline Processing Input: Data set of Web Query Sessions Output: Clusterwise Optimal Ranking of Clicked URLs 662
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
1. For each clicked URLs in the query session, the Information Scent Metric is calculated using Equation (1) which is the measure of the relevancy of the clicked URLs with respect to the information need of the user associated with the query session. 2. The trust of a given clicked URL d in session i is initialized with Information Scent. trustid = sid, ∀d ∈ 1..n, i 1..m
3. Query sessions keyword vector is generated from query sessions using trust and content of Clicked URLs where content of clicked URLs is TF.IDF weighted vector as given below. n1
Qi = ∑trustid * Pid
∀ i ∈ 1..m
d =1
Where n1 is the number of distinct clicked URLs in a given session i 4. k-means algorithm is used for clustering query sessions keyword vector. 5. Each cluster j is associated with the mean keyword vector clusterj_mean. 6. For each cluster j select the clicked URLs in Lj where Trust(ClickedURLjd)> 𝜀 where d∈1…n 7. The initial trust value of a given cluster j is calculated as follows Trust (j)={|ClickedURLjd|: Trust(ClickedURLjd)> 𝜀 }| 8. Clicked count and recommended count are defined for each distinct clicked URLs and are initialized to zero in the list Lj associated with each cluster j. 9. For each cluster j apply the algorithm Genetic Algorithm based optimal ranking of clustered clicked URLs on the List Lj associated with the cluster j to determine the top m optimal ranking of clicked URLs in the List Lj associated with each cluster and is represented by TORj (Trusted Optimal Ranking j). Sub Algorithm. Genetic Algorithm Based Optimal Ranking of Clustered Clicked URLs Input: List Lj, cluster mean keyword vector clust_meanj, Trust threshold value ε, Information Scent threshold value ρ. Output: Ranked list of clicked URLs, TORj 1. For a given List Lj, there is a set Mj where Mj ⊆ Lj, Length(Lj)=n, Length(Mj)=m and m 𝜀 where Recommendedcount(ClickedURLjd)!=0}| RecSet(j) is the total number of recommendations made using cluster j RecSet(j)=|{ClickedURLjd| Recommendedcount(ClickedURLjd)!=0}| 11. If the user request for the next result page a. Model the partial information need of the current user profile using the trust and content of the URLs clicked so far in his partial user profile and obtain the user session keyword vector current_usersessionvectort. b. The similarity is measured for each jth cluster using the formulae MatchScorei(clusterj_mean, current_usersessionvector t)=2*(sim(current_usersessionvector t,clusterj_ mean)* Trust (j))/ Trust (j)+sim(clusterj_mean, current_usersessionvectort). when Trust (j)!=0 sim(current_usersessionvectort,clusterj_mean) when Trust (j)=0 c. Goto step 3. Else if there is another user or same user with new query for web search Goto step 1. else if sufficient time has been elapsed since the last time offline processing is performed and no user query for web search a) Invoke the subalgorithm Genetic Algorithm based optimal ranking of clustered clicked URLs on each cluster in order to update the optimal ranked list of clicked URLs TORj b) Goto step 1. end
5. EXPERIMENTAL STUDY The experiment was conducted on a data set of web user query sessions captured in three domains Academics, Entertainment and Sports. The data set of query sessions was captured through an architecture developed using JSP, JADE, Oracle and genetic algorithm tool box of MATLAB. In order to generate the dataset, the input query was entered through a GUI based interface of the architecture and passed on to the Google search engine API. The search results were retrieved and displayed along with the check
665
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
boxes on the user interface. A SnapShot of GUI interface of the architecture showing the Google search results for the input query “hindi song” is shown below in Figure 2. The experiment was performed on i3 processor, Windows 8 with 120 GB RAM. During offline preprocessing, web sphinx crawler was used to fetch the tf.idf vector of the clicked URLs of the web query sessions and loaded into database using Oraloader. The clustering agent developed in JADE was executed to generate the clusters of query session keyword vectors and performs the initialization of the trust of the clusters as well as the clicked URLs of the query sessions. Figure 2. Screen SnapShot of architecture displaying Google Search results along with the checkboxes
In the experimental set up for evaluating the performance of personalized web search using optimal ranked trusted URLs, the parameters used in the genetic algorithm are given below in Table 1: MAXGEN is the maximum number of generations of population generated in the evolutionary process, length (P) represents the number of chromosomes individuals in the population, crossover rate is the recombination rate of the selected chromosome individuals in the population and mutation rate is the rate of mutating the chromosomes in the population. Table 1. Shows the Genetic Algorithm Parameters
666
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
The experiment was conducted with the following values of selected parameters- the size of the population represented as length(P) was 120 (m!) for a given cluster where m is the number of selected clicked URLs in set Mj associated with each jth cluster, the size of the Tournament in the Tournament Selection was set to 4, crossover probability was varied in the range of [0.6-0.8] in increment of 0.1 and the mutation rate was varied in the range in [0.1-0.3] in increment of .05. The genetic algorithm tool box of MATLAB software package was used for applying the genetic algorithm on the clustered data set. The snap shot of genetic algorithm toolbox execution in the MATLAB on the given cluster with the length(P) = 120 is shown in Figure. 3. The GA converged faster on selected data set in less generation with mutation probability increases in range [0.1-0.3] in increment of .05. It was found that there is no difference in convergence as the number of generation increases because of changing mutation ratios. The optimal results were obtained with maximum fitness value at the mutation rate of 0.25 with a given population size of 120 in 25 generations with crossover rate set to 0.8, threshold value of Information Scent (ρ1) at 0.5 and the threshold value of Trust 𝜀 is set to 0.5. In this study, the process of generating the population continued till the difference in the optimum fitness value of last 12 consecutive generations was less than the threshold value τ = 0 .000001. The graph showing the fitness value versus number of generations and variables is shown in Figure 4. Thus, the mean fitness value reached to stable value when iterated for 25 generations. Figure 3. SnapShot of execution of Genetic Algorithm on the given cluster and the output window at the bottom showing the optimal ranked docid of clicked URLs of the given cluster starting from rank 1 from the left
667
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Figure 4. Graph shows the mean and best Fitness value during execution of genetic algorithm for 25 generations and best individual based on fitness for different number of variables
In order to evaluate the performance, the 25 test queries were selected randomly in each of the domains Academics, Entertainment and Sports. The purpose of selecting the queries in these three domains is to cover wide range of queries on the web. The relevancy of the documents was decided by the subject experts in the domains to which the queries belong. The personalized search results using optimal ranked Clicked URLs with (GA and Trust) for the input query ‘hindi song’ are shown in Figure. 5. The GUI of the Personalized results using optimal ranked clicked URLs (with GA) for the input query ‘hindi song’ in the entertainment domain is given below in Figure. 6 in which user’s clicks were captured through checkbox ticks. The performance of the PWS using optimal ranked trusted clicked URLs was evaluated from the average precision of Personalized Search results. The average precision in a given domain was computed using average of precision of test queries where precision is computed using the fraction of retrieved documents which are relevant in the personalized search results. The experimental results showing the average precision of 25 test queries computed in the domains of academics, entertainment and sports using PWS with optimal ranking based on GA(with/ without trust) and Classic IR is shown in Table 2 & Figure. 7. below.
668
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Figure 5. Shows the Personalized Web Search results with trust based Optimal Ranking (with GA and Trust) where optimal ranked clicked URLs are shown with CheckBoxes to capture the user clicks
Figure 6. Shows the Personalized Web Search results with Optimal Ranking (with GA) where optimal ranked clicked URLs are shown with CheckBoxes to capture the user clicks
Figure 7. Shows the avgprecision of PWS with GA and Trust /with GA based optimal ranking in Academics, Sports and Entertainment
669
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Table 2. Compares the average precision of optimal PWS results (with GA and Trust/with GA) and their percentage improvement over ClassicIR
The average precision was improved in each of the selected domains using personalized web search with cluster based optimal ranked clicked URLs based on –GA-Trust. The obtained results were analyzed using the statistical paired t-test for average precision of PWS with optimal ranking based on GA and Trust versus both (ClassicIR/GA) with 74 degrees of freedom (d.f.) for the combined sample as well as in all three categories (Academics, Entertainment and Sports) with 24 d.f each. The observed value of t for average precision of proposed approach(GA andTrust) with (ClassicIR/with GA) is given below in Table 3. Table 3. Results of Statistical t-test
It was observed that the computed t value for paired difference of average precision lies outside the 95% confidence interval in each case. Hence Null hypothesis was rejected and alternate hypothesis was accepted in each case and it was concluded that average precision improved significantly when personalized web search using optimal ranked clicked URLs based on trust with GA. Thus, it is proved that proposed approach of PWS using optimal ranked Clicked URLs based on GA (with trust) improves the precision of search results as the use of the trust for the selection of clicked URLs for optimal ranking will remove those clicked URLs which proves to be untrustworthy as it were not clicked too often when recommended. Thus, the optimal ranking of trusted web pages brings more and more relevant documents up in ranking and improve the precision of search results. Hence PWS with optimal ranked Clicked URLs using hybrid of GA and Trust provides the effective method of Page Ranking and personalizes the web search with respect to the information need of the user.
670
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
6. CONCLUSION In this paper novel approach is proposed which uses genetic algorithm and trust for generating the cluster based optimal ranking of trusted URLs for effective personalized web search. The use of trust for the selection of clicked URLs will improve the quality of web pages selected for optimal ranking therefore more and more relevant documents are retrieved up in ranking and improve the precision of search results. Experimental study was conducted on the clustered query sessions data set to confirm the effectiveness of proposed method. The proposed method shows the improvement in the average precision in each of the selected domain in comparison to PWS with optimal ranking (with GA) and classic IR. Hence the personalization of web search using recommendations of trusted optimal ranked Clicked URLs satisfies the information need of the user on the web effectively.
REFERENCES Abdul-Rahman, A., & Hailes, S. (2000). Supporting trust in virtual communities. In Proceedings of the 35th Hawaii International Conference on System Sciences, Hawaii, HI. 10.1109/HICSS.2000.926814 Agarwal, V., & Bharadwaj, K. K. (2011). Trust-enhanced recommendation of friends in web based social networks using genetic algorithms to learn user preferences. In Trends in Computer Science, Engineering and Information Technology (pp. 476-485). Springer Berlin Heidelberg. Amit, V., & Harpreet, K. V. (2015). A Hybrid Recommender System using Genetic Algorithm and kNN Approach. International Journal of Computer Science And Technology, 6(3), 131–134. Baudry, B., Hanh, V. L., & Traon, Y. L. (2000). Testing-for-trust: the genetic selection model applied to component qualification. In Proceedings of the 33rd International Conference on Technology of ObjectOriented Languages (pp. 108-119). IEEE. 10.1109/TOOLS.2000.848755 Bedi, P., & Sharma, R. (2012). Trust based recommender system using ant colony for trust computation. Expert Systems with Applications, 39(1), 1183–1190. doi:10.1016/j.eswa.2011.07.124 Beel, J., Genzmehr, M., Langer, S., Nürnberger, A., & Gipp, B. (2013). A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation. In Proceedings of the international workshop on reproducibility and replication in recommender systems evaluation (pp. 7-14). ACM. 10.1145/2532508.2532511 Bobadilla, J., Ortega, F., Hernando, A., & Alcalá, J. (2011). Improving collaborative filtering recommender system results and performance using genetic algorithms. Knowledge-Based Systems, 24(8), 1310–1316. doi:10.1016/j.knosys.2011.06.005 Braysy, O. (2001). Genetic algorithms for the vehicle routing problem with time windows. Technical Report, 1/2001. Vaasa, Finland: University of Vaasa. Bremermann, H. J. (1958). The evolution of intelligence. The nervous system as a model of its environment (Technical Report No. 1). Department of Mathematics, University of Washington, Seattle, WA.
671
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Chawla, S. (2012a). Trust in Personalized Web Search based on Clustered Query Sessions. International Journal of Computers and Applications, 59(7), 36–44. doi:10.5120/9563-4032 Chawla, S. (2012b). Semantic Query Expansion using Cluster Based Domain Ontologies. International Journal of Information Retrieval Research, 2(2), 13–28. doi:10.4018/ijirr.2012040102 Chawla, S., (2013). Personalized web search using ACO with information scent. International Journal of Knowledge and Web Intelligence, 4(2), 238-259. Chawla, S. (2014a). Personalised Web Search using Trust based Hubs and Authorities. International Journal of Engineering Research and Applications, 4(7), 157-170. Chawla, S. (2014b). Novel approach to query expansion using genetic algorithm on clustered query sessions for effective personalized web search. International Journal of Advanced Research in Computer Science and Software Engineering, 4(11), 73–81. Chawla, S. (2015). Domainwise web page optimization based on clustered query sessions using hybrid of trust and ACO for effective information retrieval. International Journal of Scientific and Technology Research, 4(11), 196–204. Chawla, S. (2016). A novel approach of cluster based optimal ranking of clicked URLs using genetic algorithm for effective personalized web search. Applied Soft Computing, 46, 90–103. doi:10.1016/j. asoc.2016.04.042 Chawla, S., & Bedi, P. (2007). Personalized Web Search using Information Scent. In Proceedings of the International Joint Conferences on Computer, Information and Systems Sciences, and Engineering, LNCS (pp. 483-488). Springer. Chawla, S., & Bedi, P. (2008). Improving information retrieval precision by finding related queries with similar information need using information scent. In Proceedings of the First International Conference on Emerging Trends in Engineering and Technology, ICETET’08 (pp. 486-491). IEEE. 10.1109/ICETET.2008.23 Chi, E. H., Pirolli, P., Chen, K., & Pitkow, J. (2001). Using Information Scent to model User Information Needs and Actions on the Web. In Proceedings of the International Conference on Human Factors in Computing Systems, New York, NY (pp. 490-497). Choi, S. H., Jeong, Y. S., & Jeong, M. K. (2010). A hybrid recommendation method with reduced data for large-scale application. Systems, Man, and Cybernetics, Part C: Applications and Reviews. IEEE Transactions on, 40(5), 557–566. Dimitrakos, T. (2003). A Service-Oriented Trust Management Framework. In Proceedings of the International Workshop on Deception, Fraud & Trust in Agent Societies (pp. 53-72). Ding, C., He, X., Husbands, P., Zha, H., & Simon, H. (2001). Link Analysis: Hubs and Authorities on the World (Technical Report: 47847). Gao, H., Yan, J., & Mu, Y. (2014). Trust‐oriented QoS‐aware composite service selection based on genetic algorithms. Concurrency and Computation, 26(2), 500–515. doi:10.1002/cpe.3015
672
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Goldberg, D. E. (1989). Genetic Algorithms in Search. Boston, MA: Addison-Wesley Longman Publishing Co. Guha, R., Kumar, R., Raghavan, P., & Tomkins, A. (2004). Propagation of trust and distrust. In Proceedings of the 13th international conference on World Wide Web (pp. 403-412). Heer, J., & Chi, E. H. (2002). Separating the Swarm: Categorization method for user sessions on the web. In Proceedings of the International Conference on Human Factor in Computing Systems (pp. 243250). 10.1145/503376.503420 Herlocker, J. L., Konstan, J. A., Terveen, L. G., & Riedl, J. T. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems, 22(1), 5–53. doi:10.1145/963770.963772 Hwang, C., & Chen, Y. (2007). Using trust in collaborative filtering recommendation. Lecture Notes in Computer Science, 4570, 1052–1060. doi:10.1007/978-3-540-73325-6_105 Jamali, M., & Ester, M. (2009). TrustWalker: A Random Walk Model for Combining Trust-based and Item-based Recommendation. In Proceedings of the 151th ACM Conference on Knowledge Discovery and Data mining KDD ’09, Paris, France. 10.1145/1557019.1557067 Jayanthi, J., & Jayakumar, K. S. (2011). An integrated Page Ranking Algorithm for Personalized Web Search. International Journal of Computers and Applications, 12(11). doi:10.5120/1732-2350 Kadam, V. J., & Gaikwad, V. S. (2015). A Survey on Social Circle Influenced Personalized Recommendation System. International Journal of Science and Research, 4(11), 2510–2513. Lathia, N., Hailes, S., & Capra, L. (2008). Trust-based collaborative filtering. In Proceedings of the joint iTrust and PST Conference on Privacy, Trust Management and Security (pp. 119-134). Springer. 10.1007/978-0-387-09428-1_8 Levien, R. (2004). Attack-resistant Trust Metrics [Ph.D. thesis]. University of California at Berkeley, CA. Massa, P., & Avesani, P. (2007). Trust-aware Recommender Systems. In Proceedings of the ACM Conference on Recommender Systems (pp. 17-24). Massa, P., & Bhattacharjee, B. (2004). Using trust in recommender systems: An experimental analysis. In Proceedings of the Second International Conference on Trust Management, Oxford, UK (pp. 221235). 10.1007/978-3-540-24747-0_17 McKnight, D. H., & Chervany, N. L. (2002). What Trust Means in e-Commerce Customer Relationships: An interdisciplinary conceptual typology. International Journal of Electronic Commerce, 6(2), 35–59. doi:10.1080/10864415.2001.11044235 Moon, C., Kim, J., Choi, G., & Seo, Y. (2002). An efficient genetic algorithm for the traveling salesman problem with precedence constraints. European Journal of Operational Research, 140(3), 606–617. doi:10.1016/S0377-2217(01)00227-2 Nidhi, A., & Shruti, K. (2015). Energy Efficient Trust Mechanism using Genetic Algorithm in WSN. International Journal of Computer Science and Mobile Computing, 4(6), 146–156.
673
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Nimbalkar, N. B., Das, S. S., & Wagh, S. J. (2015). Trust based Energy Efficient Clustering using Genetic Algorithm in Wireless Sensor Networks (TEECGA). International Journal of Computers and Applications, 112(9). O’ Donovan, J., & Smyth, B. (2005). Trust in Recommender Systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces (pp. 167-174). Page, L, Brin, S., Motwani, R., & Wino grad, T. (1999). The PageRank Citation Ranking: Bringing order to the Web (technical report). Pal, S. K., Talwar, V., & Mitra, P. (2002). Web Mining in Soft Computing Framework: Relevance, State of the Art and Future Directions. IEEE Transactions on Neural Networks, 13(5), 1163–1177. doi:10.1109/ TNN.2002.1031947 PMID:18244512 Peng, T., & Seng-cho, T. (2009). iTrustU: A blog recommender system based on multifaceted trust and collaborative filtering. In Proceedings of the ACM Symposium on Applied Computing, New York, NY (pp. 1278-1285). 10.1145/1529282.1529571 Peng, W.-C., & Lin, Y.-C. (2006). Ranking Web Search Results from Personalized Perspective. In Proceedings of the 8th IEEE International Conference on E-Commerce Technology and the 3rd IEEE International Conference on Enterprise Computing, E-Commerce, and E-Services. Pera, M. S., & Ng, Y. K. (2013). A group recommender for movies based on content similarity and popularity. Information Processing & Management, 49(3), 673–687. doi:10.1016/j.ipm.2012.07.007 Pirolli, P. (1997). Computational models of information scent-following in a very large browsable text collection. In Proceedings of the Conference on Human Factors in Computing Systems. 10.1145/258549.258558 Pirolli, P. (2004). The use of proximal information scent to forage for distal content on the world wide web, Working with Technology, Mind: Brunswikian. Resources for Cognitive Science and Engineering. Oxford University Press. Raha, A., Naskar, M. K., Avishek, P., Chakraborty, A., & Karmakar, A. (2013). A genetic algorithm inspired load balancing protocol for congestion control in wireless sensor networks using trust based routing framework (GACCTR). International Journal of Computer Network and Information Security, 5(9), 9–20. doi:10.5815/ijcnis.2013.09.02 Sathya, S. S., & Simon, P. (2010). A document retrieval system with combination terms using genetic algorithm. International Journal of Computer and Electrical Engineering, 2(1). doi:10.7763/IJCEE.2010. V2.104 Selvan, M. P. (2012). Survey on Web Page Ranking Algorithms. International Journal of Computers and Applications, 41(19). doi:10.5120/5646-7764 Selvaraj, C., & Anand, S. (2012). Peer profile based trust model for P2P systems using genetic algorithm. Peer-to-Peer Networking and Applications, 5(1), 92–103. doi:10.100712083-011-0111-9 Tian, C.-Q., Zou, S.-H., Wang, W.-D., & Cheng, S.-D. (2008). Trust model based on reputation for peerto-peer networks. Journal of Communication, 29(4), 63–70.
674
Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search
Tyagi, N., & Varshney, R. G. (2012). A model to study genetic algorithm for the flowshop scheduling problem. Journal of Information and Operations Management, 3(1), 38–42. Wen, R. J., Nie, Y. J., & Zhang, J. H. (2002). Query Clustering Using User Logs, Journal. ACM Transactions on Information Systems, 20(1), 59–81. doi:10.1145/503104.503108 Weng, J., Miao, C., & Goh, A. (2006). Improving collaborative filtering with trust-based metrics. In Proceedings of the 2006 ACM symposium on Applied computing (pp. 1860-1864). 10.1145/1141277.1141717 Weng, J., Miao, C., & Goh, A. (2006). Improving Collaborative Filtering with Trust-based Metrics. In Proceedings of SAC’06, Dijon, France. ACM. Xing, W., & Ghorbani, A. (2004). Weighted PageRank Algorithm. In Proceedings of the Second Annual Conference on Communication Networks and Services Research (pp. 305-314). IEEE. 10.1109/ DNSR.2004.1344743 Xue, W., & Fan, Z. (2008). A new trust model based on social characteristic and reputation mechanism for the semantic web. In Proceedings of the Workshop on Knowledge Discovery and Data Mining. Zhao, Y., & Karypis, G. (2002a). Comparison of agglomerative and partitional document clustering algorithms. In Proceedings of the SIAM Workshop on Clustering High-dimensional Data and its Applications. 10.21236/ADA439503 Zhao, Y., & Karypis, Y. (2002b). Criterion functions for document clustering: Experiments and Analysis. Technical report. Minneapolis, MN: University of Minnesota.
This research was previously published in Journal of Information Technology Research (JITR), 11(2); pages 110-127, copyright year 2018 by IGI Publishing (an imprint of IGI Global).
675
676
Chapter 35
A Multiobjective GeneticAlgorithm-Based Optimization of Micro-Electrical Discharge Drilling: Enhanced Quality Micro-Hole Fabrication in Inconel 718 Deepak Rajendra Unune The LNM Institute of Information Technology, India Amit Aherwar Madhav Institute of Technology and Science, India
ABSTRACT Inconel 718 superalloy finds wide range of applications in various industries due to its superior mechanical properties including high strength, high hardness, resistance to corrosion, etc. Though poor machinability especially in micro-domain by conventional machining processes makes it one of the “difficult-to-cut” material. The micro-electrical discharge machining (µ-EDM) is appropriate process for machining any conductive material, although selection of machining parameters for higher machining rate and accuracy is difficult task. The present study attempts to optimize parameters in micro-electrical discharge drilling (µ-EDD) of Inconel 718. The material removal rate, electrode wear ratio, overcut, and taper angle have been selected as performance measures while gap voltage, capacitance, electrode rotational speed, and feed rate have been selected as process parameters. The optimum setting of process parameters has been obtained using Genetic Algorithm based multi-objective optimization and verified experimentally.
DOI: 10.4018/978-1-7998-8048-6.ch035
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
INTRODUCTION Background and Motivation Inconel 718 superalloy owing to its superior mechanical properties under elevated temperature is widely used material for aviation, turbines and nuclear power plant applications (Dudzinski et al., 2004; Unune & Mali, 2016a). However, Inconel 718 is classified as a “difficult-to-cut” material because of its peculiar characteristics such as low thermal conductivity, high tendency to work hardening and high affinity for tool materials. The conventional drilling of Inconel 718 lead to the built-up edges formation at tool–chip interface as a result of micro-welding and chip braking becomes tougher as drilling proceeds. Even inferior machining may arise during drilling small and micro holes due to drill bit breaking, low rigidity and challenging evacuation (Dudzinski et al., 2004; Ezugwu & Machado, 1999; Zhu & Ding, 2013). However, nonconventional machining processes can be effectively used for machining of difficult-to-cut materials like Inconel 718 as in these processes there is no direct contact between the cutting tool and material. Electric discharge machining (EDM) is a nonconventional machining process which employs a series of sparks between the electrically-conductive workpiece and the electrode inside dielectric fluid relying on thermal energy for material removal purpose. Micro electric discharge machining (μ-EDM) is a modified version of EDM for manufacturing of micro- and miniature parts and structures. In micro electric discharge machining (μ-EDM), the material removal takes as micro-sized craters due to accurately regulated sparks occurring between a rotating electrode and a workpiece and holes as small of 5-10 μm can be fabricated (Unune & Mali, 2014; Unune & Mali, 2016a). The micro electric discharge drilling (μ-EDD) utilized the rotating cylindrical electrode, unlike the stationary electrode in die-sinking μ-EDM, and rotating motion of electrode further enhances the performance of the process. μ-EDD has become popular process due to its ability to generate highaspect-ratio (HAR) micro-holes in any conductive materials irrespective of hardness (D’Urso et al. 2015; Lee eta. 2015; Zhang et al. 2015). It is preferred especially for the difficult-to-cut materials owing to its high efficiency and precision. Although μ-EDD is a very efficient process in micro-hole machining and having many advantages, it also has some disadvantages. One of them is that it is a rather slow machining process; the other is that while the workpiece is being machined, the tool electrode also wears at a rather significant rate. This tool-wear leads to shape inaccuracies. Another drawback is the formation of a heat-affected layer on the machined surface. Since it is impossible to remove all the molten part of the workpiece, a thin layer of molten material remains on the workpiece surface, which resolidifies during cooling (Prakash et al. 2015; Zhenlonget al. 2014). Careful selection of input parameters and their optimization plays a great role in the achievement of results. Various studies have been reported to investigate effects of process parameters in μ-EDD. Although, low MRR, high electrode wear ratio (EWR), overcut, circularity and taper angle of HAR micro-holes are still major research concerns in μ-EDD.
Related Work Liu et al. (2005) explored outcomes of the discharge current on the quality of holes using Tungsten electrodes of 110 μm in high nickel alloy. They proposed the use of the second stage helically grooved electrodes to reduce surface roughness up to 0.85 μm. Yilmaz and Okka (2010) made a comparative analysis of fast μ-EDD of Inconel 718 and Ti–6Al–4V consuming single and multi-tubular tools of
677
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
copper and brass. They reported that single channel electrodes attribute high MRR and low EWR, however, multi-channel electrodes results in better surfaces. Ay et al. (2012) reported improved hole quality through optimized process parameters using grey relational analysis in die sinking μ-EDM of Inconel 718. Jafferson and Hariharan (2013) examined the effect of capacitance and voltage in deserving the precision of micro-holes edges in die-sinking μ-EDM through image processing techniques and claimed that discharge energies above 50 μJ are steadily detruding the micro-hole edges. D’Urso et al. (2015a,b) proposed empirical relations for EWR and MRR in μ-EDD of stainless steel as a function of the hole depth for Tungsten carbide (WC) and Brass electrodes. Soft computing techniques have become favorite for the prediction and optimization for performance optimization of a process (Cai, 2015; Mankad, 2015). Chandrasekaran et al. (2009) presented a review of the application of such computational tools to the major machining processes such as turning, milling, drilling, etc. They elaborated the applicability and methodology of utilizing such optimization techniques. Rao and Kalyankar (2014) showcased applications of soft computing techniques in micro-manufacturing and nano-manufacturing. They provided the recent innovations and modifications in micro and nano manufacturing domain and provided the discussion on effects on process parameters on accuracy and performance enhancement. Bhattacharya et al. (2011) investigated effects of process parameters in the μ-EDM and provided the optimization for rough and finish machining. Morgan et al. (2006) elaborated the material removal in μ-EDM for fabrication of miniature features with a focus on trade-off in tool wear, MRR, and machining time with respect to surface finish and accuracy of fabricated features. They highlighted the importance of optimization in such cases. The discharge phenomena in μ-EDM is very complex and unpredictable and due to which it is very challenging even for a skillful operator to achieve an optimal performance benchmark by optimizing the control variables. The Soft computing techniques like genetic algorithm (GA), artificial neural network (ANN), particle-swarm optimization, etc., have presented large competency in explaining the problem of complex non-linearity for such multi-objective optimization problems (Cai, 2015; Unune & Mali, 2015). Mandal et al. (2007) optimized the EDM process using GA-based multi-objective optimization technique. They modeled relationship between three process parameters, namely peak-current, pulseon-duration, and pulse-off-time with responses MRR and EWR. Somashekhar et al. (2010) established technique for modeling as well as optimization for the μ-EDM process. They used ANN to analyze the MRR in μ-EDM. A technique for order preference by similarity to ideal solution (TOPSIS) was used by Manivannan and Kumar for multi-response optimization of micro-EDM output process parameters, such as the material removal rate, electrode wear rate, overcut, taper angle, and circularity at entry and exit points on AISI304 alloy (Manivannan & Kumar, 2016). From the literature review, it was observed that the μ-EDD is competent machining process for fabricating micro-holes in materials like Inconel 718. Careful choice of machining conditions and optimization of process parameters is required to achieve accurate micro-holes with higher production rate in Inconel 718. Therefore, the objective of the current study is to examine the effects of the process parameters in μ-EDD of Inconel 718 on MRR, EWR and accuracy of drilling holes regarding over cut, and taper angle and to optimize the machining conditions in a view to obtaining accurate micro-holes.
678
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
EXPERIMENTAL METHODOLOGY Taguchi Design of Experiments Design of experiment is an important statistical technique for examining the unfamiliar characteristics of the machining parameters in the experimental process (Taguchi, Chowdhury, & Wu, 2004). The classical experimental design methods are complex and difficult to apply. Moreover, with an increase in number of machining parameters, the larger number of experiments are essential (Lin, 2002). The Taguchi method typically used to reduce the number of experiments by using orthogonal arrays. The basic idea of the Taguchi method is to make sure of quality in the design phase and yields reduces time and costs associated with experiments (Unune & Mali, 2016b). The Taguchi technique is typically used for deciding the parameter settings producing the best levels of a quality characteristic (performance measure) with minimum variation (Mahapatra & Patnaik, 2006). Taguchi produces a typical orthogonal array to accommodate this requirement. S/N ratio compares the level of a desired signal to the level of background noise. If the ratio is high, the background noise is less obtrusive. Process variability influenced by control factors measured in terms of the S/N ratio. In general, the S/N ratio characteristics can be divided into three classes: “larger-the-better”, “smaller-thebetter” and “nominal-the-best”. Further, ANOVA is typically used to find out the statistical significance of the machining parameters (Taguchi et al., 2004).
Experimental Procedure The μ-EDD was performed using a Mikrotools DT110 integrated multi-process micro-machining machine tool working on the RC-pulse generator. The workpiece of commercially available Inconel 718 of dimension 27 mm × 10 mm × 1.5 mm was prepared to perform μ-EDD experiments. The photograph of the experimental setup is shown in Figure 1. Tungsten electrodes of diameter 300 µm were chosen as tool material. Electric discharge grinding was used to make electrode tip flat before each experiment by selecting positive polarity, however, during the μ-EDD, the negative polarity was selected for the electrode. Four input process parameters have been selected viz. gap voltage, capacitance, electrode rotation speed, and feed rate and their effects on material removal rate (MRR), electrode wear ratio (EWR), overcut, and taper angle have been determined. The input machining parameters along with their values used for experiments are listed in Table 1. The experiments have been designed based on Taguchi method of design of experiments. For selecting appropriate orthogonal arrays, the degrees of freedom of array are required to be calculated, which must be greater than the degrees of freedom of input parameters. The degree of freedom is defined as the number of fair and independent comparisons needed for optimization of process parameters and is one less than the number of level of the parameter. In this study, there are sixteen degrees of freedom owing to three levels of EDM input parameters, so Taguchi-based L25 orthogonal array (experimental layout, Table 2) was selected. Accordingly, twenty-five experiments were carried out to study the effect of input process μ-EDD parameters.
679
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Figure 1. Micro-electric discharge drilling experimental setup
Table 1. Control variables and their levels Control Parameters (Symbol)
Unit
Level 1
Level 2
Level 3
Level 4
Level 5
Gap voltage (Gv)
V
80
90
100
110
120
Capacitance (C)
µF
0.0001
0.001
0.01
0.1
0.4
Electrode rotational speed (ERS)
rpm
100
300
500
700
900
Feed rate (Fr)
µm/sec
24
30
36
42
48
The radius of micro-holes was measured using the Zeiss Digital Microscope (AX10) under 500X magnification using the 20X lens. A field emission scanning electron microscope (FE-SEM; Nova NanoSEM 450) was used to capture the surface quality of micro-holes. The average volume of material removed (MR) is divided by machining time to calculate MRR in mm3/min. The volume of the MR from electrode divided by the volume of MR from the workpiece to determine EWR. The MRR, EWR, overcut and taper angle were computed using following equations (Jahan et al. 2009). π MRR = rt2 + rt rb + rb2 × d ÷ t, 3
(
680
)
(1)
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
where, rt is radius of hole at top, rb is radius at bottom, d is hole depth, and t is machining time. EWR =
TW , MRR
(2)
where TW is volumetric tool wear rate computed as TW = electrode diameter.
πD 2F , F is frontal electrode wear, D is 4t
Table 2. Taguchi L25 orthogonal array Expt. No.
Gap Voltage
Capacitance
Electrode Rotational Speed
Vibrational Frequency
1
1
1
1
1
2
1
2
2
2
3
1
3
3
3
4
1
4
4
4
5
1
5
5
5
6
2
1
2
3
7
2
2
3
4
8
2
3
4
5
9
2
4
5
1
10
2
5
1
2
11
3
1
3
5
12
3
2
4
1
13
3
3
5
2
14
3
4
1
3
15
3
5
2
4
16
4
1
4
2
17
4
2
5
3
18
4
3
1
4
19
4
4
2
5
20
4
5
3
1
21
5
1
5
4
22
5
2
1
5
23
5
3
2
1
24
5
4
3
2
25
5
5
4
3
681
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
The overcut (Oc) was calculated as: Oc =
Da − D 2
,
(3)
where Da is average hole diameter measured as, Da = (rt + rb ) .
(4)
The taper angle (𝜃) was calculated as: r − r b θ = tan −1 t . h
(5)
RESULTS AND DISCUSSION S/N Analysis In Taguchi analysis, the term ‘signal’ signifies the desired effect for the performance measure, and the term ‘noise’ signifies for the unwanted effect for the performance measure. S/N ratio events the quality characteristics conflicting from the anticipated values and higher S/N ratio determine the optimal level of the process parameters. Since higher material removal rate desirable for high production, the higher-the–better S/N quality characteristic was used for MRR. Whereas, the lower tool wear, overcut and taper angle signify the better hole accuracy, therefore, the lower-the–better S/N quality characteristic was used for EWR, overcut and taper angle. Quality characteristics of the higher-the-better and the lower-the-better are calculated using equation (5) and (6) respectively (Aherwar et al. 2014; Unune & Mali, 2015), and presented in Table 3. S = −10 log10 N
1 n 1 ∑ 2 , n i =1 yi
(6)
S = −10 log10 N
n 1 y 2 n ∑ i i =1
(7)
yi is the ith measured performance measure in an experiment and n clarifies the number of measurements in each test trial.
682
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Table 3. Measured responses and their S/N values MRR
EWR
Overcut
Taper Angle
Exp. No.
Result (mm / min)
S/N (dB)
Result (%)
S/N (dB)
Result (mm)
S/N (dB)
Result (degree)
S/N (dB)
1
0.000597
-64.474
0.060
24.345
0.0251
32.003
1.91
-5.630
2
0.001126
-58.969
0.080
21.903
0.0219
33.167
1.93
-5.754
3
0.002139
-53.395
0.105
19.575
0.0199
33.982
1.98
-5.976
4
0.005748
-44.809
0.175
15.132
0.0173
35.218
2.23
-6.973
5
0.009819
-40.158
0.304
10.329
0.0115
38.761
2.49
-7.954
6
0.002716
-51.320
0.102
19.827
0.0201
33.935
1.98
-5.964
7
0.003178
-49.956
0.119
18.484
0.0185
34.612
2.00
-6.050
8
0.002927
-50.670
0.140
17.038
0.0183
34.748
2.04
-6.233
9
0.006352
-43.941
0.217
13.253
0.0145
36.728
2.19
-6.838
10
0.006881
-43.246
0.228
12.821
0.0218
33.214
2.45
-7.808
11
0.003746
-48.527
0.140
17.035
0.0172
35.243
2.03
-6.158
12
0.003135
-50.074
0.163
15.720
0.0150
36.475
1.99
-5.998
13
0.003093
-50.191
0.182
14.792
0.0159
35.929
2.03
-6.151
14
0.005527
-45.149
0.157
16.041
0.0211
33.512
2.25
-7.072
15
0.009075
-40.842
0.245
12.198
0.0191
34.357
2.50
-7.991
16
0.003243
-49.778
0.179
14.938
0.0145
36.753
2.03
-6.171
17
0.002654
-51.520
0.190
14.384
0.0158
35.974
2.03
-6.192
18
0.002406
-52.371
0.146
16.666
0.0196
34.138
2.06
-6.309
19
0.006772
-43.384
0.192
14.297
0.0176
35.062
2.28
-7.186
20
0.008263
-41.656
0.262
11.625
0.0169
35.431
2.51
-8.027
21
0.001741
-55.181
0.206
13.712
0.0159
35.929
2.05
-6.266
22
0.001137
-58.883
0.175
15.139
0.0183
34.731
2.05
-6.256
23
0.001019
-59.833
0.183
14.729
0.0153
36.300
2.10
-6.447
24
0.005574
-45.075
0.221
13.093
0.0149
36.510
2.31
-7.282
25
0.007925
-42.019
0.267
11.445
0.0184
34.664
2.54
-8.130
3
The level of a parameter with the highest S/N ratio gives the optimal level. Therefore, the optimal process parameter setting for the MRR are A3B5C4D5 (Figure 2). Thus, the best combination for maximizing the MRR is gap voltage of 100 V, the capacitance of 0.4 µF, electrode rotation speed of 700 rpm, and vibrational frequency of 48 Hz. The analysis of the results showed that the optimum combination of machining parameters for EWR is A1B1C1D5, namely, gap voltage of 80 V, the capacitance of 0.0001 µF, electrode rotation speed of 100 rpm, and vibrational frequency of 48 Hz (Figure 3). The optimum combination of process parameters for minimizing the overcut is A5B4C5D5, namely, gap voltage of 120 V, the capacitance of 0.1 µF, electrode rotation speed of 900 rpm, and vibrational frequency of 48 Hz (Figure 4). Similarly, A1B1C1D1 is found to be the optimum combination for lowest taper angle. For low taper angle, the optimum combination of the process parameter is as follows: gap voltage of 80 V, the capacitance of 0.0001 µF, electrode rotation speed of 100 rpm, and vibrational frequency of 24 Hz (Figure 5).
683
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Figure 2. Main effects plot of S/N ratios for MRR
Figure 3. Main effects plot of S/N ratios for EWR
684
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Figure 4. Main effects plot of S/N ratios for overcut
Figure 5. Main effects plot of S/N ratios for taper angle
685
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Effect of Cutting Parameters on Responses It has been observed from Figure 6 that with an increase in capacitance value the MRR increases considerably. This is due to the fact that with an increase in capacitance, the discharge energy acting on to the workpiece also increases (Kuriachen & Mathew, 2015). The discharge energy (DE) in RC circuit is the product of capacitance and gap voltage (DE=0.5*Capacitance*Voltage2). The capacitance controls the degree of the energy deposited and therefore, with an increase in capacitance, the amount of DE, pulse current and pulse interval also rises improving the MRR. Therefore, the MRR upsurges significantly at higher values of capacitance (Figure 6). Also, the MRR increases initially with an increase in gap voltage up to an optimum value and then decreases with further increase in gap voltage (Kuriachen & Mathew, 2015). It has also been noticed that with an increase in ERS the MRR also increases. At a higher speed of tool electrode, the higher centrifugal force will produce which in turn will help in expelling the molten metal and debris particle present in the inter-electrode gap (IEG) (Dave et al. 2014; Unune & Mali, 2016c). The tangential velocities also cause a disturbance in a dielectric fluid and therefore enhanced circulation of dielectric in machining gap. It has been observed that with an increase in feed rate the MRR increases slowly. At higher feed rate the tool electrode travels at a higher rate and increases the machining rate, therefore, increases the MRR. Figure 6. Effects control factors on material removal rate
686
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
It has been noted from Figure 7 that with an increase in both capacitance and gap voltage the electrode wear ratio also increase. As discussed previously the amount of discharge energy released by depends on both capacitance and voltage values. Higher the values of capacitance and gap voltage, higher energy will be released. At higher discharge energy, more erosion will take place from both workpiece and tool electrode (Jahan et al. 2009). With the increase in electrode rotational speed, the electrode wear ratio increases. This is due to reason that as higher electrode speed promotes effective sparking, thus the amount of EWR will also increase. However, the feed rate not seen having a significant impact on the EWR. The overcut found to be reduced at higher values of all input parameters (see, Figure 8). Figure 7. Effects control factors on electrode wear ratio
From Figure 9, it has been seen that taper angle increases with increase in gap voltage, capacitance, electrode rotational speed as well as feed rate. However, the taper angle rapidly increases with increase in capacitance values. The corned wear of tool electrode results in tapered holes. As discussed previously, the increase in DE will cause the more electrode wear, and therefore, the hole will become tapered for further depth as drilling progresses. Therefore, lower values of all control variable are more suitable for drilling accurate micro-holes.
687
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Figure 8. Effects control factors on overcut
Figure 9. Effects control factors on taper angle
688
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
ANOVA Analysis Analysis of variance (ANOVA) has been utilized to determine the statistically significant factors influencing the MRR, EWR, overcut, and taper angle during µ-EDD of Inconel 718 and to determine the percentage contribution of each control factor on responses. ANOVA results are given in Table 4, Table 5, Table 6, and Table 7. The capacitance has been observed to be the most significant factor contributing 89.03% for MRR. The percentage contribution by gap voltage and electrode rotational speed on MRR found to be same (3.86%). From the ANOVA table, it has been observed that capacitance, gap voltage, and electrode rotational speed have statistical and physical significance on the MRR as their P value lower than 0.05. Similarly, for EWR the capacitance was found to be a most significant factor with 62.10% contribution (Table 5), followed by electrode rotational speed and gap voltage which contributes by 16.36% and 15.82%, respectively. The capacitance, gap voltage, and electrode rotational speed have statistical and physical significant of EWR. The most influential factor for overcut was electrode rotational speed with a percentage contribution of 55.83% and followed by capacitance and gap voltage which contributes by 22.54% and 11.67%, respectively (Table 6). However, only electrode rotational speed has observed to be having statistical significance on overcut. The capacitance and gap voltage had found to be contributing to 95.75% and 3.29% on taper angle, and both had statistical significance on taper angle (Table 7). Table 4. ANOVA for MRR Degree of Freedom, DF
Sum of Squares, SS
Mean Squares, MS
F Ratio
P-Value
Contribution (%)
Gap Voltage
4
0.0000068
0.0000017
5.07
0.025
3.86
Capacitance
4
0.0001567
0.0000392
116.11
0.000
89.03
Electrode rotation speed
4
0.0000068
0.0000017
5.02
0.025
3.86
Feed rate
4
0.0000032
0.0000008
2.38
0.138
1.82
Error
8
0.0000027
0.0000003
Total
24
0.0001763
Factors,
1.53
Table 5. ANOVA for EWR Degree of Freedom, DF
Sum of Squares, SS
Mean Squares, MS
F Ratio
P-Value
Contribution (%)
Gap Voltage
4
0.0134787
0.0033697
8.48
0.006
15.82
Capacitance
4
0.0529228
0.0132307
33.3
0.000
62.10
Electrode rotation speed
4
0.0139432
0.0034858
8.77
0.005
16.36
Feed rate
4
0.0016995
0.0004249
1.07
0.432
1.99
Error
8
0.0031788
0.0003974
Total
24
0.085223
Factors,
3.73
689
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Table 6. ANOVA for overcut Degree of Freedom, DF
Sum of Squares, SS
Mean Squares, MS
F Ratio
P-Value
Contribution (%)
Gap Voltage
4
0.0000244
0.0000061
1.04
0.445
11.67
Capacitance
4
0.0000059
0.0000015
0.25
0.903
22.54
Electrode rotation speed
4
0.0001146
0.0000287
4.87
0.028
54.83
Feed rate
4
0.000017
0.0000042
0.72
0.601
8.13
0.0000059
Factors,
Error
8
0.0000471
Total
24
0.0002089
Table 7. ANOVA for taper angle Degree of Freedom, DF
Sum of Squares, SS
Mean Squares, MS
F Ratio
P-Value
Contribution (%)
Gap Voltage
4
0.032296
0.008074
19.74
0.000
3.29
Capacitance
4
0.939096
0.234774
574.02
0.000
95.75
Electrode rotation speed
4
0.001616
0.000404
0.99
0.466
0.16
Feed rate
4
0.004456
0.001114
2.72
0.106
0.45
Error
8
0.003272
0.000409
Total
24
0.980736
Factors,
0.33
MULTI-OBJECTIVE OPTIMIZATION USING GENETIC ALGORITHM GAs are computerized search and optimization algorithms working similar to mechanics of natural genetics and natural selection (Joshi & Pande, 2011; Mahendran et al. 2010; Tyukhov, Rezk, & Vasant, 2016). Genetic algorithms are stochastic search methods that mimic the metaphor of natural biological evolution. Genetic algorithms function on a population of possible solutions smearing the principle of survival of the fittest to yield better and better approximation to a solution. At every generation, a different set of approximations is generated by the process of choosing individuals conferring to their level of fitness in the problem domain and breeding them together using operators borrowed from natural genetics. This process results in the development of a population of individuals that are superior to their environment than the individuals that they were generated from, just as in natural adaptation. Initially, the objective function is utilized for generating the fitness function and then used in successive genetic operations. The iteration in GA begins with a population of random strings representing the design or decision variables. After that, each string is evaluated to find fitness value. The population is then operated by three main operators: reproduction, crossover, and mutation to create a new population of points. The flow chart of genetic algorithm operation in multi-objective optimization is shown in Figure 10. Literature (Abbas et al. 2007; Rao & Kalyankar, 2014; Somashekhar et al. 2010) also documents that GA is a better optimization methodology over the traditional optimization techniques due to its robustness.
690
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Figure 10. Flow chart of genetic algorithm based optimization
A fitness function is created to maximize the MRR, minimize EWR, minimize overcut, and minimize taper angle to achieve the optimal performance. 0.000966873 − 0.0000055571 ×G + 0.0152632 ×C v , Maximize MRR = f +0.0000016492 × ERS + 0.0000410038 × Fr −0.071704 + 0.00164187×G + 0.294416 ×C v , Minimize EWR = f +0.0000787093 × ERS + 0.000444871 × Fr 0.0293395 − 0.000069042 ×G − 0.00139744 ×C v , Minimize overcut = f . − 0 . 0 0 000746554 × ERS − 0 000020463 × Fr
691
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
0.0293395 − 0.000069042 ×G − 0.00139744 ×C v . Minimize taper angle = f −0.00000746554 × ERS − 0.000020463 × Fr Subjected to constraints 80 ≤ Gv ≤ 120, 0.0001 ≤ C ≤ 0.4, 100 ≤ ERS ≤ 900, 24 ≤ Fr ≤ 48. where A, B, C, and D are gap voltage, capacitance, electrode rotational speed, and feed rate, respectively. The constraints of the input parameters has been decided based on the electrical capacity of the experimental setup used, and f represents the functional relationship between the performance measure and four process parameters. The GA randomly generated a real-valued population of possible process parameter sets (chromosomes). A new population for the next generation is created based on old population using different operator functions viz. A number of generations: 500; population size: 20; crossover rate: 0.8; crossover mechanism: two points; and mutation rate: 0.01; MATLAB is used for the analysis. Some GA runs have been carried out to identify the range of process parameters for optimal performance of the µ-EDD process. The optimum values of process parameters for which the µ-EDD process is expected to offer the best MRR, EWR, overcut and taper angle are as shown in Table 8. Table 8. Optimization results and validation Optimum Process Parameters Gap Voltage (V)
Capacitance (µF)
Electrode Rotational Speed (rpm)
105
0.03
201
MRR
EWR
Overcut
Taper Angle
Feed Rate (µm/ sec)
Predicted
Expt.
Predicted
Expt.
Predicted
Expt.
Predicted
Expt.
26
0.002
0.0021
0.137
0.140
0.02
0.021
2.06
2.09
BENEFITS AND LIMITATION OF CURRENT WORK Electric discharge machining process performance is highly dependent on proper choice of the input process parameters. Moreover, in high-aspect-ratio micro-hole drilling with µ-EDD process, the quality of produced micro-holes is critically affected by the process parameters. The choice of variable for i) higher MRR results in poor quality of fabricated micro-holes and ii) best quality of micro-holes results in low MRR. Therefore, in this work, to achieve a solution to this trade-off, the optimum combination of input process parameters has found out using GA based optimization. The high-aspect-ratio micro-holes drilling is demonstrated with better performance of the µ-EDD process to yield optimum MRR, EWR,
692
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
overcut and taper angle. The results of optimization will be of direct use for precision engineering and machining applications for both research institutes and industries. However, the interaction effect of process parameters on performance of process are not included. Also higher order regression equation may give further enhance results for µ-EDD process. In addition, the performance of different multi-objective optimization algorithms can be compared.
FUTURE RESEARCH DIRECTIONS In this study, genetic algorithm has been used to optimize the performance of µ-EDD process for fabrication of micro-holes in difficult-to-cut Inconel 718 superalloy. The process parameters viz. gap voltage, capacitance, electrode rotational speed, and feed rate have been optimized to obtain best quality (i.e. minimum overcut and minimum taper angle) of micro-holes at faster cutting rate and minimum tool wear. For future work, investigation on performance of different tool materials on quality of micro-holes, investigation on surface quality of fabricated micro-holes, investigation on heat affected zone and surface integrity of fabricated micro-holes, etc., can be taken as research objectives.
CONCLUSION Micro-electric discharge drilling process is widely used technique suitable for generation of micro-holes in any conductive material irrespective of its hardness. In this study, genetic algorithm based optimization of µ-EDD process is performed to optimize the performance of process such a that accurate micro-holes can be achieved in difficult-to-cut Inconel 718 superalloy. The effects of input machining parameters, i.e., gap voltage, capacitance, electrode rotational speed, and feed rate on performance variables, i.e., MRR, EWR, overcut, and taper angle. Based on the study performed, the following conclusion have been drawn: • • •
Capacitance has found to be a most significant factor for MRR, EWR and taper angle while electrode rotational speed for overcut. The MRR, EWR and taper angle increases with an increase in capacitance, gap voltage due to increase in discharge energy during micro-hole drilling. Higher electrode rotational speed is more suitable for low overcut of drilled micro-holes. Genetic algorithm based multi-optimization shows that optimum machining conditions, i.e. gap voltage of 105 V, capacitance of 0.03 µF, electrode rotational speed of 201 rpm, feed rate of 26 µm/sec can lead to obtaining accurate micro-holes while producing micro-holes in Inconel 718 superalloy using µ-EDD process.
ACKNOWLEDGMENT Authors like to thank Dr. Harlal Singh Mali and the Advanced Manufacturing and Mechatronics Laboratory at Malaviya National Institute of Information Technology Jaipur, India and for providing the facilities for conducting this research. Authors also extend their deepest regards to Prof. Pandian Vasant and IGI
693
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
global for giving us an opportunity to present our work in book ‘Emergent Research on the Application of Optimization Algorithms’.
REFERENCES Abbas, N. M., Solomon, D. G., & Bahari, M. F. (2007). A review on current research trends in electrical discharge machining (EDM). International Journal of Machine Tools & Manufacture, 47(7-8), 1214–1228. doi:10.1016/j.ijmachtools.2006.08.026 Aherwar, A., Unune, D., Pathri, B., & Kishan, J. (2014). Statistical and Regression Analysis of Vibration of Carbon Steel Cutting Tool for Turning of EN24 Steel Using Design of Experiments. International Journal of Recent advances in Mechanical Engineering, 3(3), 137-151. doi:10.14810/ijmech.2014.3312 Ay, M., Çaydaş, U., & Hasçalık, A. (2012). Optimization of micro-EDM drilling of inconel 718 superalloy. International Journal of Advanced Manufacturing Technology, 66(5-8), 1015–1023. doi:10.100700170012-4385-8 Bhattacharya, A., Batish, A., Singh, G., & Singla, V. K. (2011). Optimal parameter settings for rough and finish machining of die steels in powder-mixed EDM. International Journal of Advanced Manufacturing Technology, 61(5-8), 537–548. doi:10.100700170-011-3716-5 Cai, T. (2015). Application of Soft Computing Techniques for Renewable Energy Network Design and Optimization. In P. Vasant (Ed.), Handbook of Research on Artificial Intelligence Techniques and Algorithms (pp. 204–225). Hershey, PA: IGI Global; doi:10.4018/978-1-4666-7258-1.ch007 Chandrasekaran, M., Muralidhar, M., Krishna, C. M., & Dixit, U. S. (2009). Application of soft computing techniques in machining performance prediction and optimization: A literature review. International Journal of Advanced Manufacturing Technology, 46(5-8), 445–464. doi:10.100700170-009-2104-x D’Urso, G., Maccarini, G., & Ravasio, C. (2015b). Influence of electrode material in micro-EDM drilling of stainless steel and tungsten carbide. International Journal of Advanced Manufacturing Technology. doi:10.100700170-015-7010-9 Dave, H. K., Mathai, V. J., Desai, K. P., & Raval, H. K. (2014). Studies on quality of microholes generated on Al 1100 using micro-electro-discharge machining process. International Journal of Advanced Manufacturing Technology, 76(1-4), 127–140. doi:10.100700170-013-5542-4 Dudzinski, D., Devillez, A., Moufki, A., Larrouquère, D., Zerrouki, V., & Vigneau, J. (2004). A review of developments towards dry and high speed machining of Inconel 718 alloy. International Journal of Machine Tools & Manufacture, 44(4), 439–456. doi:10.1016/S0890-6955(03)00159-7 DUrso, G., Maccarini, G., Quarto, M., & Ravasio, C. (2015a). Investigation on power discharge in microEDM stainless steel drilling using different electrodes. Journal of Mechanical Science and Technology, 29(10), 4341–4349. doi:10.100712206-015-0932-1 Ezugwu, E. O., Wang, Z. M., & Machado, A. R. (1999). The machinability of nickel-based alloys: A review. Journal of Materials Processing Technology, 86(1-3), 1–16. doi:10.1016/S0924-0136(98)00314-8
694
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Jafferson, J. M., & Hariharan, P. (2013). Investigation of the Quality of Microholes Machined by µEDM Using Image Processing. Materials and Manufacturing Processes, 28(12), 1356–1360. doi:10.1080/1 0426914.2013.832302 Jahan, M. P., Wong, Y. S., & Rahman, M. (2009). A comparative experimental investigation of deephole micro-EDM drilling capability for cemented carbide (WC-Co) against austenitic stainless steel (SUS 304). International Journal of Advanced Manufacturing Technology, 46(9-12), 1145–1160. doi:10.100700170-009-2167-8 Joshi, S. N., & Pande, S. S. (2011). Intelligent process modeling and optimization of die-sinking electric discharge machining. Applied Soft Computing, 11(2), 2743–2755. doi:10.1016/j.asoc.2010.11.005 Kuriachen, B., & Mathew, J. (2015). Effect of Powder Mixed Dielectric on Material Removal and Surface Modification in Microelectric Discharge Machining of Ti-6Al-4V. Materials and Manufacturing Processes, 31(4), 439–446. doi:10.1080/10426914.2015.1004705 Lee, P. A., Kim, Y., & Kim, B. H. (2015). Effect of low frequency vibration on micro EDM drilling. International Journal of Precision Engineering and Manufacturing, 16(13), 2617–2622. doi:10.100712541015-0335-3 Lin, T. R. (2002). Optimisation Technique for Face Milling Stainless Steel with Multiple Performance Characteristics. International Journal of Advanced Manufacturing Technology, 19(5), 330–335. doi:10.1007001700200021 Liu, H.-S., Yan, B.-H., Huang, F.-Y., & Qiu, K.-H. (2005). A study on the characterization of high nickel alloy micro-holes using micro-EDM and their applications. Journal of Materials Processing Technology, 169(3), 418–426. doi:10.1016/j.jmatprotec.2005.04.084 Mahapatra, S. S., & Patnaik, A. (2006). Optimization of wire electrical discharge machining (WEDM) process parameters using Taguchi method. International Journal of Advanced Manufacturing Technology, 34(9-10), 911–925. doi:10.100700170-006-0672-6 Mahendran, S., Devarajan, R., Nagarajan, T., & Majdi, A. (2010). A Review of Micro-EDM. Proceedings of the International MultiConference of Engineers and Computer Scientist Hong Kong. Mandal, D., Pal, S. K., & Saha, P. (2007). Modeling of electrical discharge machining process using back propagation neural network and multi-objective optimization using non-dominating sorting genetic algorithm-II. Journal of Materials Processing Technology, 186(1-3), 154–162. doi:10.1016/j.jmatprotec.2006.12.030 Manivannan, R., & Kumar, M. P. (2016). Multi-response optimization of Micro-EDM process parameters on AISI304 steel using TOPSIS. Journal of Mechanical Science and Technology, 30(1), 137–144. doi:10.100712206-015-1217-4 Mankad, K. B. (2015). An Intelligent Process Development Using Fusion of Genetic Algorithm with Fuzzy Logic. In P. Vasant (Ed.), Handbook of Research on Artificial Intelligence Techniques and Algorithms (pp. 44–81). Hershey, PA: IGI Global; doi:10.4018/978-1-4666-7258-1.ch002
695
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Morgan, C. J., Vallance, R. R., & Marsh, E. R. (2006). Micro-machining and micro-grinding with tools fabricated by micro electro-discharge machining. International Journal of Nanomanufacturing, 1(2), 242. doi:10.1504/IJNM.2006.012196 Prakash, C., Kansal, H. K., Pabla, B., Puri, S., & Aggarwal, A. (2015). Electric discharge machining A potential choice for surface modification of metallic implants for orthopedic applications: A review. Proceedings of the Institution of Mechanical Engineers. Part B, Journal of Engineering Manufacture, 230(2), 331–353. doi:10.1177/0954405415579113 Rao, R. V., & Kalyankar, V. D. (2014). Optimization of modern machining processes using advanced optimization techniques: A review. International Journal of Advanced Manufacturing Technology, 73(58), 1159–1188. doi:10.100700170-014-5894-4 Somashekhar, K. P., Ramachandran, N., & Mathew, J. (2010). Optimization of Material Removal Rate in Micro-EDM Using Artificial Neural Network and Genetic Algorithms. Materials and Manufacturing Processes, 25(6), 467–475. doi:10.1080/10426910903365760 Taguchi, G., Chowdhury, S., & Wu, Y. (2004). Taguchi’s Quality Engineering Handbook. John Wiley & Sons Inc. doi:10.1002/9780470258354 Tyukhov, I., Rezk, H., & Vasant, P. (2016). Modern Optimization Algorithms and Applications in Solar Photovoltaic Engineering. In P. Vasant & N. Voropai (Eds.), Sustaining Power Resources through Energy Optimization and Engineering (pp. 390–445). Hershey, PA: IGI Global; doi:10.4018/978-1-4666-97553.ch016 Unune, D. R., & Mali, H. S. (2014). Current status and applications of hybrid micro-machining processes: A review. Proceedings of the Institution of Mechanical Engineers. Part B, Journal of Engineering Manufacture, 229(10), 1681–1693. doi:10.1177/0954405414546141 Unune, D. R., & Mali, H. S. (2015). Artificial neural network–based and response surface methodology– based predictive models for material removal rate and surface roughness during electro-discharge diamond grinding of Inconel 718. Proc IMechE Part B: J Engineering Manufacture. doi:10.1177/0954405415619347 Unune, D. R., & Mali, H. S. (2016a). Experimental investigation on low-frequency vibration assisted micro-WEDM of Inconel 718. Engineering Science and Technology, an International Journal. doi:10.1016/j.jestch.2016.06.010 Unune, D. R., & Mali, H. S. (2016b). A study of multiobjective parametric optimisation of electric discharge diamond cut-off grinding of Inconel 718. International Journal of Abrasive Technology, 7(3), 187–199. doi:10.1504/IJAT.2016.078281 Unune, D. R., & Mali, H. S. (2016c). Experimental Investigations on Low Frequency Workpiece Vibration in Micro Electro Discharge Drilling of Inconel 718. Paper presented at the 6th International & 27th All India Manufacturing Technology, Design and Research Conference (AIMTDR-2016), Pune, India. Unune, D. R., Singh, V. P., & Mali, H. S. (2015). Experimental Investigations of Abrasive Mixed Electro Discharge Diamond Grinding of Nimonic 80A. Materials and Manufacturing Processes. doi:10.1080/ 10426914.2015.1090598
696
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
Yilmaz, O., & Okka, M. A. (2010). Effect of single and multi-channel electrodes application on EDM fast hole drilling performance. International Journal of Advanced Manufacturing Technology, 51(1-4), 185–194. doi:10.100700170-010-2625-3 Zhang, L., Tong, H., & Li, Y. (2015). Precision machining of micro tool electrodes in micro EDM for drilling array micro holes. Precision Engineering, 39, 100–106. doi:10.1016/j.precisioneng.2014.07.010 Zhenlong, W., Xuesong, G., Guanxin, C., & Yukui, W. (2014). Surface Integrity Associated with SiC/ Al Particulate Composite by Micro-Wire Electrical Discharge Machining. Materials and Manufacturing Processes, 29(5), 532–539. doi:10.1080/10426914.2014.901520 Zhu, D., Zhang, X., & Ding, H. (2013). Tool wear characteristics in machining of nickel-based superalloys. International Journal of Machine Tools & Manufacture, 64, 60–77. doi:10.1016/j.ijmachtools.2012.08.001
This research was previously published in Handbook of Research on Emergent Applications of Optimization Algorithms; pages 728-749, copyright year 2018 by Business Science Reference (an imprint of IGI Global).
697
A Multiobjective Genetic-Algorithm-Based Optimization of Micro-Electrical Discharge Drilling
APPENDIX: NOMENCLATURE C : Capacitance (µF) d : Hole depth (µm) D : Electrode diameter (µm) Da : Average hole diameter (µm) ERS : Electrode rotational speed (rpm) F : Frontal electrode wear (µm) Fr : Feed rate (µm/min) Gv : Gap voltage (v) Oc : Overcut (µm) rt : Radius of hole at top (µm) rb : Radius at bottom (µm) t : Machining time (min) θ : Taper angle (degree)
698
Section 4
Utilization and Applications
700
Chapter 36
Application of NaturalInspired Paradigms on System Identification: Exploring the Multivariable Linear Time Variant Case Mateus Giesbrecht Unicamp, Brazil Celso Pascoli Bottura Unicamp, Brazil
ABSTRACT In this chapter, the application of nature-inspired paradigms on system identification is discussed. A review of the recent applications of techniques such as genetic algorithms, genetic programming, immunoinspired algorithms, and particle swarm optimization to the system identification is presented, discussing the application to linear, nonlinear, time invariant, time variant, monovariable, and multivariable cases. Then the application of an immuno-inspired algorithm to solve the linear time variant multivariable system identification problem is detailed with examples and comparisons to other methods. Finally, the future directions of the application of nature-inspired paradigms to the system identification problem are discussed, followed by the chapter conclusions.
INTRODUCTION For more than 50 years, many techniques have been developed to control a huge variety of dynamic systems, which vary from simple and well-known mechanical systems to the dynamics of the financial market. To develop those control techniques, it is fundamental to know the mathematical relations between the system inputs and outputs. This necessity naturally led to the development of techniques to find the mathematical realizations of the systems to be controlled. These realizations are mathematical models DOI: 10.4018/978-1-7998-8048-6.ch036
Copyright © 2021, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Application of Natural-Inspired Paradigms on System Identification
that, for a given domain, behave as close as possible to the system to be modeled. The determination of the realizations of a given dynamical system is defined as system identification. Given the huge variety of the existent systems, there are many models that can be used to generate a system realization and, for this reason, there are also many techniques that can be applied to reach this objective. Generally, the starting point is the information available about the system to be studied. Depending on the available information, one of those three techniques may be applied: the white, grey or black box modelling. Those techniques might be applied to any kind of system, including linear, nonlinear, time variant, time invariant, monovariable and multivariable ones. If the laws that govern the system behavior are reasonably known and if there is an estimate of the parameters involved in the problem, the system equations can be written from the known dynamics using the estimated parameters and a mathematical realization is then determined. This technique is known as white-box identification, since all the relevant details about the system are available. The models obtained with this technique are reasonably accurate for simple and well-studied systems operating on a known region of the domain. For more complex physical systems, sometimes the structure of the equations is known but the parameters are not, either because they are complicated to measure or because their values vary depending on the ambient conditions. To obtain realizations for those systems, it is possible to write the equations that relate their outputs to their inputs as functions of the unknown parameters, to collect input and output data from the system operation and to estimate the parameters as the set that minimizes the error between the real system outputs and the model outputs. It is also possible to collect the output signal statistical characteristics and to determine the set of parameters for which the model produces the outputs with the maximum likelihood to the real system outputs. Those methods are known respectively as error minimization and maximum likelihood. In this case, where the model structure is known but the parameters are not, the system identification using any of the methods cited is known as grey-box. If both the system structure and the parameters are unknown, there is necessary to apply techniques capable to determine the model structure and the parameters. There is also important to establish an information criterion to reach a compromise between the model accuracy and the model complexity, in order to avoid too complex models that do not result in an accuracy much better than simpler ones. For linear systems, the subspace methods allow the determination of generic state space models without the necessity of any initial guess about the system order. For non-linear systems, linear combinations of the results of non-linear functions applied to the inputs can be used to model the unknown dynamics. These nonlinear functions must be bases of the functions space, such as polynomials, sigmoid funcions, radial basis functions, etc. That functions can also be applied recursively, as observed on neural networks. Once a technique is defined, the identification procedure consists on defining the set of parameters and model structures that minimize the information criterion. In this case, the identification method is known as black-box, since no information about the system is available a priori. For the grey-box and the black-box approaches, the system identification is an optimization problem, where the objective is to find the solutions that minimize the output error, or maximize the likelihood between the systems and the model outputs, or minimize the information criterion. From this point of view, natural inspired paradigms might be used to solve this kind of problem, since many of those paradigms were developed to determine the solutions of optimization problems, such as the evolutionary algorithms. For the time variant system identification, the application of evolutionary algorithms is even more attractive, since in this case the system parameters vary along the time, and the evolutionary approach
701
Application of Natural-Inspired Paradigms on System Identification
allows the population of possible solutions to continuously evolve to the optimal set of parameters at each time instant. In this chapter, the applications of natural paradigms on system identification are discussed. For this, a review of the recent applications of those paradigms to solve this kind of problem is presented and then an immuno-inspired algorithm implemented by the authors to identify linear multivariable time variant systems is detailed.
Background The application of natural paradigms to solve the optimization problem related to the system identification has been observed since the beginning of the 21st century, following the development of natural inspired algorithms and the enhance of the computational capacity verified in the last decade of the 20th century. An approach from the first decade of the 21st century is the one described in (Rodriguez-Vazquez, Fonseca, & Fleming, 2004). In this chapter, the authors applied the multiobjective genetic programming to identify a nonlinear time invariant simple input-simple output (SISO) system, which was a gas turbine. The genetic programming approach allowed the determination of a nonlinear auto regressive model with exogenous inputs (NARX) without previous assumptions about the system structure or system parameters, configuring a black-box system identification. The basis function used was polynomial and the individual’s fitness was calculated based on many criteria, involving performance indexes and model complexity, following the idea of the information criterion used in the common black-box system identification. To deal with this criteria diversity, a Pareto-optimality approach was applied. The results showed a good agreement between the model and the system output data. On the following year, an article was published showing the application of the evolutionary approach to identify the structure and parameters of a time invariant mechatronic linear system (Iwasaki, Miwa, & Matsui, 2005). In that article, the authors used a classical genetic algorithm to identify the two-mass resonant system transfer function parameters. The system studied was linear and time invariant. Both the system and the system structure were determined by the proposed algorithm, which optimizes a fitness function that incorporates an information criterion. The results were compared to the results obtained with a white-box modelling procedure, and the conclusion was that the results obtained with the evolutionary black-box approach were closer to the actual system response due to the modelling of unpredicted phenomena. Natural approaches have also been used to identify time variant linear and non-linear systems. In the reference (Wakizono, Hatanaka, & Uosaki, 2006), an immune mechanism based on the clonal selection algorithm proposed in (de Castro & Von Zuben, 2002) was proposed to switching linear system and nonlinear system identification. The models used in the reference were based on transfer functions and the choice of the immune mechanism was due to the capacity of this algorithm to track previous optimal solutions from the set of memory cells. The results presented in the reference showed that the approach was effective to model the systems studied. In (Zakaria, Jamaluddin, Ahmad, & Loghmanian, 2010) the authors applied a multiobjective evolutionary algorithm to determine the parameters of three time invariant nonlinear multivariable systems, where two of them were benchmarks with a known model and one of them an actual system. The two objectives of the proposed algorithm were the minimization of the model complexity and the minimization of the mean square error between the benchmark or the real systems outputs and the model’s outputs, following the common black-box approach. The models used were NARX (nonlinear auto regressive with exog-
702
Application of Natural-Inspired Paradigms on System Identification
enous inputs) and the algorithm chosen to solve the optimization problem was an elitist non-dominated sorting genetic algorithm, that is a genetic algorithm with the capacity of keeping the better solutions found in the population along the generations. The results obtained for the two benchmarks showed a close agreement between the known parameters and the estimated ones, and the outputs obtained with the model created to simulate the real system were close to the real system outputs, with residuals with an order of magnitude two times smaller than the output data. The combination of natural techniques with other computational intelligence algorithms has also been used to solve the system identification problem. In (Li & Li, 2012), an evolutionary algorithm combining quantum computation and a particle swarm algorithm was proposed and used to identify the parameters from a second order time invariant linear SISO system. The parameters were identified by solving an optimization problem stated as the minimization of the mean square error between the system and the model outputs. The results were compared to the ones obtained with a conventional genetic algorithm and a pure quantum inspired genetic evolutionary algorithm, showing comparable performances, even when the system output data was corrupted by noise. The time series realization, that is the stochastic equivalent to the system identification problem, was addressed by the authors in the reference (Giesbrecht & Bottura, 2011). In this reference, the multivariable time series realization was stated as an optimization problem and solved with an immuno-inspired method. In a recent paper (Giesbrecht & Bottura, 2016) the authors stated the time series problem in an alternative way, and also applied a natural inspired method to find realizations of a given time series. In both papers, the results obtained with the heuristic method were more accurate than other more usual methods due to the stochastic nature of the problem. The time series forecasting, that is related to the time series realization, was also studied based on a natural inspired paradigm. In (Koshiyama, Escovedo, Dias, Vellasco, & Pacheco, 2012) a genetic programming approach was applied to search for a nonlinear combination of forecasts generated by different time series models in order to obtain the optimal approximation for an observed time series. The results were compared to an optimal linear combination and the conclusion was that the nonlinear combinations obtained with the proposed method performed better than the optimal linear combination. The particle swarm optimization (PSO), that is another natural inspired paradigm based on the movement of particles on the space, was also used to solve the system identification problem. In (Tang, Xie, & Xue, 2015) a new algorithm based on the particle swarm optimization and named comprehensive learning particle swarm optimization (CLPSO) is introduced and applied to identify the parameters of structural systems. The results were compared to the standard PSO showing a better performance of the proposed method. The majority of the results presented so far concern the application of natural paradigms, mainly evolutionary techniques, to solve the SISO system identification problem. Even on the works where the single input multiple output (SIMO) or the multiple input multiple output (MIMO) systems were considered, the transfer functions or the difference equations that model the system were split for each output and optimized one by one. It is possible to identify a single state space model to linear MIMO systems, but the main difficulty related to this approach is the curse of dimensionality, since the search space increases as the system order does. In (Giesbrecht & Bottura, 2015), the authors proposed an algorithm following an immuno-inspired paradigm to identify discrete state space models for time variant MIMO systems with unknown structure. To avoid the curse of dimensionality, the algorithm population was initialized using a subspace identification method. In this manner, the candidate solutions start in a solutions space region to the 703
Application of Natural-Inspired Paradigms on System Identification
desired solutions, allowing the convergence in a reasonable amount of time. This algorithm also takes the advantage of the capacity of the immuno-inspired algorithms to track time variant solutions. In this manner, as the system parameters vary along the time, the individuals in the population also evolve to track the set of parameters that minimize the output error at that instant of time. In this chapter, the algorithm proposed in (Giesbrecht & Bottura, 2015) is detailed. To do so, some important concepts related to the subspace methods for MIMO system identification, to time variant system identification and to the immuno-inspired algorithms are presented. In sequel, the proposed algorithm is detailed, what is followed by the application of it to identify a benchmark system. Then the future research directions and the conclusions are presented.
MAIN FOCUS OF THE CHAPTER State Space System Identification State space discrete system identification is the name given to the techniques created to determine a state space model that describes a discrete time invariant causal dynamic system. Many state space system identification techniques have been developed to identify time invariant multivariable dynamic systems, such as the ones presented on the references (Verhaegen & Verdult, 2007), (Young, 2011), (Barreto, 2002). Basically, these techniques take into account all available system input and output samples, estimate the Markov parameters and from these parameters some algebraic manipulations are made to find the matrices of a linear time invariant model that has outputs similar to the system outputs when excited by the same input. Since these techniques are based on the whole set of samples available, only linear time invariant systems can be modeled correctly. The main advantage of the state space approach is that a multivariable input multivariable output (MIMO) dynamic system can be modeled with no prior knowledge about the system order. Depending on the approach, a special input may or may not be needed to identify the system. In the following sections two approaches are discussed. The first is the discrete impulse response method. In this approach, a discrete impulse is applied as input to the system and from the outputs the state space model matrices can be found. The second approach discussed in this section is the Multivariable Output-Error State Space (MOESP) method, proposed by Verhaegen and Dewilde (Verhaegen & Dewilde, 1992). In this approach, the system input can be any signal and using this signal and the system output is possible to identify the system.
Multivariable Discrete Impulse Response A fundamental concept in discrete time MIMO state space system identification is the multivariable discrete impulse response. From this concept, the definition of the Markov Parameters and the concepts of reachability and observability arise directly. These concepts are fundamental to understand the algorithm developed in this paper and also to compare the behavior of different MIMO system state space identification algorithms. For this reason, in this section the multivariable discrete impulse response is defined in a formal way.
704
Application of Natural-Inspired Paradigms on System Identification
Let y (k ) ∈ ℜ2 be the l-dimensional output at instant k when the time invariant discrete causal MIMO
system is excited by a m-dimensional input u (k ) ∈ ℜm . Since the system is supposed to be causal, the outputs at instant k depend only on the inputs at instants between 0 and k. The relation between the outputs and inputs can then be represented by the equation below: k
y(k ) = ∑ G (i )u(k − i )
(1)
i =0
where $G(i)\in\Re^{lXm}$, $0\leq i\leq k$ are the impulse response matrices that define the system. To find the set of matrices G(i) that describe a time invariant discrete causal MIMO system, a set of multivariable discrete impulses can be applied as inputs and the outputs will be the columns of matrices G(i) as is shown below. By definition, let uj(k) be the j-th discrete impulse element defined in the following manner: [δ(1, j ) δ(2, j ) … δ(m, j )]T u j (k ) = 0 ∈ ℜm
k=0 k ≠0
(2)
where s(i,j) is the following function: 1 if δ(i, j ) = 0 if
i=j i≠j
(3)
The definition above means that the discrete impulse elements are the following: [1 0 … 0]T u1 (k ) = 0 ∈ ℜm
k=0
[0 1 … 0]T u2 (k ) = 0 ∈ ℜm
k=0
[0 0 … 1]T um (k ) = 0 ∈ ℜm
k=0
k ≠0
k ≠0
(4)
(5)
k ≠0
(6)
Let u1(k) be applied as input to a time invariant discrete causal MIMO system represented by equation 1. Then the outputs y(k) will be, for k=0
705
Application of Natural-Inspired Paradigms on System Identification
k
y(0) = ∑ G (i )u1 (0 − i ) = G (0)u1 (0) i =0
(7)
that is: G11 (0) … G1m (0) 1 G11 (0) y(0) = = Gl 1 (0) … Glm (0) 0 Gl 1 (0)
(8)
for k=1 ∞
y(1) = ∑ G (i )u1 (1 − i ) = G (1)u1 (0) i =0
(9)
that is: G (1) … G (1) 1 G (1) 1m 11 11 y(1) = = Gl 1 (1) … Glm (1) 0 Gl 1 (1)
(10)
Similarly, it can be proved that if the j-th discrete impulse element is applied to the system the output at instant k will be: G1 j (l ) y(k ) = Glj (k )
(11)
Consequently, if all m discrete impulse elements are applied to the system, all m columns of all matrices G(i) can be determined. Since the matrices G(i) are obtained as the system response to a multivariable discrete impulse, this set of matrices is defined as system impulse response.
State Space Models and Markov Parameters In this section two key concepts for State Space System Identification are discussed and related to the system impulse response matrices defined in the previous section. These concepts, which are the State Space Models and the Markov Parameters, are fundamental to understand the algorithm developed in this paper and are the basis criteria to find out if a time variant system identification method succeeds.
706
Application of Natural-Inspired Paradigms on System Identification
The MIMO systems can be represented by its State Space Models, in which the past is summarized in a variable defined as state and just four matrices and the initial state are necessary to find the system output at any instant. The state space model is defined in the following manner: x (k + 1) = Ax (k ) + Bu(k ) y(k ) = Cx (k ) + Du(k )
(12)
where x (k ) ∈ ℜn is defined as the system state at instant k, A ∈ ℜnXn is the state transition matrix, B ∈ ℜnXm is a matrix that relates the input u(k ) ∈ ℜm to the state, C ∈ ℜlXn is the matrix that relates the output y(k ) ∈ ℜl to the state and D ∈ ℜlXm is the matrix that relates the outputs to the inputs. It is possible to relate the impulse response matrices G(i) to the state space model matrices as will be discussed in sequel. This is achieved by exciting each one of the m system inputs with one of the m discrete impulse elements and calculating the outputs as state space matrices functions. If the system described in equation 12 is submitted to the input u1(k) defined in equation 4 and supposing that the initial state is x (0) = 0 ∈ ℜn , the state and outputs will be, for k=0: x (1) 1 x (1) x (1) = 2 = Ax (0) + Bu1 (0) = x n (1)
b 11 b 21 bn 1
(13)
y (0) 1 y (0) y(0) = 2 = Cx (0) + Du1 (0) = yl (0)
d 11 d 21 dl 1
(14)
for k=1: b x 1 (2) 11 x (2) b x (2) = 2 = Ax (1) + Bu1 (1) = A 21 x n (2) bn 1
(15)
y1 (1) y (1) y(1) = 2 = Cx (1) + Du1 (1) = C yl (1)
(16)
b 11 b 21 bn 1
707
Application of Natural-Inspired Paradigms on System Identification
for k=2: b11 b 21 … bn 1
(17)
b y1 (2) 11 y (2) b y(2) = 2 = Cx (2) + Du1 (2) = CA 21 yl (2) bn 1
(18)
x 1 (3) x (3) x (3) = 2 = Ax (2) + Bu1 (2) = A2 x n (3)
Consequently, for any k the system response to the first discrete impulse element u1(k) is given by the following relation:
g (k ) y (k ) 11 1 g (k ) y (k ) 12 = 2 g ( k ) y ( k ) l 1 l
d 11 d 21 k=0 dl 1 = b 11 k −1 b21 CA k ≠ 0 bn1
(19)
If the system is subjected to the second element of discrete impulse with the null initial state, then for k=0 the following states and outputs will be obtained: x (1) 1 x (1) x (1) = 2 = Ax (0) + Bu2 (0) = x n (1)
b 12 b 22 bn 2
(20)
y1 (0) y (0) y(0) = 2 = Cx (0) + Du2 (0) = yl (0)
d 12 d 22 dl 2
(21)
708
Application of Natural-Inspired Paradigms on System Identification
for k=1: b x 1 (2) 12 x (2) b x (2) = 2 = Ax (1) + Bu2 (1) = A 22 x n (2) bn 2
(22)
y1 (1) y (1) y(1) = 2 = Cx (1) + Du2 (1) = C yl (1)
(23)
b 12 b 22 bn 2
for k=2: b 12 b 22 bn 2
(24)
b y (2) 12 1 b y (2) y(2) = 2 = Cx (2) + Du2 (2) = CA 22 bn 2 yl (2)
(25)
x (3) 1 x (3) x (3) = 2 = Ax (2) + Bu2 (2) = A2 x n (3)
Consequently, for any k the second impulse response is the following:
g (k ) y (k ) 12 1 g (k ) y (k ) 22 = 2 … … ( ) ( ) g k y k l 2 l
d12 d22 k=0 … d l2 = b 12 k −1 b22 CA k ≠ 0 … bn 2
(26)
709
Application of Natural-Inspired Paradigms on System Identification
From the development above it can be noticed that, for each discrete impulse element applied to the system, a system impulse response matrix column is found, and this impulse response matrix is related to B, C and D, in the following manner: D k=0 G (i ) = i −1 CA B k ≠ 0
(27)
The impulse response matrices are also defined as Markov parameters. Since it is possible that more than one quadruple {A, B, C, D} satisfy the equation 27, the state space model that represents a system is not unique, but the impulse response, or the Markov parameters, are unique. Once the impulse response matrices are determined, the state space system matrices can be found from the relation shown in equation 27 and some algebraic relations, as can be found in (Ho & Kalman, 1966) for the continuous system identification and in (Aoki, 1987) for discrete time series realization, that is a problem similar to the state space discrete system identification.
Observability and Reachability From the concepts and definitions discussed above, two other useful definitions can be derived. These definitions are the observability and reachability properties of a system, and are described below. Let a system with initial state equal to x(0)= 0 ∈ ℜn be submitted to an arbitrary input until the time i and from this instant to the future the input is 0 ∈ ℜn . Since u(i ) = 0 ∈ ℜm , i≥k+1, the outputs at instants i+1 depend only on the state at instant i+1 and this relation, derived from the equation 12, is the following: y(i + 1) C y(i + 2) CA y(i + 3) = CA2 x (i + 1)
(28)
Defining the matrix at left of equation 28 as y+(k+1) and the matrix at right as O, the equation 28 can be rewritten as: y + (i + 1) = x (i + 1)
(29)
The matrix O contains the relation between an output set in the future and the state at instant i+1. For this reason, the matrix O is defined as observability matrix. In other words, from this matrix is possible to observe the state at instant i+1 from any future output with the following relation: x (i + 1) = †y + (i + 1)
710
(30)
Application of Natural-Inspired Paradigms on System Identification
where *† is the pseudoinverse of *. As can be seen from the linear system shown in equation 30, if the matrix O rank is equal or greater than the system order n, any state can be determined from any set of outputs and the system is defined as observable. Similarly, the state at instant i+1 depends only on the inputs until the instant i and the relation between state and inputs is the following: u(i ) u(i − 1) x (i + 1) = B AB A2B u(i − 2)
(31)
or, similarly as the definitions made for the observability matrix, x (i + 1) = u − (i )
(32)
where is a matrix that defines how the state x(i+1) was reached by the system with the inputs u‑(i). For this reason, the matrix is defined as reachability matrix. To reach a determined state, an input set can be defined by the following relation: u − (i ) = †x (i + 1)
(33)
so, if the reachability matrix has rank equal or greater than n it is possible to define an input set that reaches any state, and the system is defined as reachable. In this work, the reachability and the observability properties were used as constraints as discussed in the section named time variant multivariable system identification as an optimization problem.
MOESP Method The second method discussed in this section is the Multivariable Output-Error State Space (MOESP), proposed by Verhaegen and Dewilde at (Verhaegen & Dewilde, 1992). This method is based on a LQ decomposition of a matrix containing output and input data. Let the system inputs and outputs from t to an arbitrary instant k-1 be concatenated in two matrices as defined below:
yt|k −1
y(t ) y(t + 1) = y(t + k − 1)
(34)
711
Application of Natural-Inspired Paradigms on System Identification
ut|k −1
u(t ) u(t + 1) = u(t + k − 1)
(35)
then it is possible to show that: yt|k −1 = k x (t ) + Ψ k ut|k −1
(36)
where C CA k = k −1 CA
(37)
D 0 0 0 CB D 0 0 Ψ k = k −2 CA B … CB D
(38)
If the matrices ut|k −1 , yt|k −1 and x(t) for t=0…N–1, where N is the total number of inputs and outputs available, are concatenated side by side, the following matrices can be defined: U 0|k −1 = u 0|k −1 u1|k
… uN −1|k +N −2
(39)
Y0|k −1 = y 0|k −1 y1|k
… yN −1|k +N −2
(40)
X N −1 = x (0) x (1) … x (N − 1)
(41)
and then the following extended state space model can be written: Y0|k −1 = k X N −1 + Ψ kU 0|k −1
712
(42)
Application of Natural-Inspired Paradigms on System Identification
Concatenating the matrices U 0|k −1 , Y0|k −1 and proceeding a LQ decomposition, the following relation can be found: T U 0|k −1 = LQ = L11 0 Q1 L T Y 21 L22 Q2 0|k −1
(43)
where L11 ∈ ℜkmXkm and L22 ∈ ℜklXkl are lower triangular matrices, Q1T ∈ ℜkmXN and Q2T ∈ ℜklXN are orthonormal and L21 ∈ ℜklXkm . From this decomposition, the equation 42 can be rewritten as shown below: L21Q1T + L22Q2T = k X N −1 + Ψ k L11Q1T
(44)
Post-multiplying both sides of equation 44 by Q2 results in: L22 = k X N −1Q2
(45)
a singular value decomposition (SVD) is then applied to L22, resulting in the following: Σ L22 = U 1 U 2 1 0
0 V1 = k X N −1Q2 0 V2
(46)
and then: 1
k = U 1Σ12
1
X N −1Q2 = Σ12V1
(47)
Once the matrix Ok is found, is easy to find the matrices A and C of the state space system. Let k ↑ be the following matrix: CA C k ↑ = CA2 = CA A = k A
(48)
then: A = k†k ↑
(49)
and C is the first l×n block of O.
713
Application of Natural-Inspired Paradigms on System Identification
The matrices B and D can be found by observing the following relations, that come from the orthogonality between U1 and U2 guaranteed by the SVD: U 2T L22 = 0 U 2T k = 0
(50)
Then, multiplying both sides of equation 44 by U 2T the following relation is found: U 2T (L21Q1T + L22Q2T ) = U 2T (k X N −1 + Ψ k L11Q1T ) ⇒ ⇒ U 2T L21Q1T = U 2T Ψ k L11Q1T ⇒ T 2
(51)
T 2
⇒ U L21 = U Ψ k L11 ⇒ U 2T L21L−111 = U 2T Ψ k Splitting U 2T in blocks with l columns defined as Li and splitting the matrix U 2T L21L−111 in blocks with m columns defined as Mi, the following relation is valid: M … M = L … L Ψ k k 1 1
(52)
then, opening the matrix Ψ the following linear system can be written: L1D + … + Lk −1CAk −3B + LkCAk −2B = M 1 L2D + … + Lk −1CAk −4B + LkCAk −3B = M 2 Lk −1D + LkCB = M k −1
(53)
Lk D = M k
and from this system the matrices B and D can be found. As can be noticed from the discussion above, the MOESP method can be used just for time invariant systems and a large amount of data is needed to estimate the quadruple {A, B, C, D} that constitutes a state space model for the system.
714
Application of Natural-Inspired Paradigms on System Identification
Although MOESP is not suitable to solve the time variant system identification problem, this method is used in the algorithm proposed in this work to create an initial estimate of the time variant system, as will be seen in one of the next sections of this paper. The results of time variant system identification if only the MOESP method is used are not exact, as can be seen in the section in which an example of application is shown, but the MOESP initialization solves the problem related to the search space topology as will be seen in one of the following sections.
MOESP-VAR Algorithm To solve the time variant system identification problem, a MOESP based algorithm was created (Tamariz, 2005) (Tamariz, Bottura, & Barreto, 2005). This algorithm works as follows: The first step is the definition of time windows. Each window contains a subset of the system input and output data. The windows lengths are defined in such a way that the system does no vary significantly during the time interval defined for each window. Then, the MOESP algorithm is applied to the output and input data contained in each window and a quadruple {A, B, C, D} is estimated. This quadruple represents the system during the interval defined by each window. This algorithm works well if a large amount of data is available for each window. This means that the system variations cannot be too fast in order to guarantee that a large amount of input and output data can be collected for each window, without excessive system variations during that time interval. This happens because the decompositions defined in equations 43 and 46 need large matrices to be successful, and these large matrices can be reached only if large U 0|k −1 and Y0|k −1 matrices are available, and these matrices are large only if a large amount of input and output samples is available. The windows size for the method discussed in this chapter can be smaller than the size required by MOESP-VAR algorithm since the method here proposed is not based on matrices decompositions as shown in the section that discusses the proposed algorithm. In a later section, the results for a fast time variant problem are shown using both algorithms and the difference between them can be clearly observed.
Immune Inspired Algorithms for Optimization As will be discussed in a future section, in this paper the time variant system identification problem is seen as an optimization problem. To solve this optimization problem an immuno inspired algorithm is used. In this section this class of algorithms and the advantages of using it are discussed. Immuno-inspired algorithms are computational tools inspired on biological immune system principles. Useful information about this kind of algorithms can be found at (de Castro & Von Zuben, 2000; de Castro & Timmis, 2002; de Castro & Von Zuben, 2002; de França, Coelho, Castro, & Von Zuben, 2010). Basically, immune system works as follows: when an animal is exposed to an antigen (Ag), some B cells of its bone marrow secrete antibodies (Ab), that are molecules which goal is to recognize and bind themselves to Ags. The B cells that produced Abs with higher affinity to the Ags are encouraged to mature into non-dividing Ab secreting cells called plasma cells. Some of them are selected to become memory cells which will be used by the immune system to deal with the similar Ags in the future, and the other ones will generate clones. These clones are subjected to random mutations (receptor editing) and are not exactly equal to the original cells. This mechanism allows B cells to produce Abs with higher affinity to some Ag. On the other hand, this process also creates clones producing lower affinity Abs.
715
Application of Natural-Inspired Paradigms on System Identification
The higher affinity clones are encouraged to proliferate and the lower affinity ones are eliminated by the immune system. These mechanisms are controlled by T cells. In addition to it, a set of B cells with random receptors is created at bone marrow to maintain the population diversity. An interesting immune response aspect is that the process indicated above will happen only with B cells that produced higher affinity Abs. This aspect is known as Clonal Selection Principle. Another interesting characteristic is that immune system cells do not only recognize Abs, but also recognize other cells (idiotypic network theory). Then, the immune response is regulated by T cells that detect and encourage multiplication of higher affinity B cells and suppression of B cells that are at a shape space region near to another cell with better fitness. The ideas taken from the immune system can be used in an optimization procedure as shown in (de Castro & Von Zuben, 2002). In this case, Abs are candidate solutions and the concept of binding an Ag is replaced by the concept of having a better fitness. The method works as follows: first of all, a set of optimization problem candidate solutions (Abs) is generated. Candidate solutions are applied to the objective function and the result is called fitness. The fitness measures how appropriate a candidate solution is for the problem. All of the solutions are cloned with slight mutations producing new antibodies (receptor editing). The clones number for each solution is proportional to the fitness of it (when considering the maximization problem), following the clonal selection principle. The new population then is submitted to threshold suppression, that is, the Abs that are near to Abs of best fitness are eliminated from the population. Then, other new feasible solutions are generated to avoid population stagnation around a local optimum. This procedure is repeated until reasonable solutions are found. This algorithm is known as Opt-AiNet (Optimization Artificial Immune Net). In Opt-AiNet algorithm there is no distinction between memory and plasma cells. The whole set of cells is submitted to steps cited above on every iteration. The use of Opt-AINet algorithm brings many advantages to optimization problems solution. The first one is that the algorithm does not stagnate around a local optimum since new Abs are introduced on each iteration. The second advantage is that the population size is self-regulated by the suppression mechanism and, if the suppression threshold is well established, each local optimum is found. The final set of solutions tends to have the same size of the problem local optima set. This happens because if an Ab is at a local optimum region, it tends to be nearer to this point on each iteration by the receptor editing mechanism. All other solutions that are not the one nearer the optimum are suppressed. If an antibody generates descendants in a region that tends to another optimum, these new ones will tend to find these other points and will not be suppressed (once algorithm parameters are well chosen). There are other immuno-inspired algorithms, for instance Opt-IA, proposed at (Cutello & Nicosia, 2002), and Opt-Immalg, proposed at (Cutello, Narizi, Nicosia, & Pavone, 2006). These algorithms do not use the threshold suppression and population size is not self-regulated. At (Cutello, Narizi, Nicosia, & Pavone, 2005) a comparison is made between Opt-Immalg and Opt-AINet but the second algorithm was implemented without the threshold suppression, leading to an incomplete conclusion. Since algorithm capacity to self-regulate the population size is important to solve the problem here presented, an algorithm based on the Opt-AINet was used. Opt-AINet algorithm can be used to solve time variant discrete multivariable system state space identification by transforming the problem into an optimization problem, as it will be seen in the next section. Since this optimization problem is constrained, some modifications have to be done to the algorithm, as detailed in the following sections.
716
Application of Natural-Inspired Paradigms on System Identification
SOLUTIONS AND RECOMMENDATIONS Time Variant Multivariable System Identification as an Optimization Problem The time variant system identification can be defined as an optimization problem if a State Space Models Space is defined. This space contains all possible matrices quadruples {A, B, C, D} with the desired dimensions, supposed known a priori. Each state space model is seen as a point in this space and the idea is to find the point that minimizes the error between the system and the model outputs to the same input signal. This point at the state space models space is the one that represents the system for that set of inputs and outputs. Supposing that the system to be modeled is time variant, if it suffers a variation along the time another point in the state space models space will be model that represents the system, but since the variations of the system are supposed to be slight, this new optimum point is in a region of the State Space Models Space around the former model. Therefore, a heuristic algorithm can find the new model without too much effort. In other words, in this work the time variant multivariable system identification is treated as an optimization problem in the following way: at each time instant k a set of Nw+1 inputs and outputs is taken. This set, defined as time window, contains the inputs between the time instant k – (Nw.2) and k+Nw/2 and the outputs between k – (Nw/2)+1 and k+Nw/2+1. The optimization problem is to find the matrix quadruple {Aest, Best, Cest, Dest} that describes a state space model (see equation 12) that when excited by inputs contained in the time window, produces the better approximation to the outputs contained in the same time window. Once this quadruple is found, the next time window is taken and the optimization algorithm runs again. This procedure is repeated until the time windows cover all data available. This procedure allows determining a different model for each time window and therefore can be used to model time variant systems. The optimization problem details and the definitions of the search space, candidate solutions, objective function, constraints and search space topology are discussed in sequel.
Optimization Search Space The search space in the optimization problem related to the multivariable time variant system identification is the union of the four spaces where the matrices A, B, C, and D lie. It is, ℜnXn ∪ ℜnXm ∪ ℜlXn ∪ ℜlXm . This search space is huge and also has a not very well-behaved topology as will be shown by an example in the end of this section.
Candidate Solutions The candidate solutions to this optimization problem are quadruples {Aest, Best, Cest, Dest}, which are points in the search space defined above. These quadruples define models following the structure shown in equation 12. Each one of these models, when submitted to the input u(k), produce estimated outputs yest(k). These outputs can be compared to the system outputs allowing the creation of an objective function described below. In immuno-inspired algorithms context, the quadruples {Aest, Best, Cest, Dest} are also defined as antibodies.
717
Application of Natural-Inspired Paradigms on System Identification
Objective Function For an arbitrary time window, let mode be a vector containing in each position the absolute value of the l-dimensional difference between the system output, and the output estimated with a candidate solution {Aest, Best, Cest, Dest}, for each sample in the window, it is: | y(k − (N w / 2) + 1) − yest (k − (N w / 2) + 1) | | y(k − (N / 2) + 2) − y (k − (N / 2) + 1) | w est w mode = | y(k + (N w / 2) + 1) − yest (k + (N w / 2) + 1) |
(54)
The objective function to be maximized is the following: F (mode ) =
Nw + 1 N w +1
∑ mod (i) i =1
(55)
e
The variable Nw+1 is at the numerator to allow the comparison between objective function values for cases with different time windows lengths. The error sum is at denominator and consequently the objective function will be greater if the error is smaller. In the immuno-inspired optimization algorithms context, the function F calculated for a candidate solution, or antibody {Aest, Best, Cest, Dest}, is defined as the fitness of this candidate solution.
Constraints The candidate solutions {Aest, Best, Cest, Dest} must satisfy the following constraints: The matrix Aest must imply on a stable model, which in discrete case means that its eigenvalues must be inside the unit circle. The matrices Best and Cest must produce model’s outputs reasonably close to the system outputs, in other words, they must imply on outputs that do not make the error go to infinity. The observability and reachability are also constraints to the optimization problem. To check if a candidate solution defined by a quadruple {Aest, Best, Cest, Dest} implies on an observable model, matrices Aest and Cest are taken to evaluate the observability matrix rank. If the observability matrix rank is greater than model order, the model is observable. To check the reachability property, the matrices Aest and Best are taken and the reachability matrix rank is evaluated. If the rank is greater than model order, the model is reachable, as discussed before.
Search Space Topology To illustrate the search space in the optimization problem proposed in this paper, the following simpler but similar example is proposed. Let:
718
Application of Natural-Inspired Paradigms on System Identification
C = [1.231862 ‑0.719364\ be a 1x2 matrix and x0 the following 2x1 vector: 0.3601 x 0 = 0 . 2725 and y0 = Cx0 = 0.2476 For i and j varying in a range between -2 and 2 in steps of 0.01, a set of Cest(i,j) matrices was defined in the following way: Cest(i,j) = [i j] with these matrices, the numbers y 0est (i, j ) = C est (i, j )x 0 were calculated. For each Cest(i,j) the fitness was calculated according to the following equation: Fs (C est (i, j )) =
1 | y 0 − y 0est (i, j ) |
that is equivalent to the equation 55. The results of Fs for each Cest(i,j) are plotted in Figure 1. From the figure is easy to see that the surface of the problem is not well behaved since it has high peaks and the derivatives are also very high. Consequently, if a heuristic algorithm is used to solve the optimization problem of finding the Cest(i,j) that best represents C, at least one of the random candidate solution must fall in a very restricted region of the space. Since this has a low probability to happen, this is a hard optimization problem to be solved by heuristic algorithms. The problem studied in this paper is even harder than the example proposed in this section since it has four matrices with dimensions higher than the one shown. As shown in the next section, an initialization step was implemented to deal with this problem. This step puts the candidate solutions in a region near the optimal solutions to be found.
Proposed Algorithm As shown in the last section, the optimization problem related to the time variant system identification is constrained and has a search space that is not favorable to heuristic algorithms since the peaks are high and with high derivatives. On the other hand, considering that the system variations are continuous in time, once a solution is found, it is easy to follow slight variations of it by making small disturbances to the solution found in the previous time. More than that, if an immuno-inspired algorithm is used, it is guaranteed that, if the suppression threshold is well chosen, a solution that solves the problem in the past is kept in the population and consequently, if the system comes back to a previous situation, the solution is easier to be found.
719
Application of Natural-Inspired Paradigms on System Identification
Figure 1. Results of calculating the fitness Fs to each Cest(i,j)
The analogy between the optimization problem treated in this chapter and the immune system is the following: the antibodies are the quadruples {Aest, Best, Cest, Dest} and antigens are the time variant system matrices at each instant k represented by the quadruple {A, B, C, D}. If the system varies in time, or in other words, if the antigen suffers mutations, the quadruple {Aest, Best, Cest, Dest} will be mutated until it fits to the new system, or in other words, the B cells that secrete the antibodies will suffer mutation to fit the mutated antigen. If the system changes again, the estimated quadruple will be changed again to increase the fitness, but the previous solution will be kept in the organism. If the system comes back to a situation where it was before, the organism has already learned how to deal with that and the correct antibody will be used. The main advantage of the immuno-inspired algorithms to other heuristic algorithms in this case is that previous solutions are kept in the population, without overpopulating the organism, what would imply on a huge computational effort. The data used by the algorithm proposed in this paper is a set of inputs and outputs obtained from an actual or a benchmark time variant system. The outcomes are the models that describe the system at each instant of time. The algorithm steps are described below:
Algorithm Initialization Before the main loop starts, an initialization is run to define the initial set of candidate solutions of the algorithm. The steps in this initialization are described below:
720
Application of Natural-Inspired Paradigms on System Identification
Step 1: An initialization window is defined containing Nini samples of the inputs and outputs of the dynamic system to be modeled for time instants 0≤k≤Nini–1. This number of samples is considerably smaller than the total volume of data available but must be big enough to the MOESP algorithm work. In this definition uini=input(:,1:Nini–1) and yini=output(:,1:Nini–1) are the set of the first Nini samples of inputs and outputs of the system. Step 2: The MOESP algorithm described in the beginning of this paper is applied to the inputs u_ini and outputs y_ini contained in the initialization window to find a state space model that defines an initial quadruple {Aini, Bini, Cini, Dini} of matrices, as shown in the command below: {Aini, Bini, Cini, Dini} = MOESP(uini, yini) Step 3: The initial quadruple {Aini, Bini, Cini, Dini} is cloned creating Nclonesini clones, which are copies of this initial quadruple. Each one of the clones is then disturbed, it means, each matrix inside each clone is added to a disturbance, which is a random matrix of the same dimensions. This procedure defines the initial population. The disturbance of the clones is made such a way that is guaranteed that the solutions satisfy the constraints defined for the problem, as will be seen later in this section. To create the disturbance for each clone, a positive real variable defined as Disturbance Order of Magnitude (DOM) and a pseudorandom matrices generator are used. This generator is a routine that generates matrices with a predetermined dimension filled with pseudorandom numbers taken from an uniform distribution between 0 and 1. Subtracting a matrix with all elements equal to 0.5 from the pseudorandom matrix and multiplying the result by the positive real number DOM, a disturbance random matrix with elements from a uniform distribution between –DOM/2 and DOM/2 is obtained. Let rand(i,j) be the command to generate a matrix with i rows and j columns filled with pseudorandom numbers taken from an uniform distribution between 0 and 1 and ones(i,j) the command to generate a matrix with i rows and j columns filled with ones. Then a disturbance to the matrix Aini of each clone is obtained with the following command: Disturbance_A=DOM*(rand(n,n)-0.5*ones(n,n)) This disturbance is then added to Aini defining a possible first matrix of each disturbed clone, defined as Aqq. A_qq=A_ini+ Disturbance_A The matrices must satisfy the constraints discussed before. For the matrix A this means that every eigenvalue of it must be inside the unit circle. To guarantee this, the eigenvalues of Aqq are calculated and if they are all inside the unit circle the matrix is considered part of a feasible solution and becomes one of the matrices of the quadruple that defines the candidate solution, receiving the name Aest. If not, a new disturbance is generated following the same rule and added again to Aini. This is done until a feasible matrix A appears. The pseudo code is the following: While all eingenvalues of Aqq are not inside the unit circle Disturbance_A=DOM*(rand(n,n)-0.5*ones(n,n))
721
Application of Natural-Inspired Paradigms on System Identification
A_qq=A_ini+ Disturbance_A Check the eigenvalues of A_qq End A_est=A_qq Once the A_est matrix of each clone is defined, a similar procedure is applied to disturb the B_est and the C_est matrices of each clone, using basically the same functions adapted to the dimension of each matrix and to the constraints related to B_est and C_est: While (Aest, Bqq) is not a reachable pair and Bqq does not satisfy its constraints DisturbanceB=DOM*(rand(n,m)-0.5*ones(n,m)) Bqq = Bini + DisturbanceB Check if the pair (Aest, Bqq) is reachable End Best = Bqq While (Aest, Cqq) is not an observable pair and Cqq does not satisfy its constraints DisturbanceC =DOM*(rand(l,n)-0.5*ones(l,n)) Cqq = Cini + DisturbanceC Check if the pair (Aest, Cqq) is observable End Cest = Cqq For the matrix Dest there is no constraint, so this is created simply adding the disturbance to the matrix Dini, found with the MOESP algorithm. DisturbanceD = DOM*(rand(l,m)-0.5*ones(l,m)) Dqq = Dini + DisturbanceD Each one of the clones is disturbed with the procedure described above, creating the initial population of the problem.
Algorithm Iterations Once the initial population is defined, the algorithm starts its main loop that contains the steps described below: Step 4: For each instant k>Nini, a set of samples containing the system inputs between the time instants k–(Nw/2) and k+Nw/2, and the system outputs between k–(Nw/2)+1 and k+Nw/2+1 is taken, where Nw is the window length. This set of data is defined as time window and represents the input and output data of the system around the time instant k. Step 5: The fitness of the antibodies in the population is evaluated. To calculate the fitness of each antibody, the inputs of the system taken from the set of the samples of the time window are given as inputs to each model defined by each quadruple that represents each antibody in the population. Then the distance between the outputs of the model and the outputs of the system for each time
722
Application of Natural-Inspired Paradigms on System Identification
instant inside the time window is calculated as the Euclidian distance between the two vectors, and fitness is calculated as described in equation 55. As seen before, if the antibody is a quadruple of matrices that represents a model that is a good approximation for the system, the outputs of this model will be similar to the outputs of the system and the fitness of this antibody will be high. If the antibody is a quadruple of matrices that do not represent the system, its outputs will be far from the system outputs and its fitness will be low. Step 6: Each antibody at the population is cloned and the number of clones created for each antibody is proportional to the antibody fitness. Each clone is disturbed following the same procedure detailed in step 3 of the algorithm initialization, that means that the disturbances, that are matrices with elements created randomly with an uniform distribution between 0 an 1, subtracted by 0.5 and multiplied by DOM, are added to each one of the four matrices of the clone until feasible matrices are obtained. As the number of iterations gets bigger, the variable DOM decays. It means that in each iteration this variable is multiplied by a number between 0 and 1 in such a way that the disturbances become smaller as the candidate solutions converge to the optimum. Step 7: The distances between the antibodies in the population are calculated. In this paper the distance between to quadruples of matrices is defined as the sum of the distances between the correspondent matrices in each quadruple. For example, let {A1, B1, C1, D1} and {A2, B2, C2, D2} be two quadruples of matrices that define two different antibodies and let dist() be an operator that defines distances. Then: dist({A1, B1, C1, D1}, {A2, B2, C2, D2}) = dist(A1,A2)+dist(B1,B2)+dist(C1,C2)+dist(D1,D2) In this paper the distance between two matrices is defined based on the Euclidean distance between two vectors, which is the square root of the sum of the differences between each term squared. So, in this paper the distance between two matrices is defined as the square root of the sum of the differences between each term of the matrices squared. For example, let A1(i,j) be the term in the line i and column j f the matrix A1, and the same for the others matrices. The distance between the two antibodies {A1, B1, C1, D1} and {A2, B2, C2, D2} is defined in this paper as: dist
= =
(A1(1, 1) − A2(1, 1))2 + (A1(1, 2) − A2(1, 2))2 + ... + (A1(2, 1) − A2(2, 1))2 + ... + (A1(n, n ) − A2(n, n ))2 + 2 2 2 2 + (B1(1, 1) − B 2(1, 1)) + (B1(1, 2) − B 2(1, 2)) + ... + (B1(2, 1) − B 2(2, 1)) + ... + (B1(n, m ) − B 2(n, m )) +
2
2
2
2
=
+ (C 1(1, 1) − C 2(1, 1)) + (C 1(1, 2) − C 2(1, 2)) + ... + (C 1(2, 1) − C 2(2, 1)) + ... + (C 1(l , n ) − C 2(l , n )) +
=
2 + (D1(1, 1) − D 2(1, 1)) + (D1(1, 2) − D 2(1, 2)) + ... + (D1(2, 1) − D 2(2, 1)) + ... + (D1(l , m ) − D 2(l , m))) +
2
2
2
Step 8: The antibodies that are around another one with better fitness, that is, he ones that have a distance to an antibody with better fitness smaller than a suppression threshold, are eliminated from the population.
723
Application of Natural-Inspired Paradigms on System Identification
Step 9: The fitness of each antibody is calculated as discussed in step 5. If the better fitness is greater than a value defined as Freq (required fitness), then k is incremented and the algorithm goes to step 4 defining a new time window. Otherwise random antibodies are created respecting the constraints of the problem in a procedure similar to the discussed in the step 3 and are added to the population. With this new population the algorithm goes to step 6 and the current time window solutions are refined. If the end of inputs and outputs data is reached, the algorithm is finalized. The pseudocode that describes the main loop is shown in Box 1. Box 1.
An important aspect of the algorithm is that it is implemented in such a manner that if the system does not vary between k–(Nw/2) and k+(Nw/2)+1, the same quadruple that had good fitness for the time window between k–(Nw/2) and k+(Nw/2) will have a good fitness for the time window between k–(Nw/2)+1 and k+(Nw/2)+1, and almost surely no iteration will be needed in this new time window, since the maximum fitness of the population will be bigger than the minimum fitness required.
724
Application of Natural-Inspired Paradigms on System Identification
If the system suffers a small variation between two subsequent time windows, just slight variations of the quadruple will be needed to find a quadruple that, when applied to the inputs, results in an output with a desired fitness. In the following section, an example of application is shown. In this example, the system does not vary during the first instants of time and then, in the middle of the experiment starts varying. The behavior described in the two paragraphs above is clearly observed when the number of iterations is compared to the derivative of the system variation, as will be shown.
Example of Application To test the algorithm proposed in this paper, a tridimensional white noise with N=1000 samples was given as input to a linear time variant benchmark system with the structure shown in equation 12, with the following time variant matrices: 0.2128 f (k ) 0.1979 −0.0836 0.1808 0.4420 −0.3279 0.2344 A(k ) = −0.51882 0.1728 −0.5488 −0.3083 0.2252 −0.0541 −0.4679 0.8290
(56)
where: 0.1360 k < 200 k − 200 f (k ) = 0.1360 − sin( ) 200 0.1360 k > 828
200 ≤ k ≤ 828
(57)
−0.0101 0.0317 −0.9347 −0.0600 0.5621 0 . 1657 B(k ) = − 0 . 3310 − 0 . 3712 − 0 . 5846 − 0 . 2655 0 . 4255 0 . 2204
(58)
0.6557 −0.2502 −0.5188 −0.1229 C (k ) = 0 . 6532 − 0 . 1583 − 0 . 055 − 0 . 2497
(59)
−0.4326 0.1253 −1.1465 D(k ) = −1.6656 0.2877 1.1909
(60)
725
Application of Natural-Inspired Paradigms on System Identification
These matrices are similar to the benchmark that can be found in the third chapter of the reference (Tamariz, 2005). The only difference is the variation proposed in the term (1,2) of the matrix A. The variation of each element of the matrix A(k) along k is shown in Figure 2. Although only one element of only one matrix of the benchmark system changes along k, the system Markov parameters are deeply affected, as can be seen in Figures 3, 4, 5 and 6, where the variation of the first four Markov parameters C(k)B(k), C(k)A(k)B(k), C(k)A2(k)B(k) and C(k)A3(k)B(k) are shown. Figure 2. Variation of each A(k) element along k for the benchmark system
The benchmark system outputs were collected and the algorithm proposed in this paper was applied to the benchmark system inputs and outputs with Nini=40 and Nw=19. The algorithm result was a time variant system with the following structure: xest (k + 1) = Aest (k )x (k ) + Best (k )u(k ) y(k ) = C est (k )x (k ) + Dest (k )u(k )
726
(61)
Application of Natural-Inspired Paradigms on System Identification
where each matrix variation along k is shown in Figures 7, 8, 9 and 10. As can be seen from the figures, the matrices found with the algorithm proposed in this paper are different from the matrices defined for the benchmark system. More than that, the matrices Best, Cest, and Dest(k) shown in Figures 8, 9 and 10 are time variant, differently from the benchmark matrices B(k), C(k) and D(k) defined respectively in equations 58, 59 and 60, that do not vary along time. Although the estimated matrices found with the algorithm presented in this paper are different from the ones defined in the benchmark system, the time variant Markov parameters calculated with the estimated matrices are similar to the benchmark system Markov parameters, as can be seen in Figures 11, 12, 13 and 14, where the first four Markov parameters obtained with the estimated matrices are plotted together with the benchmark system Markov parameters. This shows that, even with different matrices, the estimated system impulse response is near the benchmark system impulse response, which means that the estimated system is a good approximation for the benchmark system. Figure 3. Benchmark system C(k)B(k) Markov parameter variation along k
727
Application of Natural-Inspired Paradigms on System Identification
Figure 4. Benchmark system C(k)A(k)B(k) Markov parameter variation along k
Figure 5. Benchmark system C(k)A2(k)B(k) Markov parameter variation along k
728
Application of Natural-Inspired Paradigms on System Identification
Figure 6. Benchmark system C(k)A3(k)B(k)) Markov parameter variation along k
Figure 7. Variation of each estimated Aest(k) element along k. Each plot represents one element
729
Application of Natural-Inspired Paradigms on System Identification
Figure 8. Variation of each estimated Best(k) element along k. Each plot represents one element
Figure 9. Variation of each estimated Cest(k) element along k. Each plot represents one element
730
Application of Natural-Inspired Paradigms on System Identification
Figure 10. Variation of each estimated Dest(k) element along k. Each plot represents one element
Figure 11. Markov parameters for the benchmark and the estimated system. In blue continuous line is plotted the C(k)B(k) Markov parameter for the benchmark System and in red crosses is plotted the Cest(k) Best(k) Markov parameter for the estimated system
731
Application of Natural-Inspired Paradigms on System Identification
Figure 12. Markov parameters for the benchmark and the estimated system. In blue continuous line is plotted the C(k)A(k)B(k) Markov parameter for the benchmark System and in red crosses is plotted the Cest(k) Aest(k)Best(k) Markov parameter for the estimated system
Figure 13. Markov parameters for the benchmark and the estimated system. In blue continuous line is plotted the C(k)A2(k)B(k) Markov parameter for the benchmark System and in red crosses is plotted the Cest(k) Aest2 (k)Best(k) Markov parameter for the estimated system
732
Application of Natural-Inspired Paradigms on System Identification
Figure 14. Markov parameters for the benchmark and the estimated system. In blue continuous line is plotted the C(k)A3(k)B(k) Markov parameter for the benchmark System and in red crosses is plotted the Cest(k) Aest3 (k)Best(k) Markov parameter for the estimated system
The same input used to generate the benchmark system outputs was applied to the time variant system obtained with the algorithm proposed in this paper. For comparison, this input was also applied to the quadruple {Aini, Bini, Cini, Dini} obtained during the MOESP initialization and for time variant matrices found with the MOESP-VAR algorithm using a window with the same length as the one used for the immuno-inspired algorithm, which is 19. The outputs from the benchmark system, from the time variant system obtained by the algorithm proposed in this paper, from the system defined by the quadruple {Aini, Bini, Cini, Dini} and from the system found with MOESP-VAR algorithm are shown in Figures 15 and 16. At the Figure 15 the time interval is in a region where the system does not vary. From this figure is easy to see that the outputs from the system obtained by the method proposed in this paper and the outputs from the system defined by the initialization quadruple are near the benchmark system outputs. It is as expected since in this interval the time variant benchmark system has the same matrices that are used in the MOESP initialization step. On the other hand, the outputs obtained with MOESP-VAR algorithm in some cases are far from the actual outputs. This happens because the window size is too small to get good estimates from the sample matrices decomposition. At the Figure 16 are shown the outputs in an interval where the benchmark system is under the greatest variation in the experiment (near k = 200). From this figure is easy to see that the outputs obtained with the system found in the MOESP initialization are still accurate, since the model is still near the original benchmark system for which the MOESP initialization was applied. The outputs obtained with
733
Application of Natural-Inspired Paradigms on System Identification
the algorithm proposed in this paper are also accurate. The outputs obtained with the system found in MOESP-VAR algorithm are not accurate since at this point the system suffers a big variation and the input and output samples do not satisfy the hypothesis that the system do not change along time. At the Figure 17 is shown a time interval where the system is farther from the system for which the initialization was made (around k=515). As expected, the outputs obtained with the model found at MOESP initialization are far from the actual outputs since at that interval the system is different from the one for which the initialization was made. The outputs obtained with the system found with MOESP-VAR algorithm are near the actual outputs since the system is not under a big variation during this interval. The results obtained with the system estimated by the algorithm proposed in this paper are near to the actual outputs. Figure 15. Outputs in the interval 1009
Low (1) Medium (2) High (3)
p12
113
Low (1) Medium (2) High (3)
p13