258 79 10MB
English Pages 361 [347] Year 2021
Lecture Notes in Networks and Systems 187
Sabu M. Thampi · Jaime Lloret Mauri · Xavier Fernando · Rajendra Boppana · S. Geetha · Axel Sikora Editors
Applied Soft Computing and Communication Networks Proceedings of ACN 2020
Lecture Notes in Networks and Systems Volume 187
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
Sabu M. Thampi · Jaime Lloret Mauri · Xavier Fernando · Rajendra Boppana · S. Geetha · Axel Sikora Editors
Applied Soft Computing and Communication Networks Proceedings of ACN 2020
Editors Sabu M. Thampi Indian Institute of Information Technology and Management Kerala Trivandrum, Kerala, India Xavier Fernando Ryerson Communications Lab, Department of Electrical and Computer Engineering Ryerson University Toronto, ON, Canada S. Geetha School of Computer Science and Engineering Vellore Institute of Technology Chennai, India
Jaime Lloret Mauri Integrated Management Coastal Research Institute Universitat Politecnica de Valencia Valencia, Spain Rajendra Boppana Department of Computer Science University of Texas at San Antonio San Antonio, TX, USA Axel Sikora University of Applied Sciences Offenburg Offenburg, Germany
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-33-6172-0 ISBN 978-981-33-6173-7 (eBook) https://doi.org/10.1007/978-981-33-6173-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
The International Conference on Applied Soft Computing and Communication Networks (CAN’20) organized by the Vellore Institute of Technology (VIT), Chennai, India, was conducted as a virtual event on October 14–17, 2020. The conference served as a forum for exchange and dissemination of ideas and the latest findings in the aspects of computing and communication networks. All submissions were evaluated on the basis of their significance, novelty, and technical quality. A double-blind review process was conducted to ensure that the author names and affiliations were unknown to the TPC. This volume consists of 22 papers presented at the symposium. Special thanks are given to the Conference Committee for the commitment to the conference organization. We would also like to thank all the authors who contributed with their papers to the success of the conference. We express our most sincere thanks to all keynote speakers who shared with us their expertise and knowledge. We are grateful to the Vellore Institute of Technology (VIT), Chennai, for organizing the conference. Recognition should go to the Local Organizing Committee members, who all have worked extremely hard for the details of important aspects of the conference programs. We appreciate the contributions of all the faculty and staff of VIT and the student volunteers who have selflessly contributed their time to make this virtual conference successful. Finally, we would like to thank our publisher, Springer, and in particular, Senior Editor, Aninda Bose for a very fruitful collaboration. Trivandrum, India Valencia, Spain Toronto, Canada San Antonio, USA Chennai, India Offenburg, Germany October 2020
Sabu M. Thampi Jaime Lloret Mauri Xavier Fernando Rajendra Boppana S. Geetha Axel Sikora
v
Organized By
Vellore Institute of Technology (VIT), Chennai, India
Conference Organization Chief Patron Dr. G. Viswanathan, Chancellor, VIT Patrons Sankar Viswanathan, Vice-President Vellore Institute of Technology Sekar Viswanathan, Vice-President Vellore Institute of Technology G. V. Selvam, Vice-President Vellore Institute of Technology Sandhya Pentareddy, Executive Director, Vellore Institute of Technology Kadhambari S. Viswanathan, Assistant Vice-President, VIT Rambabu Kodali, Vice Chancellor , Vellore Institute of Technology S. Narayanan, Pro-VC, Vellore Institute of Technology Vellore V. S. Kanchana Bhaaskaran, Pro-VC, Vellore Institute of Technology Chennai P. K. Manoharan, Additional Registrar, VIT Chennai
vii
viii
Honorary General Chair Raj Kumar Buyya, University of Melbourne, Australia General Chairs Jaime Lloret_Mauri, Polytechnic University of Valencia, Spain Soura Dasgupta, The University of Iowa, USA Xavier Fernando, Ryerson University, Canada General Executive Chair Sabu M. Thampi, IIITM-Kerala, India Organizing Chairs Geetha S., VIT, Chennai Jagadeesh Kannan R., VIT, Chennai TPC Chairs Jemal H. Abawajy, Deakin University, Australia Rajendra Boppana, University of Texas at San Antonio, USA Axel Sikora, University of Applied Sciences Offenburg, Germany Ashok Kumar Das, IIIT, Hyderabad, India Organizing Secretaries Sweetlin Hemalatha C., VIT, Chennai Suganya G., VIT, Chennai Kumar R., VIT, Chennai Organizing Co-chairs Asha S., VIT, Chennai Pattabiraman R., VIT, Chennai Viswanathan V., VIT, Chennai TPC Members Tri-Thanh Nguyen, Vietnam National University, Hanoi, Vietnam Ashwin Ashok, Georgia State University, USA Eduard Babulak, Liberty University, USA Visvasuresh Victor Govindaswamy, Concordia University, USA Yiming Ji, Georgia Southern University, USA Mohammed Kaabar, Moreno Valley College, USA J. Mailen Kootsey, Simulation Resources, Inc., USA Xia Li, Apple, USA Kyriakos Manousakis, Applied Communication Sciences, USA Haijun Pan, New Jersey Institute of Technology, USA Shivani Sud, Intel Labs, USA Mahbubur Syed, Minnesota State University, Mankato, USA
Organized By
Organized By
ix
Abdullah Tansel, Baruch College, USA Krishnaiyan Thulasiraman, University of Oklahoma, USA Zhenzhen Ye, IBM, USA Ahmed Elmisery, University of South Wales, UK Tomasz Mach, Samsung Electronics Research, UK, Cathryn Peoples, Ulster University, UK Omer Rana, Cardiff University, UK Francesco Tusa, University College London, UK Mohammad Al-Shabi, University of Sharjah, UAE Kamran Arshad, Ajman University, UAE Amine Dhraief, University of Manouba, Tunisia Chin-Chen Chang, Feng Chia University, Taiwan Lien-Wu Chen, Feng Chia University, Taiwan Ying-ping Chen, National Chiao Tung University, Taiwan Chien-Fu Cheng, Tamkang University, Taiwan Ying-Ren Chien, National I-Lan University, Taiwan Chun-I Fan, National Sun Yat-sen University, Taiwan Gwo-Jiun Horng, Southern Taiwan University of Science and Technology, Taiwan Jia-Chin Lin, National Central University, Taiwan Wen-Yang Lin, National University of Kaohsiung, Taiwan Kuei-Ping Shih, Tamkang University, Taiwan Sheng-Shih Wang, Lunghwa University of Science and Technology, Taiwan You-Chiun Wang, National Sun Yat-Sen University, Taiwan Chang Wu Yu, Chung Hua University, Taiwan OskarsOzolins, RISE AB, Sweden Juan Carlos Cuevas Martínez, University of Jaen, Spain Javier Gozalvez, Universidad Miguel Hernandez de Elche, Spain Emilio Jiménez Macías, University of La Rioja, Spain Cristina López-Bravo, Universidade de Vigo, Spain Nestor Mora Nuñez, CadizUniversity, Spain Miguel Sepulcre, Universidad Miguel Hernandez de Elche, Spain Viranjay Mohan Srivastava, University of KwaZulu-Natal, South Africa El-Sayed El-Alfy, King Fahd University of Petroleum and Minerals, Saudi Arabia Dmitry Korzun, Petrozavodsk State University, Russia Ramiro Barbosa, Institute of Engineering of Porto, Portugal Jose Delgado, Technical University of Lisbon, Portugal Anna Bartkowiak, University of Wroclaw, Poland Marcin Paprzycki, IBSPAN, Poland MariuszZal, Poznan University of Technology, Poland Piotr Zwierzykowski, Poznan University of Technology, Poland AzianAzamimi Abdullah, Universiti Malaysia Perlis, Malaysia MohdRamziMohd Hussain, International Islamic University Malaysia, Malaysia Wan Hussain Wan Ishak, Universiti Utara Malaysia, Malaysia Sami Habib, Kuwait University, Kuwait Akihiro Fujihara, Chiba Institute of Technology, Japan
x
Organized By
Kenichi Kourai, Kyushu Institute of Technology, Japan Paolo Bellavista, University of Bologna, Italy Domenico Ciuonzo, University of Naples Federico II, IT, Italy Franco Frattolillo, University of Sannio, Italy FabrizioGranelli, University of Trento, Italy Vincenzo Piuri, Universitàdegli Studi di Milano, Italy Simon Pietro Romano, University of Napoli Federico II, Italy Angelo Trotta, University of Bologna, Italy Ash Mohammad Abbas, Aligarh Muslim University, India Sai Dhiraj Amuru, IIT Hyderabad, India Manjunath Aradhya, Sri Jayachamarajendra College of Engineering, India Shanmugapriya D., Avinashilingam Institute, India Kaushik Das Sharma, University of Calcutta, India Durairaj Devaraj, Kalasalingam University, India Niketa Gandhi, Senior Member IEEE, India Bibhas Ghosal, Iiit Allahabad, India Akhil Gupta, Lovely Professional University, India Ankur Gupta, Model Institute of Engineering and Technology, India Brij Gupta, National Institute of Technology Kurukshetra, India Hari Gupta, Indian Institute of Technology (BHU) Varanasi, INDIA, India K. Haribabu, BITS Pilani, India Ajay Jangra, UIET, Kurukshetra University, Kurukshetra, Haryana, India Raveendranathan Kalathil Chellappan, College of Engineering Thiruvananthapuram, India Arzad Kherani, Indian Institute of Technology, Bhilai, India Ravibabu Mulaveesala, Indian Institute of Technology Ropar, India Subrata Nandi, National Institute of Technology, Durgapur, India Manoj kumar Parmar, Robert Bosch Engineering and Business Solutions Pvt Ltd., India Kanubhai Patel, Charotar University of Science and Technology (CHARUSAT), India Jaynendra Kumar Rai, Amity University Uttar Pradesh, India Tarun Rao, Accenture Solutions Pvt Ltd, India Mohd Sadiq, Jamia MilliaIslamia, India Aditi Sharma, Quantum University, Roorkee, Uttarakhand, India B. H. Shekar, Mangalore University, India Manu Sood, Himachal Pradesh University, India Jaya Sreevalsan-Nair, IIIT Bangalore, India Mahalakshmi T., University of Kerala, India Petros Bithas, University of Piraeus, Greece Sotiris Kotsiantis, University of Patras, Greece Demetrios Sampson, University of Piraeus, Greece Dimitrios Stratogiannis, National Technical University of Athens, Greece Alexander Vavoulas, University of Thessaly, Greece Dimitrios D. Vergados, University of Piraeus, Greece
Organized By
xi
Christian Veenhuis, HELLA Aglaia Mobile Vision GmbH, Germany Dennis Pfisterer, University of Luebeck, Germany Vincenzo Sciancalepore, NEC Laboratories Europe GmbH, Germany Ramin Yahyapour, GWDG - University Göttingen, Germany Mohamed Ba khouya, University of Technology of Belfort Montbeliard, France Pascal Lorenz, University of Haute Alsace, France Philippe Merle, Inria Lille - Nord Europe, France Françoise Sailhan, CNAM, France Mohamed Moustafa, Egyptian Russian University, Egypt Huifang Chen, Zhejiang University, China Taiping Cui, Chongqing University of Posts and Telecommunications, China Wei-Chiang Hong, Jiangsu Normal University, China Philip Moore, Lanzhou University, China Jin-Yuan Wang, Nanjing University of Posts and Telecommunications, China Kui Xu, Army Engineering University of PLA, China Zbigniew Dziong, École de technologiesupérieure, University of Quebec, Canada Dongxiao Liu, University of Waterloo, Canada Michael McGuire, University of Victoria, Canada Vladimir Jotsov, ULSIT- Sofia, Bulgaria Lucio Agostinho, Federal University of Technology - Campus Apucarana, Brazil Elizabeth Goldbarg, Federal University of Rio Grande do Norte, Brazil Roberta Spolon, UNESP - State University of Sao Paulo, Brazil Carlos Becker Westphall, Federal University of Santa Catarina, Brazil A. B. M. Alim Al Islam, Bangladesh University of Engineering and Technology, Bangladesh Lloyd Wood, Ericsson, Australia Malika Bourenane, University of Senia, Algeria Moussa Diaf, Université Mouloud Mammri, Algeria
Contents
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pedro Juan Roig, Salvador Alcaraz, Katja Gilly, and Carlos Juiz
1
A Novel Approach to Preserve DRM for Content Distribution over P2P Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yogita Borse, Purnima Ahirao, Meet Mangukiya, and Shreeya Patel
17
Secure Multimodal Biometric Recognition in Principal Component Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. P. Ragendhu and Tony Thomas
27
EEG-Based Emotion Recognition and Its Interface with Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gali Amoolya, Ashna KK, Gadde Sai Venkata Swetha, Ganga Das, Jalada Geeta, and Sudhish N. George Profile Verification Using Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rishab Jain and V. Sarasvathi
41
57
Distributed Denial of Service (DDoS) Attacks Detection: A Machine Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Premson Singh Samom and Amar Taggu
75
Intrusion Detection Using Deep Neural Network with AntiRectifier Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ritika Lohiya and Ankit Thakkar
89
CloudSim Exploration: A Knowledge Framework for Cloud Computing Researchers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Lakshmi Sankaran and Saleema Janardhanan Subramanian A Comprehensive Survey on Big Data Technology Based Cybersecurity Analytics Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 S. Saravanan and G. Prakash xiii
xiv
Contents
An Asynchronous Leader-Based Neighbor Discovery Protocol in Static Wireless Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Jose Vicente Sorribes, Lourdes Peñalver, and Jaime Lloret P2P Bot Detection Based on Host Behavior and Big Data Technology . . . 163 P. Sai Teja, P. Hema Sirija, P. Roshini, and S. Saravanan Localization in Wireless Sensor Networks Using a Mobile Anchor and Subordinate Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Abhinesh Kaushik and D. K. Lobiyal Serverless Deployment of a Voice-Bot for Visually Impaired . . . . . . . . . . . 189 Deepali Bajaj, Urmil Bharti, Hunar Batra, Anita Goel, and S. C. Gupta Smart Cities and Spectrum Vulnerabilities in Long-Range Unlicensed Communication Bands: A Review . . . . . . . . . . . . . . . . . . . . . . . . 207 Elisha D. Markus and Johnson Fadeyi Stability Certification of Dynamical Systems: Lyapunov Logic Learning Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Mongelli Maurizio and Orani Vanessa Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Santhosh Thota and Pradip Sircar Interference Management Technique for LTE, Wi-Fi Coexistence Based on CSI at the Transmitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 D. Diana Josephine and A. Rajeswari VHF OSTBC MIMO System Used to Overcome the Effects of Irregular Terrain on Radio Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Xolani B. Maxama and Elisha D. Markus Modelling Video Frames for Object Extraction Using Spatial Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Vinayak Ray and Pradip Sircar Speech-Based Selective Labeling of Objects in an Inventory Setting . . . . 301 A. Alice Nithya, Mohak Narang, and Akash Kumar Classification and Evaluation of Goal-Oriented Requirements Analysis Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Farhana Mariyam, Shabana Mehfuz, and Mohd. Sadiq iCREST: International Cross-Reference to Exchange-Based Stock Trend Prediction Using Long Short-Term Memory . . . . . . . . . . . . . . . . . . . 323 Kinjal Chaudhari and Ankit Thakkar Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Editors and Contributors
About the Editors Sabu M. Thampi is Professor at Indian Institute of Information Technology and Management Kerala (IIITM-K), Technopark Campus, Trivandrum, Kerala State, India. He has completed Ph.D. in Computer Engineering from the National Institute of Technology, Karnataka. Sabu has several years of teaching and research experience in various institutions in India. His current research interests include sensor networks, Internet of things (IoT), cognitive computing, affective computing, social networks, nature-inspired computing, image forensics, and video surveillance. He has authored and edited few books published by reputed international publishers and published papers in academic journals and international and national proceedings. Sabu has served as Guest Editor for special issues in few international journals and Program Committee Member for many international conferences and workshops. He has co-chaired several international workshops and conferences. He has initiated and is also involved in the organization of several annual conferences/symposiums. Sabu is currently serving as Editor for Journal of Network and Computer Applications (JNCA), Elsevier, and Journal of Applied Soft Computing, Elsevier; Associate Editor for IEEE Access and International Journal of Embedded Systems, Inderscience, UK; and Reviewer for several reputed international journals. Sabu is a Senior Member of IEEE and Member of IEEE Communications Society, IEEE SMCS, and Senior Member of ACM. He is the Founding Chair of ACM Trivandrum Professional Chapter. Jaime Lloret Mauri received his B.Sc.+M.Sc. in Physics in 1997, his B.Sc.+ M.Sc. in Electronic Engineering in 2003, and his Ph.D. in Telecommunication Engineering (Dr. Ing.) in 2006. He is Cisco Certified Network Professional Instructor. He worked as Network Designer and Administrator in several enterprises. He is currently an Associate Professor at the Polytechnic University of Valencia. He is the Chair of the Integrated Management Coastal Research Institute (IGIC), and he is Head of the “Active and collaborative techniques and use of technologic resources in the education (EITACURTE)” Innovation Group. He is the Director of the University xv
xvi
Editors and Contributors
Diploma “Redes y Comunicaciones de Ordenadores,” and he has been the Director of the University Master “Digital Post Production” for the term 2012–2016. He’s served as the General Chair (or Co-Chair) of 38 international workshops and conferences. He has authored 22 book chapters and has more than 380 research papers published in national and international conferences and international journals (more than 140 with ISI Thomson JCR). He has been Co-Editor of 40 conference proceedings and Guest Editor of several international books and journals. He is the Editor-in-Chief of the “Ad Hoc and Sensor Wireless Networks” (with ISI Thomson Impact Factor), the international journal “Networks Protocols and Algorithms,” and the International Journal of Multimedia Communications, and IARIA Journals Board Chair (8 journals), and he is (or has been) an Associate Editor of 46 international journals (16 of them with ISI Thomson Impact Factor). He has been involved in more than 400 program committees of international conferences and more than 150 organization and steering committees. He leads many national and international projects. He is currently the Chair of the Working Group of the Standard IEEE 1907.1. He has been the General Chair (or Co-Chair) of 38 international workshops and conferences. He is a Senior Member of IEEE and IARIA Fellow. Xavier Fernando is Professor at the Department of Electrical and Computer Engineering, Ryerson University, Toronto, Canada. He has (co-)authored over 200 research articles and three books (one translated to Mandarin) and holds few patents and non-disclosure agreements. He is the Director of the well-funded Ryerson Communications Lab. He was an IEEE Communications Society Distinguished Lecturer and delivered over 50 invited talks and keynote presentations all over the world. He was the Member of the IEEE Communications Society (COMSOC) Education Board Working Group on Wireless Communications. He was the Chair of IEEE Canada Humanitarian Initiatives Committee 2017–2018. He was also the Chair of the IEEE Toronto Section and IEEE Canada Central Area. His work has won 30 awards and prizes so far including Professional Engineers Ontario Award in 2016, IEEE Microwave Theory and Techniques Society Prize in 2010, Sarnoff Symposium Prize in 2009, Opto-Canada best poster prize in 2003, and CCECE best paper prize in 2001. He was the General Chair for IEEE International Humanitarian Technology Conference (IHTC) 2017 and General Chair of IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), 2014. He has been in the organizing/steering/technical program committees of numerous conferences and journals. He was the Member of the Board of Governors of Ryerson University during 2011– 2012. He is a program evaluator for ABET (USA). He was a visiting scholar at the Institute of Advanced Telecommunications (IAT), UK, in 2008, and MAPNET Fellow visiting Aston University, UK, in 2014. Ryerson University nominated him for the Top 25 Canadian Immigrants award in 2012, in which he was a finalist. Rajendra Boppana is a Professor in the Department of Computer Science at the University of Texas at San Antonio (UTSA). Dr. Boppana’s research interests include computer network security and performance and high performance computing. Dr. Boppana is currently working on the analysis, visualization, and mitigation of
Editors and Contributors
xvii
denial of service attacks on software defined networks. He has published 75 peerreviewed conference papers and journal articles, in addition to several book chapters on these topics. Dr. Boppana served as Principal Investigator (PI) or Co-PI for over 12 research grants from US federal funding agencies and is the sole or lead inventor for three patents. Dr. Boppana received his Ph.D. degree in computer engineering from the University of Southern California, Los Angeles, USA. Dr. Boppana directed the UTSA’s Quantitative Literacy Program (QLP), which is a university-wide curriculum enhancement program, 2011–2016. Dr. Boppana served as the Chair of the Department of Computer Science from 2012 to 2018. S. Geetha is a Professor and Associate Dean in the School of Computer Science and Engineering, VIT University, Chennai Campus, Tamil Nadu, India. She has received the B.E., and M.E., degrees in Computer Science and Engineering from Madurai Kamaraj University, India, in 2000, and Anna University of Chennai, India, in 2004, Ph.D. degree from Anna University in 2011, respectively. She has more than 19 years of rich teaching and research experience. She has published more than 100 papers in reputed international conferences and refereed journals. Her research interests include steganography, steganalysis, multimedia security, intrusion detection systems, machine learning paradigms, and information forensics. She joins the review committee and editorial advisory board of journals like IEEE Transactions on Information Forensics and Security and IEEE Transactions on Image Processing, Springer Multimedia Tools and Security, Elsevier Information Sciences. She has published 5 books. She has given many expert lectures and keynote addresses at international and national conferences. She has organized many workshops, conferences, and FDPs. She is a recipient of University Rank and Academic Topper Award in B.E. and M.E. in 2000 and 2004, respectively. She is also the proud recipient of ASDF Best Academic Researcher Award 2013, ASDF Best Professor Award 2014, Research Award 2016 and High Performer Award 2016, from VIT University, and Best Poster Award in ISCA 2018. Axel Sikora holds a master (Dipl.-Ing.) of Electrical Engineering and a master (Dipl. Wirt-Ing.) of Business Administration, both from Aachen Technical University, Germany. He is a DAAD alumnus from 1990/91 in St Petersburg Politechnical Institute. He has done a Ph.D. (Dr.-Ing.) in Electrical Engineering at the Fraunhofer Institute of Microelectronics Circuits and Systems, Duisburg, with a thesis on SOI-technologies. After various positions in the telecommunications and semiconductor industry, he became a professor at the Baden-Wuerttemberg Cooperative State University Loerrach in 1999. In 2011, he joined Offenburg University of Applied Sciences, where he now leads the Institute of Reliable Embedded Systems and Communication Electronics (ivESK). Since 2016, he is also deputy member of the board to Hahn-Schickard Association of Applied Research, a government-funded research institute in the state of Baden-Wuerttemberg. There, he now leads two engineering divisions “Embedded Solutions” and “Software Solutions”. Since 2019, he is also affiliated to Technical Faculty of Freiburg University, allowing him to serve as primary Ph.D. supervisor.
xviii
Editors and Contributors
Dr. Sikora is founder and head of Steinbeis Transfer Center Embedded Design and Networking and shareholder of STACKFORCE GmbH, an independent and successful spin-off engineering company around IoT connectivity solutions. His major interest is in the field of efficient, energy-aware, autonomous, and valueadded algorithms and protocols for wired and wireless embedded communication with a strong focus on end-to-end security concepts and implementations for the Industrial Internet of Things. Dr. Sikora is author, co-author, editor and co-editor of several textbooks and more than 200 peer-reviewed papers in the field of embedded design and wireless & wired networking. Amongst many other duties, he serves as Chairman of the annual embedded world Conference (Nuremberg), the world’s largest event on the topic, and as Scientific Advisor of the annual Wireless Congress (Munich) and of the annual IoT Conference (Stuttgart).
Contributors Purnima Ahirao Department of Information Technology, K J Somiaya College of Engineering, Mumbai, India Salvador Alcaraz Miguel Hernández University, Elche, Spain A. Alice Nithya School of Computing, SRM Institute of Science and Technology, Kattankulathur, Kanchipuram, Chennai, TN, India Gali Amoolya National Institute of Technology Calicut, Kerala, India Deepali Bajaj Department of Computer Science, Shaheed Rajguru College of Applied Sciences for Women, University of Delhi, Delhi, India Hunar Batra Department of Computer Science, Shaheed Rajguru College of Applied Sciences for Women, University of Delhi, Delhi, India Urmil Bharti Department of Computer Science, Shaheed Rajguru College of Applied Sciences for Women, University of Delhi, Delhi, India Yogita Borse Department of Information Technology, K J Somiaya College of Engineering, Mumbai, India Kinjal Chaudhari Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad, Gujarat, India Ganga Das National Institute of Technology Calicut, Kerala, India Johnson Fadeyi Central University of Technology, Bloemfontein, South Africa Jalada Geeta National Institute of Technology Calicut, Kerala, India Sudhish N. George National Institute of Technology Calicut, Kerala, India
Editors and Contributors
xix
Katja Gilly Miguel Hernández University, Elche, Spain Anita Goel Department of Computer Science, Dyal Singh College, University of Delhi, Delhi, India S. C. Gupta Department of Computer Science, Indian Institute of Technology, Delhi, India P. Hema Sirija Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India Rishab Jain PES University, Bangalore, India D. Diana Josephine Department of ECE, Coimbatore Institute of Technology, Coimbatore, India Carlos Juiz University of the Balearic Islands, Palma de Mallorca, Spain Abhinesh Kaushik School of Computers and Systems Sciences, Jawaharlal Nehru University, New Delhi, India Ashna KK National Institute of Technology Calicut, Kerala, India Akash Kumar School of Computing, SRM Institute of Science and Technology, Kattankulathur, Kanchipuram, Chennai, TN, India Jaime Lloret Integrated Management Coastal Research Institute, Universitat Politécnica de Valencia, Valencia, Spain D. K. Lobiyal School of Computers and Systems Sciences, Jawaharlal Nehru University, New Delhi, India Ritika Lohiya Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad, Gujarat, India Meet Mangukiya Department of Information Technology, K J Somiaya College of Engineering, Mumbai, India Farhana Mariyam Department of Electrical Engineering, Faculty of Engineering and Technology, Jamia Millia Islamia (A Central University), New Delhi, India Elisha D. Markus Central University of Technology, Bloemfontein, South Africa Mongelli Maurizio National Research Council of Italy (CNR) - IEIIT Institute, Genoa, Italy Xolani B. Maxama Central University of Technology, Bloemfontein, South Africa Shabana Mehfuz Department of Electrical Engineering, Faculty of Engineering and Technology, Jamia Millia Islamia (A Central University), New Delhi, India Mohak Narang School of Computing, SRM Institute of Science and Technology, Kattankulathur, Kanchipuram, Chennai, TN, India
xx
Editors and Contributors
Shreeya Patel Department of Information Technology, K J Somiaya College of Engineering, Mumbai, India Lourdes Peñalver Department of Computer Engineering, Universitat Politécnica de Valencia, Valencia, Spain G. Prakash Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India S. P. Ragendhu Indian Institute of Information Technology and ManagementKerala. (Research Centre of Cochin University of Science and Technology, India), Kazhakkoottam, India A. Rajeswari Department of ECE, Coimbatore Institute of Technology, Coimbatore, India Vinayak Ray Department of Electrical Engineering, Indian Institute of Technology Kanpur, Kanpur, Uttar Pradesh, India; Presently with Advanced Micro Devices, Bengaluru, India Pedro Juan Roig Miguel Hernández University, Elche, Spain; University of the Balearic Islands, Palma de Mallorca, Spain P. Roshini Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India Mohd. Sadiq Department of Computer Science and Automation, Indian Institute of Science Bangalore Karnataka, Bangalore, India P. Sai Teja Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India Gadde Sai Venkata Swetha National Institute of Technology Calicut, Kerala, India Lakshmi Sankaran CHRIST (Deemed to be University), Bengaluru, Karnataka, India V. Sarasvathi PES University, Bangalore, India S. Saravanan Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India Premson Singh Samom North Eastern Regional Institute of Science and Technology, Nirjuli, Arunachal Pradesh, India Pradip Sircar Department of Electrical Engineering, Indian Institute of Technology Kanpur, Kanpur, Uttar Pradesh, India Jose Vicente Sorribes Department of Computer Engineering, Universitat Politécnica de Valencia, Valencia, Spain
Editors and Contributors
xxi
Saleema Janardhanan Subramanian CHRIST (Deemed to be University), Bengaluru, Karnataka, India Amar Taggu North Eastern Regional Institute of Science and Technology, Nirjuli, Arunachal Pradesh, India Ankit Thakkar Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad, Gujarat, India Tony Thomas Indian Institute of Information Technology and ManagementKerala, Kazhakkoottam, India Santhosh Thota Department of Electrical Engineering, Indian Institute of Technology Kanpur, Kanpur, Uttar Pradesh, India Orani Vanessa National Research Council of Italy (CNR) - IEIIT Institute, Genoa, Italy
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices Pedro Juan Roig, Salvador Alcaraz, Katja Gilly, and Carlos Juiz
Abstract Moving IoT devices may change their positions at any time, although they may always need to have their remote computing resources as close as possible due to their intrinsic restrictions. This condition makes Fog computing an ideal solution, where hosts may be distributed over the Fog domain, and additionally, the Cloud domain may be used as support. In this context, a generic scenario is presented and the most common actions regarding the management of virtual machines associated with such moving IoT devices are being modelled and verified by algebraic means, focusing on the message exchange among all concerned actors for each action. Keywords Algebraic modelling · Cloud support · Fog computing · IoT · Live VM migration
1 Introduction Fog computing and its use with IoT devices is a hot topic nowadays [1]. Such devices have some specific features related to restricted computing power [2] regarding CPU processing, RAM memory, storage capacity, limited bandwidth and power consumption [3]. Hence, regular communication paradigms such as client-server or peer-topeer are not convenient for IoT communications [4]. However, they are better suited P. J. Roig (B) · S. Alcaraz · K. Gilly Miguel Hernández University, Elche, Spain e-mail: [email protected] S. Alcaraz e-mail: [email protected] K. Gilly e-mail: [email protected] P. J. Roig · C. Juiz University of the Balearic Islands, Palma de Mallorca, Spain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_1
1
2
P. J. Roig et al.
for paradigms where there is a central server carrying out most of the work [5], thus minimizing the workload for IoT devices. Furthermore, when IoT devices are in the move [6], it is crucial that their associated virtual machines (VM) [7] are as close as possible to them [8], thus minimising the latency [9] and jitter [10], as well as the necessary bandwidth [11]. In this context, the deployment of hosts in distant geographical locations within the Fog environment provides the possibility of performing live VM migration among those hosts [12] in order for a given VM to be located in the host being the closest possible to its associated user, while such a user may be moving around [13]. The rest of this paper is organized as follows: first, Sect. 2 presents the process algebra used in the models, then, Sect. 3 introduces the topology framework in use, next, Sect. 4 delivers the models proposed, afterwards, Sect. 5 verifies those models, and finally, Sect. 6 draws some conclusions.
2 Algebra of Communicating Processes In order to be able to model the procedures aimed at managing VMs when moving IoT devices enter and exit the Fog environment, a process algebra called Algebra of Communicating Processes (ACP) is going to be used. ACP is considered as the most abstract among all process algebras, as it does not care about the real nature of entities, but only cares about its characteristics, as other abstract algebras do, such as field theory, ring theory or group theory [14]. ACP permits to reason about relationships among the processes describing the behaviour of the entities being modelled [15]. In that sense, ACP basically offers two atomic actions, such as sending a message through channel i, referred to as si (d), and reading a message through channel i, referred to as ri (d). Such a message is undetermined, thus being either a control message, a data message or even a VM being migrated. ACP also makes use of operators so as to deal with atomic actions, allowing derivations to be undertaken in order to achieve equivalent expressions. Those derivations may either be by means of a specific software, or otherwise manually, as being presented herein for clarifying purposes [16]. There is a bunch of operators, but only the most important ones are going to be introduced [17]. In that sense, the most common operators are sequential, being denoted by the multiplication sign ·, alternate, being quoted by the addition sign +, concurrent, being indicated by the parallel sign ||, and conditional, being designated by the structure (True condition False). Likewise, the encapsulation operator must be remarked, which is represented by ∂ H () and whose role is to allow communications among processes through the channels as well as leading to deadlock when such communications do not take place. Additionally, the abstraction operator is also relevant, which is symbolized by τ I () and whose role is to mask internal actions, as well as hiding internal communications, thus revealing the external behaviour of the model proposed [18].
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices
3
3 Topology Framework The scenario to be modelled is a Fog computing environment for moving IoT devices within a wireless domain, having the Cloud as a backup solution. Those IoT devices may move around and their associated computing assets may try to be allocated in another host within the Fog ecosystem. This point means that each selected host may be situated as close as possible to its corresponding devices, thus the migration takes place from a source host to the destination host with enough available resources for such migration and being close enough to the former. The framework selected is composed of 5 layers, each of them having a different role in the topology, even though all of them are working altogether for the system to behave the way it is supposed to be. It is to be mentioned that moving IoT devices are not going to be considered as an independent layer, as they are the external factors asking for the different behaviours offered by the Fog environment. Figure 1 depicts the organization of such layers. In the first place, wireless layer is the lower one, and it is just where moving IoT devices get connected to the Fog environment, thus acting as an anchor point between the device and the Fog environment. Specifically, when a moving IoT device gets into the coverage area of a particular wireless relay, being identified by variable i, such as Wi , it gets access to its associated VM through the Fog infrastructure. Afterwards, orchestration layer is the most important one, as it is in charge of managing how the whole system behaves, such as selecting the proper host where a VM is to be moved on, where a new VM is to be created or an old VM is to be terminated, and also interacting with the Cloud in order to make it a useful backup system. In this scheme, just one orchestrator is assigned, being identified by Q, even though there may be two physical devices acting as one by means of High Availability policies.
C
T
Q
Hj
Cloud IN
Cloud OUT Wi
Migrate IN Init VM in
IoT
Fig. 1 Topology scheme
Migrate OUT Legend: Kill VM Wi : Wireless relay Hj : Hosts withing fog out T : Internetworking topology Q : Orchestrator C : Cloud node
4
P. J. Roig et al.
Then, the host layer is where VMs associated with the different users are allocated. Each host is identified by variable j, such as H j . It is to be noted that there is not an isomorphic correspondence between wireless relays and hosts, in a way that hosts are located all around the geographic coverage area of the Fog domain, although a single host may be the closest one to a range of wireless relays. Hence, this fact may mean that it is not necessary to make any migration when moving within such wireless areas, and the orchestration layer will not make any move about it. Next, the topology layer involves the interconnection Networking getting all hosts together in order for them to communicate to each other, being identified by T . It is to be reminded that the orchestrator is the one making a decision as to where and when to undertake VM migration, thus selecting the destination host. Nevertheless, once a migration has been proposed, live VM migration will take place according to the shortest path available in the topology selected. Additionally, load balancing policies might be applied if redundant paths may exist within the topology between a source host and a destination host. Additionally, the cloud layer will be used as a backup for the Fog environment and is going to be identified by C. It is to be said that the Cloud is not being managed by the orchestrator, but it just establishes communication with it in order to facilitate the transmission of VMs from Cloud to Fog and vice versa. This way, Cloud might be considered as a layer working along with the rest of the system. Once that the different layers have been introduced, it is time to present the communication channels among those layers. In order to identify such channels and each of their directions, a specific nomenclature for them is to be used. Basically, the traffic will go from a source layer to a destination layer, so the channel will be named by both its layer identifiers, just in that order. Therefore, Table 1 shows the communication channels to be used in this model. As it may be seen in both that table and the previous figure, the orchestrator is the main element within the scheme, being the central point and making all decisions, which are forwarded on to the rest of actors. Hence, it may be said that all actions are started at the wireless layer, then, they are planned at the orchestrator layer, and finally, they are executed at the other layers. Apart from the internal channels in both directions among the corresponding layers being part of the generic Fog computing environment proposed, there have been defined other 2 wireless physical channels. Such two channels are external to the Fog environment and represent either the way for a moving IoT device to get into the Fog domain by means of getting connected to a wireless relay, or otherwise, the way to get out of the Fog domain by getting disconnected from it. The former path is labelled as in, whereas the latter path is branded as out.
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices Table 1 Communication channels Channel ID Direction Wi Q
Incoming
QWi
Outgoing
QHj
Incoming
Hj Q
Outgoing
Hj T
Incoming
T Hj
Outgoing
QC
Incoming
CQ
Outgoing
in
Inwards
out
Outwards
5
Meaning Channel from the wireless relay Wi to orchestrator Q Channel from orchestrator Q to the wireless relay Wi Channel from orchestrator Q to the host H j Channel from the host H j to orchestrator Q Channel from the host H j to Switching topology T Channel from Switching topology T to the host H j Channel from orchestrator Q to the Cloud C Channel from the Cloud C to orchestrator Q Wireless physical channel coming into wireless relay Wi Wireless physical channel going out of wireless relay Wi
4 Actions to be Modelled Regarding the set of 6 actions required at wireless layer, as exhibited in Fig. 1, it is to be recalled that such actions are triggered by a moving IoT device coming in or going out of a wireless relay. This type of traffic may be referred to as control actions, which conceptually belongs to the control plane, also referred to as the intelligence of the system, as they are related to how the different VMs are governed by the system. This kind of traffic does not include the user’s data being forwarded by the system, which conceptually belongs to the data plane, or otherwise referred to as the forwarding plane. Likewise, this sort of traffic does not embrace the way to access and manage the different items deployed at each layer for the purpose of configuration, monitoring or maintenance, which conceptually belongs to the management plane. Those 6 actions required at the Wireless layer may be divided into two categories, thus distinguishing between some actions required when a moving IoT device gets associated to a particular wireless relay Wi , meaning that it gets into the Fog domain, or otherwise, some others when a moving IoT device gets disassociated, meaning that it gets out of the Fog domain. The actions shown on the left-hand side of the picture above belong to the former, whereas those on the right-hand side belong to the latter.
6
P. J. Roig et al.
4.1 Action Cloud IN It happens when a moving IoT device first gets into the Fog environment through wireless physical channel I N , coming from the Cloud environment and getting connected to any wireless relay Wi . In this case, the IoT device has already an associated VM located in the Cloud C, therefore, the target here is to migrate that VM in the Cloud domain into one of the hosts within the Fog domain, called host Hj. The Cloud where that VM is standing has an identifier k, and that VM inside that Cloud is identified by m. Furthermore, a VM is associated with a moving IoT device, which may be considered as user u. In summary, the identification of a VM may combine all three elements, such as the environment, the associated VM, and the user, according to the following expression: km→u . On the other hand, when the VM is in a host j within the Fog, the expression will be jm→u . This nomenclature would let an easy extension of the model to deal with multiple users, multiple VMs, and multiple platforms. First of all, it allows to distinguish the environment where the VM is located, either the Fog or the Cloud, and even facilitates the use of more than one Cloud by just modifying the Cloud identifier. Likewise, it permits the distinction among different VMs, in order to discriminate different VMs allocated to a single user. And additionally, it enables the possibility to discern among the different users making use of the remote computing facilities, no matter whether it is in the Fog or in the Cloud. However, to better read the algebraic expressions, only the first and second identifiers will be exposed, meaning the platform and the VM, but not the user. To start with, an IoT device associated with a VM in the Cloud joins the wireless relay Wi . At that point, Wi sends a control message to the orchestrator Q in order to find a host H j with enough available resources and located as close as possible to Wi . After the orchestrator Q has undertaken the necessary processing to select the proper host candidate j, then, the orchestrator sends a control message with the identifier of that host j and the identifier of that associated VM km straight to Cloud C in order to migrate that VM from the Cloud C to the selected host H j , resulting in jm . Regarding the modelling of this action, it is to be said the control message to prepare this type of migration is labelled as ctr 1 , because this action is going to be considered as action 1. Also, some internal processes have been defined to reflect what goes on in the system, such as ‘Select’ to choose host j, ‘MoveOut’ to start the migration from Cloud to Fog, ‘Cloud I n’ to carry out the migration from Cloud to destination host, ‘MoveI n’ to finish that migration, ‘U nbind’ to take out the old identifier of the associated VM and ‘Bind’ to include the new identifier of that VM. Anyway, here it comes the modelling of action 1:
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices
7
Wi = rin IoT (km ) · sWi Q ctr 1 (km ) · r QWi jm · Unbind IoT (km ) · Bind IoT ( jm ) · Wi
Q = r Wi Q ctr 1 (km ) · Select(km ) · s QC ctr 1 (km , jm ) · rC Q CloudIn(km , jm ) ·Q · s Q H j CloudIn(km , jm ) · r H j Q ACK, jm · s QWi jm C = rC Q ctr 1 (km , jm ) · MoveOut(km , jm ) · sC Q CloudIn(km , jm ) · C · Hj H j = r Q H j CloudIn(km , jm ) · MoveIn(km , jm ) · s H j Q ACK, jm
Putting all processes concurrently, and applying the encapsulation operator, it is obtained the sequence of events in a timely manner, unchained by the moving IoT device getting into the Fog environment and asking for a Cloud IN action: ∂ H Wi || Q || C || H j = rin IoT (km ) · cWi Q ctr 1 (km ) · Select(km ) ·c QC ctr 1 (km , jm ) · MoveOut(km , jm ) · cC Q CloudIn(km , jm ) ·c Q H j CloudIn(km , jm ) · MoveIn(km , jm ) · c H j Q ACK, jm · c QWi jm ·Unbind IoT (km ) · Bind IoT ( jm ) · ∂ H Wi || Q || C || H j
Eventually, an abstraction operator may be applied in order to mask internal processes and internal communications as well, thus maintaining only the external behaviour of the model, where all layers may be included, no matter whether they take part in this action or otherwise, as the latter does not make any influence in the overall outcome: τ I ∂ H Wi || Q || C || H j || T = rin IoT (km ) · τ I ∂ H Wi || Q || C || H j || T
4.2 Action Migrate IN It happens when a moving IoT device gets connected to a wireless relay Wi through a wireless physical channel IN, although it comes from another wireless relay Wi∗ within the Fog domain. In this case, the IoT device has already an associated VM located in a Host H j , therefore, the target here is to either migrate that VM from that aforesaid host H j to another given host H j being closer to the moving IoT device, or otherwise, to leave that VM in the same host H j as there is no closer host available within the Fog environment to the new wireless relay Wi .
8
P. J. Roig et al.
To begin with, an IoT device associated with a VM located in a particular host gets to wireless relay Wi . At that point, Wi sends a control message to the orchestrator Q in order to determine whether there is an available host H j with enough available resources and located closer than the current H j where the VM is located at that stage. Upon receiving that message, the orchestrator Q carries out the processing needed to find out whether such a new host is available, and if so, it is selected as the proper host candidate j . Hence, two things may happen: if such a better host exists, then the orchestrator sends a control message to Host H j with the identifier of the VM jm , which shows which is the current host where the VM is standing, and the identifier which will bear that VM after that migration has been undertaken from host j to host j , which will be VM jm . Alternatively, if no migration is to occur, then an acknowledge receipt message will be sent back to the IoT device, stating that no VM migration has been carried out. Regarding the modelling of this action, it is to be said the control message to prepare this type of migration is branded as ctr 2 , because this action is going to be considered as action 2. However, it is clear from its description that two couple of entities are involved, that is, a pair of wireless relays Wi ∗ and Wi and another pair of hosts H j and H j . With respect to the former pair, i represents the wireless relay where the IoT has just arrived, whereas i∗ does the one where it came from. With regards to the latter pair, j represents the host where the VM is located on arriving at the new wireless relay i, whilst j does the possible new host if VM migration takes place. So this model includes Wi , H j and H j . It is to be said that migration within the Fog domain may be described from two opposite points of view, such as from the source wireless relay or from the destination one. The former accounts for action 5, this is, migration out, whereas the latter accounts for action 2, this is, migration in. At this point, the latter is going to be formally described, whereas the former will be done later on. Furthermore, many of the processes defined for the previous action has been used, as well as three new ones called ‘MigrateOut’ to carry out the migration from the source host to the destination host, from the point of view of the former, ‘MigrateI n’ to do the same, but from the point of view of the latter, and ‘N etwor king’ to embrace the whole set of redundant paths being available between the source host and destination host, which depend on the topology selected for the switches where hosts are hanging on. Anyway, here it comes the modelling of action 2: Wi = rin IoT ( jm ) · sWi Q ctr 2 ( jm ) · r QWi jm · Unbind IoT ( jm ) · Bind IoT ( jm ) · Wi + r QWi ACK
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices
9
Q = r Wi Q ctr 2 ( jm ) · s Q H j ctr 2 ( jm , jm ) · r H j Q ACK, jm · s QWi jm Select( jm ) s QWi ACK ·Q H j = r Q H j ctr 2 ( jm , jm ) · MoveOut( jm , jm ) · s H j T MigrateOut( jm , jm ) · H j · H j H j = r T H j MigrateIn( jm , jm ) · MoveIn( jm , jm ) · s H j Q ACK, jm
T = r H j T MigrateOut( jm , jm ) · Networking( jm , jm ) · sT H j MigrateIn( jm , jm ) · T
Putting all processes in a concurrent manner and applying the encapsulation operator, it is attained the sequence of events in a timely manner, triggered by the moving IoT device getting into the Fog and asking for a Migrate IN action: ∂ H Wi || Q || H j || H j || T = rin IoT ( jm ) · cWi Q ctr 2 (km ) · c Q H j ctr 2 ( jm , jm ) · MoveOut( jm , jm ) · c H j T MigrateOut( jm , jm ) ·Networking( jm , jm ) · cT H j MigrateIn( jm , jm ) · MoveIn( jm , jm ) · c H j Q ACK, jm ·c QWi jm · Unbind IoT ( jm ) · Bind IoT ( jm ) Select( jm ) c QWi ACK ·∂ H Wi || Q || H j || H j || T
Eventually, abstraction operator may be applied in order to mask internal processes and internal communications, hence maintaining just the external behaviour of the model, where all layers are to be included, considering that those not participating do not influence the result in any way: τ I ∂ H Wi || Q || C || H j || H j || T = rin IoT ( jm ) · τ I ∂ H Wi || Q || C || H j || H j || T
4.3 Action Init VM It happens when a moving IoT device first gets into the Fog environment by means of wireless physical channel IN, and it does not have any associated VM in the Cloud. Hence, it is the case of an IoT device getting first registered to get its associated VM, and therefore, the target here is to create and initialize that VM into one of the hosts within the Fog domain. In the beginning, the IoT gets connected to the wireless relay Wi without any associated VM whatsoever, that being shown by the sign −. Then, Wi sends a control
10
P. J. Roig et al.
message to the orchestrator Q so as to find a host H j with enough resources being situated the closest possible to Wi . Next, when Q does find out an available host H j , then a VM is created in there, and afterwards, the identifier is sent back to the IoT device. As per the modelling of this action, it is to be remarked that the control message to create and init that VM is named as ctr 3 , because that action is going to be considered as action 3. Additionally, the process ‘I nit’ is going to be defined to undertake the creation and initialization of the VM. Anyway, here it comes the model for action 3: Wi = rin IoT (−) · sWi Q ctr 3 (−) · r QWi jm · Bind IoT ( jm ) · Wi
·Q Q = r Wi Q ctr 3 (−) · Select(−) · s Q H j ctr 3 ( jm ) · r H j Q ACK, jm · s QWi jm · Hj H j = r Q H j ctr 3 ( jm ) · Init( jm ) · s H j Q ACK, jm
After the application of the encapsulation operator to create a new VM, thus an Init VM action: ∂ H Wi || Q || H j = rin IoT (−) · cWi Q ctr 3 (−) · Select(−) · c Q H j ctr 3 ( jm ) ·Init( jm ) · c H j Q ACK, jm · c QWi jm · Bind IoT ( jm ) · ∂ H Wi || Q || H j
Finally, after applying the abstraction operator, the external behaviour prevails: τ I ∂ H Wi || Q || C || H j || T = rin IoT (−) · τ I ∂ H Wi || Q || C || H j || T
4.4 Action Cloud OUT It happens when a moving IoT device leaves the Fog environment by means of getting disconnected off a Wireless relay Wi through physical channel OUT, not getting reconnected to any other one. In this case, the VM associated with that IoT device leaves the Fog domain and gets into the Cloud domain, and therefore, the target here is to migrate that VM in the Fog environment, specifically into a host H j , to the Cloud environment. The process herein is quite similar to the one explained for the migration from Cloud to Fog, presented in action 1 (Cloud IN), although the operations to be done go the other way around. It is to be seen that the process of migrating the VM to the Cloud is undertaken in the first place, and when it is done, the IoT devices move out of the Fog environment.
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices
11
Focussing on the modelling action, it is to be said the control message to prepare this type of migration is called ctr 4 , because this action is going to be considered as action 4. Furthermore, the process ‘Find’ is going to be defined to undertake the lookup of the Cloud identifier where the VM will be sent on. Anyway, here it comes the modelling of action 4: Wi = sWi Q ctr 4 ( jm ) · r QWi km · Unbind IoT ( jm ) · Bind IoT (km ) · sout IoT (km ) · Wi Q = r Wi Q ctr 4 ( jm ) · Find( jm ) · s Q H j ctr 4 ( jm , km ) · r H j Q CloudOut( jm , km ) ·Q · s QC CloudOut( jm , km ) · rC Q ACK, km · s QWi km H j = r Q H j ctr 4 ( jm , km ) · MoveOut( jm , km ) · s H j Q CloudOut( jm , km ) · H j ·C C = r QC CloudOut( jm , km ) · MoveIn( jm , km ) · sC Q ACK, km
After the application of the encapsulation operator, this is the result of a Cloud OUT action: ∂ H Wi || Q || H j || C = cWi Q ctr 4 ( jm ) · Find( jm ) · c Q H j ctr 4 ( jm , km ) ·MoveOut( jm , km ) · c H j Q CloudOut( jm , km ) · c QC CloudOut( jm , km ) ·MoveIn( jm , km ) · cC Q ACK, km · c QWi km · Unbind IoT ( jm ) ·Bind IoT (km ) · sout IoT (km ) · ∂ H Wi || Q || H j || C Eventually, after applying the abstraction operator, the external behaviour is shown: τ I ∂ H Wi || Q || C || H j || T = sout IoT (km ) · τ I ∂ H Wi || Q || C || H j || T
4.5 Action Migration OUT Both this case, namely Migration Out (action 5), and Migration In (action 2) carry out a VM migration, although this one takes the point of view of the source wireless relay, where the IoT devices leave it, whilst the other one does it from the destination
12
P. J. Roig et al.
wireless relay, where the IoT devices join it. Hence, processes are seen in action 2 apply. Focussing on the modelling action, it is to be said the control message to prepare this type of migration is referred to as ctr 5 , because this action is going to be considered as action 5. It is to be reminded that after changing the wireless relay, two options may happen, such that there may be a VM migration or otherwise. Anyway, here comes the modelling of action 5: Wi = sWi Q ctr 5 ( jm ) · r QWi jm · Unbind IoT ( jm ) · Bind IoT ( jm ) · sout IoT ( jm ) + r QWi ACK · sout IoT ( jm ) · Wi Q = r Wi Q ctr 5 ( jm ) · s Q H j ctr 5 ( jm , jm ) · r H j Q ACK, jm · s QWi jm Select( jm ) s QWi ACK ·Q
H j = r Q H j ctr 5 ( jm , jm ) · MoveOut( jm , jm ) · s H j T MigrateOut( jm , jm ) · H j · H j H j = r T H j MigrateIn( jm , jm ) · MoveIn( jm , jm ) · s H j Q ACK, jm T = r H j T MigrateOut( jm , jm ) · Networking( jm , jm ) · sT H j MigrateIn( jm , jm ) · T
After the application of the encapsulation operator, this is the result of a Migrate OUT action: ∂H
Wi || Q || H j || H || T = cWi Q ctr 5 (km ) · c Q H j ctr 5 ( jm , jm ) · MoveOut( jm , jm ) ·c H j T MigrateOut( jm , jm ) · Networking( jm , jm ) · cT H j MigrateIn( jm , jm ) ·MoveIn( jm , jm ) · c H j Q ACK, jm · c QWi jm · Unbind IoT ( jm ) ·Bind IoT ( jm ) · sout IoT ( jm ) Select( jm ) c QWi ACK · sout IoT ( jm ) ·∂ H Wi || Q || H j || H j || T
j
Eventually, abstraction operator to obtain the external behaviour of the model: τ I ∂ H Wi || Q || C || H j || H j || T = sout IoT ( jm ) + sout IoT ( jm ) · τ I ∂ H Wi || Q || C || H j || H j || T
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices
13
4.6 Action Kill VM It happens when a moving IoT device leaves the Fog environment by means of wireless physical channel OUT, not getting back to the Cloud environment. This fact makes that the IoT device does not need its associated VM anymore, so it may be terminated. Hence, the target here is to kill that VM located in the Fog environment, specifically into a host H j . The process herein is quite similar to that explained for the creation of initialization of a VM in the Fog, explained in action 3, although the operations to be done work the other way around. It is to be noted that the process of killing the VM is undertaken in the first place, and when it is done, the IoT devices move away from the Fog environment. Focussing on the modelling action, it is to be said the control message to prepare this type of migration is referred to as ctr 6 , because this action is going to be considered as action 6. Additionally, the process ‘K ill’ is going to be defined to undertake the termination of the VM. Anyway, here it comes the model for action 6: · Wi Wi = sWi Q ctr 6 ( jm ) · r QWi − · Unbind IoT ( jm ) · sout − ·Q Q = r Wi Q ctr 6 ( jm ) · Find( jm ) · s Q H j ctr 6 ( jm ) · r H j Q ACK, − · s QWi − H j = r Q H j ctr 6 ( jm ) · Kill( jm ) · s H j Q ACK, − · H j
After the application of the encapsulation operator, this is the result of a a Kill VM action: ∂ H Wi || Q || H j = cWi Q ctr 6 ( jm ) · Find( jm ) · c Q H j ctr 6 ( jm ) · Kill( jm ) ·c H j Q ACK, − · c QWi − · Unbind IoT ( jm ) · sout IoT , − · ∂ H Wi || Q || H j
Finally, abstraction operator to obtain the external behaviour of the model: τ I ∂ H Wi || Q || C || H j || T = sout IoT (−) · τ I ∂ H Wi || Q || C || H j || T
5 Verification of the Models To wrap it all up, let us study the results obtained for all 6 actions studied and compare it to the behaviour of real systems for those same actions:
14
P. J. Roig et al.
Action 1: Cloud IN • External behaviour of the model: τ I ∂ H Wi || Q || C || H j || T = rin IoT (km ) · τ I ∂ H Wi || Q || C || H j || T
• External behaviour of the real system: An IoT device with an associated VM in the Cloud gets into the Fog domain: X = rin IoT (km ) · X Action 2: MIGRATE IN • External behaviour of the model: τ I ∂ H Wi || Q || C || H j || H j || T = rin IoT ( jm ) · τ I ∂ H Wi || Q || C || H j || H j || T
• External behaviour of the real system: An IoT device with VM an associated already in the Fog gets to another wireless relay: X = rin IoT ( jm ) · X Action 3: INIT VM • External behaviour of the model: τ I ∂ H Wi || Q || C || H j || T = rin I oT (−) · τ I ∂ H Wi || Q || C || H j || T
• External behaviour of the real system: An IoT device without any associated VM gets into the Fog domain, so a new VM is assigned: X = rin IoT (−) · X Action 4: Cloud OUT • External behaviour of the model: τ I ∂ H Wi || Q || C || H j || T = sout IoT (km ) · τ I ∂ H Wi || Q || C || H j || T
• External behaviour of the real system: An IoT device with an associated VM in the Fog gets out to the Cloud domain: X = sout IoT (km ) · X Action 5: MIGRATE OUT • External behaviour of the model: τ I ∂ H Wi || Q || C || H j || H j || T = sout IoT ( jm ) + sout IoT ( jm ) · τ I ∂ H Wi || Q || C || H j || H j || T
• External behaviour of the real system: An IoT device with an associated VM already Fog moves to another wireless relay, so it may or may not migrate: in the X = sout IoT ( jm ) + sout IoT ( jm ) · X
Algebraic Modelling of a Generic Fog Scenario for Moving IoT Devices
15
Action 6: KILL VM • External behaviour of the model: τ I ∂ H Wi || Q || C || H j || T = sout IoT (−) · τ I ∂ H Wi || Q || C || H j || T
• External behaviour of the real system: An IoT devicewith an associated VM in the Fog gets off and it does not go to Cloud: X = sout IoT (−) · X In summary, it is clear that each pair of recursive expressions are multiplied by the same factors, hence, each pair shows rooted branching bisimilar expressions, as they both share the same actions and they both have the same branching structure. This a sufficient condition to get a model verified, and therefore, all of the 6 models proposed got verified.
6 Conclusions In this paper, a formal algebraic modelling of a generic Fog scenario with Cloud support for moving IoT devices has been presented. In this sense, a generic topology has been introduced, integrating a range of wireless relays, an orchestrator, and a set of hosts with a Switching topology connecting them all together, along with Cloud as a backup system. The study has focused on modelling the movement of VM being triggered when a moving IoT device gets connected or disconnected to a wireless relay within the Fog ecosystem, accounting for a total of 6 different scenarios, which cover the most common cases. All those actions have been modelled by algebraic means, and in turn, they have been verified.
References 1. Prabhu CSR (2017) Overview—fog computing and internet-of-things (IoT). In: EAI endorsed transactions on cloud systems 2017, issue 10, article 5, pp 1–24. China Communications Magazine, Co., Ltd., Beijing. https://doi.org/10.4108/eai.20-12-2017.154378 2. Yousefpour A, Ishigaki G, Jue JP (2017) Fog computing: towards minimizing delay in the internet of things. In: Narhstedt K, Zhu H (eds) 2017 IEEE 1st international conference on edge computing, pp 17–24. https://doi.org/10.1109/IEEE.EDGE.2017.12 3. Lee HG, Chang L (2015) Powering the IoT: storage-less and converter-less energy harvesting. In: 20th Asia and South pacific design automation conference 2015, vol 1, pp 124–129. https:// doi.org/10.1109/ASPDAC.2015.7058992 4. Habibi P et al (2020) Fog Computing: a comprehensive architectural survey. IEEE Access 8:69105–69133. https://doi.org/10.1109/ACCESS.2020.2983253 5. Waqas M et al (2018) Mobility-aware fog computing in dynamic environments: understandings and implementation. IEEE Access 7:38867–38879. https://doi.org/10.1109/ACCESS.2018. 2883662
16
P. J. Roig et al.
6. Mukherjee M, Gorlatova M, Gross J, Aazam M (2020) Sustainable infrastructures, protocols, and research challenges for FOG computing. IEEE Access 8:110943–110946. https://doi.org/ 10.1109/ACCESS.2020.3000845 7. Sethi P, Sarangi SR (2017) Internet of things: architectures, protocols, and applications. J Elect Comput Eng 1:1–25. https://doi.org/10.1155/2017/9324035 8. Oh S, Kim JH, Fox G (2010) Real-time performance analysis for publish/subscribe systems. Future Gener Comput Syst 26:318–323. https://doi.org/10.1016/j.future.2009.09.001 9. Osanaiye O, Chen S, Yan Z, Lu R, Choo KR, Dlodlo M (2017) From cloud to fog computing: a review and a conceptual live VM migration framework. IEEE Access 5:8284–8300. https:// doi.org/10.1109/ACCESS.2017.2692960 10. Mahmud R, Ramamohanarao K, Buyya R (2019) Latency-aware application module management for fog computing environments. ACM Trans Internet Technol 19(1–9):1–21. https://doi. org/10.1145/3186592 11. Yang H, Bai W, Yu A, Yao Q, Zhang J, Lin Y, Lee Y (2018) Bandwidth compression protection against collapse in fog-based wireless and optical networkings. IEEE Access 6:54760–54768. https://doi.org/10.1109/ACCESS.2018.2872467 12. Kaur P, Rani A (2015) Virtual machine migration in Cloud computing. Int J Grid Distrib Comput 8(5):337–342. https://doi.org/10.14257/ijgdc.2015.8.5.33 13. Bittencourt LF, Lopes MM, Petri I, Rana OF (2015) Towards virtual machine migration in fog computing. In: International Conference 3PGCIC 2015, vol 1, pp 1–8. https://doi.org/10.1109/ 3PGCIC.2015.85 14. Padua D (2011) Encyclopedia of parallel computing. Springer, Heidelberg. https://doi.org/10. 1007/978-0-387-09766-4 15. Groote JF, Mousavi MR (2014) Modeling and analysis of communicating systems. MIT Press, Cambridge. https://dl.acm.org/citation.cfm?id=2628007 16. Bergstra JA, Klop JW (1985) Algebra of communicating processes with abstraction. Theor Comput Sci 37:77–121. Elsevier, Amsterdam. https://doi.org/10.1016/0304-3975(85)90088X 17. Fokkink W (2000) Introduction to process algebra. Springer, Heidelberg. https://doi.org/10. 1007/978-3-662-04293-9 18. Fokkink W (2007) Modelling distributed systems. Springer, Heidelberg. https://doi.org/10. 1007/978-3-540-73938-8
A Novel Approach to Preserve DRM for Content Distribution over P2P Networks Yogita Borse , Purnima Ahirao , Meet Mangukiya, and Shreeya Patel
Abstract The Digital revolution has made the sharing of information very easy. Peer-to-Peer (P2P) technology has found an important and significant role through which a large number of files are being exchanged by millions of users concurrently. However, due to the significant growth of P2P file sharing, even copyrighted files are actively exchanged among the users. This caused illegal sharing of copyrighted contents and hence violating the copyright law. So, copyright infringement has become a serious issue. In this paper, a DRM-protected content distribution system is proposed that uses a Web Browser-based P2P system for data transfer. The system provides hassle-free transfer of digital media content such as Audio/Video/Movies through the use of Web Real-Time Communications (WebRTC) API. This system aims to provide the best user experience by appropriate selection of the peer in terms of minimizing the latency in digital content sharing while imposing secured and restricted distribution to enforce Digital Rights Management (DRM). Keywords Digital Rights Management · Peer-to-Peer · WebRTC · Content Distribution Network · Edge Computing · Secure Streaming
1 Introduction A P2P network is a distributed network of computers (peers), all are connected via the internet. It is a decentralized system, meaning no central server is required for Y. Borse (B) · P. Ahirao · M. Mangukiya · S. Patel Department of Information Technology, K J Somiaya College of Engineering, Mumbai, India e-mail: [email protected] P. Ahirao e-mail: [email protected] M. Mangukiya e-mail: [email protected] S. Patel e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_2
17
18
Y. Borse et al.
communication or data transfer. All the nodes are equal, everyone serves (seeds) to, and everyone requests (leeches) from other peers. So the role of each node changes dynamically, depending on whether they are serving as a server to others requests or requesting for a service as a client or sometimes playing the role of both. It has been used extensively for file sharing. But since it is a decentralized system, there is no control over file distribution. Digital media content like songs, music, movies, TV shows, etc., can be shared without actually having to buy them from the original content creator. In other words, P2P makes piracy very easy. One person who has the content can share it with millions of other users leading to huge losses for labels, digital studios, etc. To prevent this, companies came up with a technology called DRM which usually meant the consumers had to install certain software from the vendor through which they could access the content. So, if it is the label, they develop music player software with security features that takes care of verification, authentication, and authorization in the background and then plays the encrypted content. These softwares usually use encryption and decryption to prevent the content from distribution. They get the content from the server in encrypted form; store the content in encrypted form. They are decrypted only when they are being played, in Random Access Memory (RAM). This does not prevent the user from sharing the files, but prevents them from being able to play, read, or watch the content. Most popular DRM systems have a common flow of getting encrypted content from Content Delivery Network (CDN), getting a license from the server, decrypting using the key in the license server; re-encrypt the frame after it is done playing. DRM in browsers is implemented via the World Wide Web Consortium (W3C) standard of Encrypted Media Extensions (EME). EME provides an interface for allowing HTML5 video players to use DRM playback. Each DRM service like Widevine, Playready, Fairplay all use customized DRM implementations. Standard DRM implementation model is not available. EME contains a component called Content Decryption Module (CDM) that handles the decryption, making license requests, etc. These are different for different services. CDM is a secure sandboxed environment that runs in a privileged Trusted Execution Environment (TEE) to ensure that none of the decryption keys are left exposed in user space. Of course, they can be reverse engineered and broken too if someone figures out how the licensing is handled, how to get the keys from the server, and mock the same behavior that the player exhibits to decrypt the content, intercept the decrypted content and store it on disk ready to be distributed without any rights or consent over any P2P file-sharing systems. One just has to make the system difficult to exploit in such a way that the amount of resources that go into the exploitation makes the whole motive unworthy. The goal of this proposed system is to bring the best streaming performance to the users by complementing edge computing with P2P. It further ensures that P2P is used for performance benefits and patches the vulnerabilities, i.e., lack of control over content by adding a secured DRM layer to it.
A Novel Approach to Preserve DRM …
19
1.1 Literature Review Zhang et al. [1] proposed a copyright protection solution for BitTorrent like P2P systems. In the proposed system, a tracker site responds to the authorization requests, computes re-encryption keys, and provides decryption keys to the leechers. To use the content, users have to get licenses from content owners after some payment mechanism. This system needs a high-performance server to do the job of a tracker site since it has to serve the many simultaneous authorization requests posing challenges to the scalability and efficiency of P2P systems. This system uses a centralized authorization server, which is the main performance bottleneck, but that does not serve the content, it only connects peers, which is why it scales well. Kitahara et al. [2] proposed a solution that used Bitcoin. In this method, the content provider generates coins for each license that was issued. The coins represent an authorized copy/license for accessing the content. Whenever a user buys the content from the content creator, a coin is generated and sent to the user’s wallet. Only the users with coins for a particular content can decrypt and use the content. The paper itself did not propose how it would control the content, encryption, and decryption of content using the coins. The problem with this solution is that a blockchain-based system requires mining and verification of transactions which are not real time, also, no solution is proposed on how to actually implement the DRM. Herzberg et al. of [3] proposed a whitebox security theory that proved how white-box security can be implemented which would enable trusted and secure execution on untrusted environment machines which would solve many of the computer security problems including DRM. This way one can be sure that their program is not compromised and the client code they wrote is the code that is actually asking for content and no other. This would help prevent modifying the client to maliciously distribute data that it receives without checking the authorization of peers. It also helps prevent the scenario where the client program is altered to save the decrypted content locally. Gu et al. of [4] proposed a way to protect digital content over P2P systems by creating a PLI (Public License Infrastructure, like PKI but for licenses instead of keys). The core idea of this paper was to encrypt the content by a key. And the decryption key that is supposed to be stored in licenses will be stored on PLI, a partial key will be stored on a group of servers. Each server has to authenticate the user for the user to be able to completely decrypt the content. Qiu et al. of [5] proposed a system based on re-encryption. It involves content creators uploading content to the system in encrypted form. This encrypted content is distributed via P2P. The users that wish to use the content sends an authorization request to an authorization proxy which contains the encrypted content and the user’s re-encryption keys. The authorization proxy checks the authorizations and if the user is authorized, it re-encrypts the ciphertext with the user’s key making it decryptable by the user. This re-encrypted content is transferred back to the user. The biggest drawback is significant network overhead on the user as well as server side. Users have to download the content twice to be able to use which is definitely not an ideal solution for streaming purposes. Bellini et al. in [6] discuss using a P2P distributed network along with a DRM system. It talks about granting authorization
20
Y. Borse et al.
using different servers like License server which provides storing of licenses and authorization for devices/players. An anonymous file-sharing system is proposed by Chen and Wu [7] which uses Diffie–Hellman group and threshold scheme to facilitate anonymity of users, efficient multimedia file sharing and effective computation and communication cost. A trustworthy P2P system is proposed by the Kuntze et al. [8]. It proposes a system for adding security and trust to the existing P2P protocol as an extension. Bit-Torrent file-sharing technique is used for the same. Use of CDN and P2P combination is discussed in the article by Zhao et al. [9]. They proposed the idea of using CDN over the P2P as it helps in decreasing the load on the server. The system focuses on P2P cache management and replacement strategy and choosing the best CDN network for media broadcasting and streaming. In Ohzahata and Kawashima [10], the authors have analyzed behavior of peers joining in the P2P network, Winny, which is a common approach of controlling DRM.
2 Challenges and Contributions In P2P network, in order to achieving the DRM without affecting the user experience, following are the challenges experienced and considered while proposing the system. • In a pure P2P system, it is difficult to enforce control because it is a distributed system with no hierarchy. Peer locations are a public knowledge in a P2P system making it difficult to prevent data leak from malicious nodes. They can just leak their locations and peers can get data from the malicious node. • If the location / IP address of the peers is shared only after authenticating the user, even then it is possible that the authenticated user itself chooses to behave as a malicious node and leak the connection information. Easier to replay actions and hence vulnerable to replay attacks, because no time-based parameter. • If either seeds or peers can initiate connections autonomously the system becomes vulnerable. Authorizations cannot be delegated to peers. Our contributions as presented in this proposed system are • The peers cannot directly connect to each other without having authenticated and authorized by the WebRTC, prevents data leak problem without compromising on the P2P benefits. • As session keys are used for protecting the content, there is no reuse of keys and hence avoids replay attack. • Blending of edge computing and P2P has resulted into hierarchy in the distributed network, helps in ensuring that no peer can bypass the protocol and behave maliciously.
A Novel Approach to Preserve DRM …
21
3 Proposed System The Proposed system is a web-based movie streaming system that aims at improving streaming performance by combining edge computing with P2P data transfers. This system implements a solution by developing an application that uses in-browser P2P data transfers while also being able to impose constraints on the P2P network data transfers in order to have DRM protection. This can easily be extended to movies, music streaming services, data transfers in subscription-based systems, etc.
3.1 System Approach A web browser provides an environment for websites to show content, make it interactive and dynamic via JavaScript while imposing security constraints to what they can do. Web browsers provide APIs to do stuff like reading a file, taking a picture from a webcam, playing a song, or a movie. If these things were to be done without a browser, one would have to do file management, camera management, and use and interact with drivers, take care of audio and video codecs and many more things which are abstracted by browsers to provide an easy to use API to create web applications. One such API is the WebRTC API which is an abbreviation for Web Real-Time Communications and it is used for real-time communications like video/voice calls, live streaming videos, screen sharing, etc. These RTCs are done directly from one client to another, i.e., it is like a P2P connection. These APIs provided by WebRTC have been leveraged to implement a movie streaming service that aims at improving performance by combining edge computing with P2P data transfers while also imposing DRM constraints on the distribution of content to ensure only authorized users could access the content.
3.2 Physical Architecture/Layout There are four major components in system as shown in Fig. 1. • Frontend Server: The frontend server is the application server that handles all the user requests initiated by the browser excluding WebRTC communication. It does everything other than content distribution/movie data stream. It renders web pages for users based on the requests, their login state, etc. • WebRTC Signaling Server: WebRTC signaling server is the one that manages WebRTC connection initialization between peers which also checks for authorizations before establishing communication channels for data transfers. It also weighs peers according to the distance between them, load, etc., to allow clients to connect to best possible peers with best performance.
22
Y. Borse et al.
Fig. 1 Physical architecture/layout
• CDN Network: The CDN network is a network of servers distributed geographically in every major city and/or country to reduce the latency between the peers and servers. The nodes in the CDN network are viewed as 24 × 7 peers that are streaming the content in order to ensure uninterrupted service to the authorized users. • Users/Clients: The client browsers that interact with and consume our system.
3.3 The P2P Network The Web Browser is special software that is used to access the internet and browse the web. Initially, browsers were used for making it simpler to access documents over a network or the internet itself. As time passed, people realized that this piece of technology and the APIs it exposed can be used to create an application experience as well, other than plain reading. With time, browsers started implementing and exposing more and more APIs to allow web developers to be able to create almost any application as per their requirement. New APIs and standards are introduced and worked on with time. While exposing all these APIs, browsers have to be very careful as far as security is concerned. Any security vulnerabilities can make millions
A Novel Approach to Preserve DRM …
23
of users using the web browsers vulnerable. One such security feature is browsers can’t act as servers. They are supposed to be and were developed to be clients only. Also, routers on home networks, office networks, etc., block all incoming connections to secure the networks. It is not as simple as opening a socket with any client given an IP as it would have been if OS-level APIs were used by creating a native application. These gaps were bridged by the WebRTC APIs first introduced in 2011. WebRTC requires something known as a signaling server that clients communicate with, to locate each other, get physical addresses to be able to connect to, even behind routers which would require using Network Address Translation (NAT) servers, etc. One can compare this signaling server as an equivalent of a tracker site in Bit-Torrent systems. WebRTC is used for establishing data channels between peers. The signaling server prevents peer connections whenever a peer is trying to access content for which it is not authorized. This helps impose a part of DRM since there is no way to connect to peers without the signaling server. One peer cannot just connect to another peer by sending a request to their IP addresses since the system is working on browsers with security features that limit this functionality. The clients streaming a movie X will become peers for other clients also streaming the movie X. The peers for client A will be the clients that have the fragment of movie in their buffers that client A needs. That is, all the clients that have Z fragments in their buffers become seeders of that fragment for other clients / newly joined clients. The peer joins the network when they start streaming a particular movie and leaves the network as the user closes the browser tab or finishes with the streaming. Figure 2 shows the flowchart of streaming a movie. The Flow is as follows: • User opens the browser. • User visits streaming website: On visiting any page in a browser, the browser sends a request to that particular page and renders the HTML returned by the server. Browser sends request to front-end server. • Frontend server returns rendered html. • Browser renders the HTML: HTML is used to define the structure and semantics of the content. Each HTML tag is used for particular reason and is rendered in particular ways conforming to predetermined standards. CSS is used to style the page. JavaScript is used to add dynamic nature to the webpage. • Browser loads the JavaScript specified in the HTML in a JavaScript VM and starts execution of JavaScript providing standard Web JavaScript API implemented environment for execution. • Player takes care of background coordination with the network and peers for data transfer, access, and playing the content. • Player sends request for content: Player sends request for content to the WebRTC signaling server. • WebRTC signaling server validates authorizations: WebRTC server uses a shared database to validate authorizations of the user requesting content. Whether or not the user has bought the content? (This is the simplified way used to authorize the user in place of license) If it is not valid, return unauthorized user error in which
24
Y. Borse et al.
Fig. 2 Streaming flow diagram
case player redirects user to payment page. Signaling server maintains a list of active peers and what content they are streaming and what part of content they have which is used to build a final list of suitable peers for a particular user. WebRTC signaling server notifies the peers: WebRTC signaling server notifies the peers that the user X is authenticated and is authorized to access a particular content. • WebRTC signaling server connects user with peers/CDN peers: Handshake takes place; peers validate the user’s identity and content requested against notification previously received by the peer from the signaling server. – Handshake completes.
A Novel Approach to Preserve DRM …
25
– Data streaming between user and peers starts. – Player renders the data stream as the video plays. – Users are able to view the streamed movie.
3.4 The Secured DRM The DRM part of the system is maintained by the video player which acts as client that handles communication with the signaling server and other peers. The player consists of video playing modules, cryptography modules, and WebRTC modules. The functions of player are as follows—Look-ahead buffer, loading encrypted content in the buffer, storing and managing the encryption keys used for different parts of the buffer coming from different peers, decrypting the part of the content that is currently being played/in-frame, re-encrypting the decrypted content after it has finished playing. This would be a simple re-encryption with the same key just to ensure that plaintext is not left in the memory. This is different from partial reencryption schemes that require a pool of servers that each re-encrypts parts of the content.
4 Discussion As lessons learned from the literature and the challenges faced as listed above, authors proposed the system aiming to address those challenges. Table 1 discusses the various features of the proposed system and the description on how those features are achieved. Table 1 Features of Proposed System Features
Description
Prevention of replay attack
Data is protected using session keys
Prevention against malicious peer
As data is protected using session keys, data forwarding will not be resulting into rendering the data
Prevention against data leak problem
Data is encrypted and shared with authorized peers only
DRM protection
Data is decrypted and rendered by the player of authorized peers only
Fault Tolerance
Edge computing and P2P are used as basic building blocks, provides fault tolerance inherently
Scalability
New users can register to CDN and become the part of system
User experience
As peer is selected based on the distance ranking, latency is minimum
Load balancing
Content delivery is done by CDN as well as peers
26
Y. Borse et al.
5 Conclusion The paper proposes a way to improve content delivery performance beyond edge computing by using a combination of P2P and edge computing. The simplified version of the same is implemented in which instead of licensing, some data uploaded by the user is considered for authorization. If the user has bought a particular content, it is stored in a database that this particular user has bought which is utilized by the WebRTC signaling server before making the peer connection for requested content delivery. All the users that are registered are assumed to have bought the content and these would just be another check in the system. The implemented simplified check works as a proof of concept that connections are controlled and more conditions can be added. The proposed system also efficiently reduces the load on the CDN by allowing the content delivery through peers. Selection of the peer is utilizing the peer ranking feature to eliminate the latency issues. As the peers cannot connect to each other without initial check by Webrtc signaling server, security is not traded off for performance benefits and so DRM is still intact and in place.
References 1. Zhang X, Liu D, and Chen S (2008) Towards digital rights protection in BitTorrent-like P2P systems, 15th Multimed. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.115. 4421&rep=rep1&type=pdf 2. Kitahara M, Kawamoto J, Sakurai K (2014) A method of digital rights management based on Bitcoin protocol. https://doi.org/10.1145/2557977.2558034 3. Herzberg A, Shulman H, Saxena A, Crispo B (2009) Towards a theory of white-box security. Emerging challenges for security, privacy and trust. 297:342–352. https://doi.org/10.1007/9783-642-01244-0_30 4. Gu G, Zhu BB, Li S, Zhang S (2003) PLI: a new framework to protect digital content for P2P networks. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinform) 2846:206–216. https://doi.org/10.1007/978-3-540-45203-4_16 5. Qiu Q, Tang Z, Yu Y (2011) A decentralized authorization scheme for DRM in P2P file-sharing systems. In: 2011 IEEE Consumer Communications and Networking Conference CCNC’2011, pp 136–140. https://doi.org/10.1109/CCNC.2011.5766438 6. Bellini P, Nesi P, Pazzaglia F (2014) Exploiting P2P scalability for grant authorization in digital rights management solutions. Multimed Tools Appl 72(2):1611–1637. https://doi.org/10.1007/ s11042-013-1468-y 7. Chen Y-M, Wu W-C (2014) An anonymous DRM scheme for sharing multimedia files in P2P networks. Multimedia Tools Appl 69(3):1041–1065. https://doi.org/10.1007/s11042-0121166-1 8. Kuntze N, Rudolph C, Fuchs A (2010) Trust in peer-to-peer contentDistribution protocols. pp 76–89. https://doi.org/10.1007/978-3-642-12368-9_6 9. Zhao J, Yu F, Zhang J, Sun S, Guo J, Gong J, Zhao Y (2010) The research of key technologies of streaming media digital resources transmission based on CDN and P2P. 317:442–448. https:// doi.org/10.1007/978-3-642-12220-0_64 10. Ohzahata S, Kawashima K (2011) An experimental study of peer behavior in a pure P2P network. J Syst Softw 84(1):21–28. https://doi.org/10.1016/j.jss.2010.08.025
Secure Multimodal Biometric Recognition in Principal Component Subspace S. P. Ragendhu and Tony Thomas
Abstract In the case of most of the biometric recognition systems, the type of features extracted from the biometric template depends on the biometric modality. A generalized feature representation is convenient and efficient in multimodal biometric system as the complexity involved in fusing or protecting the heterogeneous features corresponding to different biometric modalities can be avoided. The dimensionality of the features can also be reduced effectively if we can extract generalized feature vectors without compromising the matching performance. This work proposes two schemes: (i) a generalized yet efficient matching technique based on the cosine similarity of the principal component subspaces (ii) a protection scheme by fusing the feature vectors of multiple modalities in principal components domain and matching in that protected domain. The key feature of the proposed protection scheme over existing schemes is that it does not require a user specific key for ensuring the irreversibility. User specific key is used only for ensuring the revocability of the protected template. Results show that the proposed method is able to achieve a matching performance of 98.57% for a multimodal system with face, fingerprint, and finger vein templates in the fused domain.
1 Introduction Biometrics is the best available solution to identify or authenticate a person. Unimodal systems utilize the discriminating capability of only one modality for identifying or authenticating an individual; whereas multimodal biometric systems utilize the S. P. Ragendhu (B) Indian Institute of Information Technology and Management-Kerala. (Research Centre of Cochin University of Science and Technology, India), Kazhakkoottam, India e-mail: [email protected] T. Thomas Indian Institute of Information Technology and Management-Kerala, Kazhakkoottam, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_3
27
28
S. P. Ragendhu and T. Thomas
benefits of more than one modality to recognize a person. Multimodal biometric systems are the need of the hour, as unimodal biometric systems are lagging far behind in the domains of security and recognition accuracy. Multimodal biometric systems are commonly employed for person authentication and identification even in high-security applications such as military applications and border control [4]. Government initiatives such as Aadhaar scheme (UIDAI) of India government are also promoting the multimodal biometric-based person authentication. Multimodal systems have the additional advantages of flexibility and increased user convenience. Main design issue in a multi-biometric system is dealing with the complexity of the system [26, 30, 40]. Multi-biometric system’s design should be customized to handle relevant representation schemes based on the modalities included in that system. Most of the feature recognition schemes are specifically designed for a particular biometric trait, which supports only a specific representation. For example, point set representation is widely used in fingerprint minutiae [6]; binary string representation is commonly utilized for iris pattern [5]; vector representation (Eigen faces) is often employed in face biometrics [44], etc. These modality dependent feature representations are proved to be efficient in unimodal systems. However, this causes additional overhead of combining different recognition schemes using suitable fusion techniques at appropriate levels in multimodal systems. The potential risk associated with any biometric recognition system is that users are obliged to reveal their biometric data, which are highly sensitive piece of personal information [11]. Security risk becomes even higher in the case of multimodal systems, as amount of sensitive data stored in these systems is higher compared to unimodal systems. Since almost all commonly used feature representations are dealing with local distinguishing features of each modality, it will be feasible for an intruder to reconstruct the original template back from these features with minimal effort [1, 27, 28]. Cancellable biometrics and biometric cryptosystems are the protection schemes which are designed specifically for accommodating the special characteristics of biometric templates. However, major drawback of prevailing template protection mechanisms is that each one of them is designed specifically to handle a particular modality [25, 29]. For example, fuzzy vault is suitable for fingerprint minutiae [21]; while fuzzy commitment is appropriate for iris code [24]. It will be cumbersome to employ different protection mechanisms for different biometric traits in a multimodal biometric system [34]. Use of statistical features can eliminate the problem of heterogeneous features, as they can be generalized for any biometric template. It is obvious that local features can slightly outperform global statistical features in the case of matching accuracy as they are specific to a particular modality. However, in a multimodal system we have enough information regarding the person (data of more than one biometric modality), hence the performance of a multimodal system is the combined performance of each unimodal module associated with the system. Hence, statistical features are more suitable for multimodal system than unimodal system [26, 39]. In addition to this, we use only global statistical features for authentication, and there is no need to store the intrinsic details of the biometric data. Hence, the reconstruction of templates from the features can be prevented up to an extend.
Secure Multimodal Biometric Recognition in Principal Component Subspace
29
This work is an extension of the work by the authors [22], where the principal component similarity metric is used in fingerprint recognition. In this work, we analyze the performance of principal component-based schemes for other biometric modalities and the suitability of using it for secure multimodal authentication. Similarity of biometric templates is calculated by estimating cosine similarity of corresponding principal component subspaces [41]. The calculations are quite simple and straight forward; which makes the method simpler and faster. Multi-biometric template fusion is proposed for template security, where principal components of one modality are projected using the principal components of other modality using the concept of orthogonal projections. Similarity calculation is done in this projected domain. This not only ensures the security of the system but also provides reduction in template size. Rest of this paper is organized as follows. Section 2 discusses related literature in the area of multimodal biometric recognition. Biometric template matching scheme based on principal components similarity and proposed protection scheme for multimodal templates are explained in Sect. 3. Section 4 analyzes the experimental results and performance of the proposed matching scheme for individual modalities (face, fingerprint, and finger vein) as well as for fused feature vectors. Security analysis of the proposed protection scheme is described in Sect. 5. Section 6 briefs the conclusion drawn from the work.
2 Related Works In multimodal biometric systems, the protection schemes proposed in the literature can be generally divided into two categories: 1. Specific protection mechanism for each modality and fusing at score or decision level; 2. Generalized protection mechanism for all the modalities (modality specific feature representation) by projecting it to encrypted domain by fusing the feature vectors of different modalities. Different works proposed in these categories use concept of cancellable biometrics in one way or the other to ensure the non-invertibility and revocability of the templates. However, there exist many security pitfalls and complexity issues when we deal with multi-biometric template protection [23, 34]. Majority of the works in multimodal template protection are based on feature level fusion using different techniques. A watermarking- based template protection for face and fingerprint templates is proposed by Nafea et al. [19]. In their scheme, they fused the templates using Discrete Wavelet Transform and Singular Value Decomposition using user specific keys. However, the security of the system solely depends on the user specific key, and stolen token scenario is not explained. In the work proposed by Yang et al. [42], template protection for fingerprint and finger vein templates are proposed using feature level fusion using a technique called enhanced partial dis-
30
S. P. Ragendhu and T. Thomas
crete Fourier transform. Complexity of computations involved in the fusing scheme is comparatively high in this work. In the work proposed by Gomez et al. [8], a homomorphic encryption-based template protection is discussed. Experiments were done after employing different types of fusion schemes such as feature, score and decision levels in multimodal systems. Results show the feature level fusion gives better performance than other schemes. However, there will always be a trade-off between complexity and accuracy in the multimodal systems. Bloom filters are applied individually to face and periocular templates for protection and score level fusion is proposed for multimodal recognition in the work proposed by Raghavendra et al. [31]. However, there are works demonstrating the vulnerability of bloom filters in template protection [2]. Walia et al. [36] proposed an adaptive graph-based feature level fusion, after computing modality specific feature vectors based on a set of key images [34]. In the scheme, it is required to store the set of key images corresponding to each person also while authentication, which is an additional overhead. The same concept is implemented using deep features by the same authors in their work [35]. Keshav et al. [9] proposed a feature level fusion scheme by projecting the original features to a random plane using user specific key. Projected points are transformed to cylindrical coordinates to obtain a combined cancellable template. This scheme also depends on user specific key for generating secure templates. Many works based on projecting original biometric templates using orthonormal matrices are proposed for the protection of biometric templates [14, 32, 38]. Biohashing is one of the important schemes in this category [12]. However, the problem in all such methods was the availability of transformation matrix to the intruder. Another important work in the domain of cancellable biometrics is biophasor, which generates cancellable template by mixing the original template with user specific random number [33]. The irreversibility of the template is independent of the availability of user specific tokens. We propose a security scheme which combines the advantages of both biohashing and biophasor schemes.
3 Proposed Method The proposed work is based on the fact that data can be optimally represented in the principal components’ domain, as they are orthonormal vectors. Equation 1 represents the concept of representing the data as linear combination of principal components. m wi ai + c, (1) x≈ i=1
where, ai , i = 1, 2, . . . , m represent principal components, wi represent weights associated with each principal component and c is the minimal squared reconstruction error while reconstructing the data x. If the dimensionality of the data x is n, Eq. 1
Secure Multimodal Biometric Recognition in Principal Component Subspace
31
shows how principal components can be used to reconstruct x in a dimensionally reduced space (m < n). Principal Component Analysis (PCA) is mainly used as a tool for dimensionality reduction [18]. In this paper, we propose another application of principal components, that is measuring the similarity between the biometric templates [13, 41]. This work is an extension of the previous work by the authors [22] and is discussing the performance of principal component similarity metric in different biometric modalities and in fused domain. Biometric image can be considered as a multivariate data and hence the similarity of templates can be calculated based on the similarity of principal component subspaces [37]. Principal components basically represent optimized coordinate system to represent the given data [15]. The process of extracting principal components for proposed method is explained in the steps given below Step 1:
Normalize the input image I(r ×c) , by subtracting the mean image of size (r × c) and obtain I˜(r ×c) .
Step 2:
Calculate the covariance matrix of I˜(r ×c) using either Eq. 2 or Eq. 3 (depends on the need of dimensionality reduction) as given below
Step 3:
T Cov I m (r ×r ) = ( I˜)(r ×c) × ( I˜)(c×r ),
(2)
T ˜ Cov I m (c×c) = ( I˜)(c×r ) × ( I )(r ×c) .
(3)
Find out the Eigenvectors ei , i = 1, 2 . . . c or r corresponding to the Eigenvalues λi , i = 1, 2 . . . c or r (sorted in descending order) of covariance matrix. Eigenvectors ei represent the principal components, and the amount of variance is given by corresponding Eigenvalues λi .
3.1 Similarity Measure Based on Principal Components If X 1 and X 2 represent two biometric templates of size p × q, PCA can model X 1 and X 2 using k principal components each (k 0, is called Lyapunov surface or level surface. As c decreases, the level surface shrinks to the equilibrium point, in this paper identified as the origin. A trajectory, which crosses the Lyapunov surface V (x) = c, because of condition (3), moves inside the set Sc {x ∈ Rn |V (x) < c} without a chance to get out again, approaching the origin as time progresses. The Lyapunov function should be regarded as a certificate to guarantee the stability of a dynamic system. In a stable system, we can construct several Lyapunov functions, with different shapes and grades. The more common Lyapunov function, as we introduce in Sect. 2.2, is constructed using the Sum of Square approach (SOS).
2.2 SOS Lyapunov Functions In general, finding a Lyapunov candidate V is a difficult task. There exist stability analysis tools, based on computational methods. The Sum of Square (SOS) approach [24] generalizes a well-known algorithmic tool in linear control theory. This method restricts Lyapunov functions V (x) to be polynomial, thus, the SOS should be used only in cases we have polynomial dynamical systems. SOS Lyapunov are constructed as V (x) = Z (x)T Q Z (x), where Q is an unknown positive semi-definite matrix and Z (x) is a vector of a priori fixed monomials features in the element of x, with degree ≤ d. The Semi-Definite Program (SDP), in order to prove that V (x) is a sum of square, arises the problem of finding a Q ≥ 0. Suppose that the desired Lyapunov function (x) is of degree 2d, in [24] authors n V 2j d define a supporting polynomial φ(x) = i=1 j=1 ξi j x i , with ξi j > 0 for all i and j. Then, there exists a polynomial V (x) such that V (0) = 0 and
Stability Certification of Dynamical Systems: Lyapunov Logic Learning Machine
V (x) − φ(x) is SOS; − V˙ (x) = −
n ∂V i=1
∂ xi
f i (x) is SOS.
225
(4) (5)
These constraints exploited in the SDP ensure the positive-definiteness of V and the negative-definiteness of V˙ . The choice of monomials in Z (x) is the key to outline the ROA using an SOS Lyapunov function V (x). Thus, for a suitable choice of Z (x) and adopting the SOS approach, we can prove that for a dynamical system, the equilibrium in zero is stable in the sense of Lyapunov. Further, starting from the Lyapunov function V (x) found, we can estimate the ROA Sc .
2.3 Classical ROA Estimation Given a Lyapunov function V (x), in this section we consider the problem of estimating the ROA of an equilibrium point, identified before as the origin, e.g., x¯ = 0. Thus, our aim is to find a set in which the derivative of Lyapunov function V (x) is negative, called positively invariant subset. A typical approach is to solve the following optimization problem: c = infn V (x) subject to x∈R
V˙ (x) = 0 x = 0.
(6)
In this case, the invariant subset is given by the connected component of the Lyapunov level Sc = {x ∈ Rn |V (x) < c}. Intuitively, since the region Sc is invariant, the Lyapunov function V (x) can only decrease in this region. Defining a positivedefinite V (x), we ensure that trajectories starting in Sc remain in this safe level set and converge to the equilibrium point x¯ = 0.
3 Method Finding a ROA using a data-driven approach is simple as it is based on sampling trajectories via simulation and mapping the results into a regular supervised learning problem. The result of which is the foundation of the present paper as the intelligibility of the solution is exploited to enter the logic of stability conditions and drive Lyapunov certification.
226
M. Maurizio and O. Vanessa
3.1 Intelligible Analytics A supervised classification problem consists of finding the best boundary function g(x) separating points in classes (stability and instability here). The derivation of g(x) is formulated under the Logic Learning Machine (LLM) [25] setting. The g(x) model is described by a set of m intelligible rules rk , k = 1, . . . , m, of the type if (premise) then (consequence), where (premise) is a logical product (AND, ∧) of dk conditions clk , with lk = 1k , . . . , dk , and (consequence) provides a class assignment for the output, i.e., yˆ = g(x). In the LLM algorithm, the model g(x) is obtained through a three-step process, through which LLM has a higher precision than other intelligible models. For the interested reader, further discussion about the LLM algorithm is available in [26].
3.2 Value Ranking Consider a set of m rules, each consisting of a set of conditions clk . Let X 1 , . . . , X n be the input variable, s.t. X j = x j ∈ X ⊆ R ∀ j = 1, . . . , n, then, a condition clk regarding the variable X j can have one of the following forms: X j > λ,
X j ≤ μ, λ < X j ≤ μ,
where λ, μ ∈ X. Introducing a machine learning algorithm, we also include a classification uncertainty. Thus, for each rule generated by the algorithm, we can define a confusion matrix, obtaining four indices: T P(rk ) and F P(rk ), defined as the number of examples (x j , y j ) which satisfy all the conditions in rule (rk ) with yˆ = y j and yˆ = y j , respectively; T N (rk ) and F N (rk ), the number of examples (x j , y j ) which do not satisfy at least one condition in rule (rk ), with yˆ = y j and yˆ = y j , respectively. Consequently, we can define some useful characteristic quantities, such as the P(rk ) P(rk ) , the error E(rk ) = F P(rFk )+T , and the precision covering C(rk ) = T P(rTk )+F N (rk ) N (rk ) T P(rk ) P(rk ) = T P(rk )+F P(rk ) . The covering and the precision are adopted as measures of relevance for a rule rk ; as a matter of fact, the greater are covering and the precision, the higher is the generality and the correctness of the corresponding rule. In order to obtain a measure of relevance R(clk ) for a condition, we compare the rule rk in which condition clk occurs, and the same rule without condition clk , called rk . Since the premise part of rk is less stringent, we obtain that E(rk ) ≥ E(rk ), thus the quantity R(clk ) = (E(rk )E(rk ))C(rk ) can be used as a measure of relevance for the condition of interest clk . Each condition clk refers to a specific variable X j and is verified by some values ν j ∈ X, thus we can derive a measure of relevance R yˆ (ν j ) for every value assumed by X j as 1 − R clk , R yˆ (ν j ) = 1 − k
where the product is computed on the rules rk that include a condition clk verified when X j = ν j . Since the measure of relevance R yˆ (ν j ) takes values in [0, 1], it can
Stability Certification of Dynamical Systems: Lyapunov Logic Learning Machine
227
be thought as the probability that value ν j occurs to predict yˆ . The same argument can be extended to intervals I ⊆ X. In [9], data visualization through the interval value ranking is applied to profile the vehicle collision behavior, while in this work we consider the interval value ranking to inspect the most relevant safe regions in term of states x.
3.3 LLM with Zero Error Another key element of the machine learning proposed here relies on forcing the g(x) model to describe the stable class with zero (statistical) error [6, 9]. Together with value ranking, it helps build a safety margin from instability, thus giving robustness to the further derivation of Lyapunov level sets. Stable points are identified in the LLM discretization step with fine granularity and clusters of the stable class are assigned without overlap with unstable points. All the resulting rules are then joined together under logical OR (∨) to build the g(x) predictor. This approach builds a strict boundary around safe points that could be even too conservative, especially if only the rules with the highest covering are used to simplify the structure of g(x), in place of all rules (see the example in the performance evaluation).
3.4 ROA Explanation Previous sections have treated Lyapunov theory and machine learning separately. Now the two disciplines are joined together. The stable region is denoted by Sc in terms of states x; the c suffix is related to Lyapunov level sets as follows. Algorithm 1 ROA explanation Input: dynamical system x˙ = f (x), safe region Sc , number of states n. Output: explained ROA R Initialization: 1: x = x0 LOOP Process 2: for i = 1 to n do 3: x = xi 4: if (x ∈ Sc ) then 5: y=0 6: else 7: y=1 8: end if 9: end for 10: build the LLM model g(x) 11: compute R yˆ =0 (x) for all x s.t. y = 0 12: find an explained ROA R in terms of x 13: return R
228
M. Maurizio and O. Vanessa
Given a dynamical system x˙ = f (x), local stability is ensured by searching a SOS Lyapunov function V (x) with the toolbox SOSTOOLS [24] and the SDP solver SeDuMi [27] in MATLAB. Then, the optimization problem defined in Sect. 2.3 is solved to obtain a stable region Sc , in which, by definition, all trajectories end to the origin. The data collected consists of different states x and labels y, equal to zero if x belong to Sc , and one otherwise. The LLM algorithm is then applied to those data points, deriving the model g(x), from which the measure of relevance for each state R yˆ =0 (x) (when the class predicted is zero) is readily available. The first region Sc , defined through the level surface of an SOS Lyapunov function V (x) = c, should have an elliptical shape, while the second region R, defined through linear rules, should have a tiered-shape, which approximates the ellipse in x under V (x) = c. Value ranking R yˆ =0 (x) outlines the margins from instability of Sc . It is worth noting that a more general stability region, denoted now S, may be defined on the basis of experimented trajectories of the system, rather than through the Lyapunov ROA, still preserving the same structure of Algorithm 1. Despite the latter has no Lyapunov certification, it often gives origin to larger stability boundaries, compared to the ones of SOS Sc . The idea is then to exploit the ROA explanation of simulated S to push an enlargement of Lyapunov level sets. This is the subject of the second algorithm.
3.5 ROA Candidates for Certification 3.5.1
Data-Driven ROA
Simulations and machine learning are now combined. The behavior of the system is reproduced under several initial conditions and by assigning a label y equal to zero if a trajectory remains in the stable region S and ends in the origin, a label y = 2 if a trajectory leaves the S, but ends in the origin, and we assign a label y = 1 otherwise (i.e., instability). The data collected consists of the initial states and labels. As before, to explicit the boundaries found in S in terms of x, the relevance R yˆ =0 (x0 ) is computed for all initial points considered, denoted by x0 . In the beginning, the candidate stable region S does not depend on any level surface c of the Lyapunov function V (x). A further Lyapunov certificate is found on the basis of the ROA explanation of S. The ROA approximation may follow different approaches in terms of safety margins from instability. A first ROA candidate, Rα , is defined in terms of the subsets of up ∧ xi < θxi , i = 1, . . . , n, θxlow and x satisfying the following constraints: xi > θxlow i i up θxi being the lower and upper bounds on xi for stability in value ranking. Another candidate, Rβ , comes from sensitivity analysis of thresholds above by the further up introduction of positive variables εlow xi and εxi and giving rise to variations of the up up low low type: xi > θxi + εxi ∧ xi < θxi − εxi , i = 1, . . . , n. The second candidate safe region Rβ is defined by slightly shrinking Rα .
Stability Certification of Dynamical Systems: Lyapunov Logic Learning Machine
229
The latter may be simply interpreted as a more prudent approach in front of not trusting value ranking. Such a concept of trust is strictly connected with the concept of generalization in machine learning. Roughly speaking, the following question arises: is Rα applicable to trajectories not used for training in Algorithm 2, still experiencing stability? Answering the question (again, in the machine learning point up of view) means cross-validating εlow xi and εxi , i = 1, . . . , n. Cross-validation means testing stability conditions in different subsets not used for training, while restricting the boundaries of the initial ROA candidate Rα , until stability is achieved over all test sets. The resulting Rβ ROA may be smaller than Rα , but it is more robust to noise or uncertainty of some parameters in the dynamical system [6, 9]. This topic is of paramount importance for the certification of Artificial Intelligence, see, e.g., [7] (the inherent Appendix G, in particular: Implications for off-line training). Algorithm 2 Computational ROA Input: dynamical system x˙ = f (x), number of initial points n, number of trajectory steps m. Output: proposed ROA R Initialization: 1: for i = 1 to n do 2: x = xi Trajectories simulation: 3: for j = 1 to m do 4: xi j = x j 5: end for 6: if ∀ j xi j ∈ S then 7: y = 0 8: else if (∃ xi j ∈ / S)&(xim ∈ B(0, ε)) then 9: y=2 10: else 11: y=1 12: end if 13: end for 14: build the LLM model g(x) 15: compute R yˆ =0 (x0 ) for all x0 s.t. y = 0 16: find an explained ROA R in terms of x 17: return R
3.5.2
Lyapunov Certification of Machine Learning
The regions found by machine learning technique are still not certified. The stability of region Rs , s = α, β, is studied by employing the Lyapunov function V (x) and by solving the following optimization problem: cs = infn V (x) subject to x∈R
V˙ (x) ≤ 0 x ∈ Rs .
(7)
The level surface of the Lyapunov function V (x) = cs is searched, corresponding to the initial states found through the machine learning. The new Lyapunov ROA is
230
M. Maurizio and O. Vanessa
defined as the set Scs = {x ∈ Rα |V (x) < cs }. Still under SOS Lyapunov, the region Scs has an elliptic shape. The region of attraction can be expanded further through the definition of a positive vector parameter, εx , extending the search space in the candidate region Rεx , for the best level set with V˙ (x) < 0; εx defines the size of extension over the value ranking thresholds, namely, Rεx is an extension of Rα of size εx . The new optimization problem is defined as V˙ (x) ≤ 0 max infn V (x) subject to (8) εx x∈R x ∈ Rε x . The focus of the optimization is finding the largest oscillation around value ranking thresholds such that the Lyapunov function is still negative. Found the optimal εx , denoted by εx∗ , the new corresponding optimal level set is Scεx∗ =
x ∈ Rε |V (x) < cεx∗ , cεx∗ being the minimum level set within the region where εx has been optimized. Finally, as before, the ROA can be viewed as an elliptic-shape level surface of V (x), or as a tiered-shape surface, in terms of states x, in case a further ROA explanation is applied. The ROA candidates for Lyapunov certification are based on interval value ranking, which, in turn, may be based on different settings of g(x): LLM with and without zero error (the former profiles the border between the classes with high precision, the latter drives better generalization), just focusing on the few rules with largest covering, with and without cross-validation. As to the subsequent parameters ε and δx and optimization problems (7) and (8), different level sets of the Lyapunov function V (x) may be interrogated. Moreover, as V (x) functions with increasing complexity may be considered [18, 22], the joint adoption of machine learning and Lyapunov theory provides flexibility, while restricting the computational effort of numerical optimization around the regions profiled by value ranking.
3.5.3
Rules Versus Neural Networks
It is finally worth noting that the use of thresholds and their sensitivity in the optimization problems above is intrinsic to the application of rule-based mechanisms; the application of neural networks in Algorithms 2 and (7) or (8), in place of LLM, would lead to more complex sensitivity analysis of candidate ROAs [18].
4 Results The Van der Pol oscillator in reverse time is considered (9), as in example 8.5 of [1] and section VI.A of [10], whose true ROA is depicted in Fig. 1 (stability in blue, instability in red dots, respectively). The system has one equilibrium point
Stability Certification of Dynamical Systems: Lyapunov Logic Learning Machine Fig. 1 True ROA of Van der Pol oscillator (9)
231
3 stable unstable
2
1
0
-1
-2
-3 -3
-2
-1
0
1
2
3
at the origin and an unstable limit cycle on the border of the true ROA. The LLM algorithm is implemented in Rulex software1 ; the simulation of the dynamical system is developed in C.2
x˙1 = −x2 x˙2 = x1 + (x12 − 1)x2 .
(9)
As rules generate rectangles in the feature space, they may be thought to have less powerful approximation capabilities than neural networks [18], in front of the complex shape of non-linear regions; a single rule is shown in Fig. 3. However, the LLM clustering process with zero error in ROA explanation (of Sc in Algorithm 1 and S in Algorithm 2) allows the generation of a family of overlapping rules of enough small size, able to cover the true ROA completely, without touching any unstable point outside the region (Fig. 2). As a result of Algorithm 2, such a family is constituted by over 70 rules for the stable class of the type: up
up
if ((x1 > txlow ∧ x1 < tx1 ) ∧ (x2 > txlow ∧ x2 < tx2 ) then stable 1 2 up
up
with variations in thresholds txlow , tx1 , txlow , tx2 . The rules are applied all together 1 2 in logical OR (∨) to find the best approximation of the true ROA. Figure 3 shows the progressive covering of an increasing number of rules, until the entire region is covered (the plots of the progressing covering of Sc and S are not reported for the space limitation). 1 Rulex
data analytics platform; www.rulex.ai. and datasets are available at https://github.com/mopamopa/Liapunov-Logic-LearningMachine, [28].
2 Code
232
M. Maurizio and O. Vanessa
Fig. 2 Value ranking of x1 for stable class in Algorithm 2 3
3
2
2
1
1
0
0
-1
-1
single rule
-2
-2
-3
-3 -3
-2
-1
0
1
2
3
-3
-2
-1
0
1
2
3
Fig. 3 ROA approximated by a single rule (left), and ROA approximated by several rules (right)
The inherent value ranking for the stable class (Fig. 2 for x1 , the x2 case is very similar) shows the largest thresholds still maintaining a safe margin from the unstable class. The ranking shows the highest score to several intervals whose minimum and maximum values are −1.70 and 1.71, respectively; the evidence of those thresholds is confirmed by the intervals [−∞, −1.70], [1.71, +∞], whose score is significantly low (two bottom bars in the figure). This is also confirmed by the value ranking for the unstable class, not shown here. It is worth noting that there is still no Lyapunov guarantee in that rule-based approximation. The thresholds are then later used as lower and upper bounds to enlarge the ROA border certificates. For the above system, we consider the Lyapunov function V (x) V (x) = 1.5x12 − x1 x2 + x22 ,
(10)
Stability Certification of Dynamical Systems: Lyapunov Logic Learning Machine
233
Fig. 4 Minimum level set of V (x) under V˙ (x) = 0 (solid line), and minimum level set of V (x) driven by value ranking (dashed line)
and its derivative V˙ (x) V˙ (x) = −(x12 + x22 ) − (x13 x2 − 2x12 x22 ).
(11)
Before machine learning certification, problem (6) is solved as a performance comparison with traditional ROA solutions. V (x) minimum level set under the V˙ (x) = 0 constraint leads to: V (x) < c1 , c1 = 2.42, Fig. 4 (canonical). Two other ROAs are available under system linearization and intersection with V˙ (x) ≤ 0 : c11 = 0.618, and linearization with variable change x1 = ρcosθ, x2 = ρsinθ : c12 = 0.803 (p. 319 of [1]). Then, problems (7) and (8) are solved. Equation (7) with Rα (value ranking thresholds) gives a level set with V (x) < c21 , c21 = 1.41; (7) with Rβ (more prudent thresholds) gives V (x) < c21 , c21 = 0.836. On the other hand, the solution of (8) achieves the highest level set V (x) < c2∗ , c2∗ = 3.736222, able to touch the border of the true ROA, Fig. 4 (value ranking, dashed line). The subsequent ROA explanation of V (x) < c2∗ leads to a tiered-shape approximation of the c2∗ level set; the inherent number of rules is 91, thus denoting the fine granularity needed to track the shape of the stability border in a proper way.
5 Conclusion and Future Work The paper combines the Lyapunov theory and intelligible machine learning. The contribution is twofold: firstly, stability regions are explained in terms of system state; secondly, a new computational method finds stability conditions over larger regions than with traditional methods.
234
M. Maurizio and O. Vanessa
Future work deals with further performance evaluation, also in respect to cyberphysical systems whose behavior is assessed by dynamics with noise, uncertainty, and control feedback over a communication channel (e.g., vehicle platooning). Moreover, barrier functions are also of interest for safety certification. More investigation is also devoted to the analysis of the states that during the trajectories leave and return to the candidate stability region (classified in Algorithm 2 with the y = 2 tag) as they may constitute a good indication of region enlargement as well. Acknowledgements The authors gratefully acknowledge colleagues Elisabetta Punta and Fabrizio Dabbene for suggestions on Lyapunov stability theory and Marco Muselli for the usage of LLM with zero error and inherent value ranking.
References 1. Khalil H (2002) Nonlinear systems, 3rd edn. Prentice Hall, Upper Saddle River 2. Valmorbida G, Anderson J (2017) Region of attraction estimation using invariant sets and rational lyapunov functions. Automatica 75:37–45. http://www.sciencedirect.com/science/article/ pii/S0005109816303387 3. Korda M, Henrion D, Jones CN (2013) Controller design and region of attraction estimation for nonlinear dynamical systems 4. Zhai C, Nguyen HD (2019) Region of attraction for power systems using gaussian process and converse lyapunov function – part i: theoretical framework and off-line study 5. Lun YZ, D’Innocenzo A, Smarra F, Malavolta I, Benedetto MDD (2019) State of the art of cyber-physical systems security: an automatic control perspective. J Syst Softw 149:174–216. http://www.sciencedirect.com/science/article/pii/S0164121218302681 6. Mongelli M, Muselli M, Ferrari E, Fermi A (2018) Performance validation of vehicle platooning via intelligible analytics. IET Cyber-Phys Syst: Theory & Appl 10:2018 7. Road vehicles safety of the intended functionality pd iso pas 21448:2019, International Organization for Standardization, Geneva, CH, Standard (2019) 8. Glassman E, Desbiens AL, Tobenkin M, Cutkosky M, Tedrake R (2012) Region of attraction estimation for a perching aircraft: a lyapunov method exploiting barrier certificates. In: IEEE international conference on robotics and automation, pp 2235–2242 9. Mongelli M, Ferrari E, Muselli M (2019) Achieving zero collision probability in vehicle platooning under cyber attacks via machine learning. In: 2019 4th international conference on system reliability and safety (ICSRS) 10. Jones M, Mohammadi H, Peet MM (2017) Estimating the region of attraction using polynomial optimization: a converse lyapunov result. In: 2017 IEEE 56th annual conference on decision and control (CDC), pp 1796–1802 11. Zeˇcevi´c AI, Šiljak DD (2010) Regions of attraction. Springer, Boston, pp 111–141 12. Burchardt H, Ratschan S (2007) Estimating the region of attraction of ordinary differential equations by quantified constraint solving. In: Proceedings of the 3rd WSEAS international conference on DYNAMICAL SYSTEMS and CONTROL (CONTROL’07). WSEAS Press, pp 241–246 13. Mongelli M, Ferrari E, Muselli M, Scorzoni A (2019) Accellerating prism validation of vehicle platooning through machine learning. In: 2019 4th international conference on system reliability and safety (ICSRS) 14. Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Surv 51(5):93:1–93:42. http://doi. acm.org/10.1145/3236009
Stability Certification of Dynamical Systems: Lyapunov Logic Learning Machine
235
15. Noroozi N, Paknoosh K, Fatemeh S, Hamed J (2008) Generation of lyapunov functions by neural networks. Lect Notes Eng Comput Sci 2170:07 16. Petridis V, Petridis S (2006) Construction of neural network based lyapunov functions 01(2006):5059–5065 17. Chang Y-C, Roohi N, Gao S (2019) Neural lyapunov control 01:2019 18. Richards SM, Berkenkamp F, Krause A (2018) The lyapunov neural network: adaptive stability certification for safe learning of dynamic systems. arXiv:1808.00924 19. Diao R, Vittal V, Logic N (2010) Design of a real-time security assessment tool for situational awareness enhancement in modern power systems. IEEE Trans Power Syst 25:957–965 20. He M, Zhang J, Vittal V (2013) Robust online dynamic security assessment using adaptive ensemble decision-tree learning. IEEE Trans Power Syst 28(4):4089–4098 21. Bastani O, Pu Y, Solar-Lezama A (2018) Verifiable reinforcement learning via policy extraction. arXiv:1805.08328 22. Kozarev A, Quindlen J, How J, Topcu U (2016) Case studies in data-driven verification of dynamical systems. In: Proceedings of the 19th international conference on hybrid systems: computation and control HSCC 2016, 04 2016, pp 81–86 23. Khalil HK (2002) Nonlinear systems, 3rd ed. Prentice-Hall, Upper Saddle River; the book can be consulted by contacting: PH-AID: Wallet, Lionel. https://cds.cern.ch/record/1173048 24. Papachristodoulou A, Anderson J, Valmorbida G, Prajna S, Seiler P, Parrilo P (2013) Sostools version 3.00 sum of squares optimization toolbox for matlab, 10 25. Cangelosi D, Muselli M et al (2014) Use of attribute driven incremental discretization and logic learning machine to build a prognostic classifier for neuroblastoma patients. BMC Bioinf 15(suppl):5 26. Muselli M, Ferrari E (2011) Coupling logical analysis of data and shadow clustering for partially defined positive boolean function reconstruction. IEEE Trans Knowl Data Eng 23(1):37–50 27. Sturm JF (1999) Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones. Opt Methods Softw 11(1–4):625–653. https://doi.org/10.1080/10556789908805766 28. Mongelli M, Orani V, Git repository of liapunov logic learning machine. https://github.com/ mopamopa/Liapunov-Logic-Learning-Machine
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array Santhosh Thota and Pradip Sircar
Abstract We present a novel technique to estimate the two-dimensional angles of arrival (2D-AOAs), namely, azimuth and incidence (complementary to elevation) angles of multiple narrowband sources using an L-shaped array. The approach lies in forming a polynomial from the received data matrices. From the roots of the polynomial, the AOAs can be calculated. The proposed method uses singular value decomposition (SVD) to reduce the effect of noise. We derive an expression for the Cramer–Rao bound (CRB) of the 2D-AOA estimation problem. Performance analysis of the proposed method through simulation shows that it performs better than an existing method. We show how the proposed method compares with the CRB. Keywords 2D angle of arrival (AOA) · 2D direction of arrival (DOA) · Cramer–Rao (CR) bound
1 Introduction Two-dimensional (2D) angle of arrival (AOA) estimation is an important problem in array signal processing with applications in radar, sonar, radio astronomy, and mobile communication systems [1, 2]. Various array geometries such as planar array [3], parallel-shaped array [4–6], circular array [7], rectangular array [8], and Lshaped array [9–11] are employed for the 2D-AOA estimation. The L-shaped array is often preferred because it decomposes the 2D problem into two independent 1D problems, and the decoupling reduces complexity and enhances accuracy. But the Presently with Sierra Wireless, Richmond, Canada. S. Thota (B) · P. Sircar Department of Electrical Engineering, Indian Institute of Technology Kanpur, Kanpur 208016, Uttar Pradesh, India e-mail: [email protected] P. Sircar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_16
237
238
S. Thota and P. Sircar
decomposition also gives the problem of pair matching between the azimuth and incidence angle estimates [12, 13]. While employing an L-shaped array antenna, the different approaches being used for the estimation of 2D AOAs, namely, azimuth and incidence (complementary to elevation) angles, are classified as MUSIC-based [14, 15], ESPRIT-based, the matrix pencil [9, 10], and the propagator [11] methods. In this paper, we propose a new technique for the estimation of AOAs of multiple narrowband signals arriving from the far field of an antenna array. The narrowband signals are assumed to be of the same frequency. We extend the polynomial-based approach presented in [16] to the 2D-AOA estimation. We show that the proposed method performs better compared to the propagator method presented in [11]. We also derive an expression for the theoretical bound on the performance of AOA estimation. The paper is organized as follows: Sect. 2 presents the array configuration and system model. Section 3 contains the proposed method. Section 4 presents the simulation results. Section 5 concludes the paper. In Appendix, we derive an expression for the Cramer–Rao (CR) bound of the 2D-AOA problem.
2 System Model We consider an L-shaped antenna array in the X-Z plane as shown in Fig. 1. Two uniform linear arrays (ULAs) of p sensors each with inter-element spacing d are placed along X-axis and Z-axis with a common sensor at the origin [11]. The gain from an antenna element at position (x, y, z) with reference to origin for a given signal arriving from the direction (θ, φ) is given by a(θ, φ) = e j (2π/λ)(x sin θ cos φ+y sin θ sin φ+z cos θ),
(1)
Suppose that there are q narrow band sources, with same wavelength λ impinging on each sensor. An lth source has an azimuth angle φl , l = 1, 2, ..., q and an incidence (elevation) angle θl . The signal received at the ith element of the Z-subarray is given by [11] z i (t) =
q
az,i (θl , φl )sl (t) + n z,i (t),
(2)
l=1
where
az,i (θl , φl ) = e j (i−1)ψl , ψl = 2π(d/λ) cos θl ,
(3)
sl (t) is the complex envelope of the lth signal impinging from the direction (θl , φl ), and n z,i (t) is the additive white Gaussian noise (AWGN) of zero mean whose properties are stated in Sect. 3.
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
239
Fig. 1 The L-shaped array configuration
The signal received at the ith element of the X-subarray is given by xi (t) =
q
ax,i (θl , φl )sl (t) + n x,i (t),
(4)
l=1
where
ax,i (θl , φl ) = e j (i−1)ξl , and ξl = 2π(d/λ) sin θl cos φl ,
(5)
and n x,i (t) is the zero-mean AWGN. If we accumulate the outputs of all the sensors on the Z-subarray into an p × 1 vector, we have q az (θl , φl )sl (t) + nz (t), (6) z(t) = l=1
where z(t) = [z 1 (t)z 2 (t) · · · z p (t)]T , and az (θl , φl ) is the m × 1 steering vector of the Z-subarray toward the direction (θl , φl ) given by az (θl , φl ) = [az,1 (θl , φl ) az,2 (θl , φl ) · · · az,m (θl , φl )]T , T and nz (t) = n z,1 (t) n z,2 (t) · · · n z,m (t) .
240
S. Thota and P. Sircar
We can rewrite (6) as z(t) = Az s(t) + nz (t),
(7)
where Az = az (θ1 , φ1 ) az (θ2 , φ2 ) · · · az (θq , φq ) , and s(t) = [s1 (t)s2 (t) · · · sq (t)]T . Similarly for the X-subarray, we have x(t) =
q
ax (θl , φl )sl (t) + nx (t),
(8)
l=1
where
x(t) = [x1 (t) x2 (t) · · · xm (t)]T , ax (θl , φl ) = [ax,1 (θl , φl ) ax,2 (θl , φl ) · · · ax,m (θl , φl )]T ,
and nx (t) = [n x,1 (t) n x,2 (t) · · · n x,m (t)]T . Note that Eq. (8) can be rewritten as x(t) = Ax s(t) + nx (t),
(9)
where Ax = ax (θ1 , φ1 ) ax (θ2 , φ2 ) · · · ax (θq , φq ) .
3 Proposed Method Consider the Z-subarray for processing which provides the estimate of incidence angles. Substituting the values of az,i (θl , φl ) in Az (7) for i = 1, 2, ..., m and l = 1, 2, .., q, we have ⎛ ⎜ ⎜ Az = ⎜ ⎝
1
1
e jψ1 .. .
e jψ2 .. .
··· ··· .. .
1 e jψq .. .
⎞ ⎟ ⎟ ⎟ ⎠
(10)
e j (m−1)ψ1 e j (m−1)ψ2 · · · e j (m−1)ψq , where {ψl } are defined in (3). By collecting M equally spaced snapshots from each sensor, we can form a data matrix Z as follows: (11) Z = Az S + Nz , where
Z = [z(t1 )
z(t2 ) · · · z(t M )] , S = [s(t1 )
s(t2 ) · · · s(t M )] ,
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
241
and Nz = nz (t1 ) nz (t2 ) · · · nz (t M ) . The matrices Az and S are unknown and are assumed to be full-rank. Let y1 = e jψ1 , y2 = e jψ2 ,..., yq = e jψq , and assume that y1 , y2 , ..., yq are the roots of a polynomial equation P(y) = 1 + c1 y + c2 y 2 + · · · + cm−1 y m−1 = 0.
(12)
Our aim now is to calculate the coefficients which satisfy this assumption and then to solve for the roots.
3.1 Method Without Noise From (10) to (12), it can be easily shown that the following equations hold true in a noise-free scenario (Nz = 0): (13) AzT c = 0, where c = [1, c1 , c2 , · · · , cm−1 ]T . Using (11), we get ZT c = 0
or
Pc = 0,
pi = [z i (t1 ) where P = [p1 p2 · · · pm ] and Rewriting (14), we get the following matrix equation: P1 c1 = −p1 ,
(14) z i (t2 ) · · · z i (t M )]T .
(15)
where P1 = [p2 · · · pm ] and c1 = [c1 , c2 , · · · , cm−1 ]T . With more equations than unknowns, this is an overdetermined system, and the least squares solution for the coefficient vector c is given by c1 = −P1# p1 ,
(16)
where P1# is the Moore–Penrose inverse of P1 given by P1# = (P1H P1 )−1 P1H . Substitute coefficients c1 , c2 , . . . , cm−1 obtained in Eq. (16) into (12) and solve for the roots of P(y). There are (m − 1) roots among which q roots are of our interest. From the (m − 1) roots, choose q roots, say {γl }, whose magnitudes are nearer to unity: l = 1, 2, . . . , q. (17) γl = e j2π(d/λ) cos θl Then, the estimate of ψl is
ψˆ l = ∠γl .
(18)
242
S. Thota and P. Sircar
Following a similar approach for the X-subarray, calculate {βl } defined by βl = e j2π(d/λ) sin θl cos φl and hence
l = 1, 2, . . . , q,
ξˆl = ∠βl .
(19)
(20)
Therefore, the estimates of the incidence angles θl and azimuth angles φl are given by
θˆl = cos
−1
ψˆ l 2π(d/λ)
φˆl = cos−1
ξˆl
l = 1, 2, . . . , q.
(21)
l = 1, 2, . . . , q.
(22)
(2π(d/λ) sin θˆl )
3.2 Method with Noise Note that in (6) and (8), the vectors nz (t) and nx (t) are random noise components on Z and X subarrays, respectively. The noise is assumed to be additive, temporally white Gaussian with mean zero. The above stated properties of the noise can be mathematically expressed as follows [17]: E[nz (t)] = E[nx (t)] = 0,
(23)
E[nz (t)nzH (t)] = E[nx (t)nxH (t)] = σ 2 I,
(24)
E[nz (t)nzT (t)] = E[nx (t)nxT (t)] = 0,
(25)
and E[nz (ti )nzH (t j )] = E[nz (ti )nzT (t j )] = 0
for i = j,
(26)
E[nx (ti )nxH (t j )] = E[nx (ti )nxT (t j )] = 0
for i = j,
(27)
where E denotes the expectation operation, H denotes the Hermitian operation, T denotes the transposition, and ti and t j are the ith and jth time instants, respectively. The noise will have a negative impact on the accuracy of estimates of the polynomial coefficients. In order to reduce the effect of the noise, we follow the same procedure described for a noise-free system except for the calculation of inverse of the matrix P1 in (15) where we use the singular value decomposition (SVD) as discussed below. Let the SVD of the matrix P1 be
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
P1 = U˝V H ,
243
(28)
where U and V are the unitary matrices containing the singular vectors of P1 , and Σ is a diagonal matrix with same dimensions as P1 . The diagonal elements in Σ are non-negative real values sorted in decreasing order, Σ = diag(τ1 , τ2 , . . . , τq , . . . , τm−1 ),
(29)
where τ1 ≥ τ2 ≥ · · · ≥ τq ≥ · · · ≥ τm−1 are the singular values of P1 . Replace the last (m − q − 1) singular values with zero in calculating the inverse of P1 to reduce the effect of noise
and
Σ 1 = diag(τ1 , τ2 , . . . , τq , 0, . . .)
(30)
Σ −1 1 = diag(1/τ1 , 1/τ2 , . . . , 1/τq , 0, . . .)
(31)
H The dimensions of Σ −1 1 are same as that of P1 . Thus, the lease squares approximation of P1# is given by [18] H P1# = VΣ −1 1 U
(32)
Hence, an estimate of coefficient vector c1 in (15) is H c1 = −VΣ −1 1 U p1
(33)
Following the similar approach, we can obtain the coefficient vector and roots for the X-subarray. The computation of AOAs from the roots is the same as that for the noiseless case.
4 Simulation Results In this section, we present the results of computer simulation to evaluate the 2D-AOA estimation performance of the proposed polynomial-based method. The estimation performance is compared with the propagator method described in [11]. The CR bound of 2D-AOA estimation is derived in Appendix. An L-shaped antenna array as discussed in Sect. 2 is used with a total of 2 p − 1 = 31 elements. The elements are spaced half wavelength apart. For simulations, we consider three independent signal sources arriving at (θ, φ) angles of (60◦ , 30◦ ), (45◦ , 45◦ ), and (30◦ , 60◦ ) with equal power. We take M = 200 samples from each sensor. The signal-to-noise ratio (SNR) is varied from 0 to 30 dB. The estimation process is run L = 500 times for each SNR level.
244
S. Thota and P. Sircar
Fig. 2 RMSE of Source 1 (30◦ , 60◦ ); Our method *, Propagator method o , CRB +
Figures 2, 3, and 4 show the root-mean-square-error (RMSE) of estimation versus SNR for the Signal Sources 1–3, respectively. The theoretical RMSE is approximated by the sample RMSE,
E[(θˆl − θl )2 + (φˆl − φl )2 ] , l = 1, 2, 3 2 2 L θˆl,k − θl + φˆl,k − φl . = 1 k=1
RMSE =
L
(34)
We observe from the RMSE plots that the performance of our method deteriorates when the azimuth angle φ is close to 90◦ , as it is for Source 3. In such cases, the X-subarray cannot estimate the angles accurately because of its dependence on cos φ.
Fig. 3 RMSE of Source 2 (45◦ , 45◦ ); Our method *, Propagator method o , CRB +
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
245
Fig. 4 RMSE of Source 3 (60◦ , 30◦ ); Our method *, Propagator method o , CRB +
However, in addition, if we have the Y-subarray in the system, then the angles can be estimated accurately because of their dependence on sin φ.
5 Conclusion In this paper, we address the problem of estimation of azimuth and incidence angles of arrival of multiple signals received by an L-shaped array of sensors. The 2D-AOA estimation can find its applications in mobile satellite communications, RADAR ranging, indoor M2M communications, and many other scenarios. The proposed method does not require the transmitted signals to be known a priori, and the method has better estimation accuracy as compared to the propagator method described in [11]. We derive the CR bound for estimation of the 2D-AOA problem and compare the performance of the proposed SVD-based polynomial method with the CR bound. The detailed comparison of various methods in performance and computational complexity can be taken up as future work.
Appendix 1: CR Bound We derive an expression for the CR bound (CRB) for the 2D-AOA estimation. The Fisher information matrix (FIM) is computed below whose inverse gives the CRB.
246
S. Thota and P. Sircar
Appendix 1.1: Data Model The received signal vector at time instant ti is x(ti ) =
q
a(θl , φl )sl (ti ) + e(ti )
l=1
= As(ti ) + e(ti ),
(35)
where a(θl , φl ) is the steering vector in the direction (θl , φl ), e(ti ) is the error vector, and sl (ti ), θl , and φl are as defined in Sect. 2.
Appendix 1.2: Log-Likelihood Function The likelihood function of the data is 1 L(x(t1 ), x(t2 ), . . . , x(t M )) = pM (2π ) (σ 2 /2) pM M 1 × exp − 2 [x(ti ) − As(ti )] H [x(ti ) − As(ti )] . σ i=1
(36)
The log-likelihood function is Λ = log L = K − pM log σ 2 −
M 1 ||e(ti )||2 , σ 2 i=1
(37)
and e(ti ) = x(ti ) − As(ti ) q = x(ti ) − a(θl , φl )sl (ti ).
(38)
l=1
K is a constant, log is natural logarithm, and || · || denotes the vector norm. The properties of e(ti ) = n(ti ) are given in (23)–(27).
Appendix 1.3: Fisher Information Matrix First we calculate the one time differentiations of Λ with respect to σ 2 , x R (ti ) = {x(ti )}, x I (ti ) = {x(ti )}, Θ, and Φ given by
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
247
T Θ = θ1 θ2 · · · θq T Φ = φ1 φ2 · · · φq . Differentiating (37) with respect to σ 2 , M ∂Λ pM 1 H = − + e (ti )e(ti ). ∂σ 2 σ2 σ 4 i=1
(39)
Differentiating (37) with respect to θl , M ∂Λ 1 H ∂e(ti ) ∂e H (ti ) e (ti ) = 2 + e(ti ) . ∂θl σ i=1 ∂θl ∂θl
(40)
Now from (38), we have
and
∂a(θl , φl ) ∂e(ti ) =− sl (ti ) ∂θl ∂θl
(41)
∂a H (θl , φl ) ∂e H (ti ) = −sl∗ (ti ) . ∂θl ∂θl
(42)
Substitution of (41) and (42) into (40) yields M ∂Λ 1 H ∂a(θl , φl ) e (ti ) = 2 sl (ti ) ∂θl σ i=1 ∂θl ∂a H (θl , φl ) ∗ +sl (ti ) e(ti ) . ∂θl
(43)
Therefore, M ∂Λ 2 ∂a H (θl , φl ) ∗ = 2 sl (ti ) e(ti ) ∂θl σ i=1 ∂θl
l = 1, 2, . . . , q.
(44)
M ∂Λ 2 ∂a H (θl , φl ) = 2 sl∗ (ti ) e(ti ) ∂φl σ i=1 ∂φl
l = 1, 2, . . . , q.
(45)
Similarly,
From (44), we have M ∂Λ 2 H S D (ti )Dθ H e(ti ) , = 2 ∂Θ σ i=1
(46)
248
S. Thota and P. Sircar
and from (45), we have M 2 H ∂Λ = 2 S D (ti )Dφ H e(ti ) , ∂Φ σ i=1
(47)
where S D (ti ) = diag s1 (ti ), · · · , sq (ti ) Dθ = ∂a(θ∂θ1 ,φ1 ) · · · ∂a(θ∂θq ,φq ) ∂a(θq ,φq ) Dφ = ∂a(θ∂φ1 ,φ1 ) · · · ∂φ . Differentiation of Λ with respect to s R (ti ) = {s(ti )}, we get 1 ∂Λ = 2 [A H e(ti ) + AT e∗ (ti )] ∂s R (ti ) σ 2 = 2 A H e(ti ) . σ
(48)
Similarly, the one time differentiation of Λ with respect to s I (ti ) = {s(ti )} is 1 ∂Λ = 2 − jA H e(ti ) + jAT e∗ (ti ) ∂s I (ti ) σ 2 = 2 A H e(ti ) σ
(49)
To evaluate the elements of the Fisher information matrix, we need the following identities [19]. P1:
P2:
E e H (ti )e(ti )e H (t j )e(t j ) = E e H (ti )e(ti )eT (t j ) = 0
for i = j p2 σ 4 p( p + 1)σ 4 for i = j, for all i and j,
(50)
(51)
P3: 1 {xyT } + {xy H } 2 1 T {x}{y } = − {xyT } − {xy H } 2 1 {x}{yT } = {xyT } − {xy H } . 2
{x}{yT } =
(52) (53) (54)
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
249
P4: Let Q be a nonsingular complex matrix, and G be its inverse given by G = Q−1 . Then −1 Q R −Q I G R −G I . (55) = QI QR GI GR Now let us continue with the evaluation of the Fisher information matrix which is given by [19] FIM = E Ψ Ψ T , (56) where
Ψ T = ∂Λ/∂ σ 2 , s R (t1 ), s I (t1 ), . . . , s R (t M ), s I (t M ), Θ T , Φ T .
From (39) and (50), we derive E
∂Λ ∂σ 2
2 =
pM σ4
.
(57)
From (39), together with (46), (47), (48), and (49), we get E
∂Λ ∂σ 2
!
∂Λ ∂Θ !T
!T
=E
∂Λ ∂σ 2
!T ∂Λ =E ∂s R (ti ) ! ! ∂Λ ∂Λ =E = 0. ∂σ 2 ∂s I (ti ) ∂Λ ∂σ 2
!
∂Λ ∂Φ
!T
(58)
Next we use (51) and the fact that E e(ti )eT (t j ) = 0 for all i and j to obtain all the following expressions: From (48), E
∂Λ ∂s R (ti )
!
∂Λ ∂s R (t j )
!T =
2 H A A δi, j , σ2
where δi, j is the Dirac delta given by δi, j = From (48) and (49),
1 for i = j 0 for i = j.
(59)
250
S. Thota and P. Sircar
∂Λ ∂s R (ti )
E
!
∂Λ ∂s I (t j )
!T =−
2 H A Aσ 2 δi, j . 2 σ
(60)
From (46) and (48), E
∂Λ ∂s R (ti )
!
∂Λ ∂Θ
!T =
2 H A Dθ S D (ti ) . 2 σ
(61)
=
2 H A Dφ S D (ti ) . 2 σ
(62)
Similarly, from (47) and (48), we have E From (49),
∂Λ ∂s R (ti )
E
∂Λ ∂s I (ti )
!
!
∂Λ ∂Φ
!T
∂Λ ∂s I (t j )
!T
2 H A A δi, j . σ2
(63)
=
2 H A Dθ S D (ti ) . 2 σ
(64)
=
2 H A D S (t ) . φ D i σ2
(65)
=
From (46) and (49), E
∂Λ ∂s I (ti )
!
∂Λ ∂Θ
!T
Similarly, from (47) and (49), we have E
∂Λ ∂s I (ti )
!
∂Λ ∂Φ
!T
From (46), we derive E
∂Λ ∂Θ
!
∂Λ ∂Θ
!T =
M 2 H S D (ti )DθH Dθ S D (ti ) σ 2 i=1
= Γθ,θ
(66)
Similarly, E
∂Λ ∂Θ
!
∂Λ ∂φ
!T =
M 2 H S D (ti )DθH Dφ S D (ti ) 2 σ i=1
= Γθ,φ . and
(67)
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
E
∂Λ ∂Φ
!
∂Λ ∂Φ
!T =
251
M 2 H S D (ti )DφH Dφ S D (ti ) 2 σ i=1
= Γφ,φ .
(68)
Appendix 1.4: The Cramer–Rao Bound The CRB covariance matrix is given by CRB(σ 2 , s R (t1 ), s I (t1 ), . . . , s R (t M ), s I (t M ), Θ, Φ) = Ω = (FIM)−1 ,
(69)
where FIM is defined in (56). We shall now have the following notations: varC R (σ 2 ) = Q= Δθ,i = Δφ,i = Δθ,i,R = Δφ,i,R = Δi =
σ4 pM 2 H A A G = Q−1 σ2 2 H A Dθ S D (ti ) σ2 2 H A Dφ S D (ti ) σ 2 Δθ,i,I = Δθ,i Δθ,i Δφ,i Δφ,i,I = Δφ,i Δθ,i Δφ,i .
Inserting (57)–(68) into (69) and using the above notations, we get Ω=
pM σ2
0 0 Ω1
!−1 .
(70)
The matrix Ω1 is given by Ω1 = CRB(s R (t1 ), s I (t1 ), . . . , s R (t M ), s I (t M ), Θ, Φ) ⎞−1 ⎛ Q R −Q I Δ1,R 0 ⎜ QI QR Δ1,I ⎟ ⎟ ⎜ ⎜ .. ⎟ . .. ⎜ . ⎟ =⎜ ⎟ ⎜ Q R −Q I Δ M,R ⎟ ⎟ ⎜ 0 ⎝ QI QR Δ M,I ⎠ T T Δ1,R Δ1,I · · · ΔTM,R ΔTM,I Γ
(71)
252
S. Thota and P. Sircar
where Δ1,R = Δθ,1,R Δφ,1,R Δ1,I = Δθ,1,I Δφ,1,I and
Γθ,θ Γθ,φ T Γθ,φ Γφ,φ
Γ =
! .
Now we use the matrix inversion lemma to obtain {CRB(Θ, Φ)}−1 = Γ −1
T T = Γ − Δ1,R Δ1,I · · · ΔTM,R ΔTM,I
⎞⎛ Δ ⎞ 1,R G R −G I 0 ⎟ Δ ⎜ GI GR ⎟⎜ 1,I ⎟ ⎜ ⎟⎜ ⎜ . ⎟⎜ . ⎟ ··· . ·⎜ ⎜ ⎟⎜ . ⎟ ⎟ ⎝ G R −G I ⎠ ⎝ Δ M,R ⎠ 0 GI GR Δ M,I ⎛
(72)
Observe that
and that
G R −G I GI GR
ΔR ΔI
ΔTR ΔTI
GR ΔR − GI ΔI = GI ΔR + GR ΔI
{GΔ} = {GΔ}
{GΔ} = Δ H GΔ . {GΔ}
(73)
(74)
From the above equations, {CRB(Θ, Φ)}−1 = Γ −
M ΔiH GΔi .
(75)
i=1
Therefore,
M ΔiH GΔi CRB(Θ, Φ) = Γ − i=1
which is the required result.
−1 ,
(76)
Two-Dimensional Angle of Arrival Estimation Using L-Shaped Array
253
References 1. Pillai SU (1989) Array signal processing. Springer, New York 2. Ohmori S, Wakana H, Kawase S (1998) Mobile satellite communications. Artech House, Boston 3. Swindlehurst AL, Kailath T (1993) Azimuth/elevation direction finding using regular array geometries. IEEE Trans Aerosp Electron Syst 29(1):145–156 4. van der Veen AJ, Ober PB, Deprettere EF (1992) Azimuth and elevation computation in high resolution DOA estimation. IEEE Trans Acoust Speech, Signal Process 40(7):1828–1832 5. Kedia VS, Chandna B (1997) A new algorithm for 2-D DOA estimation. Signal Process 60(3):325–332 6. Wu Y, Liao G, So HC (2003) A fast algorithm for 2-D direction-of-arrival estimation. Signal Process 83(8):1827–1831 7. Zoltowski MD, Mathews CP (1993) Closed-form 2D angle estimation with uniform circular array via phase mode excitation and ESPRIT. In: Proceedings of 27th asilomar conference on signals, systems and computers, vol 1, pp 169–173 8. Zoltowski MD, Haardt M, Mathews CP (1996) Closed-form 2D angle estimation with rectangular arrays in element space or beamspace via unitary ESPRIT. IEEE Trans Signal Process 44(2):316–328 9. Hua Y, Sarkar TK, Weiner DD (1991) An L-shaped array for estimating 2-D directions of wave arrival. IEEE Trans Antennas Propag 39(2):143–146 10. del Rio JEF, Catedra-Peraz MF (1997) The matrix pencil method for two-dimensional direction of arrival estimation employing L-shaped array. IEEE Trans Antennas Propag 45(11):1693– 1694 11. Tayem N, Kwon HM (2005) L-shape 2-dimensional arrival angle estimation with propagator method. IEEE Trans Antennas Propag 53(5):1622–1630 12. Shu T, Liu X, Lu J (2008) Comments on “L-shape 2-dimensional arrival angle estimation with propagator method”, vol 56, Issue 5, pp 1502–1503 13. Liang J, Liu D (2010) Joint elevation and azimuth direction finding using L-shaped array. IEEE Trans Acoust Speech, Signal Process 58(6):2136–2141 14. Changuel H, Harabi F, Gharsallah A (2006) 2-L-shape two-dimensional arrival angle estimation with a classical subspace algorithm. Proc IEEE Int Symp Ind Electron 1:603–607 15. Porozantzidou MG, Chryssomallis MT (2010) Azimuth and elevation angles estimation using 2-D MUSIC algorithm with an L-shape antenna. In: Proceedings of IEEE antennas and propagation society international symposium, ON, Canada, Toronto 11–17 July 2010 16. Singh P, Sircar P (2011) Time delays and angles of arrival estimation using known signals. Signal, Image Video Process 6(2):171–178 17. Papoulis A, Pillai SU (2002) Probability, random variables and stochastic processes, 4th edn. McGraw-Hill, New York 18. Dewilde P, Deprettere EF (1988) Singular value decomposition: an introduction. In: Deprettere EF (ed) SVD and signal processing: algorithms, applications, and architectures. Elsevier Science, North Holland, pp 3–41 19. Stoica P, Nehorai A (1989) MUSIC, maximum likelihood, and Cramer-Rao bound. IEEE Trans Acoust Speech Signal Process 37(5):720–741
Interference Management Technique for LTE, Wi-Fi Coexistence Based on CSI at the Transmitter D. Diana Josephine
and A. Rajeswari
Abstract Coexistence of wireless devices is a better option when devices operate in 2.4/5 GHz unlicensed radio spectrum. In recent times, LTE and Wi-Fi are the most promising wireless technologies that coexist. Joint operation of LTE and Wi-Fi in the same license-exempt bands is a real possibility but it requires advanced intelligent techniques. When such devices coexist, they suffer from significant interference and leads to performance degradation. This paper focuses on quantifying the impact of interference on the performance metrics and implements a novel Channel ConditionBased Equipment (CCBE) to manage such interference. Results show that for a fixed SNR there is a reasonable reduction in BER when transmissions are allowed after estimating the behavior of the channel. The coexistence environment is deployed in MATLAB with the aid of WLAN and LTE toolboxes. Keywords Coexistence · Unlicensed spectrum · Interference · Channel estimation
1 Introduction In the past two decades, the use of wireless communication has exceeded the use of personal communication or Human-to-Human (H2H) communication. But the deployment of wireless communication systems has faced severe and several challenges due to the current spectrum scarcity. As India is one of the spectrum-starved countries, where each operator gets about only one-fourth of licensed spectrum as compared to operators in other countries, unleashing the potential of unlicensed spectrum is delightful. But the situation has still been worsened by the development of new and contemporary wireless technologies and standards. Hence, Coexistence among heterogeneous networks has been a big urge of research in the academy and in industry. As more and more wireless devices use the same 5/2.4 GHz radio spectrum, coexisting wireless device in this band suffers from significant mutual interference thus leading to performance degradation of the entire system [1]. As D. D. Josephine (B) · A. Rajeswari Department of ECE, Coimbatore Institute of Technology, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_17
255
256
D. D. Josephine and A. Rajeswari
demand for mobile data access continues to escalate, LTE and Wi-Fi coexistence has now become a hope to enhance the user’s mobile data experience. From the technological perspective, LTE is the most preferable choice for traffic offload due to its optimal usage of radio resources, but, it requires backhaul connection. On the other hand, the low deployment costs of Wi-Fi and its broad adoption make it competitive in spite of its lower radio resource usage. It is challenging to support coexistence between LTE and Wi-Fi. LTE has been used in a controlled environment with little interference, and does not have any additional mechanism to avoid interference. On the other hand, Wi-Fi is mainly used in ISM band, and it uses Carrier Sense Multiple Access (CSMA) to avoid interference. If they are used in the same spectrum, it is likely that LTE dominates the spectrum while Wi-Fi devices keep deferring to LTE transmissions [7]. Even if Wi-Fi gets a chance to access the channel, Wi-Fi will be easily interfered by LTE as soon as LTE signal becomes stronger, degrading both performances. This results from the Wi-Fi protocol operation, CSMA, which provides channel access only in low interference situations. Therefore, mechanisms for enabling LTE/Wi-Fi coexistence and managing the interference are essential. Regarding this, the contribution of work in this dissertation is manifold. The following are the contributions made in this research work: • Develop a channel estimation-based interference management model • Estimate the impact of interference • Show the benefit and accuracy of the model The paper is organized as follows. In Sect. 2, a brief content on LTE/Wi-Fi coexistence issues and their potential benefits from the literature is presented. Section 3 presents the approaches for enabling LTE/Wi-Fi coexistence. Simulation results and discussions are included in Sect. 4. Finally, the conclusion and future works in Sect. 5.
2 Related Works The problem of Wi-Fi and LTE coexistence and the potential impact of one network over the other have recently been studied and simulation results have been presented in a handful of research and industry publications. Paper [1] reviewed the coexistence between LTE and Wi-Fi access systems operating in 5 GHz ISM Spectrum with respect to channel selection and Transmit power control. An LTE-Wi-Fi coexistence mechanism that wakes up Wi-Fi nodes during the LTE blank sub-frame period was proposed. LTE blank sub-frame period will be identified by Wi-Fi APs, and it calls its Wi-Fi stations to access the channel. Interference between the Wi-Fi stations is also avoided by some Beacon procedures. A system-level simulation is been carried out with different LTE blank sub-frame configurations for an indoor scenario with varied APs and nodes count. The results show that, in general, blanking LTE sub-frames increases the throughput in Wi-Fi networks on per user basis. However, LTE throughput decreases
Interference Management Technique for LTE, Wi-Fi Coexistence …
257
both for losing time resources and from suffering interference of Wi-Fi nodes, which are not able to confine their transmission within the time. In Paper [2], survey on the following is done. Issues that arise from simultaneous operation of LTE and Wi-Fi in the same spectrum bands, the technology that is affected the most, factors that determine the impacts of LTE and Wi-Fi coexistence identifies the strengths and weaknesses of existing solutions and suggests potential strategies to improve performance of these two technologies. Paper [3] addressed the channel occupation and channel selection problem for LTE-U coexisting with Wi-Fi system on the unlicensed band. A review on different solutions that addressed the coexistence issue is made. The works in the discussions are either based on fair spectrum allocation between LTE-U and Wi-Fi technologies or on the game theory paradigm. In paper [7], the interference management is achieved by monitoring, the Wi-Fi device, signaling energy on a communication channel in a frequency band associated with the Wi-Fi device. The monitored signal energy is then compared with a known waveform signature corresponding to LTE operation, and presence of an LTE interferer is observed on the communication channel in the frequency band associated with the Wi-Fi device. In paper [9], an analytical model is provided to evaluate the performance of the LBT-based coexistence scheme in the scenario with multiple LTE SBSs, and the expressions of the occupancy time of LTE/Wi-Fi/access overhead and the collision probability caused by the back-off mechanisms of LTE SBSs and Wi-Fi are also derived. This LBT scheme is considered as the foundation for channel estimationbased interference management technique. It is evident from numerical results that the LBT-based scheme is not fair for the Wi-Fi system in the coexistence scenario with multiple LTE SBSs. To achieve a significant performance, the number of LTE SBSs and the length of the LTE frame should be carefully designed to protect the Wi-Fi system from starving. Others techniques discussed in the literature to combat interference are Frequency-based isolation which includes OFDM subcarrier suppression, finegrained frequency fragmentation, MIMO, Auction based cooperation and competition, Cognitive coexistence scheme to enable spectrum sharing between U-LTE and Wi-Fi networks, referred to as CU-LTE, Carrier aggregation and selection, Duty cycle-based LTE-U, LAA for LTE, LBT-based coexistence scheme, Interference aware channel scheduling, LTE blank sub-frame, Transmission power control, Resource allocation, etc.
3 Interference Modeling When two wireless devices transmit at the same time, their radio signals will collide and become distorted. A natural approach to support coexistence between multiple transmissions is to isolate spectrum across different time, frequency, space, thereby
258
D. D. Josephine and A. Rajeswari
avoiding interference. Characterizing interference is critical and understanding the performance of a wireless heterogeneous network is still more difficult. Many protocol and algorithmic work fundamentally depend on such interference characterization. However, current research considers interference models that are either oversimplified or too abstract with unknown parameters limiting their use in practice. 802.11 devices operating on the same channel use a Clear Channel Assessment (CCA) check to avoid these collisions. However, the CCA check may not detect a transmission occurring on a different channel.
3.1 Channel Estimation One way to support their coexistence is to separate them in either time domain or frequency domain, as used by Qualcomm. Moreover, even with accurate demand estimation, only one of them can be active at a given time and frequency, thereby limiting the total throughput and performance. A novel Channel Condition-Based Equipment (CCBE) is proposed in this paper (shown in Fig. 1)which does channel estimation to identify the best channel for
Yes Start
New frame to send
Is Channel free?
Perform CCA
No N=rand (1,2...q) Wait N Ɵme slots and Perform CCA
Yes
Is Channel free?
No N=N-1
Yes
No Is N=0?
Perform channel esƟmaƟon using Pilot symbols
Classifier
Best Channel
Average Channel
Occupy the channel and transmit the frames
Enter contenƟon window with BEB strategy
Fig. 1 Proposed Channel condition-based equipment
Bad Channel
Interference Management Technique for LTE, Wi-Fi Coexistence …
259
transmission. This is done because identifying a best channel and then transmitting is substantial than transmitting frequently in a bad channel and ending up with multiple retransmissions. The CCBE is proposed with the understanding of Frame-Based Equipment (FBE) and Load-Based Equipment (LBE) discussed in the literature. Channel estimation is also essential for signal decoding [6]. Conventional channel persistent method and channel estimation scheme using pilot symbol are adopted to classify the channel. Transmission is then carried out when an average channel condition is achieved. The channel estimation is done at the transmitter since it is difficult to get the reference or pilot signals for the channel estimation at the receiver. As the received signal is a mix of LTE and Wi-Fi signals, the pilot symbols generated for channel estimation would be erroneous.
4 LTE and Wi-Fi Simulation Modeling MATLAB can simulate wireless systems (LTE, WLAN, 5G, etc.) and then connect the designs to a range of RF instruments and Software-Defined Radio (SDR) devices. The LTE and Wi-Fi coexistence is deployed under two phases. Under Phase 1, the LTE and Wi-Fi signals are generated autonomously in a clean environment and its Bit Error Rate (BER) is tested. Under Phase 2, the LTE and Wi-Fi signals are generated in a coexistence environment, i.e., transmitted jointly and its Bit Error Rate (BER) is tested. The two phases described is illustrated as block diagram in Fig. 2.
Phase 1 - HOMOGENEOUS NETWORK ENVIRONMENT Wi-Fi , LTE signal generaƟon in homogeneous environment
Performance metric measurement and analysis
Phase 2 - HETEROGENEOUS NETWORK ENVIRONMENT Wi-Fi and LTE signal generaƟon in heterogeneous environment
Performance metric measurement and analysis
Interference model based on measurement analysis Fig. 2 LTE and Wi-Fi Deployment Scenario
260
D. D. Josephine and A. Rajeswari
Table 1 LTE PHY/MAC parameter setup
4.1 Channel Model The wlanTGacChannel System object filters an input signal through an 802.11ac (TGac) multipath fading channel.
4.2 LTE/LTE-A Transceiver Generating LTE waveforms require a deep understanding of the LTE standards. LTE System Toolbox offers complete control of LTE waveform generation, including standard-compliant Reference Measurement Channels (RMCs) and Fixed Reference Channels (FRCs), uplink and downlink, and downlink E-TM test models as defined by 3GPP and they comply with the definition found in TS 36.101. The following Table 1 shows the various LTE system properties and their typical values or configuration type of enodeB, Physical Downlink Control Channel (PDCCH), and Physical Downlink Shared Control Channel (PDSCH) set for the generation of LTE waveform.
4.3 WLAN 802.11ac Transceiver WLAN Toolbox provides standards-compliant functions for the design, simulation, analysis, and testing of wireless LAN communications systems. The following Table 2 shows the wireless LAN Very High Throughput Configuration (wlanVHTConfig), wlanTGacChannel and AWGN channel properties and their typical values or configuration type set for the generation of Wi-Fi signal.
Interference Management Technique for LTE, Wi-Fi Coexistence …
261
Table 2 Wi-Fi PHY/MAC parameter setup
4.4 Generation of Clean LTE and Wi-Fi Signal Generation of clean LTE and Wi-Fi signal involves the test waveform production at the transmitter, making it travel through the fading channel and received at the receiver by proper synchronization and demodulation methods. For the intention of achieving togetherness and reasonable system, performance channel estimation and equalization are done. The following is the step-by-step process involved in the generation of WlanVHT waveform and its recovery. STEP 1: Create a random bit stream. STEP 2: Create a VHT configuration object and generate an 80 MHz VHT waveform. STEP 3: Create a TGac and an AWGN channel (Model the sampling frequency equal to channel bandwidth). STEP 4: Calculate the noise variance for the receiver (|nVar|, is equal to kTBF, where k is Boltzmann’s constant, T is the ambient temperature of 290 K, B is the bandwidth (sample rate), and F is the receiver noise Fig. (9 dB). STEP 5: Pass VHT waveform through noisy TGac and AWGN Channel. STEP 6: Configure the recovery object and recover the payload bits using a perfect channel estimate. STEP 7: Compare the recovered bits against the transmitted bits. The concatenated PSDU bits for all packets, data, are passed as an argument to wlanWaveformGenerator along with the VHT packet configuration object vhtCfg. This configures the waveform generator to synthesize an 802.11ac VHT waveform with 4 packets and the ideal time between the packets is kept as 15e-6. Figure 2 shows the generated baseband IEEE 802.11ac waveform and Fig. 4 shows the same after passing through TGac and AWGN channel.
262
D. D. Josephine and A. Rajeswari
Fig. 3 WlanVHT waveform
The ideal separation/gap between the packets got disturbed when passed through the channel, and it is clearly depicted in Fig. 3. Figure 4 shows the respective spectrum plot for the same. The lteRMCDL (Downlink reference measurement channel configuration) returns a configuration structure for the reference channel. The structure contains the configuration parameters required to generate a given reference channel waveform using the Reference Measurement Channel (RMC). The RMC is selected as R.13, and the parameters are set up as defined in Table 1. For R.13, the number of resource blocks is set to 50, and the number of antennas to 4. Figure 5 depicts the 10 Sub-frames of R.13 RMC generated waveform. There are four plots, each corresponding to one of the four transmit antennas for R.13. The sixth sub-frame (sub-frame 5) includes a hole, which specifies that the sixth sub-frame should not contain any data Physical Downlink Shared Channel (PDSCH) transmission. In the sixth sub-frame, the dark blue plot includes more data because the Primary and Secondary Synchronization Signals (PSS/SSS) are located in that sub-frame and transmitted from only one antenna.
Fig. 4 WlanVHT waveform after passing through TGac and AWGN channel
Interference Management Technique for LTE, Wi-Fi Coexistence …
263
Fig. 5 Spectrum of WlanVHT waveform before and after passing through TGac and AWGN channel
4.5 LTE and Wi-Fi Coexistence Deployment Scenario LTE and WI-FI Coexistence deployment scenario involve the phase 2 heterogeneous environment as mentioned earlier. The generated LTE signal is added with the generated Wi-Fi signal and passed through the same channel at the same time. Both the signals now suffer from severe degradation, and this can be quantitatively measured through Bit Error Rate (BER). This analysis is done under Sect. 5. Figure 6 shows the time domain plot of Wi-Fi signal when LTE signal is added to it. The packet separation is affected more and the magnitude of the Wi-Fi signal is also been dropped.
Fig. 6 LTE waveform
264
D. D. Josephine and A. Rajeswari
5 Performance Evaluation and Results At the receiver, it receives the signal as shown in Fig. 6 and demodulation is done. After demodulation, channel estimation is done and channel coefficients are obtained. A sample of the channel coefficients obtained after demodulation is shown in Fig. 7. Channel estimation for data and pilot subcarriers, specified as a matrix or array of size NST-by-NSTS-by-NR. NST is the number of occupied subcarriers. NSTS is the number of space–time streams. NR is the number of receive antennas. NST and NSTS must match the cfg configuration object settings for channel bandwidth and number of space–time streams. NST increases with channel bandwidth. This is described in Table 3. After data recovery, the transmitted and received bits are compared and their BER is calculated. The obtained BER is shown in Fig. 8. The simulation is performed for sample SNR values; say 50 and 100 and their BER results are obtained. The BER results of transmitting a pure WLan signal, pure LTE signal, and WLan signal mixed with LTE signal with channel estimation are obtained in the command window and the same is shown in Fig. 8. It is clear from the command window that BER of WLan signal mixed with LTE signal along with channel estimation is lesser when compared to the same signal without channel estimation. This is because data recovery with known channel co-efficient yields the best results since the nature or behavior of the channel is translucent. It is also evident that due to interference the BER of mixed WLan and LTE is greater than a
Fig. 7 Time domain plot of LTE and Wi-Fi signal when passed together
Table 3 Configuration object settings
Interference Management Technique for LTE, Wi-Fi Coexistence …
265
Fig. 8 Sample Channel co-efficient values
pure WLan signal and the interference metric is quantitatively measured. Figure 9 shows the BER vs SNR plot for the simulated environment. It is evident from the plot that knowledge of CSI though not preferable at lower SNR ranges in a homogenous environment (phase 1) aid well at higher SNR ranges. Contrarily, in a heterogeneous environment (phase 2), they outperform in both lower and higher SNR ranges. As the channel state information is known at the transmitter, the throughput of the system is quite high and the packet loss is less. Simulation results show that the Application Throughput for the modeled system is 4793.0535 Mbps. The simulation time for the modeled system is 13.086 s. This simulation time for the first iteration, and it seems to be large, since channel estimations are done and time reduced for the subsequent transmissions by a factor of 1.084 s. The BER results are good for first 8–10 transmissions, and they increase gradually. This is because the condition of the channel is time-varying and channel estimation has to be restarted again.
6 Conclusion and Future Work The coexistence environment between LTE and WLan is deployed in MATLAB and their inference is quantitatively measured. Results show that there is an average decrease in BER of 0.01when channel estimation scheme is involved and CSI is
266
D. D. Josephine and A. Rajeswari
(a)
(b) (a) SNR=50 (b) SNR=100
Fig. 9 Sample BER output values a SNR = 50 b SNR = 100 0.52
Pure Wlan Signal
B 0.51 E 0.5 R 0.49
LTE+Wlan Signal without channel esƟmaƟon
0.48 50
60
70
80
90
100
LTE+Wlan Signal with channel esƟmaƟon
SNR
Fig. 10 BER vs SNR plot
known by the transmitter in a coexistence environment. This work can be extended to practical implementation by OTA testing.
References 1. Abinader, Fuad M., Vicente A. de Sousa, Sayantan Choudhury, Fabiano S. Chaves, André M. Cavalcante, Erika PL Almeida, Robson D. Vieira, Esa Tuomaala, and Klaus Doppler. LTE/Wi-Fi Coexistence in 5 GHz ISM Spectrum: Issues, Solutions and Perspectives” Wireless Personal Communications 99, no. 1: 403–430(2018). 2. Quang-Dung Ho, Daniel Tweed, Tho Le-Ngoc, “U-LTE With Wi-Fi :A Survey”, Long Term Evolution in Unlicensed Bands, Part of the Springer Briefs in Electrical and Computer Engineering book series, pp 43–48 (2017).
Interference Management Technique for LTE, Wi-Fi Coexistence …
267
3. Hager Ben Hafaiedh , Leila Azouz Saidane and Abdellatif Kobbane, “LTE-U and WiFi coexistence in the 5 GHz Unlicensed Spectrum: A Survey” IEEE International conference on performance evaluation and modelling in wired and wireless networks(PEMWN), pp1–7 (2017). 4. Chaves, F. S., A. M. Cavalcante, E. P. L. Almeida, F. M. Abinader Jr, R. D. Vieira, S. Choudhury, and K. Doppler. “LTE/Wi-Fi coexistence: Challenges and mechanisms.” In XXXI Simposio Brasileiro De Telecomunicacoes (2013). 5. Mafakheri, Babak, Leonardo Goratti, Roberto Riggio, Chiara Buratti, and Sam Reisenfeld. “LTE transmission in unlicensed bands: Evaluating the impact over clear channel assessment.” In 2018 27th International Conference on Computer Communication and Networks (ICCCN), pp. 1–8. IEEE (2018) 6. Yun, Sangki, and Lili Qiu. “Supporting WiFi and LTE co-existence.” In 2015 IEEE Conference on Computer Communications (INFOCOM), pp. 810–818. IEEE (2015). 7. Madan, Ritesh K., Vikram Chandrasekhar, Andrea Goldsmith, and Santhosh Krishna. “Dynamic channel selection algorithms for interference management in Wi-Fi networks” U.S. Patent 9,131,391, issued September 8, 2015. 8. Valliappan, Nachiappan, and Ahmed Kamel Sadek. “Long term evolution interference management in unlicensed bands for Wi-Fi operation” U.S. Patent 9,332,465, issued May 3, 2016. 9. Pei, E., Meng, D., Li, L., & Zhang, P. (2017). “Performance analysis of listen before talk based coexistence scheme over the unlicensed spectrum in the scenario with multiple LTE small bases”, IEEE Access, 5, 10364–10368.”
VHF OSTBC MIMO System Used to Overcome the Effects of Irregular Terrain on Radio Signals Xolani B. Maxama and Elisha D. Markus
Abstract Due to irregular terrain in Goodhouse of the Northern Cape Province in South Africa, radio signal reception is a challenge. All forms of electronic communication are not available in the area. This setback has impacted the power utility, Eskom negatively in this region as they have substations and other high voltage apparatus to monitor using the Supervisory Control and Data Acquisition (SCADA) system. This paper addresses this challenge by making use of the Very High Frequency Orthogonal Space–Time Block Code, Multiple-In, Multiple-Out (VHF OSTBC MIMO) system. The simulation results are generated using Matlab software. The results generated provide coverage predictions and receive signal levels of two OSTBC MIMO systems operating at two different VHF frequencies. The results reveal that employing a low VHF frequency band OSTBC MIMO transceiver system, in rough terrain environments, can greatly improve radio signal reception. Keywords Modulation · OSTBC · Propagation · Irregular terrain · MIMO · Fading · BER (Bit Error Rate)
1 Introduction The Terrain, along the Orange River valley between the towns Goodhouse and Vioolsdrift in the Northern Cape Province, is irregular. The area boasts mountains as high as 1258 m above sea level. The South African supply Authority Eskom has High Voltage (HV) plants in this area supplying electricity to the surrounding farming communities. However, all forms of electronic communication, including the Supervisory Control and Data Acquisition (SCADA) service which Eskom uses to monitor
X. B. Maxama · E. D. Markus (B) Central University of Technology, Bloemfontein, South Africa e-mail: [email protected] X. B. Maxama e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_18
269
270
X. B. Maxama and E. D. Markus
its HV plants remotely, are not available due to uneven terrain [1]. Eskom is currently unable to monitor its high voltage plants in this area due to this setback. Mountains appear very large, in the way of high-frequency signals, whose wavelengths is small, thus resulting in the reflection of the transmitted signal [3]. High frequencies are susceptible to attenuation due to uneven terrain [2]. According to [3], this is possible because high frequencies have short wavelengths of up to (1–10 mm). Therefore, when they come into contact with a large obstruction such as a mountain, they are reflected and attenuated [4]. The inability to monitor High Voltage (HV) plants, places Eskom’s power grid at risk and needs to be addressed with urgency. The terrain in the area of interest, from Vioolsdrift Radio Site (the transmit site) to Henkriesmond Substation (the receive site), is an uneven mountainous terrain. As seen in the line of sight diagram in Fig. 1, there is no line of sight between Henkriesmond Substation and Vioolsdrift Radio Site. This is so because of the high mountain peaks, reflecting the radio signals. Table 1 illustrates the performance results of the current Eskom radio link in the
Fig. 1 Line of sight diagram
Table 1 Link performance of the existing Eskom radio link
Parameters
Vioolsdrift RS (TX)
Henkriesmond SS (RX)
Effective radiated power (Watts)
12.13
7.92
Effective radiated power (dBm)
40.84
38.99
115.19
115.19
Free space loss (dB) Polarization Net path loss (dB)
Vertical
Vertical
160.74
157.07
RX signal (dBm)
−123.75
−120.08
Fade margin (dB)
−16.75
−13.08
VHF OSTBC MIMO System Used to Overcome the Effects of Irregular …
271
area, using Pathloss 4 software. The effects of the mountain can be seen in Table 1, where the path losses increase due to reflection of the radio signals by the mountain, with net path loss at 157.07 dB.
2 Methods to Mitigate Propagation Challenges in Irregular Terrain This section has categorized methods that is used previously to combat radio signal propagation challenges in irregular terrain. Different modulation schemes and propagation methods in a Rayleigh fading channel are cited with their advantages and disadvantages in Table 2.
3 Proposed Model The proposed OSTBC MIMO transceiver model over a Rayleigh fading channel is shown in Fig. 2. The model is simulated using MATLAB/Simulink software. The following are the different components blocks making up the system: Random Binary Generator The random binary generator block generates an unpredictable binary sequence. It uses Bernoulli Binary Generator to generate the random bits. It simulates the data to be transmitted over the channel. Quadrature Phase Shift Key (QPSK) Modulator The binary data from the random bit generator is modulated into a QPSK constellation. In this block, a splitter divides the binary data into two data streams, dq (t) and di (t). The symbol duration for both dq (t) and di (t) streams are twice the bit duration (Tsym = 2 Tb ), thus increasing the data rate [12]. Orthogonal Space–Time Block Code (OSTBC) Encoder The modulated data symbols from the QPSK Modulator are fed to the input of the OSTBC Encoder which encodes the data symbols by either using the Alamouti code [14] for two transmit antennas or an orthogonal code for three or four transmit antennas [15]. In this simulation, a 4 × 4 MIMO system with four antennas is employed [13]. Multiple-In, Multiple-Out (MIMO) Rayleigh Fading Channel The MIMO Rayleigh Fading Channel Block models multi-path Rayleigh fading channel typical that of a non-line of sight, irregular terrain. The block is configured as a 4 × 4 MIMO channel. Rayleigh fading model is chosen over other models such as
272
X. B. Maxama and E. D. Markus
Table 2 Methods used to mitigate wireless communication challenges in irregular terrain Method
Description
General advantages
General disadvantages
Typical application in wireless communications
Multiple-In Multiple-Out (MIMO)
It is a transceiver which uses multiple antennas at the transmitter and receiver
It exploits multi-path propagation to improve signal reception. It creates spatial diversity. It improves SNR and link performance [5]. Increased channel capacity. Largely overcomes fading. Reduces the impact of time dispersion, path loss and interference [15]
In the market, it is currently available in microwave and UHF frequency applications
Largely used in telecommunication environments where there is heavy fading [5]
(HF) band coupled with a discone antenna
This is the HF band at (20–30 MHz) using a discone antenna to achieve groundwave propagation [6]
Lower transmission power levels. Low background noise. Groundwave signals diffract. Groundwaves propagate in the area between the ground and the air for short distances of up to 100 km over land and 300 km over sea [7]
Large Antennas. Noise. Low data rates. Limited working bandwidth. Susceptible to noise [8]
Groundwave communication which is a transmission mode suitable for areas where radio communication is unavailable due to mountainous terrain [9]
HF MIMO OFDM
It is a transceiver system combining MIMO and OFDM operating in the HF band
Improved noise reduction Increased bandwidth utility at HF band. Increased data rates [10]
Large antennas. High cost
According to [11], this technique works well in irregular terrain environments
VHF OSTBC MIMO System Used to Overcome the Effects of Irregular …
273
Fig. 2 Proposed model for OSTBC MIMO
Rician fading model because this study investigates a non-line of sight environment, which the Rayleigh model is used for. The Rician model on the other hand models a multi-path channel with at least one strong line of sight path [16]. Added White Gaussian Noise (AWGN) The AWGN Channel block models symbol errors resulting from a noisy channel. Orthogonal Space–Time Block Code (OSTBC) Decoder The Orthogonal Space–Time Block Codes (OSTBC) Combiner stage combines the signals from the receive antennas and the Channel State Information (CSI) so that the information of the symbols encoded by the OSTBC encoder can be recovered. Quadrature Phase Shift Key (QPSK) Demodulator This block demodulates the signal recovered from OSTBC Decoder into the originally sent signal. Frame Error Rate The Frame Error Rate (FER) calculates the system FER by comparing the demodulated bits with the original bits sent per frame, to detect errors.
4 Simulation Results The aim of this section is to investigate the validity of the proposed model in Sect. 3. This was done by analyzing the Bit Error Rate (BER) results of the OSTBC MIMO
274
X. B. Maxama and E. D. Markus
system when two different MIMO antenna arrays were used at both the ends of the link. Also, the Receive Signal Strength (RSS) results when used with two different low band VHF frequencies was also investigated. Therefore, these scenarios in a Rayleigh fading channel will be used to investigate the following: • Whether the Receive Signal Strength (RSS) can be improved significantly by employing the OSTBC MIMO system in the low VHF frequency band. • Whether the Bit Error Rate can be reduced significantly by employing MIMO antenna arrays.
4.1 The Receive Signal Strength Results of the VHF OSTBC MIMO System Using Two Different VHF Frequencies Figure 3 shows a proposed radio link path into Henkriesmond Substation: The Repeater at Doringwater Substation makes use of a 12 element Yagi antenna at a height of 60 m and two different VHF frequency bands namely, 135 and 70 MHz. The distance between the transmitter and receiver is 11 km. The interfering obstacle (mountain) is 700 m above sea level. This part of the simulation will only focus on the radio signal propagation performance of the repeater at Doringwater Substation toward Henkriesmond Substation. The design of the microwave link from Vioolsdrift Radio Site to Doringwater Substation will not be dealt with in this paper. A 4 × 4 OSTBC MIMO System Operating at a VHF Frequency of 135 MHz Table 3 represents the performance results taken from a simulated Matlab SiteViewer propagation prediction program for a VHF OSTBC MIMO repeater situated
Fig. 3 Google earth image of the proposed radio link path into Henkriesmond substation
VHF OSTBC MIMO System Used to Overcome the Effects of Irregular … Table 3 Performance profile of the VHF repeater to Henkriesmond substation at 135 MHz
Doringwater substation (TX)
275
Parameters
Latitude
−29.08333333
Longitude
17.94527778
Frequency (MHz)
135
TX power (watts)
10
Elevation (m)
700
Antenna height (m)
60
Antenna model
Yagi (12 El) Y425_12
Antenna gain (dBi)
16
MIMO array
4×4
Azimuth (°)
30
Henkriesmond substation (RX)
Parameters
Latitude
−28.90111111
Longitude
18.13694444
Antenna model
Yagi (12 El) Y425_12
Antenna height (m)
60
RX sensitivity criteria (dBm)
−90
RX sensitivity (dBm)
−107
RX signal strength (dBm)
−96.92
at Doringwater Substation toward Henkriesmond Substation operating at a frequency of 135 MHz. As seen in Table 3, the Receive Signal Strength (RSS) of this system is −96.92 dBm. This reading means the Remote Terminal Unit (RTU) at Henkriesmond Substation will fail at −96.92 dBm, because the Receive Signal Strength threshold for RTU operation at Eskom is set between −60 and −90 dBm. A 4 × 4 OSTBC MIMO System Operating at a VHF Frequency of 70 MHz Table 4 also contains performance results taken from simulated Matlab SiteViewer propagation prediction program similar to Table 3 except that its operating frequency is much lower at 70 MHz. The Receive Signal Strength (RSS) reading of this system has now improved to −89.71 dBm, see Table 4. This reading means the RTU at Henkriesmond Substation will operate at −89.71 dBm, because Receive Signal Strength is within the RTU operating threshold standard for Eskom which is between −60 and −90 dBm [17].
276
X. B. Maxama and E. D. Markus
Table 4 Performance profile of the VHF repeater to Henkriesmond substation at 70 MHz
Doringwater Substation (TX)
Parameters
Latitude
−29.08333333
Longitude
17.94527778
Frequency (MHz)
70
TX power (watts)
10
Elevation (m)
700
Antenna height (m)
60
Antenna model
Yagi (12 El) Y425_12
Antenna gain (dBi)
16
MIMO array
4×4
Azimuth (°)
30
Henkriesmond substation (RX) Latitude
−28.90111111
Longitude
18.13694444
Antenna model
Yagi (12 El) Y425_12
Antenna height (m)
60
RX sensitivity criteria (dBm)
−90
RX sensitivity (dBm)
−107
RX signal strength (dBm)
−89.71
4.2 The Bit Error Rate Results of the OSTBC MIMO System Using Two Different MIMO Antenna Arrays Performance of a 2 × 2 OSTBC MIMO System in a Rayleigh Fading Channel Figure 4 shows the simulation results from Matlab of a (B E R vs E b /N0 ) curve based on the performance achieved across a 2 × 2 OSTBC MIMO transceiver system. The number of transmit (N t) and receive (Nr ) antennas used is 2, respectively. Based on (B E R, E b /N0 ) curve diagram shown in Fig. 4, as the signal (E b /N0 ) increases, the bit error rate (B E R) gradually shows a slight decrease. Based on the 2 × 2 OSTBC MIMO (B E R, E b /N0 ) curve diagram in Fig. 4, assuming that the system is operating at 100 Mb/s, which is 108 bits per second. According to Fig. 4, the discrete minimum B E R reading easier to choose is at 16 d B(E b /N0 ), which is approximately 10−3 . This means, 1-bit error occurs every 103 bits. Therefore, the time it would take to receive 103 bits before an error occurred would be approximately: 103 N umber o f Bits Sent = 8 = 0.01 ms Bit Rate 10
(1)
VHF OSTBC MIMO System Used to Overcome the Effects of Irregular …
277
Fig. 4 The BER curve of a 2 × 2 OSTBC MIMO System
Therefore, 0.01 ms is the average time it would take before an error could occur in this system. Performance of a 4 × 4 OSTBC MIMO System in a Rayleigh Fading Channel Figure 5 shows the simulation results from Matlab of a (B E R vs E b /N0 ) curve based on the performance achieved across a 4 × 4 OSTBC MIMO transceiver system. The number of transmit (N t) and receive (Nr ) antennas used in this scenario is 4, respectively. Based on (B E R, E b /N0 ) curve diagram shown in Fig. 5, as the signal (E b /N0 ) increases, the bit error rate (B E R) shows a more significant decrease than that seen in Fig. 4. Based on the 4 × 4 OSTBC MIMO (B E R, E b /N0 ) curve diagram in Fig. 5, again assuming that the system is operating at 100 Mb/s, which is 108 bits per second. According to Fig. 5, the minimum discrete B E R reading is found at 18 d B(E b /N0 ), which is approximately 10−5 . This means, 1-bit error occurs every 105 bits. Therefore, to receive 105 bits before an error could occur in this system, the time taken would be approximately: 105 N umber o f Bits Sent = 8 = 1 ms. Bit Rate 10
(2)
278
X. B. Maxama and E. D. Markus
Fig. 5 The BER curve of a 4 × 4 OSTBC MIMO System
Therefore, 1 ms is the average time it would take before an error could occur in this system. Therefore, we see that the use of 4 antennas at both ends of the system, greatly improves the performance of the system. In this scenario, the radio link performance is far better than the scenario in Fig. 4. The simulation results received in this section confirm that using the OSTBC MIMO system at lower VHF frequencies does improve radio signal reception.
5 Conclusion and Discussion This paper has demonstrated different simulation results. The results illustrated that using the OSTBC MIMO system at low Very High Frequency (VHF) frequencies do greatly improve signal reception in irregular terrain. Although the MIMO system has been exploited widely in the cellular network industry, its application in the low VHF band (49–108 MHz) for SCADA purposes hasn’t been widely investigated. Furthermore, the application of the OSTBC MIMO technology in the low VHF frequency band for improvement of signal reception in irregular terrain has certainly not been explored in the mountainous Northern Cape in South Africa.
VHF OSTBC MIMO System Used to Overcome the Effects of Irregular …
279
Therefore, the information provided in this study not only present useful information for radio engineers designing radio systems in mountainous areas but also for the Eskom business to provide quality service, improve safety and increase revenue. This study will also encourage more researchers to identify more research gaps to improve the system. Further work involving testing this technique in the field using a real data is planned in the future.
References 1. Karoo, pp 2 Homepage. https://en.wikipedia.org/wiki/Karoo. Accessed 24 Sept 2020 2. Seybold JS (2005) Introduction to RF propagation, p 54 3. Maxama XB, Markus ED (2018) A survey on propagation challenges in wireless communication networks over irregular Terrains. In: Open innovations IEEE conference, pp 1 4. Poole I, Electromagnetic waves - reflection, refraction, diffraction, pp 2. Homepage. http:// www.radioelectronics.com/info/propagation/em_waves/electromagnetic-reflection-refrac tion-diffraction.php. Accessed 19 July 2020 5. Arvind S, Mytri VD (2015) Design of Simulink model for OSTBC and performance evaluation of IEEE 802.16 OFDM physical link with and without space-time block coding for wireless communication. IJERA, p 9. Accessed 9–10 Jan 2015 6. Champion JR (1992) An empirical investigation of high frequency ground wave propagation. John Hopkins APL Techn Digest 13(4):517 7. Space Weather Services.: Introduction to HF Radio Propagation (2016) Australian Government, p 10 8. Kumar A, Bahl R (2013) An architecture for high data rate very low frequency communication. Def J 63(1):25 9. Sizu H (2003) Radio wave propagation for telecommunication applications, pp 49–51 10. Kumar A, Bahl R (2013) An architecture for high data rate very low frequency communication. Def J 63(1):26 11. Gopinath V, Krishnas N (2014) Enhanced performance of HF OFDM in wireless communication. Kiet Int J Commun Electron 2(1):46 12. Hochwald BM, Marzetta TL (2000) Unitary space-time modulation for multiple-antenna communications in rayleigh flat fading. IEEE Trans Inf Theory 46(2):543–564 13. Breed G (2003) Bit error rate: fundamental concepts and measurement issues. High frequency electronics, p 46 14. Alamouti SM (1998) A simple transmit diversity technique for wireless communications. IEEE J Sel Areas Commun 16(8):1451–1458 15. Tarokh V, Jafarkhami H, Calderbank AR (1999) Space-time block codes from orthogonal designs. IEEE Trans Inf Theory 45(5):1456–1467 16. Golam Sadeque Md, Mohonta SC, Firoj AMd (2015) Modeling and characterization of different types of fading channel. IJSETR 4(5):1410–1415 17. Delport P (2006) Optimisation of UHF radio scada systems for electrical distribution networks. Master’s thesis, p 46
Modelling Video Frames for Object Extraction Using Spatial Correlation Vinayak Ray and Pradip Sircar
Abstract Object extraction forms a critical part of the object-based video processing. However, most of the techniques available concentrate only on surveillance and tracking. Normal video sequence does not have steady background and hence these techniques cannot be applied to them. In our work, we propose an elegant method to model background and foreground based on histogram data. We use 2D continuous wavelet transform to spatially localize object and create object mask to approximate silhouette. With available histogram for object pixels and background pixels, we obtain probability density function by normalizing the area under histogram. In order to retain smoothness in our density function, we use curve-fitting techniques to approximate the probability density function. Keywords Modelling video frames · Object extraction · Spatial correlation · 2D continuous wavelet transform · Histogram
1 Introduction The need for content-based video processing has been expression of our relentless pursuit to achieve video processing more akin to our visual system, a complex visual processor which responds more to object shape rather than pixel value [1]. Extracting and classifying the objects present in video frame and post-processing them rather than the whole frame would be aesthetically more pleasing. Features can be enhanced and background can be altered. Furthermore, motion estimation based on objects rather than on blocks will surely be an elegant solution. Modelling the probability V. Ray · P. Sircar (B) Department of Electrical Engineering, Indian Institute of Technology Kanpur, Kanpur 208016, Uttar Pradesh, India e-mail: [email protected] V. Ray Presently with Advanced Micro Devices, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_19
281
282
V. Ray and P. Sircar
density function of foreground and background in order to segregate them motivated us to undertake this work. Surveillance and tracking are some of the areas where object extraction is extensively used. Implemented through background subtractions, they calculate the probability of a pixel belonging to background, if below certain threshold, assign them as foreground. A lot of literature is available in the area of surveillance and tracking, most of them attain their objective of identifying background by modelling them with suitable distribution. A classical paper in this regard is of Stauffer and Grimson [2] which modelled each pixel as adaptive Gaussian mixture model (GMM), updating itself with time as new objects arrive at the scene. One drawback of the Stauffer–Grimson (SG) method of background subtraction becomes apparent when the foreground objects are of similar colour as that of the background. Several extensions of the SG method are proposed to alleviate the above drawback by using the properties of local image, e.g., optical flow [3, 4], spatiotemporal blocks [5, 6], or texture information [7]. Another drawback of the SG methods appears in the lack of consistency in the states of adjacent pixels, leading to noisy segmentations of foreground and background. Various extensions of the SG method are proposed to encourage consistency by using image modelling [8–10], or by allowing the state of one pixel to influence the states of neighbouring pixels [11]. Other methods of background subtraction differentiate the ‘salient’ motion of foreground from the motion of background [12, 13]. The saliency is defined as the motion of a pixel in a consistent direction. Most of the above methods, together with the SG method, are based on the assumption that the background is static over short time durations. Thus, the proposed models can adapt only to slowly varying backgrounds. These models are bound to fail in video object extraction where the background often changes drastically. In the present work, we adopt a new approach of object extraction based on image property. Our goal is to utilize the spatial correlation existent in the image and try to model the background as well as foreground. We assume that the probability of a pixel having particular grayscale will depend on the pixel values in the neighbouring region [14]. Thus, we use histogram to model the probability distribution. However, due to varying lighting conditions it is often difficult to capture and differentiate the spatial variations through single probability density function model. Some regions are overexposed, some are underexposed. In the present scenario, single model probability density function is bound to give false results. Intuitively, separation of images based on properly selected threshold will divide the image into regions of overexpose and underexpose. In case of multi-modal image, number of thresholds should increase so that distribution model can adapt to different lighting conditions in better way. Another major roadblock is localizing the objects spatially and obtaining approximate object region. Since the objects in video are most of the time in ‘focus’, we use this particular observation for object localization. The 2D continuous wavelet transform is used for this purpose. Finally, we determine solution to the problem of binary hypothesis, i.e., whether the pixel belongs to background or foreground using Bayesian testing.
Modelling Video Frames for Object Extraction Using Spatial Correlation
283
The paper has been divided into sections, based on the background knowledge that we shall use in our work. Section 2 deals with global thresholds where we use Otsu’s method [15] to segregate image according to level of illumination. Section 3 covers the basics of 2D continuous wavelet transform (2D-CWT) with emphasis on analysing wavelets Mexican Hat and Morlet which are the most widely used wavelets for image analysis. Section 4 touches upon the foundation of detection theory and use of binary hypothesis in our algorithm. Section 5 includes object extraction and algorithm implemented for the same. Section 6 draws the final conclusion and future works.
2 Global Threshold: Otsu’s Method The selection of gray-level threshold is often important from the viewpoint of object extraction. A lot of papers propose methods based on analysis of histogram. In an ideal case, histogram will have deep sharp valley between two peaks representing object and background. However, in real scenario, valley may be flat, broad with unequal peaks. Otsu’s algorithm overcomes most of these problems [15]. The algorithm assumes that image consists of two classes of pixels, background and foreground. The optimum threshold will give minimal intra-class variance or maximizing inter-class variance. Let n i be the number of pixels at level i, where i ∈ [1, . . . , L] pi =
ni , N
L
pi ≥ 0,
pi = 1
(1)
N = n1 + n2 + · · · + n L
(2)
i=1
Segregating image pixels into two classes C0 and C1 (objects and background) with optimum threshold being level k. C0 denotes pixels with levels [1, 2, …, k] and C1 denotes pixels with level [k + 1, k + 2, …, L]. The probabilities of class occurrence are given by w0 = Pr(C0 ) =
k
pi = w(k)
(3)
pi = 1 − w(k)
(4)
i=1
w1 = Pr(C1 ) =
L i=k+1
and class mean levels
284
V. Ray and P. Sircar
μ0 =
k
i Pr(i|C0 ) =
i=1 L
μ1 =
μ(k) w(k)
(5)
μT − μ(k) 1 − w(k)
(6)
i pi /w0 =
i=1
i Pr(i|C1 ) =
i=k+1
k
L
i pi /w1 =
i=k+1
where μ(k) =
k
i pi
(7)
i pi
(8)
i=1
μT = μ(L) =
L i=1
The last term μT represents the total mean level of the original picture.We note that following equation hold w0 μ0 + w1 μ1 = μT w0 + w1 = 1
(9) (10)
where w0 and w1 have been defined in (3) and (4), respectively. The variances of the classes are given by σ02 = σ12 =
k k Pi (i − μ0 )2 Pr(i|C0 ) = (i − μ0 )2 w 0 i=1 i=1
L
(i − μ1 )2 Pr(i|C1 ) =
i=k+1
L
(i − μ1 )2
i=k+1
Pi w1
(11)
(12)
In order to evaluate the optimum threshold, discriminant criteria measure is to be determined and maximized. The η below represents the inter-class variance and depends only on first-order statistics. η=
σ 2B σT2
where
(13)
σ 2B = w0 (μ0 − μT )2 + w1 (μ1 − μT )2
(14)
and the optimal threshold k ∗ is σ 2B (k ∗ ) = max σ 2B (k) 1≤k C11 For Bayesian test, the expression finally comes out as pr |H1 (R|H1 ) H0 P0 (C10 − C00 ) ≶ pr |H0 (R|H0 ) H1 P1 (C01 − C11 ) Λ(R) =
pr |H1 (R|H1 ) pr |H0 (R|H0 )
(34)
(35)
292
V. Ray and P. Sircar
Note that (35) is the likelihood ratio and is single-dimensional regardless of the dimension of R. The quantity η
P0 (C10 − C00 ) P1 (C01 − C11 )
(36)
is the threshold of the test which is variable depending upon the cost and a priori information. In our work, deciding whether a pixel belongs to object or background formed our two hypotheses H0 and H1 . The a priori information is in the form of object and background histogram which could be used to model the probability density function.
5 Object Extraction Image analysis in general involves the study of feature extraction, segmentation and classification techniques. 1. Feature Extraction: The feature to be extracted could be spatial, edges, transform features or it could be complicated texture. Features are always extracted to help us achieve the next stage of segmentation. 2. Segmentation: Segmentation of image into different components or objects by extracting their boundaries. 3. Classification: Image classification maps different regions or segments into one of the several objects, each identified by a label In our work, we deal with extraction of foreground from video frames. The fact that in video frames the subject (object of interest) is always in focus and hence will be the amalgamation of high-frequency contents, which can be extracted by 2DCWT at low scales. The commonly used edge detectors, gradient operator or Laplace operators not only generate object contour but also pick up some sharp variations in the background. Though we were able to localize the spatial position of the object, obstacles remained in the form of broken edges which wouldn’t let us extract the boundaries of the object. While identifying objects, we often ignore the spatial correlation between the object pixels. Though attempts have been made to model the background of surveillance cameras with Gaussian mixture model (GMM) [2], we do not have the liberty to implement them in our model as background of video frames change drastically with time and hence impossible to model background using existing models. Since background as well as objects has spatial correlation up to some extent, we use histogram to model the probability density function of both background and objects. We use 2D-CWT for spatial localization of object [16] and generate silhouette through contour-filling, which may not give exact boundary.
Modelling Video Frames for Object Extraction Using Spatial Correlation
293
The next part of our algorithm is Binary Hypothesis testing: whether the given pixel belongs to background or foreground. However, modelling a whole background or object through a single probability density function can be erroneous as both of them can have large spatial variation, which will not be captured by histogram, due to its global nature. If each pixel resulted from particular surface under particular lighting, a single Gaussian model will be sufficient to model the pixel value [2]. In our scenario too, both of them are unlikely to happen, so Gaussian modelling has to be ruled out. Otsu’s method [15] decides threshold of images by maximizing inter-class variance. Our experimentation with Otsu’s method show that images extracted after threshold were reflective of the lighting conditions, i.e., each image obtained corresponded to brightness of lighting conditions (see Fig. 6). Thus, sufficiently lighted and dimly lighted areas could be segregated. In our case, we decide to use two threshold and obtain three images, • Well exposed areas (C L ASS3 ) • Medium lighting (C L ASS2 ) • Dim lighting (C L ASS1 ) We are aware of the fact that these three classes cannot capture the myriad variations in lighting. So, we still stick with our intuition for arbitrary distributions, and do not attempt to impose any particular family of distribution. Modelling of background and foreground with histogram data and performing Bayesian Hypothesis testing on each of the pixels to decide whether they belong to foreground or background form the next leg of our algorithm. Thus, for a particular
(a) Original Image
(b) CLASS1 Image
(c) CLASS2 Image
(d) CLASS3 Image
Fig. 6 Original image and its corresponding threshold images with two thresholds using Otsu’s method
294
V. Ray and P. Sircar
class H0 : Pixel Value ∈ Foreground H1 : Pixel Value ∈ Background
(37) (38)
The Bayesian testing is done independently on all classes since they are mutually exclusive C1 ∩ C2 ∩ C3 = φ
(39)
5.1 Implementation of Algorithm 1. Preprocess the image with 2D-CWT using Mexican Hat wavelet at low scales to extract details of image which will invariably consist of more edges of object in focus. Subsequent use of Otsu’s single thresholding method [15] gives us binary image in form of object contour (See Fig. 7). 2. With contour obtained, we flood fill the contour to obtain object mask represented with value of 1 in the region of object and 0 where it is absent. Mask obtained
(a) Original
(b) After CWT, a=0.001
(c) Otsu’s Method
(d) Sobel Edge
Fig. 7 Original image and subsequent operations on it for edge extraction
Modelling Video Frames for Object Extraction Using Spatial Correlation
295
may not be exact and can have some artefacts. So, we move to next step so that exact contour of object may be extracted. 3. We threshold the image into three classes based on Otsu’s method [15]. This creates images with distinct lighting conditions. 4. Object Mask is logically ANDed with three classes separately in order to find out the percentage of foreground in each image class. These percentages form our a priori probability in Binary hypothesis testing. P(H0 |C1 ) = Probability of Foreground when belonging to C L ASS1 (40) P(H1 |C1 ) = Probability of Background when belonging to C L ASS1 (41) P(H0 |C2 ) = Probability of Foreground when belonging to C L ASS2 (42) P(H1 |C2 ) = Probability of Background when belonging to C L ASS2 (43) P(H0 |C3 ) = Probability of Foreground when belonging to C L ASS3 (44) P(H1 |C3 ) = Probability of Background when belonging to C L ASS3 (45) 5. Obtain the histogram of objects and background for each class. We use curvefitting technique to smooth out the histogram and normalize its area to obtain probability density function. • Calculate the histogram of object pixels corresponding to each class. Similarly, calculate histogram of background pixel for the each class as well. • Use spline fitting technique on the histogram data and normalize the curve so that its area is unity, a constraint on probability density function. 6. The probability of pixel p being in the foreground, with grayscale G p will be given by Pr(G p |H0 , C1 ) = P(G p ∈ Foreground|C1 ) =
Gp G p −1
p01 ( p)dp
(46)
where in (46) p01 ( p) = probability density function of grayscale values, given H0 (hypothesis of belonging to foreground) and C1 are true Similarly, the probability of the pixel above belonging to the background will be given by Pr(G p |H1 , C1 ) = P(G p ∈ Background|C1 ) = where in (47)
Gp G p −1
p11 ( p)dp
(47)
296
V. Ray and P. Sircar
p11 ( p) = probability density function of grayscale values, given H1 (hypothesis of belonging to background) and C1 are true Thus, the binary hypothesis testing will give H1 |Ci
Pr(G p |H0 , Ci ) · P(H0 |Ci ) ≶ Pr (G p |H1 , Ci ) · P(H1 |Ci ) H0 |Ci
(48)
where i = 1, 2, 3 covering all the classes. The pixel whether it is in foreground or background will be decided by the inequality.
5.2 Results Figure 8b, c show the objects extracted after applying our algorithm on Fig. 8a. The contours of the objects closely follow the object boundary without any artifacts and seem to be quite natural.
(a) Original
(b) Turtle 1 Fig. 8 Sea turtles: original and extracted images
(c) Turtle 2
Modelling Video Frames for Object Extraction Using Spatial Correlation
(a) Original
297
(b) Extracted
Fig. 9 Player in action: original and extracted images
(a) Original
(b) Extracted
Fig. 10 Player in gratitude: original and extracted images
However, with Fig. 9a the background is more complicated to model hence we see some artifacts present in the extracted object. If we observe Fig. 10 closely, we find that portion of image where background is falsely depicted as foreground, have spatial variation and intensity quite similar to our object. This is false alarm, which is the characteristics of hypothesis testing. The probability of false alarm can not be reduced, without reducing the probability of detection [18].
298
V. Ray and P. Sircar
6 Conclusion and Future Work The result of our algorithm largely depends on how well the objects have been localized. One can put forth an argument that if our algorithm depends on how good is the contour, why would one require such algorithm in first place? However, our results show that even if contour is not perfect but captures substantial variation of object, we can get sufficiently good results. And, applying the algorithm again and again we can get more improved results. The proposed algorithm is useful in cases where one has to extract object for motion estimation and interpolation instead of present block-estimation method. It can enable object-based video processing, enhancing the object features or at time changing the background of the object for aesthetic purpose. For global motion estimation, the optimal procedure is the full search (FS) method. However, the FS method has very high computational complexity. Alternative techniques which are simple and fast are explored for special applications [19]. Our proposed algorithm for object detection by using image profiles has low complexity, and the developed technique can be employed for such applications. Future work includes better modelling of PDF. We have considered single frame here, whereas video backgrounds do not change abruptly so multiple frames can be used for modelling. Even when the frames are in transition, for low frame rates, they are almost superimposed on one another, hence the algorithm should work well in that case also. Frame threshold was decided considering them to be multimodal (two threshold in present case). However, we can dynamically decide on the threshold from the peaks in the histogram. The same algorithm can be implemented as block-level processing in order to extract local features more correctly and capture variations more accurately. In that case our probability distribution can be of normal distribution. However, our intuition tells that block-level algorithm implementation and higher levels of image class (i.e. more threshold) should have the same effect.
References 1. Gonzalez RC, Woods RE (2002) Digital image processing. Prentice-Hall of India 2. Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. In: Proceedings of IEEE conference on computer vision and pattern recognition (Cat. No PR00149), vol 2, pp 246–252 3. Elgammal A, Harwood D, Davis L (2000) Non-parametric model for background subtraction. In: European conference on computer vision. Springer, Berlin, pp 751–767 4. Mittal A, Paragios N (2004) Motion-based background subtraction using adaptive kernel density estimation. In: IEEE conference on computer vision and pattern recognition, vol 2, pp II–II 5. Latecki LJ, Miezianko R, Pokrajac D (2004) Motion detection based on local variation of spatiotemporal texture. In: IEEE conference on computer vision and pattern recognition workshop, pp 135–135
Modelling Video Frames for Object Extraction Using Spatial Correlation
299
6. Kahl F, Hartley R, Hilsenstein V (2004) Novelty detection in image sequences with dynamic backgrounds. In: International workshop on statistical methods in video processing. Springer, Berlin, pp 117–128 7. Heikkila M, Pietikainen M (2006) A texture-based method for modeling the background and detecting moving objects. IEEE Trans Pattern Anal Mach Intell 28(4), 657–662 8. Oliver NM, Rosario B, Pentland AP (2000) A Bayesian computer vision system for modeling human interactions. IEEE Trans Pattern Anal Mach Intell 22(8), 831–843 9. Li Y (2004) On incremental and robust subspace learning. Pattern Recognit 37(7), 1509–1518 10. Sheikh Y, Shah M (2005) Bayesian modeling of dynamic scenes for object detection. IEEE Trans Pattern Anal Mach Intell 27(11), 1778–1792 11. Dalley G, Migdal J, Grimson W (2008) Background subtraction for temporally irregular dynamic textures. In: IEEE workshop on applications of computer vision, pp 1–7 12. Wixson L (2000) Detecting salient motion by accumulating directionally- consistent flow. IEEE Trans Pattern Anal Mach Intell 22(8), 774–780 13. Tian Y-L, Hampapur A (2005) Robust salient motion detection with complex background for real-time video surveillance. In: IEEE workshop on applications of computer vision & workshop on motion and video computing, Breckenridge, CO 2005, pp 30–35. https://doi.org/ 10.1109/ACVMOT.2005.106 14. Jain AK (1989) Fundamentals of digital image processing. Prentice Hall, Englewood Cliffs 15. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst, Man, Cybern 9(1), 62–66 16. Reddy VK, Siramoju KK, Sircar P (2014) Object detection by 2-D continuous wavelet transform. In: IEEE international conference on computational science and computational intelligence (CSCI), vol 1, pp 162–167 17. Antoine JP, Carrette P, Murenzi R, Piette B (1993) Image analysis with two-dimensional continuous wavelet transform. Signal Process 31(3):241–272 18. Van Trees HL (2004) Detection, estimation, and modulation theory, part I: detection, estimation, and linear modulation theory. Wiley, New York 19. Albu F, Florea C, Zamfir A, Drimbarean A (2008) Low complexity global motion estimation techniques for image stabilization. In: Digest of technical papers-IEEE international conference on consumer electronics, pp 1–2
Speech-Based Selective Labeling of Objects in an Inventory Setting A. Alice Nithya, Mohak Narang, and Akash Kumar
Abstract Object detection has been extensively used and is considered as one of the prerequisites for various vision-based applications like object recognition, instance segmentation, pose estimation, and so on. This field has attracted much research attention in recent years due to its close relationship with scene analysis and image understanding. Traditional methods extract features and use shallow machine learning architectures to detect objects. These methods faced many difficulties in combining the extracted low-level features with the high-level context of the detector and classifier. Though the recent developments in deep learning architectures helped visual scene analysis and detection methods to perform remarkably, a lot more focus is required in performing voice-based scene analysis and object detection methods. In this paper, a voice-based scene analysis and object detection methods are proposed which aims in detecting a specific object through voice input in inventory storage. An automatic speech recognizer based on deep learning architecture is used to perform speech-to-text conversion. The converted text is used to identify which object is to be located in the scene. A dictionary-based search algorithm is used for reducing the search time of the class of interest. Object detection is performed using Faster RCNN architecture. Experimental results are tested on the retail product checkout dataset for evaluating the system performance. Experimental results show that this approach makes it possible for the model to segment only the specific object while ignoring all the other classes. Keywords Object detection · Speech-to-text conversion · Faster RCNN · Speech recognizer · Selective labeling
A. Alice Nithya (B) · M. Narang · A. Kumar School of Computing, SRM Institute of Science and Technology, Kattankulathur, Kanchipuram, Chennai, TN, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_20
301
302
A. Alice Nithya et al.
1 Introduction Selective object detection and labeling serve as a requisite for many vision-based applications like semantic segmentation [1, 2], face or iris recognition [3, 4], pose estimation [5] and object-based reasoning [6]. Object detection is a combination of both object localization in a scene and objects classification. Many of the recently developed object detectors follow a two-stage framework [7, 8] (i) discriminate foreground objects from background objects and assign class labels; (ii) detects the required object and removes the duplicated detection of the same objects if any. Selective labeling algorithm [9] aims to help people who work at jobs which involves spotting specific things from a pile of different things. It aims to reduce the time and energy that they spend every day searching for specific objects. The areas of application involve places where segregation happens, for example, an inventory or a big wholesale supermarket. In [10], Convolution Neural Network (CNN) was proposed and the bounding box extraction method was followed to perform for object detection. The authors used a self-made dataset and applied it for the proposed work in two situations, one on the conveyor belt and others on the shelf. Performance metrics like mean Average Precision (mAP), Maximum Average Precision (APmax), and Minimum Average Precision (APmin) had been used to evaluate the model performance and were able to get the value of mAP of 0.84 on the conveyor belt and 0.79 on the shelf. In [11], Faster RCNN and Structure Interface Net (SIN) were proposed for object detection. Pascal VOC and MS-COCO datasets were used to test the experimental methodologies. Mean Average Precision (mAP) metric was used to evaluate the performance. They got the mAP of 76.0% on VOC 2007 and mAP of 73.1% on VOC 2012 dataset. On the MS-COCO dataset, they achieved the overall performance of mAP over different IoU thresholds from 0.5 to 0.95. Intersection of Union (IoU-Net) and Feature Pyramid Network (Res-Net FPN) were proposed in [12] for object detection. IoU guided by Non-Maximum Suppression (NMS) was used to construct the bounding box for the detected objects and remove duplication. Experiments done on the MS-COCO dataset showed that they were able to achieve 40.6% AP with ResNet101-FPN compared to the baseline 38.5% and therefore made an improvement by 2.1%. Collaborative Self-Paced Curriculum Learning (C-SPCL) for object detection and labeling was proposed in [13]. Pascal VOC-2007 test and MS-COCO datasets were used to evaluate the model performance. They have used mAP as a performance metric and achieved more than 1.3–11.7% higher average mAP when compared with the other state-of-the-arts. Ren et al. [14] proposed Fast-CNN and Region Proposal Network (RPN) for object detection. This object detection method on a GPU had a frame rate of 5 fps and achieved state-of-the-art object detection accuracy on datasets PASCAL VOC 2007 (73.2% mAP) and PASCAL VOC 2012 (70.4% mAP). Thus, from the existing research works, it is observed that object detection and labeling were done in a scene initially and then from the labels provided users were
Speech-Based Selective Labeling of Objects …
303
able to identify the object they need. To make this object detection and labeling technique suitable for an inventory environment, in this work a voice-based scene analysis followed by object detection and labeling methods is proposed. In this paper, voice-based scene analysis and object detection methods are proposed which aim in detecting a specific object through voice input in inventory storage. An automatic speech recognizer based on deep learning architecture is used to perform the speech-to-text conversion. The converted text is used to identify which object is to be located in the scene. A dictionary-based search algorithm is used for reducing the search time of the class of interest. Object detection is performed using Faster RCNN architecture. Since the detection task requires a lot of data and training, the transfer learning approach is used for reducing training time. The dataset would be split into training, testing, and validation to evaluate the model’s performance. The rest of this paper is organized as follows. Section 2 explains the experimental methodologies proposed. The experimental results and additional experimental validation are presented in Sect. 3. Section 4 concludes the main contributions of our study.
2 Experimental Methodology In general object detection aims at localizing, detecting, and classifying different objects in an image, and finally labels them by some bounding boxes to their existence. In this work, a voice-based scene analysis and object detection method followed by drawing the bounding box around the required object is proposed which aims in detecting a specific object through voice input in inventory storage. Experimental methodology is tested using COIL-100, created by Center for Research on Intelligent Systems at the Department of Computer Science in Columbia University [16]. Initially, a data augmenter [17] is used to increase the size of the dataset and gives it the desired transformations. An automatic speech recognizer based on Google API is then used to perform speech-to-text conversion [18]. The converted text is used to identify which object is to be located in the scene. A dictionary-based search algorithm is used for reducing the search time of the class of interest [19]. Object detection is performed using Faster RCNN architecture followed by generation of class labels [20, 21]. Figure 1 shows the workflow of the proposed methodology.
2.1 Data Preprocessing Given the complexity of detecting objects accurately in the images, it is important to have a dataset rich in the number of samples as well as their orientation. There should be a wide variety of objects with different transformations in the samples for getting better model performance. The samples present in the dataset should have
304
A. Alice Nithya et al.
Fig. 1 Workflow of proposed methodology
different noise factors like objects of varied size, varied clarity, objects at different orientations, color variations, and so on. Since the selected dataset contains a limited number of images per class, a data augmenter was used to create new samples from the existing images by transforming them under different conditions so as to achieve objects with different scale, rotation, translation, and color saturations. In this work, to perform this, an open-source tool named augmenter [17] has been utilized in order to augmentate the dataset and get the desired transformations. This helped in increasing the size of the dataset from 72 images per class to over 200 images per class.
Speech-Based Selective Labeling of Objects …
305
All the images in the dataset are then resized to have the same resolution, which helps further to improve the model performance. This is accomplished by identifying the resolution of the images and performs zero padding to the smaller sized images in the dataset to avoid data loss. Then to prepare the training data to build the object detection model, an open-source tool [22] is used to annotate labels (object’s classes) to the region of interest. The training dataset is manually labeled for each class to obtain a Pascal Visual Object Classes (VOC) file for each image. The VOC file has an extension.xml which is used during model training.
2.2 Voice-Based Object Detection 2.2.1
Speech-to-Text Conversion
Voice input plays an important role in this proposed methodology as it is used to determine the item that is to be found in the image in a short span of time with less human intervention. The voice input is taken from the user along with the image and is used to identify the class which is to be localized in the input image. The voice of the user is recognized using Python bindings for Port Audio, the crossplatform audio I/O library [23]. This library enables the use of a microphone to gather the voice input from the user. The voice input given by the user is taken in analog wave format, followed by converting it to text format. An automatic speech recognizer based on deep learning architecture (Google speech API) is used to perform this speech-to-text conversion [18]. The recognized text is then mapped with the available classes on which the model is trained. This mapping is done using an efficient dictionary-based class searching algorithm.
2.2.2
Dictionary-Based Class Searching Algorithm
From the recognized speech, a list of possible results is returned where each element of the list is a tuple of containing the word that the user could have spoken as an input along with its recognition accuracy. The word which has the highest recognition accuracy is the first element of this list. It is selected and is searched in the dictionary of class labels. A dictionary is maintained such that its keys are alternate words that a user might speak to describe the class name and the corresponding values of the keys are the actual names of the class. This way the user speech input is always mapped with the correct class name and desired object would be localized in the image. There are two main advantages of this search algorithm; firstly, since a large number of alternate words are mapped with each class, the correct class would get identified even if there are any noise disturbances in the speech input recognition. Secondly, the dictionary approach makes use of a hash-like key-value reference approach which makes the average complexity of searching to be O(1).
306
A. Alice Nithya et al.
The searching is done on the key dictionary for the presence of detected text and if a match is found, the corresponding class name is returned as the target class which is to be identified in the input image. The label description file required for object detection is altered based on the class that is identified in order to identify only the desired object. Thus, the class labels required for detecting the objects are generated and given to the trained model for further processing.
2.3 Faster RCNN Object Detection is the process of localizing instances of objects in an image. The detection of an object in an image is effectively a two-step process where firstly different areas in an image are classified as either ‘area of interest’ or background and bounding boxes are proposed on the area of interest using a Region Proposal Network (RPN) [20]. In the second step, a pooling is done for the region of interests proposed previously, and these regions are classified into different object categories. The deep learning architecture takes the input image and runs it through a layer of trained Convolutional Neural Networks which extract the features from the image. These features represent different transformations of the image and help the network understand the image. Collectively known as feature maps, these features are given as an input to the RPN network where different parts of the image are classified as background or an object. The localized coordinates are also proposed for objects. These proposals on feature maps are then sent to a Box Classification layer where the objects are classified into their target labels and localization coordinates are enhanced. The RPN network creates regions on the input image where the probability of the presence of the object is high. These regions are called proposals. The second step in the process is classification, in which each proposal is considered separately and is classified for the presence of an object in the input image. The proposal area where the classification score is high for a specific class is labeled with a bounding box along with the name of that class. Figure 2 illustrates the process of object detection where the feature maps are generated using trained convolution layers. RPN classifies the areas on the feature maps as background or object and proposes bounding boxes. With the help of a Box Classifier, these proposed areas of interest are refined and classified into object categories. For this work, a Faster RCNN Resnet-50 model [20] is used based on its high mAP score and better computation time for localizing objects. The training stage of this model requires input images along with the corresponding VOC label files created in the data preprocessing step. Together the input image and VOC label files are used as input to train the Faster RCNN model. Since a lot of time and computation is required for training a network from scratch, transfer learning is employed to train networks that have been trained on similar data. This reduces the time taken by the network to converge and thus reduces the
Speech-Based Selective Labeling of Objects …
307
Fig. 2 Faster Region-based Convolutional Neural Network (FRCNN) architecture
overall computation cost of the developed model. Once the training model is ready, the weights are converted to a frozen-inference-graph which is used for making detection along with a class label description file which tells the model which objects to look for in the image. This graph is deployed along with the speech recognition, class selection, and label generation method in order to achieve desired results. The results are discussed in further sections of this paper.
3 Results and Discussion The algorithm proposed in this work was tested using images from COIL-100 dataset, created by Center for Research on Intelligent Systems at the Department of Computer Science in Columbia University [16]. This data set has color images of 100 objects. All the objects were positioned on a power-driven turntable against a black backdrop, and images were acquired at pose internals of 5°. Figure 3 shows some of the sample images used in this work taken from the dataset. In this work, initially a data augmenter was used to increase the sample size of the dataset and get more desirable transformations. This is done to help in improving the classifier accuracy. After increasing the sample size of the input data, an automatic
308
A. Alice Nithya et al.
Fig. 3 Sample images taken from COIL-100 dataset
speech recognizer based on Google API was used to perform speech-to-text conversion. This helped in converting the speech input given by the user to a text message to search for the object. The converted text was used to identify which object is to be located in the scene. A dictionary-based search algorithm was then used to limit to specific object types described in the text. This step helped in reducing the search time required in identifying the class of the object of interest. Object detection was performed using Faster RCNN architecture followed by generation of class labels. Figure 4 shows the classification accuracy for Faster RCNN for (a) specific object type and (b) all object types. Figure 5 shows rates of change of four different types of losses of the detection model with respect to training steps. Figure 5a shows the graph representing the loss of Box Classifier in predicting the class name for the bounding box. The graph in Fig. 5b represents the loss for Box Classifier in refining the coordinates of bounding boxes. Figure 5c depicts the loss of RPN network in identifying the coordinates of the bounding boxes on the region of interest. Figure 5d shows the loss of RPN network in classifying the regions on the image as an object or background. In order to obtain accurate detection results, it is important for all the four losses to converge. The losses attained a minimum at around 2000 training steps and hence further training of the model was stopped.
Speech-Based Selective Labeling of Objects …
(a)
309
(b)
Fig. 4 Classification accuracy for a specific object, b all objects
Fig. 5 Graphs representing a classification loss, b localization loss, c localization loss, and d RPN objectness loss
310
A. Alice Nithya et al.
4 Conclusion Object detection has been extensively used and is considered as one of the prerequisites for various vision-based applications. From the proposed work, it was observed that by increasing the sample size of the input images using an augmenter helped in improving the accuracy of the faster RCNN classifier. From the literature, it was observed that this is the first work done for speech-based selective labeling of objects in an inventory setting. By employing transfer learning to the existing architecture computational cost of the model is highly reduced with improved accuracy. The dictionary-based search algorithm developed in this work helped in reducing the search time required to identify the class of object of interest. From the results, it was found that Faster RCNN helped in minimizing the classification loss and localization loss for the proposed model. As a future work, the proposed work will be tested on different benchmark datasets and will be extended to annotate videos as well.
References 1. Chen X, Girshick R, He K, Dollár P (2019) Tensormask: a foundation for dense object segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 2061–2069 2. Pinheiro PO, Lin TY, Collobert R, Dollár P (2016) Learning to refine object segments. In: European conference on computer vision. Springer, Berlin, pp 75–91 3. Nithya AA, Lakshmi C (2019) On the performance improvement of non-cooperative iris biometrics using segmentation and feature selection techniques. Int J Biom 11(1):1–21 4. Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: closing the gap to human level performance in face verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708 5. Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660 6. Desta MT, Chen L, Kornuta T (2018) Object-based reasoning in VQA. In: 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1814–1823 7. Girshick R (2015) Fast r-cnn. In: The IEEE international conference on computer vision (ICCV) 8. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-CNN. In: The IEEE international conference on computer vision (ICCV) 9. Zhang Y, Miao D, Zhang Z, Xu J, Luo S (2018) A three-way selective ensemble model for multi-label classification. Int J Approx Reason 103:394–413 10. Yi W, Sun Y, Ding T, He S (2019) Detecting retail products in situ using CNN without human effort labelling. In: CVPR 11. Liu Y, Wang R, Shan S, Chen X (2018) Structure inference net: object detection using scenelevel context and instance-level relationships. In: CVPR 12. Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: European conference on computer vision (ECCV) 13. Zhang D, Han J, Zhao L, Meng D (2018) Leveraging prior-knowledge for weakly supervised object detection under a collaborative self-paced curriculum learning framework published. Int J Comput Vis 127:363–380
Speech-Based Selective Labeling of Objects …
311
14. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks IEEE Trans Pattern Anal Mach Intell 39. https://doi.org/10.1109/ TPAMI.2016.2577031 15. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149 16. Nene SA, Nayar SK, Murase H (1996) Columbia object image library (COIL-100) 17. Bloice M, Stocker C, Holzinger A (2017) Augmentor: an image augmentation library for machine learning. J Open Source Softw 2. https://doi.org/10.21105/joss.00432 18. Cloud Speech-to-Text API—Language support, Google. https://cloud.google.com/speech-totext/. (Accessed 17 Dec 2019) 19. Large J, Bagnall A, Malinowski S, Tavenard R (2019) On time series classification with dictionary-based classifiers. Intell Data Anal 23(5):1073–1089 20. Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S, Murphy K (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7310–7311 21. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99 22. Tzutalin, labelImg (2019) A graphical image annotation tool. https://github.com/tzutalin/lab elImg. (Accessed 17 Dec 2019) 23. PyAudio (2019) PortAudio v19 Python Bindings. https://people.csail.mit.edu/hubert/pyaudio/. (Accessed 20 Dec 2019)
Classification and Evaluation of Goal-Oriented Requirements Analysis Methods Farhana Mariyam, Shabana Mehfuz, and Mohd. Sadiq
Abstract Goal-oriented requirements analysis (GORA) is a sub-process of goaloriented requirements engineering, which is used for the identification and analysis of the high-level objective of an organization. There are different types of GORA methods like AGORA, PRFGOREP, FAGOSRA, Tropos, etc., which have been developed to deal with different issues related to GORA like reasoning with goals, selection and prioritization of the goals and requirements, stakeholders analysis, detection of conflictions among goals, and so on. The objective of this paper is to classify and evaluate the GORA methods based on goal concepts, goal links, and soft computing techniques used in GORA methods to deal with impression and vagueness during the decision-making process. Based on the evaluation, we have also discussed the future scope in the field of GORA methods. Keywords Goal-oriented requirements analysis · Soft computing · Computational intelligence · Fuzzy logic · Genetic algorithms
1 Introduction Goal-oriented requirements engineering (GORE) has been used for the identification, representation, analysis, capturing alternatives, and detection of confliction among software requirements using goals. Goal is a high-level objective of an organization [1, 2]. Goals in requirements engineering (RE) provide definite criterion for sufficient F. Mariyam (B) · S. Mehfuz Department of Electrical Engineering, Faculty of Engineering and Technology, Jamia Millia Islamia (A Central University), New Delhi 110025, India e-mail: [email protected] S. Mehfuz e-mail: [email protected] Mohd. Sadiq Department of Computer Science and Automation, Indian Institute of Science Bangalore Karnataka, Bangalore 560012, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_21
313
314
F. Mariyam et al.
completeness of the requirements specification, requirements appropriateness, etc. It is also helpful in detecting discordances among stakeholders [2]. Based on the different RE process, as discussed in [3], we have classified GORE into five subprocesses, i.e., goal-oriented requirements elicitation, goal-oriented requirements modeling, goal-oriented requirements analysis (GORA), goal-oriented requirements management, and goal-oriented requirements verification and validation. Among various sub-process of GORE, in our work, we focus on the GORA because it is concerned with the identification and analysis of the goals. Over the past two decades, different GORA methods have been developed for the analysis of the goals like “Knowledge Acquisition for Automated Specification” (KAOS), “Non-Functional Requirements” (NFRs) Framework, “Attributed Goal Oriented Requirement Analysis” (AGORA) method, etc. [1–4]; and in these methods different techniques have been used for the analysis of the goals. For example, in [4] fuzzy-based approach was employed for the analysis of the goals during GORA process. Therefore, it motivates us to systematically identify those methods which have been developed for the analysis of the goals in GORA; and evaluate it on the basis of goal concepts, goal links and soft computing techniques so that some new research directions can be identified in the area of GORA methods. Different studies have focused on the analysis and evaluation of GORE techniques. For example, Horkoff and Yu [5] compared and evaluated the goal-oriented satisfaction analysis techniques like NFR framework, i* framework, Tropos, and GRL. Zickert and Beck [6] evaluated the KAOS methodology. Anwar and Ikram [7] presented the critical study of the GORE techniques. In the past few years, different GORA methods have been developed with the support of the soft computing techniques. Based on our review, we found that there is no study which focuses on the classification and evaluation of the GORA methods based on the soft computing techniques goal concepts, and goal links. Therefore, to address this research issue in this paper, we present the classification and evaluation of the GORA methods. The contributions of the paper are as follows: (a) classification of GORA methods and (b) evaluation of GORA methods based on goal concepts, goal links, and soft computing techniques used in the development of the intelligent GORA methods. The remaining part of this paper is organized as follows: In Sect. 2, we discuss the classification of GORA methods. An evaluation of GORA methods based on goal concepts, goal links, and soft computing techniques is discussed in Sect. 3. Finally, Sect. 4 concludes the paper and suggestion for future work in the area of GORA.
2 Classification of GORA Methods In the literature of the GORE, different methods have been developed for the elicitation, analysis, and modeling of the software requirements. To identify those GORE techniques which focus more on the analysis of goals, we have evaluated the GORE techniques on the basis of different sub-processes of RE; and as a result, we have
Classification and Evaluation of Goal-Oriented Requirements Analysis Methods
315
identified 13 GORE techniques which are dedicated to goal analysis; and these techniques are referred to as “goal-oriented requirements analysis” (GORA). Based on the applications of the GORA methods, we have classified the GORA methods into three parts, i.e., (a) GORA methods for the analysis of both functional requirements (FRs) and non-functional requirements (NFRs), (b) GORA method(s) dedicated to NFRs only, and (c) GORA method(s) for social modeling.
2.1 GORA Methods for the Analysis of FRs and NFRs Based on our review, we have identified the following GORA methods which are used for the analysis of the FRs and NFRs of a system. The knowledge acquisition for automated specifications (KAOS) method was developed in 1993 to build requirements models. It can be used in different industries like steel, healthcare, telecommunications, etc. KAOS helps to generate the ideas related to problem description, to improve the problem analysis process by applying the systematic approach for the elicitation and structuring of the requirements, and it also helps to specify the responsibilities of different stakeholders. There are four types of the models that are used in KAOS, i.e., KAOS goal model, KAOS responsibility model, KAOS object model, and KAOS operational model [8]. Goal-Based Requirements Analysis (GBRA) method was proposed by Anton [9] in 1996 to “identify, elaborate, refine, and organize goal for requirements specifications”. Based on the concept of scenario analysis, the GBRA method was developed for goal analysis and goal evolution. Following concepts are used in GBRA method for the goal analysis and evolution, i.e., achievement goal, maintenance goals, goal decomposition, and goal obstacles. GBRA method achievement keywords are “achieve, make, improve speedup, increase, satisfied, completed, and allocated”. GBRA method maintenance keywords are “maintain, keep, ensure, avoid, know, monitor, track, provide, supply, and found out”. The Attributed Goal-Oriented Requirements Analysis (AGORA) method was developed in 2002 by Kaiya et al. [10] for the analysis of goals by using two attribute values, i.e., contribution values (CV) and preference matrices (PM). These attributes are attached by a requirements analyst with an AGORA graph. The CV specifies the “contribution of the sub-goal to the achievement of its parent goal”. The PM of a goal represents the “preferences of the goal for each stakeholder”. Following steps are used to construct an AGORA graph: (a) identify the starting goal of the stakeholders (b) decomposition and refinement of the goals into sub-goals, (c) selection of goals from the list of the decomposed goals (d) detection of confliction of goals and resolve it. Tropos is a Greek word which means easily changeable and adaptable [11]. Tropos was developed by Bresciani et al. [11] in 2004 as an agent-oriented method to support the following activities, i.e., (a) “early requirements phase”, (b) “last requirements phase”, (c) “architectural design phase”, (d) “detailed design phase”, and (e) implementation. The concept of actors and goals in Tropos was adopted
316
F. Mariyam et al.
from i* framework. The objective of requirements analysis in Tropos is to find out different types of the requirements of software. Goal-oriented idea generation (GOIG) method was developed in 2003 by Oshiro et al. [12] to support the following activities, i.e., “refinement and decomposition of goals into more concrete sub-goals” and “collaborative activities by stakeholders to elicit goals and to construct a goal graph”. In GOIG, an idea generation method was combined for the identification of the sub-goals by group of stakeholders. Goal-oriented and ontology-driven requirements elicitation (GOORE) method was developed in 2007 by Shibaoka et al. [13] in which domain knowledge is used for the decomposition and refinement of goal. In GOORE domain knowledge is represented as ontology and embeds the ontological system with goal-oriented analysis. In this method, ontology is considered as a “thesaurus of words and inference rules on it”. The process of GOORE is divided into two parts, i.e., “set of activities that requirement analyst should perform” and task can be automated. In 2014, Sadiq and Jain [14] developed a method for the prioritization of requirements using fuzzy-based approach in goal-oriented requirements elicitation process (PRFGOREP). It was the first attempt to use the fuzzy-based multi-criteria decision-making (MCDM) methods in GORE literature for the selection and prioritization of the software requirements [15]. The PRFGOREP was developed by combining the α-level weighted F preference, extent fuzzy analytic hierarchy process (AHP), and binary sort tree method for the prioritization of the requirements of institute examination system (IES). In 2015, Sadiq and Jain [16] proposed a “fuzzybased approach for selection of goals in goal-oriented requirements elicitation process” (FSGGOREP). This method was developed to choose and adopt a goal out of the alternatives of the decomposed goals when several stakeholders are involved in decision-making process and linguistic variables are used by the stakeholders to specify their preferences. In this method, α-level weighted F preference relation was also used to deal with the linguistic variables. The applicability of this method was discussed with the requirements of IES. The objective of this method was to deal with one of the activities of AGORA method using fuzzy logic. Motivated by the work of Sadiq and Jain [14–16] in 2017, Garg et al. [17] developed a goal-oriented approach for software requirements elicitation and prioritization (GOASREP) using AHP method. In this method, goal-oriented method was used for the elicitation of goals and the prioritization of the elicited requirements using AHP. In this method, cost and effort were used as criteria for the prioritization of the requirements. The function point approach was used to compute the cost of each requirement. In 2016, Mohammad et al. [4] developed a fuzzy-attributed goal-oriented software requirements analysis (FAGOSRA) method to strengthen the few activities of the AGORA method using fuzzy logic. Similar to PRFGOREP and FSGGOREP methodologies, fuzzy logic was also used in FAGOSRA for the analysis of the goals [4, 18]. In AGORA, crisp values were used in CV and PM. But in the case of FAGORA, fuzzy CV and fuzzy PM were used with AND/OR graph for goal analysis. This method was applied for the analysis of the requirements of IES. In this method, the L −1 R −1 inverse function arithmetic principle of the addition was used
Classification and Evaluation of Goal-Oriented Requirements Analysis Methods
317
to model the linguistic variables. This methods was extended by the same research group by considering the multiple stakeholders during the decision-making process and it is referred to as FAGOSRA_MS [19].
2.2 GORA Method(s) for the Analysis of NFRs Based on our review, we found that NFR framework, proposed by Mylopoulos et al. [20], is dedicated for the analysis of the NFRs. The complexity of a system is determined by its functionality, i.e., “what the system does” and non-functionality, i.e., “how the system is supposed to be”. There are different types of the NFRs which are used as criteria during the decision-making process like security, usability, reliability, etc. This framework consists of five major components, i.e., (a) “set of goals for representing NFRs design decision”, (b) set of link types for relating goals, (c) refinement of goals, (d) correlation rules, and (e) labeling procedure. The basic idea behind the NFR framework is based on the softgoal interdependency graph which is used for the analysis softgoals.
2.3 GORA Method(s) for Social Modeling The i* framework was developed by Yu [21, 22] in 1995 for social modeling and reasoning of an information system. This framework has two components, i.e., strategic dependency (SD) and strategic rationale (SR). To show the direct dependency relationship among actors, the SD model is employed in i* framework. There are four types of dependency in an SD model, i.e., goal dependency, task dependency, resource dependency, and softgoal dependency. The SR model focuses on the interest and concerns of an organization and also on the stakeholders. In this framework, a means-ends link is employed to “connect a task of a goal, indicating a specific way to achieve the goal”. The year-wise development of different GORA methods is exhibited in Fig. 1.
3 Evaluation of GORA Methods The objective of this section is to evaluate the GORA methods based on the (a) goal concepts, i.e., achievement goal (AG), maintenance goal (MG), softgoal (SG), agent (AT), belief (BL), assumption (AS), constraints (CT), (b) goal links, i.e., intergoal contribution link (IGCL), AND/OR operationalization link (AOR), responsibility link (RL), decomposition link (DL), means-ends link (MEL), and (c) soft computing techniques (SCT). A brief description about goal concepts, goal links, and soft computing is given in Table 1.
318
F. Mariyam et al.
2025 2020 2015 2010 2005 2000 1995 1990 1985 1980 1975
Fig. 1 Development of different GORA methods
The first two criteria, i.e., goal concepts and goal links, are the key components of GORA methods and they help requirements analyst for the selection of a GORA method during the decomposition and refinement process of goals. The third criterion will help the requirements analyst to identify those GORA methods which deal with the imprecision and vagueness during the decision-making process. In GORA methods, high-level objective of an organization is decomposed and refined into sub-goals. These sub-goals are further decomposed and refined into sub-sub-goals. This process continues until the responsibilities of the last sub-goals are assigned to some agents. These sub-goals are represented by an AND/OR graph which has two decomposition links, i.e., AND decomposition and OR decomposition. In our evaluation, we have also considered the different types of links which are present in GORA methods. Different types of goals are used in GORA methods, like achievement goal, maintenance goals, and soft goals. Based on our review, as shown in Table 2, we found that achievement goal and maintenance goals are defined in KAOS and GBRAM methods. Softgoals are supported by NFR framework, i* framework, Tropos, GOORE; but it is conditionally defined in KAOS, GBRA method, AGORA, GOIG, FSGGOREP, PRFGOREP, GOASREP, FAGOSRA, and FAGOSRA_MS methods. Only few GORA methods support the soft computing-based techniques for the analysis of the goals. Most of the focus in GORA was on the use of the crisp data during the decisionmaking process of the goals and software requirements. For example, in AGORA method [10], the crisp values were used for the analysis of the goals using AGORA graph. It has been observed that in real-life applications, stakeholders specify their
Classification and Evaluation of Goal-Oriented Requirements Analysis Methods
319
Table 1 Criteria for the evaluation of GORA methods Criteria
Objective
Achievement goal
It is an objective of an organization or system
Maintenance goal
Anton [3] states that “maintenance goals are those goals that are satisfied while their target condition remains true. They tend to be operationalized as actions or constraints that prevent certain states from being reached.”
Soft goal
A soft goal can be defined as a goal that has no clear cut criteria for the achievement, i.e., “delivery of important document quickly” is a soft goal, being the notion of “quickly” highly subjective
Agent
Agent is any component, whose cooperation is needed to define the operationalization of a goal, i.e., how the goal is going to be provided by the system-to-be
Belief
Belief is an interpretation of the system. It becomes very useful goal concept that can prevent stakeholders from repeating the same discussion over and over again
Assumptions
A goal under the responsibility of single agent in the software-to-be becomes a requirement, whereas a goal under the responsibility of single agent in the environment of the software-to-be becomes an assumption
Constraints
It is the requirements that must be met for goal completion. A constraint places a condition on the achievement of a goal
Inter-goal contribution (IGC) link
It is used to capture the contribution of requirements to goals. The contribution value of an edge in AND/OR graph stands for the degree of contribution of the sub-goal to the achievement of its parent goal
AND/OR operationalization link
KAOS introduces AND/OR operationalization links to relate goals to the operations
Responsibility link
In KAOS, responsibility links are introduced to relate the goal and agent sub-models. A goal may be assigned to alternative agents through OR responsibility links
Decomposition link
A decomposition links connect a goal/task with its components
Means-Ends link
It is mostly used with goals and specifies alternative ways to achieve them
Soft computing techniques
Soft computing techniques are used to deal with imprecision and vagueness, like fuzzy logic, rough set theory, etc.
preferences using linguistic variables. For example, the system should be more reliable. In this example, more is a linguistic variable. Fuzzy logic is a suitable tool to model such linguistic variables. It was developed by Lotfi A. Zadeh in 1965 to deal with impreciseness and vagueness during the decision-making process [7]. In 2014, Sadiq and Jain [14] introduced the fuzzy logic in GORE literature and proposed two methods to deal with imprecise and vagueness during the decisionmaking process, i.e., FSGGOREP and PRFGOREP. After that some other researchers
320
F. Mariyam et al.
Table 2 Framework for the evaluation of GORA methods on the basis of goal concepts, goal links, and soft computing techniques S.
GORA No methods
Goal concept(s)
Goal links
SCT
AG MG SG AT BL AS CT IGCL AOR RL DL MEL
1
KAOS
✓
✓
✓
×
✓
✓
✓
✓
×
2
NFR framework
×
×
✓
×
×
✓
✓
✓
✓
✓
×
3
i* framework
×
×
✓
✓
×
×
×
✓
✓
✓
✓
×
4
GBRAM
✓
✓
✓
×
×
✓
×
×
×
×
×
×
5
AGORA
×
×
×
×
×
×
✓
✓
×
×
×
6
GOIG
×
×
×
×
×
×
×
×
×
×
×
×
7
TROPOS
×
×
✓
✓
✓
×
×
✓
✓
✓
✓
×
8
GOORE
×
×
✓
×
×
×
×
×
×
×
×
×
×
9
FSGGOREP
×
×
×
×
×
✓
✓
×
✓
×
✓
10
PRFGOREP
×
×
×
×
×
✓
✓
×
✓
×
✓
11
GOASREP
×
×
×
×
×
×
✓
×
✓
×
×
12
FAGOSRA
×
×
×
×
×
✓
✓
×
✓
×
✓
13
FAGORA_MS ×
×
×
×
×
✓
✓
×
✓
×
✓
Defined (✓), Conditionally Defined (), and Not Defined ( ×)
have also used the fuzzy logic to strengthen the some of the activities of the GORA; and they developed an extended version of AGORA, i.e., FAGOSRA and FAGOSRA_MS. In [4, 14, 16, 19] fuzzy-based multi-criteria decision-making was used for the selection and prioritization of the goals and software requirements during decision-making process. To model the linguistic terminologies, the simplest form of the fuzzy numbers, i.e., triangular fuzzy numbers (TFNs) were used in fuzzy-based GORA methods. Serrano et al. [23] developed a fuzzy-based method to deal with the softgoals in GORA at run time. The authors have combined propagation rules, fuzzy logic, and multi-agent systems in order to provide support for dealing with softgoals at run time.
4 Conclusion Goal-oriented requirements engineering (GORE) is an active research area in the field of software engineering which is used for the identification and analysis of the software requirements according to the consensus of the stakeholders so that a successful software product can be developed. Based on different processes of RE, in this paper, we divide the GORE into five sub-process and focus on one of the key sub-processes of GORE, i.e., goal-oriented requirements analysis (GORA). In past two decades, different GORA methods have been developed for the identification
Classification and Evaluation of Goal-Oriented Requirements Analysis Methods
321
and analysis of the goals of software like KAOS, AGORA, PRFGOREP, FAGOSRA, i* framework, etc. In this paper, we have classified these methods into three parts, i.e., GORA methods for the elicitation and analysis of FRs and NFRs, GORA dedicated to NFRs only, and GORA method for social modeling. Based on our review, we observe that in the literature of the GORA, most of the focus is on reasoning with goals, elicitation, and analysis of the goals; and less attention is given to the MCDM methods and its applications during the selection and prioritization of the software requirements. We have identified two methods in which the fuzzy logic introduced for the selection and prioritization of goals and requirements in goal-oriented requirements elicitation and analysis process, i.e., PRFGOREP and FSGGOREP. Based on these two methods, some other methods have also been developed for the analysis of the goals and detection of confliction among goals and stakeholders, i.e., FAGOSRA and FAGOSRA_MS. One of the limitations of the fuzzy-based GORA method is that little attention is given to different types of the goals and goal concepts. In future, we shall try to develop the intelligent techniques in the context of the goal concepts and goal types in GORA domain. Much work is needed to apply the different intelligent techniques like rough set theory, genetic algorithm, and neural networks to deal different issues of the goal analysis in GORA. There are different applications of the machine learning in the area of software engineering. For example, it can be used for the classification of the goals from the set of the elicited goals during the GORA process.
References 1. Mylopoulos J, Chung L, Liao S, Wang H, Yu E (2001) Exploring alternatives during requirements analysis. IEEE Softw 93–96 (2001) 2. Lamsweerde (2001) Goal-oriented requirements engineering: a guided tour. In: Proceedings fifth IEEE international symposium on requirements engineering, Canada, pp, pp 249–262 3. Sadiq M, Jain SK (2012) An insight into requirements engineering processes. In: 3rd international conference on advances in communication, network, and computing LNCSIT, Chennai, pp 313–318 4. Mohammad CW, Shahid M, Husain SZ (2016) FAGOSRA: fuzzy attributed goal oriented software requirements analysis method. In: 9th international conference on contemporary computing, pp 384–389 5. Horkoff J, Yu E (2013) Comparison and evaluation of goal oriented satisfaction analysis techniques. Requirements Eng 18:199–222 6. Zickert F, Beck R (2010) Evaluation of the goal oriented requirements engineering method KAOS. 16th American conference on information systems, Peru, pp 1–9 7. Anwer S, Ikram N (2006) Goal oriented requirement engineering: a critical study of techniques. In: 13th Asia Pacific software engineering conference Kanpur, pp 121–130 8. KAOS method: https://www.objectiver.com/index.php?id=25. Accessed on August 10, 2020. 9. Anton AI (1996) Goal-based requirements analysis. In: Proceedings of the second international conference on requirements engineering, Colorado Springs, pp 136–144 10. Kaiya H, Horai H, Saeki M (2002) AGORA: attributed goal-oriented requirements analysis method. In: Proceedings IEEE joint international conference on requirements engineering, Essen, Germany, pp 13–22
322
F. Mariyam et al.
11. Bresciani P, Perini A, Giorgini P et al (2004) Tropos: an agent-oriented software development methodology. Auton Agent Multi-Agent Syst 8:203–236 12. Oshiro K, Watahiki K, Saeki M (2003) Goal-oriented idea generation method for requirements elicitation. In: Proceedings 11th IEEE international requirements engineering conference, Monterey Bay, pp 363–364 13. Shibaoka M, Kaiya H, Saeki M (2007) GOORE: Goal-oriented and ontology driven requirements elicitation method. In: Hainaut JL et al. (eds) Advances in conceptual modeling – foundations and applications. ER 2007. Lecture notes in computer science, vol 4802. Springer, Berlin 14. Sadiq M, Jain SK (2014) Applying fuzzy preference relation for requirements prioritization in goal oriented requirements elicitation process. Int J Syst Assur Eng Manag 5(4):711–723 15. Sadiq M (2017) Fuzzy logic driven goal oriented requirements elicitation processes. PhD thesis (2017) in computer engineering, Department of Computer Engineering, National Institute of Technology Kurukshetra, India 16. Sadiq M, Jain SK (2015) A fuzzy based approach for the selection of goals in goal oriented requirements elicitation process. Int J Syst Assur Eng Manag 6(2):157–164 (2015) 17. Garg N, Sadiq M, Agarwal P (2017) GOASREP: goal oriented approach for software requirements elicitation and prioritization using analytic hierarchy process. In: Satapathy S, Bhateja V, Udgata S, Pattnaik P (eds) Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications, 2017. Advances in intelligent systems and computing, vol 516. Springer, Singapore 18. Horkoff J, Aydemir FB, Cardoso E, Li T, Mate A, Paja E, Salnitri M, Piras L, Mylopoulos J, Giorgini J (2019) Goal-oriented requirements engineering: an extended systematic mapping study. Requirements Eng 24:133–160 19. Mohammad CW, Shahid M, Hussain SZ (2018) Fuzzy attributed goal oriented software requirements analysis with multiple stakeholders. Int J Inf Technol 1–9 20. Mylopoulos J, Chung L, Nixon B (1992) Representing and using non-functional requirements: a process-oriented approach. IEEE Trans Software Eng 18(6):483–497 21. Yu ESK (1995) Modelling strategic relationships for process reengineering. PhD dissertation, Department of Computer Science, University of Toronto 22. Yu ESK (1997) Towards modeling and reasoning support for early-phase requirements engineering. In: 3rd IEEE international symposium on requirements engineering, pp 226 – 235 23. Serrano M, Serrano M, do Prado Leite JCS (2011) Dealing with softgoals at runtime: a fuzzy logic approach. In: 2nd international workshop on [email protected], Trento, 2011, pp 23–31.
iCREST: International Cross-Reference to Exchange-Based Stock Trend Prediction Using Long Short-Term Memory Kinjal Chaudhari and Ankit Thakkar
Abstract Stock market investments have been primarily aimed at gaining higher profits from the investment; a large number of companies get listed on various stock exchanges to initiate trading through the stock market. For the potential expansion of market tradings, several companies may choose to get listed on multiple exchanges which may be domestic and/or international. In this article, we propose an international cross-reference to exchange-based stock trend (iCREST) prediction approach to study how historical stock market data of a company listed on internationallylocated stock exchanges can be integrated. We consider the timezone and currency variations in order to unify the data; we also incorporate data integration-based pre-processing to eliminate loss of useful stock price information. We calculate the difference between exchange prices of a company and adopt long short-term memory (LSTM) models to predict one-day-ahead stock trend on respective exchanges. Our work can be considered as one of the novel approaches that integrate the international stock exchanges to predict the stock trend of corresponding markets. For the experiment, we take datasets of five companies listed on National Stock Exchange (NSE), Bombay Stock Exchange (BSE), as well as New York Stock Exchange (NYSE); the prediction performance is evaluated using directional accuracy (DA), precision, recall, and F-measure metrics. The results using these metrics indicate performance improvement with international exchanges, and hence the potential adaptability of the proposed approach.
K. Chaudhari (B) · A. Thakkar Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad 382 481, Gujarat, India e-mail: [email protected] A. Thakkar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7_22
323
324
K. Chaudhari and A. Thakkar
1 Introduction The stock market has been an attractive field for an enormous number of investors, traders, as well as researchers. It is a highly volatile market; it experiences non-linear fluctuations because of a large number of events that have a direct or indirect influence on the market behaviours. One of the primary goals to invest in such markets is to attain higher returns of the invested amount; while some studies have shown the non-linearity of stock market returns [1, 7], the hidden patterns among market price movements have been explored to derive potential market characteristics, and hence to predict the future trend. The market behaviours can be fundamentally analyzed using knowledge of company specifications, as well as expertise to infer investment opportunities; also, they can be technically studied by deriving various indicators and analyzing the historical market trends [2, 6]. Such analyses can be useful to forecast the future stock price and/or trend. In financial markets, it is important to operate through a platform where financial instruments can be bought and sold; a stock exchange is such a platform that facilitates companies to extend their growth and profits; subsequently, the investors and traders get benefited with higher opportunities. National Stock Exchange (NSE) and Bombay Stock Exchange (BSE) are two of the largest stock exchanges in India [15]; other widely known stock exchanges include New York Stock Exchange (NYSE), NASDAQ, Japan Exchange Group (JPX), London Stock Exchange (LSE), to name a few. While the financial markets have emerged from manual money transferring to digital currency trading, a considerable number of companies prefer to get listed on the stock exchange(s) during their growing phase; while initial public offering (IPO) is a direct way to get listed, several private companies may approach an indirect reverse takeover (RTO) to acquire listing status in a stock exchange [13]. Such public listing allows the company to trade its stocks through a stock exchange [12]. Therefore, in order to expand the profitability of a company and increase the opportunities to embed a higher number of potential investors, a company may further choose to get listed on more than one stock exchange [8, 17]. While the primary stock exchange is likely to be domestic, i.e., the stock exchange belongs to the same country as that of the company, selection of the other stock exchange may be dependent on various factors such as the targeted investors, potential benefits, the listing requirements, etc. [5]. Some companies may prefer to get listed on the stock exchanges of the same country whereas some companies may aim to get listed on the international stock exchange(s) alongside the domestic stock exchange(s). The diversity of information collected from different stock exchanges and hence, the inherent patterns can be fused to derive useful stock market predictions [19]. A recent study has been carried out on companies that are listed on two domestic stock exchanges, NSE and BSE, in Ref. [18]; authors proposed to integrate crossreferencing through stock exchanges to predict the stock price movement direction. As a company gets listed on multiple stock exchanges, the company’s profile, popularity, relation with investors, dominating and/or influential factors, etc. play important roles in its market trend with respect to each stock exchange [20]; the correlations
iCREST: International Cross-Reference to Exchange-Based …
325
can be studied for different stock exchanges and potential trends can be predicted. With a motivation to associate information of the same company listed on multiple stock exchanges, this article proposes to study the international cross-reference to exchange-based stock trend (iCREST) prediction approach. We consider companies having been listed on two or more stock exchanges at domestic, as well as international levels, and prepare a systematic procedure to advance the stock trend prediction.
1.1 Motivation The stock market volatility is likely to fluctuate because of various factors of the country’s economies; on the other hand, the financial developments based on such markets can also drive the economy of respective countries [14]. Therefore, it is important to study the financial aspects of the company, economic conditions, as well as potential risk factors of the surrounding environment and the country, as a whole. The companies may have a primary goal towards increasing the market reach and expanding business dimensions [17] wherein the cross-country exchange of trades has been commonly observed with the global spread of financial markets. Also, such companies may choose to get listed on different stock exchanges. Evidence has demonstrated the impact of such listing; for example, after getting an American Depositary Receipt (ADR) listing, the Russian stocks showed an increased volatility of stock returns [16], whereas the Indian stocks indicated higher liquidity gains [9]. Some of the major motivations behind getting domestic as well as international listings may be an access to the well-established market, stringent legal protections based on legal bonds, increased valuation of the company, as well as the stronger corporate governance [10]. Hence, it is inspirational to study the impact of a company’s trades on internationally listed stock exchanges. In this article, our primary focus is to study the companies having been listed on two or more international stock exchanges and analyze the impact of one stock exchange on the other one. Here, we consider five companies listed on NSE, BSE, and NYSE and incorporate the proposed iCREST prediction approach using long short-term memory (LSTM) models as given by Ref. [18]. Here, the variations in currency formats, as well as timezones between the considered stock exchanges, are given key attention. The major contributions of the proposed approach can be summarized as follows: • International cross-reference to exchange-based stock trend (iCREST) prediction approach is proposed and presented. • Dissimilarities between timezones and currencies are studied and an illustrative example is provided for a better understanding of the work. Also, the data cleaning process is enhanced with data integration to eliminate the loss of potential stock price information.
326
K. Chaudhari and A. Thakkar
• Experiments are carried out on datasets of five companies, namely, Dr. Reddy’s Laboratories Limited (DRREDDY), HDFC Bank Limited (HDFCBANK), ICICI Bank Limited (ICICIBANK), Infosys Limited (INFY), and Wipro Limited (WIPRO), each of which has been listed on NSE, BSE, and NYSE. • One-day-ahead stock trend prediction is evaluated for NSE and NYSE and similarly, for BSE and NYSE, for each of the five companies. • The impact of iCREST prediction is analyzed using directional accuracy (DA), precision, recall, and F-measure metrics. The remaining article is organized as follows: a detailed step-by-step procedure of the proposed iCREST prediction approach is provided in Sect. 2 along with an illustrative example; the experimental parameter specifications are included and the prediction results are analyzed in Sect. 3; the concluding remarks are given in Sect. 4 along with the potential future scope.
2 Proposed Approach In financial markets, the stock exchange provides a platform to buy and sell various financial instruments. The companies may choose to get listed on one or more stock exchanges to expand their reach to a large number of investors and traders wherein selection of the stock exchange may be critical; the potential impact of domestic exchanges on each other has been demonstrated in Ref. [18]. With an inspiration to study such effects on global exchanges, we propose iCREST prediction approach to forecast the stock trend of a company listed on multiple international stock exchanges. The step-by-step procedure of the proposed approach is discussed; also, an illustrative example is provided for a better understanding of the work. A series of operations are carried out under pre-processing, preparation, and prediction steps for iCREST prediction; a structural overview of the same is provided by Fig. 1.
2.1 Time-Series Data Collection The primary step to initiate pre-processing for the proposed iCREST prediction approach is to collect the time-series stock market data. We download the data from Yahoo finance website for the target companies. Our main focus is a company having been listed on multiple stock exchanges that are situated in domestic as well as international countries. For the experiment purpose, we collect the historical time-series data of five companies, namely, Dr. Reddy’s Laboratories Limited (DRREDDY), HDFC Bank Limited (HDFCBANK), ICICI Bank Limited (ICICIBANK), Infosys Limited (INFY), and Wipro Limited (WIPRO), each of which has been listed on NSE, BSE, and NYSE; for simplicity of understanding, we discuss the following steps with respect
iCREST: International Cross-Reference to Exchange-Based …
327
Pre-processing WIT Time-series Data Collection
Mapping & Cleaning
Currency Conversion
Normalization
Feature Selection
Difference WIPRO.NS
Prediction
Preparation International Cross-Reference to Exchange-based Stock Trend (iCREST) Prediction
Trend Prediction (WIPRO.NS)
Performance Evaluation
Price Prediction (WIT)
LSTM Model - I (WIT)
Prediction (Difference)
LSTM Model - III (Difference)
Input Samples
Training Validation (Optional)
Trend Prediction (WIT)
Price Prediction (WIPRO.NS)
LSTM Model - II (WIPRO.NS)
Testing
Fig. 1 A structural overview of international cross-reference to exchange-based stock trend (iCREST) prediction approach
to WIPRO, however, the same operations are also applied to each of the considered companies. The downloaded WIPRO data from NSE, BSE, and NYSE are represented using WIPRO.NS, WIPRO.BO, and WIT ticker symbols, respectively. Here, the time-series data is collected for the duration of 01-01-2009 to 01-01-2020 for each stock exchange. It must be noted that WIPRO.NS and WIPRO.BO presents stock prices in INR, whereas WIT denotes stock prices in USD. To unify the given price values, we require currency conversion; therefore, we also download foreign exchange rate, i.e., USD/INR, for the said duration. Figure 1 demonstrates iCREST prediction approach for WIPRO listed on NSE and NYSE; similarly, the proposed approach can also be evaluated for WIPRO listed on BSE and NYSE, separately. The same can be adopted for any other company listed on two stock exchanges, however, the foreign exchange rate must be chosen accordingly. The daily historical stock market data in its raw form includes date, open, high, low, close, and volume information; here, open and close indicate the opening and closing prices, respectively, whereas high and low denote the highest and the lowest price values reached in the given trading day, respectively; the number of shares traded is provided by volume information, however, the same is not available for the foreign exchange rate. The collected data are further given to the pre-processing steps. To develop a clear understanding of the proposed work and to simplify the illustration, we consider WIT and WIPRO.NS data along with USD/INR for a sample duration of one week, i.e., 01-01-2019 to 07-01-2019, and conduct a series of operations as carried out in the proposed approach; Table 1 demonstrates the considered sample data. As explained above, here, WIT indicates currency in USD, WIPRO.NS indicates currency in INR, and USD/INR denotes rupees for one dollar, i.e., currency in INR.
328
K. Chaudhari and A. Thakkar
Table 1 An illustrative example: Step-1 time-series data collection Trading data WIT (in USD)
01-012019
02-012019
03-012019
04-012019
05-012019
06-012019
07-012019
Open
Null
5.07
5.07
5.05
Null
Null
Null
High
Null
5.16
5.09
5.08
Null
Null
Null
Low
Null
5.07
5.02
5.02
Null
Null
Null
Close
Null
5.11
5.05
5.06
Null
Null
Null
Volume
Null
466800
676500
1356100
Null
Null
Null
WIPRO.NS (in INR) Open
248.06300 246.60001 245.25000 243.37500 Null
Null
Null
High
249.56300 248.47501 245.96300 245.13800 Null
Null
Null
Low
244.05000 242.25000 242.28799 239.73801 Null
Null
Null
Close
244.98801 244.16299 244.12500 243.33800 Null
Null
Null
Volume
2018271
4411069
4723040
3207036
Null
Null
Null
Open
69.71000
69.44300
69.96000
70.10050
Null
Null
69.52500
High
69.73000
70.23300
70.51000
70.13030
Null
Null
69.92000
Low
69.43000
69.44300
69.96000
69.60000
Null
Null
69.08000
Close
69.71000
69.71000
69.96000
70.30000
Null
Null
69.52500
Volume
–
–
–
–
–
–
–
USD/INR (in INR)
2.2 Mapping and Cleaning The market instruments are traded within a specific duration as per the trading day timings; also, there are holidays or certain situations when the market remains close. Such days may differ with stock exchanges; this issue arises especially while working with stock exchanges of different countries. In the proposed approach, we consider NSE and NYSE (similarly, BSE and NYSE); therefore, it is important to identify common trading days wherein data are available with both the stock exchanges. Here, the difference in currencies has also been encountered; to make a uniform representation of data, a conversion of different currencies into a single currency is required using the currency exchange rate of that particular trading day. However, the above-mentioned concern of having null or missing data also arises with foreign exchange rate; therefore, common trading days can be identified by mapping the available data using their trading dates and further eliminating the trading days if any of the three datasets, i.e., WIT, WIPRO.NS, and USD/INR, does not contain trading information for that day. Though the resultant data would be clean, i.e., without null or missing values, such operations also erase useful information available in one or more exchanges for the specific day; loss of the information can result in performance degradation. To overcome such situations, we consider data integration in the proposed approach. Initially, we map data according to their trading dates wherein the days without having any information are marked as null. With the help of such mapping, we identify trading days that have null information for all the three datasets and remove those days. Hence, the attained incomplete data consists of missing values for one or more datasets for a specific trading day, but not for all. Therefore, we fill the missing
iCREST: International Cross-Reference to Exchange-Based …
329
Table 2 An illustrative example: Step-2 mapping and cleaning WIT (in USD)
Trading data
01-01-2019
02-01-2019
03-01-2019
04-01-2019
07-01-2019
Open
5.15
5.07
5.07
5.05
5.05
High
5.16
5.16
5.09
5.08
5.08
Low
5.10
5.07
5.02
5.02
5.02
Close
5.13
5.11
5.05
5.06
5.06
Volume
316600
466800
676500
1356100
1356100
WIPRO.NS (in INR) Open
248.06300
246.60001
245.25000
243.37500
243.37500
High
249.56300
248.47501
245.96300
245.13800
245.13800
Low
244.05000
242.25000
242.28799
239.73801
239.73801
Close
244.98801
244.16299
244.12500
243.33800
243.33800
Volume
2018271
4411069
4723040
3207036
3207036
Open
69.71000
69.44300
69.96000
70.10050
69.52500
High
69.73000
70.23300
70.51000
70.13030
69.92000
Low
69.43000
69.44300
69.96000
69.60000
69.08000
Close
69.71000
69.71000
69.96000
70.30000
69.52500
USD/INR (in INR)
data with previously available trading day information of the same dataset. In this way, we integrate the data so as to ensure that the historical stock information of any dataset is not lost while cleaning. In contrast to that, the feature having missing or incomplete values such as volume information of USD/INR is removed from the corresponding dataset. Such cleaning operations are necessary to reduce error propagation in the succeeding steps. It can be observed from Table 1 that trading days, 05-01-2019 and 06-01-2019, have null values for all three datasets considered for the illustration, however, 01-012019 and 07-01-2019 contain null values for WIT and 07-01-2019 has null values for WIPRO.NS. Therefore, by mapping respective datasets using dates, we eliminate the trading days with null information in all three datasets, viz. 05-01-2019 and 06-01-2019. To perform data integration, we update the null values with the previous trading day’s information such that trading days 01-01-2019 and 07-01-2019 of WIT get replicated by trading days 31-12-2018 and 04-01-2019, respectively; similarly, trading day 07-01-2019 of WIPRO.NS gets a replica of 04-01-2019. In this way, a sparse historical data matrix undergoes data integration during this step. We remove volume information from USD/INR as it does not contain any information; the derived dataset after cleaning is shown in Table 2. It must be noted that cleaning is based on the associated datasets; in case of mapping trading days of datasets from other stock exchanges, for example, WIPRO listed on NYSE and Frankfurt Stock Exchange (FSE), the cleaning operation may result into a different set of trading days as compared to the one illustrated here. Also, the currencies of NYSE and FSE are USD and Euro, respectively; therefore, the USD/EUR foreign exchange rate must be taken into consideration.
330
K. Chaudhari and A. Thakkar
2.3 Currency Conversion One of the crucial operations of the proposed approach is to convert the currency to attain uniform data. It is a significant pre-processing step wherein the price values of one of the datasets undergo currency conversion. In the proposed approach, NSE (similarly, BSE) and NYSE-based price values are available in different currencies, i.e., INR and USD, respectively. Instead of converting all the prices with a single day’s exchange rate, we adopt the foreign exchange rate data for the currency conversion. Here, the timezone of corresponding stock exchanges plays an important role; while NSE and BSE belong to Indian Standard Time (IST), NYSE follows Eastern Daylight Time (EDT). Considering these timezones with respect to Greenwich Mean Time (GMT), IST and EDT are equivalent to GMT+5:30 and GMT-4, respectively; this means, IST is five and a half hours ahead of GMT, whereas EDT is four hours behind GMT which, in turn, creates a gap of nine and a half hours between IST and EDT. On the other hand, we also require the foreign exchange rate, USD/INR, to convert the currency which follows British Summer Time (BST), i.e., GMT+1. In this regard for a particular trading day, the open price of NSE (similarly, BSE) is available, followed by the open price value of USD/INR exchange rate, and followed by that, the open price of NYSE. Keeping into consideration the timezone difference, we convert USD into INR using the open price value of the exchange rate for the considered trading day. For our experiments, such a USD/INR exchange rate is downloaded from Yahoo finance website for the considered duration of other datasets, i.e., 01-01-2009 to 01-01-2020. To illustrate currency conversion, we extend our example in Table 2. We take the open price of each trading day to convert WIT data from USD to INR for that specific day. As the foreign exchange rate indicates rupees for one dollar, we multiply each price value of WIT with the open rate of USD/INR and derive the converted dataset as shown in Table 3. Here, observation indicates that the data integration, given by a replica of the previous trading day, may get updated with different exchange rates; for example, Table 2 contains trading days, 04-01-2019 and 07-01-2019, as a replica of each other, however, these days’ information in WIT gets modified with specific USD/INR rates as given in Table 3. It must be noted that we limit the conversion based on the open rate of USD/INR by following the chronological order of timezones, however, the same may be adjusted subject to the timezone followed by the considered stock exchanges.
2.4 Normalization It can be observed that various features may fall into different ranges. For example, the historical data of WIPRO.NS for 02-01-2019 indicates 244.16299 as the close price, whereas 4411069 as the volume. Such large differences may dominate the prediction results. Operating various features belonging to different data scales may result in
iCREST: International Cross-Reference to Exchange-Based …
331
Table 3 An illustrative example: Step-3 currency conversion Trading data
01-01-2019
02-01-2019
03-01-2019
04-01-2019
07-01-2019
Open
359.00649
352.07602
354.69719
354.00754
351.10126
High
359.70359
358.32589
356.09639
356.11055
353.18701
Low
355.52099
352.07602
351.19919
351.90452
349.01551
Close
357.61229
354.85374
353.29799
354.70854
351.79651
Volume
316600
466800
676500
1356100
1356100
WIPRO.NS (in INR) Open
248.06300
246.60001
245.25000
243.37500
243.37500
High
249.56300
248.47501
245.96300
245.13800
245.13800
Low
244.05000
242.25000
242.28799
239.73801
239.73801
Close
244.98801
244.16299
244.12500
243.33800
243.33800
Volume
2018271
4411069
4723040
3207036
3207036
WIT (in INR)
Table 4 An illustrative example: Step-4 normalization Trading data 01-01-2019 02-01-2019 03-01-2019 WIT
Open High Low Close Volume WIPRO.NS Open High Low Close Volume
1 1 1 1 0 1 1 1 1 0
0.12331 0.78859 0.47045 0.52568 0.14449 0.68793 0.75412 0.58256 0.50000 0.88466
0.45488 0.44646 0.33567 0.25817 0.34622 0.39996 0.18644 0.59137 0.47697 1
04-01-2019
07-01-2019
0.36764 0.44863 0.44409 0.50072 1 0 0 0 0 0.43951
0 0 0 0 1 0 0 0 0 0.43951
biased outputs. Therefore, it is important to consider normalizing each feature within a specific range. We apply normalization on each feature of the illustrative example given by Table 3 using Eq. (1) [23]. (X − X min ) +a (1) X = (b − a) · X max − X min where, a and b are two arbitrary points such that each feature of the dataset is transformed to the range [a, b] from the original data range given by [X min , X max ]; X indicates the normalized value of a feature value X [23]. As given in study [18], we normalize each feature within [0, 1] range. In the illustrated example, each feature is normalized as shown in Table 4. Such normalized data are useful in eliminating the possible dominance of features with higher values such as volume in the considered example.
332
K. Chaudhari and A. Thakkar
2.5 Feature Selection One of the important criteria of a prediction model is to select appropriate feature(s). The normalized data from the previous step is further applied for feature selection operation. In the proposed approach, we choose to predict the one-day-ahead stock trend of NYSE open price based on the other stock exchange, i.e., NSE or BSE, as considered in Ref. [18]. However, iCREST prediction approach consists of data from different timezones in such a way that chronologically, the close price of NSE (similarly, BSE) for any trading day is likely to be available before the opening of NYSE for that day. To utilize the recent information of a stock exchange and to predict the stock open price of WIT, we choose the close price of WIPRO.NS (similarly, WIPRO.BO); therefore, we select open and close features for WIT and WIPRO.NS, respectively. Though we restrict the selection of single feature in the proposed approach, a combination of multiple features can also be selected from various technical indicators.
2.6 Input Samples The considered features in the previous step belong to WIT and WIPRO.NS datasets. The primary aim of the proposed iCREST prediction approach is to integrate historical data from different stock exchanges to predict the future stock trend. Therefore, we calculate the difference between the close price of WIPRO.NS and the open price of WIT for the given trading day using Eq. (2) [18]. di f f er ence = open(W I T ) − close(W I P R O.N S)
(2)
A total of three datasets, WIT, WIPRO.NS (similarly, WIPRO.BO), and difference, are given as the input samples as shown in Fig. 1; for simplicity of the representation, the step-blocks containing specific datasets are displayed in particular colours. These data inputs are further divided into training, validation (optional), and testing samples in the preparation step and further given to the prediction model. Following the existing approach [18], we divide each dataset into 80:20 training-testing ratio.
2.7 iCREST Prediction The main goal of the proposed iCREST prediction approach is to forecast the future stock price movement, and therefore, the derived data samples of each dataset are further given to individual models. Here, we represent Model-I for WIT, Model-II for WIPRO.NS, and Model-III for the difference (Eq. (2)) as shown in Fig. 1.
iCREST: International Cross-Reference to Exchange-Based …
333
An LSTM architecture, 1-32-16-1, is adopted as a prediction model as specified in Ref. [18]; here, the model represents a number of neurons in an input layer, two hidden layers, and an output layer, respectively. A separate model is given to each of the three datasets along with respective data samples. While Model-I and II are given open price of WIT and close price of WIPRO.NS, respectively, Model-III is provided with the difference between these price values. These models operate on respective data and predict the corresponding one-day-ahead open price of WIT, close price of WIPRO.NS, and difference of price values. In the proposed iCREST prediction approach, the predicted data of one of the two stock exchanges are operated with the predicted difference data and the other stock exchange trend is predicted. As shown in Fig. 1, the blocks separate iCREST prediction such that the predicted open price of WIT and difference value is integrated to predict close price movement direction of WIPRO.NS; similarly, the predicted close price of WIPRO.NS and difference value are utilized to derive stock trend of open price of WIT. Here, we adopt LSTM prediction models to predict stock trend of a company having been listed on multiple stock exchanges located in different countries. However, other potential stock prediction models can be evaluated for the proposed approach.
2.8 Performance Evaluation In the proposed approach, we predict one-day-ahead stock open price of two exchanges, NSE and NYSE (similarly, BSE and NYSE) as well as the price difference between such exchanges. The combinations of the predicted values are further evaluated using Eq. (2) to forecast one-day-ahead stock trend of the open price. For the classification approaches, accuracy, precision, recall, and F-measure metrics are generally used [3, 4, 11, 21, 22]. To evaluate the prediction performance of the proposed approach, we consider DA, precision, recall, and F-measure metrics as adopted in Ref. [18]. Here, it must be noted that the derived results of iCREST prediction require a fair comparison. Therefore, stock trend of the forecasted price values from individual LSTM models, i.e., Model-I and II, are evaluated using DA, precision, recall, and F-measure; the same are further compared with the results of iCREST prediction. The forecasting performance indicates one-day-ahead stock trend of open price of NYSE and close price of NSE (similarly, BSE) for each company.
3 Experimental Analysis In this article, we propose to integrate the information of a company listed in two stock exchanges, domestic and international, and predict the one-day-ahead stock open price movement direction. We consider five datasets listed on NSE, BSE, and NYSE for the experiment; these historical stock prices of DRREDDY, HDFCBANK,
334
K. Chaudhari and A. Thakkar
Table 5 Parameter specifications for the proposed approach Parameter Value Prediction frequency Number of epochs Datasets
i ii iii iv v
Data specification Normalization Input feature
Prediction model (architecture) Activation Loss Optimizer Batch Training : Testing
Daily [18] 50 [18] RDY, DRREDDY.NS, DRREDDY.BO HDB, HDFCBANK.NS, HDFCBANK.BO IBN, ICICIBANK.NS, ICICIBANK.BO INFY, INFY.NS, INFY.BO WIT, WIPRO.NS, WIPRO.BO 01-01-2009 to 01-01-2020 [0, 1] [18] Open price of dataset from NYSE; Close price of dataset from NSE and BSE LSTM (1 − 32 − 16 − 1) [18] Linear [18] Mean squared error (MSE) [18] Adam [18] 1 [18] 80% : 20% [18]
ICICIBANK, INFY, and WIPRO are symbolized by RDY, HDB, IBN, INFY, and WIT to indicate NYSE datasets, respectively. Here, we select companies from different sectors, viz. Pharmaceuticals (e.g., DRREDDY), Banks (e.g., HDFCBANK and ICICIBANK), and Information Technology (e.g., INFY and WIPRO) to demonstrate the operability of the proposed approach; the symbols written with suffices “.NS” and “.BO” indicate NSE and BSE data, respectively. The parameter specifications given in Table 5 are adopted from Ref. [18]. As shown in Fig. 1, three LSTM models are adopted for each company as given by Model-I, II, and III and the predicted price values are applied to Eq. (2) to calculate iCREST NSE (similarly, iCREST BSE) and iCREST NYSE; for example, the predicted price of WIT and the predicted difference are used to calculate WIPRO.NS trend. Here, it must be noted that the same architecture as well as other specifications provided in Table 5 are carried forward. To eliminate the potential biased outcomes, we execute the proposed approach with ten random seeds for each dataset of NSE and NYSE, as well as BSE and NYSE, and take an average of the predicted values. To present the prediction performance, we calculate an average percentage of DA, precision, recall, and F-measure for each company and show it for NSE and NYSE, as well as BSE and NYSE, in Tables 6 and 7, respectively. Here, one-day-ahead stock
iCREST: International Cross-Reference to Exchange-Based …
335
Table 6 Comparative analysis of the stock trend prediction for NSE and NYSE Company Stock Approach DA Precision Recall exchange DRREDDY
NYSE NSE
HDFCBANK
NYSE NSE
ICICIBANK
NYSE NSE
INFY
NYSE NSE
WIPRO
NYSE NSE
Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST
50.018 53.152 50.000 50.806 49.562 49.562 51.874 50.158 49.615 49.492 53.205 53.765 46.690 48.984 50.630 49.422 50.290 52.353 53.656 51.767
60.519 47.232 50.000 63.831 17.103 13.172 9.745 45.636 23.003 32.935 23.984 19.492 9.129 19.871 19.452 6.815 62.334 19.284 10.264 42.568
50.396 57.077 45.709 48.342 53.601 51.695 57.510 55.508 56.096 51.637 51.600 48.086 55.366 67.727 54.805 40.805 49.217 53.614 51.183 47.588
F-measure 62.132 41.759 62.740 51.590 33.046 29.369 30.654 47.309 37.210 40.562 38.921 39.403 25.570 26.513 32.419 16.906 61.998 30.524 20.827 56.635
open price of NYSE and close price of NSE (similarly, BSE) are predicted using Model-I and II, respectively, along with the difference prediction using Model-III. Based on the predicted price and difference values, one-day-ahead stock open trend of NYSE and close trend of NSE (similarly, BSE) are forecasted using iCREST prediction approach. To demonstrate the significance of iCREST prediction, we compare the results of iCREST prediction with the outputs of Model-I and II; these models indicate that the difference values are not included. Here, the comparison for stock trend prediction of the considered datasets is provided using NYSE and NSE, as well as NYSE and BSE. It can be observed that a comparable or higher overall prediction improvement is achieved using iCREST prediction as compared to the prediction without integrating the difference of price values, i.e., outputs of Model-I and II. The analysis also indicates that the prediction results are in-line with that given in Ref. [18]. A company listed on multiple exchanges is likely to have separate operations, diverse set of potential investors, as well as variant environmental effects depending on the country it is located, however, the results present a prospective way to study the relationship between two or more stock exchanges of different countries. Also, it can be observed that even if NSE and BSE belong to the same country, they have a distinct impact on NYSE;
336
K. Chaudhari and A. Thakkar
Table 7 Comparative analysis of the stock trend prediction for BSE and NYSE Company Stock Approach DA Precision Recall exchange DRREDDY
NYSE BSE
HDFCBANK
NYSE BSE
ICICIBANK
NYSE BSE
INFY
NYSE BSE
WIPRO
NYSE BSE
Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST Model-I iCREST Model-II iCREST
50.123 53.538 48.231 50.210 49.475 49.352 51.471 50.070 49.545 49.545 53.433 53.730 47.391 46.803 51.407 52.864 49.961 52.159 54.276 50.909
60.000 50.761 65.253 70.156 20.724 12.966 22.862 51.920 25.154 19.386 21.725 20.000 5.048 1.524 0.316 7.368 60.643 26.704 5.838 35.973
50.613 56.994 44.793 48.074 50.588 59.814 49.666 48.352 51.836 54.806 46.200 46.009 63.728 92.227 50.000 63.574 50.451 55.282 54.110 44.602
F-measure 67.209 45.277 60.315 52.462 48.021 26.941 52.589 54.741 41.111 40.731 43.125 42.262 20.697 2.895 2.062 27.306 58.470 30.261 13.499 44.315
therefore, iCREST prediction and related studies can be broadened to extensively derive potential correlations among various exchanges.
4 Concluding Remarks and Potential Future Scope The non-linear stock market prediction has attracted researchers from various backgrounds to study market behaviours using inherent patterns. The informativeness of the available historical stock market data can be exploited to derive useful market characteristics, and hence predict the future stock trend. In this article, we have proposed to integrate the concept of international stock exchanges; companies having been listed on two or more exchanges, domestic as well as international, have been considered. With the motivation to cross-reference stock exchanges [18], we have extended the operability of cross-reference to exchange approach to the international stock exchanges. Here, we have considered five companies, namely, DRREDDY, HDFCBANK, ICICIBANK, INFY, and WIPRO, belonging to three different sectors; the datasets for these companies have been collected
iCREST: International Cross-Reference to Exchange-Based …
337
for three stock exchanges, i.e., NSE, BSE, and NYSE. One of the significant aspects is to ensure that the data loss should be minimized while pre-processing the raw data; along with mapping and cleaning, we have adopted data integration to ensure that the useful information from either of the exchanges, as well as the foreign exchange rate, is not removed. Also, to handle different currencies, we have proposed a currency conversion pre-processing step with the help of USD/INR foreign exchange rate. Subsequently, the open price of NYSE whereas close price of NSE (similarly, BSE) have been considered to operate through varying timezones and respective prices, as well as difference of price values, have been predicted using three separate LSTM models. In this article, we have demonstrated the usefulness of considering the price differences along with one of the stock exchanges to predict the stock trend of the other exchange. A comparable or higher overall performance improvement has been observed for iCREST prediction as compared to predictions without integrating difference values; these prediction results have also varied with NSE and BSE. Hence, such an approach can be potentially fused to derive useful relationships among various stock exchanges. While the stock exchanges follow a particular timetable to trade corresponding stocks of a company, a substantial number of events can influence the stock markets. It can be inferred that the prediction performance may get affected by the potential distinctions between exchanges, as well as certain dominant factors such as mergers or separations of companies, political events, government rules and regulations, etc. [17]; consideration of such facets to construct a fusion-based approach can be significant in deriving market characteristics [19]. Also, the study can be detailed with intraday time-series data. Among the considered companies for respective stock exchanges, we have primarily evaluated open and close features; this study can be further extended by analysing the market behaviours using technical, as well as fundamental aspects, along with the timezone and currency. Such data can also be integrated with the corresponding weight information and such schemes can be deployed [20] for predicting inherent market patterns. One of the potential applications of such cross-referencing is to recommend which stock exchange may be invested into for the same company; on the other hand, interested investors may be recommended with portfolio having diverse assets from domestic as well as international stock exchanges.
References 1. Brock WA (2018) Nonlinearity and complex dynamics in economics and finance. The economy as an evolving complex system. CRC Press, New York, pp 77–97 2. Cavalcante RC, Brasileiro RC, Souza VL, Nobrega JP, Oliveira AL (2016) Computational intelligence and financial markets: a survey and future directions. Expert Syst Appl 55:194– 211 3. Chaudhari K, Thakkar A (2019) A comprehensive survey on travel recommender systems. Arch Comput Methods Eng 1–27
338
K. Chaudhari and A. Thakkar
4. Chaudhari K, Thakkar A (2019) Survey on handwriting-based personality trait identification. Expert Syst Appl 124:282–308 5. Chemmanur TJ, He J, Fulghieri P (2008) Competition and cooperation among exchanges: effects on corporate cross-listing decisions and listing standards. J Appl Corp Fin 20(3):76–90 6. Chen HY, Lee CF, Shih WK (2016) Technical, fundamental, and combined information for separating winners from losers. Pacif-Basin Fin J 39:224–242 7. Hartman D, Hlinka J, Nonlinearity in stock networks. Chaos: Interdiscip J Nonlinear Sci 8. Hassan OA, Skinner FS (2016) Analyst coverage: Does the listing location really matter? Int Rev Fin Anal 46:227–236 9. Majumdar S (2007) A study of international listing by firms of Indian origin. Money and Finance 10. Makanga IM, Gateri MW (2014) Effects of cross-listing on valuation and firm performance 11. Mungra D, Agrawal A, Thakkar A (2009) A voting-based sentiment classification model. Intelligent communication, control and devices. Springer, Berlin, pp 551–558 12. Park M, Song H (2017) Does public listing affect bank profitability? Evidence from us banks (2017) 13. Pavabutr P (2019) White knights or machiavellians? Understanding the motivation for reverse takeovers in Singapore and Thailand. Rev Quant Fin Acc 1–19 14. Sadorsky P (2010) The impact of financial development on energy consumption in emerging economies. Energy Pol 38(5):2528–2535 15. Securities & Exchange Board of India: SEBI Details of Stock Exchanges (2020). Accessed 17 Jan 2020. https://www.sebi.gov.in/stock-exchanges.html 16. Smirnova E et al (2004) Impact of cross-listing on local stock returns: case of Russian ADRs 17. Thakkar A, Chaudhari K (2020) A comprehensive survey on portfolio optimization, stock price and trend prediction using particle swarm optimization. Arch Comput Methods Eng 1–32 18. Thakkar A, Chaudhari K (2020) CREST: Cross-reference to exchange-based stock trend prediction using long short-term memory. Procedia Comput Sci 167:616–625 19. Thakkar A, Chaudhari K (2020) Fusion in stock market prediction: a decade survey on the necessity, recent developments, and potential future directions. Inf Fusion 20. Thakkar A, Chaudhari K (2020) Predicting stock trend using an integrated term frequency– inverse document frequency-based feature weight matrix with neural networks. Appl Soft Comput 106684 21. Thakkar A, Lohiya R (2020) Attack classification using feature selection techniques: a comparative study. J Ambient Intell Hum Comput 1–18 22. Thakkar A, Mungra D, Agrawal A (2020) Sentiment analysis: an empirical comparison between various training algorithms for artificial neural network. Int J Innov Comput Appl 11(1):9–29 23. Zhang H, Lin H, Li Y (2015) Impacts of feature normalization on optical and SAR data fusion for land use/land cover classification. IEEE Geosci Remote Sens Lett 12(5):1061–1065
Author Index
A Ahirao, Purnima, 17 Alcaraz, Salvador, 1 Alice Nithya, A., 301 Amoolya, Gali, 41 Ashna KK, 41
H Hema Sirija, P., 163 J Jain, Rishab, 57 Josephine, D. Diana, 255 Juiz, Carlos, 1
B Bajaj, Deepali, 189 Batra, Hunar, 189 Bharti, Urmil, 189 Borse, Yogita, 17
K Kaushik, Abhinesh, 177 Kumar, Akash, 301
C Chaudhari, Kinjal, 323
L Lloret, Jaime, 145 Lobiyal, D. K., 177 Lohiya, Ritika, 89
D Das, Ganga, 41
F Fadeyi, Johnson, 207
G Geeta, Jalada, 41 George, Sudhish N., 41 Gilly, Katja, 1 Goel, Anita, 189 Gupta, S. C., 189
M Mangukiya, Meet, 17 Mariyam, Farhana, 313 Markus, Elisha D., 207, 269 Maurizio, Mongelli, 221 Maxama, Xolani B., 269 Mehfuz, Shabana, 313 N Narang, Mohak, 301 P Patel, Shreeya, 17
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. M. Thampi et al. (eds.), Applied Soft Computing and Communication Networks, Lecture Notes in Networks and Systems 187, https://doi.org/10.1007/978-981-33-6173-7
339
340 Peñalver, Lourdes, 145 Prakash, G., 123
R Ragendhu, S. P., 27 Rajeswari, A., 255 Ray, Vinayak, 281 Roig, Pedro Juan, 1 Roshini, P., 163
S Sadiq, Mohd., 313 Sai Teja, P., 163 Sai Venkata Swetha, Gadde, 41 Sankaran, Lakshmi, 107
Author Index Sarasvathi, V., 57 Saravanan, S., 123, 163 Singh Samom, Premson, 75 Sircar, Pradip, 237, 281 Sorribes, Jose Vicente, 145 Subramanian, Saleema Janardhanan, 107
T Taggu, Amar, 75 Thakkar, Ankit, 89, 323 Thomas, Tony, 27 Thota, Santhosh, 237
V Vanessa, Orani, 221