283 10 33MB
English Pages 350 [358] Year 2021
Advances in Intelligent Systems and Computing 1302
Miguel Botto-Tobar Omar S. Gómez Raúl Rosero Miranda Angela Díaz Cadena Editors
Advances in Emerging Trends and Technologies Proceedings of ICAETT 2020
Advances in Intelligent Systems and Computing Volume 1302
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by SCOPUS, DBLP, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/11156
Miguel Botto-Tobar Omar S. Gómez Raúl Rosero Miranda Angela Díaz Cadena •
•
•
Editors
Advances in Emerging Trends and Technologies Proceedings of ICAETT 2020
123
Editors Miguel Botto-Tobar Eindhoven University of Technology Eindhoven, Noord-Brabant, The Netherlands
Omar S. Gómez Escuela Superior Politécnica de Chimbor Riobamba, Ecuador
Raúl Rosero Miranda Escuela Superior Politécnica del Chimbo Riobamba, Ecuador
Angela Díaz Cadena Universitat de Valencia Valencia, Valencia, Spain
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-63664-7 ISBN 978-3-030-63665-4 (eBook) https://doi.org/10.1007/978-3-030-63665-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The Second International Conference on Advances in Emerging Trends and Technologies (ICAETT) was held on the main campus of the Escuela Superior Politécnica de Chimborazo, in Riobamba–Ecuador from October 26 until 30, 2020, and it was proudly organized by Facultad de Informática y Electrónica (FIE) at Escuela Superior Politécnica de Chimborazo and supported by GDEON. The ICAETT series aims to bring together top researchers and practitioners working in different domains in the field of computer science to exchange their expertise and to discuss the perspectives of development and collaboration. The content of this volume is related to the following subjects: • • • • • • • •
Communications e-Government and e-Participation e-Learning Electronic Intelligent Systems Machine Vision Security Technology Trends
ICAETT 2020 received 116 submissions written in English by 160 authors coming from 15 different countries. All these papers were peer-reviewed by the ICAETT 2020 Program Committee consisting of 165 high-quality researchers. To assure a high-quality and thoughtful review process, we assigned each paper at least three reviewers. Based on the peer reviews, 27 full papers were accepted, resulting in a 23% acceptance rate, which was within our goal of less than 40%.
v
vi
Preface
We would like to express our sincere gratitude to the invited speakers for their inspirational talks, to the authors for submitting their work to this conference and the reviewers for sharing their experience during the selection process. October 2020
Miguel Botto-Tobar Omar S. Gómez Raúl Rosero Miranda Ángela Díaz Cadena
Organization
General Chairs Miguel Botto-Tobar Omar S. Gómez
Eindhoven University of Technology, The Netherlands Escuela Superior Politécnica de Chimborazo, Ecuador
Organizing Committee Miguel Botto-Tobar Omar S. Gómez Raúl Rosero Miranda Ángela Díaz Cadena
Eindhoven University of Technology, The Netherlands Escuela Superior Politécnica de Chimborazo, Ecuador Escuela Superior Politécnica de Chimborazo, Ecuador Universitat de Valencia, Spain
Steering Committee Miguel Botto-Tobar Ángela Díaz Cadena
Eindhoven University of Technology, The Netherlands Universitat de Valencia, Spain
Publication Chair Miguel Botto-Tobar
Eindhoven University of Technology, The Netherlands
vii
viii
Organization
Program Chairs Technology Trends Miguel Botto-Tobar Sergio Montes León Hernán Montes León Jean Michel Clairand Ángel Jaramillo Alcázar
Eindhoven University of Technology, The Netherlands Universidad de las Fuerzas Armadas (ESPE), Ecuador Universidad Rey Juan Carlos, Spain Universidad de las Américas, Ecuador Universidad de las Américas, Ecuador
Electronics Ana Zambrano Vizuete David Rivas Edgar Maya-Olalla Hernán Domínguez-Limaico
Escuela Politécnica Nacional, Ecuador Universidad de las Fuerzas Armadas (ESPE), Ecuador Universidad Técnica del Norte, Ecuador Universidad Técnica del Norte, Ecuador
Intelligent Systems Guillermo Pizarro Vásquez Janeth Chicaiza Gustavo Andrade Miranda William Eduardo Villegas Diego Patricio Buenaño Fernández
Universidad Universidad Universidad Universidad Universidad
Politécnica Salesiana, Ecuador Técnica Particular de Loja, Ecuador de Guayaquil, Ecuador de las Américas, Ecuador de las Américas, Ecuador
Machine Vision Julian Galindo Erick Cuenca Pablo Torres-Carrión Jorge Luis Pérez Medina
LIG-IIHM, France Université de Montpellier, France Universidad Técnica Particular de Loja, Ecuador Universidad de las Américas, Ecuador
Communication Óscar Zambrano Vizuete Pablo Palacios Jativa Iván Patricio Ortiz Garces Nathaly Verónica Orozco Garzón Henry Ramiro Carvajal Mora
Universidad Universidad Universidad Universidad
Técnica del Norte, Ecuador de Chile, Chile de las Américas, Ecuador de las Américas, Ecuador
Universidad de las Américas, Ecuador
Organization
ix
Security Luis Urquiza-Aguiar Joffre León-Acurio
Escuela Politécnica Nacional, Ecuador Universidad Técnica de Babahoyo, Ecuador
e-Learning Miguel Zúñiga-Prieto Verónica Fernanda Falconí Ausay Doris Macias
Universidad de Cuenca, Ecuador Universidad de las Américas, Ecuador Universitat Politécnica de Valencia, Spain
e-Business Angela Díaz Cadena
Universitat de Valencia, Spain
e-Government and e-Participation Alex Santamaría Philco
Universidad Laica Eloy Alfaro de Manabí, Ecuador
Program Committee Abdón Carrera Rivera Adrián Cevallos Navarrete Alba Morales Tirado Alejandro Ramos Nolazco Alex Santamaría Philco
Alex Cazañas Gordon Alexandra Velasco Arévalo Alexandra Elizabeth Bermeo Arpi Alfonso Guijarro Rodríguez Alfredo Núñez Allan Avendaño Sudario Almílcar Puris Cáceres Ana Guerrero Alemán Ana Santos Delgado Ana Núñez Ávila
University of Melbourne, Australia Griffith University, Australia University of Greenwich, UK Instituto Tecnológico y de Estudios Superiores Monterrey, Mexico Universitat Politècnica de València, Spain/ Universidad Laica Eloy Alfaro de Manabí, Ecuador The University of Queensland, Australia Universität Stuttgart, Germany Universidad de Cuenca, Ecuador Universidad de Guayaquil, Ecuador New York University, USA Università degli Studi di Roma “La Sapienza”, Italy Universidad Técnica Estatal de Quevedo, Ecuador University of Adelaide, Australia Universidade Federal de Santa Catarina (UFSC), Brazil Universitat Politècnica de València, Spain
x
Andrea Mory Alvarado Andrés Calle Bustos Andrés Jadan Montero Andrés Molina Ortega Andrés Robles Durazno Andrés Vargas González Andrés Barnuevo Loaiza Andrés Chango Macas Andrés Cueva Costales Andrés Parra Sánchez Ángel Plaza Vargas Angel Vazquez Pazmiño Ángela Díaz Cadena Angelo Vera Rivera Antonio Villavicencio Garzón Audrey Romero Pelaez Bolívar Chiriboga Ramón Byron Acuna Acurio Carla Melaños Salazar Carlos Barriga Abril Carlos Valarezo Loiza Cesar Mayorga Abril César Ayabaca Sarria Christian Báez Jácome Cintya Aguirre Brito Cristian Montero Mariño Daniel Magües Martínez Daniel Silva Palacios Daniel Armijos Conde Danilo Jaramillo Hurtado David Rivera Espín David Benavides Cuevas Diana Morillo Fueltala Diego Vallejo Huanga Edwin Guamán Quinche Efrén Reinoso Mendoza Eric Moyano Luna Erick Cuenca Pauta Ernesto Serrano Guevara Estefania Yánez Cardoso Esther Parra Mora Fabián Corral Carrera Felipe Ebert
Organization
Universidad Católica de Cuenca, Ecuador Universitat Politècnica de València, Spain Universidad de Buenos Aires, Argentina Universidad de Chile, Chile Edinburgh Napier University, UK Syracuse University, USA Universidad de Santiago de Chile, Chile Universidad Politécnica de Madrid, Spain University of Melbourne, Australia University of Melbourne, Australia Universidad de Guayaquil, Ecuador Université Catholique de Louvain, Belgium Universitat de València, Spain George Mason University, USA Universitat Politècnica de Catalunya, Spain Universidad Politécnica de Madrid, Spain University of Melbourne, Australia Flinders University, Australia Universidad Politécnica de Madrid, Spain University of Nottingham, UK Manchester University, UK Universidad Técnica de Ambato, Ecuador Escuela Politécnica Nacional (EPN), Ecuador Wageningen University & Research, The Netherlands University of Portsmouth, UK University of Melbourne, Australia Universidad Autónoma de Madrid, Spain Universitat Politècnica de València, Spain Queensland University of Technology, Australia Universidad Politécnica de Madrid, Spain University of Melbourne, Australia Universidad de Sevilla, Spain Brunel University London, UK Universitat Politècnica de València, Spain Universidad del País Vasco, Spain Universitat Politècnica de València, Spain University of Southampton, UK Université de Montpellier, France Université de Neuchâtel, Switzerland University of Southampton, UK University of Queensland, Australia Universidad Carlos III de Madrid, Spain Universidade Federal de Pernambuco (UFPE), Brazil
Organization
Fernando Borja Moretta Franklin Parrales Bravo Gabriel López Fonseca Gema Rodriguez-Perez Georges Flament Jordán Germania Rodríguez Morales Ginger Saltos Bernal Gissela Uribe Nogales Glenda Vera Mora Guilherme Avelino Héctor Dulcey Pérez Henry Morocho Minchala Holger Ortega Martínez Iván Valarezo Lozano Jacqueline Mejia Luna Jaime Jarrin Valencia Janneth Chicaiza Espinosa Jefferson Ribadeneira Ramírez Jeffrey Naranjo Cedeño Jofre León Acurio Jorge Quimí Espinosa Jorge Cárdenas Monar Jorge Illescas Pena Jorge Lascano Jorge Rivadeneira Muñoz Jorge Charco Aguirre José Carrera Villacres José Quevedo Guerrero Josue Flores de Valgas Juan Barros Gavilanes Juan Jiménez Lozano Juan Romero Arguello Juan Zaldumbide Proaño Juan Balarezo Serrano Juan Lasso Encalada Juan Maestre Ávila Juan Miguel Espinoza Soto Juliana Cotto Pulecio Julio Albuja Sánchez Julio Proaño Orellana Julio Balarezo Karla Abad Sacoto Leopoldo Pauta Ayabaca
xi
University of Edinburgh, UK Universidad Complutense de Madrid, Spain Sheffield Hallam University, UK LibreSoft/Universidad Rey Juan Carlos, Spain University of York, UK Universidad Politécnica de Madrid, Spain University of Portsmouth, UK Australian National University, Australia Universidad Técnica de Babahoyo, Ecuador Universidade Federal do Piauí (UFP), Brazil Swinburne University of Technology, Australia Moscow Automobile And Road Construction State Technical University (Madi), Russia University College London, UK University of Melbourne, Australia Universidad de Granada, Spain Universidad Politécnica de Madrid, Spain Universidad Politécnica de Madrid, Spain Escuela Superior Politécnica de Chimborazo, Ecuador Universidad de Valencia, Spain Universidad Técnica de Babahoyo, Ecuador Universitat Politècnica de Catalunya, Spain Australian National University, Australia Edinburgh Napier University, UK University of Utah, USA University of Southampton, UK Universitat Politècnica de València, Spain Université de Neuchâtel, Switzerland Universidad Politécnica de Madrid, Spain Universitat Politécnica de València, Spain INP Toulouse, France Universidad de Palermo, Argentina University of Manchester, UK University of Melbourne, Australia Monash University, Australia Universitat Politècnica de Catalunya, Spain Iowa State University, USA Universitat de València, Spain Universidad de Palermo, Argentina James Cook University, Australia Universidad de Castilla La Mancha, Spain Universidad Técnica de Ambato, Ecuador Universidad Autónoma de Barcelona, Spain Universidad Católica de Cuenca, Ecuador
xii
Lorena Guachi Guachi Lorenzo Cevallos Torres Lucia Rivadeneira Barreiro Luis Carranco Medina Luis Pérez Iturralde Luis Torres Gallegos Luis Benavides Luis Urquiza-Aguiar Manuel Beltrán Prado Manuel Sucunuta España Marcia Bayas Sampedro Marco Falconi Noriega Marco Tello Guerrero Marco Molina Bustamante Marco Santórum Gaibor María Escalante Guevara María Molina Miranda María Montoya Freire María Ormaza Castro María Miranda Garcés Maria Dueñas Romero Mariela Barzallo León Mario González Mauricio Verano Merino Maykel Leiva Vázquez Miguel Botto-Tobar Miguel Arcos Argudo Mónica Baquerizo Anastacio Mónica Villavicencio Cabezas Omar S. Gómez Orlando Erazo Moreta Pablo León Paliz Pablo Ordoñez Ordoñez Pablo Palacios Jativa Pablo Saá Portilla Patricia Ludeña González Paulina Morillo Alcívar Rafael Campuzano Ayala
Organization
Università della Calabria, Italy Universidad de Guayaquil, Ecuador Nanyang Technological University, Singapore Kansas State University, USA Universidad de Sevilla, Spain Universitat Politècnica de València, Spain Universidad de Especialidades Espíritu Santo, Ecuador Universitat Politècnica de Catalunya, Spain University of Queensland, Australia Universidad Politécnica de Madrid, Spain Vinnitsa National University, Ukraine Universidad de Sevilla, Spain Rijksuniversiteit Groningen, The Netherlands Universidad Politécnica de Madrid, Spain Escuela Politécnica Nacional, Ecuador/Université Catholique de Louvain, Belgium University of Michigan, USA Universidad Politécnica de Madrid, Spain Aalto University, Finland University of Southampton, UK University of Leeds, UK RMIT University, Australia University of Edinburgh, UK Universidad de las Américas, Ecuador Eindhoven University of Technology, The Netherlands Universidad de Guayaquil, Ecuador Eindhoven University of Technology, The Netherlands Universidad Politécnica de Madrid, Spain Universidad Complutense de Madrid, Spain Université du Quebec À Montréal, Canada Escuela Superior Politécnica del Chimborazo (ESPOCH), Ecuador Universidad de Chile, Chile/Universidad Técnica Estatal de Quevedo, Ecuador Université de Neuchâtel, Switzerland Universidad Politécnica de Madrid, Spain Universidad de Chile, Chile University of Melbourne, Australia Politecnico di Milano, Italy Universitat Politècnica de València, Spain Grenoble Institute of Technology, France
Organization
Rafael Jiménez Ramiro Santacruz Ochoa Richard Ramírez Anormaliza Roberto Larrea Luzuriaga Roberto Sánchez Albán Rodrigo Saraguro Bravo Rodrigo Cueva Rueda Rodrigo Tufiño Cárdenas Samanta Cueva Carrión Sergio Montes León Tania Palacios Crespo Tony Flores Pulgar Vanessa Echeverría Barzola Vanessa Jurado Vite Verónica Yépez Reyes Victor Hugo Rea Sánchez Voltaire Bazurto Blacio Washington Velásquez Vargas Wayner Bustamante Granda Wellington Cabrera Arévalo Xavier Merino Miño Yan Pacheco Mafla Yessenia Cabrera Maldonado Yuliana Jiménez Gaona
xiii
Escuela Politécnica del Litoral (ESPOL), Ecuador Universidad Nacional de La Plata, Argentina Universidad Estatal de Milagro, Ecuador/ Universitat Politècnica de Catalunya, Spain Universitat Politècnica de València, Spain Université de Lausanne, Switzerland Escuela Superior Politécnica del Litoral (ESPOL), Ecuador Universitat Politècnica de Catalunya, Spain Universidad Politécnica Salesiana, Ecuador/ Universidad Politécnica de Madrid, Spain Universidad Politécnica de Madrid, Spain Universidad de las Fuerzas Armadas (ESPE), Ecuador University College London, UK Université de Lyon, France Université Catholique de Louvain, Belgium Universidad Politécnica Salesiana, Ecuador South Danish University, Denmark Universidad Estatal de Milagro, Ecuador University of Victoria, Canada Universidad Politécnica de Madrid, Spain Universidad de Palermo, Argentina University of Houston, USA Instituto Tecnológico y de Estudios Superiores Monterrey, Mexico Royal Institute of Technology, Sweden Pontificia Universidad Católica de Chile, Chile Università di Bologna, Italy
xiv
Organizing Institutions
Organization
Contents
Communications Digital Gap Reduction with Wireless Networks in Rural Areas of Tosagua Town . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cesar Moreira-Zambrano, Walter Zambrano-Romero, Leonardo Chancay-García, Mauricio Quimiz-Moreira, and Darwin Loor Zamora Development of an Industrial Data Server for Modbus TCP Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Violeta Maldonado, Silvana Gamboa, María Fernanda Trujillo, and Ana Rodas Industrial Communication Based on MQTT and Modbus Communication Applied in a Meteorological Network . . . . . . . . . . . . . . Erwin J. Sacoto Cabrera, Sonia Palaguachi, Gabriel A. León-Paredes, Pablo L. Gallegos-Segovia, and Omar G. Bravo-Quezada REGME-IP Real Time Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David Cisneros, Mónica Zabala, and Alejandra Oñate Physical Variables Monitoring to Contribute to Landslide Mitigation with IoT-Based Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . Roberto Toapanta, Juan Chafla, and Angel Toapanta
3
16
29
42
58
e-Government and e-Participation Open Data in Higher Education - A Systematic Literature Review . . . . Ivonne E. Rodriguez-F, Gloria Arcos-Medina, Danilo Pástor, Alejandra Oñate, and Omar S. Gómez
75
xv
xvi
Contents
e-Learning Online Learning and Mathematics in Times of Coronavirus: Systematization of Experiences on the Use of Zoom® for Virtual Education in an Educational Institution in Callao (Peru) . . . . . . . . . . . . Roxana Zuñiga-Quispe, Yesbany Cacha-Nuñez, Milton Gonzales-Macavilca, and Ivan Iraola-Real
91
Implementation of an Opensource Virtual Desktop Infrastructure System Based on Ovirt and Openuds . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Washington Luna Encalada, Jonny Guaiña Yungán, and Patricio Moreno Costales Virtual Laboratories in Virtual Learning Environments . . . . . . . . . . . . 114 Washington Luna Encalada, Patricio Moreno Costales, Sonia Patricia Cordovez Machado, and Jonny Guaiña Yungán Mathematical Self-efficacy and Collaborative Learning Strategies in Engineering Career Aspirants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Ivan Iraola-Real, Ling Katterin Huaman Sarmiento, Claudia Mego Sanchez, and Christina Andersson Electronic Economic Assessment of Hydroponic Greenhouse Automation: A Case Study of Oat Farming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Flavio J. Portero A., Jorge V. Quimbiamba C., Angel G. Hidalgo O., and Ramiro S. Vargas C. Proportional Fuzzy Control of the Height of a Lubricant Mixture . . . . 151 Fátima de los Ángeles Quishpe Estrada, Angel Alberto Silva Conde, and Miguel Ángel Pérez Bayas Intelligent Systems Analysis of Anthropometric Measurements Using Receiver Operating Characteristic Curve for Impaired Waist to Height Ratio Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Erika Severeyn, Alexandra La Cruz, Sara Wong, and Gilberto Perpiñan Classification of Impaired Waist to Height Ratio Using Machine Learning Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Alexandra La Cruz, Erika Severeyn, Sara Wong, and Gilberto Perpiñan Machine Vision Creating Shapefile Files in ArcMap from KML File Generated in My Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Graciela Flores and Cristian Gallardo
Contents
xvii
Dynamic Simulation and Kinematic Control for Autonomous Driving in Automobile Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Danny J. Zea, Bryan S. Guevara, Luis F. Recalde, and Víctor H. Andaluz Analysis of RGB Images to Identify Local Lesions in Rosa sp. cv. Brighton Leaflets Caused by Sphaerotheca Pannosa in Laboratory Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 William Javier Cuervo-Bejarano and Jeisson Andres Lopez-Espinosa EEG-Based BCI Emotion Recognition Using the Stock-Emotion Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Edgar P. Torres P., Edgar Torres H., Myriam Hernández-Álvarez, and Sang Guun Yoo Security Guidelines and Their Challenges in Implementing CSIRT in Ecuador . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Fernando Vela Espín A Blockchain-Based Approach to Supporting Reinsurance Contracting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 Julio C. Mendoza-Tello and Xavier Calderón-Hinojosa Implementation of an Information Security Management System Based on the ISO/IEC 27001: 2013 Standard for the Information Technology Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 Mario Aquino Cruz, Jessica Noralinda Huallpa Laguna, Herwin Alayn Huillcen Baca, Edgar Eloy Carpio Vargas, and Flor de Luz Palomino Valdivia Technology Trends Industry 4.0 Technologies in the Control of Covid-19 in Peru . . . . . . . . 275 Jessica Acevedo-Flores, John Morillo, and Carlos Neyra-Rivera Proposal for a Software Development Project Management Model for Public Institutions in Ecuador: A Case Study . . . . . . . . . . . . . . . . . 290 José Antonio Quiña-Mera, Luis Germán Correa Real, Ana Gabriela Jácome Orozco, Pablo Andrés Landeta-López, and Cathy Pamela Guevara-Vega Development of a Linguistic Conversion System from Audio to Hologram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 María Rodríguez, Michael Rivera, William Oñate, and Gustavo Caiza
xviii
Contents
Exploring Learning in Near-Field Communication-Based Serious Games in Children Diagnosed with ADHD . . . . . . . . . . . . . . . . . . . . . . 314 Diego Avila-Pesantez, Sandra Santillán Guadalupe, Nelly Padilla Padilla, L. Miriam Avila, and Alberto Arellano-Aucancela “Women Are Less Anxious in Systems Engineering”: A Comparative Study in Two Engineering Careers . . . . . . . . . . . . . . . . 325 Ling Katterin Huaman Sarmiento, Claudia Mego Sanchez, Ivan Iraola-Real, Mawly Latisha Huaman Sarmiento, and Héctor David Mego Sanchez Sorting Algorithms and Their Execution Times an Empirical Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Guillermo O. Pizarro-Vasquez, Fabiola Mejia Morales, Pierina Galvez Minervini, and Miguel Botto-Tobar Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Communications
Digital Gap Reduction with Wireless Networks in Rural Areas of Tosagua Town Cesar Moreira-Zambrano1(B) , Walter Zambrano-Romero2 , Leonardo Chancay-Garc´ıa2 , Mauricio Quimiz-Moreira2 , and Darwin Loor Zamora2 1
Escuela Superior Polit´ecnica Agropecuaria de Manab´ı, Calceta, Manab´ı, Ecuador [email protected] 2 Universidad T´ecnica de Manab´ı, Portoviejo, Manab´ı, Ecuador {walter.zambrano,leonardo.chancay,mauricio.quimiz, patricio.loor}@utm.edu.ec Abstract. Access to information and communication technologies in rural areas of Ecuador in Manab´ı Province is limited, which is an obstacle to the development and quality of life of its inhabitants, ranging from the high cost of investment in the infrastructure, low population density, and little interest from the authorities generated by its location, up to the difficulties of access to connectivity. The Wireless Network Implementation currently has benefited more than 250 houses whose population is engaged in agriculture and livestock, allowing them to have Internet access. The applied methodology is V-cycle, with 10 access points distributed in the two rural parishes that were implemented in the town. To validate the operation, the infrastructure was tested and that its response times were adequate. A survey was conducted with 95% confidence and a margin of error of 7% to the inhabitants of rural communities for which they use the internet where 45.26% for education and learning, 36.84% for social networks, and 17.89% do it to search for jobs and information. In addition to this 90% of the respondents make daily use of the internet, which shows that the project is well received in the communities.
Keywords: Wireless cities Wi-fi rural areas
1
· Radio links · Internet · Radio mobile ·
Introduction
The technologies of Wireless communication emerged as one of the main forces in the recent development and change of the world economy [15]. New planning strategies that emphasize the adoption and adaptation of Information and Communications Technology (ICT) have attracted considerable attention from academics and responsible politicians. In the context of the world information economy, the use of wireless technologies has become a key indicator of competitive city’s. Certain cities, such as Singapore and Taipei in Asia, Philadelphia,
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 3–15, 2021. https://doi.org/10.1007/978-3-030-63665-4_1
4
C. Moreira-Zambrano et al.
San Francisco, and Boston in the United States, and Perth in Australia, have expressed their intention to establish a wireless network or launch specific plans to develop wireless cities. The National Development Plan 2017–2021 (Ecuador) in the objective 6:Development productive and environmental capacities to achieve food sovereignty and Good Rural Living in its policy 6.6, promote the access to health services, education, water service in rural areas, main sanitation service, citizen security, rural social protection, and homes with relevance territorial; as well as promoting national connectivity and roads [14]. In Ecuador, the use of the Internet has contributed to increasing the connection to sources of information, improving communication in rural and urban areas in Ecuador, which has a positive impact at the economic, social, and technological levels. According to INEC, in 2017, 81.3% of the Gal´ apagos population uses the Internet, while Chimborazo is the province that did the least with 45.1%; in the province of Manab´ı specifically, there is an Internet use of 53.0%, as indicated in Fig. 1. However, Manab´ı’s province should get better, because it is located in position 12 in relation to the use of the internet [9]. Ecuador has a long way to go to really to satisfy the needs of connectivity that society requires in the 21st century to support oneself to the education and ensure the “Buen Vivir” to its fellow citizens. The National Plan for Telecommunications and Information Technologies of Ecuador 2016–2021 carried out by the Ministry of Telecommunications and the Information Society is the instrument for planning and managing sectors of telecommunications, allowing for greater digital inclusion, and competitiveness of the country. The projects of vision contribute to locate Ecuador as a regional reference in 2021 in terms of connectivity, access, and production of information and communication technology (ICT), for the benefit of the economic and social development of the country [1].
Fig. 1. Percentage of people by province that use the Internet (2017)
Digital Gap Reduction with Wireless Networks
5
When the people in a community have quick and cost-reducing access to information, they will directly benefit from what the Internet offers [6]. The time and effort saved to have access to the internet will result in better quality in their life. In the same way, the communities connected to the network at high speed have global market presence where transactions happen rapidly. The people around the world see that Internet access has given a voice to discuss their problems, and obtain important information for them that the television can’t be found. Recently seemed like an illusion, now come true, and that is based on wireless networks [3,7]. The authors of [4] have carried out a detailed study on the dissemination of information at points specifically attended by the population of a place, with the objective to the dissemination of information in urban scenarios with wireless communication. While in [5] show an analysis of the sending of information from highly frequented points in the cities of San Francisco and Rome for the forwarding of tourist information. Tosagua has urban areas and rural areas, however, these areas do not have complete coverage of Internet service in houses, and in areas where there is coverage this has often a cost and when being a town where her population are very few resources, the city council planned to offer this service for free at 10 free internet access points in the two rural neighborhood. For these reasons, the project proposed in this study is very important since it will benefit the two ´ rural neighborhood, such as Bachillero town with 2622 inhabitants and Angel Pedro Giles town with 3553 inhabitants, providing free internet access, thereby reducing the digital gap among the population who have and do not have access to technology. For all the aforementioned, the objective was to implement a wireless broadband network to provide free Internet in rural areas of the Tosagua town, in order to eradicate the digital divide and promote the use of information technologies and communication in the general public.
2
Material and Method
The research was carried out in the two rural parishes of Tosagua, a canton in the Province of Manab´ı with a total population of 6175 inhabitants over the age of 16 (data obtained from the Consejo Nacional Electoral in 2017). Its main source of income is agriculture. The purpose of this research is to reduce the digital gap of the inhabitants by using the free internet with 10 access points with standard 802.11a/n/ac wireless technology. Himself that he will provide a minimum speed of 2 Mbps and the maximum exceeds 25 Mbps with data delivery bursts of up to 35 Mbps for every 4 s where a primary node was established in Tosagua. The execution was carried out using the V-cycle methodology, applying the following phases. 2.1
Specifications Phase
A preliminary information survey was carried out, in order to obtain the requirements for the implementation of the infrastructure, where the first strategy was
6
C. Moreira-Zambrano et al.
a no-structured interview with the mayor of Tosagua town, where He mentioned that it was required to provide free Internet to the following rural areas: Ciudadela El Recreo, San Roque Abajo, Parque Central (Parroquia San Jos´e de ´ Bachillero), Parque Central (Parroquia Angel Pedro Giler), Comunidad Cale˜ no, Comunidad Los Micos, Comunidad Monte Oscuro, Casical Community, Mutre Afuera Community, and El Tambo Community. In order to define the financial resources for the acquisition of technological equipment, it is the entity that finances the project. The increase of the digital divide between towns and big cities reflects better in terms of poor internet access. Furthermore, networking in rural areas is a complicated task, requiring a relatively large investment in network infrastructure against low profitability and decline seen from the perspective of a service provider. Rural networks and their applications in health, education, and business community development are vital but require an initial investment [10]. The second strategy was based on the survey carried out where it was deter´ mined that the residents of the two rural townships of Bachillero and Angel Pedro Giler do not have access to internet service, so the implementation of the Tosagua Conectado project is necessary to undertake to reduce the existing digital gap in the town and provide residents with a free service that will allow them to search for information, contribute to education, the use of content managers and social networks. The third strategy, was developed in three activities: the first was to identify the environmental situations in the benefited areas for the deployment of the network, information such as temperature, humidity and wind speed; It was obtained from the Plan de Desarrollo Ordenamiento Territorial of Tosagua, ´ Angel Pedro Giler and San Jos´e de Bachillero parishes. From this information, the obstructions variable was analyzed and determined, which was obtained through visits in each of the areas, the values obtained are not mean values, they are constants collected by the PDOT as shown in Table 1. Table 1. Distance analysis of wireless access points Township
Longitude
Latitude
Terrain elevation (m) Distance (km)
Tosagua Tower
–0.781850◦
–80.238926◦
35.0
3273
El Recreo
–0.784725◦ –80.222254◦ 10.7
1881
San Roque Abajo –0.799444◦ –80.236030◦ 42.5
1983
Bachillero
–0.766670◦ –80.213703◦ 25.0
3273
Estancilla
–0.817080◦
30.0
4680
Cale˜ no
–0.801289◦ –80.230322◦ 14.2
2364
Los Micos
–0.798843◦ –80.233369◦ 25.0
1988
Monte Oscuro
–0.764643◦
25.0
6348
Cacical
–0.771251◦ –80.256329◦ 10.9
2266
Mutre Afuera
–0.762412◦ –80.254612◦ 34.3
2777
El Tambo
–0.759236◦ –80.243969◦ 10.1
2576
–80.215902◦
–80.184488◦
Digital Gap Reduction with Wireless Networks
7
Fig. 2. Logical design of the implemented infrastructure
2.2
High-Level and Detailed Design Phase
After the collection of information, the logical design of the infrastructure to be implemented was made, the topology of the wireless network, for which the SmartDraw software, 2015 was used. In this work, the need to use a simulator was seen. of links so that an analysis of the different simulators available was carried out to determine the software to be used. For a correct design, was used the software Radio Mobile software, to reduce the margin of error in the manual and visual calculations [2]. With this tool we did simulations of each of the links, verifying if the site where the equipment was expected to be placed and the height were adequate or not, and if necessary, change these parameters. We make a cost-benefit analysis to agree on the technology to be used in this work, where was taken into account aspects like the useful life of the equipment, costs, safety, application of technology in the area, among others. According to the information collected in the requirements phase, the logical design of the infrastructure to be implemented was carried out, where the wireless network was included, as well as the perimeter security solution. This design is shown in Fig. 2. 2.3
Implementation Phase
In this phase, we proceeded to implement each of the devices in the places according to design made in the previous phase, beginning for the infrastructure, that is, towers and mast (Fig. 3). The link at the City hall was then made towards the main tower located on the hill of the San Cristobal township, then communication equipment was installed that would play the role of repeaters. We proceeded with the configuration of all the equipment in the planned areas. Connection tests were carried out between each device, in order to have a response
8
C. Moreira-Zambrano et al.
Fig. 3. Topological diagram of physical node
and verify a connection. For this, the PING command was used from the Windows console, as well as the PING TEST tool incorporated in the equipment used. Once the wireless network was implemented, the firewall was installed and configured. Before that, a respective analysis was carried out to determine the firewall to be implemented, where available support information was taken into account, among other aspects. Afterward, a survey was carried out through a checklist and an interview with the department director to determine the systems or servers to protect from the external network, in the same way, to carry out security policies that would protect the internal network. Then, the firewall security system was implemented, with an available server with the following characteristics: Intel Xeon 3.0 GHz processor, 4GB of RAM, 1 TB hard drive, and 2 network cards (Gigabit Ethernet). The first thing we did was to back up the information stored on the computer and then format and install the software. Once the installation process was completed, the two network cards of the server were defined, one is the network card for the LAN and the other for the WAN. Then the network cards were configured. Then, we enter the administration of the server through the web with the default credentials, admin as user and pfsense as the access key, which was later modified for security reasons. Once the administration on the web has been configured, all the firewall configuration was done by adding the virtual IPs of the servers, assigning 1 to 1 where it will translate the public IPs to private, rules to allow access to certain computers through specific ports among other configurations to protect the internal network from the external network. In addition, a package was installed to control access to network users to allow certain websites at certain times, as well as to control bandwidth at each of the access points. The equipment used in the implementation was 2 Rocket M5 antennas, working at a frequency of 5.8 GHz with a gain of 30.0 dBi, an edge router Routerboard 1100 that will allow efficient network management, 7 Nanostation M5 radios of 10 dBi, 3 Outdoor,
Digital Gap Reduction with Wireless Networks
9
Table 2. Environmental situation of the areas benefited by the obstruction approach Township
Temperature (Celsius) Humidity (%) Block Wind (m/s)
Ciudadela el Recreo
26/25
81
No
1.6
San Roque Abajo
26/25
81
Yes
1.6
Central Park Bachillero
25/27
81
Yes
1.6
Central Park Estancilla
26
81
No
1.6
Comunity Cale˜ no
26/25
81
No
1.6
Comunity los Micos
1.6
26/25
81
Yes
Comunity Monte Oscuro 26/27
81
Yes
1.6
Comunity Casical
26/25
81
Yes
1.6
Comunity Mutre Afuera
26/25
81
Yes
1.6
Comunity el Tambo
26/25
81
No
1.6
AP/CPE/Point-to-Point Braided 18 dBi M5 PowerBam with RP-SMA connectors and a cable cover for moisture protection. This device is compatible with the new 802.11ac standard that has a higher transmission speed (866 Mbps and 20/40 and 80 MHz channels), this radio was configured point to point between the Rocket M5 antennas and then the distribution was defined to station equipment, has an opening angle of 90◦ , a CPE device is a high speed, low-cost 5 GHz MIMO wireless outdoor device. The unit is equipped with a powerful 600 MHz CPU, 64 MB RAM, 16 dBi dual-polarization antenna, POE, power supply and mounting kit, a 2.4 GHz Picostation M2 WiFi router that will provide the experience of wireless browsing from the reach area, a tower with support for 30-meter turnbuckles and a mast that will allow a line of sight between the links. 2.4
Unit-Test Phase - Integration and Operational
In this phase the verification tests were carried out, one of them was to check Internet access from the end-user computers in the different benefited areas, also to corroborate the proper functioning of the perimeter system, attempts were made to access certain sites websites that were blocked according to the established security policies, finally, the bandwidth in the end computers was evaluated. The devices are monitored to verify its correct operation with the Ubiquiti administration software, so, visits to the site for physical verification, and a revision to verify the connection of concurrent users connected to the Internet service.
3
Results and Discussion
This part describes the three activities were developed: the first was to identify the environmental situations in the areas benefited by the deployment of the network, information such as temperature, humidity, and wind speed; it was obtained from the Development Plans and Territorial Planning of the parishes
10
C. Moreira-Zambrano et al. Table 3. Link Group Team Name
Description
Device
TX-Principal
Main Device
Rocket M5
RX-el Recreo
Recreo Device
NanoStation M5
RX-San Roque Abajo San Roque Abajo Device NanoStation M5 TX-Bachillero
Bachillero Main Device
Rocket M5
TX-Estancilla
Estancilla Main Device
Rocket M5
RX-Cale˜ no
Cale˜ no Device
NanoStation M5
RX-Los Micos
Los Micos Device
NanoStation M5
RX-Monte Oscuro
Monte Oscuro Device
Powerbeam M5
TX-Casical
Torre Casical Device
Rocket M5
RX-Mutre Afuera
Mutre Afuera Device
NanoStation M5
RX-El Tambo
El Tambo Device
Powerbeam M5
Table 4. Access group team Name
IP Address Device
El Recreo
10.10.10.3
PicoStation M2
Bachillero
10.10.10.2
PicoStation M2
Casical 10.10.10.4 ´ Angel Pedro Giler 10.10.10.5
PicoStation M2
PicoStation M2
Cale˜ no
10.10.10.13 PicoStation M2
Los Micos
10.10.10.14 PicoStation M2
Mutre
10.10.10.17 PicoStation M2
San Roque Abajo 10.10.10.15 PicoStation M2 Monte Oscuro
10.10.10.16 PicoStation M2
El Tambo
10.10.10.18 PicoStation M2
´ Tosagua, Angel Pedro Giler, and San Jos´e de Bachillero. Based on this information, the variable obstructions were analyzed and determined, which was obtained through visits to each of the areas, as shown in Table 2. On the other hand, in the second activity, the transmission frequencies of the communication devices in the town of Tosagua were diagnosed. For this, an informal survey was carried out, with the number and percentages of teams working at frequencies of 2.4 GHZ. with 58.82% and at frequency 5.8 GHZ with 41.18%. With this information, it was determined that the frequency to be used should be 5.8 GHZ, because it is less saturated in the places where it will benefit from Internet service and the channel would be 44 because it is not used by any equipment and would avoid interference on the links.
Digital Gap Reduction with Wireless Networks
11
Next, each of the computers on the network was configured where the author could group them into two parts; The first group of computers that allow connection between the links, all this information is shown in Table 3. In the same way, the configuration of the equipment that functions as an access point to which citizens would have to connect to access the Internet was carried out, the information of these devices is shown in Table 4. To complete the implementation of the wireless network, connectivity tests were carried out between the different links using the ping command from the windows console, resulting in response times of one millisecond, between a link to a computer in the main tower. The implementation includes the installation and configuration of the firewall, this objective includes certain activities, in the first, it was determined that the security system to be used would be the distribution of PfSense, which is a free access firewall, the distribution installation, and configuration, where different rules were added to protect the internal network from the external network. Because there are web servers in the institution, the respective mapping of public addresses to LAN addresses was made. Also, packages were installed in the security solutions to block websites for end-users of the web and control bandwidth at access points. Table 5. Response times from access points to the server Access point
Min (ms) Max (ms) Average (ms)
El Recreo
4,13
5,96
5,27
Bachillero
7,26
18,09
11,09
Casical 7,11 ´ Angel Pedro Giler 4,71
63,04
27,76
10,77
7,98
Cale˜ no
7,5
16,48
11,19
Los Micos
8,06
17,29
12,08
Mutre
4,03
9,45
7,25
San Roque Abajo 5,59
16,13
8,07
Monte Oscuro
6,08
16,23
8,2
El Tambo
4,26
5,92
5,03
For the verification, three tasks were carried out to corroborate the good functioning of the infrastructure. First, tests were carried out using the ping test tool from the installed equipment to the server, to verify the response times. Table 5 shows the summary of the times obtained, where we can be seen that are low. Advances in Information and Communication Technologies have led to the need for a paradigm shift and one of them is distance education in rural areas through the use of the internet [11]. The fifth strategy that I applied was a survey of the two rural parishes with 6175 inhabitants in the 10 installed points
12
C. Moreira-Zambrano et al. Table 6. Availability of pc at home Availability
Yes No
El Recreo
18
4
Bachillero
21
4
Casical ´ Angel Pedro Giler
17
8 10 8
Cale˜ no
8 10
Los Micos
5
Mutre
2 10
San Roque Abajo Monte Oscuro El Tambo Total
7
6
6
13
9
17
7
115 75
Table 7. Free internet use in rural Tosagua Uses
Social networks Education Work
El Recreo
9
10
3
Bachillero
12
8
5
Casical ´ Angel Pedro Giler
5
9
4
7
12
3
Cale˜ no
7
8
3
Los Micos
3
5
4
Mutre
5
5
2
San Roque Abajo
3
7
2
Monte Oscuro
9
11
5
El Tambo
10
11
3
Total
70
86
34
of Tosagua connected, with a confidence level of 95% and a margin of error of 7%. It is concluded that 190 surveys must be carried out that 90 were distributed in the Bachillero parish and its communities, which have four access points, and ´ 100 in the Angel Pedro Giler parish and its communities, which have six access points. The inhabitants were consulted if they have a computer in their home, the results obtained that 60.53% have computer equipment and 39.47% do not have the technical resource as shown in Table 6. The inhabitants chosen in the sample were asked if there is a provider in their community that provides the internet service and if they have contracted the service, responding that 16.24% have contracted the internet, and they were asked if they do you use the free internet Tosagua connected obtaining as a
Digital Gap Reduction with Wireless Networks
13
result that 100% answered that they use the free internet, where 90% of the respondents answered that they use the free internet Tosagua connected with more than two hours and 10% less than two hours. Access to the Internet through the deployment of infrastructure where it was not previously available, especially in rural areas, makes the difference by benefiting people by giving them greater opportunities to acquire knowledge through the Internet and make them more productive, benefiting more than 20 households whose population is engaged in agriculture [13]. The reasons why the inhabitants of the two rural areas connect are 45.26 for education and learning, 36.84% for social networks, and 17.89 for job search, information that shows that the project has acceptance allowing to reduce the digital divide as shown in Table 7. Having familiar who connect online has a positive effect on individual use, as it encourages the use of the Internet as a tool for communication, entertainment, information seeking, and social media. This finding is consistent with research [12]. The development of wireless cities must address not only the construction of hardware facilities but also the educational and support system that improves Internet access. Therefore, not only the amount of wireless access but also the quality of public access to the digital world deserves the attention of public policymakers. Currently, there are many studies to implement wireless networks in Ecuador, each having similarities and differences, shows in his research, an analysis of wireless technologies, in order to improve the design of the existing communications network in the rural sector center of the province of Morona Santiago [8]. The objective of providing free Internet to the urban and rural areas of the Tosagua town with the intention of eradicating the digital divide and making ICT available to citizens was achieved and the satisfaction of its inhabitants was achieved.
4
Conclusions
Wireless technologies have an increasingly determining element in urban and rural development, making it essential o adopt the notion of a wireless city for Internet access through public WiFi, which has become an integral part of the economic competitiveness of the city for the benefit of its inhabitants and thus be able to considerably reduce the digital divide. In the Tosagua town, access to the public WiFi service was not available, neither in the urban area nor the rural areas, the municipality of this canton in 2016 adopted as a public policy to offer this service to its inhabitants, installing eight access points in parks and township in the urban area, so what ten points were installed in parks and community areas to reduce the digital divide, as part of a planned process, because the second phase of the project contemplates offering courses free computing for residents of townships. The results of the analysis show that the gap in terms of Internet access continues to be appreciable in the rural sector of Ecuador. The initiative taken by the Tosagua canton partially reduced the digital gap. However, the use of the
14
C. Moreira-Zambrano et al.
Internet is very important being that a large part of the rural population has low educational levels and is not in contact with ICT. Therefore, reducing the digital divide requires improving access conditions through the adequate development of infrastructure and provision of high-speed Internet services; reducing the costs associated with connecting to the Internet by promoting greater competition between service providers. As a future project, establish an agreement between the city hall and the Universidad T´ecnica de Manab´ı and the Escuela Superior Polit´ecnica Agropecuaria de Manab´ı so that computer and internet training courses are held to reduce the digital divide in the use of ICT in the inhabitants of the rural areas of the Tosagua town.
References 1. Ministerio de telecomunicaciones. “Plan Nacional de Telecomunicaciones y Tecnolog´ıas de Informaci´ on del Ecuador” https://www.telecomunicaciones.gob.ec/wpcontent/uploads/2016/08/Plan-de-Telecomunicaciones-y-TI.pdf, urldate = 20-022019 (2019) 2. Barbecho, B., Carlos, R.: Estudio, dise˜ no e implementaci´ on de un enlace inal´ ambrico de largo alcance con antenas direccionales de la empresa Compuf´ acil. B.S. thesis, Quito: Universidad Israel, 2011 (2011) 3. Butler, J., Pietrosemoli, E., Zennaro, M., Fonda, C., Okay, S., Aichele, C., Buttrich, S.: “redes inal´ ambricas en los pa´ıes en desarrollo”. “The WNDW Project” (2013) 4. Chancay-Garcia, L., Hern´ andez-Orallo, E., Manzoni, P., Calafate, C.T., Cano, J.C.: Evaluating and enhancing information dissemination in urban areas of interest using opportunistic networks. IEEE Access 6, 32514–32531 (2018) 5. Chancay-Garc´ıa, L., Herrera-Tapia, J., Manzoni, P., Hern´ andez-Orallo, E., Calafate, C.T., Cano, J.C.: Improving information dissemination in vehicular opportunistic networks. In: 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC), pp. 1–6. IEEE (2020) 6. Contreras, F.K., Oliveira, F.B.D., Martins, E.S.: Internet: monitored freedom. JISTEM-J. Inf. Syst. Technol. Manag. 9(3), 459–472 (2012) 7. Heller, A.: The sensing internet–a discussion on its impact on rural areas. Future Internet 7(4), 363–371 (2015) 8. Hidalgo Bourgeat, J.E.: An´ alisis de tecnolog´ıas inal´ ambricas para mejorar el dise˜ no de la red de comunicaciones en el sector rural centro de Morona Santiago. Master’s thesis, Escuela Superior Polit´ecnica de Chimborazo (2013) 9. “INEC”: “Tecnolog´ıas de la Informaci´ on y la Comunicaci´ on (TIC)” (2019) Accessed 20 Feb 2019. https://www.ecuadorencifras.gob.ec/documentos/webinec/Estadisticas Sociales/TIC/2017/Tics 202017 270718.pdf 10. Khalil, M., Shamsi, Z., Shabbir, A., Samad, A.: A comparative study of rural networking solutions for global internet access. In: 2019 International Conference on Information Science and Communication Technology (ICISCT), pp. 1–5. IEEE (2019) 11. Lembani, R., Gunter, A., Breines, M., Dalu, M.T.B.: The same course, different access: the digital divide between urban and rural distance education students in South Africa. J. Geography Higher Educ. 44(1), 70–84 (2020) 12. Mart´ınez-Dom´ınguez, M., Mora-Rivera, J.: Internet adoption and usage patterns in rural mexico. Technol. Soc. 60, 101226 (2020)
Digital Gap Reduction with Wireless Networks
15
13. Morales, F., Valencia, H., Toasa, R.M.: Implementation of a wireless network for an ISP in the alhajuela community from ecuador. In: The International Conference on Advances in Emerging Trends and Technologies, pp. 280–290. Springer (2019) 14. Senplades, S.: SENPLADES, Secretar´ıa Nacional de Planificaci´ on y Desarrollo Ecuador, Pol´ıtica del Buen Vivir (2020). http://www.planificacion.gob.ec. Accessed 21 July 2020. 15. Wang, M., Liao, F.H., Lin, J., Huang, L., Gu, C., Wei, Y.D.: The making of a sustainable wireless city? mapping public wi-fi access in shanghai. Sustainability 8(2), 111 (2016)
Development of an Industrial Data Server for Modbus TCP Protocol Violeta Maldonado, Silvana Gamboa, Mar´ıa Fernanda Trujillo(B) , and Ana Rodas Escuela Polit´ecnica Nacional, Quito, Ecuador {violeta.maldonado,silvana.gamboa,maria.trujillo01,ana.rodas}@epn.edu.ec Abstract. This document presents the development of a low-cost industrial data server for communicating Modbus TCP devices and Windows applications by using a database as a temporary storage space. The performance of developed data server was evaluated by integrating it with an commercial industrial Windows application, were data server demonstrated an adequate behavior. The proposed application is based largely on free software, Java and MySQL, to reduce its cost and to become it an accessible option for industries with low investment capacity. Then, this development aims to offer a low-cost alternative for implementing a supervisory control system. Keywords: Modbus TCP Database · MySQL
1
· Communication · Data server · Java ·
Introduction
Today, process automation goes beyond mechanical action on the process components or the measurement of variables and its regulation by electrical and electronic devices. Now, automation aims to extend through all company levels covering not only the plant floor, but also business areas. But to achieve such objective, information exchange from the process network to the company network is essential. In this regard, several tools such as communication drivers or data servers have been developed to enable data integration between industrial equipment and computer system that executes process monitoring and control applications. However, the initial high investment required for licensed industrial communication software is an important limitation in the implementation of global automation, especially for small and medium enterprises (SMEs) since its low investment capacity [1,8]. In recent years, free software has enabled the development of low-cost industrial applications. For example, in [6] an OPC server, which exchanges information with industrial devices through Ethernet was made in Phyton. So too SCADA systems such as Argos, FreeSCADA, Indigo, IntegraXor, Likindoy, Lintouch, Proview, PVBrowser, among others were developed using free software [5]. As result, small industry with low purchasing power can access these tools c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 16–28, 2021. https://doi.org/10.1007/978-3-030-63665-4_2
Modbus Data Server
17
that will allow them to improve their productivity and supervise their production. But industrial applications developed in free software is not new, a local development will further facilitate the access to this technologies for small industry in Ecuador. It is important to mention that Modbus protocol is one of the most used communication protocols in industrial automation. Modbus TCP, a Modbus version over TCP/IP and Ethernet, is currently the most widely used variation due to, among other things, its high transmission speed, the accessibility to Ethernet network devices and its relatively low cost compared to other industrial technologies [2]. With this background, this work is focused on developing an industrial data server for communicating Modbus TCP devices and industrial Windows applications by using a database as kind of a temporary storage memory. The implementation was performed in Java since it is a free software characterized by portability of its applications and MySQL database management system because it is an open source database [3]. With this implementation, based largely on free software, we seek to reduce the cost of software for implementing monitoring and control systems in industries with low investment capacity. The developed application use 8 of the most commonly used Modbus function codes to establish communication between Modbus TCP devices and a MySQL database in which process input data will be stored and their can be accessed by Windows applications latter. The use of a database as register mechanism helps later to implement a kind of process historian too. This work is organized as follow. Section 2 presents a review of Modbus TCP protocol and details the structure of its data frame according with Modbus function codes. Then, requirements for software development are exposed in Sect. 3. In Sect. 4, procedure for mapping Modbus frames over TCP is detailed. Later, communication with database is explained in Sect. 5. Section 6 describes the development and operation of server configuration interface. In Sect. 7 tests and their results are presented, and finally, the conclusions are drawn in Sect. 8.
2
Modbus TCP Protocol
Modbus is one of the most widely used protocols in industry since it is an open protocol that is versatile and easy to implement in the different levels of the process automation. This application layer protocol is based on a client/server architecture in that the client sends a request message to the server, which performs the action requested and returns a response message to the client with the corresponding data. In the case that the requested action cannot be performed by the server an error message is sent to the client [7]. Figure 1 shows the Modbus TCP frame, in which PDU is the Protocol Data Unit independent of underlying communication layers and integrated by two fields, Function Code and Data. Data field may or may not contain certain blocks according to Function Code field. Instead, ADU is the Application Data Unit composed by Modbus Application (MBAP) header and PDU. MBAP header contains four fields: (1) transaction identifier, (2) protocol identifier, (3) length, and (4) unit identifier.
18
V. Maldonado et al.
Transaction identifier and Protocol Identifier are zero, in Length field is placed the number of bytes from the unit identifier to the last byte of the frame, and Unit Identifier varies according to Modbus TCP server. Despite differences in the PDU, MBAP header is the same for all function codes. In the following paragraphs, PDU for the eight implemented Modbus function codes [7] will be reviewed.
Fig. 1. Modbus TCP frame
2.1
Read Coil Status
Function Code 01 allows the client to read status of coils in the server, and through it being possible to read from 1 to 2000 states. Table 1 details PDUs for Request and Response Messages for this function code, in where N is the number of bytes in which status of coils read are returned. Table 1. Request and response to read coils REQUEST
2.2
RESPONSE
Field
Value
Field
Function code
01
Function code 01
Value
Starting address 0 to 65535 Byte count
N
Quantity
Read values
1 to 2000
Values
Read Discrete Input Status
Function Code 02 enables to read the client up to 2000 status of discrete input or contacts in the server. PDU for Request and Response messages are similar to messages that shown in Table 1 but function code field must be 02. 2.3
Read Holding Register
Function Code 03 allows the client to read holding registers which can be accessed as integer type (2 bytes) or as floating type (4 bytes) in the server. In the case of integer data up to 125 registers can be read. Table 2 shows the PDU for the Request and Response messages for this code. Where N is the number of elements to read on the device. If elements are integer type the byte count will be twice N, however if elements are floating type, it will be four times N.
Modbus Data Server
19
Table 2. Request and response to read holding register REQUEST
2.4
RESPONSE
Field
Value
Field
Function code
03
Function code 03
Value
Starting address 0 to 65535 Byte count
2* N
Quantity
Read values
1 to 125
Values
Read Input Register
Function Code 04 enables to the client to obtain the contents of input register which can be integer or floating type. It is important to highlight that Big Endian or Little Endian format must be specified for floating type to define the byte order. In Table 2 PDU for Request and Response message are shown but function code field must be 04. 2.5
Force Single Coil
Function Code 05 allows to change a coil between two states: ON (1) or OFF (0) in the server. The PDU of the Request and Response message for this function code are shown in Table 3. Table 3. Request and response to write single coil FIELD
VALUE
Function code 05
2.6
Address
0 to 65535
Value
0 or 1 (ON or OFF)
Write Single Holding Register
Function Code 06 allows to write an integer holding register. The data field in PDU is similar for Request and Response messages. Table 3 shows the structure but values in the function code field must be 06. 2.7
Force Multiple Coils
Function Code 15 allows to force multiple contiguous coils. Table 4 shows PDU for Request and Response messages. It will be understood that bytes in the field Values of Request message can contain up to 8 coil status.
20
V. Maldonado et al. Table 4. Request and response to write multiple coils REQUEST
2.8
RESPONSE
Field
Value
Field
Value
Function code
15
Function code
15
Starting address 0 to 65535
Starting address 0 to 65535
Quantity
1 to 1968
Quantity
Byte count
N
Values
Data to write
1 to 1968
Write Multiple Holding Register
Function code 16 enable to the client to write one or more holding register in the server. Because this function code allows to write floating type elements, in such case it is necessary to specify the number in the IEEE754 single-precision format, and then to organize the bytes according to Big Endian or Little Endian format that depends of the server device. Table 5 shows PDU for the Request and Response messages. Table 5. Request and response to write multiple REQUEST
3 3.1
RESPONSE
Field
Value
Field
Value
Function code
16
Function code
16
Starting address 0 to 65535
Starting address 0 to 65535
Quantity
1 to 123
Quantity
Byte count
2*N
Values
Data to write
1 to 123
Proposal for Industrial Data Server Implementation Industrial Data Server Requirements
Requirements that must accomplished by data server that will be developed are: (1) ability to enable communication between Modbus TCP devices and Windows applications, (2) facilitate the recording of data temporarily or for short-term, as well as historical data storage or for long-term, and (3) a reduced investment cost. The last requirement is achieved by using free software as Java for server implementation. As regards the second requirement, the use of a database is considered since that will enable a temporary data registration as well as a long-term data storage
Modbus Data Server
21
in a similar way to a process historian. Besides this, database will be a link between the developed data server and Windows applications such as operator interfaces that run in a PC. Therefore, database facilitates to accomplish the first proposed requirement. The creation and management of this database will be made by using MySQL as management system because it is an open source set of tools that contribute to reduce the cost implementation. 3.2
Software Implementation
For data server implementation, two modes of operation are considered. The first mode allows testing the eight basic function codes of the Modbus Protocol which are described in the previous section; in this mode, the user selects the code and configure the required parameters for its execution. Instead, the second mode enables an automatic reading and writing of the industrial device through function code 2 and 4 to read discrete inputs (Contact) and analog inputs (Input Register) respectively. Meanwhile, function code 5, 6 and 16 are used to write outputs to field devices, code 5 for a discrete output (Coil), code 6 for a integer analog output (Holding Register) and code 16 for floating values. The execution of these codes will be carried out according to the first digit of the Modbus register address since it define the variable type. Table 6 shows the function code according to the register address. For example, if the address is 30001 the function code 4 is executed, that code corresponds to write in the industrial device register. Table 6. Registration address relationship with function code ADDRESS VARIABLE TYPE FUNCTION CODE 0XXX
Coil
05
1XXX
Contact
02
3XXX
Input register
04
4XXX
Holding register
03(Integer), 16 (Float)
In the second mode, the data server acts as an interface between the industrial device and the database. Regarding the database, two tables are created for each configured Modbus TCP device. First table is intended for temporary recording of values in real-time, and the other table for the long-term storage of process variable. Figure 2 shows a schema of our implementation, in where the Windows application is an Human Machine Interface (HMI).
22
V. Maldonado et al.
Fig. 2. Communication between industrial device and a HMI through developed server
4
Sending Modbus Frames over TCP
In order to send Modbus Frames over TCP and IP protocols from the industrial data server to a Modbus device, a communication connection is opened by using Java Socket Class since TCP/IP communication is based on sockets, i.e., the connection with a Modbus field device is fully defined by its IP address and logical port number. Afterwards, “Request Frame” is assembled according to the corresponding Modbus function code and sent by using getOutStream().Write(b) method. Subsequently, industrial data server will be waiting for “Response Frame” from the Modbus field device and its reception is accomplished by implementing getInputStream().Read(b) method. After that “Response Frame” is received by the data server and once error check field has been verified in it, values in the data field are read. After, read data is organized according to Big-Endian or Little-Endian format, depending on the Modbus field device features. Endian formats are shown in Fig. 3a) and 3b) respectively. Also, if data type of read values is float, so they will be handled according to IEEE 754 format with simple precision. An example of assembled frame for Function Code 3 is presented in Fig. 4, in which reading of Holding Register is accomplished. This code requires as input parameters (1) starting read address, (2) quantity of registers to read, and (3) variable data type (Integer or Floating). A MBAP header is added to the frame too, in which address and quantity fields use the byte order according to the Big-Endian format.
Fig. 3. a) Big-Endian order data (AB CD), b) Little-Endian order data (CD AB)
Fig. 4. Example for function code 03
Modbus Data Server
5
23
Database Implementation
In automatic communication mode, the data server must be connect to the database. As previously mentioned, two tables will be created in the database for each Modbus device. The first table contains the real-time data, so it saves records of name, address, value, and the time of last entry. The second table stores historical data that corresponds to time and value of the last 1000 registered values. Additionally, there is a table to save the configuration of the industrial data server, if the user selects such option. Since the implemented database is managed through MySQL database management system (DBMS), interaction between the developed data server and the DBMS is established by using SQL statements. Among the used statements are SELECT for reading the table with the values in real-time, UPDATE for modifying the values of the records, INSERT for adding a new row in the table and, CREATE TABLE for creating a new table.
6
Server Configuration Interface
A graphical user interface (GUI) for configuring data server is implemented considering that two operating modes described above can be executed, i.e., GUI allows the user to test each function code individually by Function Code Test Mode or establish a continuous communication between the server and a Modbus device by Continuous Communication Mode. In the following, implementation of the two operation modes on GUI are described. 6.1
Function Code Test Mode
For this mode (Fig. 5) IP address, Modbus register address and, register quantity are mandatory input parameters. If the function code is a write code (05, 06, 15, and 16), a input field for every register to write is enabled in the lower lefthand of the window for entering data that will be written in Modbus device. For reading function codes (01, 02, 03, and 04), data will be displayed at the lower right-hand too. 6.2
Continuous Communication Mode
In this mode three tabs has been implemented: Configuration, Database, and Data (Fig. 6). In Configuration tab, user enters the data related to the configuration of the Modbus device. As can be seen, several devices can be added even after starting some connection. When CONNECT button is pressed communication between the configured devices and the data server is established. It is important to consider that if the table cannot be found in the database, the user should be create it first. Additionally, SAVE CONFIGURATION button saves current configuration, and LOAD CONFIGURATION button restore the last saved configuration. Database tab enables the user to enter name and address of
24
V. Maldonado et al.
Fig. 5. Function code test interface
Fig. 6. Continuous communication mode - configuration tab
registers in the specified Modbus device, this is shown in Fig. 7a). Once such configuration is done, user must push UPDATE button to update table in database. Data tab allows user to monitor a database table. In this tab user can enter the table name and before to press SHOW button data from rows name, value, and time will be shown. It is important to highlight that the database tables create in this mode allows data exchange with some Windows application as HMI that runs in a PC. In our work an HMI developed in InTouch is used to validate this functionality. HMI application is communicated with the database through SQL Functions available in InTouch, so the creation of Bind List to associate tagnames with table columns is required.
Modbus Data Server
25
Fig. 7. Second mode interface a) Database and b) Data tab
7
Tests and Results
Each server operating mode were tested with PLCs from different brands to validate its performance. In the Function Code Test Mode, performance of implemented Modbus function codes are tested using a PLC M580 from Schneider and a Modbus PLC emulator as Modbus field devices. Figure 8 shows experiments for the function codes 16 and 3, in which on left side is M580 PLC programming and on right side is the data server interface. Meanwhile, Fig. 9 shows test for function code 4 with the PLC emulator. In this case, two floating data are read from address 30002 with Big Endian format. As result, in both cases data server reads correct values, then, by these tests we verify the correct functioning of this first mode. For Continuous Communication Mode tests were carried out by simulating a chocolate process which has a Supervisory Control System. This process has 3 PLCs that are Micro850 and ML1400 from Allen Bradley, and S71200 from Siemens, all of them are configured as Modbus TCP servers. Figure 10 shows the configuration for the PLC ML1400. On left side is shown connection configuration and on right side is address configuration of Modbus registers. As mentioned above, each Modbus device has two tables in database, historical and real-time. Figure 11 shows historical table on left side and real-time table on right side of PLC ML-1400. Figure 12 shows the interface process developed in InTouch, which is connected with the database tables and interacts with them
Fig. 8. Function code test mode with PLC M580
26
V. Maldonado et al.
Fig. 9. Function code test mode with Emulator
Fig. 10. Continuous communication mode with ML1400
Fig. 11. Database tables for ML1400
through SQL statements. It was verified that values from inputs and outputs of every PLCs match with values displayed in HMI. After experiments that were described above, we made and present in Table 7 a comparison between developed server and a commercial data server that is Wonderware MBTCP DAServer. MBTCP features that are shown in Table were taken from its User Guide [4].
Modbus Data Server
27
Table 7. Comparison between developed server and MBTCP DAServer Description
MBTCP DAServer
Developed server
Operating system
Windows
Anyone
End-user investment in software
Paid license
Free
Connectivity with other applications Windows applications acting as a DDE, Suite Link, or OPC client
Applications that handle SQL language to access data in MySQL
Register address size
4, 5 and 6 digits
4, 5 and 6 digits
Device sampling time
Configurable for the server Configurable for each device
Response timeout
Configurable, default 3 sec 10 msec
Item configuration
In Device Item tab
Data visualisation
Visualization in console
Visualization in Data tab
Data base
Database en SQL server
Database en MySQL
In Database tab
Fig. 12. Chocolate process interface
8
Conclusions
1. Development of a industrial data server for communicating Modbus TCP devices and Windows applications by using a database as a temporary storage space were presented. 2. In Fuction Code Test Mode, the eight function codes can be tested individually, however in Continuous Communication Mode the implementation of the 8 function codes is not necessary for a correct communication. 3. Use open source software could reduce costs of engineering software. Besides, the advantage of Java is the portability of its programs that allows its execution in any operating system. 4. This developed server and its database allow the monitoring of data in realtime as well as historical data of a process.
28
V. Maldonado et al.
5. The implementation of the system allows having a low-cost server for small industry or industries with limited investment capacity. That enables them the possibility to implement a supervisory control system. 6. As a recommendation, low-cost data servers for other protocols could be made because there are other communication protocols that are used in the industry, despite Modbus is one of the most used industrial protocols. This will expand the number of beneficiaries of the low-cost industrial software.
References 1. Aguirre, D., Gamboa, S., Rodas, A.: Low-cost supervisory control and data acquisition systems. In: 2019 IEEE 4th Colombian Conference on Automatic Control (CCAC), pp. 01–06 (2019) 2. Alonso, C.G.M., Rafael, S.F., Francisco, M.P., Gabriel, D.O., Elio, S.R., Miguel, S.P.V., Javier, S.B., Mar´ıa, F.A.J., Pau, M.C., Gregorio, Y.C.J., et al.: Comunicaciones industriales: principios b´ asicos. Editorial UNED (2017) 3. de Guevara, J.M.L.: Fundamentos de programaci´ on en java. Editorial EME (2011) 4. Invensys Systems, Inc.: Wonderware MBTCP daserver user’s guide (2007) 5. Acosta, R.L.: Evaluaci´ on de sistemas. Ph.D. thesis, Universidad Central Marta Abreu de Las Villas (2014) 6. Minchala, L.I., Ochoa, S., Velecela, E., Astudillo, D.F., Gonzalez, J.: An open source SCADA system to implement advanced computer integrated manufacturing. IEEE Latin America Trans. 14(12), 4657–4662 (2016) 7. Modbus Organisation: Modbus application protocol specification v1. 1b3 (2015) 8. Warren, J.C.: Industrial networking. IEEE Ind. Appl. Mag. 17(2), 20–24 (2011)
Industrial Communication Based on MQTT and Modbus Communication Applied in a Meteorological Network Erwin J. Sacoto Cabrera(B) , Sonia Palaguachi, Gabriel A. Le´ on-Paredes , Pablo L. Gallegos-Segovia, and Omar G. Bravo-Quezada Universidad Polit´ecnica Salesiana, Cuenca, Ecuador {esacoto,gleon,pgallegos,obravo}@ups.edu.ec [email protected], http://www.ups.edu.ec Abstract. In this paper, we propose an architecture to enable hierarchical sensors networks that use MQTT protocol for sensors data acquisition to interact with the MODBUS protocol to send information to the SCADA system. The proposed architecture is implemented in three stages. In the first stage, we configure the MQTT protocol in sensors. In the second stage, we configure the middleware to transport the data to the SCADA system by MODBUS protocol. And in the third stage, the SCADA system is implemented. We conclude that the architecture proposed is feasible to hierarchical sensor networks that used the MQTT protocol and interacted with MODBUS industrial networks. Keywords: MQTT
1
· MODBUS · Industrial communication · SCADA.
Introduction
The Internet of Things (IoT) will change the way we live. It is a technological impact will give intelligence to almost everything things. It will allow the development of cloud computing since more data can be collected, stored and analyzed than ever before, as described by authors in [21,23,26,40]. In the same way, the use of IoT in the industry is called Industrial IoT (IIoT). It focuses on the integration between Operational Technology (OT) and Information Technology (IT), as described in [14,26,39]. In this regard, the development of protocols for data acquisition in industrial communication networks has focused on the comparison of different communication protocols and how they can improve the performance of companies, as described by the authors in [3,18,25]. In the same way, focus the research has mainly on comparisons of different communication protocols for data acquisition in sensors networks for data analysis and on improvements of the business-to-business services across a wide range of industry sectors. In this context, a set of scalable and lightweight protocols have come up with the IIoT and provided the necessary features for the increasing number of an industrial process that needs to be automatized, such as, MQTT [4,17], Networks c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 29–41, 2021. https://doi.org/10.1007/978-3-030-63665-4_3
30
E. J. Sacoto Cabrera et al.
(MQTT-SN) [10,34], Data Distribution Service (DDS) [27], Constrained Application Protocol (CoAP) [32], Advanced Message Queuing Protocol (AMQP) [38], Java Message Service (JMS) [27], Extensible Messaging and Presence Protocol (XMPP) [31]. Similarly, the development of industrial communications based on the concept of a distributed bus, as described in 4 [12]. Specifically, in [12,15], the authors describe many versions and standards of field bus technology, such as Controller Area Network (CAN), Process Field Bus (Profibus), Foundation Fieldbus (MODBUS). Currently, several problems have arisen in the integration of IoT sensors and nodes with legacy industrial communication protocols. In this respect, some researches are focused on the study of the protocols interaction. The authors in [28], compare the most common protocols in the field of industrial automation and IoT, such as OPC Unified Architecture (OPC UA), Data Distribution Service (DDS), Robot Operating System (ROS), MQTT. In [11,29,30,36,37], an architecture is proposed based on the use of standard OPC UA and the use of MQTT messages to regulate the data transmission process between the sensors and the Supervisory Control and Data Acquisition Systems (SCADA) available in the industry. Two scenarios of comparison for building the IIoT environment between the MODBUS TCP and the MQTT protocols are proposed in [18]. The first scenario uses MODBUS TCP autonomously to build the IIoT environment, i.e., consider MODBUS TCP as an IoT protocol. The authors conclude that the first scenario is efficient only when the request-response pattern is needed and its complies with most industrial applications [18]. In the second scenario, MQTT in conjunction with MODBUS TCP, is used. The authors conclude that the second scenario is used when MQTT publication-subscription is implemented in the sensors as a complement of MODBUS TCP; however, the MQTT lacks interoperability. Nevertheless, the described studies do not analyze a model that allows integrating from end to end the protocols used in IoT with the industrial networks, specifically MQTT and MODBUS. Besides, these works are focused on specific aspects of the comparison of the characteristics of the protocols IIoT. 1.1
Paper Contributions and Outline
In this paper, we propose an architecture to enable hierarchical sensors networks that use MQTT protocols to interact with MODBUS networks. We study the feasibility of designing and implementing a communication system. Its key challenge is to integrate MQTT and MODBUS protocols generally used by IoT components by different vendors communicate with existing industrial networks. The proposed system is analyzed in three stages. The first stage studies the MQTT as a communication protocol between the sensors and the node. The second stage studies the design of middleware to transport the data from the sensor nodes to the MODBUS network. In the third stage, the system is implemented in a network of meteorological stations that communicate via MODBUS to the SCADA system. The rest of this paper is organized as follows: In Sect. 2, we describe the system architecture, communications protocol, and network scheme in detail.
Industrial Communication Based on MQTT and Modbus Communication
31
Section 3 provides details of the development and implementation of the system. Section 4 shows and discusses the results. Finally, in Sect. 5, we present the conclusions and future work.
2
General System architecture
The general system architecture is inspired by the model described in [7,16]. The authors propose in the mentioned studies a four-layers architecture; that is different from the architecture of the OSI [2]. The general system architecture proposed is suitable for Unified Network and Universal Service of sensor networks [1,5,9], and integrates network resource, service and customer [7]. The architecture proposed in our model has the following layers: – Industrial Sensor Layer (ISL). This layer is made of sensors/actuators installed on machines in industries, and are responsible for collecting the information to be sent to SCADA system [9,22]. For the implementation, we designed sensors to measure meteorological parameters, based on Arduino Yung boards. – Industrial Gateway (IG). The IG receives the data from the ISL layer through the MQTT protocol by WiFi connection [30]. The IG sends the information to the SCADA system via MODBUS protocol by Ethernet [22]. – Management Layer (MSL). This layer manages the connection between sensors/actuators and the application layer (SCADA system) [30]. For this study, we consider that the MQTT standard policies about authentication and authorization for the subscription of clients through message topics are sufficient to provide safety to the sensor network [30,36]. – Application Layer (AL). The system architecture is completed with the monitoring and controlling through the SCADA system that, is used in numerous industrial processes [13,30]. In this layer, the SCADA system saves measurements information in a database and extract the data for processing. We use the MODBUS protocol between the SCADA system and the IG Layer. In Fig. 1 we show the General System Architecture described above. The main elements that are part of the architecture are described below: 2.1
First Stage
We consider a hierarchical network of sensors that are formed by the meteorological stations and nodes that are in charge of collecting the information and then send it to the SCADA system. The devices that make up this stage are – Sensors for Weather Stations. The ISL is composed of the sensors installed in the weather stations. Each sensor is composed of an Arduino YUN device to which temperature, humidity, wind speed, wind direction, barometric pressure, precipitation and solar radiation measurement are connected, as shown in Fig. 2.
32
E. J. Sacoto Cabrera et al.
Fig. 1. General system architecture
We develop the communication system between sensors and the receiver node using the MQTT protocol because it is a light and simple messaging protocol for its architecture of publication/subscription. This allows the management of thousands of remote customers, in our case, the weather stations. – Receptor Nodes. These are part of the IG layer and receive the information sent by the sensors. For the reading of the information sent by each sensor, we consider a queue M/M/1 with a discipline Round Robin (RR) [33]. This discipline allows the reading of each one of the sensors that depend on this receiver node. The receiver node is designed in a Raspberry Pi in which the MQTT broker was deployed. And for the data storage of the sensors, a My SQL database was used. Figure 3 shows the receiver node and its elements.
Fig. 2. Weather stations sensors
Industrial Communication Based on MQTT and Modbus Communication
2.2
33
Second Stage
In this stage, we develop middleware to connect the sensors that send the information via MQTT and the SCADA system that uses MODBUS. For this purpose, we use the Receptor Nodes described in the previous section. To send the information of the data collected from the sensors to the SCADA system, we use NODE-RED to configure MODBUS in the Receptor Nodes. Figure 3 shows the essential elements used in this stage.
Fig. 3. Receptor nodes
2.3
Third Stage
In this stage, we define that Communication System Protocol and SCADA System we implement, as described below: – Communication System Protocol. The traditional SCADA protocols (e.g.. Modbus) do not support connections with IoT protocols as MQTT [8,12,19]. In [18], the authors compares the performance between MODBUS and MQTT with the network resources needed for IoT [18]. In this study, the authors conclude that MODBUS is more efficient than MQTT and optimizes data transmission to the AL layer of industrial applications such as SCADA system, as shown in Fig. 4.
34
E. J. Sacoto Cabrera et al. MQTT
MODBUS CONNECT MODBUS
CONNECT CONNACK
Query
SUBSCIBE
Byte encoded format Industry - Oriented Optimized Frame
CONNACK
text
SUBACK
PUBLISH PUBLISH MODBUS
DISCONNECT DISCONNECT
Response
Publisher
Broker
Subscriber
Client
Server
Fig. 4. Comparison of MQTT and MODBUS [18].
Even though the MQTT protocol cannot replace MODBUS in industrial environments, because MODBUS uses an optimized frame format for industrial PLCs and SCADA as shown in Fig. 4; the MQTT protocol is suitable for sensors-based subscription-publishing. As indicated above, we use the MODBUS protocol to interconnect the nodes with AL layer, i.e., with the SCADA system. The middleware described in Sect. 2.2 converts the information obtained from the meteorological stations (sensors) received via the MQTT protocol and sends them to the upper layer via MODBUS protocol. – SCADA System. In our development, we use SCADA Laquis system [6] that is a kind of software used for data acquisition and process control in the industrial control system (ICS). The most relevant configurations characteristics that we use in our development are: • Monitoring: We configure the SCADA system as recommended by the vendor for this type of application, as a distributed industrial control system [6]. We configure the SCADA system as a WEB server because it allows the export of control panels, information and reports, and distributes them to the PCs of the users of the weather network. This configuration allows data to be obtained in real-time through an integrated remote network, as we propose in this study. • Architecture: We configure the data obtained from the IG layer with identification labels in the SCADA system database. The SCADA System use tags in a database that contains properties to control what will be read or written from the lower layers,i.e., SCADA System contains properties for reading or writing control of the equipment [6]. However, for this development, we configure the SCADA system database only to read the equipment data, because we do not have actuators in weather stations network.
Industrial Communication Based on MQTT and Modbus Communication
35
• Control Server (CS): We configure in the SCADA System a hosts supervisory control, called Control Server (CS) that is designed to communicate with a lower-level control layer (ISL) [35]. Specifically, CS allows the SCADA System to access the subordinate control modules via the ICS network layer using the MODBUS protocol. • Data Historian (DH): The DH is the centralized database SCADA System for logging data information recollected by sensors network through IG layer [35]. • Reports Module: We configure this module to generate daily, monthly or annual meteorological reports of the different parameters, and to verify the data process end to end [35]. We use the Laquis SCADA system after a survey of use cases considering different user-profiles and roles, considering the requirements and best practices in the management of users of the meteorological system.
3 3.1
Develop and Implementation MQTT Implementation
The information from weather stations sensors is sent via the MQTT protocol. As in similar studies, we configure MQTT protocol so the clients can subscribe to topics that pertain to them and thereby receive whatever messages published about those topics [20,36]. The MQTT configuration for this project is described below: – Topics and Subscriptions. In our model, a broker MQTT identifies the status of a receiver which is a meteorological station before sending a command. A receiver can also identify the status of a broker before doing an action. This configuration is possible because as described in [20,30] it is a characteristic of MQTT protocol, i.e., in which MQTT are publishing the topics and clients who signed up to receive particular messages by subscribing to a topic. – Publish/Subscribe. As described in [17], the principle of the publish/subscribe (pub/sub) communication model is that components register their interest if interested in consuming information. In this model, we configure as clients of the system, (editor and subscriber), and we call as a topic in the MQTT message [20,30], i.e., MQTT subscriptions and publications can only be made on a specified set of topics. The following is the configuration for MQTT connection between the sensors and the Receptor Node: • MQTT Client: With the sentence: Client < IPStack, Countdown> client= MQTT:Client (ipstack) we create a client object and move on to a network class instance. • Void Connect (): This class contains the parameters for the connection between the Sensors and Receptor Nodes. The parameters that we configure are:
36
E. J. Sacoto Cabrera et al.
∗ Hostname: Contain the MQTT server IP. ∗ Port: Number of the port (1883) through which the MQTT protocol communicates. ∗ rc: Return code to verify that the client-server connection was established. ∗ Protocol Version: Contains the default protocol version (data.MQTT Version = 3.1). ∗ Client Name: Contains the client name (data.clientID.string=char*).
The sensors data are transferred across the network when connecting to the message broker. The sensor network also allows multiple users to subscribe messages with a subject because we need to control several meteorological sensors by groups then the message broker will broadcast to different users, as allowed by MQTT protocol [17]. • Broker: The broker MQTT distributes the published data to those applications that have subscribed to that topic [17]. We configure MQTT broker in Mosquitto software, that is an open-source tool to develop this requirement. MQTT Mosquitto broker is set up for the Raspberry Pi to publish and subscribe to the sensors messages. Besides, it has a client running, responsible for subscribing the desired topics, receiving the data, decoding it and storing it appropriately in a MySQL database [24,30]. We configure the broker using a configuration file as described in [24]. • Data Base. The database installed on the Raspberry Pi in which the information sent by the sensors will store has the structure shown in Fig. 5. We use JAVA to store in the database the information sent to the MQTT broker. The main classes we used are: ∗ Clase Suscriber(): It contains the main elements to make the connection between the MQTT broker and JAVA, for it we use the following libraries: · Import org.eclipse.paho.client.MQTTv3.MQTTClient. · Import org.eclipse.paho.client.MQTTv3.MQTTExeption
And the methods used in the Subscriber class are: · Subscriber()void: It contains the connectivity properties of the MQTT topic send by sensors to broker (Mosquitto). · String BrokerU RL : Contains DNS or IP port. · Client Username: Contains the name of the final user of the information. · Start: Receives the answer of the client MQTT. · ClaseConexion : Allows define the connection to the MySQL database · Clase ManagerConexion : Contains the different methods for inserting data into the MySQL database · GetConexion : We use this object to create SQL statements
Industrial Communication Based on MQTT and Modbus Communication
37
Fig. 5. Data structure MySQL
3.2
MODBUS-MQTT Configuration
The following tools used to configure the MODBUS protocol: – Node-Red. It is a graphic programming tool included in the Raspberry-Pi. In Node-Network, the conversion of the MQTT protocol to MODBUS is carried out by graphically programming the following nodes: • Node MQTT IN: Connect the MQTT broker and subscribe to the messages of the data topic. • Node Function: It performs the conversion and division of the data using a script. This node has seven outputs, each one representing the data obtained in the reading of the sensors by MQTT. • Node Modbus TCP client: Connect to MODBUS TCP server and write the messages. • Node Modbus TCP read: Connect to MODBUS TCP server to read values from the log assigned to a sensor. – MODBUS Slave. It is a function that is configured in Laquis SCADA to listen and obtain the data that is sent by the Network Node. Each node of the weather network is created and configured in the SCADA with a new connection using the function: MODBUS TCP/IP connection. For each node, you must configure the IP address and the modus communication port, which is 502. – SCADA Laquis Configuration. The variables are configured in the SCADA system for each weather station. The variables are: Humidity (hum), temperature (temp), wind speed (vel), rain (luv), pressure (pres) and irradiation (irr).
38
4
E. J. Sacoto Cabrera et al.
Results
This section presents the results of the implementation of the weather station network through an MQTT communication over MODBUS. Table 1, shows a sample of the data obtained in the SCADA system for sensor 1. The information obtained by each weather station is sent by MQTT to Raspberry Pi, where it is stored in a MySQL database. By means of the configuration made in NODERED, the information is converted from the MQTT to the MODBUS that is transmitted to the SCADA system. For the implementation of the weather stations, several tests and calibration of the sensors were carried out. Besides, a data sending stage was implemented between the Raspberry Pi and the SCADA System. The information obtained by each weather station is sent by MQTT to Raspberry Pi, where it is stored in a MySQL database. Through the configuration made in NODE-RED, the information is converted from MQTT to MODBUS that is transmitted to the SCADA system (Fig. 6).
Fig. 6. Results data transmission (Temperature) Table 1. Results table sensor 1 Time
Name
Rain (l/m2 )
Velocity (km/h).
Press (Pa)
Humidity (%)
Temperature (◦ C)
Direction
Irradiation (mW/m2 )
11.20.45
S 1
0
0
749
48.9
24
0
1
11.25.45
S 1
0
0
749
48.9
24
0
1
11.30.45
S 1
0
0.63
749
55.7
21.7
0
5
11.51.09
S 1
0
0
749
42.8
24.7
0
8
12.01.09
S 1
0
1.64
749
46.2
23.9
0
8
12.06.09
S 1
0
0.56
748
47.6
24
0
7
12.11.09
S 1
0
0.56
748
43.4
24
0
8
12.16.09
S 1
0
0.72
748
41.5
25.6
0
9
12.21.09
S 1
0
1.87
749
41.7
26
0
10
Industrial Communication Based on MQTT and Modbus Communication
5
39
Conclusion
In this paper, we propose an architecture to enable a hierarchical sensors network that uses the MQTT protocol to interacts with the MODBUS protocol in the industrial network. In this document, we demonstrate that the three design stages of the industrial communication system based on the MQTT and the MODBUS protocols can be implemented in an industrial system. As follows – The first stage was implemented by MQTT as a communication protocol between the sensors and the node. – The second stage was implemented by middleware to transport the data from the sensor nodes on the MODBUS network in a Raspberry pi in NODE-RED. – In the third station, the system was implemented in a network of meteorological stations that communicate via MODBUS to the SCADA Laquis system.
References 1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: a survey. Comput. Netw. 38(4), 393–422 (2002) 2. Alani, M.M.: OSI model. In: Alani, M.M. (ed.) Guide to OSI and TCP/IP Models, pp. 5–17. Springer, Cham (2014) 3. Balaji, S., Nathani, K., Santhakumar, R.: IoT technology, applications and challenges: a contemporary survey. Wireless Pers. Commun. 108, 1–26 (2019) 4. Banks, A., Gupta, R.: MQTT version 3.1. 1. OASIS Standard 29, 89 (2014) 5. Brinsfield, J.: Unified network management architecture (UNMA). In: IEEE International Conference on Communications,-Spanning the Universe, pp. 1135–1141. IEEE (1988) 6. Consultoria, L.L.: Laquis SCADA (2017). http://laquisscada.com/index.html 7. Dai, H., Han, R.: Unifying micro sensor networks with the Internet via overlay networking [wireless networks]. In: 29th Annual IEEE International Conference on Local Computer Networks, pp. 571–572. IEEE (2004) 8. Encalada, P., Isabel, S.: Dise˜ no, desarrollo e implementaci´ on de una estaci´ on meteorol´ ogica basada en una red jer´ arquica de sensores, software libre y sistemas embebidos para la Empresa ELECAUSTRO en la Minicentral Gualaceo utilizando comunicaci´ on MQTT y MODBUS. B.S. thesis (2018) 9. Fahmy, H.M.A.: Wireless Sensor Networks. Springer, Heidelberg (2020) 10. Fette, I., Melnikov, A.: The websocket protocol. Technical report (2011) 11. Gallegos-Segovia, P.L., Bravo-Torres, J.F., Argudo-Parra, J.J., Sacoto-Cabrera, E.J., Larios-Rosillo, V.M.: Internet of things as an attack vector to critical infrastructures of cities. In: 2017 International Caribbean Conference on Devices, Circuits and Systems (ICCDCS), pp. 117–120. IEEE (2017) 12. Gilchrist, A.: Designing industrial internet systems. In: Industry 4.0, pp. 87–118. Springer (2016) 13. Goldenberg, N., Wool, A.: Accurate modeling of Modbus/TCP for intrusion detection in SCADA systems. Int. J. Crit. Infrastruct. Prot. 6(2), 63–75 (2013) 14. Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (IoT): a vision, architectural elements, and future directions. Future Generation Comput. Syst. 29(7), 1645–1660 (2013)
40
E. J. Sacoto Cabrera et al.
15. Guohuan, L., Hao, Z., Wei, Z.: Research on designing method of can bus and Modbus protocol conversion interface. In: 2009 International Conference on Future BioMedical Information Engineering (FBIE), pp. 180–182. IEEE (2009) 16. Han, D., Zhang, J., Zhang, Y., Gu, W.: Convergence of sensor networks/internet of things and power grid information network at aggregation layer. In: 2010 International Conference on Power System Technology, pp. 1–6. IEEE (2010) 17. Hunkeler, U., Truong, H.L., Stanford-Clark, A.: MQTT-S–A publish/subscribe protocol for wireless sensor networks. In: 2008 3rd International Conference on Communication Systems Software and Middleware and Workshops (COMSWARE 2008), pp. 791–798. IEEE (2008) 18. Jaloudi, S.: Communication protocols of an industrial Internet of Things environment: a comparative study. Future Internet 11(3), 66 (2019) 19. Johnson, C.: Securing the participation of safety-critical SCADA systems in the industrial Internet of Things (2016) 20. Lampkin, V., Leong, W.T., Olivera, L., Rawat, S., Subrahmanyam, N., Xiang, R., Kallas, G., Krishna, N., Fassmann, S., Keen, M., et al.: Building smarter planet solutions with MQTT and IBM websphere MQ telemetry. IBM Redbooks (2012) 21. Le´ on-Paredes, G.A., Barbosa-Santill´ an, L.I., S´ anchez-Escobar, J.J., Pareja-Lora, A.: Ship-sibiscas: A first step towards the identification of potential maritime law infringements by means of LSA-based image. Sci. Program. 2019, 14 (2019) 22. Low, K.S., Win, W.N.N., Er, M.J.: Wireless sensor networks for industrial environments. In: International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC 2006), vol. 2, pp. 271–276. IEEE (2005) 23. Madakam, S., Ramaswamy, R., Tripathi, S.: Internet of Things (IoT): a literature review. J. Comput. Commun. 3(05), 164 (2015) 24. mosquitto.org: MQTT man page (2019). https://mosquitto.org/man/mqtt-7.html. Accessed 19 Jul 2019 25. Nolan, M., McGrath, M.J., Spoczynski, M., Healy, D.: Adaptive industrial IoT/CPS messaging strategies for improved edge compute utility. In: Proceedings of the Workshop on Fog Computing and the IoT, pp. 16–20. ACM (2019) 26. Palattella, M.R., Thubert, P., Vilajosana, X., Watteyne, T., Wang, Q., Engel, T.: 6tisch wireless industrial networks: determinism meets IPV6. In: Internet of Things, pp. 111–141. Springer (2014) 27. Pardo-Castellote, G.: Omg data-distribution service: architectural overview. In: 23rd International Conference on Distributed Computing Systems Workshops, 2003. Proceedings, pp. 200–206. IEEE (2003) 28. Profanter, S., Tekat, A., Dorofeev, K., Rickert, M., Knoll, A.: OPC UA versus ROS, DDS, and MQTT: performance evaluation of industry 4.0 protocols. In: Proceedings of the IEEE International Conference on Industrial Technology (ICIT) (2019) 29. Quishpi, A., Paola, G., Rodr´ıguez Bustamante, J.L.: Implementaci´ on de un sistema de supervisi´ on y monitorizaci´ on del consumo de energ´ıa el´ectrica y agua potable, utilizando redes Het-Net para la transmisi´ on de datos; con la finalidad de obtener informaci´ on oportuna para una eficiente facturaci´ on y disminuir perdidas en la dotaci´ on de los servicios. B.S. thesis (2016)
Industrial Communication Based on MQTT and Modbus Communication
41
30. Sacoto-Cabrera, E., Rodriguez-Bustamante, J., Gallegos-Segovia, P., ArevaloQuishpi, G., Le´ on-Paredes, G.: Internet of Things: informatic system for metering with communications MQTT over GPRS for smart meters. In: 2017 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), pp. 1–6. IEEE (2017) 31. Saint-Andre, P.: Extensible messaging and presence protocol (XMPP): Core. Technical report (2011) 32. Shelby, Z., Hartke, K., Bormann, C.: The constrained application protocol (COAP). Technical report (2014) 33. Singh, A., Goyal, P., Batra, S.: An optimized round robin scheduling algorithm for CPU scheduling. Int. J. Comput. Sci. Eng. 2(07), 2383–2385 (2010) 34. Stanford-Clark, A., Truong, H.L.: MQTT for sensor networks (MQTT-SN) protocol specification. International business machines (IBM) Corporation version 1 (2013) 35. Stouffer, K., Falco, J.: Guide to supervisory control and data acquisition (SCADA) and industrial control systems security (2006) 36. Vimos, V., Sacoto, E., Morales, D.: Conceptual architecture definition: implementation of a network sensor using Arduino devices and multiplatform applications through OPC UA. In: 2016 IEEE International Conference on Automatica (ICAACCA), pp. 1–5. IEEE (2016) 37. Vimos, V., Sacoto Cabrera, E.J.: Results of the implementation of a sensor network based on Arduino devices and multiplatform applications using the standard OPC UA. IEEE Latin America Trans. 16(9), 2496–2502 (2018) 38. Vinoski, S.: Advanced message queuing protocol. IEEE Internet Comput. 6, 87–89 (2006) 39. Yan, L., Zhang, Y., Yang, L.T., Ning, H.: The Internet of Things: From RFID to the Next-Generation Pervasive Networked Systems. CRC Press, Boca Raton (2008) 40. Zhang, Y., Mukherjee, M., Wu, C., Zhou, M.T.: Advanced industrial networks with IoT and big data. Mobile Netw. Appl. 24, 1–3 (2019)
REGME-IP Real Time Project David Cisneros1 , Mónica Zabala2(B) , and Alejandra Oñate2 1 Instituto Geográfico Militar, Quito, Ecuador
[email protected] 2 Escuela Superior Politécnica de Chimborazo, Riobamba, Ecuador
{m_zabala,mayra.onate}@espoch.edu.ec
Abstract. The high demand for positioning-based services has generated the development of measurement correction techniques for GPS data, with Real-Time techniques presenting the best results. The architecture of a Real-Time correction system is based on the RTCM message transport protocol (NTRIP) and consists of a server called a caster that concentrates the streams of the GNSS stations and distributes them to the users through internet acting as a gateway. Currently there are international projects such as IGS-RTS or EUREF-IP and in Latin America such as SIRGAS-RT which encourages all member countries to participate in the implementation of this service in real time. Being responsible for the administration of the Continuous Monitoring Network of Ecuador, the Military Geographical Institute optimizes technological resources and implements the national caster and caster backup in collaboration with Escuela Superior Politécnica de Chimborazo to provide the correction service on time. real through the NTRIP protocol for the release of GNSS data to Ecuadorian citizens as well as including Ecuador as one of the countries that complies with the SIRGAS requirements. Keywords: GNSS · NTRIP · Real time corrections
1 Introduction In Ecuador, positioning activities based on Global Navigation Satellite Systems - GNSS, are developed mainly with the relative Static Differential positioning method, being mandatory the post-process with specialized software, to obtain points with coordinates linked to the geocentric framework, official reference, below an acceptable level of accuracy. Obviously, GNSS positioning technique takes time to deliver results, however, it is important for maximum accuracy work, in which it is mandatory to achieve the least possible error in terms of coordinates at the millimeter level, for example in the case of establishing high accuracy GNSS networks, national, regional, global reference frameworks, accuracy engineering, among others. However, there are several georeferenced activities, which do not require maximum accuracy, it is enough to obtain coordinates in the sub metric and centimetric order, before which it is not mandatory to apply the method of relative positioning Static Differential Post-process, since it requires results faster and in less time, opening the possibility © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 42–57, 2021. https://doi.org/10.1007/978-3-030-63665-4_4
REGME-IP Real Time Project
43
of applying relative positioning methods in Real Time. The techniques, procedures for observation and acquisition of geospatial information evolve over time, the world trend is to provide GNSS information in real time, evolving the transmission of traditional DGNSS corrections, based on radio frequency and downloading files for post-processing, to an Internet-based data transmission; this is undoubtedly solved with the use of relative GNSS positioning methods in Real Time, mainly based on the NTRIP protocol.
2 International Real Time Projects Related Work 2.1 Euref-Ip With the increased capacity of the Internet, applications that transfer continuous data streams by IP-packages, such as Internet Radio, have become well-established services. Consequently, EUREF decided in June 2002 to set up and maintain a Real-Time GNSS infrastructure on the Internet using stations of its European GPS/GLONASS Permanent Network EPN. The goal of the pilot project is to evaluate and stimulate the use of the NTRIP technology. All data are sent to a EUREF Broadcaster from where they can be received by authorized users. EUREF-IP is a trial service. Data are available for test and evaluation purposes only [1]. 2.2 IGS Real-Time Service (IGS-RTS) IGS-RTS is an operational scientific service of the International Association of Geodesy and one of several services contributing to the Global Geodetic Observing System (GGOS) with the development of the Real-Time service using the Network Transport of RTCM by Internet Protocol (NTRIP) for internal operations and for the delivery of Real-Time products to its user community [2]. 2.3 Geocentric Reference System for the Americas (SIRGAS) SIRGAS Resolution No. 6 of May 29, 2008 on the SIRGAS Real-Time policy project resolves: Establish a pilot project called SIRGAS in Real Time (SIRGAS-RT), which will aim to investigate the fundamentals and applications associated with the distribution, in the SIRGAS regions, of observations and corrections to GNSS measurements in real time using NTRIP or any other long-range means [3]. Under this activity, a service called Experimental SIRGAS Caster has been implemented with the main purpose of publishing GNSS data in real time using NTRIP. This caster is hosted by the Laboratory of the Rosario Geodesy Satellite Group, at the National University of Rosario, Argentina [4]. 2.4 Experimental Caster Server ESPOCH Initially, a pilot project is proposed in 2016, which consists of implementing the experimental caster with BKG standard version software as cooperation between the Faculty of Computer Science and Electronics and the National Electricity Company of Riobamba city only for internal use. The infrastructure consists of a single GNSS station called
44
D. Cisneros et al.
EREC connected to the caster server, the main element of the project, monitored continuously during two years of operation. The integrity of the information transmitted checked using BNC software and applying the corrections in Real-Time on the Mobile Mapper 20 GPS devices, reaching a accuracy scope at the PVT solution’s centimeter level [5].
3 Red de Monitoreo Continuo del Ecuador (REGME) The GNSS Continuous Monitoring Network of Ecuador - REGME, constitutes the principal national georeferenced infrastructure, installed, and administered by the Military Geographical Institute (IGM) since 2008 [6]. It is made up of 45 CORS-type stations (See Fig. 1), distributed in all the national territory, tracking GNSS signals of all the observables given by the L-band frequencies of GPS L1/L2, GLONASS L1/L2, and the near future GALILEO E1/E5) providing the necessary information to execute correction activities through the Static Differential and Real-Time relative positioning method. REGME was designed to provide radio coverage of 100 [km] for each station. Due to the requirements and the applications from users derive from redefining the radio coverage decreasing at 50 [km] and increasing the GNSS station with their equipment, focus as a goal of 50 permanent stations.
Fig. 1. REGME, Coverage radio around 50 [km] (August 2020).
4 Network Transport RTCM Internet Protocol (NTRIP) It is an application-level protocol that supports GNSS data transmission over the Internet, based on the HTTP/1.1 hypertext transfer protocol. NTRIP develops to transmit differential correction data or other types of GNSS data transmission to stationary or mobile users over the Internet. NTRIP is an open, non-proprietary protocol. Significant characteristics of the NTRIP dissemination technique are the ability to distribute any kind of GNSS data, potential to support mass usage, disseminating hundreds of streams
REGME-IP Real Time Project
45
simultaneously for up to thousand users possible when applying modified Internet Radio broadcasting software, and enables streaming over any mobile IP network because of using TCP/IP [7]. NTRIP constitutes the transport layer, and the transmitted data is in the RTCM format, generally in versions 2.x and 3.x. The messages contain observables, type of antenna, coordinates of the reference station linked to the national official reference system, code, and phase corrections from GNSS constellations as GPS, GLONASS, and GALILEO. The RTCM version 3.0 additionally transmits a network solution message, made up of the differential corrections of several permanent stations, which increases the consistency and quality of Real-Time positioning solutions. 4.1 Architecture NTRIP architecture is based on the following components: NTRIPClients, NTRIPServers, and NTRIPCasters (see Fig. 2) [8]. • NTRIPSource: Is the GNSS reference station that generates and transmits the corrections (data streams) in the RTCM format. The version currently generated by REGME stations is 2.3 and 3.0 versions depending on the receiver datasheets. • NTRIPServer: acts as an intermediary in the transmission of data between the GNSS reference station and the caster server, recommends using it only when the GNSS station don’t have available setting related to NTRIP protocol. • NTRIPCaster: It is the transmitting agent, its primary function is the dissemination of GNSS corrections to end-users; likewise, it is responsible for monitoring the quality and integrity of the data received, authentication to the system through user and assigned password, for this it is mandatory to have a constant and uninterrupted internet connection. • NTRIPClient: Final devices or rovers which are connected to the caster. Final devices are considered PCs, tablets, GPS receivers, mobile phones, PDAs, and the software and hardware to support the NTRIP protocol [9].
Fig. 2. NTRIP Architecture
46
D. Cisneros et al.
5 RTCM Standard The Radio Technical Commission for Maritime Services (RTCM) establish in 1947 as an advisory committee to the US Department of State. Currently, it is an independent organization made up of government agencies, manufacturers, and service providers. The high demand in using DGPS applications in Real-Time led to the establishment of the RTCM Special Committee 104, to normalize an industry-standard format for correction messages [10]. The RTCM format has developed new versions that improve data delivery and protect the integrity of these. 5.1 Rtcm 2.X The RTCM version 2 format design was made based on the structure of the GPS Navigation Message, since word sizes, parity algorithms, and format are identical. The GPS Navigation Message transmit at a 50 Hz, and a word has 30 bits. A data link carries information from a reference station to a rover or remote receiver. The reference receiver produces GNSS data and converts it into a suitable format for transmitting and modulating the data on a carrier that it transfers to a mobile user [11]. 5.2 Rtcm 3.X RTCM SC-104 introduced a new standard known as RTCM SC104 version 3.0 to overcome disadvantages and improve RTK operations. Its renovated structure benefits RTK operations and supports RTK networks due to the efficient bandwidth used during the broadcast. The RTCM Type 1003 message is a GPS RTK dual-frequency message based on raw pseudo-range and phase carrier measurements. RTCM version 3.0 divides L1 and L2 correction into dispersive and non-dispersive components: Ionospheric Carrier Phase Correction Difference (ICPCD) and Geometric Carrier Phase Correction Difference (GCPCD) [12].
6 REGME -IP Project Implementing a precise Real-Time positioning service requires a physical and technical infrastructure equipped mainly with geodetic equipment such as GNSS antennas and receivers, wireless communication technologies, security policies, and integrity of the information transmitted on communications protocols distributing DGNSS streams corrections in RTCM format on the Internet. REGME-IP Real-Time project represents an integrated system that includes all those elements to improve the accuracy of the final Position, Velocity, and Time (PVT) solution. The permanent GNSS stations, from REGME, are compatible with the Real-Time positioning technique and generate streams of differential corrections in RTCM format to distribute them applying the NTRIP protocol changing the traditional technique of DGPS where the correction information is transmitted through radio frequency into packed switching communication based on TCP/IP. The denomination of the term REGME-IP represents the actual and functional
REGME-IP Real Time Project
47
REGME plus the transmission of the GNSS station information over Internet Protocol (IP) following the international recommendation of related works. The GNSS station’s streams allow correcting common errors that the GNSS signal suffers, during its travel from satellites positioned in MEO orbit, during its arrival until the earth, problems like satellite geometry, effects on the propagation channel as ionosphere and troposphere producing inaccuracy on final position solution. The correction is generated based on the coordinate of each of the permanent stations. The REGME-IP service will expect to reduce operating costs done in outdoor activities, improving the survey process because it does not require post-processing activities that produce delays for delivering results. The service’s availability allows increasing the level of citizen satisfaction in obtaining products and services related to the generation of geoinformation, dissemination of geospatial sciences, and other specialized services. 6.1 Principal Caster Server The Real-Time positioning system’s architecture has a fundamental element denominated Caster server acts as a gateway between the GNSS station and the end-users. The caster’s primary function is to authenticate users and transmit the stations’ streams in RTCM format to the rovers. Since 2018, the Military Geographical Institute (IGM) and Escuela Superior Politécnica de Chimborazo (ESPOCH), start the project to implement the Principal Experimental Caster Server to distribute the stream of differential GNSS corrections in Real-Time, through the NTRIP protocol using REGME stations. The international projects in Real-Time recommend using BKG NTRIP Caster software; it allows the dissemination of GNSS data streams in RTCM format 2.x and 3.x version based in NTRIP protocol. The BKG NTRIP Caster software is developed within the framework of the EUREF-IP project. It is based on the ICECAST Internet Radio software written in the C programming language under the GNU General Public License (GPL). REGME-IP has redundancy to guarantee the national service; the Principal Caster Server is installed in IGM – Quito, Ecuador; in cases of physical or operative troubles, the caster server backup will be available being a mirror of the principal server and will assume the request connections from registers users. The Caster Server backup is in the Faculty of Computer Science and Electronics of ESPOCH – Riobamba, Ecuador. The address domain of each caster server is written below: • Principal Caster Server: regme-ip.igm.gob.ec:2101 • Backup Caster Server: regme-ip.espoch.edu.ec:2101 6.2 Operating Considerations and Recommendations The Real-Time correction technique takes the principle and basis of the GPS/GNSS differential technique [13]. Once finished the test period executed in outdoor environment, some recommendations can consider it.
48
D. Cisneros et al.
DGPS • The maximum/optimal operating distance between the Source NTRIP (REGME station) and the NTRIP Client (Rover), dual-frequency L1/L2, is 50 [km], for a fixed solution. • The maximum/optimal operating distance between the Source NTRIP (REGME station) and the NTRIP Client (Rover), an L1 frequency, is 20 [km], for a fixed solution. • The Real-Time positioning time is defined by the NTRIP Client, e.g. 15”, 30”, 60” s. • GNSS corrections are transmitted in RTCM 2.3 and 3.0 format, which involves pseudo-range and frequency correction and estimates centimeter horizontal and vertical accuracy, depending on survey conditions. The Real-Time technique represents to change the communication channel, from radiofrequency in VHF/UHF bands to transmission based on Internet Protocol. The rover device must support an internet connection with any available technology as WIFI, hotspot, and Service Mobile Advanced (AMS). In outdoor conditions, the mobile cellular network becomes the primary connectivity option. Advanced Mobile Service (AMS) The mobile operators in Ecuador are Conecel S.A (Claro), Otecel (Movistar), and CNT. The new technologies and demands of low latency in digital services demands to update towards new technologies that integrate an evolution in devices, antennas, operation frequency bands, and services. Every step and evolution is identified by a generation of mobile networks (2G, 3G, 3G + , 4G) corresponds to new technology (GSM, GPRS, Edge, UMTS), increasing the performance of the mobile cellular network providing quality in the service (See Fig. 3).
Fig. 3. AMS Generation
Coverage AMS Figure 4 shows the evolution of radio bases installed in the national territory for the period 2008-2020. Presenting that as of March 2020 there are 18,857 radio bases with an increase of 10.48% compared to March 2019, when there were 17,069 radio bases [14].
REGME-IP Real Time Project
49
Fig. 4. Radio bases installed in 2G, 3G, 4G technologies, nationally, 2015–2020
Size of Generated Data The RTCM format, according to the number of satellites that the rover tracking will generate the size of the packet it transmits [15]. Table 1. Data transfer in RTCM format (bits/sec) 6 Satellites
9 Satellites
12 Satellites
RTCM 2.3
3.900
5.400
7.000
RTCM 3.0
2.500
3.000
3.550
RTCM 3.0
2.000
2.400
2.800
[CM]R1
1.400
1.800
2.100
[CM]R
900
1.300
1.600
IGM technical staff record a daily working day of 8 h of data collections from 08:00 to 16:00, considering 12 satellites as the maximum possible number of satellites (See Table 1) available for the RTCM 2.3 format above the rover bytes generated for 8 h days are calculated in (1). Total = Number of bits generated per second ∗ 3600 [sec] ∗ 8[hours]
(1)
Total = 7000 [bits/sec] ∗ 3600 [sec] ∗ 8[hours] Total = 25.2 Mbytes In the case of using the RTCM 3.0 correction format, with maximum satellite availability, the total bytes generated is (2): Total = Number of bits generated per second ∗ 3600 [sec] ∗ 8[hours] Total = 3550 [bits/s] ∗ 3600 [sec] ∗ 8[hours] Total = 12.78 Mbytes
(2)
50
D. Cisneros et al.
Latency Based AMS Generation Depending on the AMS availability service, the rover’s latency will vary. Consider the RTCM 2.3 format with full satellite line of sight (See Table 1), generates 7000 bits/sec, the time response using 2G considering on average the theoretical rate of 0.1 [Mbytes/s] (See Fig. 3), the latency estimated is 1.09 [msec]. To RTCM 3.0 format with a full satellite above the rover, generates 3550 bits/sec, the latency estimate with 2G is 0.55 [msec]. The BNC software is developed by BKG, which emulates an NTRIP client on the computer [16]; the tool allows us to measure the latency and throughput of connection to the caster server, the average latency is 0.9 [sec].
7 Methodology 7.1 Static Differential Post-process vs Real Time NTRIP Positioning Test Using REGME-IP Service To analyze and quantify the accuracy reached in meters, evaluate a sample points tests taken by both methods. The Sucre Canton, Bahía de Caráquez city, Manabí Province, is chosen to apply the tests. The data collected correspond to 150 points located in different sites around. The REGME stations used as a reference to apply static method differential post-process and Real-Time NTRIP are located in Portoviejo and Chone city; the stream’s identifier is POEC and ONEC, respectively. (See Fig. 5)
Fig. 5. Geographic location of the area used for field testing
REGME-IP Real Time Project
51
Differential Static Positioning, Post-Process For static positioning: • • • •
Two Trimble R8 GNSS L1/L2 (GPS + GLONASS) rover equipment. Time survey duration by point: 30 min. Complete the survey of 150 planned points. Download files from the GNSS reference station using Trimble TBC software to execute post-processing. • The final coordinate, REGME stations located in Portoviejo – POEC and Chone – ONEC are the GNSS reference stations to adjustment.
Real Time Positioning NTRIP, REGME-IP service Real-Time positioning technique demands the rover’s internet service to connect to Principal Caster Server. Huawei P8 cell phone device as a hotspot to provide connectivity to the rover. The test is considered the RTCM 3.0 message. (See Fig. 6)
Fig. 6. Trimble R8 GNSS L1/L2 rover equipment (GPS + GLONASS) configured to connect to the Principal Caster using a cell phone as hotspot
The maximum distance from the working area located in the Sucre Canton to the GNSS station is 58 [km], and the minimum distance is 25 [km]. Each point is measured (survey) for 60 s, obtaining FIXED solutions. The NTRIP technique is affected by latency due to the distance between the Principal Caster Server and the worksite; it depends on the time that delays switching the packets to transmit the information at a specific rate and the technology of the mobile network and coverage offered by the user-selected mobile operator. It is essential to consider these observations; for this example, the distance between the Principal Caster Server to the site in the Sucre Canton is approximately 230 [km] (See Fig. 7). Another distance that you must be considered is the one with the GNNS reference station, it is recommended to use the nearest one based on the DGPS technique (See Sect. 6.2). For the case of Sucre Canton, to the nearest and available GNSS station are described in Table 2; consider the operative condition of the station before your selection; it recommendable to check the status online on the official site web [17]. Depending on the location of the sample areas of the points, the distances are as follows (Table 2):
52
D. Cisneros et al. Table 2. Distance between the nearest GNSS Reference Station to Sucre Canton
PARROQUIA
REGME reference station with NTRIP service
Distance [km]
CHARAPOTO
PORTOVIEJO - POEC
26
SAN ISIDRO
PORTOVIEJO - POEC
68
BAHIA
CHONE - ONEC
38
LEONIDAS PLAZA
PORTOVIEJO - POEC
47
SAN AGUSTIN
PORTOVIEJO - POEC
51
Fig. 7. Distance between Principal Caster Server and Sucre Canton
8 Results The DGPS technique consists of eliminating common errors existing between a GNSS reference station and the rover; this technique is conditioned by the distance, the epoch, and the satellites in view between them. The first scenario (See Table 3) in the analysis is the 150-point sample survey in Sucre Canton (Bahía de Caráquez); the results show an acceptable level of accuracy between the Static Post-Process Differential and Real-Time NTRIP method. In some cases, the horizontal component’s difference reaches to millimeters order considering that the Differential Static Positioning had a positioning duration of 30 min, instead of with Real-Time technique using NTRIP, the positioning has the only duration of 60 s. Both methods show a similar accuracy level. The horizontal component (N, E) has less error than the vertical component (ellipsoidal height h). Table 4 show some results obtained by comparing the two static Post-Process Differential vs Real-Time NTRIP methods taken in Bahia de Caráquez city (Fig. 8). The second scenario is proposed in the country’s principal zone, specifically Ambato, Riobamba, and Latacunga cities, to evaluate the precision achieved using the Real-Time
REGME-IP Real Time Project
53
Table 3. Scenario characteristics in Bahía de Caráquez EQUIPOS MÓVILES UTILIZADOS:
TRIMBLE R8 GNSS L1/L2 GPS + GLONASS
COORDENADAS UTM 17 SUR ITRF 08, ÉPOCA DE REFERENCIA 2016.4 SERVIDOR NTRIP CASTER:
REGME_NTRIP
MOUNT POINT REGME USADO COMO BASE STREAM:
CHONE - ONEC
DISTANCIA DE LA BASE AL ÁREA DE ESTUDIO:
38 [KM]
MENSAJE DE CORRECCIÓN STREAM/VERSIÓN:
RTCM 3.0
TIEMPO DE REGISTRO DE DATOS POR PUNTO:
60 SEGUNDOS
Fig. 8. Distance between ONEC reference station and Bahía de Caráquez city
correction technique using the RTCM 2.3 and RTCM 3.0 format. For demonstrative purpose, only analyze the case to the correction between Ambato and Riobamba cities. To check the complete reference maps and the results review [18]. The Fig. 9 show GNSS reference station denominated EREC located in Riobamba’s city to the furthest observation point in Ambato’s city, with 49.5 [km] of distance. It is observed (see Table 5) that the Real-Time correction at the measurement point in the city of Ambato with respect to the EREC station achieves with the RTCM 2.3
54
D. Cisneros et al. Table 4. Difference between Post-Process vs Real Time to Bahía de Caráquez City N MED
E
h
0.011 0.011 0.029
MAX 0.036 0.033 0.055 MIN
0.001 0.000 0.001
DESV 0.010 0.009 0.016
Fig. 9. Distance between EREC GNSS base station in Riobamba and Ambato cities
Table 5. Observed correction in Ambato respect EREC GNSS base station with RTCM 2.3 and RTCM 3.0 Format
E [m]
N[m]
Elevation [m]
HRMS [m]
VRMS [m]
RTCM 2.3
765163,028
9864514,592
2595,585
0,014
0,026
RTCM 3.0
765163,017
9864514,592
2595,593
0,011
0,018
format a HRMS (Horizontal Error) of 1.4 [cm] and VRMS (vertical error) of 2.6 [cm], while with the RTCM 3.0 format, HRMS 1.1 [cm] and VRMS 1.8 [cm] are displayed [18]. The survey is repeated around the Ambato city at some points, increasing the radius by 1 [km], taking as pivot a the GNSS reference station. The accuracy reached in the corrections of each point respect to both references GNSS stations EREC and CXEC cities next to Ambato is acceptable, the results support the fact to create a reference accuracy map for the Central andean zone (See Fig. 10) applying Real-Time correction technique for each RTCM format. The Principal Caster located in IGM is the server that provides access to the correction in Real-Time. After some tests in the outdoor environment, the experiences, and
REGME-IP Real Time Project
55
Fig. 10. Riobamba’s Accuracy reference map with EREC GNSS reference station
results, there are some feedback, conclusions, and recommendations regarding the use of REGME-IP. The Real-Time correction technique is a service based on correction data that is transmitted over the internet protocol. The rover device must maintain a constant and stable connection to an Internet Service Provider (ISP) of your preference. The Real-Time positioning method using the NTRIP protocol does not replace the differential static positioning method, only is a different technique that gives greater productivity in the outdoor survey environment. The static post-process differential method provides greater accuracy than the NTRIP Real-Time method but requires more processing time to deliver the results. The advantage of the Real-Time technique is that you can achieve high accuracy with a shorter GNSS positioning time. The results of the horizontal and vertical components are obtained immediately without using post-process correction. The Static Differential method for each point has 30 min of positioning in the outdoor environment, while the Real-Time based in the NTRIP method for each point is only 60 s. The horizontal component (N, E) has less error than the vertical component (ellipsoidal height h). For the vertical component, a double or triple horizontal component error is expected. Statistical results from the 150-point sample indicate an acceptable level of accuracy between the Static Post-Process Differential and Real-Time NTRIP method. In some cases, the difference in the horizontal component reaches the order of the millimeters
56
D. Cisneros et al.
taking into account that the Differential Static Positioning had a positioning duration of 30 min, compared to the positioning of 60 s with the technique in Real-Time – NTRIP; and at different distances from the REGME station used as Mount Point, from which RTCM 3.0 differential correction messages are generated. The REGME-IP service, a Real-Time project, is available to users at the national level, under their responsibility with operational equipment and access to the Internet by on ISP. The Principal and Backup caster are operatives since January 2019, initially worked with BKG NtripCaster Software version Standard; the project was planned in installation, test, fix bugs stages, the constant monitoring avoided damage on server and rovers compromising the service’s quality and continuity. In 2020, the IGM plans to provide Real-Time as national service to benefit Ecuadorian citizens; it demands updating to BKG NtripCaster Software version Professional to support all the requirements technical and give quality service that include increase users and station connections simultaneous, network management, security and quality policies. Some tools are creates to facilitate the use of the service, the web site http://www.geoportaligm.gob.ec/ is available with information to register, access to Real-Time service and official documentation generated by the entity responsible. The collaboration between Public Entities and universities represents the effort, support, and teamwork to reach the biggest objectives through the knowledge transfer from their researchers to the benefit of Ecuadorian society.
9 Future Work The DGPS technique consists of the elimination of common errors produced by the ionosphere and troposphere; the correction can reach to eliminate the error to increase the accuracy in the final solution at the centimeter level; however, the correction of the orbital errors is not considered. The future work includes improving the service of REGME-IP, increasing the Precise Point Positioning service (PPP), and interconnecting between casters servers from international projects. The full operation of the new GNSS constellation demands Real-Time technique analysis to upgrade the service to multi constellations.
References 1. EUREF Permanent GNSS Network. http://www.epncb.oma.be/_organisation/WG/previous/ euref_IP/index.php. Accessed 26 Jul 2020 2. BKG GNSS data center. https://kb.igs.org/hc/en-us/articles/115002196848-Real-Time-Ser vice-Fact-Sheet-2014. Accessed 26 Jul 2020 3. Sistema de Referencia Geocéntrico de las Américas. http://www.sirgas.org/fileadmin/docs/ Boletines/Bol15/18_Hoyer_et_al_Reporte_SIRGAS_RT.pdf. Accessed 26 Jul 2020 4. Proyecto SIRGAS Tiempo Real. www.fceia.unr.edu.ar/gps/caster. Accessed 28 Jul 2020 5. Zabala, M.: Implementación del caster experimental para la distribución de medidas de GPS en tiempo real a través de NTRIP. In: Conference 2018, Vol. 13, pp. 117–120. Congreso de Ciencia y Tecnología ESPE (2018) 6. Geoportal REGME. http://www.geoportaligm.gob.ec/wordpress/. Accessed 28 Jul 2020
REGME-IP Real Time Project
57
7. Federal agency of cartography and geodesy. https://igs.bkg.bund.de/NTRIP/about. Accessed 28 Jul 2020 8. Zabala, M.: Implementación del caster experimental para la distribución de medidas de GPS en tiempo real a través de NTRIP. In: Conference 2018, Vol. 13, pp. 117–120, Congreso de Ciencia y Tecnología ESPE (2018) 9. Hoyer, M.: Conceptos Básicos del Posicionamiento GNSS en Tiempo Real. In: SIRGAS-RT, Conference (2002) 10. Bonilla, L., Damián, C.: Diseño de un mapa de referencia de corrección en observaciones GPS considerando las restricciones de rango y mensajes de RTCM de DGPS del caster experimental medido en la zona centro del país, Tesis, 35–36, Escuela Superior Politécnica de Chimborazo (2020) 11. RTCM: Recommended standard for differential GNSS (Global Navigation Reference System) Service Version 2.3 (2001) 12. RTCM: Recommended Standard for Differential GNSS (Global Navigation Reference System) Service Version 3.0 (2016) 13. Franklin, L., Ortega, A., Zabala, M.: Análisis e implementación de diferencial GPS en configuración simple y doble. Vol. 7, N° 1, pp. 46–53, Noviembre 2017. https://doi.org/10.24133/ maskay.v7i1.343 14. Infraestructura y Cobertura, Boletín N° 2020–02, Agencia de Regulación y Control de las Telecomunicaciones (ARCOTEL). https://www.arcotel.gob.ec/wp-content/uploads/2015/01/ BoletinEstadistico-May2020-AMS-CoberturaInfraestructura.pdf. Accessed 15 Sept 2020 15. EUSKADI, Conexión a la red GNSS a través de Internet (2012) Last Access. http://www. gps2.euskadi.net/internet.php 16. Carranza, A., Reyes, J., Zabala, M.: Análisis e implementación de diferencial de GPS en tiempo real a través de la tecnología NTRIP para la EERSA”, Tesis, Escuela de Ingeniería en Telecomunicaciones y Redes, ESPOCH, Riobamba, Ecuador (2017) 17. Geoportal. http://www.geoportaligm.gob.ec/geodesia/index.php/informacion-serviciosntrip/. Accessed 19 Sept 2020 18. Bonilla, L., Damiá,n C.: Diseño de un mapa de referencia de corrección en observaciones GPS considerando las restricciones de rango y mensajes de RTCM de DGPS del caster experimental medido en la zona centro del país, Tesis, 46–47, Escuela Superior Politécnica de Chimborazo (2020)
Physical Variables Monitoring to Contribute to Landslide Mitigation with IoT-Based Systems Roberto Toapanta1(B) , Juan Chafla1 , and Angel Toapanta2 1 Pontificia Universidad Cat´ olica del Ecuador, EC17012184 Quito, Ecuador [email protected], {jchafla390,rctoapanta}@puce.edu.ec 2 Instituto de Investigaci´ on Geol´ ogico y Energ´etico, EC170518 Quito, Ecuador [email protected]
Abstract. This document proposes the design of a low-cost platform that contributes to the monitoring of the physical variables involved in the process of generating land-slides through a low-power wide-area network. The system consists of three components. The first is the electronic module in charge of integrating sensors, sampling, digitizing, and sending data. The second is the LoRa wireless commu-nication system that links all the nodes located in strategic sites in the study area with the gateway, then the bidirectional messages (data) are sent from the gate-way to the server hosted in the cloud through the MQTT protocol, where all the data acquired from the sensor network and the web server are stored. The third component is the display of information to the user through HTTP requests from any multi-platform device. The platform generates a large amount of data that will serve as input for future investigations related to landslides. This design will al-low the generation of more reliable early warning systems. Keywords: Internet of Things landslides · LoRa
1
· MQTT protocol · Cloud computing ·
Introduction
The development of the Internet has generated significant changes in the human being’s [1] life and, with it, the introduction of new technological applications that have revolutionized the world as it is the case of the Internet of Things (IoT) [2]. In this aspect, anything in the physical world of interest to be observed and controlled by human beings will be connected and offer services through the Internet [3]. IoT presents benefits that traditional cities practice, to later transform into modern and smart cities. These smart cities use connectivity, geographically distributed environmental sensors, and computerized intelligent management systems to solve immediate problems that anticipate, mitigate, and even prevent crises. These mechanisms make it possible to proactively offer better services, alerts, and information to citizens [4]. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 58–71, 2021. https://doi.org/10.1007/978-3-030-63665-4_5
Physical Variables Monitoring to Contribute to Landslide Mitigation
59
Large cities and metropolitan areas are considered complex systems with various challenges. Therefore, urban planning and the development of dynamic decision mechanisms that consider citizen participation processes’ growth and inclusion are becoming increasingly important [4]. In this context, according to the World Bank [5], 90% of urban expansion in developing countries is in areas close to risk zones, resulting in informal and unplanned settlements. The interaction between humans and nature throughout history has caused considerable material and human losses [6]. Quito, study case, has a population of around 2,781,641 inhabitants [7]; due to its geographical and climatic characteristics, Quito can be affected by natural phenomena such as landslides. These phenomena are of short duration but potentially destructive, and among their causes are geological, hydrological, and anthropogenic effects. Quito’s geography has about 11 significant ravines according to a study by [8], and they are areas of most significant risk from landslides, e.g., the event in the Yacupugro ravine, which was affected by a landslide on March 22, 2019, causing economic losses to the closest inhabitants [9]. The dynamic and unpredictable behavior of nature influences the complexity of developing the monitoring infrastructure of environmental variables needed for early warning systems. Rain and earthquakes are factors identified as triggers for the origin of landslides. Under these conditions, the sensory component is responsible for measuring variables such as ambient temperature, relative humidity, soil moisture, precipitation, wind, and soil movement. Likewise, it is essential to develop devices that allow us to obtain signals in real-time and to handle massive amounts of data with accuracy, integrity, as well as to ensure the transmission of data with low latency and efficient use of energy. The proposed technological infrastructure based on IoT technology will allow the monitoring of physical variables associated with landslides, contributing to improving early warning systems [10]. 1.1
Related Research Review
With his study in Southern California, Russell H. Campbell foresaw the possibility of forecasting landslides produced by rainfall. He suggested that three main elements could implement Landslide Early Warning Systems (LEWS): 1. A system of rain gauges to record rainfall every hour. 2. A meteorological mapping system on the rain gauges’ location allows determining the centers of high-intensity rain, thus generating slope maps or topographic maps. 3. An information management system includes collecting, transmission, and administration of information from the different measuring instruments. His study has been the key to any modern scientific attempt at predicting and early warning of landslides [11]. [12] proposes an intelligent IoT system based on a disaster management platform with data integration for monitoring, emergency response coordination. This platform comprises the wireless sensor network with information exchange through web service and disaster notification by mobile phone. [13] describes a system for predicting the probability of flooding in a river basin based on IoT that integrates machine learning. The model uses a modified
60
R. Toapanta et al.
sensor network over ZigBee for data collection and a GPRS module for data transmission over the Internet; a neural network analyzes them. A study carried out by [14] developed a system of interconnected intelligent modules to facilitate data acquisition; these modules are composed of multiple sensors useful for monitoring smart cities and disaster management. Since measurement systems are lost in a landslide, selecting a low-cost and efficient system becomes a challenge, so [15] presents an early warning system for land-slides low-cost WiFi sensor networks using micro-controllers. In the research of [16], they present a prototype of an early warning system for landslide risks using technologies related to sensor networks (WSN), reuse and adaptation of web services through API RESTful hosted on the Web of Things (WoT) and Big Data. [17] describes a comprehensive framework for monitoring and early warning of landslides using TDR (Time Domain Reflectometry). He details an updated procedure for installation, data acquisition, identification of displacement deformities, and processing of early warnings.
2
Materials and Methods
The proposed design for the acquisition and monitoring of physical variables that influence landslides is structured as follows: – – – – –
a sensory component data acquisition an embedded system a data communication block an energy source (see Fig. 1).
Fig. 1. Hardware description [18].
Physical Variables Monitoring to Contribute to Landslide Mitigation
61
Sensory Component. One of the essential components for IoT is the sensors; each node has integrated sensors that supervise their local environment, turning the physical parameter into an electrical signal. The most relevant requirements of most sensors are accuracy, resolution, and sensitivity. Also, these sensors must be low-cost since they can be lost in a landslide. The system is composed of the sensors showing in Table 1. Table 1. Sensors type. Sensor type
Sensor
Environmental sensing Temperature, humidity, wind speed, precipitation, soil moisture Location
Global position system
Motion
Geophone
Temperature and Relative Humidity Sensor. The DTH-22 (AM2302) sensor allows to acquisition temperature and humidity data; its output is a calibrated digital signal with a sampling period of 2 s. It has a resolution of 0.1 ◦ C for temperature and 0.1% RH for relative humidity with low power consumption. Wind Speed Sensor. This sensor measures wind speed accurately; it starts the measurement when the wind is present in its hemispherical bowls generating voltage values in an analogical way proportional to the wind speed. This analog signal enters the microcontroller’s 10-bit analog channel to digitize and deliver the speed value in m/s. The speed sensor used in this investigation has the following specifications – – – – –
Style: three cups Supply voltage: 9–24 VDC Resolution: 0.1 m/s Output signal: 0–5 V Measurement Range: 0–30 m/s
The calculation of the wind speed, according to [19], is as described in Eq. 1: Ws = 6 × V o
(1)
Where wind speed, Ws, is the product of the conversion factor to units of speed per output voltage (in m/s), Vo is the analog output voltage (in volts). Precipitation Sensor. The precipitation sensor or rain gauge is responsible for collecting and measuring the precipitation in the study area. It uses the mechanism of calibrated cubes of identical proportions that indicate the volume of precipitation obtained using a switching action. The bucket mechanism switching empties the steady rain in one of the buckets and makes the other bucket
62
R. Toapanta et al.
ready to continue the measurement process [20]. With a pull-up resistor, this action is changing the state of the digital signal. The main characteristics of the rainfall sensor are: – Accuracy: ±1% at 5.1”/hr. – Output: 0.1-s switch closure 3. – Contact maximum rating: 3 W, 0.25 A, 24 Vdc. Soil Moisture Sensor. A low cost and low power consumption YL-692 module (3.3 V–5 V) detect the percentage of soil moisture in the study area. This module gives an analog value. To calculate the moisture percentage corresponding to the value of the moisture sensor, the Eq. 2 is applied: 1023 − Ao × 100 (2) 1023 − 395 where M is the humidity percentage, and Ao is the analog value of the sensor [21]. M=
Soil Measuring Motion Sensor. The motion sensors, geophones detect the vibrations caused by the landslides. Geophones are the most used signal receivers for surface wave studies; this sensor is composed of an inertial mass suspended from a spring with a magnet attached to the geophone box and a coil used as a proof mass. The magnet’s movement inside a coil causes an induced field, and the electromagnetic field produces a potential difference (voltage) proportional to the movement. The geophone does not need any external power supply [22], benefiting the IoT framework, and mainly energy management. The main characteristics of the geophone are: – Low-distortion vertical geophone – Extended spurious over 240 Hz, allowing full bandwidth at 2-ms sampling – Sensitivity of 28.8 V/m/s Global Position System (GPS). The L80 GPS is an ideal module for IoT because of its low power consumption, and it obtains the positioning of an end node. The information obtained is extracted from the minimum recommended position data of the sentence $GPRMC of the standard NMEA 0183 packet protocol [23]. Table 2 shows the fields required to obtain the position of the end node. Table 2. Structure of the GPRMC sentence. Field Description
Format
3
Latitude
ddmm.mmmm (degree and minutes)
4
Lat. hemisphere (N = North, S = South)
N/S
5
Longitude
dddmm.mmmm (degree and minutes)
6
Long. hemisphere (E = East, W = West)
E/W
Physical Variables Monitoring to Contribute to Landslide Mitigation
63
Actuator. The actuator is an element that performs an action determined by a control device. In this case, a loudspeaker emits an audible signal to alert people near the source of risk. The triggers for these alerts will depend on the data processed. This actuator requires a control signal given by the embedded system port. Data Acquisition. Each end-node has a signal conditioning stage that will depend on the number of sensors connected, with a maximum of seven sensors, which allows amplification, analog-to-digital conversion, and filtering. External Analog-to-Digital Converter. The ADS1115 converter allows improving the Arduino embedded system’s resolution from 16 bits to 860 samples/second, low power, and a serial data bus I2C. For example, the signal delivered by the geophone enters the ADS1115 converter in a differential manner, where it is digitized, and the programmable gain is applied to amplify the input signal. Open-Source Embedded Systems. IoT applications use the Arduino opensource embedded system (hardware and software) because of its low cost and C and C++ based programming [24]. This component is in charge of performing the logic (algorithms) related to the sensor values’ acquisition, conversion, quantification, and communication. Energy Source. The end-node deployment in a remote and hostile region is a limiting factor for the mobility and power supply [25]. A large part of the total energy consumed is signal processing and wireless communication [26]. To counteract this challenge, and with the idea of keeping data acquisition online at all times, two methods are used (see Fig. 1). The first one, the system uses power management as an exploitation technique, and it consists of setting an end node to standby mode automatically when it is inactive for a specific time to reduce the node’s energy consumption [26]. In the second one, the system uses the sun as an alternative energy source to harvest energy in progress. This approach is suitable (due to the fixed infrastructure) for each node’s supply of energy [21]. Solar cells are responsible for harvesting the sun’s energy and then storing it in batteries always to maintain the energy supply. 2.1
Network
LoRa Gateway. It connects the LoRa wireless network to an IP network via WiFi, Ethernet, 3G, or 4G cellular; these interfaces provide flexible connecting sensor networks to the Internet. The LoRa wireless allows users to send data and reach too long ranges at low data-rates. It provides ultra-long range spread spectrum communication and high interference immunity. Also, it supports 50 to 100 sensor nodes [27]. The LoRa uses the unlicensed ISM band, and due to the geographical location and corresponding regulations, it configures in the
64
R. Toapanta et al.
915 MHz frequency. Also, a low Spreading Factor (FS) was selected so that the transmission time is not very long, and it allows energy savings by avoiding long transmission times. 2.2
Communication Protocol
MQTT - Message Queue Telemetry Transport. It is a publish/subscribe messaging transport protocol built on top of TCP to provide orderly, lossless, bidirectional connections. It is used because it is light, open, and designed for lowbandwidth, high-latency networks, which guarantees a certain degree of security in the delivery [21,28–30]. MQTT is made up of three members: Publisher, Broker, and Subscriber. The Broker is the manager between Publishers and Subscribers; there is no need to directly identify a publisher and a consumer based on physical aspects (such as an IP address) [29]. MQTT allows a more simplified and straightforward data acquisition. The MQTT forwarding has two options, the uplink and the downlink (see Fig. 2); in the uplink, the sensor sends data to the LoRa gateway via LoRa wireless. The gate-way will process these data and forward it to a remote MQTT Broker hosted in the cloud (AWS) through the Internet. While the gateway subscribes a topic in the MQTT broker for the downlink, when there is an update on the topic, the gateway will know and broadcast the data to the Local LoRa network [31].
Fig. 2. General MQTT structure [27].
The quality of service (QoS) in MQTT is defined and controlled by the sender; this means that each sender can have a different policy [29]. In order to alleviate the bandwidth, we select a QoS = 0.
Physical Variables Monitoring to Contribute to Landslide Mitigation
65
The developed software allows us to register unique topics for each end node in a hierarchical way and the MQTT broker only processes the data of each end node previously registered (see Fig. 3): Topic Format: System Name/Station Code/Gateway Code/Node Code/option Example: EnvironmentalMonitoring/QUPI/QUPI-GW1/QUIPI-ND1/data
Fig. 3. Example of hierarchical structure topic.
2.3
Software Description
The application is developed under the MVC pattern (Model - View - Controller) so that the code is ordered, facilitating the maintenance and continuous development by any developer. We design the software using the PHP Laravel 7.0 framework in the backend and VueJs in the frontend. The application contains middleware to filter HTTP requests so that the access to each module is limited to each registered user by their assigned role in the application. It also contains security for cross-site request forgery (CSRF) attacks to avoid unauthorized requests through registered users. The application consists of five modules: Monitoring, Reports, Configuration, Inventory, and Administration (see Fig. 4).
66
R. Toapanta et al.
Fig. 4. User interface.
2.4
Architecture
The architecture proposed in this research consists of three layers, each fulfilling a specific function; these layers are IoT Platform Side, Server AWS Side, Client Side (see Fig. 5). IoT Platform Side. It is mainly in charge of the gateway and each end-nodes (sensory component) inter-action through the Wireless LoRa Network. Server AWS Side. This component uses cloud services; therefore, Amazon Web Services (AWS) helps us perform cloud computing. The AWS include Storage, Web Server, and Data lake, which helps to have scalability, high availability, and fault tolerance due to the IoT application’s criticality. Client-Side. Through user HTTP requests to the Web Server, it displays all the information obtained from the sensorial part to be visualized by the user in different devices such as cell phones, tablets, and computers.
Physical Variables Monitoring to Contribute to Landslide Mitigation
67
Fig. 5. IoT-based landslide monitoring architecture [32].
3
Results and Discussion
Figure 6 shows the three nodes installed at approximately 350 m approximately from the gateway. The cellular network allows communication between the gateway with the cloud.
Fig. 6. Study area.
The Fig. 7 illustrates the connection, subscription of the software called SlandioT to the MQTT broker, and the result of processing a message from an unregistered device. The software discards these messages.
68
R. Toapanta et al.
Fig. 7. Messaging processing from unregistered device.
Figure 8 shows the result of processing a message from a registered device and storing the message’s data. This proving that only data from registered devices are stored.
Fig. 8. Messaging processing from registered device.
The data from installed sensors can be displayed both in real-time (see Fig. 9) and historical data (see Fig. 10). According to the study area results, low-cost nodes implemented with a LoRa net-work and energy independence (energy collection) are viable for this type of application in remote and hostile areas. Also, their low implementation cost allows their re-placement and continuous monitoring.
Physical Variables Monitoring to Contribute to Landslide Mitigation
69
Fig. 9. Humidity trend.
Fig. 10. Temperature trend.
Based on the environmental conditions of the study area, we establish a low spread factor (SF) in order to encode more data per second, which implies less trans-mission time and less energy consumption. The data obtained from each installed node’s sensors is stored in the cloud to have a data lake that can be considered inputs for future research to determine the correlation between the parameters obtained and the generation landslides to develop reliable early warning systems.
4
Conclusions
Over the years, landslides have caused countless losses of human life and property due to the lack of an early warning system. The main contribution of this research is the use of open-source and low-cost hardware and software that allows
70
R. Toapanta et al.
monitoring the variables involved in the generation of landslides through sensors that capture the data in real-time; with the Lora wireless network, all the nodes located in strategic sites communicate with the gateway in the free ISM band in the 915 MHz band. The MQTT protocol manages the publication and subscription of messages sent by each node to the cloud. Due to the criticality of monitoring variables that trigger landslides, Amazon Web Services AWS is used to maintain high system availability with a low failure rate. Cloud computing provides all services such as storage, applications, database, and web server hosting. The power supply for each system component comprises energy management and energy harvesting to maintain continuity of monitoring.
References 1. Cobos, A.: Dise˜ no e implementaci´ on de una arquitectura IoT basada en tecnolog´ıas. Open source (2016) 2. Zhou, Q., Zheng, K., Hou, L., Xing, J., Xu, R.: Design and implementation of open LoRa for IoT. IEEE Access 7, 100649–100657 (2019). https://doi.org/10. 1109/access.2019.2930243 3. Tsiatsis, V., Karnouskos, S., H¨ oller, J., Boyle, D., Mulligan, C.: Origins and IoT landscape. Internet Things 9–30 (2019). https://doi.org/10.1016/b978-0-12814435-0.00013-4 4. Bouskela, M., Casseb, M., Bassi, S., De Luca, C., Marcelo, F.: La ruta hacia las smart cities: Migrando de una gesti´ on tradicional a la ciudad inteligente (2016) 5. Banco Mundial: Desarrollo urbano: panorama general. https://www. bancomundial.org/es/topic/urbandevelopment/overview 6. Statista: Los desastres naturales en el mundo - Datos estad´ısticos. https://es. statista.com/temas/3597/desastres-naturales/ 7. INEC-Instituto Nacional de Estad´ısticas y Censos: Proyecciones Poblacionales. https://www.ecuadorencifras.gob.ec/proyecciones-poblacionales/ 8. Arg¨ uello, A., Arboleda, D., Menoscal, J., Maldonado, D., Urresta, S.: Monitoreo de la reforestaci´ on en las quebradas en el Norte de Quito. Enfoque UTE 3, 42–63 (2012). https://doi.org/10.29019/enfoqueute.v3n2.4 9. Secretar´ıa General de Seguridad y Gobernabilidad: Informe T´ecnico de Evento/Emergencia Barrio Pinar Alto, pp. 5–10 (2019) 10. International Telecommunication Union: Disruptive technologies and their use in disaster risk reduction and management, pp. 1–39. International Telecommunication Union (2019) 11. Campbell, R.H.: Soil Slips, Debris Flows, and Rainstorms in the Santa Monica Mountains and Vicinity, Southern California. U.S. Geological Survey professional paper 851, 51 p. (1975) 12. Liaw, T.L.D.: Development of an intelligent disaster information- integrated platform for radiation monitoring (2014). https://doi.org/10.1007/s11069-014-1565-x 13. Mitra, P., Ray, R., Chatterjee, R., Basu, R., Saha, P., Raha, S., Barman, R., Patra, S., Biswas, S.S., Saha, S.: Flood forecasting using Internet of Things and artificial neural networks. In: 7th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, IEEE IEMCON 2016 (2016). https://doi.org/ 10.1109/IEMCON.2016.7746363
Physical Variables Monitoring to Contribute to Landslide Mitigation
71
14. Sakhardande, P., Hanagal, S., Kulkarni, S.: Design of disaster management system using IoT based interconnected network with smart city monitoring. In: 2016 International Conference on Internet of Things and Applications, IOTA 2016, pp. 185–190 (2016). https://doi.org/10.1109/IOTA.2016.7562719 15. Biansoongnern, S., Plungkang, B., Susuk, S.: Development of low cost vibration sensor network for early warning system of landslides. Energy Procedia 89, 417–420 (2016). https://doi.org/10.1016/j.egypro.2016.05.055 16. Omar, M., Foughali, K., Fathallah, K., Frihida, A., Claramunt, C.: Landsliding early warning prototype using MongoDB and Web of Things technologies. Procedia Comput. Sci. 98, 578–583 (2016). https://doi.org/10.1016/j.procs.2016.09.090 17. Chung, C.C., Lin, C.P.: A comprehensive framework of TDR landslide monitoring and early warning substantiated by field examples. Eng. Geol. 262, 105330 (2019). https://doi.org/10.1016/j.enggeo.2019.105330 18. Tan, Y.K.: Energy Harvesting Autonomous Sensor Systems: Design, Analysis, and Practical Implementation. CRC Press, Boca Raton (2013) 19. DFROBOT: Wind Speed Sensor Voltage. https://wiki.dfrobot.com/Wind Speed Sensor Voltage Type 0-5V SKU SEN0170 20. Alcides, P., Le´ on, J., Arlex, I., Andalia, I.: Captaci´ on de lluvia con pluvi´ ografos de cubeta y su postprocesamiento. Ing. Hidr´ aulica y Ambient 34, 73–87 (2013) 21. Hassan, Q.F.: Internet of Things A to Z: Technologies and Applications. WileyIEEE Press, Piscataway (2018) 22. Dai, K., Li, X., Lu, C., You, Q., Huang, Z., Wu, H.: A low-cost energy-efficient cableless geophone unit for passive surface wave surveys. Sensors 15, 24698–24715 (2015). https://doi.org/10.3390/s151024698 23. Quetec: L80 GPS protocol specification (2014) 24. Mohamed, K.S., Mohamed, K.S.: The Era of Internet of Things. Springer, Cairo (2019). https://doi.org/10.1007/978-3-030-18133-8 25. Ramesh, M.V.: Design, development, and deployment of a wireless sensor network for detection of landslides. Ad Hoc Netw. 13, 2–18 (2014). https://doi.org/10. 1016/j.adhoc.2012.09.002 26. Siu, C.: IoT and Low-Power Wireless: Circuits, Architectures, and Techniques. CRC Press, Boca Raton (2018) 27. Dragino: LG01N/OLG01N LoRa Gateway User Manual (2020) 28. Tantitharanukul, N., Osathanunkul, K., Hantrakul, K., Pramokchon, P., Khoenkaw, P.: MQTT-topics management system for sharing of open data. In: 2017 International Conference on Digital Arts, Media and Technology, pp. 62–65 (2017). https://doi.org/10.1109/ICDAMT.2017.7904935 29. Lea, P.: Internet of Things for Architects: Architecting IoT Solutions by Implementing Sensors, Communication Infrastructure, Edge Computing, Analytics, and Security. Packt Publishing, Birmingham (2018) 30. Ejaz, W., Anpalagan, A.: Internet of Things for smart cities: technologies, big data security (2019). https://doi.org/10.1007/978-3-319-95037-2 1 31. Dragino: MQTT Forward Instruction - Wiki for Dragino Project. http://wiki. dragino.com/index.php?title=MQTT Forward Instruction 32. Alqinsi, P., Edward, I.J.M., Ismail, N., Darmalaksana, W.: IoT-based UPS monitoring system using MQTT protocols. In: Proceeding of the 2018 4th International Conference on Wireless and Telematics, ICWT 2018, pp. 1–5 (2018). https://doi. org/10.1109/ICWT.2018.8527815
e-Government and e-Participation
Open Data in Higher Education - A Systematic Literature Review Ivonne E. Rodriguez-F1 , Gloria Arcos-Medina2(B) , Danilo Pástor3 Alejandra Oñate1 , and Omar S. Gómez2
,
1 Escuela Superior Politécnica de Chimborazo, Riobamba 060155, Ecuador 2 GrIISoft Research Group – Escuela Superior Politécnica de Chimborazo,
Riobamba 060155, Ecuador [email protected] 3 Universidad Nacional de Colombia, Medellín 050034, Colombia
Abstract. The open data (OD) initiative fosters open access to public information in an organization. It is a relatively new but very significant and potentially powerful approach that starts being adopted in different organizations. In the context of higher education, the adoption of this initiative starts being reported in the literature. In order to get a better understanding of open data (OD) in higher education, we conducted a systematic literature review. From a sample of 595 papers, we identified and selected 25 relevant papers for the review. Our findings suggest that open science, open government, and open university are main reasons to adopt OD. Models, frameworks, methods, and standards about OD have been adopted in higher education context. Different technological tools have been used for: data modeling, RDF data generation, data publishing, linking, and exploitation. However, some issues have been reported in the moment of adopting OD initiatives, mainly related to intellectual property, standardization, legal and operation aspects. Finally, the largest number of OD initiatives reported in the context of higher educations have been applied in European countries. Keywords: Open data · Higher education · Higher education institutions · HEIs · Transparency
1 Introduction The adoption of open data is mainly motivated by the expected benefits such as transparency and accountability, stimulating economic growth, improving the technical, and operational functions of government agencies and institutions, exchange of knowledge, and encouraging innovation [1–3]. In this context, different sectors, both public and private have developed a growing number of open data (OD) initiatives. Currently, the university needs to open its information frontiers, to adopt a perspective of knowledge sharing and collaboration with other universities and the rest of the world [4]. This entails establishing a process of developing guidelines or policies allowing to establish a culture of publication of the information. Currently, this information is held by institutions in non-accessible formats. As OD initiatives become more common in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 75–88, 2021. https://doi.org/10.1007/978-3-030-63665-4_6
76
I. E. Rodriguez-F et al.
society, users will want to test and evaluate their usefulness before migrating their data. It creates the need to introduce frameworks or methodologies and tools, allowing to model data and publish it. The knowledge acquired can be reused and linked in such a way it is understandable by machines. In the field of higher education, authors in [5, 6] highlight the power of open data. It lies in its value as an educational resource and as an engine for the development of “Open Science” initiatives. Additionally, it constitutes a tool to activate the university community participation towards in the improvement of higher education through innovation and accountability. Universities produce a significant amount of data that can be published and shared under conditions that allow its reuse. Thus, it can become an essential and fundamental source for the benefit of society in general. In this context, because of the lack of OD initiatives that allow establishing a line of work for higher education institutions (HEIs) to publish their data and promote the transparency of their information. This paper presents a systematic review of the literature on open data in the field of Higher Education. The rest of the paper is organized as follows: Sect. 2 presents an overview of the open data, Sect. 3 presents the used method, Sect. 4 presents the results; finally, Sect. 5 discusses the findings and presents the conclusion.
2 Open Data Overview Open data (OD) is a widely accepted practice and stream for publishing and sharing data. It lso it has been seen as the tool that supports the open government doctrine, which holds that citizens have the right to access information (documents) of the government to allow for effective public oversight. According to the International Open Data Charter [7], open data is digital data that is available with all the necessary technical and legal characteristics and can be used, reused, and redistributed freely, anytime, anywhere. Technology has been a promoter in the exponential increase in the quantity and variety of data, it has given rise to the so-called data revolution. The data revolution is also an approach considered within the “Agenda 2030 for Sustainable Development” of the United Nations [8]. In this context, OD is viewed as the input of innovation aimed at sustainable development. OD contributes to the exchange of knowledge. It is worth noting that open data is not the same as big data, but big data can be part of open data [9]. Nowadays, public sector institutions are the primary producers of open data, and the so-called Open Government Data (OGD) initiative embraced them. According to the Organization for Economic Co-operation and Development (OECD). OGD is more than an open data type; it is a philosophy defined as a set of open government policies. It promotes transparency, accountability, and value creation by making government data available to society at large [10]. In the literature, we can found different works related to OD. These works discuss global methodologies and tools aimed at measuring open data initiatives (including OGDs); for example, the Open Data Barometer – ODB [11] and the Global Open Data Index - GODI [12]. Another example is the five-star framework for the Linked Open Data (LOD), qualification proposed by Tim Berners-Lee, which highlights legal and technical openness. The LOD concept promote interoperability through to transform “data on the web” [13].
Open Data in Higher Education
77
To be fully benefited from OD, it is crucial to put information and data into a context that creates new knowledge and enables the development of services and applications; here comes into play the use of Linked Open Data (LOD) as a relevant mechanism for information management and integration inside the OD umbrella. In this sense, government agencies in several countries (prominently, in the UK and the US) adopted Linked Open Data for transparency and public information purposes, also by publishing companies such as Elsevier [14]. Open data portals have been launched for data release; most of them have adopted platforms such as the Comprehensive Knowledge Archive Network (CKAN), Drupal Knowledge Archive Network (DKAN), SOCRATA, among others. Also, there are a variety of metadata vocabularies used for sharing data; for example, the Data Catalog Vocabulary (DCAT) and the Dublin Core Metadata Element Set [15, 16]. The open data maturity model is another contribution proposed by the Open Data Institute (ODI), Five areas are the base of this model: data management processes, knowledge and skills, customer attention and commitment, investment and financial performance, and strategic supervision. The model considers publication and reuse capacity as aspects of open data practice, and through a measurement matrix, it allows organizations to identify their maturity levels [17].
3 Method The research method used in this SLR follows the guidelines presented in [18], it consists of three main stages: 1) Planning, wherein this activity, the SLR protocol is developed. It implies a rigorous and iterative process to develop the general plan of the SLR. In this stage, research questions are stated, and the objectives are defined. In this activity, the sources used to perform the searches are specified, the language of the papers is considered, and the exclusion and inclusion criteria are established. 2) Execution, in this activity, the protocol is executed, and the defined search string is run on the defined sources. Document results are assessed according to the inclusion and exclusion criteria. Relevant information from relevant papers (primary studies) is synthesized and recorded. 3) Reporting, this activity involves reporting the SLR findings. Following, we describe the protocol used for this SLR. 3.1 Planning The main goal of this work is to gain a better insight into open data used in higher education institutions. To fulfill this goal, we defined the following research questions: RQ1. In what application context are the relevant works framed? RQ2. Have models, frameworks, methods, or standards related to OD been reported in the context of HEIs? RQ3. What supported tools have been used for OD? RQ4. What drawbacks have been reported when implementing OD initiatives in HEIs? RQ5. In which countries have experiences of applying OD been reported in higher education institutions?
78
I. E. Rodriguez-F et al.
Identification and Source Selection. Among several available sources, we worked with the Scopus database. It is a well-known database that includes a wide range of scientific literature. It offers a reliable and friendly search engine and a range of result exportation facilities [19]. Search String Definition. The search string includes different terms extracted from the research questions stated for this SLR. We combined these terms with logical operators such as “AND” and “OR”. The resulting search string was defined as follows: “TITLE-KEY-ABS((“open data” OR opendata OR open-data) AND (“higher education institutions” OR HEI OR university OR academia))” Inclusion and Exclusion Criteria. After the source for searching was selected and the search string defined, we delimited primary studies selected in terms of the inclusion criteria (IC) and exclusion criteria (EC). The primary studies (papers) were selected based on the title, abstract, and keywords to determine their relevance. The selection of the papers took into account compliance with the following inclusion criteria: papers related to open data in HEIs written in English language, complete papers published until April 2020 in prestigious venues such as journals, proceedings and book chapters subject to a peer review process. The rejection of the papers took into account the exclusion criteria: duplicated papers, papers whose main contribution is not related to open data in HEIs, posters, non-English written literature, and short communications like letters to the editor.
3.2 Execution In this stage, the defined protocol is executed, the defined search string is run on the defined sources and document results are assessed according to the inclusion and exclusion criteria. Relevant information from relevant papers (primary studies) is synthesized and recorded. To identify relevant documents, we conducted four phases, as shown in Fig. 1. In the initial stage, and after running the search string (April 2020), we got the meta-data information of 595 results from the Scopus database. In the initial phase, documents were selected and discarded based on title, abstract, and keywords. We selected 53 potential documents (relevant papers). In the next phase, these 53 documents were assessed against inclusion and exclusion criteria; after this, we discarded ten documents. In the last phase, we discarded 17 papers based on their content because their content did not help us answer any of the research questions stated in this SLR. At the end of this process, we selected 25 relevant papers for reporting the present SLR. Extraction and Synthesis Strategy. After selecting the primary studies, it was possible to start with the extraction of the most relevant information from the primary studies, using a strategy based on a set of possible answers for each of the defined research questions. The response-based strategy allows the same data extraction criteria applied for all selected papers. To help answering the research questions synthetic was carried out. It consist in representation of the information through the use of text, tables, and graphics.
Open Data in Higher Education
79
Fig. 1. Phases of the process followed in this Systematic Literature Review.
Quality Assessment. Each paper we identified as relevant for this SLR was evaluated to assess its quality level, according to nine evaluation criteria shown in Table 1. This evaluation guide was adapted and agreed upon by each one of the authors of this SLR. Table 1. Quality assessment criteria. Assessment criteria
Scoring
- Is the paper published in an impact journal? Yes (5 points), No (0 points) - Year of publication
Less than 5 years (5 pts), more than 5 years (2 points)
- Is the objective of the study clearly stated?
Likert scale (1 lowest, 5 highest) Likert scale (1 lowest, 5 highest)
- Is the context of the problem clearly described?
Likert scale (1 lowest, 5 highest)
- Does the paper contain a results section?
Likert scale (1 lowest, 5 highest)
- Is it possible to replicate the study?
Likert scale (1 lowest, 5 highest)
- Does the paper answer your own research questions?
Likert scale (1 lowest, 5 highest)
- Does the study report clear, unambiguous findings based on evidence and argument?
Likert scale (1 lowest, 5 highest)
- Is the paper well/appropriately referenced? Likert scale (1 lowest, 5 highest)
The resulting score was used as a reference for assessing the quality of the relevant papers. None of the papers was rejected based on the evaluation criteria used. The total score from each paper was computed using a percentage scale. Table 2 shows the evaluation results of the relevant papers according to the quality evaluation criteria previously described. All the papers included in this literature review have a quality score that ranges from good to excellent. The average of this evaluation yielded 76%, considered a good enough quality of the relevant papers.
80
I. E. Rodriguez-F et al. Table 2. Quality assessment result of the relevant papers used in this SLR. Percentage score category Number of papers Percentage of papers Poor (86%)
0%
4 Results RQ1. In what application context are the relevant works framed? We classified the university context into three categories according to the main functions of the university: 1) research, 2) teaching, 3) social responsibility, and 4) management. The first three categories are substantive functions of a university, because they contribute to increase, transmit, and apply knowledge [20]. The fourth category refers to the organizational management, which allows guaranteeing the achievement of the university goals. The relevant works found are located in each of the categories mentioned above. According to the context of the application of open data in a university setting. Aditionally, a subclassification was necessary due to the variety of aspects related to each category. Table 3, shows the categories and their applications. Table 3. Categories and application context of the relevant works found. Category
Context
No. of papers
Research
Open Science
6
[21–25]
Scholarly
2
[26, 27]
Teaching
Courses
6
[28–33]
Library
1
[34]
Social responsibility
Innovation
2
[21, 35]
Management
Open government and open university
10
References
[4, 14, 36–43]
From the obtained results, we observe a significant contribution to the Management category, 40% (10 papers) of the selected works correspond to this group. According to these studies, universities collect information generated by the substantive functions in management systems. They record and track information related to human resources, planning, organizational effectiveness, quality of management, economic and financial resources. This information has begun to be publicly available through open data, and, its use in the assessment of the impact of open data on the governance of universities has started.
Open Data in Higher Education
81
The research category represents 30% of the papers analyzed (8 papers). We observe that open science, which corresponds to the research category, has 22% (6 papers) of works that have addressed this topic. It implies scientific dissemination and shows the efforts of universities to share the research output. Other aspects related to this category are related to human resources and information from documentary databases where research output is published. Another point worth noting is the application of open data to expose open educational resources such as the development of curricula, design of educational strategies, and the improvement the teaching and the learning process. It comprises 26% (7 papers) of the identified works (corresponding to the Teaching category). We have identified two open innovation works based on the university’s collaboration with external entities that promote research and business development and share the benefit of such cooperation. These studies represent 7% (2 papers) and correspond to the Social Service category. Finally, management represents 37% (10 papers). Results found largely coincide with [5], who states that there are three confluent paths through which to face open data in higher education: 1) educational development, 2) open science, and 3) transparency in all the information that generates the university. RQ2. Have models, frameworks, methods, or standards related to OD been reported in the context of HEIs? The rationale of this RQ is to gain better insight about higher abstraction level related to the models, frameworks, methods and standards used in the context of open data in higher education institutions. At a higher abstraction level, we have models, that are simplified representation of a reality; in this category, we identified five works [21, 23, 27, 30, 35]. Work in [21] proposes a model for the governance of open science and innovation at universities in a digital world. The model distinguishes four principles of open science in the digital era that direct research teams work at universities: transparency and accessibility to science outputs, and authorization and participation in science production. Work in [23] describes a model of open data in university-industry partnerships. This model proposes a mediated and collaborative approach between these two parties. We also found two other works that discuss the use of ontologies. Work in [30] describes an ontology which amalgamates selected ontologies and facilitates the full description of course-related information in the HEI context. Similarity, work in [27] describes an ontology that is associated with other proposed ontologies belonging to the HEI context. In the last model [35], describe a model of transparent open data access with context awareness that automatically suites data-access characteristics to the user’s needs based on the data-access context. Descending to the next level of abstraction, we have frameworks. They help in providing the structure needed to implement a model. In this category, we found four works [4, 25, 29, 39]. For example, work in [4] proposes a framework for university openness and co-creation performance under the umbrella of open data. Another example is found in [29], where a general framework is depicted for publishing open educational contents as linked data. In this work, authors present a framework to integrate heterogeneous OpenCourseWare repositories and publish their metadata as linked data [29]. Another framework describes the Malaysian Technical University Network (MTUN), open data framework [39]. The last framework we identified [25] discusses open data adoption at the University of Cambridge.
82
I. E. Rodriguez-F et al.
In the next descending level, we have methods, nestled inside of frameworks. Methods provide an approach for achieving a specific goal. With regards to this category, we found four works [32, 34, 37, 42]. For example, work in [34] describes the step for exploring options for adding links, and transforming them into specific semantics, deploying them in the RDF standard. In [32], authors propose a learning method dealing with the open data concept. Since the availability of this data allows students to propose its use in real projects on database design. Their method focuses on the development of a design project of a database that serves to solve a problem posed around a set of open data. Students form groups studying available open data and propose an original scenario where they use open data to simulate a specific problem. In [42], the proposed methodology comprises six differentiated phases inspired by methods commonly used to publish linked data. Similarly, work in [37] describes a method of using open data around the project. Finally, we have standards, which are technical documents designed to be used as a rule, guideline, specification, or definition. They are made by consensus-built, and can be used consistently to ensure that materials, products, processes, and services are fit for their purpose. In the context of open data, we identified three works that apply the RDF (Resource Description Framework) standard [14, 31, 40]. Table 4 shows the relevant works grouped by the categories previously described. Table 4. Number of relevant works grouped by the different abstraction categories. Category
Number of papers
References
Model
5
[21, 23, 27, 30, 35]
Framework
4
[4, 25, 29, 39]
Method
4
[32, 34, 37, 42]
Standard
3
[14, 31, 40]
RQ3. What supported tools have been used for OD? Concerning this question, ten works were identified. To compare and analyze technological tools used in Open Data five characteristics were defined: data modeling, generation of RDF data, data publication, linking and exploitation, software libraries for working with RDF) for comparing, and analyzing, shown in Table 5. These characteristics were established and adapted from phases of the specific methodologies used to publish open data such as [29, 42, 44]. The first characteristic, Data Modeling, refers to the tools aimed at data modeling by using ontologies or vocabulary creation of classes and properties. The next characteristic, Generation of the RDF data, indicates the tools used to generate data in the RDF format standard. The Data publication characteristic describes possible platforms used to store and process data in RDF format. The next characteristic, Linking and exploitation, refers to the tools that allowed the linking process, i.e., to make a data set the link to another data set. Finally, the latest characteristic, Software libraries for working with RDF, reveals possible libraries or software for handling and manipulating ontologies and RDFs.
Open Data in Higher Education
83
Table 5. Reported tools used in the context of open data in HEIs. Tool used for:
Tool name
Data modelling
NeOn Toolkit [42], Protégé [27, 30], OpenVocab [38]
RDF data generation
D2RQ platform [38, 42], kOre framework [24], Jena scripts [29], PHP scripts [30, 31], Python scripts [27]
Data publishing
SPARQL endpoint [42], Virtuoso server [24, 27, 29, 30, 34], Apache Marmotta [30, 31], Fuseki server [40]
Linking and exploitation Silk tool [29, 42], RESTFul API [24, 31, 37] Libraries or frameworks Apache Jena [29], ARC2 [31]
We observe very few tools aimed at data modeling. Protégé is the most common tool to carry out this process. To generate RDF data, there are some tools and platforms; those must have some relationship with the source of the data or the platform where the data will be published. Scripts are coded in languages such as PHP, Phyton, Jena, and also platforms or frameworks like D2RQ, kOre, ConvertToRDF. Regarding tools for publishing open data, we observe that Virtuoso server is the most used, followed by Apache Marmota and Fuseki. However, in some works authors prefer to develop their own system or data portal. Regarding the tools for Linking and exploitation, RESTful API is very common in the analyzed works. However, it seems that not all the works reach this phase which is related to the goal stated by Tim Berners-Lee [13] that poses the availability of five-star class datasets. RQ4. What drawbacks have been reported when implementing OD initiatives in HEIs? Most of the relevant papers agree that open data increases the transparency of an organization, as well as data, can be reused for future analyses, thus helping to accelerate innovation and facilitate collaborative work. However, in the environment of HEIs there are drawbacks that limit the development of this topic. We have observed some drawbacks reported regarding intellectual property, as mentioned in [23], revealing R&D open data may leak valuable information to competitors. In this sense, working on intellectual property law in science projects with companies and with other research organizations [21] is needed. Regarding standardization, we observe a lack of open science standards in areas such as data governance, infrastructure, practices, publishing protocols, skills, and technical support [21]. We observe confusing publishing practices and no metadata standardization from the relevant works identified [22]. For example, the origin of the considered datasets has been developed in isolation from each other, with different modeling principles, designs, and vocabularies being used [14], thus causing a lack of standardization. We also observe legal drawbacks; jurisprudence, ethical, and commercial interests can affect whether the data could be open access (sensitive data) [22]. Current laws and guides about how to open data in different public sectors do not consider the complexity and idiosyncrasy of HEIs [37]. Another drawback we observe is related to the operability implementing data in HEIs; for example, no infrastructure (long-time storage) is available [22], there are insufficient information channels and a gap between researchers
84
I. E. Rodriguez-F et al.
and management units [22]. We also observe a lack of use and ignorance of open data in academic activity by university teachers [33]. RQ5. In which countries have experiences of applying OD been reported in higher education institutions? We have identified 14 studies that report experiences of applying open data in public and private universities located in different countries. We observe that most of the studies describe the use of open data through portals centered in exposing and sharing information mainly on: educational resources for the teaching-learning process (courses), academic information, management, and research. For example, in [37], the authors describe the OpenData4U project aimed at the development of open data portals for universities [37]. Universities such as the University of Alicante, University of La Laguna from Spain [41, 42], ITMO University from Russia [24], and the Open University from the United Kingdom [40] have developed open data portals. [14] reports other examples of portals. We also found studies that describe the development of applications that adopt open data [26–28, 30–32, 34, 36]. In the open data field, entities can publish data instead of reusing it, or conversely, they can consume data and have little or no information at all published [17]. Most of the studies found are oriented to the publication of data [14, 24, 27, 28, 30, 31, 34, 36, 37, 40–42]. Whereas, to a lesser extent, works describe the reuse of existing data [26, 32]. Concerning these studies, 83% of them reported the implementation of open data initiatives in Europe, the rest of them (17%) describe open data initiatives implemented in America, Table 6 shows a summary of the works found per region. Table 6. Countries where open data in HEIs initiatives have been reported. Country
Number of papers
References
Hungary
3
[28, 30, 31]
Spain
4
[32, 37, 41, 42]
United Kingdom
1
[40]
The Netherlands
1
[36]
Italy
1
[26]
Russia
1
[24]
Other European Countries (consortium)
1
[14]
USA
1
[34]
Ecuador
1
[27]
Open Data in Higher Education
85
5 Discussion and Conclusions With regard to our findings, we observe the existence of initiatives in the field of open data in HEIs concerning research, teaching, social responsibility, and management. Most of the relevant identified papers discuss the open data from a management and research viewpoint. Proposed models, frameworks, methods, and standards related to open data initiatives in HEIs have begun to appear in the literature. These models address issues related to collaboration among HEIs and industry, governance of open science and innovation, and teaching. Proposed frameworks and methods have begun to address guidelines to implement open data initiatives. With regard to standardization, the RDF standard seems to be widely adopted. Different supported tools for open data have been identified in the context of HEIs; we observe the use of tools that address different aspects of open data such as data modeling, RDF data generation, data publishing, linking and exploitation and libraries or frameworks, the use of RESTful APIs seems to be broadly adopted as a mechanism for exposing open data. Mainly observed issues at the moment of embracing an open data initiative in a HEI are those related to intellectual property, standardization, legal, and operation (implementation) aspects. Finally, the largest number of open data initiatives reported in the relevant papers have been developed in European HEIs. Study Limitations. We acknowledge that any secondary study presented in this work is subject to interpretation in all its stages and steps, so this conveys a risk of bias. To address this issue, we analyzed and refined our criteria for document selection. As described in the stages followed for document selection (Sect. 3), we crosschecked the initial assigned documents to each of us to decide whether or not a document is subject to discard. It allowed us to have a common consensus of the document selecting process. In this sense, all borderline issues where there was a doubt in selection and analysis were discussed to reduce a bias further. Although the risk of missing relevant papers was present, we consider that the selected documents for this review (relevant papers) represent a good enough sample. We did not consider gray literature, so we assume that good quality grey literature of this subject will appear as journal or conference papers; because of this, publication bias may be present due to negative results that are not usually published. Another issue to discuss is that we did not consider documents published in a nonEnglish language; although this is not a limitation in our regional context, it can be a reflection of the limitations imposed on us by the available research in this area (updated and peer-reviewed literature is published in English). The adoption of open data initiatives has begun to be reported in the context of HEIs. The adoption of open data in the higher education context will allow further transparency and accountability of the HEIs, also sharing knowledge and collaborations between HEIs and industry and society. In this paper we have presented a systematic literature review of open data in the field of higher education. Our findings reported in this SLR can serve as reference for conducting future research on this topic.
86
I. E. Rodriguez-F et al.
References 1. Altayar, M.S.: Motivations for open data adoption: an institutional theory perspective. Gov. Inf. Q. 35, 633–643 (2018) 2. Zuiderwijk, A., Janssen, M., Davis, C.: Innovation with open data: essential elements of open data ecosystems. Inf. Polity. 19, 17–33 (2014) 3. Welle Donker, F., van Loenen, B.: How to assess the success of the open data ecosystem? Int. J. Digit. Earth. 10, 284–306 (2017) 4. Krumova, M.: Higher education 2.0 and open data: A framework for university openness and co-creation performance. In: ACM International Conference Proceeding Series. pp. 562–563 (2017) 5. Atenas, J., Atenas, J., Havemann, L., Priego, E.: Open data as open educational resources: towards transversal skills and global citizenship. Open Prax. 7, 377–389 (2015) 6. Davies, T., Walker, S.B., Rubinstein, M., Perini, F.: The State of Open Data: Histories and Horizons. African Minds (2019) 7. Open Data Charter: International Open Data Charter. Principles, https://opendatacharter.net/ principles/, last accessed 2020/06/14 8. United Nations: A new global partnership: eradicate poverty and transform economies through sustainable development-the report of the High level Panel of Eminent Persons on the Post2015 Development Agenda (2013) 9. Open Data Institute: What is open data? https://theodi.org/what-is-open-data Accessed 20 June 2020 10. OECD: Open Government Data. http://www.oecd.org/gov/digital-government/open-govern ment-data.htmAccessed 09 July 2020 11. World Wide Web Foundation: Open Data Barometer. https://opendatabarometer.org/ Accessed 17 June 2020 12. Knowledge Foundation Open: Global Open Data Index. https://index.okfn.org/ Accessed 17 June 2020 13. Berners-Lee, T.: Linked Data - Design Issues. (2009) 14. d’Aquin, M.: On the use of linked open data in education: current and future practices. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). pp. 3–15. Springer Verlag (2016) 15. Neumaier, S., Umbrich, J., Polleres, A.: Automated quality assessment of metadata across open data portals. J. Data Inf. Qual. 8, 1–29 (2016) 16. Assaf, A., Troncy, R., Senart, A.: HDL - Towards a harmonized dataset model for open data portals. In: USEWOD-PROFILES@ ESWC. pp. 62–74. CEUR-WS, Portoroz, Slovenia (2015) 17. Open Data Institute: Open Data Maturity Model. https://theodi.org/article/open-data-mat urity-model-2/ Accessed 17 June 2020 18. Kitchenham, B.A., Budgen, D., Brereton, P.: Evidence-Based Software Engineering and Systematic Reviews. Chapman & Hall/CRC (2015) 19. Dieste, O., Grimán, A., Juristo, N.: Developing search strategies for detecting relevant experiments. Empir. Softw. Eng. 14, 513–539 (2009) 20. Mora, M.M., Espinosa, M.R., Delgado, M.R.: Approach of processes for the distribution of economic resources in public university of ecuador. Int. Res. J. Manag. IT Soc. Sci. 5, 25–35 (2018) 21. Vicente-Saez, R., Gustafsson, R., Van den Brande, L.: The dawn of an open exploration era: Emergent principles and practices of open science and innovation of university research teams in a digital world. Technol. Forecast. Soc. Change. 156, 120037 (2020)
Open Data in Higher Education
87
22. Borgerud, C., Borglund, E.: Open research data, an archival challenge? Arch. Sci. 10, 1–24 (2020). https://doi.org/10.1007/s10502-020-09330-3 23. Perkmann, M., Schildt, H.: Open data partnerships between firms and universities: the role of boundary organizations. Res. Policy 44, 1133–1143 (2015) 24. Karmanovskiy, N., Mouromtsev, D., Navrotskiy, M., Pavlov, D., Radchenko, I.: A case study of open science concept: linked open data in university. Commun. Comput. Inf. Sci. 674, 400–403 (2016) 25. Riechert, M., Roberson, O., Wastl, J.: Research information standards adoption: Development of a visual insight tool at the University of Cambridge. Procedia Comput. Sci. 106, 39–46 (2017) 26. Di Iorio, A., Peroni, S., Poggi, F.: Open data to evaluate academic researchers: an experiment with the italian scientific habilitation. In: Catalano, G., Daraio, C., Gregori, M., Moed, H.F., and Ruocco, G. (eds.) Proceedings of the 17th International Conference on Scientometrics and Informetrics, ISSI 2019, Rome, Italy, September 2–5, 2019. pp. 2133–2144. ISSI Society (2019) 27. Cadme, E., Piedra, N.: Producing linked open data to describe scientific activity from researchers of Ecuadorian universities. In: 2017 IEEE 37th Central America and Panama Convention, CONCAPAN 2017. pp. 1–6. IEEE Inc. (2018) 28. Fleiner, R., Szász, B., Micsik, A.: OLOUD - an ontology for linked open university data. Acta Polytech. Hungarica. 14, 63–82 (2017) 29. Piedra, N., Tovar, E., Colomo-Palacios, R., Lopez-Vargas, J., Chicaiza, J.A.: Consuming and producing linked open data: The case of OpenCourseWare. Program. 48, 16–40 (2014) 30. Szász, B., Fleiner, R., Micsik, A.: A case study on linked data for university courses. In: 5th International Workshop on Methods, Evaluation, Tools and Applications for the Creation and Consumption of Structured Data for the e-Society (2016) 31. Piros, P., Fleiner, R., Kovács, L.: Linked data generation for courses and events at Óbuda University. In: 2017 IEEE 15th International Symposium on Applied Machine Intelligence and Informatics (SAMI). pp. 253–258 (2017) 32. Mazón, J., Lloret, E., Gómez, E., Aguilar, A., Mingot, I., Pérez, E., Quereda, L.: Reusing open data for learning database design. In: 2014 International Symposium on Computers in Education (SIIE). pp. 59–64 (2014) 33. Rivas-Rebaque, B., Gértrudix-Barrio, F., De Britto, J.C.D.C.: The perception of the university professor regarding the use and value of open data. Educ. XX1. 22, 141–163 (2019) 34. Cole, T.W., Han, M.J., Weathers, W.F., Joyner, E.: Library marc records into linked open data: challenges and opportunities. J. Libr. Metadata. 13, 163–196 (2013) 35. Wrzeszcz, M., Kitowski, J., Słota, R.: Towards trasparent data access with context awareness. Comput. Sci. 19, 201 (2018) 36. Conradie, P., Choenni, S.: On the barriers for local government releasing open data. Gov. Inf. Q. 31, S10–S17 (2014) 37. Zubcoff, J.J., Vaquer, L., Mazón, J.N., Maciá, F., Garrigós, I., Fuster, A., Carcel, J.V.: The university as an open data ecosystem. Int. J. Des. Nat. Ecodynamics. 11, 250–257 (2016) 38. Ma, Y., Xu, B., Bai, Y., Li, Z.: Building linked open university data: Tsinghua university open data as a showcase. In: Joint International Semantic Technology Conference. pp. 385–393 (2011) 39. Ismael, S.N., Mohd, O., Rahim, Y.: Implementation of open data in higher education: a review. J. Eng. Sci. Technol. 13, 3489–3499 (2018) 40. Daga, E., d’Aquin, M., Adamou, A., Brown, S.: The open university linked data data.open.ac.uk. Semant. Web J. 7, 183–191 (2015) 41. Vaquer, L., Carcel, J.V., Fuster, A., Garrigós, I., Maciá, F., Mazón, J.-N., Jacobozubcoff, J.: Developing a data map for opening public sector information. Int. J. Des. Nat. Ecodynamics. 11, 370–375 (2016)
88
I. E. Rodriguez-F et al.
42. Juanes, G.G., Barrios, A.R., García, J.L.R., Medina, L.G., Adán, R.D., Yanes, P.G.: Linked data strategy to achieve interoperability in higher education. In: Proceedings of the 10th International Conference on Web Information Systems and Technologies (WEBIST-2014). pp. 57–64 (2014) 43. Van Schalkwyk, F., Willmers, M., McNaughton, M.: Viscous open data: The roles of intermediaries in an open data ecosystem. Inf. Technol. Dev. 22, 68–83 (2016) 44. Konstantinou, N., Spanos, D.-E.: Deploying linked open data: methodologies and software tools. Materializing the Web of Linked Data, pp. 51–71. Springer, Cham (2015). https://doi. org/10.1007/978-3-319-16074-0_3
e-Learning
Online Learning and Mathematics in Times of Coronavirus: Systematization of Experiences on the Use of Zoom® for Virtual Education in an Educational Institution in Callao (Peru) Roxana Zuñiga-Quispe, Yesbany Cacha-Nuñez, Milton Gonzales-Macavilca(B) and Ivan Iraola-Real
,
Universidad de Ciencias y Humanidades, Los Olivos, Lima, Peru [email protected]
Abstract. The present work has as objective the systematization of educational experiences developed for the teaching of Mathematics (Geometry, Algebra, and Arithmetic) through the Zoom® platform to students of regular basic education of an educational institution of the Constitutional Province of Callao (Peru). The qualitative approach of this research allows us to understand the different factors that have allowed students to successfully achieve their competences and capacities through the compulsory modality of virtual education due to social isolation by COVID-19. Finally, the importance of the joint work of all educational agents (parents, teachers, students, and institution) is concluded so that these experiences can be successfully achieved. Keywords: Virtual education · Zoom® · Mathematics · Digital skills
1 Introduction This work result from an accelerated period of implementation of digital-technological resources in education, due to the compulsory social isolation by COVID-19. With this systematization of experiences, we seek to share with the academic community the strategies used to carry out this process. Specifically, we refer to the teaching-learning dynamics in Mathematics through the Zoom® platform, which was implemented in a private educational institution in the Constitutional Province of Callao (Peru) during the first quarter of the 2020 school year (March-May). In general, the development of the Mathematics area in the regular basic education courses in Peru (EBR) requires a detailed analysis, since we are still far from reaching the optimal levels in the different national and international tests that measure the competences math in students. A separate challenge is a fact of teaching the subjects in this area under the virtual education modality, since it restricts the levels of interaction that were common in face-to-face teaching. Furthermore, it should be noted that the majority of pedagogical strategies were designed for this type of education, not for virtual education, which substantially increased the difficulty of this new challenge. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 91–102, 2021. https://doi.org/10.1007/978-3-030-63665-4_7
92
R. Zuñiga-Quispe et al.
In this sense, it is essential that the experiences can be systematized and shared with the educational, scientific and academic community, since it helps to evaluate and compare the various strategies that have been implemented in different realities, and this, undoubtedly, helps to improve both individual practices carried out by teachers in the classroom, such as educational policies that affect institutional and even state practices at a much higher level. Now, in the case of the educational institution that is the object of our study, although at the beginning of the quarantine, there was a protocol for teaching Mathematics in the virtual system, the challenges that were presented required modifications in the said protocol. All these changes are what has motivated the development of this work, since the final results were favorable for the students who lived through all these experiences. Finally, this document has been organized as follows: the previous aspects of distance education, the description of the methodology applied in this work, the explanation of the context of the health emergency in which the experience is developed, then The systematization itself is exposed, and, finally, the results are analyzed to evaluate the achievements and difficulties that have arisen throughout this first quarter (March-May) of virtual education.
2 Preliminary Aspects The distance education modality is not as new as it may seem, since it has been used in the 19th century due to the need to maintain the teaching-learning ties between students and teachers who were not in the same physical space [1]. Then, from the beginning and the middle of the 20th century, with the emergence and incorporation of Information and Communication Technologies (ICT) [2], the entire educational system undergoes revolutionary changes. Thus, distance education, virtual education, and online learning grew exponentially due to the extraordinary demands of the student sector, which called for taking advantage of technological advances for the search and development of new educational strategies; all to overcome difficulties such as the massive increase in students and the limited flexibility of schedules [2]. Thanks to the development of ICT, all these limitations could be overcome decisively, since three formats emerged that revolutionized and democratized education worldwide: b-learning (blended), e-learning (virtual), and a-learning (automatic) [3]. However, even though physical contact or face-to-face interaction between teacher and student is restricted in distance education, dedication, enthusiasm, and commitment have not been affected; therefore, online learning does not represent a threat or limitation in terms of acquiring skills and knowledge [4]. Currently, the inclusion of online learning at all educational levels is essential because it helps meet the diverse needs of a knowledge society like the one, we live in today [5]. It is a society in which the student, from your initial training stage, is already part of and operates naturally in the various online learning environments, which, also, potentiates their abilities of self-motivation and autonomy to carry out their learning activities [6]. On the other hand, these changes in the teaching-learning dynamics also demand essential changes for the teaching role, who must be willing to break the paradigm of traditional education to stimulate meaningful learning in their students [7]. For this
Online Learning and Mathematics in Times of Coronavirus
93
reason, we are interested in systematizing the experiences that have allowed students to reach learning levels in this challenging context of a health emergency, largely thanks to the proper use of different technological-digital tools, and in this case specific to the Zoom® video conferencing platform. However, it is essential to emphasize that teaching in virtual education goes beyond the use of digital tools, since one of its pillars is to promote good and profitable interaction between education agents [8]. For example, to solve a mathematical problem it is not enough for the student to learn to use the functions of a program. Still, the teacher must be able to stimulate his curiosity and creativity to find the solution [9]. In these experiences that we will systematize below, we use the Zoom® platform for real-time classes. We use the different virtual tools to create an enabling environment, where students occupy the leading role of teaching-learning in the area of Mathematics.
3 Methodology This work belongs to the qualitative approach and to the specific type of systematization of experiences (SE), which seeks to explore both the contexts and the participating subjects to provide descriptions and explanations of the subjective reality that frames the action [10]. Likewise, the three minimum elements that make up the SE are experience, participants, and achievements [11]. It is also essential to differentiate the SE from a report or summary of chronologically ordered activities, since what the SE really seeks is to answer three questions: what experiences were carried out? How were they carried out? And what were they carried out for? [11]. Finally, to carry out this work, the steps or stages proposed by Unesco for the systematization of innovative educational experiences [12] have been followed, which are reflected in the organization of the following subtitles.
4 The Context of the Experience In recent years, the virtual education system showed rapid growth, which implied further development and incorporation of ICT; and this became the differentiating element concerning classical education, which little by little was incorporated into this transformation process [2]. However, in developing countries such as Peru, at the beginning of 2020, this transformation process accelerated rapidly, because the pandemic caused by COVID-19 forced compulsory social isolation. In order not to miss the school year, the Ministry of Education promulgated the Emergency Decree No. 026-2020, which ordered the suspension of face-to-face classes in order not to spread the disease. A short time later, Ministerial Resolution No. 160-2020-MINEDU would be approved so that the school year throughout the country will be carried out entirely in a virtual format. Likewise, as part of the strategy to make this decision sustainable, the State implemented the “Aprendo en Casa” program, with which it sought to guarantee, above all, distance educational service in public institutions through the use of means such as radio, television, and Internet; likewise, the possibility was allowed for private institutions to develop their methodology [13].
94
R. Zuñiga-Quispe et al.
The educational agents of the Callao private educational institution, which is now the object of our study, responded to this situation from different perspectives; for example, the principal implemented new communication mechanisms with teaching staff during compulsory social isolation to strengthen ties between them. On the other hand, for the teacher, it was quite a challenge to adapt their pedagogical strategies and processes to the virtual system. In most cases, the transformation of teachers was more complex than the adaptation of students, since most of them, being digital natives, already had various competencies [14] that favored their adaptation to the new virtual teaching-learning system. Finally, the parents at first showed uncertainty about this system, since for most of them, it also meant am entirely new change, which required other dynamics and processes. Still, after this surprise and initial concern, all They decided to collaborate with the teachers of the institution regarding the time and organization of activities together with their student children, since they are the leading actors in this new virtual education system [13]. Following all that has been stated so far, we will carry out the systematization of experiences developed during the first quarter of the academic year (March-May) in a private educational institution in Callao (Peru) with students of regular basic education in the context of education virtual due to the mandatory social isolation caused by COVID-19.
5 Systematization of the Experience The experiences to be systematized are the pedagogical and didactic strategies used to teach Mathematics classes (Geometry, Algebra, and Arithmetic) through the Zoom® digital videoconferencing platform to the fourth-grade students of primary education of a Callao private school (Peru). And the importance of this systematization lies in the fact that these strategies allowed high levels of interaction in the classroom and the achievement of the projected competences for these subjects. Below we can see the general overview of the activities carried out between March and May 2020 (Fig. 1). 5.1 Location of the Experience This experience was carried out in coordination with a private educational institution in the Constitutional Province of Callao (Peru), where the population is economically characterized as being middle class, and 40% of students from the entire province are enrolled in a private institution. [15]. Regarding the members of the school where the experience was carried out, 100% had Internet and access to mobile telephones, which favored the development of virtual education. 5.2 Agents of Experience The experience was carried out with the support of the teaching, directing, administrative, students, and parents staff.
Online Learning and Mathematics in Times of Coronavirus
95
Fig. 1. Map of the general structure of the experience. Source: Own elaboration.
The educational institution partially had the financial resources for the development of distance education, because although it provided equipment (laptops) to the teachers, it was unable to acquire payment platforms to develop the virtual classes, so it was decided to the use of the free version of the Zoom® platform. Forty-one students from the 4th grade of primary education, aged between 9 and 10, participated in this experience, where 25% are girls and 75% boys. 5.3 Description of the Experience Within the Mathematics area, only the subjects of Geometry, Algebra, and Arithmetic were considered for the development of the experience. Below, in Fig. 2a, 2b, we systematically detail all the activities carried out in each of the stages (planning, execution, and evaluation) (Fig. 3, 4 and 5). 5.4 Relations with the Population Services Provided to the Population. The teaching work consisted of guiding the students to obtain the best benefits with the use of digital platforms, specifically Zoom®, for learning Mathematics. Likewise, we had to propose activities that complement the virtual task. We also had to carry out reprogramming reports and keep ourselves in constant training to offer the best educational service [13]. Also, we had to reformulate our entire educational approach and methodologies to adapt to the new demands of digital environments, where students show their leading role, since it is not possible to continue observing them as simple recipients of knowledge [17]. The Role of the Population. In an everyday family environment, there is little control over the use that minors give to technological devices, and excessive use can be detrimental to acquiring of skills [18]. Therefore, parents must assume a fundamental role for
96
R. Zuñiga-Quispe et al.
Antecedents The measure adopted by the educational institution (Callao-Peru) Platforms
A. Planning Pronouncement of the Ministry of Education through Ministerial Resolution No. 160-2020-MINEDU, which ordered the suspension of face-to-face classes in to not spread the COVID-19. Implementation of online education through the use of virtual platforms. Zoom (synchronous). Classroom (asynchronous). B. Execution Motivation and welcome.
First week (first Contact)
Second week (Reflection)
Characteristics of general sessions
Characteristics of general sessions
Agree on the rule so that only the teacher activates his microphone. In the Algebra session, a potential problem was presented. There were problems in identifying the first didactic process called “familiarization with the problem”, since the students showed a minimal interaction that did not favor the socialization process. To comply with the second didactic process called “search for strategies”, we ask them to propose different ways of the solution themselves; and for this, the microphone of all the students was activated, but time was insufficient. During the explanation, we tried to get all the students to participate. Still, it was very time consuming and affect the fluid development of the session, so the process called “mathematical representation” was omitted, where concrete materials had to be used; however, the formalization process, which consists of expressing their conclusions, dominated it by merely summarizing the explanation in their notebooks. At the end of the session, the students were asked to participate through chat. Still, this option was unsuccessful due to incorrect use, since they shared irrelevant information, and this generated distraction. Active participation with the use of the microphone was implemented to work on the mathematical process of understanding the problem. Here the students had to answer the questions and questions by activating their microphones. To develop processes such as the so-called “search for strategies”, “mental calculation” and “resolution of basic operations”, participation was requested with the free use of the microphone; however, it encouraged disorder because everyone was talking at the same time and their responses could not be clearly understood. Due to these difficulties, personalized tutoring meetings were organized to improve the use of the microphone in class (see Fig. 3).
Fig. 2. Systematization of the experience. Source: Own elaboration.
Online Learning and Mathematics in Times of Coronavirus
Third week (the challenge)
Quarter week (adaptation)
Characteristics of general sessions
Characteristics of general sessions
Achievements
Assessment type
97
It was proposed that the next interventions with the microphone be carried out in order of the list, with which it was sought to favor each student's understanding of the problem. By giving the language a central role, it helped to improve the interpretation of Mathematics [16]. To strengthen the strategy of orderly participation, Zoom® “raise hand” tool was proposed. In this way, they could share, through oral presentation, the new knowledge they had obtained in the session. Here, the student generates the need to reason logically and propose using this product in his life [8]. In the Geometry session, the Zoom® “virtual whiteboard” tool was used to model geometric shapes. Here the student was able to draw figures using the mouse as a pencil (see Fig. 4). To work on the ability to argue claims about geometric relationships, it was proposed to use private chat so that each student shares their opinion only with the teacher; thus, the Zoom® saved in an orderly manner each one of the participations in the teacher's PC. This information served as evidence of learning. In the Arithmetic session, the Zoom® “screen sharing” tool was used. This made it possible to visualize the place value board so that they could take note of it, and with it, an essential didactic process called “representation of the graphic to the abstract” was completed. In the Arithmetic session, a survey was carried out using the group chat tool to find out the opinions of the students. Then their answers were used to prepare frequency tables (see Fig. 5), and with this strategy the competence called “solves data management problems and uncertainty” was developed. The next strategy consisted of choosing two students at random to translate a problem that the teacher presented to them into algebraic and verbal language. For them, the students had to use the microphone and the chat, and their other classmates had to check if they translated the equation correctly. With this, the competition was worked to solve problems of regularity, equivalence, and change. C. Evaluation 1. More organized and ordered classes 2. Active student participation 3. Volunteers gave the opportunity to the rest 4. Orderly participation in chat 5. The Mathematics area generated more interest in the students 6. They strengthened their Mathematical competences Throughout all sessions, we worked with the formative assessment, which (unlike the summative assessment) allowed us to highlight the achievements and strengths of the students; and also to know and attend to their difficulties.
Fig. 2. (continued)
98
R. Zuñiga-Quispe et al.
Fig. 3. Personalized tutoring to reinforce the use of the microphone in the Zoom®. Own elaboration
Fig. 4. Geometric figures were drawn by students on the Zoom® virtual whiteboard
Fig. 5. Table of frequency on the types of food that most students consume according to the survey carried out in the Zoom® group chat.
Online Learning and Mathematics in Times of Coronavirus
99
the education of their children. They must contribute, supervise, and collaborate with the development of the learning sessions; they are also directly responsible for organizing the times for study and rest. For his part, the student oversees putting all his efforts so that the experience is meaningful. Both are critical agents for distance learning. The Relationship between the Promoter Team and the Population. The relationship between the directors of the institution, the teacher, the students, and parents were conducive to the development of activities in favor of distance learning. To do this, it had to go through a whole process, which ranged from familiarization with the use of virtual platforms to economic agreements to face this exceptional period due to the health crisis. At all stages, the dialogue was essential to take this utterly virtual education modality forward.
6 Results and Limitations The strategies that were proposed for the context of virtual education through the Zoom® platform allowed students to satisfactorily achieve the expected learning levels. In other words, the strategies helped to overcome the limitations of direct interaction due to social isolation, and to obtain the necessary skills for the area of Mathematics (see Fig. 6).
Fig. 6. Mathematical skills and competencies that were developed with the strategies and activities carried out on the Zoom® platform. Source: Own elaboration. The mathematical competencies and capacities were taken from the National Curriculum of Regular Basic Education of Peru.
100
R. Zuñiga-Quispe et al.
Likewise, the tutoring strategy allowed 100% training of students to manage virtual platforms promptly. On the other hand, one of the most important limitations is linked to connectivity issues. Although everyone had an Internet, the use of the Zoom® platform required high data traffic, which could affect the flow of the session. Also, all students needed to keep their web cameras and microphones active, but some did not have adequate devices for this. Regarding the work team (educational institution and teaching colleagues), the limitations were two. In the first place, it was challenging to coordinate with other teachers on issues associated with improving teaching, since not everyone was familiar with the use of virtual environments. It should be noted that the level of teacher training in the use of ICT influences the positive attitude that may have on the use of these tools [19]. However, the social situation we are going through has made it clear that teaching work, now more than ever, is linked to digital skills; therefore, mastering these skills now becomes an essential requirement for all education professionals [20]. Secondly, the educational institution started this whole process with very little empathy towards the needs of its teachers; for this reason, it was essential to strengthen the motivational aspect of all educational personnel. Thanks to this, institutional improvements were evident as the weeks went by. Finally, an external condition that influenced distance education was the general saturation of the network during school hours. This affected the normal development of some sessions.
7 Conclusions The objective of this work was to systematize the pedagogical and didactic strategies that were used to teach Mathematics classes (Geometry, Algebra, and Arithmetic) through the digital platform of videoconferences Zoom® to the fourth-grade students of primary education of a Callao private school (Peru). This, because these experiences allowed students to achieve the expected skills and capacities despite not having the usual physical space (school) that favored the interaction and teaching of these subjects. As general conclusions, we can mention: 1. The indispensable factors for completing these experiences were: the support of parents, a good internet connection, having adequate technological devices (at least one smartphone), the help of the institution with complementary activities (for example, technical support and counseling talks, both for teachers and students), and the willingness of teachers to train, self-train and adapt their strategies to the new teaching-learning dynamics that are established in digital environments. 2. This exceptional situation of social isolation and online learning made it possible to assess more precisely the benefits offered by ICT for education. Likewise, creative planning strategies are essential so that students can feel involved and not be passive recipients. It should be noted that due to the problem-solving approach, the teaching of Mathematics requires the use of very particular strategies, for which it is crucial that students must learn to responsibly use the tools offered by the different platforms, in this case Zoom®.
Online Learning and Mathematics in Times of Coronavirus
101
3. Regarding the aspects to be improved, it is essential to encourage and facilitate teacher training in digital skills; For its part, the educational institution should formally consider continuing with this virtual modality even when face-to-face activities are normalized, perhaps for the development of complementary subjects or the development of courses with face-to-face and virtual activities, in this way, both the teacher and the students will be able to continue with their development of digital competencies that are so necessary for today’s education [21]. Families must also continue with the process before adaptation and now strengthen this new modality of virtual education. Lastly, we hope that the dissemination of these experiences helps to visualize the benefits that can be obtained with the proper use of digital environments for education, the importance of collective work among teachers, students, parents of family and educational institutions; and, finally, that the different pedagogical and didactic strategies can serve so that the academic and teaching community implement improvements in their work.
References 1. Chaves, A.: La educación a distancia como respuesta a las necesidades educativas del siglo XXI. Revista Academia y Virtualidad 10(1), 23–41. (2017) https://doi.org/10.18359/ravi. 2241 2. United Nations Educational Scientific Cultural Organization, Ministerio de Educación del Perú.: Docentes y sus aprendizajes en modalidad virtual. Punto & Grafía S.A.C., Lima (2017) 3. Rama, C.: Políticas, tensiones y tendencias de la educación a distancia y virtual en América Latina. EUCASA (Universidad Católica de Salta), Salta (2019) 4. García, L.: Bosque semántico: ¿educación/enseñanza/aprendizaje a distancia, virtual, en línea, digital, eLearning…? RIED. Revista Iberoamericana de Educación a Distancia 23(1), 09–28 (2020). https://doi.org/10.5944/ried.23.1.25495 5. United Nations Educational Scientific Cultural Organization.: Aprendizaje abierto y a distancia: consideraciones sobre tendencias, políticas y estrategias. Editorial Trilce, Uruguay (2002) 6. Wang, Q.: Developing a technology-supported learning model for elementary education level. Mimbar Sekolah Dasar 6(1), 141–146 (2017). https://doi.org/10.17509/mimbar-sd. v6i1.15901 7. Guimarães, I., Tejada, J., Pozos, K.: Formación docente para la educación a distancia: la construcción de las competencias docentes digitales 24(51), 69–87 (2019) http://dx.doi.org/ 10.20435/serie-estudos.v24i51.1296 8. Vega, J., Niño, F., Cárdenas, Y.: Enseñanza de las matemáticas básicas en un entorno eLearning: un estudio de caso de la Universidad Manuela Beltrán Virtual. Revista EAN 79, 172–187 (2015) 9. Rodríguez, M., Mosqueda, K.: Aportes de la pedagogía de Paulo Freire en la enseñanza de la matemática: hacia una pedagogía liberadora de la matemática. Revista Educación y Desarrollo Social 9(1), 82–95 (2015) 10. Restrepo, M., Tabares, L.: Métodos de investigación en educación. Revista de Ciencias Humanas 21 (2000) 11. Barbosa-Chacón, J., Barbosa, J., Rodríguez, M.: Concepto, enfoque y justificación de la sistematización de experiencias educativas. Perfiles Educativos 37(149), 130–149 (2015) 12. Unesco: Metodología de sistematización de experiencias educativas innovadoras. Cartolan, Lima (2016)
102
R. Zuñiga-Quispe et al.
13. Ministerio de Educación.: Aprendo en casa. Ministerio de Educación del Perú, Lima (2020) 14. Quiñonez, S., Zapata, A., Canto, P.: Competencia digital en niños de educación básica del sureste de México. Revista Iberoamericana de las Ciencias Sociales y Humanísticas 9(17), 289 – 311 (2020) https://doi.org/10.23913/ricsh.v9i17.199 15. Estadística de la calidad educativa y Ministerio de Educación del Perú.: Presentación del proceso censal 2017. Ministerio de Educación, Callao (2017) 16. Mendoza, H., Burbano, V., Valdivieso, M.: El Rol del Docente de Matemáticas en Educación Virtual Universitaria. Un Estudio en la Universidad Pedagógica y Tecnológica de Colombia. Formación universitaria 12(5), 51–60 (2019) https://dx.doi.org/10.4067/S0718-500620190 00500051 17. Pando, V.: Tendencias didácticas de la educación virtual: Un enfoque interpretativo. Propósitos y Representaciones, 6(1), 463–505 (2018) http://dx.doi.org/10.20511/pyr2018.v6n1.167 18. Escardíbul, J., Mediavilla, M.: El efecto de las TIC en la adquisición de competencias. Un análisis por tipo de centro educativo. Revista Española de Pedagogía 74(264), 317–335 (2016) 19. Fernández, J., Torres, J.: Actitudes docentes y buenas prácticas con TIC del profesorado de Educación Permanente de Adultos en Andalucía. Revista Complutense de Educación 26, 33–49 (2015). https://doi.org/10.5209/rev_RCED.2015.v26.43812 20. Prendes, M., Gutierrez, I., Martinez, F.: Competencia digital: una necesidad del profe-sorado universitario en el siglo XXI RED. Revista de Educación a Distancia 56(7), 01–31 (2018). https://doi.org/10.6018/red/56/7 21. Area, M.: De la enseñanza presencial a la docencia digital. Autobiografía de una historia de vida docente. RED. Revista de Educación a Distancia 56 (1), 01–31 (2018) http://dx.doi.org/ 10.6018/red/56/1
Implementation of an Opensource Virtual Desktop Infrastructure System Based on Ovirt and Openuds Washington Luna Encalada(B) , Jonny Guaiña Yungán, and Patricio Moreno Costales Escuela Superior Politécnica de Chimborazo, Panamerica Sur Km 1 1/2, Riobamba, Ecuador {wluna,jguaina,pmoreno}@espoch.edu.ec
Abstract. This article shows the implementation of a virtualization infrastructure using the opensource tools oVirt and OpenUDS for the deployment of virtual laboratories; to measure the performance of hard disk drivers on the workload of virtual machines in the use of virtual laboratories. This research was done with the purpose of shed some light on hard disk drivers’ performance to know which driver gives better performance in the stress testing of an environment of a virtual laboratory. Keywords: Virtual desktop infrastructure · Cloud computing · OpenUDS · oVirt · Virtual labs · VDI
1 Introduction The potential of cloud computing is one of the most relevant topics that has emerged in recent years, going from Iaas (Infrastructure as a service) to Saas (Software as a service), and the scope of cloud computing continues to expand [1]. In just a few short years, many companies like Microsoft, Apple, and Amazon have started offering cloud computing services. However, in these years, virtual desktops through Daas (Desktop as a service) has increasingly become a clear choice for companies and organizations. This is due to the high processing speed and centralized administration, which simplifies the administrative costs of clients [2]. Today, user applications consume a more significant amount of computing resources, this has caused personal computer capabilities to keep improving to satisfy applications’ needs. However, not all users use all the capacity that their computers offer to them. One very convenient answer to this problem is the VDI (virtual desktop infrastructure) solution because, when using Daas, a single server can provide resources to several users using its resources better, and without wasting processing capacity. This article does not use any of the most popular paid VDI solutions on the market, such as VMware, Citrix, and Microsoft, but instead uses the OpenSource oVirt solution. Although there is no specialized technical support for the project, oVirt has a large community that is continuously reporting and resolving bugs that have arisen with each new software version. Also, since oVirt is a base version for the commercial version of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 103–113, 2021. https://doi.org/10.1007/978-3-030-63665-4_8
104
W. Luna Encalada et al.
RedHat RHEV (RedHat Enter-prize Virtualization), it is used to test new functionalities for RHEV. For this reason, oVirt regularly receives security and functionality updates making it a non-negligible solution for its use and implementation. To facilitate the management of the life cycle of virtual desktops and the introduction to the use of virtual laboratories, the use of the OpenSource OpenUDS project (Open Unified Desktop Service) is adopted, which extends the concept of Daas, allowing to start the use of virtual laboratories as a cloud service. OpenUDS allows centralized management to have control of all available virtual machines to avoid issues related to virtual laboratories. For instance, it permits assigning access times to virtual machines. Also, it allows the creation of networks that differentiate and control access to virtual laboratories as well as the access permissions to the resources of the infrastructure.
2 System Frameworks oVirt [3] bases on opensource software projects like KVM (Kernel Virtual Machine) that allows users to run GNU/Linux and MS Windows-based operating systems with 32 and 64-bit architectures. oVirt also bases on different community projects [4], such as libvirt, Gluster, PatternFly, and Ansible. It provides centralized management of virtual machines, processing, memory, network, and storage resources. The oVirt [5] architecture comprises the logical elements: oVirt engine, oVirt node, and Storage node. oVirt engine is a Java application that runs on a web server; is responsible for managing the creation and deployment of virtual machines. oVirt node is a server running under CentOS 7, using the KVM virtualization technology, which maintains communication with the oVirt engine through VDSM (Virtual Desktop and Server Manager). It manages the life cycle of virtual machines using libvirt. oVirt node controls the different resource management tasks [6] of the virtual machines (processing, memory, network, storage). The storage node is a set of storage servers connected to the system which maintain the virtualization images and save the information of the disks assigned to the virtual machines. Figure 1 shows the basic architecture of oVirt. FreeNAS [7] (Free Network Attached Storage) is a free, open-source, FreeBSDbased operating system that offers network storage (NAS) services. FreeNAS provides a graphical configuration interface via the web. Its design shares the storage capacity of a computer with personal computers or client servers and allows access from the network. OpenUDS is a project that fulfills the functions of being a multiplatform connection broker built under the Django framework to provide web support. OpenUDS [8] is software that administers and manages the life cycle of virtual desktops. It allows the access of remote and local users to virtualization platforms and physical resources within the Data Center. It functions as an intermediary by redirecting, allowing, or denying access to virtual resources defined by the infrastructure manager. The OpenUDS [9] architecture comprises the following logical elements: connection clients, OpenUDS server, and service provider. Figure 2 shows the architecture. Figure 2 describes how the clients (which are the different devices, such as registered users using personal computers and smart devices) connect through Open UDS to enter and use the services provided by the system.
Implementation of an Opensource
105
Fig. 1. oVirt architecture
Fig. 2. OpenUDS architecture
The OpenUDS server [10] manages access to virtual desktop services, communication with hypervisors, and their life cycle. It is composed of an administrative database, connection broker, and tunneller. The administrative database is used for the internal management of the OpenUDS connection broker. It also allows the authentication of the connected clients in the OpenUDS web application. The OpenUDS connection broker is an administrative web interface using Python 2.7 as the core and developed using the Django framework. It runs under the Apache webserver and uses the task manager daemon to communicate between the web application and the service providers. The UDS Actor is a desktop application written in Python 2.7 for Linux-based operating systems, and in C# for MS Windows operating systems. It works as a service that provides communication and data transmission between the virtual machine installed in the service provider and the Task Manager demon.
106
W. Luna Encalada et al.
Tunneling is a web service used by OpenUDS for HTML5 RDP transport, through which any HTML5-compatible browser can use the virtual services offered by OpenUDS, using a connection encrypted by SSL. The service provider is the virtualization infrastructure managed from the connection broker in charge of creating, running, and deleting virtual desktops.
3 Building Process For the implementation of the VDI system, in the data center, the VDI oVirt and OpenUDS infrastructure are used. They are deployed using six physical servers with the following technical specifications. (see Table 1). Table 1. Server Specifications Server
Processor
Ram
Disk
Net
HP ProLiant DL360 G7
Intel Xeon E5649
6 GB
300 GB 7000 rpm
1 gbps
HP ProLiant DL360 G7
Intel Xeon E5649
6 GB
300 GB 7000 rpm
1 gbps
HP Proliant Ml310e Gen8
Intel Xeon E3-1280v2
16 GB
7000 rpm
10 gbps
HP ProLiant DL360 Gen9
Intel Xeon E5-2600 v3
32 GB
600 GB 7000 rmp
10 gbps
HP ProLiant DL360 Gen9
Intel Xeon E5-2600 v3
32 GB
600 GB 7000 rmp
10 gbps
HP ProLiant DL380 Gen9
Intel Xeon E5-2600 v4 x2
64 GB
300 GB 7000 rmp
10 gbps
A. FreeNAS deployment The installation of the FreeNAS 11 operating system [11] is performed on a 32 GB USB 3.0 pen drive because the operating system occupies all the capacity of the storage device. Thus, it is not installed in the drivers used as storage nodes. It is worth mentioning that the configuration of the storage drivers in RAID 1, shown in Table 1 under the HP Proliant Ml310e Gen8 server, is done through the HPE Smart Storage Administrator wizard, provided by HP servers. This is done for the FreeNAS operating system so that it can recognize the server’s hard drives. For the storage compartment, the FreeNAS ISCSI service [12] deploys in three steps, all of them using the FreeNAS web interface. First, RAID 1 volumes are added to the FreeNAS server, second, the ISCSI domain and finally, the RAID 1 storage sharing is configured to be accessed through the network through oVirt. B. oVirt deployment The oVirt construction process consists of three parts. For the oVirt engine, the first HP ProLiant DL360 G7 server is used. In the case of the oVirt node, we used three
Implementation of an Opensource
107
servers, two HP ProLiant DL360 Gen9 and one HP ProLiant DL380 Gen9, and finally, the storage node is used by the server previously described in the FreeNAS deployment section. oVirt engine [11] is deployed under a CentOS 7 virtualized operating system, and it is installed in the system in a simplified way employing the yum file manager. From Fig. 1, there may be more than one node attached to the oVirt engine. Currently, oVirt has embedded KVM virtualization under the Linux source code, creating a CentOSbased distribution Ovirt-node [12]. This operating system is installed on all three HP servers to function as oVirt nodes. Three pre-configurations must be made to add the Ovirt-node servers to the oVirt engine. First, the Cluster is configured, which are physical hosts that function as a set of resources for virtual machines. Second, the Data Center is configured. It is a data center [15] that acts as a logical entity that defines the set of physical and logical resources used in the virtual environment. During the configuration, the data center is integrated with the ISCSI network storage described previously in the FreeNAS deployment. Finally, the network is defined as the Management Network. It is used by the Data Center for communication between the oVirt engine and the oVirt node. C. OpenUDS deployment OpenUDS consists of three elements: the OpenUDS server, the tunneller, and the administration database. They deployed on the second HP ProLiant DL360 G7 server. The database is deployed under the Debian 9 virtualized operating system. The MySQL server is installed on the operating system using the apt-get file manager to simplify the installation process. The installation of MySQL creates a blank database controlled by a user with administrative rights over it. To deploy the OpenUDS server [16], a Debian 8 virtualized server is used. It has installed libraries for Python 2.7 that are necessary for the operation of OpenUDS. For this, the command pip [17] is used, which is a package manager used to install and manage software packages written in Python. Table 2 details the necessary libraries along with their version. Table 2. Python 2.7 Libraries Bookstores Django = 1.9.9
Geraldo = 0.4.17
MySQL-python = 1.2.5
cryptography = 1.9
decorator = 4.0.11
decorator = 4.0.11
django-appconf = 1.0.2
django-compressor = 2.1.1
django-configurations = 2.0
dnspython = 1.12.0
enum34 = 1.1.6
gyp = 0.1
ipython-genutils = 0.2.0
lxml = 3.8.0
mod-wsgi = 4.5.20
python-distutils-extra = 2.39
python-ldap = 2.4.10
python-memcached = 1.58
virtualenv-clone = 0.3.0
virtualenvwrapper = 4.8.2
wcwidth = 0.1.7
108
W. Luna Encalada et al.
The project is then downloaded from its official repository (https://github.com/dkm str/openuds). The MySQL database is configured to be managed by OpenUDS and migrated to the blank database using the python manage.py migrate command. To deploy the OpenUDS Tunneler, a virtualized Debian 8 server is used, and the OpenSource Apache Guacamole project [18] is used, which is an HTML5 web application that provides access to virtual desktop environments using remote desktop protocols (such as VNC or RDP). Guacamole installs the Guacamole libraries and the guacd daemon, using the apt-get file manager. The web application compiles using the project found in the OpenUDS repository (https://github.com/dkmstr/openuds/tree/master/gua camole-tunnel). This project is designed with the JAVA programming language; therefore, the webserver used for the deployment of the OpenUDS Tunneler is Apache Tomcat. Table 3 shows a summary of the servers and their installed roles. Table 3. Operating systems and installed roles Server
Operating system
Roles intalled
HP ProLiant DL360 G7
CentOS 7
Virtualización KVM, VM CentOS 7 con oVirt engine, VM Debian 8 con guacamole túnel, VM Debian 8 con OpenUDS 2.2
HP ProLiant DL360 G7
CentOS 7
Virtualización KVM, VM Ubuntu server 16 con MySQL, VM Debian 8 con guacamole túnel, VM Debian 8 con OpenUDS 2.2
HP Proliant Ml310e Gen8
FreeNAS 11.1
FreeNAS
HP ProLiant DL360 Gen9
oVirt Node 4.2.7.1
oVirt node
4 Experimentation A. Design of the Investigation An experimental design is used to find the performance difference in the hard disk while using the two types of drivers that oVirt offers (IDE, VirtIO) during the stress testing of a laboratory environment. This allowed measuring the possible effect that the hard disk drivers could have over the read and write speed on the disk. B. Stage Design Two stress environments were designed to measure the performance of writing and reading the hard disk; each environment was formed by a virtual laboratory with 20 computers per laboratory. Virtual labs with their virtual machines were created using OpenUDS. For creating the two virtual labs, two base virtual machines were used, in the Ubuntu 18.04 operating system, one with an IDE disk driver and the other with VirtIO, to later use them as templates for the creation of 20 virtual machines for each virtual laboratory. Table 4 shows the description of the installed virtual machines for each virtual laboratory.
Implementation of an Opensource
109
Tabla 4. Virtual machine specifications Driver
S.0.
Disk
CPU
RAM
NET
IDE
Ubuntu 18.04 64 bits
30 GB, qcow2. 5 GB, qcow2.
Intel Broadwell ×4
2048 MB, 1024 MB garantizados
100mbps
VirtIO
Ubuntu 18.04 64 bits
30 GB, qcow2. 5 GB, qcow2.
Intel Broadwell ×4
2048 MB, 1024 MB garantizados
100mbps
C. Data collection Dependent samples were used for data collection because each observation in one sample can be paired with an observation in the other sample. Four work scenarios were designed in which the reading load and the writing load on the hard disk were measured. After the measurements were collected for each scenario, each pair of observations was paired. In the first two scenarios (see Table 5), the graphical tool gnome-disks was used to measure the reading and writing speed to the hard disk. In the third scenario, the hdparm command was used to measure the speed of reading in the disk. For the fourth scenario, the dd command was used to measure the speed of writing to disk. Measurements were made on the 5 GB hard drive of the virtual machines shown in Table 4. Table 5. Scenarios for the IDE laboratory and the VirtIO laboratory Stage
Workload
Virtual laboratory
Measurement
Stage 1
gnome-disks
Lab 1 with 20 virtual machines with Ubuntu 18.04 operating system and IDE disk driver
Read speed on the hard drive
Stage 2
hdparm
Lab 2 with 20 virtual machines with Ubuntu 18.04 operating system and VirtIO disk driver
Read speed on the hard drive
Stage 3
gnome-disks
Lab 1 with 20 virtual machines with Ubuntu 18.04 operating system and IDE disk driver
Read speed on the hard drive
Stage 4
dd
Lab 2 with 20 virtual machines with Ubuntu 18.04 operating system and VirtIO disk driver
Read speed on the hard drive
5 Evaluation This section shows the results of the measurement of the read/write speed on the hard disk. These results were obtained using the gnome-disks, hdparm, and dd tools. Each
110
W. Luna Encalada et al.
experiment was repeated three times, so the results were averaged. Figures 5 and 8 show the comparison of the read and write speed, respectively, obtained when using IDE and VirtIO drivers. Each scenario compares the IDE and VirtIO labs using the same measurement tool. The scenarios are detailed in Table 5. Figure 3 shows the measurements of the reading speed on the hard disk, both in the laboratory with IDE virtual machines and in the laboratory with VirtIO virtual machines. The measurements in megabytes per second using the gnome-disks tool give a better efficiency by the VirtIO driver with 28.5% disk read speed over the IDE driver.
Disk read speed with the gnome-disks graphics application
mb/seg
IDE
VIRTIO
10 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Virtual machines
Fig. 3. Scenario 1, response to disk read speed using gnome-disks.
Figure 4 shows the measurements obtained from the reading speed on the hard disk using the hdparm command, both in the laboratory with IDE virtual machines and in the laboratory with VirtIO virtual machines. The measurements in megabytes per second using the hdparm tool give as a result a better efficiency by the IDE driver with a reading speed of 7.4% over VirtIO driver.
Disk read speed with the hdparm command IDE
VIRTIO
mb/seg
6 4 2 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Virtual machines
Fig. 4. Scenario 2, response to disk read speed using hdparm
In Fig. 5, the average performance of the disk reading speed by gnome-disks plus the speed of hdparm was analyzed for each driver. The measurements in megabytes give as a result a 17.7% higher performance by Virtio driver over IDE driver.
Implementation of an Opensource
111
Average speed of hard drive reading
mb/seg
VirtIO
IDE
10 8 6 4 2 0 1 Drivers
Fig. 5. The average speed of gnome-disks plus hdparm on hard drive read
Figure 6 shows the measurements of the writing speed on the hard disk, both in the laboratory with IDE virtual machines and in the laboratory with VirtIO virtual machines. The measurements in megabytes per second using the gnome-disks tool give as a result an efficiency of 8.9% increased by the VirtIO driver over the IDE driver.
Disk write speed with the gnome-disks graphics application IDE
VIRTIO
mb/seg
4 2 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Virtual machines
Fig. 6. Scenario 3, response to disk read speed using gnome-disks
In Fig. 7, the average of the measurements obtained from the writing speed on the hard disk using the dd command for each virtual laboratory was analyzed. The measurements in megabytes per second give as a result a higher efficiency of 5.7% by the VirtIO driver over IDE driver. In Fig. 8, the average performance of the disk writing speed by gnome-disks plus the speed of dd was analyzed for each laboratory. The measurements in megabytes per second give as a result a 6.6% higher performance of Virtio driver over IDE driver. As can be seen in Fig. 4 in scenario 1 (see Table 5), the reading speed on the hard disk using the VirtIO driver is 28.5% higher than the IDE driver. However, the same does not happen in Fig. 5, in scenario 2, where the reading speed used by IDE drivers is 7.4% higher than the VirtIO driver. In Fig. 7, in scenario 3, the write speed to the disk of the
112
W. Luna Encalada et al.
Disk write speed with the dd command IDE
VIRTIO
mb&seg
4 3 2 1 0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Virtual machines
Fig. 7. Scenario 4, response to disk write speed using dd
Average disk write speed VirtIO
IDE
mb/seg
4.6 4.4 4.2 4 1 Drivers
Fig. 8. The average speed of gnome-disks plus dd on disk writing
VirtIO driver exceeds the IDE driver by 8.9%. In Fig. 8, scenario 4, the same happens, the writing speed to the disk by the VirtIO driver is greater than the IDE driver by 5.7%.
6 Conclusion ✓ This research work analyzes and makes a comparison between the current performance situation of the hard disk drivers that oVirt provides using a virtualization infrastructure that is successfully implemented using only Opensource tools. ✓ The present research work allows selecting the most appropriate hard disk driver that provides the best performance for the deployment and use of virtual laboratories through the measurement and evaluation of the performance of two virtual laboratories in terms of access to the hard disk based on different scenarios. ✓ The research results indicate that the use of the VirtIO driver offers a better reading performance with 17.7% more than that provided by the IDE driver, and it also provides better speed for writing to disk with 6.6% higher speed than that provided by the IDE driver.
Implementation of an Opensource
113
✓ The research paper also provides an introductory contribution to the use of virtual laboratories as a cloud service that will be expanded in future research. ✓ Maximizing the use of hardware resources is still a great challenge to solve in the field of desktop virtualization infrastructures.
References 1. Calyam, P., Rajagopalan, S., Seetharam, S., Selvadhurai, A., Salah, K., Ramnath, R.: VDCAnalyst.: design and verification of virtual desktop cloud resource allocations. Comput. Networks, 68, 110–122 (2014) 2. Shu, S., Shen, X., Zhu, Y., Huang, T., Yan, S., Li, S.: Prototyping Efficient Desktop-as-aService for (FPGA), Based Cloud Computing Architecture. In: Proceedings of 2012 (IEEE) Fifth International Conference Cloud Computing. pp. 702–709 (2012) 3. oVirt. https://www.ieeexplore.ieee.org/document/7883099/metrics Accessed 01 Feb 2019 4. Proyectos comunitarios. https://www.ovirt.org/ Accessed 12 Oct 2018 5. Arquitectura de oVrit. https://www.ovirt.org/documentation/architecture/architecture/ Accessed 16 Oct 2018 6. FreeNAS. https://www.ixsystems.com/documentation/freenas/11.2/intro.html Accessed 24 Nov 2018 7. OpenUDS. https://www.udsenterprise.com/media/filer_public/76/b1/76b1b745–32c2-45c28ef9-b0449654a976/uds-enterprise-caracteristicas-generales.pdf Accessed 28 Nov 2018 8. Arquitectura OpenUDS. https://www.udsenterprise.com/es/uds-enterprise/introduccion/ Accessed 04 Dec 2018 9. Servidor OpenUD. http://dspace.espoch.edu.ec/bitstream/123456789/6787/1/18T00688.pdf Accessed 29 July 2018 10. Instalación FreeNAS. https://www.ixsystems.com/documentation/freenas/11.2/install.html Accessed 01 Dec 2018 11. Nodos, https://ieeexplore.ieee.org/document/7883085 Accessed 01 Dec 2018 12. Despliegue de ISCSI. https://www.ixsystems.com/documentation/freenas/11.2/sharing.html? highlight=iscsi#block-iscsi Accessed 11 Sept 2019 13. Instalación oVirt node. https://ovirt.org/documentation/install-guide/chap-oVirt_Nodes/ Accessed 14 Jan 2019 14. Instalación de nodos. https://ovirt.org/documentation/quickstart/quickstart-guide/#installovirt-node Accessed 11 Jan 2019 15. Despligue de Openud. https://github.com/dkmstr/openuds/blob/v2.2/server/documentation/ intro/install.rst#id1 Accessed 18 Jan 2019 16. Apache Guacamole. https://guacamole.incubator.apache.org/doc/gug/preface.html Accessed 26 Jan 2019
Virtual Laboratories in Virtual Learning Environments Washington Luna Encalada(B) , Patricio Moreno Costales, Sonia Patricia Cordovez Machado, and Jonny Guaiña Yungán Escuela Superior Politécnica de Chimborazo, Panamerica Sur Km 1 1/2, Riobamba Canton, Ecuador {wluna,pmoreno,jguaina}@espoch.edu.ec, [email protected]
Abstract. In this article, we present the structure and publication of a virtual learning environment through the PACIE methodology, incorporating resources such as virtual laboratories to carry out information technology practices by students, accessing from computers and devices such as smartphones and tablets. Because of the pandemic caused by the COVID-19 virus, students cannot enter the physical laboratories of the university. An analysis of the academic performance of students who received lessons in the previous semester is carried out in person with the academic performance of students who are receiving classes in this particular period online. Keywords: Virtual laboratories · PACIE · EVA · Academic performance
1 Introduction The appearance of the COVID-19 pandemic has shown the ability of higher education institutions to react to this health emergency. It has forced higher education institutions to respond immediately to guarantee the continuity of the education of students and administrative processes that involve the enrollments, career changes, and student applications entirely online. In the university educational process, the use of technological tools has grown exponentially to facilitate the teaching-learning process and to contribute to the preparation of students. Some subjects require theoretical concepts and experimental preparation through laboratory practices to make a direct connection with knowledge and promote cooperative work among students. The possibility of linking knowledge with experimentation in a virtual space that allows the apprehension of theoretical foundations makes virtual laboratories a powerful instrument. Virtual laboratories are very useful for learning experimental and engineering sciences which need to make constant use of sophisticated and expensive equipment. Globally, education has been one of the most affected areas by the COVID-19 pandemic causing the total or partial closure of 20,000 higher education institutions, forcing 200 million students from more than 102 countries to leave the classroom and change © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 114–126, 2021. https://doi.org/10.1007/978-3-030-63665-4_9
Virtual Laboratories in Virtual Learning Environments
115
the Teaching presence by virtual online education. In this reality, low-income countries face a lack of educational resources as well as a poor understanding of the details and nuances of online education [1]. In all the departments of the Faculty of Computer Science and Electronics, there are physical laboratories where the students carry out their practices of different subjects. However, when starting the virtual/online modality, the students could not access these types of physical equipment that is very important for their education. It was due the university was forced to restrict access to its facilities due to the declaration of an emergency caused by the COVID-19 virus. The ESPOCH had to rethink the process of teaching-learning before the Covid-19 pandemic. It was necessary to suddenly create training courses for professors to introduce them to virtual education. The teachers of the institution were accustomed to working most of his time in a face-to-face teaching-learning model. In some cases, professors worked in a hybrid or blended learning model using the Moodle platform as a tool for asynchronous processes of interaction with students. Advantageously, the university already had an adequate technological and management infrastructure to face the challenge of implementing the telework modality and continuing with online education and research activities. The available resources were a repowered institutional data center, a new datacenter for research at the Faculty of Computer Science and Electronics, the Moodle platform [2], the document management systems [3], Microsoft active volume licensing agreements [4], and access to the Zoom platform through an agreement with CEDIA [4]. Also, there were various ongoing research projects related to online education, and the Institute of Online and Distance Education was recently created. By disposition of the Polytechnic Council, in the Faculty of Computer Science and Electronics, an emergent training was carried out for all teachers. It was coordinated by the Institute of Online and Distance Education. The training contents were related to the use and configuration of virtual classrooms using the PACIE methodology [5]. PACIE is a methodology that allows the use of ICTs as a support for learning and selflearning processes. It highlights the pedagogical scheme of real education. The creator of this methodology is Pedro X. Camacho P., MWA, current director of the FATLA Latin American Technology Update Foundation [6]. One of the main aspects of the methodology is that it allows the teacher to concentrate on developing educational processes based on constructivist bases that facilitate the practice and experimentation of their students in a motivating environment. This methodology appeals to diverse resources, activities, and evaluations which will depend on the creativity of professors. In general, the methodology depends on the interest of the educational institutions in inserting its students in the Information Society. With this background, the members of the project called “Models, methodologies, and technological frameworks for the implementation and use of virtual laboratories” configured the virtual learning environment of the University. It allowed to professors of some subjects to have access to IT laboratories to perform practices with students. Through this environment, students can learn the contents of different topics that make up the lessons. At the same time, the professor can carry out the reinforcement tasks of the PACIE methodology complying with the constructivist philosophy of the methodology.
116
W. Luna Encalada et al.
It guarantees the learning processes of its students and thus counteracting the fact of not being able to go to the physical computer laboratories. The important thing about this virtual environment is that students can access virtual laboratories from different devices such as cell phones and tablets that most of them have and not necessarily only from pc or laptop computers. To document the impact that the pandemic is generating in the different institutions of higher education in Ecuador and promptly in the Faculty of Computer Science and Electronics of the ESPOCH, we have carried out an analysis of the academic performance of students in the last two academic periods. We have taken into account the students who took classes in the semester October 2019-April 2020 in the face-to-face modality to compare with the academic performance of the students who are receiving lessons in the current particular academic period with the online modality. Professors in the current period are teaching through the Virtual Learning Environment (VLE) of the university that is implemented with Moodle and with the PACIE methodology. They complement their lessons with technological resources such as those offered by Office 365, highlighting Microsoft Teams and Zoom for synchronous video conference classes. It is also important to mention the incorporation of virtual computer labs for information technology practices. The article is organized as follows: Sect. 2 presents state of the art based on the literature on the experiences of some universities and also technology companies that offer internships through virtual laboratories. Section 3 indicates the structure of the virtual classrooms developed based on the PACIE methodology, highlighting the incorporation of resources for video conferences into virtual laboratories for practices. Section 4 describes the results of the academic performance achieved in the two periods in the face-to-face and online modalities. Finally, Sect. 6 presents the conclusions and future work to be carried out on the subject.
2 State of the Art Education is a process of constant reflection that aims to provide the best education to students at the lowest cost. In this scenario, online teaching plays an essential role that needs to use new models and methodologies that effectively link the contents with pedagogical and technological aspects. To obtain practical online education in a computer laboratory it is necessary to use several tools. These tools should follow the new concepts about immersive and global education; that is, they must follow the educational pillars and do not have technological barriers [1]. In a VLE, to attain the student to accomplish a connection between theory and practice is a big challenge. People learn by association, building ideas, or skills. This learning process is doing using active discovery, dialogue, participation, exchange of information, and experiences in learning communities. There is not much evidence of the construction and development of a VLE using virtual laboratories [7]. The educational literature reveals several ideas and new technologies to study the future of education, such as cloud computing, social networks, virtual laboratories, virtual reality, virtual worlds, MOOC, BYOD, etc. Many theories support that it is not only enough to learn new theoretical concepts. These theories also suggest that there should
Virtual Laboratories in Virtual Learning Environments
117
be scenarios that allow the student to link theory and practice. The UNESCO declaration contains all these theories. [8]. The use of cloud computing for education was embraced in the last years by many leading IT companies such as Microsoft, Google, Amazon, HP, VMware, and IBM. These companies have provided learning tools to support educational institutions. There are many simple, high availability, scalable, and inexpensive cloud applications (such as Google Apps, Dropbox, Office 365, etc.) that are being used by students. For example, Google Apps for Education or Microsoft Office 365 offers online applications such as word processing, spreadsheets, and presentations to be used in class. Teachers, students, and administrative staff can use a Google or Microsoft e-mail account with their institutional domain. Besides, they can use Zoom, Microsoft Teams, Adobe Connect, Cisco WebEx Meetings, or YouTube channels to transmit video conferences or storage from OneDrive or Dropbox. But all this technology does not provide resources for IT practices. Face-to-face education tries to follow all the educational pillars. In particular, it focuses on learning through practice by using laboratories. The need to learn information technology in an applied way faces several limitations such as high cost in laboratory equipment, space availability and, problems with schedules to allow that several students use a lab at the same time. In 2008, Siddiqui stated that if teaching is based only on theoretical aspects, it may be insufficient and inefficient [9]. So it is necessary to include activities supported by technology to align with the practical objectives of education. A Boukil and Ibriz study claimed that one of the most exciting applications of elearning is virtual laboratories based on cloud computing, and defines them as a collection of computing resources, storage, and network elements, implemented with educational purposes. These laboratories provide the necessary infrastructure to simulate a classic laboratory by using resources that are accessible through the internet. Therefore, it is essential to move from traditional laboratories to virtual laboratories [10]. There is evidence that in 2009, the Software Foundation launched a platform called VCL that was used by 30,000 students and professors. So VCL can be mainly considered as a middleware for a private cloud. Other studies analyzed the process to create and reserve virtual machines in a public cloud such as Amazon EC2. These studies focused on aspects such as scalability and administration through the creation or termination of virtual machines. Some researchers also created prediction models to better adapt to demand. For example, algorithms to predict the load of cloud-based learning infrastructures and dynamic provisioning of resources, monitoring resources such as CPU load, disk I/O, process queue status, or interrupts per second. A study of several VLE concludes most only offer theoretical classes composed of a typical sequence of videos, readings, formative self-evaluation, and a summative evaluation at the end of each lesson or course, without the student being able to apply the theory. Virtual desktop environments with pre-configured software are generally used in different educational disciplines. For example, [6] presents a virtual desktop application called Dumbogo. This application is on a private VMWare-based cloud, where students and teachers can share documents and multimedia resources. The authors claim that virtual desktops are used in e-Learning due to the ubiquity of cloud computing and
118
W. Luna Encalada et al.
because teachers can offer appropriate material. Other studies state the use of virtual desktops, called Ulteo, which uses open source technology [11]. Harvard University offers students a course in Fundamentals of Cloud Computing with Microsoft Azure. The content of this course covers the fundamental architecture and design patterns required to create highly available and scalable solutions using the platform Microsoft Azure as a Service (PaaS) [12]. Berkeley University offers a Cloud Computing course. In this course, teachers teach technology trends, architecture, and the design of cloud deployments. This course is a mixture of lectures, workshops, and student presentations. Besides, it includes a practical project using virtual machines [13]. The University of Cambridge has available a free course on cloud computing. This course aims to teach students the fundamentals of cloud computing, with topics such as virtualization, data centers, resource management, storage, and popular cloud applications. Besides, the course includes tutorials on different cloud infrastructure technologies and a practical project [14]. Another promising way to gain IT skills is to study online courses (MOOCs) with practice scenarios by accessing virtual machines. The leading online platforms that offer training courses in information technology are EdX and Coursera. EdX offers various IT courses such as the Internet of Things, Virtualization, etc.; these courses include practices that use certain cloud computing services like IaaS and PaaS [20]. Coursera offers several courses to study technology. It uses Google cloud platforms and AWS platforms to deliver virtual machines to students for practices but requires special software such as Putty for remote desktop connection for access [15].
3 Methodology To know how the ESPOCH and specifically the Department of Computer Science, has faced the teaching-learning process during the COVID-19 pandemic, a study was carried out based on different sources that are: 1.- Data obtained from the OASIS institutional academic system referring to the student enrollment process, considering the last six academic periods, 2.- Through surveys carried out on 152 EIS students, obtaining the degree of satisfaction of the students have had to move from face-to-face education to virtual education, the desired outcomes of the execution of online classes, the proposals presented by students to make virtual teaching-learning produce the expected results related to the acquisition of knowledge, in a stage made up of a virtual classroom modeled on the MOODLE platform following the PACIE methodology and with the use of Micros oft Teams for the real-time process. 3.- Academic performance report for the periods April-September 2020 and September 2019- February 2020. Figure 1.
4 Structure of the Virtual Classroom Professors structured the virtual classrooms following the PACIE methodology that indicates 5 phases: Presence, Scope, Training, Interaction, and e-learning [16]. Presence phase. Create an institutional presence by maintaining the institutional corporate image in the virtual environment. It can be achieved by providing information
Virtual Laboratories in Virtual Learning Environments
119
Fig. 1. Diagram of the methodology used to obtain information
through multiple media. This phase emphasizes the importance of the image of our virtual classroom, creating visual impact, ranking information, and presenting content, resources, and activities with ease of navigation in a friendly environment. Scope phase. Establish real scope by setting international standards, keeping in mind that our virtual classrooms are open to worldwide students. From the pedagogical perspective, she plans a sequence that allows access to virtual education gradually, avoiding rejection. Training phase. It offers permanent training to teachers based on the constructivist paradigm, contemplating not only the use of technological tools but fundamentally the development of tolerance, interaction, and socialization skills. It proposes the Design Cycle as a model to create and evaluate educational solutions in five stages: Research, Planning, Creation, Evaluation and research, learning by doing, and self-learning. Interaction phase. The learning community interacts through various means both inside and outside the Virtual Campus (email, videoconference, chat, 3D avatars, etc.). This phase generates interactive processes in the classroom, motivating participation, and socialization to achieve the shared construction of knowledge. E-learning phase. Education evolves from the inclusion of ICTs and e-learning processes based on constructivist pedagogy. The constructivist pedagogy goes beyond the transmission of information to lead to the generation of knowledge. It appeals to different evaluation and self-evaluation techniques. In each of the phases, the methodology incorporates technological resources, highlighting the inclusion of virtual laboratories in the training phase and the inclusion of the videoconference resource using Microsoft Teams and Zoom. When lessons are given using videoconferences, the videos are recorded and then published in the virtual classroom. The technological framework defines the virtual resources available to students from the EVA, Fig. 2. Open UDS acts as a traffic manager, which allows or denies access to
120
W. Luna Encalada et al.
the resources defined by the teacher. Virtual machines are installed in various hypervisors hosted in the cloud such as oVirt, Hyper-V, KVM, and Vsphere. Virtual resources hosted in free public cloud layers such as Amazon or VMware can also be accessed. OpenUDS open structure enables switching between operating systems, hypervisors, authenticators, and protocols. Linking all the technological tools is possible thanks to a cross-platform connection broker and a tunneller that configures an ecosystem for laboratory education for IT practices.
Fig. 2. Linking virtual laboratories in the EVA
The PACIE methodology suggests implementing well-distributed virtual classrooms with several and different sections, and with resources that are both inside and outside the virtual classroom. This ensures that correct interaction processes can be generated and motivated to experience through practices to develop knowledge and facilitate learning processes. A good structure must be divided into sections. Each block contains sections that group resources and activities according to their functionality and usability [17]. This is observed in Table 1. Table 1. Efficiency parameter metrics. Section
0 Block
Academic Block
Closing block
1
Information
Exposition
Feedback
2
Communication
Rebound
Transaction
3
Interaction (synchronous activity, video conference)
Construction (Hand-on Labs)
4
Verification
The structure of block 0 is indicated in Fig. 3, and the structure of the academic block and closing block can be seen in Fig. 4.
Virtual Laboratories in Virtual Learning Environments
Fig. 3. Structure of the 0 block
Fig. 4. Structure of the academic block
121
122
W. Luna Encalada et al.
5 Resulted This section presents the results of the information obtained through the different sources of information presented in the methodology of Sect. 3. Once the data has been processed, they allow us to carry out the corresponding analysis of the identified sources: - OASIS Institutional Academic System. - The data is analyzed about the number of students who have enrolled in the Department of Computer Science in the last six academic periods, as shown in Table 2, to establish a history that allows a comparative analysis with the period current affected by the COVID-19 pandemic and establish whether there is an impact on the enrollment process. Table 2. Analyzed periods and students enrolled. Num Periods
Enrolled students
1
October 2017 – March 2018
397
2
April – Agosto 2018
404
3
September 2018 – February 2019 441
4
March – July 2019
430
5
September 2019 – Feb 2020
405
6
April – September 2020
407
The academic period number six, April – September 2020, began with a delay due to the pandemic and had to change from face-to-face to virtual modality. This implied a challenge for the Institution, its teachers, students, administrative staff, and support staff. Advantageously the appropriate and timely decision-making meant that the administrative processes were geared in such a way that the assumptions of little presence of students in virtual teaching did not occur. The percentage below the average (414 students) of the six semesters for the period affected by the pandemic has a value of 1.69% and is not representative. The aforementioned is because in recent years the ESPOCH has implemented the online registration system, which has been used from the beginning by EIS and which did not imply something new for the students who carried out this process without inconveniences. - Students. - To know the impact that the virtual education process had on the HIA students, which they abruptly accessed, after having experienced three months of virtual education, surveys were conducted in which they were asked about the degree of satisfaction of the classes received in this modality and Fig. 5 summarizes their responses. In Fig. 5, it can be observed that 38% of the students are indifferent to the way of taking the classes, the one done virtually does not cause them a more significant impact, due to the nature of the career that always forces them to use technological tools. Students also had the necessary devices in advance to face the online modality. 20% of the surveyed students say they are not very dissatisfied.
Virtual Laboratories in Virtual Learning Environments
123
Fig. 5. Degree of student satisfaction in online classes
11% are not satisfied with this teaching method, because they were used to attending classes at ESPOCH, have lost contact with their friends in person, do not have the resources to maintain a good Internet connection. Before the online modality, they used to have internet for free on campus as part of university facilities. Now, students are required to have a computer with good features, and students who come from other cities, which are around 60%, do not know what to do with the place they rent and where they have their stuff, they had to decide whether to keep paying or stop doing it and going back to their hometown. 6% consider that this type of teaching keeps them satisfied because they can stay at home, which reduces the possibility of getting sick by COVID-19. They are close to their parents because some students came from other cities to study or are from the countryside. Now, students can stay with their family and strengthen family ties which the student appreciates in a significant way at a time of unease of human beings due to the presence of the pandemic. - Academic Performance Reports. – For this point, we have compiled reports on the academic performance of the periods September 2019 – February 2020 and April– September 2020 of 17 subjects of the career that make use of computer laboratories and that in this period could use virtual laboratories. For which we consider the following hypotheses. Ho. There are no significant differences in the means of qualifications of the faceto-face and virtual modality. H1. There are significant differences in the means of qualifications of the face-to-face and virtual modality. Confidence level: 95% Experimental group: Virtual mode Control group: Face-to-face modality It is a cross-sectional study since we analyzed the qualifications at the same time. We applied the T-Student technique for independent samples because we have a fixed variable that creates two groups (Modality: 1. On-site, 2. Virtual) and a numerical random variable (Score). The Shapiro Wilk normality test was performed since it was a sample of 17 cases, i.e. less than 50. In both statistical tests, we obtained the significance values or P-value (Table 3).
124
W. Luna Encalada et al. Table 3. Statistical data Presential
Virtual
Valid
17
17
Lost
0
0
Mean
6.9012
7.5559
Median
6.86
7.5900
Mode
6.67
8.30
Desv. deviation
0.55762
0.68781
Variance
0.311
0.473
Minimum
5.38
6.34
Maximum
8
8.60
N
• For face-to-face modality, the P-value = 0.049 is less than 0.05, which is why it is determined that it does not have a normal distribution, so we apply a Non-Parametric test, to determine a significant difference of the averages. • For Virtual mode, we have P-value = 0.436, the same that is greater than 0.05, which determines that the data have a normal distribution, so we applied a Parametric test to determine a significant difference in the averages. The statistical test to compare means T-Test is applied for independent samples. Initially, the test of equality of variance is carried out with the Levene test. The group of statistics that appears in Table 4 shows the results of the application of the T-Test for independent samples. Table 4. Group statistics
Score
Modality
N
Mean
Desv. Deviation
Desv. Average error
Presential
17
6.9012
0.55762
0.13524
Virtual
17
7.5559
0.68781
0.16682
P-valor = 0,005 < α = 0,05 In the Levene test of equality of variances, we had a P-value = 0.141 which is greater than the significance level of 0.05. therefore, it is determined that the variances are equal. A P-value of 0.005 for the T-test for equality of means is obtained, which is lower than the significance level of 0.05. Therefore, it is concluded that: “There are significant differences in the qualification averages of the Face-to-face and Virtual modalities”. The mean estimation of the Virtual Mode = 7.5559 is greater than the mean estimation of the Face-to-face Mode = 6.9012, discarding the Null Hypothesis and accepting the researcher’s Hypothesis.
Virtual Laboratories in Virtual Learning Environments
125
6 Conclusions This paper presents a study about how the teaching-learning process was affected by the COVID-19 pandemic forcing to Department of Computer Science at ESPOCH to implement virtual education by using e-Learning tools. The methodology used to create the virtual classrooms in the VLE has provided essential resources such as the possibility of using virtual laboratories to mitigate the fact that students cannot physically access the computer laboratories in those subjects that they require to do practical assignments and projects. The inclusion of synchronous tools for collaboration and communication in a lecture presentation and its corresponding recording for a subsequent publication has influenced the student’s acceptance of virtual education. Covid-19 did not influence the enrollment process of students. 38% of the students surveyed are indifferent to how the classes are conducted, that is, the virtual modality does not cause a significant impact on students. The academic performance of the students has increased by 11.30% that is related to the learning achievements presented in the syllabus. The students show an improvement in academic performance due to significant differences are observed in the grade means of the Face-to-face and Virtual modalities, noting that the average of the Virtual Modality equal to 7.5559 is greater than the average of face-to-face Modality of 6,9012. The students who used virtual laboratories feel motivated because they had the possibility of doing their practices as often as they needed which allows them to enhance their learning. Finally, virtual laboratories constitute a useful didactic resource that arouses interest and motivation in students. They facilitate the apprehension of knowledge concerning theoretical concepts.
References 1. Encalada, W.L.: Nube Social para eseñanza práctica de Tecnología de Información. Universidad Nacional Mayor de San Marcos, Lima, Peru/ ESPOCH (2018) 2. ESPOCH, “EVA ESPOCH,” (2020). https://elearning.espoch.edu.ec/. Accessed 02 August 2020 3. Quipux, “Sistema Documentas”. https://oficina.espoch.edu.ec/. Accessed 02 Aug 2020 4. Microsoft, “Volume licenses,” (2020). https://www.microsoft.com/Licensing/servicecenter/ default.aspx. Accessed 02 Aug 2020 5. Ferrer, K.M.F., de la Soledad Bravo, M.: Methodology pacie in virtual learning environments for the achievement of collaborative learning. Rev. Electrónica Diálogos Educ. 12, 3–17 (2012) 6. P.X.C.P, “FATLA,” (2020). https://www.fatla.org/becas/contacto.html. Accessed 02 Aug 2020 7. De, La Roca, M., Morales, M., Amado-Salvatierra, H., Rizzardini, R.H.: Vinculando teoría y práctica a través de una secuencia de actividades de aprendizaje en un MOOC (2018) 8. Butcher, N.: Technologies in Higher Education : Mapping the Terrain (2014) 9. Siddiqui, A., Khan, M., Akhtar, S.: Supply chain simulator: a scenario-based educational tool to enhance student learning. Comput. Educ. 51(1), 252–261 (2008)
126
W. Luna Encalada et al.
10. Boukil, N., Ibriz, A.: Architecture of remote virtual labs as a service in cloud computing. In: Proceedings 2015 International Conference Cloud Computer Technology Applications CloudTech (2015) 11. Wang, Q., Ye, X.D., Chen, W.D., Xu, Y.F.: A study of cloud education environment design and model construction. In: Proceedings of 2012 2nd International Conference on Consumer Electronics, Communications and Networks, CECNet 2012, pp. 3082–3085 (2012) 12. University, H.: Fundamentals of Cloud Computing with Microsoft Azure (2019). https:// online-learning.harvard.edu/course/fundamentals-cloud-computing-microsoft-azure%0D. Accessed 31 July 2019 13. Berkeley University, “Cloud computing: Systems, Networking, and Frameworks,” (2019). http://people.eecs.berkeley.edu/~istoica/classes/cs294/11/. Accessed 31 July 2020 14. D. of C. S. and T. Cambrige University, “Cloud Computing,” 2019. https://www. cl.cam.ac.uk/teaching/1819/CloudComp/, https://www.cl.cam.ac.uk/teaching/1819/CloudC omp/. Accessed 31 July 2020 15. Google Cloud, “Essential Google Cloud Infrastructure: Foundation| Coursera.,” (2019). https://www.coursera.org/learn/gcp-infrastructure-foundation?specialization=gcp-. Accessed 25 Oct 2019 16. Medina, G.A.: Pacie Y AULAS VIRTUALES (2018) 17. Pástor, D., Jiménez, J., Arcos, G., Romero, M., Urquizo, L.: Patrones de diseño para la construcción de cursos on-line en un entorno virtual de aprendizaje. Ingeniare 26(1), 157–171 (2018)
Mathematical Self-efficacy and Collaborative Learning Strategies in Engineering Career Aspirants Ivan Iraola-Real1(B) , Ling Katterin Huaman Sarmiento1 Claudia Mego Sanchez1 , and Christina Andersson2
,
1 Universidad de Ciencias Y Humanidades, Lima 15314, Peru
[email protected], [email protected], [email protected] 2 Frankfurt University of Applied Sciences, 60318 Frankfurt am Main, Germany [email protected]
Abstract. In the Peruvian context for learning mathematics, the use of collaborative learning strategies is essential to achieve an adequate academic performance in them; and, at the same time, develop an adequate confidence level in mathematical abilities called self-efficacy. The present study aims to identify the predictive relationship between the collaborative learning strategies with the academic performance and the self-efficacy in mathematics in students’ aspirants to engineering careers. In this study the sample were 122 students (56 men (45.9%) and 66 women (54.1%)) between 15 and 26 years old (Mage = 17.81, SD = 1.74). In the results, it was identified women having a lower mean in mathematical self-efficacy, unlike males, but similar results in academic performance and in use of collaborative strategies in learning mathematics. Finally, the multiple linear regressions analysis shows the collaborative strategies in learning mathematics, and academic performance in mathematics positively predict the mathematical self-efficacy. Keywords: Collaborative learning · Strategies learning · Mathematical self-efficacy · Academic performance
1 Introduction The mathematics learning is fundamental in the professional training of humanity. Nevertheless, from the 70 s, the society was reconciling that the mathematical education was insufficient and inadequate [1]. This has, over the years, been improving thanks to the appearance of diverse methodologies that think about how to revolutionize the learning of the mathematics. It is necessary to bear in mind that mathematics is a subject difficult to understand or to solve due to the complexity that it characterizes [2]; that is; why many young people have deficits in general and particular knowledge of the matter, before a university entrance assessment, assessments that are more demanding and difficult for applicants for engineering careers. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 127–136, 2021. https://doi.org/10.1007/978-3-030-63665-4_10
128
I. Iraola-Real et al.
The engineers need mathematical skills to face the socioeconomic demands [1]. Therefore, problems as mentioned above, have to be studied to favor and to realize progress inside the process of learning, to change the negative thoughts and frequently the fear that students have concerning mathematics. Therefore, teachers search for better learning strategies on the part of their students, exploring diverse methodologies to find the most suitable and the one that offers significant benefits to the process of learning [3]. In this context, active methodologies arise, emphasizing the role of students as main actors in their learning process where they develop search capabilities of search, analysis, and evaluation of the information; at the same time, produces to himself an exchange of comments and experiences with their partners [2]. Therefore, to apply strategies focused on the relation between two or more persons fits with the needs of the current society [4], bearing in mind also the social essence and the interaction with others. 1.1 Benefits of Collaborative Learning in Mathematics Collaborative learning is one of the more suitable strategies of knowledge, and with major results in mathematics [5]. In this model, students study in teams [6]. In this way, reciprocity favors learning [7]. Also, it is good to think that collaborative learning offers many benefits; as, for example it improves the learning process in mathematics because it allows defining the values of the variables that intervene in a circle so that the obtained resolution is the most suitable [2]. It also allows establishing ties between the different thoughts and opinions with the various manners of solving the problematic situations that appear, having both an expected end. [8], propitiates adapted attitudes inside an approach of autonomous learning [6], generates inspiring situations founded on the acquisition of competences [9], thanks to the sum and interconnected delivery of the members of the team, each of the participants wins. Also, with the collaborative learning strategies, the quality of the ideas and proposed solutions have been confirmed [10]. For these benefits, through teamwork, students seek new solutions [2] when facing unknown situations and problems [11], achieving collaborative learning [12]. For this reason, it is important to bear in mind that in collaborative learning, students develop their reasoning and their ability to solve problems, use and make suggestions to their classmates [13]. In this way, collaborative strategies favour organized work and satisfaction with learning [14]. 1.2 Collaborative Learning in Mathematics, Self-efficacy and Academic Performance But also, when you are learning mathematics, it is necessary to have a suitable confidence level to be able to solve arithmetical problems. This confidence names itself a mathematical self-efficacy [15]. Self-efficacy according to the following factors: the experience of the gain, understanding of the sensations and emotions that happens to him or to other persons, stimulation to act in the not thought-out way initially, and reactions of the body opposite to the stress [16], allowing suitable levels of academic performance. This type
Mathematical Self-efficacy and Collaborative Learning Strategies
129
of arrangement is predicted principally by the self-efficacy and the collaborative strategy [17]: Students who apply methods that imply collaborating with others to achieve a result obtain better yield in the mathematics [8], demonstrating this way the fruits of the whole educational process [18]. But from another optics, the achievements obtained with an excellent academic performance constitute experiences that they favor the development of a suitable selfefficacy [19] because it is an experience of domain or success that allows to the students to trust in his capacities [16]. The learning product of collaboration, it offers a positive impact on the theoretical yield [20, 21], as well as it encourages the development of competences [2]. It is necessary to point out that this type of learning not only consists of realizing a work as a whole, but the collaborative strategy needs commitment, of repeating the collaborative strategy of maintaining continuous [22]. Likewise, students actively learn because there is a join objective [7] wich is to find the solution to mathematical problems related to the immediate environment [23]. 1.3 Objective and Hypothesis As previously analyzed, the present research aims to identify the predictive relationship between the collaborative learning strategies with academic performance [8, 20] and the self-efficacy in mathematics [16, 19] in a sample of students aspirants to the engineering careers. According with the investigation objective, the following hypothesis appears: – The collaborative strategies in learning mathematics predict the mathematical selfefficacy significantly in aspirants students to engineering careers; any time they possess suitable levels in their academic performance (see Fig. 1).
Fig. 1. Hypothetical Model.
130
I. Iraola-Real et al.
2 Methodology 2.1 Participants The participants were 122 students candidates to the engineering careers (56 men (45.9%) and 66 women (54.1%)) between 15 and 26 years old (M age = 17.81, SD = 1.74) (see Fig. 2).
Fig. 2. Frequency (Gender).
2.2 Measures Collaborative Strategies in Learning Mathematics Scale . This scale evaluates interactive learning strategies when studying mathematics. It is one-dimensional, and it consists of 4 items with response options on the 7-dimensional Likert scale (from 1 totally disagree to 7 total agreement). Validity was assessed with the exploratory factorial analysis (EFA), and the Kayser-Meyer-Olkin sample adjustment test (KMO) was .75, Bartlett’s Sphericity Test was significant (χ 2 = 320.843, df = 6, p < .001). Regarding reliability, the Cronbach Alpha Coefficient was .86 (adequate [24]). Mathematical Self-Efficacy Scale [25]. This is an abbreviated and adapted version of Palenzuela scale to assess confidence levels in mathematical abilities. It is a onedimensional scale that has 5 items. Validity was assessed with EFA, and the KayserMeyer-Olkin sample adjustment test (KMO) was .79, Bartlett’s Sphericity Test was significant (χ 2 = 307.040, df = 10, p < .001). Regarding reliability, the Cronbach Alpha Coefficient was .85 (adequate [24]).
Mathematical Self-efficacy and Collaborative Learning Strategies
131
3 Results After the analysis of validity and reliability of the instruments, gender difference analyzes were realized. 3.1 Gender Differences First, as the sample consisted of 122 participants (>50), the statistical distribution was analyzed with the Kolmogorov-Smirnov (KS) test [26] according to gender, and the variables demonstrated not having a normal distribution. Table 1 shows that the three analysis variables proved to be significant (p < .05). Table 1. Normality Test. Variables
Gender
Kolmogorov-Smirnov Statistical
gl
p
,175
46
,001
Collaborative strategies in learning mathematics
Men Women
,199
60
,000
Mathematical self-efficacy
Men
,193
46
,000
Women
,135
60
,008
Academic performance in mathematics
Men
,180
46
,001
Women
,158
60
,001
Table 2 shows the analysis with Mann and Whitney Test according to gender (women and men) showing significance value (Sig. asymptotic p < .05) of the variable mathematical self-efficacy confirmed lower levels in females. But, in the academic performance in mathematics and collaborative strategies in learning mathematics, men and women did not show a value of significance that indicates relevant differences (p > .05). These results suggest that although women have less confidence in their mathematical abilities, they use collective learning strategies to study in the same way, achieving the same academic performance levels as men in mathematics. This analysis confirms some studies with students from different educational levels. For example, studies with French primary school students identified that women had lower levels of mathematical self-efficacy [27]. These findings coincided with other studies with samples from high school students [28] and samples of university students [29]. 3.2 Relations Between Variables Relationships between variables were analyzed according to Pearson’s r assuming Cohen correlation criteria [30] for social studies (light, r = .10 −.23; moderate, r = .24 − .36; strong, r = .37 or more). The results showed that the collaborative strategies in
132
I. Iraola-Real et al. Table 2. Mann and Whitney Test.
Variables
Gender
Collaborative strategies in learning mathematics
Men
Mathematical self-efficacy
Academic performance in mathematics
N
Women
Average range
Range sum
56
59,61
3338,00
66
63,11
4165,00
Total
122
Men
56
71,50
4004,00
66
53,02
3499,00
Women Total
122
Men
46
53,59
2465,00
60
53,43
3206,00
Women Total
106
Grouping variable: gender
Test statistics Collaborative strategies in learning mathematics
Mathematical self -efficacy
Academic performance in mathematics
U de Mann-Whitney
1742,000
1288,000
1376,000
W de Wilcoxon
3338,000
3499,000
3206,000
Z
– ,547
– 2,886
– ,026
Sig. asymptotic (bilateral)
,584
,004
,979
learning mathematics were related to mathematical self-efficacy (r = .26**, p < .01). These results indicate that when students use collaborative strategies more frequently, this is associated with higher levels of mathematical self-efficacy. If they use these types of strategies less frequently, their self-efficacy is less. Then, the collaborative strategies in learning mathematics were related to academic performance in mathematics (r = .21*, p < .05). Similar to the previous case, it is observed that if students employ collaborative strategies frequently, this favors better grades in mathematics. If they use these strategies less frequently, then their performance in mathematics is poor. Finally, the academic performance in mathematics was related to mathematical selfefficacy (r = .31**, p < .01). This last analysis confirms that the highest levels of academic performance (expressed in grades) are associated with higher levels of confidence that a student possesses to study mathematics (and vice versa) (see Table 3). 3.3 Linear Regression Then, with multiple linear regression analysis, it was identified the predictive power [31] between collaborative strategies in learning mathematics and academic performance mathematics (as predictor variables) and mathematical self-efficacy (output variable).
Mathematical Self-efficacy and Collaborative Learning Strategies
133
Table 3. Relations between variables. Variables
1
2
1
Collaborative strategies in learning mathematics
–
2
Mathematical self-efficacy
.26**
–
3
Academic performance in mathematics
.20*
.31**
3
–
Note. Asterisks (*, **, ***) show significant relationships. * p < .05, ** p < .01, *** p < .001 (bilateral).
In this analysis was found that collaborative strategies in learning mathematics (β = .18; p < .05), and the academic performance mathematics (β = .10; p < .01), positively predict the mathematical self-efficacy, explaining 13% of the variance (R2 = .13). The results indicate that if students apply collaborative strategies more frequently in the mathematics study (as dialogue, questioning, explanation, and debate) and at the same time have a high level of academic performance, this predicts that they will have adequate confidence levels in their mathematical abilities (and vice versa) (see Fig. 3).
Fig. 3. Multiple linear regression.
4 Discussion In the present study, the objective is to identify the predictive relationship between collaborative strategies in learning mathematics and its relationship with the academic performance [8, 20] and self-efficacy in mathematics [16, 19] in a sample of students aspiring to engineer careers.
134
I. Iraola-Real et al.
In the previous analyses, significant gender differences were identified; they were indicating that the women had less mathematical self-efficacy in comparison to the males. These results confirm studies of Joët, Usher. and Bressoux [27], who analyzed a sample of 395 French students of primary education, it identified that the males possessed better levels of mathematical self-efficacy. These findings were consistent with studies with samples of secondary education [28]. Also, in a sample of 326 university students who were studying algebra, being the males were possessing significant mathematical self-efficacy levels [29]. Also, it was possible to identify the positive relationship between all variables, which allows demonstrating that the collaborative learning strategies favor the learning reflected in the academic performance [19]; what is possible because working of form collaborative favors the problem solution through consultations and mutual suggestions, what in the end it favors the learning [13], the creation of solution strategies, the comprehension and the satisfaction with the study [14]. But at the same time, the academic performance represents a set of experiences that help to develop suitable levels of personal efficacy [19], which feeds on the so-called experiences of domain necessary for the development of the mathematical self-efficacy [27]. Nevertheless, there exist studies that differ concerning the positive effects of collaborative strategies. For example; in experimental studies of Retnowati, Ayres, and Sweller [20] realized with two samples of 182 and 122 students of Indonesia, they demonstrated that in some cases, individual learning was more efficient in the achievement of understandings of algebra when demonstrative classes were happening. Still, the collaborative strategies were more beneficial when it was a question of using techniques of search for problem-solving. In this study, one concluded that collaborative learning is helpful when searching for new solutions is needed and not when the teacher exhibits examples of solutions. Also, the studies of Zhan, Fong, Mei, and Liang [21] with 588 university students identified the achievements of the work collaborative were depending on the number of women and males who were integrating the groups; concluding that the teams formed only by women and the mixed units of work with the same quantity of males and women (two women and two males) the achievements in learning were suitable; in contrast to the groups in which there were only males or a less number of women. Then, in the multiple linear regression analysis, it was observed that the collaborative learning strategies predict of more significant form the suitable levels of mathematical self-efficacy. This result allows confirming the studies of Joët, Usher, and Bressoux, who affirm that the successful experiences as obtaining positive achievements in the solving of mathematical problems it preferred to the self-efficacy [27]; this way, the collaborative strategies, on having favored to mathematical learning [5] and to that of all the subjects in general [8], from the optics of Albert Bandura, it allows developing suitable levels of mathematical self-efficacy [16]. Finally, the linear regression analysis demonstrated that the high levels of academic performance favor mathematical self-efficacy. This finding confirms again the empirical studies of Joët, Usher, and Bressoux, who identified that the mathematical performance was collaborating at the suitable level of development of self-efficacy [27]. In addition, to coinciding with the studies of Wilson and Narayan [19] who observed in a sample of 96
Mathematical Self-efficacy and Collaborative Learning Strategies
135
university students that the appropriate levels of academic performance favor at suitable levels of self-efficacy, because a right performance is perceived by the student as a domain experience allowing to be able to trust in its capacities [16]. Nevertheless, there might be comparable results because Peters’ studies with a sample of 326 university students in those whom he confirmed that suitable performance in algebra might collaborate at a high level in the mathematical self-efficacy [29].
References 1. Vásquez, R., Romo, A., Trigueros, M.: Un contexto de modelación para la enseñanza de matemáticas en las ingenierías. Ponencia presentada en la XIV CIAEM Conferencia Interamericana de Educación Matemática en Tuxtla, Chiapas, México (2015) 2. Herrada, R., Baños, R.: Experiencias de aprendizaje cooperativo en matemáticas. Espiral. Cuadernos del Profesorado: Revista Multidisciplinar de Educación 23(11), 99–108 (2018) 3. Huaney, J.: El aprendizaje colaborativo para la mejora del rendimiento académico de los estudiantes de la escuela profesional de ingeniería civil de la Universidad Católica Los Ángeles de Chimbote, M.S. thesis, Universidad Católica Los Ángeles de Chimbote, Chimbote (2019) 4. Contreras, F.: La evolución de la didáctica de la matemática. Horizonte de la Ciencia 2(2), 20–25 (2015) 5. Yadav, R.: Effect of collaborative learning on mathematics, M.S. thesis, department of mathematics education central department of education university campus Tribhuvan university Kirtipur, Kathmandu, Nepal (2017) 6. Avello, R., Marín, V.: La necesaria formación de los docentes en aprendizaje colaborativo. Profesorado. Revista de Currículum y Formación de Profesorado 3(20), 687–713 (2016) 7. Fernández, M., Valverde, J.: Comunidades de practica: un modelo de intervención desde el aprendizaje colaborativo en entornos virtuales. Revista Comunicar 42, 97–105 (2014) 8. Roselli, N.: Collaborative learning: theoretical foundations and applicable strategies to university. Propósitos y Representaciones 4(1), 219–280 (2016) 9. Londoño, G.: Aprendizaje colaborativo presencial, aprendizaje colaborativo mediado por computador e interacción: Aclaraciones, aportes y evidencias. Revista Q, 4 (2) (2017) 10. Iglesias, J., López, T., Fernández, J.: La enseñanza de las matemáticas a través del aprendizaje cooperativo en 2º curso de educación primaria. Contextos Educativos, Extraordinario 2, 47–64 (2017) 11. Barkley, E., Cross, K., Major, C.: Collaborative Learning Techniques: A Handbook for College Faculty, 2nd edn. Jossey-Bass, San Francisco, CA (2014) 12. Loes, C.: Applied learning through collaborative educational experiences. New Dir. High. Educ. 188, 13–21 (2019) 13. Davidson, N., Major, C.: Boundary crossings: cooperative learning, collaborative learning, and problem-based learning. J. Excellence Coll. Teach. 25(3&4), 7–55 (2014) 14. García, A., Basilotta, V., López, C.: Las TIC en el aprendizaje colaborativo en el aula de Primaria y secundaria. Comunicar 42, 65–74 (2014) 15. Muñoz, A., Gómez, V.: Evaluación de una experiencia de aprendizaje colaborativo con TIC desarrollada en un centro de educación primaria. Edutec. Revista Electrónica de Tecnología Educativa 51, 291 (2015) 16. Bandura, A.: Social Foundations of Thought and Action - A Social Cognitive Theory, Englewood Cliffs. Prentice Hall, N.J. (1986) 17. Bandura, A.: Toward a unifying theory of behavioral change. Psychol. Rev. 2(84), 195–215 (1977)
136
I. Iraola-Real et al.
18. Huaney, J.: El aprendizaje colaborativo para la mejora del rendimiento académico de los estudiantes de la escuela profesional de ingeniería civil de la universidad católica Los Ángeles de Chimbote, M.S. thesis, Universidad Católica Los Ángeles de Chimbote, Chimbote (2019) 19. Wilson, K., Narayan, A.: Relationships among individual task self-efficacy, self-regulated learning strategy use and academic performance in a computer-supported collaborative learning environment. Educ. Psychol. 36(2), 236–253 (2014) 20. Retnowati, E., Ayres, P., Sweller, J.: Can collaborative learning improve the effectiveness of worked examples in learning mathematics. J. Educ. Psychol. 1(14), 666 (2016) 21. Zhan, Z., Fong, P., Mei, H., Liang, T.: Effects of gender grouping on students’ group performance, individual achievements, and attitudes in computer-supported collaborative learning. Comput. Hum. Behav. 48, 587–596 (2015) 22. Mercado, C., Andrade, E., Reynoso, J.: El efecto de la autoeficacia y el trabajo colaborativo en estudiantes novatos de programación. Investigación y Ciencia: de la Universidad Autónoma de Aguascalientes 74, 73–80 (2018) 23. Plaza, L.: Modelación matemática en ingeniería. IE Revista de investigación educativa de la REDIECH 13(7), 47–57 (2016) 24. Field, A.: Discovering statistics using SPSS, 3era edn. Sage Publications, Lóndres (2009) 25. Palenzuela, D.: Construcción y validación de una escala de autoeficacia percibida específica de situaciones académicas. Análisis y Modificación de Conducta 9(21), 185–219 (1983) 26. Guillaume, F.: The signed Kolmogorov-Smirnov test: why it should not be used, GigaScience, 4 (1) , s13742–015–0048–7 (2015). https://doi.org/10.1186/s13742-015-0048-7 27. Joët, G., Usher, E., Bressoux, P.: Sources of self-efficacy: an investigation of elementary school students in France. J. Educ. Psychol. 103(3), 649–663 (2011) 28. Reiner, S.: Students’ mathematics self-efficacy, anxiety, and course level at a Community College, Ph.D. dissertation, Dept. Education., Walden Univ., Washington, (2017) 29. Peters, M.: Examining the relationships among classroom climate, self-efficacy, and achievement in undergraduate mathematics: a multi-level analysis. Int. J. Sci. Math. Educ. 11(2), 459–480 (2013). https://doi.org/10.1007/s10763-012-9347-y 30. Cohen, J.: A power primer. Psychol. Bull. 112(1), 155–159 (1992) 31. Bingham, N., Fry, J.: Regression: Linear models in statistics. Springer, New York (2010)
Electronic
Economic Assessment of Hydroponic Greenhouse Automation: A Case Study of Oat Farming Flavio J. Portero A.1 , Jorge V. Quimbiamba C.1 , Angel G. Hidalgo O.1 , and Ramiro S. Vargas C.2(B) 1 Universidad Tecnica de Cotopaxi, Simón Rodriguez Av., Latacunga, Ecuador 2 Obuda University, Becsi 94-96, Budapest, Hungary
[email protected]
Abstract. The following research details a hydroponic greenhouse automation process. The former greenhouse, known as the control group, presented water waste and excessive seed consumption. Through a brief assessment, an automated irrigation process was implemented using PLC-LOGO, further known as the experimental group. The control irrigation system was entirely manual, whereas the experimental irrigation system was set for specific periods and flexible for manual control. Moreover, better pre-planting treatment for the seeds was carried out. The water consumption was significantly optimized with a reduction in waste over 450L. Additionally, seeds utilized were reduced from 1 kg to 0.31 kg per tray. The results in terms of crop length and weight demonstrated a substantial improvement, which was further evaluated. The economic productivity of the experimental group overpasses the control group by nearly 30%. Finally, the experimental crop was used for a feeding experiment. Four test subjects demonstrated the advantages of the experimental crop over the control crop. Keywords: Total-Factor Productivity (TFP) · Automated irrigation system · Hydroponics
1 Introduction The rapid increase in population has resulted in deforestation for housing construction, agriculture, and urban development. Forests have been the primary source of fuelwood, construction, and furniture raw materials [1]. During the economic development of any country, deforestation is part of the process, which has affected the carbon cycle and air quality. Additionally, developed countries have started reforestation actions during the twentieth century with a promising landscape turnaround [2]. Livestock and poultry industries comprise over 27% of the global water usage. Moreover, nearly 11% of greenhouse gas emissions (GHG) comes from agriculture [3]. Ruminant based products (beef, lamb, goat, dairy, among others) are partially linked to deforestation since grassland is needed for better product quality. The livestock industry disproved to be sustainable despite its accelerated growth. Studies from the decade © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 139–150, 2021. https://doi.org/10.1007/978-3-030-63665-4_11
140
F. J. Portero A. et al.
2005–2015 reveal the need for better land usage, more efficient water management, and several challenges regarding adopting innovative solutions [4]. Some studies prove that only consumers can potentially mitigate the environmental effects of livestock and agriculture regarding carbon footprint. Dietary change with fewer animal products is crucial for a potential GHG reduction [5]. Ecuador can be considered as a developing country at an early stage of economic growth. Besides petroleum production, agriculture and livestock are two of the leading industries due to the favorable diverse climates in the country. According to the National Institute of Statistics and Census of Ecuador (INEC), agricultural production is predominant along the coastal region while livestock production is prevalent in the highland region. Furthermore, in Ecuador, 5.3 million hectares are considered fertile land; most of it is destined for grassland farming [6]. Consequently, Ecuadorian small and medium enterprises are required to enhance agricultural management and implement new technologies. The following case aims to prove the efficiency of a hydroponic greenhouse for oat production as a feeding source for rabbits.
2 Hydroponics Evolution Hydroponics (hydros – water and ponos – labor) started as a novel concept of soilless agriculture in 1937 [7]. However, some ancient floating gardens (Babylonian and Aztecs) are considered as the first application of hydroponics in history. It has been widely defined as the cultivation of plants in a soilless medium doped with all the necessary nutrients [8]. After several implementations of hydroponic greenhouses, some advantages and disadvantages have been concluded (see Table 1). Table 1. Advantages and disadvantages of hydroponics. Advantages
Disadvantages
In urban areas, where space for farming is limited, hydroponic crops are considered as a food security solution. People can organically grow their food at home [9] Due to better control of the nutrients in the water, hydroponics reduces nutrients leaching. Therefore, it significantly contributes to enhanced water management and environment protection [10]
The initial cost and skilled operators are relatively higher as compared to soil culture. In the case of any disease, it rapidly spreads through the crop [11] Bigger plants can hardly be grown through hydroponics as they do not buttress themselves. This method is highly suggested for herbs and leafy vegetables [10]
2.1 Hydroponic Systems The different hydroponic systems can be defined based on the irrigation and enrichment (addition of air) process of the solution. Figure 1 summarizes the 2D description of the most common hydroponic systems.
Economic Assessment of Hydroponic Greenhouse Automation
141
Electric conduction (EC) and pH of the solution are crucial parameters for a successful hydroponic system. A study conducted in ebb-flow hydroponic systems confirmed that high EC and low pH result in higher crop yield [13].
Fig. 1. Types of hydroponic systems [12].
Table 2 details the main description of the different hydroponic systems. Even though some of the types described are advisable for home gardeners, the space for hydroponic greenhouse should be considered. Hydroponic systems deliver increased crop yields and healthier plants. As the plants grow in a controlled environment, there is less risk of pests [10]. 2.2 Hydroponics in Latin America and Ecuador Endemic plant species in Latin America have potential in the food and medicine industry. Some of them can be grown through hydroponic techniques [14]. Moreover, the United States of America have slightly decreased their percentage of agriculture production
142
F. J. Portero A. et al. Table 2. Hydroponic system types [12].
System
Description
Wick
The nutrient solution is supplied through a wick or fibrous materials (typically nylon) that transport water from the reservoir to the root area by capillary action
Drip
The solution is irrigated to the plants using a pump. A timer controls the amount of solution irrigated according to specific requirements. A subclassification of recovery and the non-recovery system can be identified based on the reuse of the solution. Non-recovery systems result in unfavorable pH changes
Ebb–Flow
The plants are flooded using a pump while a tubing system is controlled to establish a specific flooding time. Higher control techniques are required to avoid overflooding and mold in the reservoir
Water culture
Plants’ roots are suspended in a reservoir of the solution and receive oxygen through an air pump
NFT
In contrast with the ebb-flow system, the solution amount can be controlled by the pump and the tray slope
Aeroponic
The solution is sprayed to the roots (in the form of mist). Since the root freely hangs, the oxygen of the air contributes to the growth. Precise control of the spraying is required as the root can quickly dry
during the last decade [15]. Thus, the North American Free Trade Agreement (NAFTA) has strengthened the hydroponic agricultural production from Mexico [16]. On the other hand, Brazil started their hydroponic production in 1987. After a successful production of lettuce, Brazil started to grow arugula and watercress [17]. The Food and Agriculture Organization of the United Nations (FAO) has conducted studies to prove the urgency of implementing Simplified Hydroponics in Latin America and Caribe (LAC). The rapid increase of population in urban and peri-urban areas lead to undernutrition [18]. Due to El Niño-Southern Oscillation (ENSO), during the winter season, farming presents some difficulties. Hence, since 2015 the Ministry of Agriculture and Livestock from Ecuador (MAGAP) has motivated small and medium enterprises to grow lettuce and strawberries through hydroponic systems [19]. In Ecuador, large scale experimental hydroponic production started in 1987. The pioneer GreenLab officially introduced hydroponic-based lettuce to the market in 1994 [20].
3 Methodology 3.1 Former Greenhouse Description The former system (control group or C-group [21]) comprised two wooden racks (4.6 × 1.6 × 0.5 m). However, the structure was not stable enough. The system included fiftysix trays (43.5 × 2.5 × 32 cm), where 1 kg oat seed was grown per plate. This process
Economic Assessment of Hydroponic Greenhouse Automation
143
did not emphasize the proper conditioning of the seed. Besides, the irrigation system required handwork, and there was no water consumption monitoring of the sprinklers. Therefore, mold and insects appeared in the oat crop. Finally, one of the disadvantages was the excess of high temperatures since the structure was located inside a plastic greenhouse (see Fig. 2).
Fig. 2. Former greenhouse structure.
3.2 Proposed Greenhouse Description To have an enhanced irrigation technique with a controlled environment [22], the proposed system (see Fig. 3), known as the experimental group or E-group [21], comprised metal racks (168 × 150 × 50 cm). It included two shelves for 20 trays each (four groups of five), which were the same as the former system. This structure was not placed inside the initial plastic greenhouse to prevent heat and, consequently, water steam excess. The amount of oat seed grown was upgraded to 0.32 kg.
Fig. 3. Proposed greenhouse system. Front view (left) Isometric view (right).
144
F. J. Portero A. et al.
Figure 3 also illustrates the automated irrigation system. It comprised a water storing tank, pipes, a pump, two shelves where trays were located, and two gutters used for collecting water after its utilization. Commonly, hydroponics grows agricultural plants by providing a nutrient solution [22]; however, this system utilized purified tap water. Oat seeds received a water spray mist from sprinklers as well as from drippers. As displayed in Fig. 4, trays were located on the greenhouse system with a tilt of 11º. This inclination was set to ease the collection of water in gutters.
Fig. 4. Trays on the proposed greenhouse system.
The automated irrigation system worked with the implementation of a PLC LOGO 230RCE. It was programmed to activate a pump three times every day (2 min per cycle). Specific hours were defined for pump activation using the trial and error approach [23]. The following parameters were considered for the final setting: irrigation intervals and volumes, water stress, and lightness [24]. In case a manual pump activation is needed on a different schedule, manual control buttons were included. Figure 5 illustrates the algorithm programmed in the PLC, whereas Table 3 shows the main elements of the automated irrigation system.
4 Results After a farming process of fifteen days, crop yield can be assessed in terms of length and weight per tray. Besides, water consumption was also measured to estimate the productivity of the control and experimental group. 4.1 Weight The increasing weight per day was tracked (Fig. 6). Regarding the control group, the 1 kg oat seed was planted per tray for fifteen days. At the end of the period, an average of 2.1 kg of crop weight was observed per tray. Based on requirements for feeding test subjects (rabbits) and an expected growth ratio of five, a weight of 310 g oat seed was set for the experimental group. The average crop weight was 1.51 kg per tray as expected.
Economic Assessment of Hydroponic Greenhouse Automation
145
Fig. 5. Programming algorithm for PLC.
Table 3. Main elements of the automated irrigation system. Element
Brief description
Pump
1 HP
PLC
LOGO 230 RCE
Pipes
½” Ø
Storing Tank 72 L capacity
Thus, the control hydroponic greenhouse overproduced 600 g per tray that was usually wasted or hardly sold to neighboring farms. The final growth ratio of the control group was 2.1 and 4.83 for the experimental group. 4.2 Length An average length was tracked for the fifteen days farming period (Fig. 7). During the first three days, the control group seemed to grow faster than the experimental group. Nevertheless, the final control group length was 13.50 cm, whereas the experimental group length was 18.94 cm. A nearly 6 cm difference demonstrates the efficiency of the automated water dosing. Furthermore, the experimental group volume and appearance showed greener and healthier plants.
146
F. J. Portero A. et al.
Fig. 6. Average weight increase per day (one tray).
Fig. 7. Average plant growth per day.
4.3 Water Consumption The control hydroponic greenhouse did not have any Irrigation Return Flow (IRF). From the 900L available, only 405L were consumed by the plants or evaporated during 15days (Table 4). The experimental hydroponic greenhouse included a reservoir where water was recycled using a pump. From 630L, 247L were effectively consumed by the plants, and only 22.50L were wasted (Table 4). The IRF (360L) can be utilized for another 15 days farming period after a doping process for an optimal solution.
Economic Assessment of Hydroponic Greenhouse Automation
147
Table 4. Total water consumption. Water consumption Irrigation return flow
Control [L]
Experimental [L]
0.00
360.00
Plant consumption
405.00
247.50
Waste
495.00
22.50
Figure 8 demonstrates the ineffective water consumption of the control group as the wastes represent more than the value of plant consumption.
Fig. 8. Water consumption percentage.
4.4 Economical Productivity The control group required more seeds, water, and handwork. The control total-factor productivity was 0.67 kg/$ (i.e., oat kg per USD), while 0.87 kg/$ for the experimental group (see Table 5). The automation process took over 60 days, and the overall cost was around 1 500 USD. The automatic hydroponic systems ensured constant irrigation of the crop. PLC control replaced the handwork with more effective intervals. Besides, the drip hydroponic system with extra sprinklers to moisturize the plants improved the volume and health of the plants. After implementing the current project, automated hydroponic systems proved
148
F. J. Portero A. et al. Table 5. Total-factor productivity (TFP) comparison. Control Experimental Improvement TFT [kg/$] 0.67
0.87
29.85%
to be advantageous for small and medium farming producers from Latacunga – Ecuador. The greenhouse effectively produced the expected oat weight using less handwork and resources. Regarding the environmental impact, only hazardless chemicals were utilized for cleaning the seeds. The solution did not include any extra chemicals during the process. 4.5 Animal Feeding Experiment The final product was used to feed three out of four rabbits. The fourth rabbit was fed with alfalfa using the same portions as for the other test subjects. After six weeks, the initial and final weight of the test subjects were tabulated to estimate the side benefit of hydroponic oat farming (see Fig. 9). Test subjects fed with oat increased their weight faster than the fourth test subjects (fed with alfalfa). Animal husbandry is a primary economic source of the area where the project was developed. Commonly, rabbits are fed using alfalfa, carrots, and commercial animal feed. Thus, fodder production is crucial for the economic growth of this rural area in Ecuador.
Fig. 9. Weight increase of test subjects (6 weeks).
5 Conclusions and Further Research The automated greenhouse production increased by 29.85% as compared to the initial manual hydroponic greenhouse. Primary sources (seeds, water, and handwork) were
Economic Assessment of Hydroponic Greenhouse Automation
149
supplied more effectively to the plant. The final product delivers a larger volume and average height (18.94 cm). The average weight rate increase is over 4.8, as expected. Based on the control Pareto charts, the main weakness of the control greenhouse was identified. Excess and lack of moisture due to uneven irrigation and inadequate seed cleaning affected the crop. The control farming process required 900L of water that was significantly reduced to 630L. A closed-loop irrigation system reduced water waste and could irrigate a second hydroponic greenhouse. Besides, the automation system replaced a person that irrigated the control greenhouse. Hydroponic greenhouses are beneficial not only for oat farming, but they can be used for high-quality germination. Regarding the test subjects, further studies should be done to determine meat protein quality. Despite the remarkable weight difference, oat retains more water than alfalfa.
References 1. Kaplan, J.O., Krumhardt, K.M., Zimmermann, N.: The prehistoric and preindustrial deforestation of Europe. Quat. Sci. Rev. 28(27–28), 3016–3034 (2009). https://doi.org/10.1016/j. quascirev.2009.09.028 2. Walker, R.: Deforestation and economic development. Can. J. Reg. Sci. 16(3), 481–497 (1993) 3. Scanes, C.G.: Impact of agricultural animals on the environment. In: Animals and Human Society, pp. 427–449. Elsevier (2018) 4. Herrero, M., et al.: Livestock and the environment: what have we learned in the past decade? Ann. Rev. Environ. Resour. 40(1), 177–202 (2015). https://doi.org/10.1146/annurev-environ031113-093503 5. Poore, J., Nemecek, T.: Reducing food’s environmental impacts through producers and consumers. Science 360(6392), 987–992 (2018). https://doi.org/10.1126/science.aaq0216 6. I. N. de E. y C. INEC: Encuesta de Superficie y Producción Agropecuaria Continua (ESPAC) 2018, Quito (2018). https://www.ecuadorencifras.gob.ec/documentos/web-inec/Estadisticas_ agropecuarias/espac/espac-2018/Presentaciondeprincipalesresultados.pdf 7. Gericke, W.F.: Hydroponics–crop production in liquid culture media. Science 85(2198), 177– 178 (1937). https://doi.org/10.1126/science.85.2198.177 8. Jones Jr., J.B.: Hydroponics: A Practical Guide for the Soilless Grower. CRC Press, Boca Raton (2016) 9. Tripp, T.: Hydroponics Advantages and Disadvantages: Pros and Cons of Having a Hydroponic Garden. Speedy Publishing LLC (2014) 10. Stone, M.: How to Hydroponics: A Beginner’s and Intermediate’s in Depth Guide to Hydroponics. Martha Stone (2014) 11. Sharma, N., Acharya, S., Kumar, K., Singh, N., Chaurasia, O.P.: Hydroponics as an advanced technique for vegetable production: an overview. J. Soil Water Conserv. 17(4), 364 (2018). https://doi.org/10.5958/2455-7145.2018.00056.5 12. Lee, S., Lee, J.: Beneficial bacteria and fungi in hydroponic systems: types and characteristics of hydroponic food production methods. Sci. Hortic. 195, 206–215 (2015). https://doi.org/ 10.1016/j.scienta.2015.09.011 13. Wortman, S.E.: Crop physiological response to nutrient solution electrical conductivity and pH in an ebb-and-flow hydroponic system. Sci. Hortic. 194, 34–42 (2015). https://doi.org/10. 1016/j.scienta.2015.07.045 14. Rodríguez-Delfín, A.: Advances of hydroponics in Latin America. Acta Hortic. 947, 23–32 (2012). https://doi.org/10.17660/ActaHortic.2012.947.1
150
F. J. Portero A. et al.
15. USDA: 2017 census of agriculture (2019). https://www.nass.usda.gov/Publications/AgC ensus/2017/Full_Report/Volume_1,_Chapter_1_US/usv1.pdf 16. Mathias, M.: Emerging hydroponics industry. Pract. Hydroponics Greenhouses (144), 18 (2014) 17. Furlani, P.R.: Hydroponic vegetable production in Brazil. Acta Hortic. 481, 777–778 (1999). https://doi.org/10.17660/ActaHortic.1999.481.98 18. Izquierdo, J.: Simplified hydroponics: a tool for food security in Latin America and the Caribbean. In: International Conference and Exhibition on Soilless Culture, ICESC 2005, vol. 742, pp. 67–74 (2005) 19. MAGAP: Se promueve cultivo hidropónico de frutilla. Ministerio de Agricultura y Ganadería. https://www.agricultura.gob.ec/se-promueve-cultivo-hidroponico-de-frutilla/. Accessed 24 Mar 2020 20. GreenLab: Quienes somos – GreenLab. http://www.greenlab.com.ec/. Accessed 23 Mar 2020 21. Fitz-Gibbon, C.T., et al.: How to Design a Program Evaluation. SAGE Publications, Thousand Oaks (1987) 22. Tunio, M.H., et al.: Potato production in aeroponics: an emerging food growing system in sustainable agriculture for food security. Chil. J. Agric. Res. 80(1), 118–132 (2020). https:// doi.org/10.4067/S0718-58392020000100118 23. Lamers, L.P.M., et al.: Ecological restoration of rich fens in Europe and North America: from trial and error to an evidence-based approach. Biol. Rev. 90(1), 182–203 (2015). https://doi. org/10.1111/brv.12102 24. Lizarraga, A., Boesveld, H., Huibers, F., Robles, C.: Evaluating irrigation scheduling of hydroponic tomato in Navarra, Spain. Irrig. Drain. 52(2), 177–188 (2003). https://doi.org/10. 1002/ird.86
Proportional Fuzzy Control of the Height of a Lubricant Mixture Fátima de los Ángeles Quishpe Estrada1(B) , Angel Alberto Silva Conde2 and Miguel Ángel Pérez Bayas3
,
1 Freelance Scientist, Riobamba 060155, Ecuador
[email protected] 2 DINELEC, Engineering Department, Riobamba 060102, Ecuador
[email protected] 3 Escuela Superior Politécnica de Chimborazo, Riobamba 060155, Ecuador
[email protected]
Abstract. Different components mix to get the required viscosities and lubrication qualities in the manufacturing processes of lubricating oils. This paper shows a proportional fuzzy controlled model for control of the height of this mixture, inside a stirred tank with heating, through the flow of two components: the base and the additive, depending on the variation in the opening of the valves and their corresponding inputs and outputs in the system. Three alternatives are studied: the variation in the number of rules used and the called negative errors, the variation in the flow range, and the change in the membership function. It also shows the importance of defining the limits of linguistic labels and determining the appropriate stabilization time, which ultimately determines a dynamic behavior of slow response; thus, the greater the system’s input flow, the stabilization time will be shorter. Keywords: Lubricant · Fuzzy · Labels · Response
1 Introduction The sale of lubricating oils, in the Ecuadorian market, is a growing business, whose volumes of product marketed in the country, 60%, are of national origin [1]. This production categorizes for each of the following applications: diesel and gasoline engines, hydraulic engines, industrial engines, two-stroke engines, marine engines, automatic transmission engines, manual and differential transmission engines, and others, as detailed in Fig. 1. When statistically analyzing the majority participation of national producers, it is possible to argue a more outstanding local production. A company dedicated to the production of this product [2] has an average mix production of 420000 gals/month and recently launched its brand of national lubricant and wants to increase its sales volume, but not to affect the production of the other brands, they must increase their current mixing capacity to 690000 gal/month (2611,934 m3 /month). The mixing is made in six tanks, but the possibility of implementing three tanks of 1000 m3 is analyzed. The process of making these oils is: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 151–163, 2021. https://doi.org/10.1007/978-3-030-63665-4_12
152
F. Á. Quishpe Estrada et al.
Fig. 1. Production trends in 2019 (Jan.–Jun.) (Source: Authors, Graphic adapted form ARCH Half-Yearly Newsletter, Ecuador, 2019 in http://apel.ec/biblioteca/boletin-del-sector-lubricantesjunio-2019/)
1. Entry of the bases and additives into the storage tank. 2. Mixing and stirring the substances to get the required viscosities and qualities, and to increase heat transfer. 3. Heating to decrease the viscosity of the product and improve its fluidity, facilitating its mixing. 4. Output of the finished product. In this work, a proportional fuzzy controller model is presented for the control of the height of said mixture, inside a stirred stainless steel tank with heating, through the inflow of its two components, the base, and the additive, in according to the variation of the valve opening and their corresponding inlets and outlets in the system, through three alternatives such as the variation in the number of rules used and the negative errors, the variation in the flow range, and the change in the membership function.
2 Methodology The system consists of a stirred tank with heating, Fig. 2, with a capacity of 1000 m3 and a height of 10 m, which is fed by two pipes: one for the base component and the other for the additive component. The constant parameters used in the system are shown in Table 1; Table 2 shows the system variables, and Table 3 shows the dependent parameters. It should note that the thermal and energy balance analyzes for this system were taken from the specialized literature on the subject, as shown in Sect. 2.1, and are not part of this research but are the starting point of understanding the system’s dynamics analyzed. 2.1 Material Balance If the material balance [7] is made inside the tank, we have: Accumulated = Entry-Exit
Proportional Fuzzy Control of the Height of a Lubricant Mixture
153
Fig. 2. Stirred tank model with heating (Source: Authors) Table 1. Constant system parameters. Symbol
Description
Value
Units
g
Gravity acceleration
9.8
m/s2
tm
Discretization step
1.0
s
A
Deposit area
100
m2
Ks
Opening constant of the tank outlet valve
1.0
–
Kv
Burner power supply opening constant
1.0
–
ρm
Lubricant mix density
880 [6]
Kg/m3
Cp
Specific heat of the fluid
1880 [6]
J/Kg.°C
U
Heat transfer coefficient
340 [6]
W/m2 . °C
X1
Feeder composition 1
90*
%
X2
Feeder composition 2
10*
%
T1
Feeder temperature 1
20*
°C
T2
Feeder temperature 2
10*
°C
P1
Burner feed gas pressure
253312.5**
Kg/m.s2
* Experimentation values ** Experimentation value for a 25 mm pitch effective section Table 2. Systems variables Symbol Description H
Units
Height of the liquid in the mixing tank m
This balance can apply to both: the total flow and the component, in particular, considering the contributions of the two inlets and outlet flows of the tank: ρA
dH = F1 + F2 − F3 dt
(1)
154
F. Á. Quishpe Estrada et al. Table 3. Dependent parameters Symbol Description
Units
F1, F2
Feeding mass flows rate Kg/s
F3
Output mass flow rate
Kg/s
X3
Outlet composition
%
dM3 = F1 X1 + F2 X2 − F3 X3 dt
(2)
where M3 = MX3 = ρAHX3 The tank outlet flow would be: F3 = Ks 2gH
(3)
where K s is the valve opening constant that varies between 0 and 1. 2.2 Energy Balance In the same way as in the material balance, in the energy balance [8, 9], both the input and output contributions and the burner are considered. d MCp T3 = F1 Cp T1 + F2 Cp T2 − F3 Cp T3 + Q dt
(4)
d (MT3 ) = F1 Cp T1 + F2 Cp T2 − F3 Cp T3 + Q dt
(5)
Cp Where:
• M (density, Area, Volume) the total mass of the volume of the tank, • Q the heat contributed by the burner. • T3 outlet temperature 2.3 Difference Equation Model The model proposed in differential equations can discretize when the variables are sampled with a period Tm, resulting in the following difference equations: Hk − Hk−1 1 = (F1 + F2 − F3 ) Tm ρm A
(6)
M3k − M3k−1 = F1 X1 + F2 X2 − F3 X3 Tm
(7)
Mk T3k − Mk−1 T3k−1 1 = F1 T1 + F2 T2 − F3 T3 + Q Tm Cp
(8)
Proportional Fuzzy Control of the Height of a Lubricant Mixture
155
Where the subscript k shows the instant tested, and the subscript k−1 refers to the last instant. Thus: Q = UAP2
(9)
P2 = Kv P1 Sv
(10)
M3 ρm AH
(11)
X3 =
For the design of a Fuzzy Controller through the use of the Matlab Software and its Fuzzy Logic toolbox on the Control of the Height of the liquid through the inlet flow, it will be taken a Matlab plant model with the following variables and parameters: function[F3, H] = IA0_ModeloPlanta(F1, F2, F3o, Ho)
where Eqs. (3) and (6) are the most suitable for this model. 2.4 Procedure The control of the height of the liquid in the tank is made by manipulating the inlet valves’ total flow at a ratio of 1: 3 taking into account the initial consideration that it is a mixing tank. F1 3F2 + =F (12) 4 4 The flow variation ranges from zero correspondings to two equal fully closed valves to a maximum flow corresponding to the same valves: one fully open and the other available 0.25%. This total flow F is that the fuzzy controller parses. Here, the flow outlet valve F3 is kept free for a constant flow. The implemented controller [10, 11] is a proportional control that has the error in the height measurement as input (Fig. 3). The maximum values of the error will be defined by: error = Hreferencia(i) − Hsalida(i − 1)
(13)
where: • Hmax = reference when the tank is empty. • Zero = reference when the tank is full Inside the Fuzzy interface in Matlab, we define the inputs, outputs, and the controller itself, as seen in Fig. 4. System Inputs They will be defined by the error, the derivative, and the integral of the input variable’s error. Here, as it is a proportional control, the entries will only be those of the error. Error states can occur within a positive or negative range (Fig. 5), which are within the full filling or emptying height of the tank (±10 m), for the following cases:
156
F. Á. Quishpe Estrada et al.
Fig. 3. Height Control System by flow (Source: Authors).
Fig. 4. Initial Fuzzy overview of the case study (Source: Authors).
Fig. 5. Membership functions of errors (Source: Authors).
• If the height of the liquid is below the setpoint, it is required to fill the tank, and the error can go from 0 to:
Proportional Fuzzy Control of the Height of a Lubricant Mixture
157
• If the liquid’s height is above the setpoint, it is required to empty the tank, and the error can only be within a negative range. It is logical to think that for each negative error value, it is prudent that the inlet flow F is always kept at zero to avoid that the tank continues to overflow, so there will be a single negative error condition, which will be demonstrated later in the results section. As seen, five linguistic labels are defined. Systems Outputs They are defined by the variable “Flow F,” which is an output of the fuzzy system but a control input of the plant model. The flux variation goes from zero to a maximum “Flow F,” the variation range is chosen coherently, initially assuming a maximum flux of 40 m3 /s (coherent for a volume of 1000 m3 ), for which four linguistic labels are assigned (Fig. 6):
Fig. 6. Membership functions of the output (Source: Authors).
Then, the control rules, defined by the expert, (Fig. 7), are:
Fig. 7. Fuzzy control rules are shown in Matlab (Source: Authors).
Table 4 shown the equivalences for the control rules. An algorithm made in the Matlab Software [11] is used to check the control design where the “.fis” extension files are incorporated and using “the evalfis command,” obtained data is displayed through an interface Guide, Fig. 8, which graphs the corresponding height and flow curves until achieving the desired control since the value of the setpoint corresponding to the quantity H of mixture that the tank must reach.
158
F. Á. Quishpe Estrada et al. Table 4. Equivalences of control rules. Error H
Flux variation opening
Bajo (-)
Fully closed
Cero
Fully closed
Bajo (+)
Little open
Medio (+) Half-open Alto (+)
Fully open
Fig. 8. Application interface guide (Source: Authors).
In this interface, it is necessary to initially indicate the setpoint’s value corresponding to the quantity H of mixture that the tank must reach.
3 Results Alternative 1: Variation in the Number of Rules used and Negative Errors. If we initially consider the following rules: 1. 2. 3. 4. 5. 6. 7.
If (errorH is Alto(-)) then (F2(0.75) + F1(0.25) is TotlC) (1) If (errorH is Bajp(-)) then (F2(0.75) + F1(0.25) is TotlC) (1) If (errorH is Med(-)) then (F2(0.75) + F1(0.25) is TotlC) (1) If (errorH is cero) then (F2(0.75) + F1(0.25) is TotlC) (1) If (errorH is Bajo(+)) then (F2(0.75) + F1(0.25) is MedioAb) (1) If (errorH is Med(+)) then (F2(0.75) + F1(0.25) is TotlAb) (1) If (errorH is Alt(+)) then (F2(0.75) + F1(0.25) is TotlAb) (1)
• Figure 9 shows that for a 5 m setpoint, the system stabilizes correctly at 35 s. • If made it a variation of the setpoint towards 2 m, the rules 5 and 6 of the table show that the system will continue to fill up, so the tank will tend to overflow.
Proportional Fuzzy Control of the Height of a Lubricant Mixture
159
Fig. 9. Result of the variation of the rules used and the negative errors (Source: Authors).
• The negative error is not justified in the design of this controller.
Alternative 2: Variation of the Flow Range. If the maximum flow range is from 0 to 20 (Fig. 10 and 11), the stabilization time is too long, so the tank will take much longer to fill, taking longer to the plant.
Fig. 10. Fuzzy interface for flow ranges 0 to 20 (Source: Authors).
If located a range of 0 to 60 (Fig. 12 and 13), it implies that the process stabilizes in less time, so that the plant will work faster. Alternative 3: Change the Membership Function. If the membership function is varied, going from the triangular type (trimf) to a Gaussian type (gaussmf), are obtain the results of Figs. 14 and 15.
160
F. Á. Quishpe Estrada et al.
Fig. 11. Result of the flow variation range 0 to 20 (Source: Authors).
Fig. 12. Fuzzy interface for flow range 0 to 60 (Source: Authors).
This last alternative does not provide an adequate solution, since it exceeds the setpoint value when there are positive error values, then is discarded as a solution, despite achieving system stabilization in a time of approximately 45 s, because of the implemented proportional controller.
Proportional Fuzzy Control of the Height of a Lubricant Mixture
Fig. 13. Result of flow variation range 0 to 60 (Source: Authors).
Fig. 14. The fuzzy interface of the analysis with positive error (Source: Authors).
161
162
F. Á. Quishpe Estrada et al.
Fig. 15. Result of the height and flow with membership function change (Source: Authors).
4 Conclusions • A tank filling system exhibits a slow response to dynamic behavior; then, it verifies that the greater the amount of inlet flow, the shorter the stabilization time. • The designed controller does not present large oscillations around the setpoint and working correctly; the result is achieved by increasing or decreasing from a high value to smaller and smaller amounts according to the de-fuzzing stage designed. • It is necessary to know the system’s operation to be controlled before designing a diffuse controller. This controller requires a lot of experience and knowledge of the plant. • The input values should be defined with the most representative proper states and the appropriate linguistic labels, to achieve better control. • The flux’s output values must represent, within what is consistent, the achievable states that the actuator can have. • If a rigorous control is required, linguistic labels can be increased at the system’s input and output, to increase the control rules. • The controller can improve by making a multivariable control, since, according to the complete plant model, the analyzed variable depends not only on F1 and F2 but also of the F3 response.
Acknowledgments. The authors thank the Escuela Superior Politécnica del Litoral of Ecuador for their documentary contribution to this research’s conduct, and the Hydrocarbon Regulation and Control Agency- Ecuador, for their statistical assistance as a basis for establishing the initial objectives of this work.
References 1. ARCH-Ecuador: Lubricants industry bulletin, Boletín del sector Lubricantes. Homepage. http://apel.ec/biblioteca/boletin-del-sector-lubricantes-junio-2019. Accessed 17 Jun 2020
Proportional Fuzzy Control of the Height of a Lubricant Mixture
163
2. Ricaurte, L.: Design and simulation of a 10,000 gallon mixing tank for the production of lube oils, Diseño y simulación de un tanque mezclador de 10000 gal para la elaboración de aceites lubricantes. Thesis, Faculty of Mechanical Engineering and Production Sciences, Escuela Superior Politécnica del Litoral, Guayaquil, Ecuador (2016) 3. Prusty, S., Pati, U.: Implementation of fuzzy-PID controller to liquid level system using LabView. In: Proceedings CIEC, Calcutta, India, pp. 365–440 (2014) 4. Krivic, S., Hudjur, M., Mrzic, A., Konjicija, S.: Design and implementation of fuzzy controller on embedded computer for water level control. In: Proceedings MIPRO, Opatija, Croatia, pp 1747–1751 (2012) 5. Castro-Montoya, A.J., Vera, F., Quintana, J.: Fuzzy fluid flow control in a laboratory station, Control difuso de flujo de fluidos en una estación de laboratorio. Inf. Tecnol., 15(3) (2004). https://doi.org/10.4067/s0718-07642004000300007 6. ENGINEERING Homepage. http://www.engineeringpage.com/technology/thermal. Accessed 17 Jun 2020 7. Matía, F.: Practice script Artificial Intelligence, Guión de Prácticas Inteligencia Artificial, Programa de Posgrado en Automática y Robótica, Universidad Politécnica de Madrid, no published (2013) 8. Cengel, Y.: Mass and energy analysis in control volumes, Análisis de masa y energía en volúmenes de control, Termodinámica, 8th edn., pp. 230–255. McGraww Hill, México D.F. (2014) 9. Wark, K.: Energy analysis of control volumes, Análisis energético de volúmenes de control, Termodinámica, 6th edn., pp. 179–213. McGraww Hill, Madrid (2004) 10. Jimenez, A., Matía, F.: Fuzzy control: current status and applications. Control Fuzzy: Estado actual y aplicaciones, AADECA, Buenos Aires, Argentina, 2(6), 17–28 (1995) 11. Chmielowski, W.: Fuzzy logic toolbox, matlab, simulink, an application of descriptive fuzzy control in environmental engineering. In: Studies in Systems, Decision, and Control, Varsovia, Polonia, pp. 37–80. Springer (2016). https://link.springer.com/content/pdf/10.1007%2F9783-319-19261-1.pdf. Accessed 17 Jun 2020
Intelligent Systems
Analysis of Anthropometric Measurements Using Receiver Operating Characteristic Curve for Impaired Waist to Height Ratio Detection Erika Severeyn1(B) , Alexandra La Cruz2 , Sara Wong3 , and Gilberto Perpi˜ nan4 1
2
Department of Thermodynamics and Transfer Phenomena, Universidad Sim´ on Bol´ıvar, Caracas, Venezuela [email protected] Faculty of Engineering, Universidad de Ibagu´e, Ibagu´e, Tolima, Colombia [email protected] 3 Department of Electronics and Circuits, Universidad Sim´ on Bol´ıvar, Caracas, Venezuela [email protected] 4 Faculty of Electronic and Biomedical Engineering, Universidad Atonio Nari˜ no, Cartagena, Colombia [email protected]
Abstract. Metabolic dysfunctions such as obesity, insulin resistance, metabolic syndrome, and glucose tolerance are strongly related to each other. The presence of any of them in a person translates into a high risk of diseases such as diabetes, heart failure, and cardiovascular disease. Anthropometric measurements such as body circumferences, body folds, and anthropometric indices such as waist-height ratio (WHtR) and body mass index (BMI) have been widely used in the study of metabolic diseases. This study aims to look for relationships between WHtR and anthropometric measurements such as subcutaneous folds and body circumferences. For this purpose, a database of 1863 subjects was used, 16 anthropometric variables were measured for each participant in the database and the BMI was calculated. The receiver operating characteristic (ROC) curves were used to assess the ability of BMI and each anthropometric measurement was used to diagnose BMI impairment. The findings reported in this research strongly suggest that the diagnosis of WHtR deficiency can be made from circumferences, skinfolds, and BMI. In this study, the anthropometric measures that best detect subjects with WHtR deficiency were BMI, subscapular skinfold, supra iliac skinfold, and arm circumference with a high probability of detecting normal WHtR-deficient subjects. Abdominal circumference is one of the areas that have the most direct relationship with cardiac metabolic risk, however, the findings of this study open the possibility of studying accumulated fat tissue in the arms and back as areas that could also indicate a risk of metabolic dysfunction. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 167–178, 2021. https://doi.org/10.1007/978-3-030-63665-4_13
168
E. Severeyn et al. Keywords: Receiver operating characteristic curve ratio · Body mass index
1
· Waist to height
Introduction
Metabolic dysfunctions such as obesity, insulin resistance, metabolic syndrome, and glucose tolerance are strongly related to each other, and the presence of any of them in a person translates into a high risk of diseases such as diabetes [39], atherosclerosis [27], heart failure [38] and cardiovascular diseases [26,34]. The increase in the incidence and prevalence of obesity and overweight worldwide has become a serious and growing public health problem, reaching alarming proportions in some countries. The World Health Organization (WHO) estimated in 2014 that worldwide 52% of adults and 30% of children are overweight [33]. This disease is characterized by excess body fat, associated with the development of multiple metabolic disorders, which in turn cause other health problems [22]. There is an essential relationship between metabolic dysfunctions and the mechanisms of adipose tissue deposition [1]. The adipose tissue functions are not limited to the storage of energy in the form of fats, but it also plays a very important role in the regulation of energy metabolism [23]. Adipose tissue is an endocrine and paracrine organ that secretes a large number and diversity of cytokines and bioactive mediators that influence homeostasis, body weight, inflammation, coagulation, etc. The alteration of the homeostasis of these cells can occur as a consequence of obesity, which generates a metabolic debacle that facilitates the appearance of dysglycemia (in particular), and of the metabolic syndrome in general. Body fat distribution is clinically meaningful [30]. Increases in abdominal fat (central or visceral), requires an active search for some biochemical and clinical disorders that can be seen as a result of the presence of this, among which are: resistance to the action of insulin [29,46,48], metabolic syndrome [11,37,47], type 2 diabetes [40], cancer [43], high blood pressure, heart ischemic disease, among other health problems [13]. Anthropometric measurements such as body circumferences and body folds, and anthropometric indices such as waist-to-height ratio (WHtR) and body mass index (BMI), have been widely used in the study of metabolic diseases [10,18,36]. Waist circumference and BMI are commonly described in the literature as predictors of adiposity, indicators of overweight and obesity, and as risk factors for chronic, degenerative diseases and metabolic syndrome citenejat2010predictors. The body mass index is used to diagnose obesity. According to WHO, overweight is diagnosed when a BM I ≥ 25 Kg/m2 [33]. Waist circumference, waist-hip ratio, or WHtR are recommended to predict cardiovascular risk and coronary heart disease [8,25]. Waist circumference has been widely used as a predictor of metabolic syndrome and cardiovascular disease, since having high-fat deposits in the abdominal area is closely related to the risk of developing diseases such as diabetes and insulin resistance [3]. However, it is not easy to establish measurements of the abdominal circumference that can be applied to all populations and it is common to find that the cut-off values do not apply to all the ethnic groups [5,15].
Impaired Waist to Height Ratio Detection
169
In this sense, the WHtR is a proportional index that indirectly measures the fat deposited on the waist, comparing it with the height, which makes its limits applicable to all ethnic groups [17,50]. In [4], they establish as an abnormal limit to have a WHtR above 0.5, for patients between 18 and 60 years of age [9]. In the case of people over 60 years old, the WHtR above 0.6 is considered to be the limit of dysfunctionality, as stated by [2,44]. As indicated in the previous paragraphs, studies have found directly proportional relationships between subcutaneous folds, BMI, and WHtR [21]. BMI, WHtR, subscapularis, biceps, triceps, and suprailiac skinfolds have been associated with high blood pressure, obesity, and cardiovascular disease [21,45]. However, no study has been conducted in which alteration of WHtR values with BMI and anthropometric measurements (such as circumferences and folds) are diagnosed. This could be useful to understand where other parts of the body, besides the abdominal circumference, accumulate fat tissue when there is metabolic dysfunction that may represent a risk of developing more severe diseases. This study aims to look for relationships between anthropometric measurements such as subcutaneous folds and body circumferences, and the WHtR. A database of 1863 subjects was used for this purpose, in which each anthropometric variable was measured for each participant in the database. Receiver Operating Characteristic (ROC) curves were used to assess the ability of BMI and each anthropometric measurement to diagnose WHtR impairment. In the following section, the methodology used in this research will be explained in detail. In Sect. 3, the results obtained will be displayed, in Sect. 4, the results will be discussed, and, finally, in Sect. 5, the conclusions and future work will be indicated.
2 2.1
Methodology Database
At the Nutritional Evaluation Laboratory of the Sim´ on Bol´ıvar University of Venezuela, the database was collected between 2004 and 2012 [16], 1863 (male = 678) adult men and women were recruited from the Capital District of Venezuela (soon to be available to the general public). Anthropometric measurements were made on each subject; they were: height, weight, arm circumferences, bent arm circumferences, waist circumference, hip circumferences, subscapular skinfolds, suprailiac skinfolds, abdominal skinfolds, diameters of the epicondyle of the humerus and femur. Table 1 shows the characteristics of the database used for subjects with normal and impaired WHtR. The diagnosis of impaired WHtR was made using [4,9], which state that the diagnosis of impaired WHtR is made when a WHtR is greater than or equal to 0.5 [9] for persons aged 18 to 60 years, and greater than or equal to 0.6 for persons aged 60 years or older [4]. The calculation of WHtR was made according to Eq. (1): W (1) W HtR = H
170
E. Severeyn et al.
Where: W is the waist circumference of the subjects measured in centimeters, and H is the height of the subjects measures in centimeters. All research subjects accepted the study by signing an informed consent form. All procedures carried out in the study were according to the ethical standards of the Bioethics Committee of the Sim´ on Bol´ıvar University, and the 1964 Declaration of Helsinki and its subsequent amendments or comparable ethical standards. 2.2
Calculation of Receiver Operating Characteristic Curves
The accuracy of a diagnostic test has been evaluated based on two characteristics: sensitivity (SEN) and specificity (SPE). The SEN of a diagnostic test is the probability of a positive result when the individual has the disease. It measures the ability to detect the disease when it is present. Sensitivity is calculated using Eq. (2): TP (2) SEN = TP + FN Where: T P is the true positives classifications and F N are the false negatives classifications [35]. The SPE of a test indicates the probability of a negative result when the individual does not have the disease. It measures the ability of the test to discard disease when it is not present. The SPE is calculated from Eq. (3): ESP =
TN TN + FP
(3)
Where: T N is the true negatives classifications and F P are the false positives classifications. Another way to evaluate a diagnostic test is from the positive predictive value (PPV), and the negative predictive value (NPV). The PPV is the ratio of valid results to total positive test results. NPV is the ratio of valid negative results to total negative test results. PPV and NPV are calculated using Eq. (4) and Eq. (5). TP (4) PPV = TP + FP TN NPV = (5) TN + FN A more global way of assessing the quality of the test across the spectrum of breakpoints, is through the use of ROC curves, which are a fundamental and unifying tool in the process of evaluation, and use of diagnostic tests. In these researches, the ROC curves of the anthropometric variables were calculated using, as a diagnostic variable, the WHtR. The area under the ROC curve (AU CROC ) was calculated using the trapeze method. An AU CROC ≥ 0.70 was considered an acceptable predictive value [41]. The optimal cut-off point was defined as the shortest distance between the ROC curve and the coordinate point (0, 1) of the ROC curve trace. SEN, ESP, NPV, and PPV were calculated for all anthropometric variables at the optimal cut-off point.
Impaired Waist to Height Ratio Detection
2.3
171
Statistical Analysis
Non-paired samples were treated with a different distribution than normal, therefore the non-parametric Mann-Whitney U-test was used, considering a p-value lower than 5% statistically significant [28]. The data in Table 1 are presented as values of mean and standard deviation. Table 1. Anthropometrics variables characteristics for normal and impaired WHtR. Anthropometrics variables
Normal WHtR
Impaired WHtR
Male = 533, n = 1502 Male = 145, n = 361 Agea [years]
42.96 ± 28.29b
Weighta
56.37 ± 9.86
68.56 ± 12.93
160.60 ± 9.20
156.98 ± 11.41
Right arm circumferencea [cm]
25.79 ± 2.91
30.22 ± 3.67
Left arm circumferencea [cm]
25.62 ± 2.90
30.14 ± 3.69
[Kg]
Height [cm]
circumferencea
56.01 ± 27.74
26.68 ± 3.06
30.89 ± 3.80
Left flexed arm circumferencea [cm]
26.41 ± 3.29
30.65 ± 3.80
Waist circumferencea [cm]
91.07 ± 6.62
96.00 ± 9.90
Hip circumferencea [cm]
74.53 ± 9.51
101.13 ± 9.60
Right subscapular skinfolda [mm]
12.97 ± 4.79
21.69 ± 7.01
Left subscapular skinfolda [mm]
13.13 ± 4.78
21.92 ± 7.04
Right suprailiac skinfolda [mm]
12.88 ± 5.97
22.73 ± 6.93
Right flexed arm
skinfolda
[cm]
12.95 ± 6.00
22.53 ± 6.92
Right abdominal skinfolda [mm]
21.59 ± 6.74
30.72 ± 6.91
Left abdominal skinfolda [mm]
22.05 ± 6.84
30.94 ± 7.08
Right humerus diameter epicondylara [cm]
6.15 ± 0.66
6.47 ± 0.62
Left humerus diameter epicondylara [cm]
6.13 ± 0.65
6.45 ± 0.60
Right femur diameter epicondylara [cm]
9.04 ± 0.74
9.74 ± 0.83
Left femur diameter epicondylara [cm]
9.03 ± 0.74
9.71 ± 0.82
21.80 ± 3.03
27.77 ± 4.19
Left suprailiac
BMIa
[Kg/m2 ]
[mm]
Height-waist ratio [Adimentional]a 0.46 ± 0.06 0.61 ± 0.07 a Statistically significant difference (p-value < 0.05) between normal and impaired WHtR. b Average and standard deviation.
3
Results
Table 1 reports the anthropometrics values of normal and impaired WHtR subjects. The database consists of 1863 subjects, 80.62% belongs to the control group, and 19.38% present an impaired WHtR. The classification were made according to [4,9], where, WHtR ≥ 0.5 were classified as impaired WHtR, in the case of having between 18–60 years old, and WHtR ≥ 0.6, in the case of having more than 60 years old. Figure 1 shows the ROC curves of the sixteen (16) anthropometric parameters for the diagnosis of impaired WHtR. Table 2
172
E. Severeyn et al.
shows the AU CROC , the sensitivity, specificity, positive predictive value, and negative predictive value for the optimal detection cut-off of the sixteen (16) anthropometric parameters for the diagnosis of impaired WHtR. Table 2. SEN, SPE, PPV, NPV, AU CROC , and the optimal cut-off point for the evaluation of anthropometric variables for impaired HRW diagnosis. SEN
SP E P P V
NP V
Right arm circumference [cm] 0.84
0.77
0.75
0.43
0.93
27.50
Left arm circumference [cm]
0.84
0.80
0.74
0.43
0.94
27.20
Right flexed arm circumference [cm]
0.82
0.71
0.76
0.41
0.92
28.60
Left flexed arm circumference [cm]
0.82
0.76
0.74
0.41
0.93
28.00
Hip circumference [cm]
0.82
0.74
0.79
0.46
0.93
96.00
Rignt subscapular fold [mm]
0.85
0.79
0.74
0.42
0.94
15.50
Left subscapular fold [mm]
0.85
0.75
0.78
0.45
0.93
16.70
Variable
4
AU CROC
Optimal cut-off point
Right suprailiac fold [mm]
0.86
0.77
0.76
0.43
0.93
16.50
Left suprailiac fold [mm]
0.85
0.73
0.80
0.46
0.93
17.50
Right abdominal fold [mm]
0.82
0.76
0.71
0.39
0.93
25.00
Left abdominal fold [mm]
0.81
0.72
0.74
0.40
0.92
26.00
Right humerus diameter epicondylar [cm]
0.63
0.55
0.62
0.26
0.85
6.30
Left humerus diameter epicondylar [cm]
0.64
0.58
0.60
0.26
0.86
6.20
Right femur diameter epicondylar [cm]
0.73
0.64
0.72
0.35
0.89
9.40
Left femur diameter epicondylar [cm]
0.73
0.64
0.71
0.35
0.89
9.40
BMI [Kg/m2 ]
0.89
0.83
0.80
0.50
0.95
24.04
Discussion
In Table 1, it can be seen that significant differences were found between the impaired and normal WHtR groups in all variables, except height. It can be verified that the age of the subjects with normal WHtR is lower (p < 0.05) than subjects with impaired WHtR. This corroborates several studies that indicate that age increment the incidence of cardiovascular risk factor [32], type 2 diabetes [49], and obesity [31] due to the metabolism slowdown consequence of the natural loss of muscle tissue [19]. On the other hand, Table 1 shows that all the anthropometric measurements of the subjects with impaired WHtR are greater, compared (p < 0.05) with the anthropometric measurements of the subjects with normal WHtR. This suggests that subjects with impaired WHtR have a greater accumulation of fat in all folds compared to subjects with normal WHtR.
Impaired Waist to Height Ratio Detection
Fig. 1. The area under the ROC curves for the impaired WHtR.
173
174
E. Severeyn et al.
Furthermore, the results shown in Table 1, corroborate that subjects with impaired WHtR have BM I ≥ 25 Kg/m2 , indicating that subjects with impaired WHtR are also overweight. This is consistent with work documenting a directly proportional relationship between overweight and waist circumference [12]. Especially in subjects suffering from insulin resistance [24], which tends to appear in overweight people, concentrating the greatest amount of body fat in the abdominal area [20]. Epicondylar diameters also showed significant differences between subjects with normal WHtR and impaired WHtR. Subjects with impaired WHtR have a larger epicondylar diameter of the femur and humerus compared to subjects with normal WHtR. This may be because, in the body areas where the measurements were made, they have fatty accumulations that misinterpret bone measurements [16]. Table 2 shows that the variables with an AU CROC > 0.80 are three of the circumferences measurements (arm, flexed arm, and hip), three of the skin folds (subscapular, suprailiac, and abdominal), and the body mass index. The body mass index was the variable that obtained the highest predictive capacity for the diagnosis of WHtR with an AU CROC = 0.89, SEN = 0.83, and SP E = 0.80. With a high probability of detecting subjects with normal WHtR (N P V = 0.95) and a cut-off point of 24.04, which indicates that subjects with a BMI above 24.04 could also have an impaired WHtR. These findings are consistent with some studies that have reported a directly proportional relationship between the BMI and the WHtR [6,7,14]. Subscapular folds (AU CROC = 0.85), suprailiac folds (AU CROC = 0.86), and arm circumference (AU CROC =0.84) were the ones that obtained the best ability to detect subjects with impaired WHtR, after BMI. This suggests that these folds and circumferences have a directly proportional relationship with the WHtR. Therefore, it could indicate a special accumulation of fatty tissue when there is a picture of cardiometabolic risk in the areas of the arms, back, and lower abdomen and its study would be interesting [42].
5
Conclusions
The findings reported in this research, strongly suggest that the diagnosis of impaired WHtR can be made from circumferences, skinfolds, and BMI. The anthropometric measurement that best detected impaired WHtR subjects was the BMI, subescapular skinfold, suprailiac skinfold, and arm circumference, with a high probability of detecting normal WHtR subjects. The abdominal circumference is one of the areas that have a more direct relationship with cardiometabolic risk. However, the findings of this research open up the possibility of studying the accumulated fat tissue in the arms and back as areas that could also indicate a risk of suffering metabolic dysfunctions. As future work, tests will be performed with automatic learning techniques such as k-means, neural networks, or support vector machines for detecting cardiometabolic risk. Similarly, we also intend to compare other metrics such as BMI with different classification techniques.
Impaired Waist to Height Ratio Detection
175
Acknowledgment. This work was funded by the Research and Development Deanery of the Sim´ on Bol´ıvar University (DID) and the Research Direction of the Ibagu´e University. Full acknowledgment is given to Rajul Parikh, author of “Understanding and using sensitivity, specificity, and predictive values” (BioInfo Publications™).
References 1. Ansaldo, A.M., Montecucco, F., Sahebkar, A., Dallegri, F., Carbone, F.: Epicardial adipose tissue and cardiovascular diseases. Int. J. Cardiol. 278, 254–260 (2019) 2. Ashwell, M., Gunn, P., Gibson, S.: Waist-to-height ratio is a better screening tool than waist circumference and BMI for adult cardiometabolic risk factors: systematic review and meta-analysis. Obes. Rev. 13(3), 275–286 (2012) 3. Barroso, T.A., Marins, L.B., Alves, R., Gon¸calves, A.C.S., Barroso, S.G., Rocha, G.D.S.: Association of central obesity with the incidence of cardiovascular diseases and risk factors. Int. J. Cardiovasc. Sci. 30(5), 416–424 (2017) 4. Browning, L.M., Hsieh, S.D., Ashwell, M.: A systematic review of waist-to-height ratio as a screening tool for the prediction of cardiovascular disease and diabetes: 0.5 could be a suitable global boundary value. Nutr. Res. Rev. 23(2), 247–269 (2010) 5. Cardinal, T.R., Vigo, A., Duncan, B.B., Matos, S.M.A., da Fonseca, M.D.J.M., Barreto, S.M., Schmidt, M.I.: Optimal cut-off points for waist circumference in the definition of metabolic syndrome in brazilian adults: baseline analyses of the longitudinal study of adult health (Elsa-Brasil). Diabetology Metab. Syndr. 10(1), 1–9 (2018) 6. Chen, X., Liu, Y., Sun, X., Yin, Z., Li, H., Deng, K., Cheng, C., Liu, L., Luo, X., Zhang, R., et al.: Comparison of body mass index, waist circumference, conicity index, and waist-to-height ratio for predicting incidence of hypertension: the rural Chinese cohort study. J. Hum. Hypertens. 32(3), 228–235 (2018) 7. Chung, I.H., Park, S., Park, M.J., Yoo, E.G.: Waist-to-height ratio as an index for cardiometabolic risk in adolescents: results from the 1998–2008 knhanes. Yonsei Med. J. 57(3), 658–663 (2016) 8. Correa, M.M., Thume, E., De Oliveira, E.R.A., Tomasi, E.: Performance of the waist-to-height ratio in identifying obesity and predicting non-communicable diseases in the elderly population: a systematic literature review. Arch. Gerontol. Geriatr. 65, 174–182 (2016) 9. Dong, J., Wang, S.S., Chu, X., Zhao, J., Liang, Y.Z., Yang, Y.B., Yan, Y.X.: Optimal cut-off point of waist to height ratio in Beijing and its association with clusters of metabolic risk factors. Curr. Med. Sci. 39(2), 330–336 (2019) 10. Ehrampoush, E., Arasteh, P., Homayounfar, R., Cheraghpour, M., Alipour, M., Naghizadeh, M.M., Davoodi, S.H., Askari, A., Razaz, J.M., et al.: New anthropometric indices or old ones: which is the better predictor of body fat? Diabetes Metab. Syndr.: Clinic. Res. Rev. 11(4), 257–263 (2017) 11. Farina, P.V.R., Severeyn, E., Wong, S., Turiel, J.P.: Study of cardiac repolarization during oral glucose tolerance test in metabolic syndrome patients. In: 2012 Computing in Cardiology, pp. 429–432. IEEE (2012)
176
E. Severeyn et al.
12. Gonzalez-Cantero, J., Martin-Rodriguez, J.L., Gonzalez-Cantero, A., Arrebola, J.P., Gonzalez-Calvin, J.L.: Insulin resistance in lean and overweight non-diabetic caucasian adults: study of its relationship with liver triglyceride content, waist circumference and BMI. PLoS One 13(2), e0192663 (2018) 13. Greenfield, J.R., Campbell, L.V.: Relationship between inflammation, insulin resistance and type 2 diabetes:’cause or effect? Curr. Diab. Rev. 2(2), 195–211 (2006) 14. Guan, X., Sun, G., Zheng, L., Hu, W., Li, W., Sun, Y.: Associations between metabolic risk factors and body mass index, waist circumference, waist-to-height ratio and waist-to-hip ratio in a chinese rural population. J. Diab. Investig. 7(4), 601–606 (2016) 15. He, J., Ma, R., Liu, J., Zhang, M., Ding, Y., Guo, H., Mu, L., Zhang, J., Wei, B., Yan, Y., et al.: The optimal ethnic-specific waist-circumference cut-off points of metabolic syndrome among low-income rural Uyghur adults in far western China and implications in preventive public health. Int. J. Environ. Res. Public Health 14(2), 158 (2017) 16. Herrera, H., Rebato, E., Arechabaleta, G., Lagrange, H., Salces, I., Susanne, C.: Body mass index and energy intake in Venezuelan university students. Nutr. Res. 23(3), 389–400 (2003) 17. Hetherington-Rauth, M., Bea, J.W., Lee, V.R., Blew, R.M., Funk, J., Lohman, T.G., Going, S.B.: Comparison of direct measures of adiposity with indirect measures for assessing cardiometabolic risk factors in preadolescent girls. Nutr. J. 16(1), 15 (2017) 18. Ho, S.Y., Lam, T.H., Janus, E.D.: Waist to stature ratio is more strongly associated with cardiovascular risk factors than other simple anthropometric indices. Ann. Epidemiol. 13(10), 683–691 (2003) 19. Janssen, I., Ross, R.: Linking age-related changes in skeletal muscle mass and composition with metabolism and disease. J. Nutr. Health Aging 9(6), 408 (2005) 20. Ji, B., Qu, H., Wang, H., Wei, H., Deng, H.: Association between the visceral adiposity index and homeostatic model assessment of insulin resistance in participants with normal waist circumference. Angiol. 68(8), 716–721 (2017) 21. Kaur, N., Barna, B., et al.: Inter-relationship of waist-to-hip ratio (WHR), body mass index (BMI) and subcutaneous fat with blood pressure among universitygoing Punjabi sikh and hindu females. Int. J. Med. Med. Sci. 2(1), 005–011 (2010) 22. Kopelman, P.G.: Obesity as a medical problem. Nature 404(6778), 635–643 (2000) 23. Kumari, M., Heeren, J., Scheja, L.: Regulation of immunometabolism in adipose tissue. Semin Immunopathol 40(2), 189–202 (2018) 24. Kurniawan, L.B., Bahrun, U., Hatta, M., Arif, M.: Body mass, total body fat percentage, and visceral fat level predict insulin resistance better than waist circumference and body mass index in healthy young male adults in Indonesia. J. Clin. Med. 7(5), 96 (2018) 25. Lanas, F., Ser´ on, P., Munoz, S., Margozzini, P., Puig, T.: Latin American clinical epidemiology network series-paper 7: central obesity measurements better identified risk factors for coronary heart disease risk in the chilean national health survey (2009–2010). J. Clin. Epidemiol. 86, 111–116 (2017) 26. Lee, C.H., Lam, K.S.: Obesity-induced insulin resistance and macrophage infiltration of the adipose tissue: a vicious cycle. J. Diab. Investig. 10(1), 29 (2019) 27. Luna-Luna, M., Medina-Urrutia, A., Vargas-Alarc´ on, G., Coss-Rovirosa, F., Vargas-Barr´ on, J., Perez-Mendez, O.: Adipose tissue in metabolic syndrome: onset and progression of atherosclerosis. Arch. Med. Res. 46(5), 392–407 (2015)
Impaired Waist to Height Ratio Detection
177
28. Marusteri, M., Bacarea, V.: Comparing groups for statistical differences: how to choose the right statistical test? Biochemia Medica: Biochemia Medica 20(1), 15– 32 (2010) 29. Miyazaki, Y., Mahankali, A., Matsuda, M., Mahankali, S., Hardies, J., Cusi, K., Mandarino, L.J., DeFronzo, R.A.: Effect of pioglitazone on abdominal fat distribution and insulin sensitivity in type 2 diabetic patients. J. Clin. Endocrinol. Metab. 87(6), 2784–2791 (2002) 30. Mohamed-Ali, V., Pinkney, J., Coppack, S.: Adipose tissue as an endocrine and paracrine organ. Int. J. Obes. 22(12), 1145–1158 (1998) 31. Moretto, M.C., Fontaine, A.M., Garcia, C.D.A.M.S., Neri, A.L., Guariento, M.E.: Association between race, obesity and diabetes in elderly community dwellers: data from the fibra study. Cadernos de saude publica, 32, e00081315 (2016) 32. Mortensen, M.B., Fuster, V., Muntendam, P., Mehran, R., Baber, U., Sartori, S., Falk, E.: Negative risk markers for cardiovascular events in the elderly. J. Am. Coll. Cardiol. 74(1), 1–11 (2019) 33. World Health Organization., et al.: Global status report on non communicable diseases 2014. World Health Organization, vol. 2014 (2014) ¨ ¨ S¨ 34. Ozer, S., Kazanc, N.O., onmezg¨ oz, E., Karaaslan, E., Altunta, B., Kuyucu, Y.E., et al.: Higher HDL levels are a preventive factor for metabolic syndrome in obese Turkish children. Nutricion hospitalaria 31(1), 307–312 (2015) 35. Parikh, R., Mathai, A., Parikh, S., Sekhar, G.C., Thomas, R.: Understanding and using sensitivity, specificity and predictive values. Ind. J. Ophthalmol. 56(1), 45 (2008) 36. Park, S.H., Choi, S.J., Lee, K.S., Park, H.Y.: Waist circumference and waist-toheight ratio as predictors of cardiovascular disease risk in Korean adults. Circ. J. 73(9), 1643–1650 (2009) 37. Perpi˜ nan, G., Severeyn, E., Altuve, M., Wong, S.: Classification of metabolic syndrome subjects and marathon runners with the k-means algorithm using heart rate variability features. In: 2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA), pp. 1–6. IEEE (2016) 38. Perrone-Filardi, P., Paolillo, S., Costanzo, P., Savarese, G., Trimarco, B., Bonow, R.O.: The role of metabolic syndrome in heart failure. Eur. Heart J. 36(39), 2630– 2634 (2015) 39. Polsky, S., Ellis, S.L.: Obesity, insulin resistance, and type 1 diabetes mellitus. Curr. Opinion Endocrinol. Diab. Obes. 22(4), 277–282 (2015) 40. Raz, I., Eldor, R., Cernea, S., Shafrir, E.: Diabetes: insulin resistance and derangements in lipid metabolism. Cure through intervention in fat transport and storage. Diab. Metab. Res. Rev. 21(1), 3–14 (2005) 41. Reitsma, J.B., Glas, A.S., Rutjes, A.W., Scholten, R.J., Bossuyt, P.M., Zwinderman, A.H.: Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J. Clin. Epidemiol. 58(10), 982–990 (2005) 42. R¨ onnecke, E., Vogel, M., Bussler, S., Grafe, N., Jurkutat, A., Schlingmann, M., Koerner, A., Kiess, W.: Age-and sex-related percentiles of skinfold thickness, waist and hip circumference, waist-to-hip ratio and waist-to-height ratio: results from a population-based pediatric cohort in Germany (life child). Obes. Facts 12(1), 25–39 (2019) 43. Rose, D., Komninou, D., Stephenson, G.: Obesity, adipocytokines, and insulin resistance in breast cancer. Obes. Rev. 5(3), 153–165 (2004)
178
E. Severeyn et al.
44. Schneider, H.J., Friedrich, N., Klotsche, J., Pieper, L., Nauck, M., John, U., Dorr, M., Felix, S., Lehnert, H., Pittrow, D., et al.: The predictive value of different measures of obesity for incident cardiovascular events and mortality. J. Clin. Endocrinol. Metab. 95(4), 1777–1785 (2010) 45. Sj¨ ostr¨ om, C.D., H˚ akang˚ ard, A.C., Lissner, L., Sj¨ ostr¨ om, L.: Body compartment and subcutaneous adipose tissue distribution-risk factor patterns in obese subjects. Obes. Res. 3(1), 9–22 (1995) 46. Vel´ asquez, J., Severeyn, E., Herrera, H., Encalada, L., Wong, S.: Anthropometric index for insulin sensitivity assessment in older adults from Ecuadorian highlands. In: 12th International Symposium on Medical Information Processing and Analysis, vol. 10160, p. 101600S. International Society for Optics and Photonics (2017) 47. Vel´ asquez, J., Herrera, H., Encalada, L., Wong, S., Severeyn, E.: An´ alisis dimensional de variables antropom´etricas y bioqu´ımicas para diagnosticar el s´ındrome metab´ olico. Maskana 8, 57–67 (2017) 48. Vintimilla, C., Wong, S., Astudillo-Salinas, F., Encalada, L., Severeyn, E.: An AIDE diagnosis system based on k-means for insulin resistance assessment in eldery people from the Ecuadorian highlands. In: 2017 IEEE Second Ecuador Technical Chapters Meeting (ETCM), pp. 1–6. IEEE (2017) 49. Wei, Y., Wang, J., Han, X., Yu, C., Wang, F., Yuan, J., Miao, X., Yao, P., Wei, S., Wang, Y., et al.: Metabolically healthy obesity increased diabetes incidence in a middle-aged and elderly Chinese population. Diab./Metab. Res. Rev. 36(1), e3202 (2020) 50. Zhou, C., Zhan, L., Yuan, J., Tong, X., Peng, Y., Zha, Y.: Comparison of visceral, general and central obesity indices in the prediction of metabolic syndrome in maintenance hemodialysis patients, pp. 1–8. Eating and Weight Disorders-Studies on Anorexia, Bulimia and Obesity (2019)
Classification of Impaired Waist to Height Ratio Using Machine Learning Technique Alexandra La Cruz1(B) , Erika Severeyn2 , Sara Wong3 , and Gilberto Perpi˜ nan4 1
Faculty of Engineering, Universidad de Ibagu´e, Ibagu´e, Tolima, Colombia [email protected] 2 Department of Thermodynamics and Transfer Phenomena, Universidad Sim´ on Bol´ıvar, Caracas, Venezuela [email protected] 3 Department of Electronics and Circuits, Universidad Sim´ on Bol´ıvar, Caracas, Venezuela [email protected] 4 Faculty of Electronic and Biomedical Engineering, Universidad Antonio Nari˜ no, Cartagena, Colombia [email protected] Abstract. Metabolic dysfunctions are a set of metabolic risk factors that include abdominal obesity, dyslipidemia, insulin resistance, among others. Individuals with any of these metabolic dysfunctions are at high risk of developing type 2 diabetes and cardiovascular disease. Several parameters and anthropometric indices are used to detect metabolic dysfunctions, such as waist circumference and waist-height ratio (WHtR). The WHtR has an advantage over the body mass index (BMI) since the WHtR provides information on the distribution of body fat, particularly abdominal fat. Central fat distribution is associated with more significant cardio-metabolic health risks than total body fat. Machine learning techniques involve algorithms capable of predicting and analyzing data, increasing our understanding of the events being studied. k-means is a clustering algorithm that has been used in the detection of obesity. This research aims to apply the k-means grouping algorithm to study its capability as an impaired WHtR classifier. Accuracy (Acc), recall (Rec), and precision (P ) were calculated. A database of 1863 subjects was used; the database consists of fifteen (15) anthropometric variables and two (2) indices; each anthropometric variable was measured for each participant. The results reported in this research suggest that the k-means clustering algorithm is an acceptable classifier of impaired WHtR subjects (Acc = 0.81, P = 0.83, and Rec = 0.73). Besides, the k-means algorithm was able to detect subjects with overweight and fatty tissue deposits in the back and arm areas, suggesting that fat accumulation in these areas is directly related to abdominal fat accumulation. Keywords: Waist to height ratio (WHtR) · k-means clustering algorithm · Anthropometrics measurements
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 179–190, 2021. https://doi.org/10.1007/978-3-030-63665-4_14
180
1
A. La Cruz et al.
Introduction
Metabolic dysfunctions are a set of metabolic risk factors that include abdominal obesity, hypertension, hyperglycemia, dyslipidemia, insulin resistance, metabolic syndrome [11], among others [7,16,18]. Individuals with any of these metabolic dysfunctions have a higher risk of developing type 2 diabetes, cardiovascular disease, and a higher risk of mortality than those who are not obese [14]. Alarmingly, the prevalence of metabolic dysfunctions is increasing worldwide, and it is a significant public health concern [32,33,49]. The prevalence of diseases that are a consequence of metabolic dysfunctions, is overwhelming. For instance, cardiovascular diseases are the leading cause of death globally, claiming an estimated 17.9 million lives each year, with about 80% of the burden occurring in countries with low and middle-income populations [13, 24]. On the other hand, in 2017, there were 451 million people with diabetes worldwide between 18 and 99 years of age, and it is expected to increase to 693 million by 2045, and 49.7% of those living with diabetes are undiagnosed [5,8]. By 2016, more than 1.9 billion adults, 18 years of age or older, were overweight. Of these, more than 34% were obese [31]. Several anthropometric parameters and indices are used to detect metabolic dysfunctions [40,41,47]. Different measures are used to detect obesity, such as waist circumference and waist to height ratio (WHtR) [3,38,46]. The WHtR has an advantage over body mass index (BMI) since the WHtR provides information about body fat distribution [50]. Central fat distribution is associated with greater cardio-metabolic health risks than total body fat [15]. According to the type of obesity are other indices that help evaluate where the fatty tissue of the body accumulates; for example, skin-fold thickness and waist circumference indicate abdominal fat deposit [42]. Cardiovascular diseases have also been predicted from abdominal obesity indices such as waist circumference, WHtR, and the waist-to-hip ratio [17]. On the other hand, waist circumference, insulin resistance, and dyslipidemia have been used as predictors of metabolic syndrome [1]. The WHtR is closely linked to cardiovascular risk, and analysis of the receiver operating characteristic (ROC) curve indicated a cut-off point of less than 0.5 for normal limit and non-obese and standard-weight adults aged between 18 and 60 [6]. Meanwhile, the ROC analysis showed a cut-off point of less than 0.6 for adults over 60 years old [10]. While it showed a cut-off point of less than 0,6 for adults over 60 years of age. The studies show that while waist circumference is strongly correlated with BMI, the association of diabetes with waist circumference is greater than the association with BMI, which is a measure of overall fat associations [25]. Similarly, waist circumference exhibits a more remarkable ability to predict abnormal fasting blood glucose compared to BMI [19,34]. However, the cut-off points for BMI and waist circumference depend on the ethnicity of the populations studied [21]. This is not the case with the WHtR, which is a dimensionless index expressing fat accumulation in the abdominal area. Abdominal fat tissue is related to the reduction of insulin activity through the increases of fatty acids [28]. Among many anthropometric parameters, the WHtR, and waist circumference are better measurements of abdominal fat distribution [44].
K-Means Clustering for Impaired Waist to Height Ratio Analysis
181
Besides, these measures are better than BMI for predicting cardiovascular diseases [26]. Machine learning techniques consist of a set of algorithms that can analyze the data, expanding our understanding of phenomena [9]. There have been applications of machine learning in the metabolic study. The k-means clustering method, a type of machine learning technique, is applied to classify overweight and obese patients, using the oral glucose tolerance test values [27]. Likewise, the k-means method has been used to detect subjects with metabolic syndrome using fasting glucose and insulin values [48], and cardiovascular parameters [35]. Satisfactory results have been found in these studies which show that the method is optimal in this type of research. This research aims to apply the k-means clustering technique for studying the classification capability of the WHtR. For this purpose, a database of 1,863 subjects was used. The database consists of fifteen (15) anthropometric variables and two (2) indices. Each anthropometric variable was measured for each participant. The following sections will describe the methodology used in this research. Section 3 will show the results obtained. In Sect. 4, the results will be discussed, and finally, in Sect. 5, the conclusions and future works will be exposed.
2 2.1
Methodology Database
The database used in this research was collected at the Nutritional Assessment Laboratory of the Sim´ on Bol´ıvar University in Venezuela. 1,863 (male = 678) adult men and women were recruited in the Capital District of Venezuela [22] (soon to be opened to the public). Fifteen (15) anthropometric measurements were made on each subject, they were; height, weight, arm circumferences, bent arm circumferences, waist circumference, hip circumferences, subscapular skinfolds, suprailiac skinfolds, abdominal skinfolds, humerus, and femur epicondyle diameters. Table 2 shows the characteristics of the database used for standard and impaired WHtR subjects. The diagnosis of impaired WHtR was done using [6,10], which state that the diagnosis of impaired WHtR is done when the WHtR is greater than or equal to 0.5 [10], for persons aged 18 to 60 years, and greater than or equal to 0.6 for persons aged 60 years or older [6]. The calculation of the WHtR was made using the following equation (Eq. 1): W HtR =
W H
(1)
Where: W is the waist perimeter of the subjects measures in centimeters and H is the height of the subjects measures in centimeters too. For this research, all subjects signed an informed consent form. All the procedures carried out in the study were following the ethical standards of the Bioethics Committee of the Sim´on Bol´ıvar University, and the 1964 Declaration of Helsinki and its subsequent amendments or comparable ethical standards.
182
2.2
A. La Cruz et al.
K-Means Implemented
The k-means algorithm is one of the most widespread partitioning methods. Among the origins of this algorithm, Mac Queen (1967) is mentioned as its precursor [29]. Since then, variations were presented, such as the proposals of Anderberg (1973) [2] and Hartigan (1975) [20]. Given an initial number of k clusters, the goal of the k-means algorithm is to minimize the Euclidean distance of the elements within each cluster, concerning its center. The similarity of the objects within each cluster is measured concerning their mean vector, usually called the centroid. The stop criterion is the mean square error. In this study, the k-means method was applied to the WHtR vector, with k = 2 clusters. The distance between observations and the centroid was calculated using Euclidean squared distance and repeated ten (10) times to avoid a local minimum. The silhouette coefficient (SC) was used to evaluate the allocation of the data set assignment in the cluster [39]. 2.3
Metrics (Accuracy, Precision, and Recall)
A confusion matrix was created from the result of the classifier [12,36]. Then, several metrics were calculated. Metrics that allow evaluating the ability of the k-means a method for detecting impaired WHtR subjects. The confusion matrix is a table showing the classification discrepancies between the clustering model and the knowing classes. Table 1 is a representation of a confusion matrix, each class classification is a column (Class1 , ..., Classn ), and the rows represent the experimental classification. The numbers in the main diagonal (A11 , ..., Ann ) are the correct classification made from the k-means method, and n is the number of the total classes. In this study, the confusion matrix has two classes (n = 2), since the goal is to classify impaired WHtR from normal WHtR subjects. Table 1. Confusion matrix. True/Predicted Class1 Class2 . . . Classn Class1
A11
A12
. . . A1n
Class2 .. .
A21 .. .
A22 .. .
... .. .
Classn
An1
An2
. . . Ann
A2n .. .
The accuracy (Acc) [36] represents the proportion of correctly classified instances of the total. It is calculated from the Eq. 2, where Aij is an instance for i = 1, ..., n, and j = 1, ..., n, with n the number of total classes. n A n ii (2) Acc = n i=1 i=1 j=1 Aij
K-Means Clustering for Impaired Waist to Height Ratio Analysis
183
In Eq. 3, the precision formula (Pi ) is showed [36]. The Pi for a given Classi represents the proportion of correctly classified instances of the Classi (Aii ), from the total classifications that were given of the Classi (Aji ). The precision reported in this research is the mean of the precision for the entire class. Aii Pi = n j=1 Aji
(3)
In Eq. 4, the formula for the recall (Ri ) [12] is shown. The Ri of a Classi is the rate of correctly ordered instances of the Classi (Aii ) from the total number of cases with the Classi as the correct label (Aij ). The recall reported in this study is the mean of the recall for the entire class. Aii Ri = n j=1 Aij 2.4
(4)
Statistical Analysis
Non-paired samples were treated with a different distribution than the normal distribution. Therefore, the non-parametric Mann-Whitney U test was used for determining the differences between groups of two. Thus, in this case, a p-value of less than 5% was considered statistically significant [30]. The data in Table 2 presents the values of mean and standard deviation.
3
Results
Table 2 summarizes the anthropometric values of normal and impaired WHtR subjects. The database consists of 1,863 subjects, 80.62% belongs to the control group, and 19.38% have an impaired WHtR. The classification was made according to [6,10]. Where W HtR ≥ 0.5 were classified as impaired WHtR in the case of adults between 18 and 60 years old, and W HtR ≥ 0.6 in the case of adults over 60 years old. Figure 1 shows the silhouette of the k-means clustering (k = 2) for impaired WHtR diagnosis. Table 3 shows the confusion matrix, centroids, silhouette coefficient, and metrics (accuracy, precision, and recall). Table 4 shows the anthropometric characteristics of the two clusterings created from the application of the k-means clustering technique using the WHtR variable.
4
Discussion
Table 2 shows the anthropometric characteristics of the subjects with normal and impaired WHtR. It can be seen that there are statistically significant differences (p < 0.05) in all variables except height. Variables such as folds, circumferences, and BMI show higher values in subjects with impaired WHtR compared to subjects with normal WHtR [4]. This could indicate that subjects with impaired WHtR have more significant fat deposits, which are not only reflected in the
184
A. La Cruz et al.
Table 2. Anthropometric variables characteristics for normal and impaired WHtR. Anthropometrics variables
Normal WHtR
Impaired WHtR
Male = 533, n = 1502 Male = 145, n = 362 Age
a
[years]
Weighta [Kg]
42.96 ± 28.29b
56.01 ± 27.74
56.37 ± 9.86
68.56 ± 12.93
160.60 ± 9.20
156.98 ± 11.41
Right arm circumferencea [cm]
25.79 ± 2.91
30.22 ± 3.67
Left-arm circumferencea [cm]
25.62 ± 2.90
30.14 ± 3.69
Right flexed arm circumferencea [cm]
26.68 ± 3.06
30.89 ± 3.80
Left flexed arm circumferencea [cm]
26.41 ± 3.29
30.65 ± 3.80
Waist circumferencea [cm]
91.07 ± 6.62
96.00 ± 9.90
Hip circumferencea [cm]
74.53 ± 9.51
101.13 ± 9.60
Right subscapular skin folda [mm]
12.97 ± 4.79
21.69 ± 7.01
Height [cm]
folda
13.13 ± 4.78
21.92 ± 7.04
Right suprailiac skin folda [mm]
12.88 ± 5.97
22.73 ± 6.93
Left suprailiac skin folda [mm]
12.95 ± 6.00
22.53 ± 6.92
Right abdominal skin folda [mm]
21.59 ± 6.74
30.72 ± 6.91
Left abdominal skin folda [mm]
22.05 ± 6.84
30.94 ± 7.08
Right humerus diameter epicondylara [cm]
6.15 ± 0.66
6.47 ± 0.62
Left humerus diameter epicondylara [cm]
6.13 ± 0.65
6.45 ± 0.60
9.04 ± 0.74
9.74 ± 0.83
Left subscapular skin
Right femur diameter
[mm]
epicondylara
[cm]
Left femur diameter epicondylara [cm] BMIa [Kg/m2 ] Waist to height ratio [Adimentional]a a Statistically
9.03 ± 0.74
9.71 ± 0.82
21.80 ± 3.03
27.77 ± 4.19
0.46 ± 0.06
0.61 ± 0.07
significant difference (p-value < 0.05) between normal WHtR
and impaired WHtR. b Average
and standard deviation.
abdominal area but also the arms and back areas [23]. Significant differences in age are also observed between both groups. Subjects with impaired WHtR have a higher average age than subjects with normal WHtR. This could be since there is a directly proportional relationship between the increasing age and increasing fat tissue deposits [45]. Table 3 shows that the unsupervised k-means classification method can classify impaired WHtR subjects with an accuracy of 0.81, a precision of 0.83 (showing that it has a high probability of classifying normal WHtR subjects), and a recall of 0.73 (meaning that it is capable of detecting impaired WHtR subjects) [36]. Cluster 1 focuses the 80.02% of normal WHtR subjects and Cluster 2 focuses the 86.15% of impaired WHtR subjects. The silhouette coefficient is greater than 0.5, meaning that all subjects were classified into a cluster. In the kmeans classification, the age of Cluster 2 is higher (p < 0.05) than Cluster 1, this
K-Means Clustering for Impaired Waist to Height Ratio Analysis
185
Fig. 1. Silhouette of k-means clustering in WHtR.
difference was also observed in Table 2. These findings, corroborate those studies that affirm the relationship between increased WHtR, and cardio-metabolic risks with the increasing age [43]. Suprailiac, subscapular, and abdominal folds have higher values (p < 0.05) for subjects in Cluster 2, compared to subjects in Cluster 1 (Table 4). This fact indicates that the WHtR may discriminate between subjects with accumulations of fatty tissue in the back and abdomen. Similarly, with abdominal circumference, arm circumference, and flexed arm circumference; Cluster 2 subjects, showed higher circumferences (p < 0.05) than Cluster 1 subjects. Suggesting that the WHtR also discriminates against subjects with accumulations of fatty tissue in the arms. All these findings suggest that the proliferation of fatty tissue in the arm and back are related to accumulations of fatty tissue in the abdominal area [37].
186
A. La Cruz et al.
Table 3. Confusion matrix, metrics, and silhouette coefficient of the k-means clustering method. Confusion matrix Centroids True/Predicted Normal WHtR Impaired WHtR Cluster 1
1202
50
0.444
Cluster 2
300
311
0.596
Metrics
Values
Accuracy
0.81
Precision
0.83
Recall
0.73
Silhouette Coefficient
0.63 ± 0.19
Table 4. Anthropometrics variables characteristics of cluster 1 and cluster 2 of the k-means algorithm applied in WHtR. Anthropometrics variables
Cluster 1
Cluster 2
Male = 398, n = 1252 Male = 280, n = 611 Age
a
33.92 ± 24.47b
[years]
Weighta
69.21 ± 20.96
56.04 ± 9.95
64.26 ± 10.55
161.42 ± 8.99
156.79 ± 11.41
Right arm circumferencea [cm]
25.62 ± 2.88
28.76 ± 3.81
Left-arm circumferencea [cm]
25.48 ± 2.89
28.58 ± 3.85
Right flexed arm circumferencea [cm]
26.57 ± 3.07
29.38 ± 3.91
[Kg]
Height [cm]
circumferencea
26.31 ± 3.36
29.10 ± 3.92
Waist circumferencea [cm]
91.04 ± 6.76
97.07 ± 9.62
Hip circumferencea [cm]
71.62 ± 7.13
93.18 ± 9.13
Right subscapular skin folda [mm]
12.60 ± 4.57
18.88 ± 7.24
Left subscapular skin folda [mm]
12.72 ± 4.60
19.15 ± 7.16
Right suprailiac skin folda [mm]
12.15 ± 5.61
20.20 ± 7.36
Left suprailiac skin folda [mm]
12.24 ± 5.64
20.07 ± 7.36
Right abdominal skin folda [mm]
21.24 ± 6.40
27.70 ± 8.22
Left flexed arm
folda
[cm]
21.83 ± 6.59
27.76 ± 8.34
Right humerus diameter epicondylara [cm]
6.06 ± 0.64
6.52 ± 0.61
Left humerus diameter epicondylara [cm]
6.05 ± 0.63
6.47 ± 0.59
Right femur diameter epicondylara [cm]
8.95 ± 0.71
9.65 ± 0.80
Left femur diameter epicondylara [cm]
8.94 ± 0.70
9.62 ± 0.79
21.43 ± 2.85
26.09 ± 4.31
0.44 ± 0.04
0.60 ± 0.06
Left abdominal skin
[mm]
BMIa [kg/m2 ] Waist to height ratio [Dimensionless]a a Statistically
significant difference (p-value < 0.05) between cluster 1
and cluster 2. b Average
and standard deviation.
K-Means Clustering for Impaired Waist to Height Ratio Analysis
5
187
Conclusions
The findings reported in this research suggest that the diagnosis of the impaired waist-height ratio can be carried out by using the k-means clustering algorithm as a classifier. Moreover, the k-means algorithm detected overweight subjects with deposits of fatty tissue in the back and arms area, suggesting that the back and arm areas have a direct relation with abdominal fat accumulation. As future work, other machine learning techniques such as neural networks, and support vector machines will be used. An interesting study would be to compare using other metrics such as BMI. Acknowledgment. This work was funded by the Research and Development Deanery of the Sim´ on Bol´ıvar University (DID) and the Research Direction of the Ibagu´e University. Full acknowledgment is given to David Powers, author of “Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation” (BioInfo Publications™).
References 1. Abdelsalam, H.A., Fayed, S.B., Mohamed, A.G.: Homeostasis model assessment of insulin resistance in obese children as a predictor for metabolic syndrome. Egypt. J. Hosp. Med. 77(6), 5817–5824 (2019) 2. Anderberg, M.R.: Cluster Analysis for Applications: Probability and Mathematical Statistics: A Series of Monographs and Textbooks. Academic Press, Cambridge (2014) 3. Ashtary-Larky, D., Daneghian, S., Alipour, M., Rafiei, H., Ghanavati, M., Mohammadpour, R., Kooti, W., Ashtary-Larky, P., Afrisham, R.: Waist circumference to height ratio: better correlation with fat mass than other anthropometric indices during dietary weight loss in different rates. Int. J. Endocrinol. Metab., 16(4) (2018) 4. Ashwell, M., Gunn, P., Gibson, S.: Waist-to-height ratio is a better screening tool than waist circumference and BMI for adult cardiometabolic risk factors: systematic review and meta-analysis. Obes. Rev. 13(3), 275–286 (2012) 5. Basu, S., Yudkin, J.S., Kehlenbrink, S., Davies, J.I., Wild, S.H., Lipska, K.J., Sussman, J.B., Beran, D.: Estimation of global insulin use for type 2 diabetes, 2018–30: a microsimulation analysis. Lancet Diab. Endocrinol. 7(1), 25–33 (2019) 6. Browning, L.M., Hsieh, S.D., Ashwell, M.: A systematic review of waist-to-height ratio as a screening tool for the prediction of cardiovascular disease and diabetes: 0.5 could be a suitable global boundary value. Nutr. Res. Rev. 23(2), 247–269 (2010) 7. Carr, M.C., Brunzell, J.D.: Abdominal obesity and dyslipidemia in the metabolic syndrome: importance of type 2 diabetes and familial combined hyperlipidemia in coronary artery disease risk. J. Clin. Endocrinol. Metab. 89(6), 2601–2607 (2004) 8. Cho, N., Shaw, J., Karuranga, S., Huang, Y., da Rocha Fernandes, J., Ohlrogge, A., Malanda, B.: IDF diabetes atlas: global estimates of diabetes prevalence for 2017 and projections for 2045. Diab. Res. Clin. Pract. 138, 271–281 (2018) 9. Costello, Z., Martin, H.G.: A machine learning approach to predict metabolic pathway dynamics from time-series multiomics data. NPJ Syst. Biol. Appl. 4(1), 1–14 (2018)
188
A. La Cruz et al.
10. Dong, J., Wang, S.S., Chu, X., Zhao, J., Liang, Y.Z., Yang, Y.B., Yan, Y.X.: Optimal cut-off point of waist to height ratio in Beijing and its association with clusters of metabolic risk factors. Curr. Med. Sci. 39(2), 330–336 (2019) 11. Farina, P.V.R., Severeyn, E., Wong, S., Turiel, J.P.: Study of cardiac repolarization during oral glucose tolerance test in metabolic syndrome patients. In: 2012 Computing in Cardiology, pp. 429–432. IEEE (2012) 12. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861– 874 (2006) 13. Fitzmaurice, C., Allen, C., Barber, R.M., Barregard, L., Bhutta, Z.A., Brenner, H., Dicker, D.J., Chimed-Orchir, O., Dandona, R., Dandona, L., et al.: Global, regional, and national cancer incidence, mortality, years of life lost, years lived with disability, and disability-adjusted life-years for 32 cancer groups, 1990 to 2015: a systematic analysis for the global burden of disease study. JAMA Oncol. 3(4), 524–548 (2017) 14. Ginsberg, H.N., MacCallum, P.R.: The obesity, metabolic syndrome, and type 2 diabetes mellitus pandemic: Part i. increased cardiovascular disease risk and the importance of atherogenic dyslipidemia in persons with the metabolic syndrome and type 2 diabetes mellitus. J. Cardiometabolic Syndr. 4(2), 113–119 (2009) 15. Goossens, G.H.: The metabolic phenotype in obesity: fat mass, body fat distribution, and adipose tissue function. Obes. Facts 10(3), 207–215 (2017) 16. Grundy, S.M., Cleeman, J.I., Daniels, S.R., Donato, K.A., Eckel, R.H., Franklin, B.A., Gordon, D.J., Krauss, R.M., Savage, P.J., Smith Jr., S.C., et al.: Diagnosis and management of the metabolic syndrome: an American heart association/national heart, lung, and blood institute scientific statement. Circulation 112(17), 2735–2752 (2005) 17. Hajian-Tilaki, K., Heidari, B.: Comparison of abdominal obesity measures in predicting of 10-year cardiovascular risk in an Iranian adult population using ACC/AHA risk model: a population based cross sectional study. Diab. Metab. Syndr.: Clin. Res. Rev. 12(6), 991–997 (2018) 18. Hansel, B., Giral, P., Nobecourt, E., Chantepie, S., Bruckert, E., Chapman, M.J., Kontush, A.: Metabolic syndrome is associated with elevated oxidative stress and dysfunctional dense high-density lipoprotein particles displaying impaired antioxidative activity. J. Clin. Endocrinol. Metab. 89(10), 4963–4971 (2004) 19. Hardiman, S.L., Bernanthus, I.N., Rustati, P.K., Susiyanti, E.: Waist circumference as a predictor for blood glucose levels in adults. Universa Medicina 28(2), 77–82 (2016) 20. Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. Royal Stat. Soc. Ser. c (Appl. Stat.) 28(1), 100–108 (1979) 21. He, J., Ma, R., Liu, J., Zhang, M., Ding, Y., Guo, H., Mu, L., Zhang, J., Wei, B., Yan, Y., et al.: The optimal ethnic-specific waist-circumference cut-off points of metabolic syndrome among low-income rural Uyghur adults in far western China and implications in preventive public health. Int. J. Environ. Res. Public Health 14(2), 158 (2017) 22. Herrera, H., Rebato, E., Arechabaleta, G., Lagrange, H., Salces, I., Susanne, C.: Body mass index and energy intake in Venezuelan university students. Nutr. Res. 23(3), 389–400 (2003) 23. Hubert, H., Guinhouya, C.B., Allard, L., Durocher, A.: Comparison of the diagnostic quality of body mass index, waist circumference and waist-to-height ratio in screening skinfold-determined obesity among children. J. Sci. Med. Sport 12(4), 449–451 (2009)
K-Means Clustering for Impaired Waist to Height Ratio Analysis
189
24. Kontis, V., Cobb, L.K., Mathers, C.D., Frieden, T.R., Ezzati, M., Danaei, G.: Three public health interventions could save 94 million lives in 25 years: global impact assessment analysis. Circulation 140(9), 715–725 (2019) 25. Kuciene, R., Dulskiene, V.: Associations between body mass index, waist circumference, waist-to-height ratio, and high blood pressure among adolescents: a crosssectional study. Sci. Rep. 9(1), 1–11 (2019) 26. Lee, J.E.: Simply the best: anthropometric indices for predicting cardiovascular disease. Diab. Metab. J. 43(2), 156–157 (2019) 27. Li, L., Song, Q., Yang, X.: Categorization of β-cell capacity in patients with obesity via ogtt using k-means clustering. Endocr. Connections 9(2), 135–143 (2020) 28. Lopes, H.F., Corrˆea-Giannella, M.L., Consolim-Colombo, F.M., Egan, B.M.: Visceral adiposity syndrome. Diabetology Metab. Syndr. 8(1), 40 (2016) 29. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Statistics, vol. 1, pp. 281–297. University of California Press, Berkeley, Calif. (1967). https://projecteuclid.org/euclid.bsmsp/1200512992 30. Marusteri, M., Bacarea, V.: Comparing groups for statistical differences: how to choose the right statistical test? Biochemia Medica: Biochemia Medica 20(1), 15– 32 (2010) 31. Moussa, H.N., Alrais, M.A., Leon, M.G., Abbas, E.L., Sibai, B.M.: Obesity epidemic: impact from preconception to postpartum. Future Sci. OA, 2(3), FSO137 (2016) 32. Niehues, J.R., Gonzales, A.I., Lemos, R.R., Bezerra, P.P., Haas, P.: Prevalence of overweight and obesity in children and adolescents from the age range of 2 to 19 years old in Brazil. Int. J. Pediatr. 2014, (2014) 33. Organization, W.H.: Obesity: preventing and managing the global epidemic. World Health Organization (2000) 34. Pang, S.J., Man, Q.Q., Song, S., Song, P.K., Liu, Z., Li, Y.Q., Jia, S.S., Wang, J.Z., Zhao, W.H., Zhang, J.: Relationships of insulin action to age, gender, body mass index, and waist circumference present diversely in different glycemic statuses among Chinese population. J. Diab. Res. 2018, (2018) 35. Perpi˜ nan, G., Severeyn, E., Altuve, M., Wong, S.: Classification of metabolic syndrome subjects and marathon runners with the k-means algorithm using heart rate variability features. In: 2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA), pp. 1–6. IEEE (2016) 36. Powers, D.M.W.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011) 37. Ram´ırez-V´elez, R., L´ opez-Cifuentes, M.F., Correa-Bautista, J.E., Gonz´ alez-Ru´ız, K., Gonz´ alez-Jim´enez, E., C´ ordoba-Rodr´ıguez, D.P., Vivas, A., Triana-Reina, H.R., Schmidt-RioValle, J.: Triceps and subscapular skinfold thickness percentiles and cut-offs for overweight and obesity in a population-based sample of schoolchildren and adolescents in Bogota. Colombia. Nutrients 8(10), 595 (2016) 38. Rivera-Soto, W.T., Rodr´ıguez-Figueroa, L.: Is waist-to-height ratio a better obesity risk-factor indicator for puerto rican children than is BMI or waist circumference? Puerto Rico Health Sci. J. 35(1) (2016) 39. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987) 40. Sgambat, K., Clauss, S., Moudgil, A.: Comparison of BMI, waist circumference, and waist-to-height ratio for identification of subclinical cardiovascular risk in pediatric kidney transplant recipients. Pediatr. Transplant. 22(8), e13300 (2018)
190
A. La Cruz et al.
41. Sgambat, K., Roem, J., Mitsnefes, M., Portale, A.A., Furth, S., Warady, B., Moudgil, A.: Waist-to-height ratio, body mass index, and cardiovascular risk profile in children with chronic kidney disease. Pediatr. Nephrol. 33(9), 1577–1583 (2018) 42. Shen, F., Wang, Y., Sun, H., Zhang, D., Yu, F., Yu, S., Han, H., Wang, J., Ba, Y., Wang, C., et al.: Vitamin D receptor gene polymorphisms are associated with triceps skin fold thickness and body fat percentage but not with body mass index or waist circumference in Han Chinese. Lipids Health Dis. 18(1), 97 (2019) 43. Stocks, T., Lukanova, A., Bjørge, T., Ulmer, H., Manjer, J., Almquist, M., Concin, H., Engeland, A., Hallmans, G., Nagel, G., et al.: Metabolic factors and the risk of colorectal cancer in 580,000 men and women in the metabolic syndrome and cancer project (me-can). Cancer 117(11), 2398–2407 (2011) 44. Swainson, M.G., Batterham, A.M., Tsakirides, C., Rutherford, Z.H., Hind, K.: Prediction of whole-body fat percentage and visceral adipose tissue mass from five anthropometric variables. PloS One 12(5), e0177175 (2017) 45. Tchkonia, T., Morbeck, D.E., Von Zglinicki, T., Van Deursen, J., Lustgarten, J., Scrable, H., Khosla, S., Jensen, M.D., Kirkland, J.L.: Fat tissue, aging, and cellular senescence. Aging Cell 9(5), 667–684 (2010) 46. Vel´ asquez, J., Severeyn, E., Herrera, H., Encalada, L., Wong, S.: Anthropometric index for insulin sensitivity assessment in older adults from Ecuadorian highlands. In: 12th International Symposium on Medical Information Processing and Analysis, vol. 10160, p. 101600S. International Society for Optics and Photonics (2017) 47. Vel´ asquez, J., Herrera, H., Encalada, L., Wong, S., Severeyn, E.: An´ alisis dimensional de variables antropom´etricas y bioqu´ımicas para diagnosticar el s´ındrome metab´ olico. Maskana 8, 57–67 (2017) 48. Vintimilla, C., Wong, S., Astudillo-Salinas, F., Encalada, L., Severeyn, E.: An aide diagnosis system based on k-means for insulin resistance assessment in eldery people from the Ecuadorian highlands. In: 2017 IEEE Second Ecuador Technical Chapters Meeting (ETCM), pp. 1–6. IEEE (2017) 49. Visscher, T.L., Seidell, J.C.: The public health impact of obesity. Ann. Rev. Public Health 22(1), 355–375 (2001) 50. Zerf, M., Atouti, N., Ben, F.A.: Abdominal obesity and their association with total body: fat distribution and composition. case of Algerian teenager male high school students. Phys. Educ. Stud. 21(3), 146–151 (2017)
Machine Vision
Creating Shapefile Files in ArcMap from KML File Generated in My Maps Graciela Flores(B) and Cristian Gallardo Tomsk Polytechnic University, 30 Lenin Avenue, Tomsk, Russia {yolandagrasiela1,kristianmaurisio1}@tpu.ru https://www.tpu.ru/
Abstract. This document describes a process of digitizing boundaries, generating KML file in My Maps, the KML file will be imported in ArcMap 10.3, and finally, shapefile files will be generated. The processes carried out within My Maps are detailed: delimitation of the study area, digitization of geographic boundaries, and export of KML files; Likewise, the processes carried out within ArcMap are detailed: import of KML file and conversion to shapefile file. Also, two possible digitization cases of the geographical limits within My Maps are shown, solutions to potential errors that may be generated during the process, and finally, an experimental result is presented in which the generated shapefile files will be used to make cuts and analyze information. Compared to digitizing geographic boundaries in ArcMap 10.3, working with My Maps provides better image resolution and a better understanding of the map’s features, which is advantageous when digitizing. Keywords: ArcGIS · ArcMap markup language (kml)
1
· Shapefile · My maps · Keyhole
Introduction
Geographic information systems (GIS) were developed in the early 1960s by geographer Roger Tomlinson, known as the father of GIS, who proposed the term “geographic information system”. Tomlinson led the creation of a database framework with the capacity to store and analyze large amounts of data for the use of the Canadian government, being able to implement this project in the National Land-Use Management Program [1,2]. The GIS allows us to capture, store, filter, analyze, and visualize geospatial data (geodata); by integrating various layers of information [3–6]. GIS are made up of hardware, software, and data. Geodata is characterized as information on geographic locations in a vector file or raster file. It can be visualized in two ways: by geographic coordinates or as points on a map [7]. Vector files are made up of vertices and paths, with its three basic shapes being points, lines, and polygons. Raster files are made up of pixels, each with associated values. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 193–204, 2021. https://doi.org/10.1007/978-3-030-63665-4_15
194
G. Flores and C. Gallardo
Several GIS applications can be found in the market, the most popular are ArcGIS (commercial software), and QGIS (free software). ArcGIS is an integrated platform that allows developers to create, manipulate, share, and analyze geodata. It is made up of several components that communicate with each other and can run on desktops, mobiles, servers, and online services [8]. Geodata analysis allows decision-making in various areas, such as medicine, engineering, environment, military and defense, business, and public safety [3, 6,9–11]. Geodata has been applied in studies on patterns of human activity and environmental impact, such as fertility, mortality, disease, crop yields, roads, cities, administrative divisions, planning, and response to natural disasters [12]. To carry out a correct analysis of the geodata, the delimitation of the study area must be adequate. Within ArcMap one of the most used methods is digitization using raster files. However, in this method, the most important point is the resolution of the image since the use of low-resolution images can make it difficult to identufy the entities contained in them (streets, rivers, buildings, etc.). Due to the importance of the use of GIS previously exposed and the verification that the delimitation used in our project coincides with the delimitation established in official documents. This article presents an alternative to the boundary digitization process, for which a study is carried out on the procedures for searching geographic boundaries in official documents, digitizing these boundaries, exporting Keyhole Markup Language (KML) files using My Maps, and importing and creating a shapefile in ArcMap 10.3. An experimental case of the process applied to the city of Tomsk, Russian Federation, is presented. This document is divided into 5 sections, including the introduction. Section 2 details the process of creating a shapefile. Then Sect. 3 presents a case study of digitization of limits. The results of the case study are presented in Sect. 4. Finally, the conclusions are given in Sect. 5.
2
Shapefile Files Creation Process
The process of creating a shapefile begins with the need to study a specific area. Borders and limits have represented a problem with multiple approaches throughout history. When speaking of an edge, reference is made to the physical delimitation of a territory. Said delimitations have different connotations depending on the scope in which it is applied, being different, for example, in the political-administrative and demographic-economic range [3]. The delimitation of a territory can be found within official documents obtained from state, municipal, or international organizations’ information, and others. Based on the official limits, the limits are digitized.
Getting Shapefile Files from My Maps
2.1
195
Shapefile Format
Shapefile is a format – developed by the Environmental Systems Research Institute (ESRI) – is the most widespread and popular within the GIS community. The supported features are points, lines, and polygons. It uses the dBASE file format (file.dbf) to store attributes and non-topological geometry. It requires little disk space and is easy to read and write. Although this type of format is widely used by many institutions, mainly due to its ease of conversion to other types of formats, it is important to take into account its disadvantages [13,14]. 2.2
Raster Format
Raster files are made up of pixels, also known as cell meshes, that associate values to each pixel. Digital aerial photographs, satellite images, digital images, or scanned maps are examples of raster files [15]. These images can offer information regarding the elevation of the terrain, temperature, vegetation cover, or in turn, allow the analysis of light of a specific wavelength. There are three ways to work with raster data in ArcGIS: as a raster dataset, as a raster type, or as a raster product. Different raster and image data formats can be part of one or more of the job categories (see Fig. 1). The ESRI website offers a complete list of supported raster and image data formats [16].
Fig. 1. Image data formats and raster data supported by ArcGIS
2.3
Digitizing and Obtaining Shapefile Files
The digitization of the area boundaries can be done in two ways: within the ArcMap application or from third-party applications. The proposal focuses on creating shapefile files using Google and ArcMap components.
196
G. Flores and C. Gallardo
Digitization using Google Tools. The architecture for the storage and use of information, both images and geodata, constitutes a modular structure with specific functions that often lead to confusion between its products (Google Earth, Google Maps, and My Maps). Each of these products needs to be defined to understand their interactions and to define the way forward to perform geographic boundary digitization and KML file export (see Fig. 2).
Fig. 2. Google Corporation tools for geodata management - Process of digitizing and obtaining shapefile files
Getting Shapefile Files from My Maps
197
Google Earth. It is an application that allows the visualization of multiple cartographies based on satellite or aerial images that are superimposed. It consists of two parts; image management and the use of information from Google Maps. Google Maps. Created in 2007, it uses vehicles equipped with lasers, cameras, and GPS systems to photograph and geo-reference the entire planet. Among its components is “Street View”, which specializes in capturing 3D photographs of all routes, also performs optical character recognition (OCR) that reads all the information around it, such as signage, business names, street names, and avenues. In addition to its various layers of information, it allows users to upload subjective information such as reviews, question answers, photos, place scores [17–20]. My Maps. It is a Google Maps tool launched in 2007 that allows the user to create custom maps using points, lines, shapes, and images in a layer located on Google Maps. It should be noted that currently, both Google Maps and My Maps have access to aerial and satellite photographs of Google Earth. Keyhole Markup Language (KML). It was developed in 2008 by Google Earth. It is an XML-based language that represents geographic data in three dimensions. KMLs can be made up of entity or raster elements, and as graphic data, drawings, attributes, and HyperText Markup Language (HTML). KML data can be converted to ArcGIS data and vice versa [21].
3
Boundary Digitization Case Study
The process of obtaining a shapefile is presented in ArcMap 10.3 using a KML file exported from My Maps through an experimental case of the delimitation of the city of Tomsk located in the region that bears the same name, in Siberia Russian Federation. The resulting shapefile will be used to perform the analysis of the areas prone to flooding against an increase in the flow of the Tom River. 3.1
Study Area
The City of Tomsk is located in the east of Western Siberia, 3.5 thousand km from Moscow, on the banks of the Tom River. Generally known as the city of students, it has an approximate population of 580,000 inhabitants, a fifth of which are students. It is the oldest scientific, educational and innovative center in Siberia with 8 universities and no less than 15 scientific research organizations. 3.2
Delimitation of the Study Area
The massive use of geographic information systems generates the need to adequately define precise geographic boundaries. On many occasions, different maps show discrepancies in the limitation of a territory, that causes problems of territorial administration and management [22]. Below is a contrast between
198
G. Flores and C. Gallardo
information on the city limits of Tomsk obtained from Google Maps, and from official websites of the city administration. The official sites consulted for the delimitation of the city were https://admin.tomsk.ru/pgs/2ro and https://map. admtomsk.ru; on these pages, both documentary information and an atlas of the city are shown (Fig. 3).
Fig. 3. Contrast between the limits regarding Google Maps and the official boundaries of the city: a) Tomsk city limit according to Google Maps; b) Official limits according to the Tomsk city administration
As can be seen in Fig. 3, the limits offered by third-party applications can be inexact and inconsistent, so it is necessary to resort to official information and carry out the digitization of the study area. 3.3
Digitization of Limits in My Maps
Once the study area is located, within My Maps, lines or shapes can be utilized used to delimit the area. It should be noted that there is a limited number of lines or shapes per map: up to 10,000 lines, shapes, or places; up to 50,000 total points (in lines and shapes); up to 20,000 data table cells. There is also a limit of line or shape per layer, which is 2000 lines, shapes, or places [23]. These limitations in the My Maps tool can be handled in two ways: carry out the delimitation using several lines or carry out the delimitation using a single line. Use of Multiple Lines. When the maximum number of points allowed is reached, the edition of the line will be blocked and will not allow future modifications, so it is necessary to continue delimiting by adding a new line. When having multiple lines, it can be the case that they do not have continuity, either due to union or line errors. Stroke errors are the most frequent when reaching the limit of allowed points (see Fig. 4).
Getting Shapefile Files from My Maps
199
Fig. 4. Frequent digitalization errors in My Maps
Fig. 5. Process of activating the editor mode by editing the KML file: a) The point limit is reached; b) Activation of editor mode by editing KML file; c) Aggregation of additional points
200
G. Flores and C. Gallardo
Remark 1. Binding and stroke errors will be resolved later within the ArcMap application. Use of a Single Line. When you reach the limit of allowed points to draw a line, it is possible to activate the edit mode again and then add more points. For this, it is necessary to export the project in a KML file and open it in a text editor, within which some coordinates will be deleted, then the edited KML file will be saved for later import into a new map within My Maps. When making this modification, the editing mode will be activated again, allowing adding points and continuing with the delimitation of the study area (see Fig. 5). Remark 2. This method may affect the performance of My Maps. 3.4
KML File Import and Processing in ArcMap 10.3
To work in ArcMap 10.3 with the exported KML file from My Maps, it is necessary to convert it into a layer, this procedure is performed by accessing the ArcToolbox “Conversion Tools” tool. This process is necessary for all KML files regardless of their content (see Fig. 6).
Fig. 6. Importing and viewing KML files in ArcMap 10.3. a) Import of KML file formed by several lines; b) Import of KML file consisting of a single line
If the study area has been digitized using several lines, it is necessary to correct the union or trace errors by forming several contiguous lines, which will later become a single line that forms the outline of the study area (see Fig. 7). Once the error correction is completed and a single contour line is obtained, the result will be similar to importing a KML file consisting of a single line. Remark 3. For those KML files consisting of a single line, no correction is required, so the polygon shapefile files are created directly.
Getting Shapefile Files from My Maps
201
Fig. 7. Joining and drawing error correction process: a) Digitalization process binding error; b) Line join error correction; c) Digitalization process drawing error; d) Elimination of wrong points
Obtaining Shapefile Files. Once the layer formed by a single contour line has been obtained, with the help of the “Feature to polygon” tool located within ArcToolbox/Data Management Tools, this layer will be converted to a polygon. Once this procedure is completed, the shapefile of the study area will be obtained (see Fig. 8a).
4
Experimental Results
To show the correct operation of the shapefile files, cuts made to raster files of the city of Tomsk are presented. The process described in this document was carried out on a laptop with a 2.50 GHz Intel Core i5-7200U processor, 8 GB RAM, and a 1TB 5400 rpm hard disk on which Windows 10 pro X64 has been installed. The GIS used was ArcGIS 10.3 X86. The imported KML file corresponds to the delimitation of the city of Tomsk using two broken lines, which were joined within ArcMap 10.3 to later create the shapefile files. My Maps has been executed within the Google Chrome browser version 84.0.4147.105 (Official Build) X64 using the satellite base map. Figure 8b shows the cut made on a ground elevation raster file. This raster allowed determining the altitude of areas prone to flooding against an increase in the flow of the Tom River.
202
G. Flores and C. Gallardo
Fig. 8. Obtaining and using shapefile files. a) Shapefile obtained based on the KML file. b) Cut on the raster file.
Fig. 9. Flood zone analysis performed on a TIFF raster file.
Figure 9 shows the cut made on a raster file in Tagged Image File Format (TIFF). An analysis of the affected populated areas against possible floods has
Getting Shapefile Files from My Maps
203
been performed on this image. This process is illustrated on the map through a light blue color. Table 1 shows the numerical result of the flood areas against estimated elevations of the water level under the following growth percentage parameters (5%, 10%, 20%, 30%, 40% y 50%). Table 1. Flood area analysis. Estimated increase Height above Flood area Volume sea level (m) (km2 ) (m3 ) in water levels 5% 10% 20% 30% 40% 50%
5
65 66 68 70 72 74
0.275 0.680 2.120 3.514 14.321 29.723
0.065 0.068 0.076 0.084 0.091 0.099
Conclusions
The method presented in this paper has proved to be effective and functional for creating shapefile files. The superimposition of satellite images offered by My Maps allowed better visualization of the constituent entities of the study area, thus allowing the process to be carried out with greater precision. Despite the limitations of My Maps, the solution alternatives presented were found to be favorable; however, it is recommended to carry out the process of digitizing limits using multiple lines, thus avoiding affecting the performance of the tool. The experimental results showed the correct operation of the proposal made.
References 1. GIS, U.F.E.: Roger tomlinson (2017). https://www.urisa.org/awards/rogertomlinson/ 2. Tomlinson, R.F.: An introduction to the geo-information system of the Canada land inventory. techreport, Department of forestry and rural development (1967) 3. Dai, F., Lee, C., Zhang, X.: Gis-based geo-environmental evaluation for urban land-use planning: a case study. Eng. Geol. 61(4), 257–271 (2001). http://www. sciencedirect.com/science/article/pii/S001379520100028X 4. Tomlinson, R.F.: A geographic information system for regional planning. Tech. rep, Department of Forestry and Rural Development, Government of Canada (1968) 5. Goodchild, M.F.: Geographic information systems. Prog. Hum. Geogr. 15(2), 194– 200 (1991). https://doi.org/10.1177/030913259101500205
204
G. Flores and C. Gallardo
6. Vida, M., Vytautas, G., Vytautas, P., Sam, G.: Geographic information system: old principles with new capabilities. Urban Des. Int. 16(1), 1–6 (2011) 7. ESRI: What is geodata? (2016). https://desktop.arcgis.com/en/arcmap/10.3/ main/manage-data/what-is-geodata.htm 8. ESRI: What is arcgis? (2020). https://developers.arcgis.com/labs/what-is-arcgis/ 9. Brody, H., Rip, M.R., Vinten-Johansen, P., Paneth, N., Rachman, S.: Mapmaking and myth-making in broad street: the london cholera epidemic, 1854. Lancet 356(9223), 64–68 (2000). http://www.sciencedirect.com/science/article/ pii/S0140673600024429 10. Tang, K.X., Waters, N.M.: The internet, GIS and public participation in transportation planning. Prog. Plann. 64(1), 7–62 (2005). http://www.sciencedirect. com/science/article/pii/S0305900605000358 11. Wei, W.: Research on the application of geographic information system in tourism management. Procedia Environ. Sci. 12, 1104–1109 (2012). http://www. sciencedirect.com/science/article/pii/S1878029612003957, 2011 International Conference of Environmental Science and Engineering 12. Wood, W.B.: Gis as a tool for territorial negotiations. In: IBRU Boundary and Security Bulletin, vol. 8, pp. 72–79 (2000) 13. ESRI: Geoprocessing considerations for shapefile output, July 2020. https:// desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/geoprocessingconsiderations-for-shapefile-output.htm 14. ESRI: What is a shapefile? (2016). https://desktop.arcgis.com/en/arcmap/10.3/ manage-data/shapefiles/what-is-a-shapefile.htm 15. ESRI: What is raster data? (2016). https://desktop.arcgis.com/en/arcmap/10.3/ manage-data/raster-and-images/what-is-raster-data.htm 16. ESRI: List of supported raster and image data formats (2016). https://desktop. arcgis.com/en/arcmap/10.3/manage-data/raster-and-images/list-of-supportedraster-and-image-formats.htm 17. Zamir, A.R., Shah, M.: Accurate image localization based on google maps street view. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) Computer Vision - ECCV 2010, pp. 255–268. Springer, Berlin Heidelberg (2010) 18. Brown, M.C.: Hacking Google Maps and Google Earth. Wiley Publishing Inc., United States (2006) 19. Gibson, R., Erle, S.: Google Maps Hacks. O’Reilly, United States (2006) 20. Ciepluch, B., Jacob, R., Mooney, P., Winstanley, A.C.: Comparison of the accuracy of openstreetmap for ireland with google maps and bing maps. In: Proceedings of the Ninth International Symposium on Spatial Accuracy Assessment in Natural Resuorces and Enviromental Sciences, 20-23rd July 2010, p. 337, July 2010. http:// mural.maynoothuniversity.ie/2476/ 21. ESRI: What is kml? (2016). https://desktop.arcgis.com/en/arcmap/10.3/managedata/kml/what-is-kml-.htm 22. Saura, M.T.: Boundaries, field work and GIS management. In: XXII International Cartographic Conference (ICC2005). Institut Cartogr` afic de Catalunya, The International Cartographic Association (ICA-ACI), July 2005. https://www.icgc. cat/en/content/download/3584/11642/version/1/file/boundaries field work and gis management.pdf 23. Corporation, G.: Draw lines & shapes in my maps (2020). https://support.google. com/mymaps/answer/3433053
Dynamic Simulation and Kinematic Control for Autonomous Driving in Automobile Robots Danny J. Zea1(B)
, Bryan S. Guevara2 , Luis F. Recalde2 and Víctor H. Andaluz2
,
1 Escuela Superior Politécnica de Chimborazo, Riobamba, Ecuador
[email protected] 2 Universidad de la Fuerzas Armadas ESPE, Sangolquí, Ecuador
{bsguevara,lfrecalde1,vhandaluz1}@espe.edu.ec
Abstract. This article proposes a simple simulation methodology that allows to experiment with the dynamic behavior of vehicles, which applies control laws that allow automated driving in unstructured environments, through the use of robotic application simulation software Webots, which shortens the experimentation times of autonomous driving systems, It also allows to overcome the limitations that this type of projects involves, such as, high infrastructure costs, the difficulties of validating the system, qualified personnel and the participation of various elements with different dynamic behaviors that make up a real traffic space. The use of the virtualized model of a BMW X5 vehicle and the instrumentation of multiple sensors necessary for its operation are presented. Finally, a path tracking algorithm is developed using Matlab scientific programming software, and the dynamic behavior of the vehicle is evaluated through the response curves of the robot states, these results represent the starting point for future research and the implementation of physical tests. Keywords: Autonomous driving · Car-Like · Dynamic simulation · Kinematic control · Robotic
1 Introduction Since the 1950s, research on automated driving systems began with the aim of solving traffic problems, so autonomous car navigation has been a subject of great interest to the scientific community and the automotive industry [1]. Autonomous driving systems can be classified as autonomous and collaborative. Autonomous systems are characterized by the ability to provide a certain type of intelligence on board the vehicle, in this type of system, sensors that warn of the state of the environment are used, such as GPS, digital road maps, artificial vision and among others, without the need to use any kind of special equipment on the road. On the other hand, the cooperative system is defined as a synergistic system between installed road intelligence, such as inductive cables and magnetic markers and intelligence on board [2]. Another definition provides the
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 205–216, 2021. https://doi.org/10.1007/978-3-030-63665-4_16
206
D. J. Zea et al.
scope of automation [3]: a) Automated driving, enhanced driving by dedicated control and autonomous systems that support driving, taking into account that control can be recovered at all times and that the driver is legally responsible to carry out the driving task; and b) Autonomous driving: makes mention of the end result of driving automation, in principle no human driver needs to be active in the operation of the vehicle, although a driver may still be, but need not be on site. In today’s cars, electronics plays an important role because it is considered that every time cars resemble robots, since many mechanical elements are replaced or electronically controlled, such that a vehicle can be considered as a mobile robot [4], keeping in mind that both automated driving and robots require a high degree of safety, high reliability, robust and fault-tolerant features. The objectives of automated driving systems seek to improve safety (active and preventive) and traffic efficiency, reflected in the improvement of the fluidity of vehicular traffic, improving the prevention and reduction of congestion and allowing the possibility of an increase in the number of lanes, at the same time that it protects the integrity of the passenger against other coexisting vehicles in the environment [5, 6]; therefore, it contributes greatly to the safety of road traffic, taking action on decision-making, perception, elimination of delays and uncertainty, and human error [7]. Among the most relevant applications that autonomous driving includes are adaptive cruise control, emergency braking, precise steering control, autonomous parking, precision approach and narrow lane driving for transit buses, door-to-door transport for the elderly, automated heavy truck platoon, automatic driving in emergencies [8, 9]. These applications represent a skill set that enables safe driving along a lane on the road. On the other hand, autonomous driving systems reduce the fatigue of drivers exposed to long hours on the road [10]. This article addresses the study of autonomous driving and the solution of the route tracking problem, in which decision making about the mechanisms and vehicle electronics is carried out by a control algorithm based on the system’s mathematical model., the objective is to provide a certain degree of intelligence on board the vehicle, in favor of intelligent transport systems (ITS). In this context, a route monitoring controller is proposed, through the recreation of an urban road system. The dynamic ownership of the vehicle is established in the robotic applications simulator, Webots, which has a specialized complement for the study of autonomous driving, as well as digitized models of commercial vehicles and the configuration of multiple sensors that are used in the real vehicle [11]. It is necessary to mention that the proposed controller does not consider the dynamic compensation that is directly related to the dynamic model of the system and its physical parameters; however, the behavior and evolution of the movement is taken into account due to the physics engine provided by the simulator. Finally, the experimental results of the proposed controller are included and discussed. This document is divided into 6 sections, including the Introduction. Section 2 formulates the control problem. In Sect. 3, the kinematic modeling of the car-like robot and the design of the controllers are presented. While the general structure is presented in Sect. 4. In Sect. 5 the simulator proposal is shown, which includes the digitized stage, the vehicle and the implemented communication channel. In Sect. 6 the experimental
Dynamic Simulation and Kinematic Control for Autonomous Driving
207
results of autonomous driving control are discussed. Finally, the conclusions are shown in Sect. 7.
2 Formulation Problem Although autonomous driving technologies have advanced so much that in the near future it may be feasible on public roads; One of the technological problems continues to be the shortcomings in the design of highly reliable elements, controllers and systems that are fault tolerant and robust. However, it is not ruled out that fully autonomous and automated driving on high-traffic roads and high-velocity highways will be substantially possible in the coming years [12]. Another obstacle to the introduction of autonomous driving are the legal and institutional problems [3]. The development of introduction and deployment scenarios, and clarification of liability for accidents, should be discussed against the background of advanced technologies. The need to improve these autonomous driving systems every time, makes it essential to use highly expensive equipment, such as the vehicle, sensors and spare parts, as well as other factors such as the necessary physical space, as well as the integrity of the equipment exposed to failures in experimental tests, in this context, this work proposes a simple simulation methodology that allows experimenting with the dynamic behavior of vehicles, applying control laws that allow autonomous driving in unstructured scenarios. Webots is one of the main simulation software for robotic applications that has a physics engine (gravity, collisions, disturbances and deformations, among others) that allows virtual analysis of the dynamic behavior of robotic systems, just like in the real world [13]. Simulation is a tool of high analytical value that allows anticipation of unexpected situations prior to implementation on physical devices.
3 Kinematic Models and Controller Design The simplified kinematic model show in [2, 14, 15], for car-like mobile robots, uses the Ackerman steer of the front wheels to create a center of rotation somewhere along the axis of the non-steered rear wheels. Figure 1 shows the geometric representation of the simplified kinematic model. The car-like robot’s non-holonomic velocity restriction determines that it can only move perpendicular to the axis of the rear wheels, −˙x sin φ + y˙ cos φ − aφ˙ = 0
(1)
Therefore, the kinematic model configuration of the robotic system is defined by: ⎤ ⎡ ⎤ ⎡ x˙ cos(φ) −a sin(φ) ⎣ y˙ ⎦ = ⎣ sin(φ) a cos(φ) ⎦ μ (2) ω 1 0 φ˙ Equation (2) can be written in its compact form as: ˙ = J(φ)v(t) h(t)
(3)
208
D. J. Zea et al.
Fig. 1. Simplified kinematic structure of a car-like robot.
˙ where, h(t) = [ x˙ y˙ ]T ∈ R2 represents the velocity vector at the point of interest; 2x2 J(φ) ∈ R is the Jacobian matrix and v(t) = [ μ ω ]T is the maneuverability control vector, in which μ y ω represents the linear and angular velocity of the car-like robot, respectively. The steering angle ψ, of the front wheels is defined by: Dω (4) ψ = tan−1 μ A control system based on the kinematics (2) of the system is proposed, therefore the control vector can be expressed as ˙ vc (t) = J−1 (φ)h(t)
(5)
Therefore, the proposed control law allows the tracking of the desired trajectory, T
where J−1 (φ) is the inverse of Jacobian matrix, h˙ d = x˙ d y˙ d is a velocity vector of T
the desired trajectory, h˜ = h˜ x h˜ y is the vector of position error, which is obtained from h˜ = hd − h, K1 is a diagonal matrix that quantifies position errors and K2 is a diagonal matrix used to delimit control actions.
˜ vc (t) = J−1 (φ) h˙ d (t) + K2 tanh K2−1 K1 h(t) (6)
4 System Structure The proposal of the dynamic simulator and the solution of the autonomous route tracking problem, it is implemented considering the scheme, as show Fig. 2, made up of the following blocks: i) Webots is free software that allows you to recreate dynamic behaviors of robotic systems [16], that replicates urban infrastructure scenarios necessary for this project and mainly modules dedicated to autonomous driving applications [17, 18], it also integrates its own libraries that allow emulating sensors [11], such as GPS, Compass, EMU, Lidar and among others; ii) Matlab is the scientific programming software that receives robot status data in real time, this information is processed by the control law and maneuverability actions are generated and iii) Shared Memories, that allows the transfer of data packets between the software, by dynamic memory space reservation.
Dynamic Simulation and Kinematic Control for Autonomous Driving
209
Fig. 2. Structure of the dynamic simulation system for autonomous driving.
5 Simulator The virtual stage is developed through the modeling and inclusion of structures made in CAD Skecht modeling software, in which the elements of the scene are generated that are later imported into the graphic engine Webots. The stage consists of two-way streets, they form closed circuits and intersect at three intersections; also, horizontal and vertical traffic and urban infrastructure signage is implemented that gives realism to the scene where the experiment will take place, as show Fig. 3.
Fig. 3. 3D simulation scenario: a) aerial view, b) isometric view and c) robot Car-Like BMW model X5.
The physics engine is implemented in the simulator, which comprises of a collision system, force of gravity, friction coefficients between contact surfaces. On the other hand, the lighting and shadow system is configured, which is rendered at every moment of time (Fig. 4).
210
D. J. Zea et al.
Fig. 4. Webots – MATLAB interaction.
5.1 Robot Car-like The CAD model of the car-like robot is imported using the Webots Autonomous Driving Module, which will be used in the experiment. The model of a Bmw model X5 car is chosen taking into account that, in the year 2000, this vehicle obtained a higher safety rating than the other SUVs tested in the Insurance Institute for Highway Safety of the United States, and was named the “Best Choice” for midsize SUVs [19], in addition to being characterized with the same configuration as a conventional vehicle. Reading of states of velocities, front wheel angle, brake and other actions on the vehicle are carried out through dedicated Webots libraries that can be programs in a controller. The location and orientation of the vehicle on the simulated stage, is done through feedback from GPS sensor and Compass sensor. The axes direction of the vehicle’s global and local reference system is transformed with the sensors to maintain the right-hand law configuration. 5.2 Bilateral Communication Matlab - Webots The process for exchanging information between Webots and MATLAB allows bilateral communication to carry out route tracking control for a Car-like vehicle. Bidirectional communication is done through a dynamic link library [20], DLL, which reserves a space for improvement in RAM. The use and handling of the DLL between Webots and MATLAB has three phases: (i) start phase, manages space and labeling in RAM; (ii) execution phase, read/write permissions are defined in RAM; y (iii) closing phase, when the application is finished it frees the reserved memory spaces. Figure 6 details the communication process using shared memories, DLL, for bilateral data exchange between subsystems. The exchange of information to run the dynamic simulator uses eight reserved memory spaces, in which robot states and maneuverability velocitiess are shared bilaterally. For internal vehicle control on the developed stage, front wheel angle is used (ψ) determined by (4).
Dynamic Simulation and Kinematic Control for Autonomous Driving
211
6 Result Simulation and Discusión In this section, the evolution of the dynamic behavior of the car-like robot is evaluated, using simulation software for robotic applications Webots, through the proposed control law in (16), where the controller parameters are defined by: hd = [ xd yd ]T , gain matrix K1 = diag[ 1 1 ] y K2 = diag[ 1 1 ], the initial conditions of the robot are h(0) = [ 6 −37 0.524 ]T . The distance that defines the control point a = 2.995[m]. A sampling time of 100[ms] during 80[s] simulation. The 3D model of the vehicle in Webots is represented in real scale 1:1, obtained from the module for applications for autonomous driving. The setting of the stage in the scene tree is detailed in Table 1. Table 1. World info configuration Field scenario
Values
Description
Gravity (x,y,z)
0, −9.81, 0
Defines the gravitational acceleration
Constraint Force Mixing (CFM)
0.00001
Defines the Constraint Force Mixing use by ODE to manage contacts joints
Error reduction parameter (ERP)
0.6
Defines the Error Reduction Parameter use by ODE to manage contacts joints
basicTimeStep
100 ms
Defines the duration of the simulation step executed by Webots
Frames per second (FPS)
60
Represents the maximum rate at which the 3D display is refreshed
physicsDisableTime
1s
Determines the amount of simulation time (in seconds) before the idle solids are automatically disabled from the physics computation
physicsDisableLinearThreshold
0.01 m/s
Determines the solid’s linear velocity threshold (in meter/seconds) for automatic disabling
physicsDisableAngularThreshold
0.01 rad/s
Determines the solid’s angular velocity threshold (in radian/seconds) for automatic disabling
northDirection (x,y,z)
0, 0, 1
Indicates the direction of the virtual north and is used by Compass and InertialUnit nodes
gpsCoordinateSystem
“local”
Indicates in which coordinate system are expressed the coordinates returned by the GPS devices
gpsReference (x,y,z)
0, 0, 0
Specifies the GPS reference point
lineScale
1m
Allows the user to control the size of the optionally rendered arbitrary-sized lines or objects
212
D. J. Zea et al.
Table 2 illustrates the main fields that define the characteristics of the PROTO BMW X5 vehicle node used in this experiment. Table 2. PROTO Bmw X5 node configuration Field proto
Values
Description
TrackFront
1.628
Defines the front distances between right and left wheels
TrackRear
1.628
Defines the rear distances between right and left wheels
WheelBase
2.995
Defines the distance between the front and the rear wheels axes
BrakeCoefficient
500
Defines the maximum dampingConstant applied by the Brake on the wheels joint
EngineMaxTorque
250
Defines the maximum torque of the motor in Nm used to compute the electric engine torque
EngineMinRPM
1000
Defines the working range of the engine
EngineMaxRPM
4500
Defines the working range of the engine
engineMaxPower
50000
Defines the maximum power of the motor in W used to compute the electric engine torque
MinSteeringAngle
−1
Defines the minimum steering angle of the front wheels
MaxSteeringAngle
1
Defines the maximum steering angle of the front wheels
time0To100
10
Defines the time to accelerate from 0 to 100 km/h in seconds, this value is used to compute the wheels acceleration when controlling the car in cruising velocity thanks to the driver
suspensionFront-SpringConstant
100000
Defines the characteristics of the suspension
suspensionFront-DampingConstant 4000
Defines the characteristics of the suspension
suspensionRear-SpringConstant
100000
Defines the characteristics of the suspension
suspensionRear-DampingConstant
4000
Defines the characteristics of the suspension
wheelsDamping-Constant
5
Defines the dampingConstant of each wheel joint used to simulate the frictions of the vehicle
Dynamic Simulation and Kinematic Control for Autonomous Driving
213
Table 3 shows the characteristics of the virtual sensors configured in the scene tree that are based on real sensor specifications. Table 3. Sensors equipped on the vehicle. Position on the vehicle
Sensor
Model
Specifications/Inputs
Values
Sensors slot rear
GPS
ProPak-V3TM
Horizontal position accuracy
0.25 m
Resolution
0.04 m
Velocity resolution
0.03 m/s
Sensors slot rear
Compass
NEO-M8T
Velocity maximum
515 m/s
xAxis
True
yAxis
True
zAxis
True
Resolution
5 milli-gauss
Figure 5.a shows the desired trajectory obtained through a nonlinear regression using real data and reference from the coordinate system in the developed scenario, that determines the values of the parameters associated with the best fit curve. Vector values are defined pd . The Fig. 5.b shows the cartesian positions of the vehicle (x, y) in tracking the trajectory over time.
Fig. 5. Definition of the trajectory
214
D. J. Zea et al.
Figure 6 shows the maneuverability actions sent from Matlab (μc ,ωc ,ψc ) in relation to the real actions (μ,ω,ψ) that the vehicle performs dynamically in Webots.
Fig. 6. Handling speeds
Figure 7 shows the evolution of position errors, which tend to converge to zero, according to the path defined in Fig. 5.a.
Fig. 7. Control errors
Dynamic Simulation and Kinematic Control for Autonomous Driving
215
7 Conclusion This article has presented a control algorithm for tracking the path of a Car-Like type mobile robot, that allows to analyze the evolution and behavior and dynamics for the autonomous driving of a BMW X5 vehicle, equipped with a GPS sensor and a Compass sensor that allow knowing the position and orientation respectively. The sensors described in Table 3, they are commercial sensors that have been validated in the simulation software Webots, and are the inputs to the control algorithm; this implies that this proposal can be implemented in a real vehicle with the proper instrumentation of an electromechanical system that receives the control actions. The evaluation of the control algorithm for the tracking of a BMW X5 vehicle was demonstrated in the results of the dynamic simulation. Acknowledgements. The authors would like to thank the Escuela Superior Politécnica de Chimborazo ESPOCH for the support to develop the research project “Análisis, diseño e implementación de algoritmos de control inteligente en controladores con una red de sensores IoT en vehículos para mejorar la seguridad vial”; also to Universidad de las Fuerzas Armadas ESPE and the Research Groups GITEA and ARSI, for the support for the development of this work.
References 1. Kümmerle, R., Hähnel, D., Dolgov, D., Thrun, S., Burgard, W.: Autonomous driving in a multi-level parking structure. In: Proceedings - IEEE International Conference on Robotics and Automation, pp. 3395–3400 (2009) 2. Toibero, M., Roberti, F., Herrera, D., Amicarelli, A., Carelli, R.: Controlador Estable para Seguimiento de Caminos en Robots Car-Like basado en Espacio Nulo. In: 2015 XVI Work. Inf. Process. Control (RPIC), Co-sponsored by IEEE, no. April 2016 (2015) 3. Mcdonald, M., Jeffery, D., Piao Tecnalia, J., Sanchez, J.: Definition of necessary vehicle and infrastructure systems for automated driving. base.connectedautomateddriving.eu (2011) 4. Lenain, R., Thuilot, B., Cariou, C., Martinet, P.: Mobile robot control in presence of sliding: application to agricultural vehicle path tracking. In: Proceedings of the IEEE Conference on Decision and Control, pp. 6004–6009 (2006) 5. Gruyer, D., Magnier, V., Hamdi, K., Claussmann, L., Orfila, O., Rakotonirainy, A.: Perception, information processing and modeling: critical stages for autonomous driving applications. Ann. Rev. Control 44, 323–341 (2017) 6. Choi, J., Chun, D., Kim, H., Lee, H.-J.: Gaussian YOLOv3: an accurate and fast object detector using localization uncertainty for autonomous driving (2019) 7. Brumitt, B., Stentz, A., Hebert, M., Group, T.C.U.: Autonomous driving with concurrent goals and multiple vehicles: mission planning and architecture. Auton. Robots 11(2), 103–115 (2001) 8. Vieira, B., Severino, R., Filho, E.V., Koubaa, A., Tovar, E.: COPADRIVe - a realistic simulation framework for cooperative autonomous driving applications. In: 2019 8th IEEE International Conference on Connected Vehicles and Expo, ICCVE 2019 - Proceedings (2019) 9. Talpaert, V., et al.: Exploring applications of deep reinforcement learning for real-world autonomous driving systems. In: VISIGRAPP 2019 - Proceeding 14th International Jt. Conference Computer Vision, Imaging Computer Graphical Theory Application, vol. 5, pp. 564–572, January 2019
216
D. J. Zea et al.
10. Tsugawa, S.: Automated driving systems: common ground of automobiles and robots. Int. J. Humanoid Robot. 8(1), 1–12 (2011) 11. Castaño, F., Beruvides, G., Villalonga, A., Haber, R.E.: Self-tuning method for increased obstacle detection reliability based on internet of things LiDAR sensor models. Sensors (Switzerland) 18(5), 1–16 (2018) 12. Cachumba, S.J., Briceño, P.A., Andaluz, V.H., Erazo, G.: Autonomous driver assistant for collision prevention. In: ACM International Conference Proceeding Series, pp. 327–332 (2019) 13. Karoui, O., et al.: Performance evaluation of vehicular platoons using Webots. IET Intell. Transp. Syst. 11(8), 441–449 (2017) 14. Stotsky, A., Hu, X.: Control of car-like robots using sliding observers for steering angle estimation. In: Proceeding of IEEE Conference Decision Control, vol. 4, no. December, pp. 3648–3653 (1997) 15. Lee, S., Kim, M., Youm, Y., Chung, W.: Control of a car-like mobile robot for parking problem. In: Proceeding - IEEE International Conference on Robotics and Automation, vol. 1, no. May, pp. 1–6 (1999) 16. Michel, O.: Webots: professional mobile robot simulation. Int. J. Adv. Robot. Syst. 1(1), 39–42 (2004) 17. Ma, X., Mu, C., Wang, X., Chen, J.: Projective geometry model for lane departure warning system in webots. In: 2019 5th International Conference on Control, Automation Robotics ICCAR 2019, pp. 689–695 (2019) 18. Tuncali, C.E., Fainekos, G., Prokhorov, D., Ito, H., Kapinski, J.: Requirements-driven test generation for autonomous vehicles with machine learning components. IEEE Trans. Intell. Veh. 5(2), 265–280 (2020) 19. colaboradores de Wikipedia, “BMW X5,” (2020) 20. B, C.S., Meissgeier, F., Bruegge, B., Verclas, S.: System : a cyber-physical human system, vol. 1, no. ii, pp. 267–280 (2016)
Analysis of RGB Images to Identify Local Lesions in Rosa sp. cv. Brighton Leaflets Caused by Sphaerotheca Pannosa in Laboratory Conditions William Javier Cuervo-Bejarano(B)
and Jeisson Andres Lopez-Espinosa
Corporación Universitaria Minuto de Dios – UNIMINUTO, Zipaquirá, Colombia [email protected], [email protected]
Abstract. Fresh cut rose flowers are an item of considerable economic importance in Colombia. However, factors such as pests and diseases cause significant yield losses. Powdery mildew (Sphaerotheca pannosa) can cause losses of up to 30% of the total production. Thence, it is necessary to implement techniques that allow an early and objective symptoms detection in crop fields to take accurate decisions. The use of RGB and multispectral images for the detection of pests and diseases have been successfully consolidated in several agroecosystems with detection capabilities of up to 95%. In a commercial rose crop in the Bogotá plateau, leaflets with and without the presence of visible signs of S. pannosa were collected from rose plants cv. “Brighton”, then, in laboratory conditions, leaflets images were acquired, pre-processed, and transformed to HSV color space. Three classes corresponding to the region of interest extracted from no visible lesion (NL), visible lesion (L), and adjacent to the lesioned area (AL) for each leaflet were compared to determine pixel intensities differences. A Welch’s ANOVA indicated a significant difference (P < 0.0001) in the leaflet reflectance between leaf tissue samples with or without visible signs of lesions caused by S. pannosa. Further experiments in field conditions that can contribute to understanding the reflectance profiles in leaflets affected by S. pannosa can be conducted to provide diagnostic tools to make more accurate decisions. Keywords: Plant pathogens · Ornamentals · Images · Remote monitoring
1 Introduction The Netherlands is the world’s leading producer and exporter of roses, followed by Colombia, Ecuador, Kenya, and Belgium [1]. Colombia has a 15% share while the Netherlands is 37.1% of the international market [2]. The main buyers of Colombian production cut flower production are the United States 78%, followed by Japan with 3.9% and the UK with 3.3% [3]. By 2019 the cut-flower industry contributed with USD$ 107.7 million to the total exports in Colombia [4] and the rose crop contributed with 20.5% [5]. Rose cultivation is seriously affected by Sphaerotheca pannosa, a plant pathogen that can cause productivity losses up to 30% [6, 7] affecting mainly leaflets and peduncles [8]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 217–225, 2021. https://doi.org/10.1007/978-3-030-63665-4_17
218
W. J. Cuervo-Bejarano and J. A. Lopez-Espinosa
Direct observation to identify symptoms of the disease is the rule of thumb, however, is time-consuming and subjective labor which delay the decision making, increasing the risk of disease spread and the resulting excess in fungicide applications, exposing workers to more harmful conditions and contributing to soil, water and atmosphere pollution [9, 10]. Analysis of spectral reflectance based on RGB, multispectral, hyperspectral, and thermographic images represents an interesting technique aimed at biotic and abiotic stress detection in plants, making possible to cover large crop areas rapidly with a minimum interpretation skewness [11–16] and providing detection precisions up to 95% [17], minimizing production costs [18], and increasing control efficiency through localized pesticide spraying [19]. Using reflectance analysis, it is possible to measure and determine the infected and infested areas in a crop [20]. For example, some diseases detection in Venturia inaequalis in Malus domestica [21], Phytophthora infestans in Solanum lycopersicum [22], Magnaporthe grisea in Oryza sativa [23], and fungal diseases in Rosa sp. [24]. For pest detection, effectiveness up to 93%, in Ceratitis capitate affecting Ziziphus jujuba [25] and 85% in Callosobruchus maculatus in Vigna radiata beans [26] have been achieved. Besides, it is a useful tool for some chemical properties like total soluble solids, total titratable acidity, and secondary metabolites in harvested fruits [27–30]. In Colombia, spectral reflectance analysis has been used to estimate maturity stages in fruits [31] and for monitoring ecosystems [32, 33] which indicates that this technology has been relatively unexplored and much less in protected agriculture. In this study RGB images from rose plants, leaflets with and without visible lesions caused by S. pannosa were analyzed in laboratory conditions to determine the reflectance pattern that can be used as an easy method to detect lesions in a crop field.
2 Materials and Methods 2.1 Experimental Site and Plant Material Plant material was collected from a rose crop system grown on soil under greenhouse conditions located at Circasia SAS facilities (5°03’23.6”N, 73°55’45.6”W), Cogua, Cundinamarca, Colombia, Cundinamarca (mean temperature and relative humidity, 24 °C and 30%, respectively). From two-year-old rose plants cv. “Brighton” was selected randomly 630 leaflets with a visible lesion (maximum 1 cm diameter) caused by S. pannosa and 630 leaflets without visible lesions. The leaflets were stored in plastic bags and transported in refrigerated conditions to UNIMINUTO Centro Regional Sabana Centro y Ubaté, Physics laboratory. The leaflets were classified in three classes according to the region of interest (ROI) extracted from the digital image, and based on the S. pannosa lesions; class 1: ROI from an area without visible lesions and (NL), class 2: ROI from an area with visible lesions (L), and class 3: ROI from an adjacent area to the visible lesions (AL).
Analysis of RGB Images to Identify Local Lesions
219
2.2 Image Acquisition and Processing A 16 Mpx RGB camera was mounted 50 cm above in a white light box 60 × 40 × 30 (L × W × H) illuminated with two white warm 2700 K LED lamps positioned each one at an angle of 45º (Fig. 1). Each leaflet was placed at the bottom and the images were captured and then transferred to a laptop. The processing was carried out in a laptop (Intel Core i7, 8th Gen, 8 GB RAM) using OpenCV-Python [34, 35]. The image pre-processing consisted of the noise filtering using a mask to remove the background, and a thresholding segmentation to uniformize the image [36]. Then, the images were transformed into HSV space color to maximize the contrast of infected areas [37]. An ROI, according to the three mentioned classes, was extracted to calculate the pixel intensity in each class.
Fig. 1. Image acquisition module for picture recording in a lightning controlled environment.
2.3 Data Analysis The pixel intensity between the three classes was compared using ANOVA and Tukey’s post hoc test, in the case of ANOVA’s assumption violations, a Welch’s ANOVA, and the Games-Howell post hoc test (0.05 significance level) was used instead [38, 39]. The statistical analysis was done using Python 3.
3 Results and Discussion Image transformation to HSV color space allows rapid data processing (Fig. 2). Welch’s ANOVA showed that the pixel intensity was affected by the class, F (2, 629) = 505.186, P < 0.0001. Games-Howell post hoc test indicated that the pixel intensity for the L class was significantly higher than NL and AL classes, but no difference was detected between NL and AL classes (P < 0.05) (Fig. 3). Pixel intensities differences in infected tissues indicate absorbance and reflectance changes due to variations in water content, and degradation of proteins and chlorophylls
220
W. J. Cuervo-Bejarano and J. A. Lopez-Espinosa
Fig. 2. Processing steps for Rose cv. “Brighton” leaflefts images acquired under laboratory conditions. The lesions correspond to visible symptoms caused by S. pannosa. a cropped raw image, b maskig, c image segmentation, d image color space transformation to HSV, and ROI selection. Red square: ROI selection on visible lesioned area, red circle: ROI selection on an adjacent area to the lesion. The image processing was carried out using OpenCV-Python.
Fig. 3. Pixel intensity from the regions of interest (ROI) extracted from three areas of Rose cv. “Brighton” leaflets with or without an apparent lesion caused by S. pannosa. L: lesioned area, NL: non-lesioned area, and AL: adjacent to the lesioned area. Different letters denote statistical differences between the classes mean values (Games-Howell, P < 0.05). Dispersion plots as suggested by [61].
at the cellular level [40–42]. In this way, spectral reflectance analysis is a promising method to estimate biotic stress caused by plant pathogens, as reported by [43, 44]. Specifically, for powdery mildew detection, a high correlation between 550–680 nm spectral range extracted from pseudo RGB images and the symptoms was found in barley [45], in wheat, the spectra range 490–780 nm showed the most sensitive response to damage [46], in betel plant, was possible classify infected and non-infected leaves [47], and in Rose cv. “Lipstick”, the technique was able to detect S. pannosa [24], results that support this research. Recently, many types of research have recognized remote sensors, including RGB images analysis, as efficient tools to pest and disease detection [48–51]. The spectral reflection signature creation or an index calculation for specific diseases [52, 53] will help to identify and track fungal diseases symptoms in field conditions were the monitoring cost is limiting [54]. However, different pathogenic fungi infections and structures could
Analysis of RGB Images to Identify Local Lesions
221
exhibit variable patterns [55] and cannot be extrapolated for the estimation of other biotic and abiotic stresses damages [56, 57]. No detected differences between NL and AL classes in this experiment could explain that asymptomatic infected tissues do not show reflectance changes detected by HSV transformed RGB images, and possibly, these changes could be sensed by wavelengths other than R, G, B like IR, NIR or LWIR [58–60].
4 Conclusions No reports about image analysis for disease detection in the cut-flower industry in Colombia has been published, it means that research in this area is necessary for supporting crop decisions. The image analysis procedure carried out in laboratory conditions was able to detect a relationship between the leaflet reflectance and the leaflet health status for Rose cv. “Brighton” plants which are grown in greenhouses in Bogotá plateau conditions, where the photoperiod is longer than in seasonal climates, affecting light quantity and quality, in consequence, different reflectance patterns and pest and diseases incidence. The HSV transformation allows rapid processing of data, useful for a system that can predict infection signs in tissues in near real-time. Further analysis under greenhouse conditions, where lighting conditions are variable, and there is the presence of multiple background objects, are required, as well as model training based on several images from different disease development stages. In addition, pixel intensities or stress signatures for fungicides, pesticide residues, and other phytopathogenic fungal structures should be considered as potential noise in the model. Multispectral or RGB based remote diagnosis used for estimation of the probability of disease spreading in crop fields it seems to be a promising tool for protected agricultural systems.
References 1. Cuestas, A.: Análisis de las ventajas competitivas del sector floricultor de Colombia y Holanda en periodo 2012–2017. https://hdl.handle.net/20.500.11839/7171. Accessed 18 Feb 2020 2. Global Leaders In Cut Flower Exports. https://www.worldatlas.com/articles/global-leadersin-cut-flower-exports.html. Accessed 18 Feb 2020 3. Flower export Colombia. Colombian flowers growing in popularity. https://www.flowercom panies.com/blog/flower-export-colombia. Accessed 18 Feb 2020 4. Departamento Administrativo Nacional de Estadística. Boletín Técnico. https://www.dane. gov.co/files/investigaciones/boletines/exportaciones/bol_exp_jul19.pdf. Accessed 18 Feb 2020 5. Asociación Colombiana de Exportadores de flores. https://asocolflores.org/es/documentos/. Accessed 18 Feb 2020 6. Seddigh, S., Kiani, L.: Evaluation of different types of compost tea to control rose powdery mildew (Sphaerotheca pannosa var. rosa). Int. J. Pest Manage. 64(2), 178–184 (2017). https:// doi.org/10.1080/09670874.2017.1361050 7. Sriram, S.: Comparative efficacy of in vitro methods to culture rose powdery mildew (Podosphaera pannosa (Wallr.:Fr.) de Bary 1870). Pest Manage. Hortic. Ecosyst. 23(1), 80–85 (2017)
222
W. J. Cuervo-Bejarano and J. A. Lopez-Espinosa
8. Debener, T., Byrne, D.H.: Disease resistance breeding in rose: current status and potential of biotechnological tools. Plant Sci. 228, 107–117 (2014). https://doi.org/10.1016/j.plantsci. 2014.04.005 9. Domínguez-Serrano, D., García-Velasco, R., Mora-Herrera, M., Salgado-Siclan, M., González-Díaz, J.: La cenicilla del rosal (Podosphaera pannosa). Agrociencia 50(7), 901–917 (2016) 10. West, J.S., Bravo, C., Oberti, R., Lemaire, D., Moshou, D., McCartney, H.A.: The potential of optical canopy measurement for targeted control of field crop diseases. Annu. Rev. Phytopathol. 41(1), 593–614 (2003). https://doi.org/10.1146/annurev.phyto.41.121702. 103726 11. Fahrentrapp, J., Ria, F., Geilhausen, M., Panassiti, B.: Detection of gray mold leaf infections prior to visual symptom appearance using a five-band multispectral sensor. Front. Plant Sci. 10(628), 1–14 (2019). https://doi.org/10.3389/fpls.2019.00628 12. Su, W., Sun, D.: multispectral imaging for plant food quality analysis and visualization. Compr. Rev. Food Sci. Food Saf. 17(1), 220–239 (2018). https://doi.org/10.1111/1541-4337. 12317 13. Zhang, D., Xu, C., Liang, D., Zhou, X., Lan, Y., Zhang, J.: Detection of rice sheath blight using an unmanned aerial system with high-resolution color and multispectral imaging. PLoS ONE 13(5), 1–15 (2018). https://doi.org/10.1371/journal.pone.0187470 14. Mahlein, A.: Plant disease detection by imaging sensors-parallels and specific demands for precision agriculture and plant phenotyping. Plant Dis. 100(2), 241–251 (2016). https://doi. org/10.1094/pdis-03-15-0340-fe 15. Cao, X., Luo, Y., Zhou, Y., Duan, X., Cheng, D.: Detection of powdery mildew in two winter wheat cultivars using canopy hyperspectral reflectance. Crop Prot. 45, 124–131 (2013). https:// doi.org/10.1016/j.cropro.2012.12.002 16. Sankaran, S., Mishra, A., Ehsani, R., Davis, C.: A review of advanced techniques for detecting plant diseases. Comput. Electron. Agric. 72(1), 1–13 (2010). https://doi.org/10.1016/j.com pag.2010.02.007 17. Shah, M., Khan, A.: Imaging techniques for the detection of stored product pests. Appl. Entomol. Zool. 49(2), 201–212 (2014). https://doi.org/10.1007/s13355-014-0254-2 18. Martinelli, F., Scalenghe, R., Davino, S., Panno, S., Scuderi, G., Ruisi, P., Davis, C.E.: Advanced methods of plant disease detection: a review. Agron. Sustain. Dev. 35(1), 1–25 (2014). https://doi.org/10.1007/s13593-014-0246-1 19. Franke, J., Menz, G.: Multi-temporal wheat disease detection by multi-spectral remote sensing. Precision Agric. 8, 161–172 (2007). https://doi.org/10.1007/s11119-007-9036-y 20. Ghaiwa, S.N., Arora, P.: Detection and classification of plant leaf diseases using image processing techniques: a review. Int. J. Recent Adv. Eng. Technol. 2(3), 1–7 (2014). https://doi. org/10.1007/s10658-015-0781-x 21. Delalieux, S., Van Aardt, J.A.N., Keulemans, W., Schrevens, E., Coppin, P.: Detection of biotic stress (Venturia inaequalis) in apple trees using hyperspectral data: non-parametric statistical approaches and physiological implications. Eur. J. Agron. 27(1), 130–143 (2007). https://doi.org/10.1016/j.eja.2007.02.005 22. Zhang, M., Qin, Z., Liu, X., Ustin, S.L.: Detection of stress in tomatoes induced by late blight disease in California, USA, using hyperspectral remote sensing. Int. J. Appl. Earth Obs. Geoinf. 4(4), 295–310 (2003). https://doi.org/10.1016/s0303-2434(03)00008-4 23. Kobayashi, T., Kanda, E., Kitada, K., Ishiguro, K., Torigoe, Y.: Detection of rice panicle blast with multispectral radiometer and the potential of using airborne multispectral scanners. Phytopathology 91(3), 316–323 (2001). https://doi.org/10.1094/phyto.2001.91.3.316 24. Velázquez-López, N., Sasaki, Y., Nakano, K., Mejía-Muñoz, J.M., Romanchik, Kriuchkova E.: Detección de cenicilla en rosa usando procesamiento de imágenes por computadora. Revista Chapingo. Serie horticultura 17(2), 151–160 (2011)
Analysis of RGB Images to Identify Local Lesions
223
25. Liu, G., He, J., Wang, S., Luo, Y., Wang, W., Wu, L., He, X.: Application of near-infrared hyperspectral imaging for detection of external insect infestations on jujube fruit. Int. J. Food Prop. 19(1), 41–52 (2016). https://doi.org/10.1080/10942912.2014.923439 26. Kaliramesh, S., Chelladurai, V., Jayas, D.S., Alagusundaram, K., White, N.D.G., Fields, P.G.: Detection of infestation by Callosobruchus maculatus in mung bean using near-infrared hyperspectral imaging. J. Stored Prod. Res. 52, 107–111 (2013). https://doi.org/10.1016/j. jspr.2012.12.005 27. Wang, H., Peng, J., Xie, C., Bao, Y., He, Y.: Fruit quality evaluation using spectroscopy technology: a review. Sensors 15, 11889–11927 (2015). https://doi.org/10.3390/s150511889 28. López, S.D.L., Trejo, M.T.: Análisis del estado de madurez de mango (Mangifera indica) mediante espectroscopía UV-VIS-NIR. Jóvenes en la ciencia 1(2), 1206–1210 (2015) 29. ElMasry, G., Wang, N., ElSayed, A., Ngadi, M.: Hyperspectral imaging for nondestructive determination of some quality attributes for strawberry. J. Food Eng. 81(1), 98–107 (2007). https://doi.org/10.1016/j.jfoodeng.2006.10.016 30. Walsh, K.N., Blasco, J., Zude-Sasse, M., Sun, X.: Visible-NIR ‘point’ spectroscopy in postharvest fruit and vegetable assessment: the science behind three decades of commercial use. Postharvest Biol. Technol. 168, 1–17 (2020). https://doi.org/10.1016/j.postharvbio. 2020.111246 31. Castro, J., Cerquera, M., Gutiérrez, N.: Determinación del color del exocarpio como indicador de desarrollo fisiológico y madurez en la guayaba (Psidium guajava cv. Guayaba pera) utilizando técnicas de procesamiento digital de imágenes. Revista EIA 10(19), 79–89 (2003) 32. Monterroso-Tobar, M.F., Londoño-Bonilla, J.M., Samsonov, S.: Estimación del retroceso glaciar en los volcanes Nevado del Ruiz, Tolima y Santa Isabel, Colombia a través de imágenes ópticas y Din-SAR. Dyna 85(206), 329–337 (2018). https://doi.org/10.15446/dyna.v85n206. 66570 33. Moroni, M., Lupo, E., Marra, E., Cenedese, A.: Hyperspectral image analysis in environmental monitoring: setup of a new tunable filter platform. Procedia Environ. Sci. 19, 885–894 (2013). https://doi.org/10.1016/j.proenv.2013.06.098 34. Howse, J.: OpenCV Computer Vision with Python. Packt Publishing, Birmingham, United Kingdom (2013) 35. Mustaffa, I.B., Khairul, S.F.B.M.: Identification of fruit size and maturity through fruit images using opencv-python and raspberry pi. In: 2017 International Conference on Robotics, Automation and Sciences (ICORAS), pp. 1–3. IEEE, Malaysia. (2017). https://doi.org/10. 1109/icoras.2017.8308068 36. Barbedo, J.G.: Digital image processing techniques for detecting, quantifying, and classifying plant diseases. Springerplus 2(1), 1–12 (2013). https://doi.org/10.1186/2193-1801-2-660 37. Naik, M.R., Sivappagari, C.M.R.: Plant leaf and disease detection by using HSV features and SVM classifier. Int. J. Eng. Sci. 6(12), 3794–3797 (2016) 38. Vallat, R.: Pingouin: statistics in Python. J. Open Source Softw. 3(31), 1026 (2018). https:// doi.org/10.21105/joss.01026 39. Schlegel, A.: Games-Howell post-hoc Test. https://rpubs.com/aaronsc32/games-howell-test. Accessed 18 Feb 2020 40. Díaz, D.: Estudio de índices de vegetación a partir de imágenes aéreas tomadas desde UAS/RPAS y aplicaciones de estos a la agricultura de precision. http://eprints.ucm.es/31423/ 1/TFM_Juan_Diaz_Cervignon. Accessed 21 Mar 2020 41. West, J.S., Bravo, C., Oberti, R., Moshou, D., Ramon, H., McCartney, H.: Detection of fungal diseases optically and pathogen inoculum by air sampling. In: Oerke, E.C., Gerhards, R., Menz, G., Sikora R. (eds.) Precision Crop Protection-the Challenge and Use of Heterogeneity, pp. 135–149. Springer, Dordrecht (2010). https://doi.org/10.1007/978-90-481-9277-9_9
224
W. J. Cuervo-Bejarano and J. A. Lopez-Espinosa
42. Linde, M., Shishkoff, N.: Reference module in life sciences. Powdery Mildew Encyclopedia of Rose Science, 158–165. Elsevier (2003). https://doi.org/10.1016/b978–0-12-809633-8.050 26-3 43. Mahlein, A., Oerke, E., Steiner, U., Dehne, H.: Recent advances in sensing plant diseases for precision crop protection. Eur. J. Plant Pathol. 133(1), 197–209 (2012) 44. Singh, V., Misra, A.: Detection of plant leaf diseases using image segmentation and soft computing techniques. Inf. Process. Agric. 4(1), 41–49 (2017). https://doi.org/10.1016/j.inpa. 2016.10.005 45. Thomas, S., Wahabzada, M., Kuska, M., Thomas, R.U., Mahlein, A.-K.: Observation of plant– pathogen interaction by simultaneous hyperspectral imaging reflection and transmission measurements. Funct. Plant Biol. 44, 23–34 (2016). https://doi.org/10.1071/fp16127 46. Graeff, S., Link, J., Claupein, W.: Identification of powdery mildew (Erysiphe graminis sp. tritici) and take-all disease (Gaeumannomyces graminis sp. tritici) in wheat (Triticum aestivum L.) by means of leaf reflectance measurements. Open Life Sci. 1(2), 275–288 (2006) 47. Jane, S.N., Deshmkh, A.P.: Study and identification of powdery mildew disease for betelvine plant using digital image processing with high resolution digital camera. Int. J. Eng. Res. Technol. 2(3), 74–79 (2015) 48. Zhang, J., Huang, Y., Pu, R., Gonzalez-Moreno, P., Yuan, L., Wu, K., Huang, W.: Monitoring plant diseases and pests through remote sensing technology: a review. Comput. Electron. Agric. 165, 1–14 (2019). https://doi.org/10.1016/j.compag.2019.104943 49. Martinelli, F., Scalenghe, R., Davino, S., Panno, S., Scuderi, G., Ruisi, P., Davis, C.E.: Advanced methods of plant disease detection. a review. Agron. Sustain Dev. 35(1), 1–25 (2015) 50. Camargo, A., Smith, J.S.: Image pattern classification for the identification of disease causing agents in plants. Comput. Electron. Agric. 66, 121–125 (2009). https://doi.org/10.1016/j.com pag.2009.01.003 51. Bock, C.H., Poole, G.H., Parker, P.E., Gottwald, T.R.: Plant disease severity estimated visually, by digital photography and image analysis, and by hyperspectral imaging. Crit. Rev. Plant Sci. 29, 59–107 (2010). https://doi.org/10.1080/07352681003617285 52. Gröll, K., Graeff, S., Claupein, W.: Use of vegetation indices to detect plant diseases. In: Böttinger, S., Theuvsen, L., Rank, S., Morgenstern, M.: (eds) Agrarinformatik im Spannungsfeld zwischen Regionalisierung und globalen Wertschöpfungsketten - Referate der 27. GIL Jahrestagung, pp. 91–94. Gesellschaft für Informatik e. V., Bonn (2007) 53. Mahlein, A.K., Rumpf, T., Welke, P., Dehne, H.W., Plümer, L., Steiner, U., Oerke, E.C.: Development of spectral indices for detecting and identifying plant diseases. Remote Sens. Environ. 128, 21–30 (2013). https://doi.org/10.1016/j.rse.2012.09.019 54. Veys, C., Chatziavgerinos, F., AlSuwaidi, A., Hibbert, J., Hansen, M., Bernotas, G., Grieve, B.: Multispectral imaging for presymptomatic analysis of light leaf spot in oilseed rape. Plant Methods 15(1), 2 (2019). https://doi.org/10.1186/s13007-019-0389-9 55. Leucker, M., Wahabzada, M., Kersting, K., Madlaina, P., Werner, B., Ulrike, S., Mahlein, A.-K., Oerke, E.-C.: Hyperspectral imaging reveals the effect of sugar beet quantitative trait loci on Cercospora leaf spot resistance. Funct. Plant Biol. 44, 1–9 (2017). https://doi.org/10. 1071/fp16121 56. Lorenzo, G., Mascarini, L., Gonzalez, M.: Dosis de N sobre reflectancia espectral, contenido de clorofila y nutrientes en plantas de gerbera. Horticultura Brasileira 35(2), 278–285 (2017). https://doi.org/10.1590/s0102-053620170220 57. Hu, H., Liu, H.Q., Zhang, H., Zhu, J.H., Yao, X., Zhang, B., Zheng, F.: Assessment of chlorophyll content based on image color analysis, comparison with SPAD-502. In: 2010 2nd international conference on information engineering and computer science, pp. 1–3. IEEE, Wuhan (2010). https://doi.org/10.1109/iciecs.2010.5678413
Analysis of RGB Images to Identify Local Lesions
225
58. Lowe, A., Harrison, N., French, A.: Hyperspectral image analysis techniques for the detection and classification of the early onset of plant disease and stress. Plant Methods 13(80), 1–12 (2017). https://doi.org/10.1186/s13007-017-0233-z 59. Yang, C.M.: Assessment of the severity of bacterial leaf blight in rice using canopy hyperspectral reflectance. Precision Agric. 11, 61–81 (2010). https://doi.org/10.1007/s11119-0099122-4 60. Pujari, J.D., Yakkundimath, R., Byadgi, A.S.: Image processing based detection of fungal diseases in plants. Procedia Comput. Sci. 46, 1802–1808 (2015). https://doi.org/10.1016/j. procs.2015.02.137 61. Weissgerber, T., Milic, N., Winham, S.J., Garovic, V.D.: Beyond bar and line graphs: time for a new data presentation paradigm. PLoS Biol. 13(4) (2015). https://doi.org/10.1371/jou rnal.pbio.1002128
EEG-Based BCI Emotion Recognition Using the Stock-Emotion Dataset Edgar P. Torres P.1
, Edgar Torres H.2 , Myriam Hernández-Álvarez1(B) and Sang Guun Yoo1
,
1 Escuela Politécnica Nacional, Facultad de Ingeniería de Sistemas, Quito, Ecuador
[email protected] 2 INVEX S.A., Mexico, Mexico
Abstract. We present a novel method of emotion elicitation through stock trading activities in a competitive market for automatic emotion recognition. We prepared eight participants, who placed buy and sell orders of a specific and very volatile security in simulated trading with real-time market data. A “carrot and stick” approach was implemented to elicit emotions efficiently, which added risks and rewards to participants’ trading decisions. Consequently, key trading emotions were triggered: 1) fear, 2) sorrow, 3) hope, and 4) relaxed (or “focused,” which is the optimal emotional state). We gathered our data (Stock_Emotion) through an EEGBased BCI device. Such a dataset was pre-processed, features were extracted, and kNN, MLP, and Random Forest algorithms were applied for emotion classification in the valence-arousal space. Our accuracy results were satisfactory. Keywords: Emotion recognition · BCI · EEG · Machine learning · Feature extraction
1 Introduction Emotions are affective states that influence our behavior. They play an essential role in decision making, and managing them is related to emotional intelligence [1]. Moreover, understanding emotions is part of rational and intelligent behavior; and also can improve communication between individuals and individuals and machines. Automatic emotion recognition would allow detecting these affective states to understand and control them. Affective computing is an area of computer science that expands human-computer interactions, including emotional communications [2]. In recent years, affective computing has developed rapidly with multiple research works that use Electroencephalography (EEG) Based on Brain-Computer Interfaces (BCI). EEG-based BCIs are affordable, noninvasive tools with a high temporal resolution for sensing electrical signals generated by the cerebral activity. These devices can be applied for emotion recognition. It has been proved that there is an association between emotions and the brain’s electrical signals, their location, and functional relationships [3]. Emotion recognition can use several emotions, provocation methods. For instance, passive approaches consist of recalling past experiences that caused feelings, watching © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 226–235, 2021. https://doi.org/10.1007/978-3-030-63665-4_18
EEG-Based BCI Emotion Recognition
227
music videos and film clips, listening to music, viewing images, or dyadic interactions [4]. There are also active procedures such as video games, flight simulators, and virtual reality immersion [5–7]. These methods vary in their emotion generation reliability. Neither of these types of approaches is similar to a “real-life” situation or a “work” environment. This last scenario could recreate a situation in which participants need to recognize their emotions and auto-regulate them to achieve optimal performance. Our research uses a novel type of elicitation of affective states that has been proved to be effective at emotion provocation. Our area of interest is emotion detection in people who trade in a competitive stock market. In this situation, every decision has significant consequences. This form to provoke emotions differs from other elicitation methods because stock market trading consistently induces feelings in participants due to its “money” component. Naturally, “simulated” trading is not as active as “real” trading because money has a strong emotional connotation for most humans. Money represents variables such as resources, lifestyle, survival odds, value, status, health, and even the likelihood of leaving offspring, to name a few [8]. Thus, we believe that participants have strong emotions tied to money, which makes “real” and even simulated trading a powerful emotional stimulus. Research suggests that emotions play a role in stock market trading because they influence decisions. Traders cannot ignore such emotions and merely focus on their work procedures when money is at risk. Emotional states manifest through changes in confidence and risk tolerance [9]. If the trader has a constructive mindset, performance will likely increase. Likewise, if emotions are too intense, then trading performance will probably worsen since stock market trading requires continuous decisions that involve objectively analyzing odds versus risk and reward payouts. Emotions allow humans to respond to stimuli quickly and without rational thought. Moreover, emotions can generate biases that influence decision making, often unbeknownst to humans [10]. These emotional processes had evolutionary purposes. It is likely that the brain’s parts that create emotions evolved first and have a direct connection with the body, unlike the prefrontal cortex that has several neuronal layers of separation [11]. Thus, emotions are often harder to curtail through rational thought, but they can be managed through recognition. Traders often report making decisions because they “feel” right at first, but are regrettable shortly after. An excellent example of emotional trading decisions is “panic selling” and “fear of missing out,” which cause traders to sell at market bottoms or buy market tops. These trading decisions occur due to emotions, expressed as the psychological pain avoidance of losing money. However, such decisions are often inadequate after logical scrutiny. For example, buying oversold and undervalued assets probably has a positive long-term expectancy. However, doing so is typically emotionally difficult for traders due to the previously explained dynamics. Therefore, we believe that it is vital for traders to recognize the effect of their emotions in their trading. This way, traders will use feelings to their advantage, but not allow them to take over their systematic processes and trading strategy. Thus, our approach aims to understand how traders can get the “best of both worlds” in trading through emotions and rational thought. We believe that enhancing trading performance is supportive of a more efficient market, which is generally positive for society.
228
E. P. Torres P.
We see a similar situation in other peak performance activities, such as playing chess. For instance, chess grandmasters are reportedly capable of observing a chess position and instantly “feeling” who is winning or losing, without much need for calculating further moves. This example shows the need for intuition (emotion or “feeling”) through experience, combined with a disciplined, rational process (i.e., calculating chess moves). We built an EEG dataset with participants executing stock trading activities and annotating their feelings using the self-assessment manikin [12] tool in a Valence-Arousal space [13]. This dataset was used to develop an emotion recognition system that validates the effectiveness of this novel emotion elicitation method. This article is organized as follows: Sect. 2 presents Materials and methods, Sect. 3 exposes Results, Sect. 4 includes Discussion and Future Work.
2 Materials and Methods The dataset was constructed with the involvement of eight healthy individuals, four males and four females, 25 to 60 years old. This dataset was processed to obtain features in the time domain, and then a classification process was executed using kNN, MLP, and Random Forest machine learning algorithms. 2.1 Dataset Construction Before the experiments, the participants had a 1-h seminar regarding general trading principles, the trading platform, and how to interpret stock charts. Each participant learned the basics for trading, which include a general thinking framework for making discretionary trading decisions. The trading experiment was simulated, i.e., no real money was at risk. However, we used live market data. Simulated trading is as emotionally charged as real trading because money has a strong emotional connotation for most humans [14]. Additionally, a carrot and stick approach was implemented to elicit emotions efficiently. We aimed to enforce a sense of risk and reward within trading decisions. The top participants received a monetary prize, and the bottom ones suffered a penalty. Before reading EEG signals that correspond to trading activities, participants were asked to relax for two minutes to capture a calm state corresponding to a baseline. The experiment used volatile security that provided them with ample trading opportunities. Subjects could select one of two possible trading actions: 1) long/bullish 2) short/bearish. Subjects made at least four trades in a session. The experiment documented the participant’s EEG signals while they were trading, and recordings were made with an Open-BCI helmet. In particular, stock market trading triggers three key emotions: 1) fear, 2) hope, and 3) regret [15]. We believe that these emotions have a direct effect on the participant’s trading and that these effects are detectable through EEG readings [16]. The application of the present research results may improve trading performance and profits because they are likely linked to emotional management and peak performance techniques. Figure 1 shows the Valence-Arousal space with a representation of the three key emotions that influence trading activities and the relaxed state location. We noticed that our dataset did not have HVLA entries, except in the initial baseline. This finding is quite
EEG-Based BCI Emotion Recognition
229
impressive, given the nature of our experiment. After all, the “HVLA” corresponds to “High Valence,” but “Low Arousal.”
Fig. 1. Valence-Arousal space
Consequently, our results indicate that our subjects never felt something resembling a “meditation” state during trading. Financial literature often cites that the optimal state of mind is something similar to this “meditative” or being “in the zone.” Since our test subjects had no prior experience with trading (simulated or otherwise), it is not surprising to see that none of them ever achieved the “ideal” trading state of mind. Thus, we believe our dataset’s results are congruent and, to some extent, validate these empirical trading and peak performance maxims. We called this dataset: Stock_Emotion, and we made it available for public use. 2.2 EEG Channels We used a brain-computer interface (BCI) Ultracortex Mark IV EEG headset to record the EEG signal from the studied individuals (Fig. 2.a, 2.b). It was equipped with 8-channel dry electrodes to record brain signals. To collect the EEG signal, we used a Cyton Bio-sensing board with an 8-channel neural interface and a 32-bit processor. The board communicates wirelessly with a computer using a USB dongle. We applied the 10−20 systems diagram with channels 1−8 of the OpenBCI default setting. Figure 2.c and Table 1 show the systems’ electrode location (Fig. 2).
230
E. P. Torres P.
Fig. 2. Open-BCI helmet
Fig. 3. EEG-based BCI emotion recognition system
Table 1. Cyton’s channels and eight electrode positions. Electrode’s Channel position 1
FP1
2
FP2
3
C3
4
C4
5
P7
6
P8
7
O1
8
O2
3 Results In the following subsections, we present an EEG-Based BCI emotion recognition system (Fig. 3) that uses time and frequency features to input in three types of classifiers.
EEG-Based BCI Emotion Recognition
231
3.1 Pre-processing The acquired EEG signals were pre-processed to eliminate body generated artifacts such as blinking eyes and heartbeats. For this, we used two Butterworth IIR filters with zerophase to keep frequencies between 1 Hz and 40 Hz. Also, we eliminated electrical noise using a notch filter to delete 60 Hz frequency components. Table 2. Frequency band associations Band
State association
Gamma rhythm (above 30 Hz)
Positive valence. Arousal increases with high-intensity visual stimuli
Beta (13 to 30 Hz)
They are related to visual self- induced positive and negative emotions
Alpha (8 to 13 Hz)
They are linked to relaxed and wakeful states. It is thought that these waves appear in relaxation periods with eyes shut but still awake. They represent the activity of the visual cortex in a repose state [17]
Theta [4 to 7 Hz]
They appear in relaxation states, and in those cases, they allow better concentration. These waves also correlate with anxious feelings. Also, individuals that experience higher emotional arousal in a reward situation reveal an increase of theta waves in their EEG [18]
Delta [0 to 4 Hz]
They are present in deep NREM 3 sleep stage. Since adolescence, their presence during sleep declines with advancing age. These waves also have been found in continuous attention tasks [19]
3.2 Feature Extraction In the time-domain statistical-based characteristics, and Higher-Order Crossing (HOC) were extracted as features. Statistical features: In the time-domain, information about EEG signals can be extracted using statistical calculations [20]. We selected essential features that contain the most relevant characteristics of the signal: mean and variance, because mean is the average amplitude of the signal, and variance is a measure of dispersion. These features are defined for a raw signal: (X(n), n = 1, …, N), as follows: Mean of the raw signal: 1 N X (n) (1) μx = n−1 N The variance of the raw signal: 1 N (2) σxz = (x(n) − μx )2 n−1 N
232
E. P. Torres P.
HOC: Higher-Order Crossing is a feature calculated as the number of oscillations through the zero-crossing. This feature is useful to indicate the pattern of periodic change of the EEG signal. To compute it, we apply a filter to the time series and count the number of the zero-crossings in the filtered and the original signals. HOC counts the various highpass filtered time series using the difference operator D. The HOC features are defined by the sequence of D as follows: HOC = [D1, D2, . . . , DM]
(3)
M is the maximum order calculated. We used M = 5. 3.3 Classification The feature vector was composed of the already described five frequency bands, statistical measures, and HOC series. The classifier inputs were processed using feature scaling to normalize them. This process helped the algorithms to converge more efficiently. To cover a wide range of algorithm types, we used nearest neighbor, neural network, and ensemble classifiers, namely: kNN, MLP, and Random Forest, respectively. The specific algorithms were chosen because they had the best performance compared with other available functions in the Sklearn Python library. We applied a 20/80 split for the test and training datasets. We obtained the parameters experimentally for the best performance for each algorithm. kNN chosen parameters: three neighbors, uniform weights, KDTree search, leaf_size = 30, Euclidean distance, distance metric Minkowski (Eq. 4). ∧ (4) sum |x - y|∧ p (1/p) MLP chosen parameters: Hidden layer sizes = 100, activation function for the hidden layers is a rectified linear unit, solver is adam which refers to a stochastic gradient-based optimizer [21], alpha = 0.0001, learning rate = constant, maximum number of iterations = 200. The parameters for Random Forest were: the number of trees = 100, the criterion to estimate the quality of a split is Gini impurity [22], maximum depth of the tree = none, so the nodes are expanded until all leaves are pure or until all leaves contain less than two split samples. Table 3 shows the accuracy results. Table 3. Frequency band associations Dataset
Classifier
Accuracy (%)
Stock-Emotion kNN
75
MLP
60
Random Forest 60 DEAP
kNN
60
EEG-Based BCI Emotion Recognition
233
The kNN algorithm had the best classification accuracy of 75%. MLP and Random Forest have both the same accuracy of 60%. The model trained in our Stock-Emotion dataset was later applied in DEAP, a publicly available dataset, using the same kNN classifier. Table 3 presents our results. As shown, our model developed on our Stock-Emotion dataset has higher accuracy than when executed in the DEAP dataset. Nevertheless, the model’s metrics on the DEAP dataset are also satisfactory for classification with four categories.
4 Discussion and Future Work The contributions of the present work are mainly related to 1) the generation of the StockEmotion dataset, which will be publicly available, 2) the proposed emotion elicitation method using trading with real market data and simulated money, and 3) the use of features in the time domain that could be promptly computed for a real-time application. Ours is a small dataset composed of EEG signals from eight subjects, with 12 emotion-tags on each experiment. However, the classification’s results were satisfactory. For context, a random classification result would have been 33% (3 available classification categories), and the kNN algorithm performed 40% better than that. MLP and Random forest algorithms also had a 30% classification performance in excess over random classification. Thus, our results are consistent with algorithms capturing signal, and not mere random noise. The method to provoke emotions makes the participants interact with a paper trading platform. Therefore, the setting of the experiment is more like an everyday working environment with highly competitive goals. The results may be applied to emotion recognition in trading and other similar activities. We must note that our results had three classification classes: High Valence – High Arousal, Low Valence – High Arousal, Low Valence – Low Arousal. Each class is related to one fundamental emotion associated with stock trading: hope, fear, regret, respectively. High Valence – Low Arousal was never tagged for the participants during trading because they were all beginners in stock trading, and neither of them was in a focused or relaxed emotional state. Figure 4 shows that the Stock-emotion database does not have High Valence - Low Arousal (relaxed state) labels, except for the first two-minute baseline (not included in the figure). This finding contrast with the DEAP dataset, which has 21% in HVLA because it has another emotion elicitation method (music videos). These results show that our elicitation method is efficient in generating these three critical emotions related to stock trading in competitive markets. As future work, we plan to extend our primary dataset for further testing of emotion elicitation methods.
234
E. P. Torres P.
Fig. 4. Valence – Arousal results stock-emotions vs. DEAP results
References 1. Mayer, J.D.: UNH Personal. Lab 1 (2004) 2. Picard, R.W.: Proceedings 8th HCI Int. Human-Computer Interact. Ergon. User Interfaces p. 829 (1999) 3. Cacioppo, J.T.: Biol. Psychol. 67, 235 (2004) 4. Uhrig, M.K., Trautmann, N., Baumgärtner, U., Treede, R.-D., Henrich, F., Hiller, W., Marschall, S.: Front. Psychol. 7 (2016) 5. Possler, D., Klimmt, C., Raney, A.A.: In: Res. Symp. – Video Games A Mediu. That Demands Our Atten. Symp. 2017, 74–91 (2017) 6. Roza, V.C.C., Postolache, O.A.: Sensors (Switzerland), 19 (2019) 7. Susindar, S., Sadeghi, M., Huntington, L., Singer, A., Ferris, T.K.: Proceedings Human Factors Ergon. Social Annual Meetings. 63, p. 252 (2019) 8. Mulder, M.B., Beheim, B.A.: Philos. Trans. R. Soc. B Biol. Sci. 366, 344 (2011) 9. Butler, D.J., Cheung, S.L.: Mind, Body, Bubble! Psychological and Biophysical Dimensions of Behavior in Experimental Asset Markets. In: Biophysical Measurement in Experimental Social Science Research. Academic Press, pp. 167–196 (2019) 10. Fenton-O’Creevy, M., Lins, J.T., Vohra, S., Richards, D.W., Davies, G., Schaaff, K.: J. Neurosci. Psychol. Econ. 5, p. 227 (2012) 11. Rolls, E.T.: Behav. Brain Sci. 23, 219 (2000) 12. Bradley, M.M., Lang, P.J.: J. Behav. Ther. Exp. Psychiatry 25, 49 (1994) 13. Russell, J.A.: J. Pers. Soc. Psychol. 39, 1161 (1980) 14. Johnson, M.W., Bickel, W.K.: J. Exp. Anal. Behav. 77, 129 (2002) 15. Strahilevitz, M.A., Odean, T., Barber, B.M.: J. Mark. Res. 48, 102 (2011) 16. Torres, E.P., Torres, E.A., Hernández-Álvarez, M., Yoo, S.G.: In: Adv. Artif. Intell. Softw. Syst. Eng. (Springer Nature), pp. 53–60 (2021) 17. Harald, R.: Ensemble 15, 250 (2013) 18. Knyazev, G.G., Slobodskoj-Plusnin, J.Y.: Pers. Individ. Dif. 42, 49 (2007)
EEG-Based BCI Emotion Recognition
235
19. Kirmizi-Alsan, E., Bayraktaroglu, Z., Gurvit, H., Keskin, Y.H., Emre, M., Demiralp, T.: Brain Res. 1104, 114 (2006) 20. Takahashi, K.: Proc. IEEE Int. Conf. Ind. Technol. 3, 1138 (2004) 21. Kingma, D.P., Ba, J.L.: 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc. 1 (2015) 22. Suthaharan, S.: Machine Learning Models and Algorithms for Big Data Classification (2016)
Security
Guidelines and Their Challenges in Implementing CSIRT in Ecuador Fernando Vela Espín(B) Universidad UTE, Quito, Ecuador [email protected]
Abstract. Increasing in cyber threats in Ecuador and worldwide, motivates to search cybersecurity mechanisms that support the cybersecurity incident response promptly, on time, so that operations of the organization are not affected. One primary strategy is to establish Cybersecurity Incident Response Teams (CSIRTs) to face cybersecurity attacks; organizations like ENISA and NIST have developed a set of steps necessary to establish a CSIRT. However, in Ecuador there are few CSIRT found in academic, financial, military, or government fields and many of them have a few years of building. The purpose of this study is to develop an analysis of the management and barriers that have been present in the implementation of CSIRTS in Ecuador and to identify possible actions to face these barriers and improve the cybersecurity incident response in the Country. Defining a guideline to avoid nonconformities when implementing a CSIRT and in this way, national security can be improved. Keywords: CSIRT · Computer security · Incident handling · Security models
1 Introduction Digitalization of critical infrastructure has increased security problems in cyberspace; under, this context, countries around the world have developed cybersecurity strategies to reduce the impact of cyber-attacks [1]. The growth of cybersecurity science, which is trying to establish procedures, mechanisms, and techniques to face cyber-attacks shows the relevance of cybersecurity incident response teams (CSIRT). The first CSIRT was established by Carnegie Mellon University in the year 1988 after the Morris virus attack that affected several computers attached to the Internet; 32 years later, there are some organizations in the world that still don’t have CSIRTs to face cyber-attacks. more notably, there are countries that don’t even have their own CSIRT. In 2014 the National CSIRT was implemented in Ecuador, and finally, in 2019, academics, government, and private organizations have become more critical to build CSIRT to protect their organizations. Some research questions have grown under this context: What are the challenges for organizations to implement CSIRT? What aspects are needed to consider to implement CSIRT? How can organizations accelerate the implementation of CSIRT? How can organizations measure its capabilities facing CSIRT cyber-attacks? © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 239–251, 2021. https://doi.org/10.1007/978-3-030-63665-4_19
240
F. V. Espín
This study aims to identify some key aspects to find possible answers to these research questions and afterward, with this background, establish some guidelines to improve the implementation process of incident response operations in organizations. To develop this study, we selected a ground theory analysis based on the observation of three established CSIRT in Ecuador. The CSIRTs chosen in this study cover different scopes: government, academy, and private, to setup a first general mapping about their experience, barriers, challenges, and strategies in the domain of cybersecurity incident response. The three CSIRT have been evaluated with the SIM3 methodology of measure; this evaluation tries to identify possible correlations among their operations. The rest of this study have been organized as follows. Section 2 presents a background of types of CSIRTs, guidelines, and their operations. Section 3 shows the research methodology that was used in this study. Section 4 reveal the main finding in this research.
2 Background 2.1 Types of CSIRTs There are different types of CSIRT. They can be classified for scope or size. Shierka, al. [2] identified the following types of CSIRT: The National one, the goal of this type of CSIRT is to coordinate cybersecurity incident response in all country territory. The Government one, provides support to citizens using personalized alert and warns services in this way, fosters awareness of internet security, and coordinates with national CSIRTs. The commercial one, delivers paid services to organizations with SLA management, generates alerts, notices in handling security incidents, and supports forensic investigations. The Academic one provides services in the educational sector, university, or research centers, managing its coverage on the campus to which they belong. The Military one is developed within the security of a country own managing parameters of each military unit ordered under the so-called protection through the darkness. 2.2 Guidelines for CSIRTs Operations According to the World Economic Forum (WEF), cyber-attacks are among top risks for the economy around the world. Under this context, the countries have acquired strategies for fighting against cyber-attacks. In some countries, the procedure was to build specialized government organizations related to cybersecurity operations. However, cyber-attacks are still growing, and these technical organizations in the states need to enhance their operations to reduce the impact of cyber-attacks. This context raised our first research question proposal in this study, “How can organizations arise if the CSIRT has the capabilities to face cyber-attacks?”. Two essential guidelines for cybersecurity operations worldwide are proposed by the National Institute of Standards and Technology (NIST) and The European Union
Guidelines and Their Challenges in Implementing CSIRT in Ecuador
241
Agency for Cybersecurity (ENISA); the first one has more relevance in North American continent and the second one in Europe. NIST. The National Institute of Standards and Technology (NIST) developed the guide NIST.SP.800-61r2 [3] Computer Security Incident Handling Guide to manage incidents. The guide defines main phases: preparation, detection and analysis, containment, eradication and recovery, and post-incident activity to mitigate attacks (See Fig. 1).
Fig. 1. Incident response life cycle
This guide seeks to assist IT Security Incident Response Teams (CSIRTs), network administrators, Chief Information Security Officers (CISOs), Chief Information Officers (CIOs), security personnel, and other responsible of preparing a responding to incidents of security. ENISA. The European Union Agency for Network and Information Security (ENISA) [4] is an expertise center in networks and information security for the EU, its member states, the private sector, and European citizens. ENISA works with these groups to develop advice and recommendations on acceptable practices in information security. ENISA proposes a guide for handling security incidents on the implementation of CSIRTS, that provides a global vision of all the implications (not only technological) through the Security Incident Management Maturity Model (SIM3), it allows contextualization for an exemplary implementation of a CSIRT. 2.3 Cyber Security Maturity Model (SIM3) The SIM3 maturity model was based on three fundamental levels: 1) Quadrants of maturity 2) Maturity parameters 3) Maturity levels These pillars were built under the context described in Table 1:
242
F. V. Espín Table 1. The SIM3 maturity model.
Quadrants of maturity
It is determined in 4 fundamentals: O-Organization H-Human T-Tools P-Processes
Maturity parameters
Maturity Parameters they are determined in more than 40 independent guidelines based on the pillars of the maturity quadrant, as detailed below. Organization Parameters O-1: Mandate
O-7: Description of the service level
O-2: Constituency
O-8: Incident classification
O-3: Authority
O-9: Participation frameworks
O-4: Responsibility
O-10: Organizational Framework
O-5: Description of the service
O-11: Security policy
Human parameters H-1: Code of conduct/ practice/ethics
H-5: External technical training
H-2: Personal resilience
H-6: External communication training
H-3: Description of skill set
H-7: External networks
H-4: Internal training Tool parameters T-1: IT Resource List
T-6: Resilient Email
T-2: List of information sources
T-7: Resilient Internet access
T-3: Consolidated email system
T-8: Incident prevention toolkit
T-4: Incident Tracking System
T-9: Incident Detection Toolkit
T-5: Rugged Phone
T-10: Incident Resolution Toolkit
Process parameters P-1: Government level escalation
P-10: Common mailbox names
P-2: Press escalation
P-11: Secure information management process
P-3: legal escalation
P-12: Information sources process
P-4: Incident prevention process
P-13: Disclosure process
P-5: Incident detection process
P-14: Reporting process
P-6: Incident resolution process
P-15: Statistical process
P-7: Specific Incident Processes
P-16: Meeting Process (continued)
Guidelines and Their Challenges in Implementing CSIRT in Ecuador
243
Table 1. (continued) P-8: Process of active evaluation of the team by senior management
P-17: Peer-to-peer process
P-9: Emergency Accessibility Process Maturity levels
It is determined in 4 fundamentals: 0 = not available/undefined/unconscious 1 = implicit (known/considered but not written) 2 = explicit, internal (written but not formalized in any way). 3 = explicit, formalized under the authority of the CSIRT chief (with seal, signed or published) 4 = explicit, audited at levels of government authority above the head of the CSIRT
*O-6: (intentionally left blank - not included in ‘scoring’).
Based on the SIM3 Security Incident Management Maturity Model, we can verify that the different levels have their obligations to level up or remain as defined in Table 2. Table 2. Mandatory requirements to validate the maturity level of a CSIRT. Requirements
Organizational parameters Human parameters Tool parameters Process parameters
O-1/O-2/O-3/O-9 = (3) H-1/H-6/H-7 = (1)
Basic level
Organizational parameters Human parameters Tool parameters Process parameters
O-7/O-10 = (3) T-1/T-8/T-9 = (1) P-1 = 3/P-10 = (2)
Intermediate level
Organizational parameters Human parameters Tool parameters Process parameters
O-1/O-2/O-3/O-4/O-5/O-9 = (4) H-1/H-2/H-7 = (3) T-2/T-3/T-4/T-5/T-6/T-8 = (2) P-4/P-5/P-6 = 2 P-9/P-11 = (3)
Advanced level
Organizational parameters Human parameters Tool parameters Process parameters
O-8/O-11 = (3) H-3/H-4/H-5/H-6 = (3) T-2/T-3/T-4/T-5/T-6/T-7 = (3) T-10 = (2) P-16/P-17 = (2) P-2/P-3/P-7/P-12/P-13/P-15 = (3) P-8/P-14 = (4)
244
F. V. Espín
3 Research Methodology To understand the context of CSIRTs in Ecuador, we design our research under a ground theory methodology intending to understand the most relevant aspects that could be a driver or barrier to building a Computer Security Incident Response Team (CSIRT). We establish three steps in our methodology: Step 1. We conducted a systematic literature review of publications related to the evaluation of CISRTs’ capabilities to obtain the key aspects to operate the cybersecurity incident response process. We used the terms ‘CSIRTS’, ‘cybersecurity incident response’, “challenges”, “barriers” and “maturity” to search publications in scientific databases. We select the following databases: Scopus, IEEE, and ACM because they are the representative ones in the technology field. We defined the search period the last five years 2015– 2020; this is an adequate period of time to evaluate changes related to the technology and economic fields. Step 2. We conducted a survey related to CSIRT development areas using the SIM3 model to three CSIRT organizations that had implemented. CSIRT in Ecuador. This survey enables us to obtain a perspective of the level of maturity in the four parameters proposed by SIM3. We interviewed representatives of three selected CSIRTs and asked them about the organizational, human, technological, and process aspects considered in the SIM3 model. Step 3. We conducted an open-questions survey to cybersecurity experts in SIM3 to obtain the essential aspects that could be barriers or accelerators in the cybersecurity incident response process. We defined questions based on factors considered in the SIM3 model. This survey was meant to know the opinions of cybersecurity experts related to management or barriers that they think may affect cybersecurity incident response operations (see Table 3). Table 3. Research methodology to understand our research questions. Research question
Research method
Data collection instrument
What are the challenges for organizations to implement CSIRT?
Survey
Literature review, questionnaire and interview
What aspects are needed to consider to implement CSIRT?
Survey
Literature review, questionnaire and interview
How can organizations accelerate the implementation of CSIRT?
Survey
Literature review, questionnaire and interview
How can organizations measure if the CSIRT has the capabilities to face cyber-attacks?
Survey
Literature review, questionnaire and interview
Guidelines and Their Challenges in Implementing CSIRT in Ecuador
245
4 Research Findings Based on the ENISA maturity model of the SIM3 to three CSIRTs in Ecuador, called in this study like company A, company B, and company C; for anonymization of real names of CSIRTs. Figure 2 shows the maturity level of ‘Company A’; the green line represents the “advanced” level of maturity, the blue line represents the intermediate level of maturity, the yellow line represents the low level of maturity, and the purple line is the current situation of the organization, based on the graph and the maturity model. According to data analysis, the company lacks organizational, human, technological, process, and parameters to reach a basic level; for this reason, the project for the implementation of a CSIRT in the company failed, as shown in the following graph.
Fig. 2. SIM3 evaluation for Company A.
Figure 3 shows the maturity level of ‘Company B’, where the green line represents the “advanced,” level of maturity, the blue line represents the intermediate level of maturity, the yellow line represents the low level of maturity, and red is the current situation of the organization, based on the graph and the maturity model. It can determine that the company lacks of organizational, human, and operational parameters to reach a fundamental level. The main reason for failing to implement a CSIRT in company B was on the administrative side because it was not clear what the CSIRT would protect or who it would protect. When they conceived the plan for the implementation of the CSIRT, the technical experts were not clear, neither the mission nor the vision because they were based on the approaches of the CSIRT of the Americas and ENISA, did not define the real Ecuadorian problem to determine the viability of the project, the actual scope with its degree of maturity, technological infrastructure, human, organizational and processes to detect and respond to security incidents.
246
F. V. Espín
Fig. 3. SIM3 evaluation for Company B.
Figure 4 shows the maturity level of ‘Company C’, an intermediate where green represents the “advanced”, level of maturity, blue represents an intermediate level of maturity, and yellow represents the low level of maturity, black is the current situation of the organization, based on the graph and the maturity model, it can be determined that the company lacks of organizational, human parameters, and 50% the issue of processes to reach a basic level, for this reason, the project for a CSIRT could not be implemented in Company C, the parameters are on track, there is still any support from the organization.
Fig. 4. SIM3 evaluation for Company C.
Figure 6 shows a comparative analysis of three companies. We can observe the main barriers to the implementation of a CSIRT is given by the organizational vision of the business turn, and the return on investment (ROI) is the main thing in each organization. They continue to maintain information security projects as an expense and not as an investment. The lack of experience of the technicians, information security specialists on what a CSIRT is, how it works, how would it will improve the security, and what the organization wins are why the implementation projects are not valid. After the detailed analysis, we can define that these are the main reasons why CSIRTs are no longer implemented in the region (Fig. 5).
Guidelines and Their Challenges in Implementing CSIRT in Ecuador
247
Fig. 5. Comparative analysis of the degree of maturity of the three companies.
4.1 Barriers for Incident Response Catota et al. [5] identified the following barriers at Ecuadorian financial institutions: Small team size; Lack of visibility; Inadequate internal coordination; Lack of awareness; Lack of training; Technology updating; Lack of budget; Slow response time; Inadequate security controls; Lack of technology Wiik et, al. [6], points out that a fundamental problem for a CSIRT is to balance a growing workload with limited human resources. On the other hand, Nyre-Yu et al. [7] mentions that structure and policy within an organization may considerably affect a group’s ability to perform. We developed a qualitative analysis in this study based on open-question surveys categorized in four axes: People, Organization, Process, and Tools. According to data analysis for open-questions survey, the drivers or barriers we can identify for cybersecurity experts in Ecuador are shown in Table 4. In Fig. 6, we detect the in the organization axis that security experts think the main issues are: lack of budget, lack of specialized training for staff, and lack of cybersecurity politics. Security experts mention that is needed more compromise for high authorities, and the lack of knowledge about cybersecurity issues impact the development of adequate policies respect to cybersecurity operations in the organization. Figure 7 shows the analysis about tools’ axis, cybersecurity experts think that email is the most used tool for cybersecurity incident response, so they believe it is essential to improve communication skills for the organization members. Incident response required the development of reports that include events, mechanisms, and possible decisions about cybersecurity incidents, so they need to be structured clearly and consider dialectics of the people read reports, many words in technical dialectic could be confused to people which don’t work in the technical department. Adopted new security tools for incident response is needed to improve the incident responses [8]. Figure 8 shows in human axis, cybersecurity experts think that there is needed symbiosis between academic and government to build an adequate incident response structure. the improvement of incident response needs trained people so this training must be considered into the budget [9].
248
F. V. Espín Table 4. Barriers that affect cybersecurity’s incident response operations.
Type
Question
Key aspect
Organization
How do authorities see cybersecurity?
Spending
Restrictions to implement CSIRT?
Lack of top management support and low budget
Limitations to implement CSIRT?
Lack of knowledge
How is the organizational maturity
Low levels and it is considered a limiting
Human
What is the role of CSIRT? Is there a symbiosis between industry and academia?
Improve the cybersecurity’s incident response There is no symbiosis
Tool
Technological development in the Country?
Lack of tools is a limitation
Process
Is personal rotation a risk or limiting? How does the organization consider the CSIRT?
Lack of process is a limitation ROI in the medium and long term
Fig. 6. Organizational factors related to CSIRT operation.
Finally, Fig. 9 shows issues related to the processes axis. Cyber security experts think there is an important number of guidelines and standards like NIST 800-61r2, RFC 2350, and ISO27035 [10]. However, they are not jet complete applied in organizations, this affect to incident response process and maybe is the reason for which organizations sometimes don’t plan training or buy the tools. Is need work in a strategic plan for incident response in conjunction with high authorities. CSIRTs need to try find new ways to automatize some of their process [11] (Fig. 7).
Guidelines and Their Challenges in Implementing CSIRT in Ecuador
249
Fig. 7. Tools related to CSIRT operation.
Fig. 8. Human factors related to CSIRT operation.
Fig. 9. Process factors related to CSIRT operation.
5 Conclusions At the end of the analysis, we identify some barriers to take advantage of cybersecurity incident response teams (CSIRT) in Ecuador. We can define the following guidelines for
250
F. V. Espín
the implementing a CSIRT in the Country based on the following levels: Strategic level, tactical level, and operational level (see Table 5). Table 5. Of the Guide to implement CSIRT Strategic level
What is the ROI for the organization? Who are the actors that intervene on the part of the organization? Who will be protected? What location the CSIRT would be managed in the institution’s office? Do you know the technology park that the organization operates? Manage of open sources or privative tools for IR analysis Lift the organization’s information assets Have information about security policies
Tactical level
Obtain the support of senior management Define the mission and vision of the CSIRT, being very clear about which assets are going to be protected, to understand what incidents can be analyzed Form a multidisciplinary team with specialists in security, communication and law Define roles responsibilities Define the physical space to implement the CSIRT
Operational level
Know the technological park that the organization manages Manage tools for IR analysis Stablish the organization’s information assets Define and verify information security politics
References 1. Karabulut, Y., Boylu, G., Kucuksille, E.U., Yalçınkaya, M.: Characteristics of cyber incident response teams in the world and recommendations for turkey. Balkan J. Electr. Comput. Eng. 3 (2015) https://doi.org/10.17694/bajece.93521 2. Skierka, I., Morgus, R., Hohmann, M., Maurer, T.: CSIRT Basics for Policy-Makers (2015) 3. Thompson, E.C.: Incident response frameworks. In: Cybersecurity Incident Response. Apress, Berkeley, CA. (2018) https://doi.org/10.1007/978-1-4842-3870-7_3 4. ENISA Maturity Evaluation Methodology for CSIRTs. https://www.enisa.europa.eu/public ations/study-on-csirt-maturity-evaluation-process 5. Catota, F., Morgan, M., Sicker, D.: Cyber security incident response capabilities in the Ecuadorian financial sector. J Cyber Secur. 4 (2018) https://doi.org/10.1093/cybsec/tyy002 6. Wiik, J., Gonzalez, J., Kossakowski, K-P.: Limits to effectiveness in computer security incident response teams. Boston, Massachusetts: Twenty Third International Conference of the System Dynamics Society (2005) 7. Nyre-Yu, M., Gutzwiller, R., Caldwell, B.: Observing cyber security incident response: qualitative themes from field research. Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 63, 437–441 (2019) https://doi.org/10.1177/1071181319631016
Guidelines and Their Challenges in Implementing CSIRT in Ecuador
251
8. Andrade, R.O., Cazares, M.F., Tello-Oquendo, L., Fuertes, W., Samaniego, N., Cadena, S., Tapia, F.: From cognitive skills to auto-mated cyber security (2018) 9. Andrade, R., Torres, J.: Enhancing intelligence soc with big datatools. In: 2018 IEEE 9th Annual Information Technology, Electron-ics and Mobile Communication Conference (IEMCON), pp. 1076–1080. IEEE (2018) 10. Cadena, A., Gualoto, F., Fuertes, W., Tello-Oquendo, L., Andrade, R., Tapia, F., Torres, J.: Metrics and indicators of information security incident management: A systematic mapping study. In: Developmentsand Advances in Defense and Security, Springer, pp. 507–519 (2020) 11. Andrade, R., Torres, J., Cadena, S.: Cognitive security for incident management process. In: International Conference on Information Technology & Systems, Springer, Cham, pp. 612– 621 (2019)
A Blockchain-Based Approach to Supporting Reinsurance Contracting Julio C. Mendoza-Tello1(B) and Xavier Calderón-Hinojosa2 1 Facultad de Ingeniería y Ciencias Aplicadas, Universidad Central del Ecuador,
170129 Quito, Ecuador [email protected] 2 Departamento de Electrónica, Telecomunicaciones y Redes de Información, Escuela Politécnica Nacional, Quito, Ecuador [email protected]
Abstract. Reinsurance is a strategy used by the insurer to protect themselves from underwriting risk. In this line, an insurer acquires an insurance policy from another called reinsurer. In this way, the contractual responsibility that must be fulfilled between these two entities is specified. Thus, collaborative and transparent efforts are required in this business. In addition, automatic processing and the integrity of the information flow must be guaranteed. Faced with these challenges, blockchain is an innovative technology that allows registering and automating operations using smart contracts and complex cryptographic mechanisms that provide trust and integrity for transactions, without the need for third-party supervision. With these characteristics, this research explores how blockchain technology can become a collaborative and supportive tool in reinsurance contracts. For this, a layered overview and model are designed. Contributions and conclusions are provided at the end of the document. Keywords: Blockchain · Smart contracts · Insurance · Reinsurance
1 Introduction An insurance policy constitutes a contractual agreement between two entities (or participating users): insured and insurer. By paying a security (called a premium), an insured party transfers part (or all of their risk) to an insurer. Thus, an insured party is protected against possible catastrophes and unexpected incidents. Similarly, an insurer needs to preserve its assets and businesses through an activity called reinsurance. Consequently, reinsurance allows one insurer (called a cedant or direct insurer) to transfer part (or all) of its risk to another insurer (called reinsurer). In this sense, the insurer agrees to pay a value (called a reinsurance premium) with the promise that the reinsurer will commit to providing risk coverage. In turn, a reinsurer that accepts the risk of another is called a retrocessionaire. In other words, reinsurance is the insurance of insurance companies. Thus, it becomes an umbrella for protection from underwriting risk and ruin.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 252–263, 2021. https://doi.org/10.1007/978-3-030-63665-4_20
A Blockchain-Based Approach to Supporting Reinsurance Contracting
253
An efficient reinsurance process requires consideration of three main aspects. Firstly, collaboration and transparency between insurer and reinsurer. Secondly, automatic processing of settlements, contracts, and claims. In this line, smooth communication with no delays in confirmations is desirable. Thirdly, an auditable and immutable record that guarantees the integrity and confidentiality of the flow of information generated. Currently, reinsurance is a B2B business with a high intermediation component that involves complex manual processes. Faced with this challenge, blockchain guarantees the mutual trust and strict compliance of the participants through the implementation of smart contracts at the top of its structure, without the need for intermediaries. Taking these considerations, this article examines how blockchain features can support reinsurance contracting. The sections of this document are structured as follows. In section two, the methodological aspects and background are pointed out. Section three explains how the inherent characteristics of the blockchain can support reinsurance contracting. Taking these considerations into account, a layered overview and model are described. Finally, conclusions and future research are presented in section four.
2 Methodology and Background 2.1 Methodology In this research, two main fields of study are identified (Fig. 1):
Fig. 1. Implications of blockchain features in reinsurance contracting
– Blockchain technology, upon which smart contracts are implemented to provide the model with the following characteristics, namely: automation, digital authentication, access rules, decentralization, shared intelligence, transparency, and immutability of data. – Reinsurance, which identifies two main types of insurance contracts: facultative and treaty. In this line, a literature review is described. That is, a notion of the different modalities of reinsurance contracts is presented. In addition, it explains how blockchain implements
254
J. C. Mendoza-Tello and X. Calderón-Hinojosa
smart contracts to work together. In this context, a description of the works related to this research is described. From this perspective, a layered model is designed to explain how blockchain features involve and support reinsurance contracting. Consequently, a reinsurance procedure based on smart contracts is described. 2.2 Reinsurance Contracts No insurance company has the financial capacity to cover the coverage of all the risks assumed by its business activity. In this sense, reinsurance becomes an umbrella of contingencies at the service of the insurer. To do this, various reinsurance mechanisms were developed as alternatives and strategies to diversify risk amongst many companies. These methods are implemented within contracts, which are adjusted according to the requirements of each company and the scope of the reinsurance business. Along these lines, two main categories of reinsurance contracts are identified: facultative and treaty. – Facultative. This is a reinsurance contract where the risks are analyzed individually because they are outside the normal conditions of a portfolio. To do this, the transferor company negotiates a contract for each policy, and the reinsurer chooses each of the risks it wishes to accept in an adverse manner. Each risk is negotiated separately. For example, within the fire portfolio, there are homogeneous conditions for houses and buildings. However, the reinsurance contract for a hydroelectric plant departs from the homogeneous characteristics and insured amounts of the aforementioned portfolio. So, the reinsurance of a hydroelectric plant must be specified within a facultative contract. – Treaty. This is a reinsurance contract where risks are analyzed by portfolio, rather than individually by each policy. In this sense, an insurer is obliged to assign all the risks of a portfolio and a reinsurer is obliged to accept all those risks. In other words, the risks of a portfolio are negotiated together. Treaty reinsurance contracts are given by two types. Firstly, proportional contracts where risks, sums insured, losses, and premiums are shared proportionally between insurer and reinsurer, according to fixed percentages (called quota share) or variable percentages (called surplus share). In the event that the insured sum of the risk is beyond the limit of the proportional contract, the risk must be transferred optionally. Secondly, non-proportional contracts whereby the reinsurer only pays the insurer for the following issues: per risk excess, per occurrence excess, and aggregate excess [1]. 2.3 How Do Blockchain and Smart Contract Work Together? Blockchain is a distributed ledger with the versatility to register any digital asset in a P2P network. This innovation is characterized in that it uses complex cryptographic operations to achieve unanimous consensus throughout the network of participants, without the need for a central supervisory authority. Thus, this innovation becomes a disruptive technology for society and information technology management. In its early days, blockchain stood out because it is the underlying technology of cryptocurrencies (like bitcoin). Each bitcoin represents a digital asset or token that can
A Blockchain-Based Approach to Supporting Reinsurance Contracting
255
be transferred from one address (controlled by a sending user A) to another address (controlled by a receiving user B). To perform this transfer of ownership, the use of asymmetric cryptographic method keys is required. These keys allow the user to demonstrate the ownership of a certain token (cryptocurrency). Blockchain is a structure of sequential and time-bound blocks. Each block is made up of a header and a set of transactions. The immutability of recorded events is guaranteed due to two mechanisms. Firstly, previous block header hash. This operation allows linking a later block with the previous block. Secondly, the Merkle tree. This procedure performs a dual grouping of transactions through hash functions. In this way, all transactional history is certified by means of a timestamp provided by the same decentralized consensus mechanism, and any attempt at falsification will be rejected by the entire P2P network. In this scheme, there are two main types of participants or nodes. Firstly, light nodes that transact through a simplified payment method. Secondly, mining nodes with special functionalities that allow the validation, authentication, and registration of transactions made by light nodes, as well as the grouping of these transactions within blocks [2]. A relevant aspect of the evolution of blockchain technology is the ability to implement smart contracts as digital tokens embedded at the top of its structure. A smart contract is a set of encodings that allows the registration of a transaction according to an established condition [3]. A smart contract is implemented through the publication of a transaction on the blockchain and can be invoked by another contract or by an external entity (user or applications). Each smart contract is made up of three fields. Firstly, a program code that represents the defined rules, and that is developed in a Turing Complete language. Each encoding, once implemented cannot be changed, that is, it is immutable. When the smart contract is invoked, its coding is validated by the mining community through the consensus mechanism. Secondly, account balance. All blockchain has monetary property; that is, it can send or receive electronic currency. Thirdly, storage through the registration of a transaction on the blockchain. Figure 2 shows how blockchain and a smart contract work together. 2.4 Related Works Information Review A document review protocol was used according to Kitchenham’s recommendations [4]. For this, Scopus is a suitable tool to select studies from the following bibliographic bases: IEEE Xplore, Elsevier, ACM digital, Springer, and Emerald Insight. In this line, four tasks were executed. Firstly, the search strategy. Keywords were identified: “blockchain”, “smart contract” and “insurance”. In a first search, 214 results were found. However, most documents did not correspond to investigation central focus. In this situation, the search was refined by document title, and a sample of 25 results was obtained. Additionally, a manual search was performed in order to add complete studies. Secondly, identification of exclusion and inclusion criteria. Only articles from journals, proceedings, and book chapters were considered. The language of the bibliographic references is English. Only post-2009 research was reviewed (i.e., studies conducted after the creation of blockchain and bitcoin by Satoshi Nakamoto [5]).
256
J. C. Mendoza-Tello and X. Calderón-Hinojosa
Fig. 2. How do blockchain and a smart contract work together?
Thirdly, study selection, and quality evaluation. Only complete studies were considered, that is, investigations that describe analysis, methodology, and data collection method, if applicable. In this evaluation, 17 documents were eliminated, and thus, 8 documents remained. Fourthly, data extraction, and synthesis. In this line, Mendeley and MS Excel were used. This systematic review culminated on July 10, 2020. Review Synthesis Blockchain is a nascent technology, which causes concerns for the insurance market due to the promise of reducing costs and making operations transparent, without the need for a trusted third party. Recent research uses SWOT analysis to assess the maturity of the blockchain to be applied in insurance [6]. In this line, a methodology was described to implement the rules and wording of a legal insurance contract within a smart contract [7]. Even the willingness of consumers to pay an additional premium for a blockchain-based smart contract was analyzed in some business scenarios [8]. In this context, usage cases were designed in insurance products: travel [9] and car. For this, research based on IoT, homomorphic encryption, and storage system was carried out [10]. However, few blockchain investigations focus on the different insurance activities, namely: underwriting, sales, and billing, reserve calculation, fraud detection, reinsurance, incident management, claims management. Along these lines, this research proposes a blockchain-based model to support reinsurance contracting.
A Blockchain-Based Approach to Supporting Reinsurance Contracting
257
3 Results 3.1 Blockchain Features to Support Reinsurance Contracting A reinsurance contract transfers the risk from an insurer to a reinsurer, maximizing the probability of survival in the market and minimizing the probability of ruin [11, 12]. In this line, this section explains how blockchain features can support this activity. Automation using Smart Contract Blockchain has the ability to implement smart contracts at the top of its structure. This adhesion enables the reinsurance activity to be supported in two ways. Firstly, a smart contract allows the definition and verification of contractual terms of reinsurance. In this line, mathematical formulas based on facultative and treaty contracts are coded. Secondly, a smart contract can work with various data sources and automate payment. With these considerations, the reinsurance process is simplified, and the immediate settlement is between insurer and reinsurer. Digital Authentication and Access Rules Blockchain uses keys based on asymmetric cryptography methods. In this context, blockchain provides security in several ways. Firstly, cryptographic functions are implemented to improve security. The keys are stored in digital wallets and allow access to digital token addresses such as reinsurance contracts and cryptocurrencies. The exchange of messages is signed with a private key. Thus, the authenticity and ownership of a digital asset are validated, as well as the integrity of the messages. Secondly, a digital authentication procedure is provided to validate the identity and registration of the data of the entities participating in the blockchain. This encrypted data can be sent from the blockchain to other computer systems. In this way, the incorporation of new participants is expedited, and identity falsification is eliminated. In this situation, blockchain transparency can favor cooperation ties between insurers, which need to strengthen the Know Your Customer (KYC) and Anti-money Laundering (AML) processes. Thirdly, access rules incorporated in smart contracts. According to this authorization, access privileges and administrative roles are granted to a digital identity that accesses a digital resource (e.g., insurance policy contract) for a specified time. That is, various users accessing a public address (which identifies a resource), according to a specified permission within a smart contract. Thus, blockchain provides confidence in the authentication process, avoiding unwanted intrusions. Shared Intelligence Sharing safe and valid information through a distributed ledger, improves the efficiency of negotiation between the participants. Due to this feature, the insurance market may benefit from some profits. Firstly, blockchain as a tool to support risk management. The implementation of blockchain (on P2P platforms) allows having several points of collaboration between entities, which need to share information about underwriting and premium calculation methods. Each transactional log point becomes a source of data about insurer and reinsurer behavior. These actions are immutable evidence of claims and events, which are linked to a reinsurance contract. In this line, the participants have the knowledge of:
258
J. C. Mendoza-Tello and X. Calderón-Hinojosa
risk profiles, experience in the insurance market, and cost factors. Thus, the insurer can assess the ability of a reinsurer to insure events and face claims payments. Likewise, the reinsurer will be able to assess the insurer’s payment capacity and niche market. Consequently, through collaborative intelligence efforts amongst insurers, they will enjoy the immediate availability of information, which favors: negotiation between companies, the management of the risk of fraud, the reduction of the probability that an insured event will occur, and monitoring of products, goods, and services. Secondly, blockchain allows a reduction in costs and search times. It is essential that a risk portfolio be transferred to a reinsurer. Along these lines, the effort to find a suitable reinsurer is a long and overwhelming task. Also, many companies carry out this activity through intermediaries or trusted third parties. In this situation, blockchain is a technology that proposes the elimination of intermediation in the business value chain. This innovation makes it possible to encrypt all the information in a distributed and immutable ledger, which enables knowing the clients and the adequate coverage for their risks, without the need for third parties. Thus, each participant configures the level of access and confidentiality of the information, which would be made immediately available to the participants for review. In this way, response times regarding the various risks and financial coverage amongst participants are streamlined. In addition, possible errors can be corrected in less time. All these activities are implemented in P2P environments. Consequently, brokerage costs can be decreased or eliminated. Thirdly, blockchain is a marketing business driver and promotes the collaborative economy through smart contracts and virtual currencies [13, 14]. Informed purchases favor decision making. In this vein, consumers can expand their knowledge about multiple insurance options, risk coverage, promotion of offers, and prices. Furthermore, a customer’s preferences and needs may be known by a reinsurer. In this sense, the marketing activity favors the personalization and adjustments of the reinsurance offer. Transparency and Immutability of Data Blockchain facilitates the visibility of reinsurance operations. This technology uses exhaustive cryptographic tests and hash functions that allow the transaction blocks to be linked sequentially. Any change in the transactional history is rejected by the participants of the P2P network. Thus, the immutability and integrity of the data are guaranteed. Due to these characteristics, the participating entities obtain some benefits. Firstly, a signed reinsurance contract cannot be tampered with or removed. The risks of loss are eliminated since the entire contract is encrypted and distributed securely over the network. That is, the contractual conditions of a reinsurance (implemented on the blockchain) may not be adulterated. This provides a guarantee of liability and compliance in legal disputes. Secondly, the events of a claim will be clear and processed in real time using blockchain. Each action will be registered, and non-repudiation will be guaranteed. Reinsurance claims will be optimized, and the insurance activity will be reliable. In addition, the insurance company will be able to track and audit incidents, recover lost values, and protect itself from fraud due to diversion of funds. Decentralization Blockchain is a disruptive innovation because it secures electronic transactions between unknown entities, without the need for trusted third parties or any supervisory authority.
A Blockchain-Based Approach to Supporting Reinsurance Contracting
259
In other words, this technology defines a new paradigm: decentralized trust through cryptographic principles, without the intervention of a control authority [15]. In this way, blockchain provides several considerations to support reinsurance activity. Firstly, decentralized storage through P2P networks. The data is not stored on central servers; instead, all the information is replicated to the network nodes. Thus, the monopoly of information control is eliminated, and the failure of a single node (in the network) is minimized. Thus, service failure attacks are mitigated. Transparency and non-repudiation are increased because all encrypted data is shared with all participating entities. Behavior patterns and detection alerts can be implemented as a trigger on the blockchain. In this vein, encrypted data on contracts, claims, and premium calculation methods are encrypted for the entire network to be aware of. Each entity may know information about the payment capacity and compliance of the participating entities (insurer and reinsurer). Each claim process is known and cryptographically validated throughout the network. Each entity can generate a copy of the payment receipt and may automatically reject double-spend attacks (same payment for different insured events). Thus, the network is able to link the behavior to a digital identity, which allows taking measures against counterfeiting and corruption. Secondly, reliability and fault tolerance. Blockchain is prepared to take on problems related to communication availability. Failure of a node on the network does not alter the performance of operations. Any peer node can replace it, and in this way, the availability of the system is guaranteed. By using a blockchain-based smart contract, policyholders and reinsurers can benefit from automatic payment settlement, without delays, guaranteeing the authenticity of the invoices issued. In this way, the availability of blockchain guarantees the correctness of the operations and implements connectivity of the user interfaces. Thirdly, support to reputation score systems. Through blockchain interactions, reputation scores can be collected based on the behavior and compliance of the participants. In this way, metrics are provided to assess trust and fidelity. The insured can benefit because they obtain quality service scores from the reinsurer. At the same time, insurers will have valuable information for underwriting insurance activities. Thus, the measurements are transparent, free from interference, and private interests. In turn, fraud and professional malpractice are evidence for the reinsurers and insurers to be aware of. 3.2 Blockchain-Based Model to Support the Contracting of Reinsurance In this section, a blockchain-based model to support reinsurance is designed. In this context, Fig. 3, a layered overview is described. Firstly, P2P mining pool layer. This is made up of mining nodes that define and execute the consensus mechanism in the network. That is, they execute the protocol to ensure the integrity and reliability of system operations. Secondly, the blockchain layer. This is made up of blocks generated through the consensus process. Each block contains a header, and a set of transactions. Within its header each transaction can carry digital tokens such as smart contracts and digital currencies (for monetary exchange). Each block implements verification fields (e.g., previous block header hash, Merkle hash tree, amongst others) that guarantee the integrity
260
J. C. Mendoza-Tello and X. Calderón-Hinojosa
Fig. 3. A layered overview to supporting reinsurance contracting
and immutability of transactions over time. The correctness of the chain generation depends on the P2P mining pool layer. Thirdly, the smart contract layer. Functional specifications implemented as triggers that autonomously store and execute encodings. For this purpose, three smart contracts are defined: – Authorization rules and authentication. This represents the access door to the scheme, validates the digital identity, and the authorization of access to the resources. As the blockchain storage capacity can be saturated with large amounts of information, users encrypt and record most of their information in distributed (external) databases, and only one 256-bit hash (which serve as pointers to the data) is stored on the blockchain. In a transparent way, each user creates a smart contract with part of their personal information; in turn, it defines authorization rules, and grants read permission. Any user accesses the scheme through password-based authentication using hashes and public key values. This validation allows you to enable user permissions, and you can retrieve information according to the rules previously defined in the smart contract. In this way, unknown users can communicate with each other and work collaboratively, replacing the intervention of a trusted third party with the automation of smart contracts implemented on blockchain. – Reinsurance methods. In order to calculate the value of a policy premium, methods were coded into smart contracts. In turn, these were defined based on the reinsurance literature: facultative and treaty. In the case of facultative reinsurance, the calculation of the policy is carried out individually and negotiated separately from the other components. In the case of treaty reinsurance, contracts are analyzed under two usage
A Blockchain-Based Approach to Supporting Reinsurance Contracting
261
cases: proportional (quota share, surplus share) and non-proportional (per risk excess, per occurrence excess, and aggregate excess). In both cases, the insurer and reinsurer can validate and monitor the results obtained at each stage of the process. – Reinsurance contract. The contractual commitment (between insurer and reinsurer) is recorded. The identifier of the contract, the term of the contract, the limitations and exclusions, the type of reinsurance, and premium value are all detailed. However, this represents too much information that could saturate the blockchain storage capacity. For this purpose, a scheme similar to the authentication process is proposed. In other words, most of the information is encrypted and stored in external and distributed databases. Then only the contract identifier (represented by a 256-bit hash) is stored on the blockchain. This hash serves as a data pointer that links the policy identifier with the other information. Due to the immutability of the blockchain, this identifier cannot be modified, and through it, only the status of the reinsurance contract can be updated, and the historical detail of renewals is preserved over time. Fourthly, user layer. It is made up of the final entities of the scheme: insurer and reinsurer. An insurer requests a reinsurance quote for an insured event and/or portfolio. According to this premise, tasks are executed in the following sequence. First, they both authenticate and interact with smart contracts. Second, smart policy-related contracts and reinsurance methods are enabled. Any of the entities can launch a policy proposal (through a smart contract). Third, test cases are run to obtain premium quotes based on the requested policy detail. Taking these considerations into account, an information flow is generated and repeated until the insurer and reinsurer reach an agreement through the signing of the reinsurance smart contract; otherwise, the negotiation is abandoned. Fourth, if there is a mutual agreement, the policy is digitally signed and registered on the blockchain. Finally, much of the information is encrypted and stored in other distributed databases transparently for the user. In order to do this, each policy identifier will be represented by a 256-bit hash pointer, which links the data stored in the blockchain and other external databases. Figure 4 describes the automated reinsurance process within blockchain.
Fig. 4. Blockchain-based model to supporting reinsurance contracting
262
J. C. Mendoza-Tello and X. Calderón-Hinojosa
4 Conclusions This document explains how blockchain features can support reinsurance tasks, namely: automation, digital authentication, and access rules, shared intelligence, transparency and immutability, and decentralization. Based on these aspects, a model is designed from two perspectives. Firstly, a button-up scheme based on four layers: P2P mining pool, blockchain, smart contract, and user. This overview explains the relationship and collaboration of the components that make up the system. Secondly, a blockchain-based model describes the automated reinsurance process within the blockchain. Thus, contractual rules, terms, and conditions are incorporated into smart contracts, which provide an audit trail. All communications are sent and validated throughout the network, without the need to send information separately. Consequently, this platform provides transparency in the calculation of premium using the various reinsurance trading methods. In addition, payment settlement is immediate, non-repudiation of liabilities is guaranteed, and the problem of information asymmetry is solved. In this way, it is intended to make an approach between insurer and reinsurer so that they establish commercial relationships through blockchain-based smart contracts. According to these considerations, the benefits of the blockchain-based model are evident. However, there are challenges that need to be addressed. First, blockchain is in the early stages of innovation, leading to many concerns, especially in risk management. For this, it is necessary to emphasize a constructivist approach in the participants to learning about technology and its link with other business marketing perspectives. Finally, legal regulatory efforts that allow the use and exploitation of technology in the insured market are essential.
References 1. Outreville, J.F.: Reinsurance. In: Theory and Practice of Insurance. Springer, Boston, MA, pp. 263–288 (1998) 2. Antonopoulos, A.: Mastering Bitcoin: Unlocking Digital Crypto-Currencies. O’Reilly Media Inc, Sebastopol, CA (2015) 3. Szabo, N.: Formalizing and securing relationships on public networks. First Monday. 2 (1997) 4. Kitchenham, B.: Guidelines for Performing Systematic Literature Reviews in Software Engineering. Durham, UK (2007) 5. Nakamoto, S.: Bitcoin: A Peer-to-Peer Electronic Cash System, https://bitcoin.org/bitcoin. pdf 6. Gatteschi, V., Lamberti, F., Demartini, C., Pranteda, C., Santamaría, V.: Blockchain and smart contracts for insurance: is the technology mature enough? Futur. Internet. 10, 8–13 (2018). https://doi.org/10.3390/fi10020020 7. Pagano, A.J., Romagnoli, F.: Implementation of blockchain technology in insurance contracts against natural hazards: a methodological multi-disciplinary approach. Environ. Clim. Technol. (2019). https://doi.org/10.2478/rtuect-2019-0091 8. Nam, S.O.: How much are insurance consumers willing to pay for blockchain and smart contracts? A Contingent Valuation Study. Sustain. 10, 1–11 (2018). https://doi.org/10.3390/ su10114332
A Blockchain-Based Approach to Supporting Reinsurance Contracting
263
9. Jia-lan, L., Wan-jun, Y., Huai-lin, Z., Xiao-yu, W., Zi-chen, W., Nai-meng, C.: Research and design of travel insurance system based on blockchain. In: 2019 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS). IEEE, Shangia, China, pp. 121– 124 (2019) 10. Xiao, L., Deng, H., Tan, M., Xiao, W.: Insurance block: a blockchain credit transaction authentication scheme based on homomorphic encryption. In: Zheng, Z., Dai, H.-N., Tang, M., Chen, X. (eds.) BlockSys 2019. CCIS, vol. 1156, pp. 747–751. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-2777-7_61 11. Huang, Y., Yin, C.: A unifying approach to constrained and unconstrained optimal reinsurance. J. Comput. Appl. Math. 360, 1–17 (2019). https://doi.org/10.1016/j.cam.2019.03.046 12. Zhang, N., Jin, Z., Qian, L., Wang, R.: Optimal quota-share reinsurance based on the mutual benefit of insurer and reinsurer. J. Comput. Appl. Math. 342, 337–351 (2018). https://doi.org/ 10.1016/j.cam.2018.04.030 13. Mora, H., Pujol-López, F.A., Mendoza-Tello, J.C., Morales-Morales, M.R.: An educationbased approach for enabling the sustainable development gear. Comput. Hum. Behav. (2018). https://doi.org/10.1016/j.chb.2018.11.004 14. Mora, H., Sirvent, R.M.: Analysis of virtual currencies as driver of business marketing. In: Visvizi, Anna, Lytras, M.D. (eds.) RIIFORUM 2019. SPC, pp. 525–533. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30809-4_48 15. Mendoza-Tello, J.C., Mora, H., Pujol, F., Lytras, M.D.: Disruptive innovation of cryptocurrencies in consumer acceptance and trust. Inf. Syst. E-bus. Manag. pp. 195–222 (2019). https:// doi.org/10.1007/s10257-019-00415-w
Implementation of an Information Security Management System Based on the ISO/IEC 27001: 2013 Standard for the Information Technology Division Mario Aquino Cruz1(B) , Jessica Noralinda Huallpa Laguna1 , Herwin Alayn Huillcen Baca2 , Edgar Eloy Carpio Vargas3 , and Flor de Luz Palomino Valdivia2 1 Micaela Bastidas National University of Apurímac, Apurímac, Perú
[email protected], [email protected] 2 José María Arguegas National University, Apurímac, Perú {hhuillcen,fpalomino}@unajma.edu.pe 3 National University of the Altiplano, Puno, Perú [email protected]
Abstract. The Information Technologies Directorate of the Micaela Bastidas de Apurímac National University (UNAMBA) (UNAMBA Rationalization Office, «Organization and Functions Manual,» Abancay, 2018.), is in charge of managing the technologies and information; it also has to safeguard the computer assets under its responsibility. However, it does not have any plan, standard, or directive that allows correctly protecting the information. For this reason, our research aimed to contribute to improving the level of information security in the Information Technology Directorate (DTI) of UNAMBA, implementing the Information Security Management System based on the standard ISO/IEC 27001: 2013 (ISO 27001 - ISO 27001 Management Systems Software, «ISO Software,» 2018. [Online]. Available: https://www.isotools.org/normas/riesgos-y-seguridad/ iso-27001/. [Last access: July 21, 2018].). This standard allows the assurance of the confidentiality, availability, and integrity of information and information systems. Regarding the research methodology, the type of research is applied with a pre-experimental research design. As a methodology for the design of the Information Security Management System, the Deming PDCA method («PDCA Cycle (Plan, Do, Check and Act): Deming’s circle of continuous improvement | PDCA Home,» Pdcahome.com, 2018. [Online]. Available: https://www.pdcahome.com/ 5202/ciclo-pdca/. [Last access: July 25, 2018].) (Plan, Do, Check, Act) was used, which consists of identifying the computer assets, performing the analysis and risk management, to then establish response actions (controls) and mitigate the associated risks. Also, carry out information security policies. For the analysis and risk management, the MAGERIT III methodology (General Directorate for Administrative Modernization, Procedures and Promotion of Electronic Administration, MAGERIT - version 3.0. Information Systems Risk Analysis and Management Methodology., Madrid: Ministry of Finance and Public Administrations, 2012.) was used. The sample is n = 20 people. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 264–272, 2021. https://doi.org/10.1007/978-3-030-63665-4_21
Implementation of an Information Security Management System
265
As results, it was obtained that the level of security risks before the implementation of the controls was 86.15%, to obtain 11.15% later; therefore, there was a decrease of 75%. Likewise, there was an increase in security controls, since before carrying out the risk treatment plan, only 18 (15.78%) controls were obtained, but then it was increased to 65 (57.01%) controls, representing an increase of 41.23%. Also, there was an improvement in the level of training in information security issues in DTI users because, before the implementation of the ISMS, only 48% of the respondents had knowledge about information security, but then it rose to 95%. Keywords: Information security management system · ISO/IEC 27001 · ISO/IEC 27002 · Information assets
1 Introduction Organizations, whether public or private, small or large, generate all kinds of information of importance to them. This information, in most cases, is stored on different media, both physical and electronic, which is why they face, more frequently, risks and insecurities from a wide variety of sources, such as computer-based fraud, espionage, viruses, intrusion attacks, and denial of services, as well as sabotage, vandalism, fires, and floods. Information security [5] is achieved with the implementation of a set of controls, which can take the form of policies, procedures, practices, organizational structures, and software and hardware functions. This group of controls needs to be established, implemented, monitored, reviewed and improved to meet the specific security and business objectives of the organization. In Peru, the organizational culture in the field of information security is not abundant, and much less in the case of universities. Therefore, they do not take into account administrative aspects such as the modeling of their processes, which is important for carrying out their activities, as they are daily work activities, it is not considered useful, presenting situations in which they let the years pass and fail to evolve in this regard [6]. Currently, the Information Technology Directorate (DTI) of the National University Micaela Bastidas of Apurímac [1], is in charge of managing technologies and information through the Web and Intranet, analyzes and develops software and information systems, and implements networks and connectivity, oriented to support UNAMBA technology, also provides technical support service to users; in this way, the DTI fulfills the functions established in the current organization and functions manual. The DTI does not have an ISMS, so there are problems related to information security such as loss of information assets and disorder in the DTI, because administrative staff, teachers, and students frequently request support personnel since there is no incident control. There is no anti-virus license, so the PCs of administrative staff and laboratories of professional schools are infected with viruses and do not allow the normal development of their activities. Deficient Internet, since anyone can connect to the network, causing slowness in the operation of the information systems of UNAMBA. Lack of a process manual, generating disorder, and lack of coordination among staff when carrying out an activity,
266
M. Aquino Cruz et al.
task, or procedure. There is no confidentiality agreement in place for a new team at DTI, which could lead to enraged staff revealing confidential information. If this situation continues, the physical facilities, hardware, information systems, and the information contained in them would be seriously affected, which would disturb the normal development of activities within the university. Due to the above, it is proposed to implement an information security management system based on the ISO/IEC 27001: 2013 standard, which will allow the protection of information assets and information that are managed within the flow of important processes. In section II, the Literary Review is described, section III describes the Methodology and materials used for the research. Section IV presents the results of the investigation after the implementation of the Information Security Management System based on the ISO/IEC 27001: 2013 standard. In section V the conclusions of the research are presented.
2 Related Works Soria [7], in his research: “Design of an Information Security Management System through the application of ISO 27001: 2013 in the office of information systems and telecommunications at the University of Cordova”; had the general objective: "Design an Information Security Management System by applying the international standard ISO/IEC 27001: 2013 for the Systems and Telecommunications office of the University of Córdoba." Likewise, the PDCA methodology was used considering only the first 2 phases of the methodology that are to Plan and Do, the documentation carried out according to the ISO 27001: 2013 standard were as follows: Support and Approval by Management, Scope of Information Security Management System, Control Objectives and Security Controls, Information Security Policies, Risk Analysis, Risk Assessment and the Risk Assessment Report, Applicability Statement and Risk Treatment Plan, Business Continuity Plan. Finally, it is concluded that the In-formation Security Management System based on the ISO 27001: 2013 standard allowed to know the current status of the objective domains and security controls through a differential analysis, it was possible to classify the information assets and determine the level of Potential risk of each one of them, where the most critical as-sets that require the most attention and security controls could be identified, given the high impact they have on the provision of services and optimal operation of the institution’s processes. Risk Assessment and Risk Assessment Report, Statement of Applicability and Risk Treatment Plan, Business Continuity Plan. Finally, it is concluded that the Information Security Management System based on the ISO 27001: 2013 standard allowed to know the current status of the objective domains and security controls through a differential analysis, it was possible to classify the information assets and determine the level of Potential risk of each one of them, where the most critical assets that require the most attention and security controls could be identified, given the high impact they have on the provision of services and optimal operation of the institution’s processes. Risk Assessment and Risk Assessment Report, Statement of Applicability and Risk Treatment Plan, Business Continuity Plan. Finally, it is concluded that the Information Security Management System based on the ISO 27001: 2013 standard allowed to know the current status of the
Implementation of an Information Security Management System
267
objective domains and security controls through a differential analysis, it was possible to classify the information assets and determine the level of Potential risk of each one of them, where the most critical assets that require the most attention and security controls could be identified, given the high impact they have on the provision of services and optimal operation of the institution’s processes. Business Continuity Plan. Finally, it is concluded that the Information Security Management System based on the ISO 27001: 2013 standard allowed to know the current status of the objective domains and security controls through a differential analysis, it was possible to classify the information assets and determine the level of Potential risk of each one of them, where the most critical assets that require the most attention and security controls could be identified, given the high impact they have on the provision of services and optimal operation of the institution’s processes. Business Continuity Plan. Finally, it is concluded that the Information Security Management System based on the ISO 27001: 2013 standard allowed to know the current status of the objective domains and security controls through a differential analysis, it was possible to classify the information assets and determine the level of Potential risk of each one of them, where the most critical assets that require the most attention and security controls could be identified, given the high impact they have on the provision of services and optimal operation of the institution’s processes. Guerrero [8], in his research: “Information Security Management System (ISMS) based on ISO 27001 and 27002 for the computer and Telecommunications unit of the University of Nariño”, had the general objective: "Improve the management of information security through the application of the risk analysis process and the verification of security control that allows the structuring of an Information Security Management System (ISMS) based on ISO 27001 and 27002 for the unit of Computer science and Telecommunications of the University of Nariño”. Likewise, the PDCA methodology was used for the ISMS and MAGERIT for risk analysis and management. Finally, it is concluded that the Information Security Management System achieves a high level of quality in the information security processes. Martinez [9], in his research: "Management System to Improve Information Security in the Institution Industrial Services of the Navy", had the general objective: "Improve Information Security in the Institution of Industrial Services of the Navy". Likewise, he hypothesizes: "The Management System improves Information Security in the Institution of Industrial Services of the Navy". The type of research is applied, the level is descriptive, and uses a single-group Time Series research design. Finally, it is concluded that the Management System improved the Information Security in the Institution of Industrial Services of the Navy by 58%, through the implementation of the Information Security Management System processes. Alcantara [10], in his research: "Security implementation guide based on the ISO/IEC 27001 standard, to support security in the computer systems of the PNP northern police station in the city of Chiclayo", Its objective is: “To contribute to improving the level of Information security, supported by the ISO/IEC 27001 standard, at the Police Commissioner del Norte - Chiclayo institution”. Likewise, he formulates as a hypothesis: "With an Information Security Implementation Guide based on the ISO/IEC 27001 Standard, it will support the improvement of Security in Computer Applications of the North Commissioner -Chiclayo". The type of research is applied technology. Finally, it is concluded that with the Implementation Guide, it was possible to increase the level of security in
268
M. Aquino Cruz et al.
the computer applications of the police institution, and this was reflected in the increase in security policies that were put in place that benefited the institution and They helped to increase the level of security in it. Aliaga [11], in his research: “Design of an information security management sys-tem for an educational institute”, Its general objective is: “To design an Information Security Management System (ISMS) based on the international standards ISO/IEC 27001: 2005 and ISO/IEC 27002: 2005, adopting the current version of COBIT as the business framework”. It also uses the PDCA methodology. Finally, it is concluded that all the documentation required by the ISO/IEC 27001 standard was achieved, which means that the general objective was met, which is to design an information security management system for an educational institute. Zena [12], in his research: “International standard ISO 27001 for the management of information security in the central office of informatics of the UNPRG”, has the general objective: “To apply the International Standard ISO 27001 for the Management of Information Security in the IT Support process at the UNPRG Central IT Office”. It also formulates as hypotheses: "If the International Standard ISO 27001 is applied, it effectively and efficiently improves the Management of Information Security in the Central Office of Informatics of the Pedro Ruiz Gallo National University". The type of research is Technological-Formal and the methodology used is PDCA. Finally, it is concluded that With the Information Security Management Sys-tem in the IT Support Process of the Central IT Office UNPRG, the level of risk is reduced on average from 6 to 4.4, which means 26.67%. This is achieved after applying the risk analysis and evaluation methodology, ending with the implementation of the ISO 27001 controls.
3 Methodology and Materials 3.1 Methodology The type of research is Applied, because the study will be carried out with knowledge of theories, basic research laws and information security management, the level is explanatory, of a pre-experimental design, pre-experimental pre-test and post-test, with a single group, the present research work was carried out with a non-probability sampling procedure for convenience since only those responsible for areas and/or offices with greater relevance and information management at the university were interviewed and surveyed (Table 1). The investigation procedure was carried out as follows: A. Collection of Information before implementation of the ISMS. B. Implementation of the ISMS based on the 27001: 2013 standard under the PDCA methodology. C. Collection of Information after implementation of the ISMS. D. Evaluation of the results obtained in the investigation. 3.2 Materials The instruments facilitate the techniques of the data collection process. The instruments used in the investigation are the following:
Implementation of an Information Security Management System
269
Table 1. Sampling offices Office
Quantity
Rectorate
Rectorate, Institutional Image Office, 06 General Secretary Office, etc.
Academic Vice-Rector’s Office
Academic Vice-Rector’s Office, Academic Services Directorate, University Welfare Directorate, etc.
01
Vice-Rector’s Office for Research
Vice-Rector’s Office for Research, General Directorate for Research Institutes, units and Centers, etc.
01
General Directorate of Administration
Treasury Office, Human Resources Directorate, etc.
04
Graduate School
Graduate School
01
Faculties
Direction of professional academic schools, deanship, etc.
07
TOTAL
• • • • • •
20
Notebook Tablet Photographic camera Records Questionnaires Check list Likewise, the data collection techniques used are the following:
• Polls • Interview • Observation
4 Results A pre-test and post-test were carried out, that is, in the traditional way and then with the implementation of the Information Security Management System based on the ISO/IEC 27001: 2013 standard. The results obtained are explained below: Objective: Reduce the levels of security risks in the DTI of UNAMBA. To verify that there was indeed a decrease in risks to which the information assets were exposed in the Information Technology Division, 13 risks with Very High and High impact were considered, which are shown in Table 2, so A before and after analysis of controls has been performed. It is observed in Fig. 1, that before the implementation of the controls, the 13 risks are of Very High impact (100) and High (70), but afterwards the risk level falls to a more
270
M. Aquino Cruz et al.
Fig. 1. Comparison of risk levels
Table 2. Results of numerical data pre test and post test. N ° Risk Risk code
Active
Risk 1
R_D_CR
Backup copies
Risk 2
R_ S_WWW
Web page
Risk 3
R_ SW_EST
Standard software
Risk 4
R_ SW_BDS
Database managers
Risk 5
R_ HW_HOS
Servers
Risk 6
R_ HW_ROU
Router
Risk 7
R_ HW_SWH
Switch
Risk 8
R_ COM_INT
Internet
Risk 9
R_ COM_LAN Local area network
Risk 10
R_ L_SIT
Information technology directorate
Risk 11
R_ P_ADM
Systems administrator
Risk 12
R_ P_COM
Communication manager
Risk 13
R_ P_DBA
Database administrator
acceptable level such as Medium, Low and Very Low, likewise in the general comparison of Fig. 2, the above is reiterated since there is a decrease of 75%. This allows us to affirm and verify that it was indeed achieved with the objective of reducing the levels of security risks in the DTI of the UNAMBA. Objective: Increase Security controls at the UNAMBA DTI. In order to verify the increase in applied controls, a questionnaire of the 114 controls has been carried out, before and after the risk treatment plan has been carried out. It
Implementation of an Information Security Management System
271
Fig. 2. General comparison of pre-test and post-test risk level
Fig. 3. Pre-test and post-test comparison of security controls level
should be noted that increasing security controls is very important since, thanks to them, they allow us to decrease the risks to which computer assets are exposed. In Fig. 3, it is observed that the security controls increase from 15.78% to a 57.01% increase in controls and consequently the nonexistent controls decrease.
5 Conclusions With the implementation of the Information Security Management System for the Information Technology Division of UNAMBA, the main objective was achieved, which was
272
M. Aquino Cruz et al.
to improve the level of information security, thus ensuring the confidentiality, integrity and availability of the information assets. It was possible to contribute to the reduction of the security risks to which the in-formation assets of the Information Technology Directorate were exposed through the analysis and risk management carried out with the MAGERIT methodology, which includes identification, design and implementation of controls for the most critical risks. The development of risk analysis and management allowed the increase of security controls, which indicates that High and Very High risks are controlled and at an acceptable level. Training and education in information security issues for DTI staff and DTI users is of vital importance since the ISMS could be implemented throughout the institution. After personal training, delivery of physical information material and by institution mail, the workers are aware that the information they handle is confidential and that they must ensure its complete integrity. Training in information security topics strengthens the implemented system and generates continuous improvement. This also made the implementation of the ISMS easier.
References 1. UNAMBA Rationalization Office, «Organization and Functions Manual», Abancay (2018) 2. ISO 27001 - ISO 27001 Management Systems Software, «ISO Software» (2018). https:// www.isotools.org/normas/riesgos-y-seguridad/iso-27001/. Accessed 21 July 2018 3. PDCA Cycle (Plan, Do, Check and Act): Deming’s circle of continuous improvement | PDCA Home, Pdcahome.com (2018). https://www.pdcahome.com/5202/ciclo-pdca/. Accessed 25 July 2018 4. General Directorate for Administrative Modernization, Procedures and Promotion of Electronic Administration, MAGERIT - version 3.0. Information Systems Risk Analysis and Management Methodology. Ministry of Finance and Public Administrations, Madrid (2012) 5. Bertolín, J.A.: Information security. Networks, computing and information systems. Editorial Paraninfo (2008) 6. Alcantara Ramirez, M.A.: Strategy for adapting a university information security management system to cloud computing (2019) 7. Doria Corcho, A.F.: Design of an Information Security Management System by applying the ISO 27001: 2013 standard in the office of information systems and telecommunications at the University of Cordova, Monteria (2015) 8. Guerrero Angulo, Y.C.: Information Security Management System (ISMS) based on ISO 27001 and 27002 for the computer and Telecommunications unit of the University of Nariño, Pasto (2014) 9. Martinez Ramos, J.: Management System to Improve Information Security in the Institution Industrial Services of the Navy, Nuevo Chimbote (2014) 10. Alcantara Flores, J.C.: In the ISO/IEC 27001 standard, to support security in the computer systems of the PNP northern police station in the city of Chiclayo, Chiclayo (2015) 11. Aliaga Flores, L.C.: Design of an information security management system for an educational institute, Lima (2013) 12. Zeña Ortiz, V.E.: International standard ISO 27001 for the management of information security in the central office of informatics of the UNPRG, Lambayeque (2015)
Technology Trends
Industry 4.0 Technologies in the Control of Covid-19 in Peru Jessica Acevedo-Flores1(B)
, John Morillo1
, and Carlos Neyra-Rivera2
1 Escuela Profesional de Medicina Humana, Universidad Privada San Juan Bautista, Lima, Peru
{jessica.acevedo,john.morillo}@upsjb.edu.pe 2 Universidad Científica del Sur, Lima, Peru [email protected]
Abstract. Introduction: COVID-19 is a disease caused by the new coronavirus SARS-CoV-2, which has spread rapidly around the world. In this paper we first define the technologies of 4.0 industry used to control the pandemic. Objective: To describe the use of technology in the control of COVID-19 in Peru. Materials and methods: Systematic review of applied technologies for COVID-19 management in Peru. The revision was limited to material published in Spanish and English, preferably from indexed sources and official websites, since January 2020. Results: There are different technological solutions implemented in different countries for prevention and containment; some of them have been considered by the Peruvian government, according to the reality of technological and digital development of the country. Discussion and Conclusion: This article reviews the main Industry 4.0 technological solutions implemented around the world to appropriately manage the global health emergency. Some of these solutions, applied for pandemic control, are utilized in the strategies design by Peruvian Government; however, there are other technologies not used yet that have been identified as appropriate. In addition, some improvements in the solutions and policies are discussed, concluding that even if technology plays an important role for COVID-19 control, it is not the only factor affecting the results, as human approach and state policies are crucial for better outcomes. Keywords: Coronavirus · COVID-19 · Industry 4.0 · Pandemic · Technology
1 Introduction During the last weeks in 2019, a viral pneumonia outbreak emerged in Wuhan, China, rapidly spreading around the world. The virus was identified as a new version of SARSCOV, belonging to the family Coronaviridae. Coronaviruses may cause respiratory infections in humans, that can range from the common cold to more serious illnesses such as Middle East respiratory syndrome (MERS) and severe acute respiratory syndrome (SARS) [1]. Only a few weeks after its appearance, SARS-COV-2 had already become a global concern due to its characteristic of being highly contagious, mainly due to the fact that Wuhan has a population of more than 11 million inhabitants and a high-traffic © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 275–289, 2021. https://doi.org/10.1007/978-3-030-63665-4_22
276
J. Acevedo-Flores et al.
international airport with a large number of connections, thus, lending itself to a high exporting potential. On January 30, 2020, the new coronavirus - responsible for the so-called infectious disease COVID-19 - was declared a Public Health Emergency of International Concern by the World Health Organization (WHO). Along with the announcement, all the nations of the world were urged to take immediate sanitary measures to prevent coronavirus propagation. Despite the measures taken by many of the governments around the world, some of them later than others, the virus spread rapidly; 15 weeks after the outbreak, COVID-19 had already spread around 184 regions of the world, with millions of confirmed cases and hundreds of thousands of deaths in every continent. The first person tested positive for COVID-19 in Peru was announced the first week of March. Ten days later, given the increasing propagation and the growing number of infected citizens (71 by that date), the Peruvian government declared a state of national emergency with mandatory social isolation (general quarantine), closing the borders and suspending international and interregional transportation, for 15 days; the social isolation mandates were later repeatedly extended. Now, more than eight months after the Coronavirus outbreak and with a third of the planet’s inhabitants in quarantine, it is undeniable the fundamental role that technology is playing as an ally of healthcare professionals and the scientific community to contain and analyze the Coronavirus. The technological solutions used to overcome the health crisis belong to what is commonly known as the fourth industrial revolution or Industry 4.0 [2], e.g., the Internet of Things, artificial intelligence, and others shown in Table 1. Many governments, mainly in Asia (China, South Korea, Singapore, Taiwan) and in Western countries (USA, Euro-region) have followed a strong technological determinism, put into practice in the last decade by implementing plans towards technological advancement, focused mainly on building “smart cities” [3]. According to the Global Competitiveness Report presented by the World Economic Forum [4], the technological and digital development of Peru is not at the same level as many countries in Latin America. Furthermore, it does not correspond to the excellent macroeconomic progress (IGC 2019 65/141, two positions lower than in 2018); some results are shown in Fig. 1. However, since technological determinism has not been a priority in Peruvian government policies, Peruvian technological development is inferior to that of the countries that we will use as a reference in this review. It is important to note that each country has had a different approach regarding the use of technology for crisis management caused by the pandemic, based on factors such as available data infrastructure, law, government administration, social and political context, and mainly, its level of technological development. As a result of this heterogeneity, the results obtained in the management of the health emergency are also different in every country. This article describes the use of technology for COVID-19 control in Peru; however, only technological solutions already implemented and in use have been considered, and not those innovations in research for future development. In the following section we compare these solutions with the ones used in other countries to suggest improvements; our main motivation is to show how technology may help to quickly implement strategies for better crisis management.
Industry 4.0 Technologies in the Control of Covid-19 in Peru
277
Table 1. Industry 4.0 technologies. Technology
Description
Example of technological solution
Internet of Things (IoT)
Automated solution consisting in collecting, transferring, analyzing, and storing data. The collection process is done using sensors built into mobile phones, robots, and other devices that, after collecting, send the data to the central cloud server for analysis and decision-making
Drones with integrated infrared thermal imaging lens that measures temperature, helping in the diagnosis of people, medical drones [5], surveillance drones [6], delivery and multipurpose drones Wearables: Smart thermometers and smart helmets [7]
Big Data
Analytical techniques capable of storing a large amount of data, provide the basis for evaluation processes, rapid decision-making - almost in real time - and in general, implementing pandemic control strategies
Dashboards based on ArcGIS technology (software in the field of Geographic Information Systems - GIS) to track the growth of COVID-19 cases in real time [8]
Artificial Intelligence (AI)
Application that can instruct computers to use big data-based models to recognize, explain, and predict a pattern
Chatbots or conversational robots [9], to provide accurate information (symptoms, transmission, preventive measures, healthcare, and travel announcements) and to answer user questions [10]. The information is provided to the Chatbot, trained to respond with information from official sources or scientific articles from indexed databases, thus, avoiding infodemics. These platforms represent a unique investment that allows hospitals to reduce costs in staff providing information; instead, professionals are engaged in the most urgent healthcare tasks (continued)
278
J. Acevedo-Flores et al. Table 1. (continued)
Technology
Description
Example of technological solution
Fifth Generation Mobile Telephony, 5G
It is purely an evolution, and 5G is used to transmit computed therefore an improvement, of tomography (CT) diagnostics 4G cellular service technology, which includes voice, text, and high-speed internet services. With new frequency bands, 5G achieves higher transmission speed and a greater number of connected devices, such as cell phones, “things” within the IoT such as sensors, appliances, robots, household water or electricity consumption readers, vehicles, and access to machines (M2M - Machine to Machine)
Cloud Computing
It includes online resources such as storage servers, databases, networks, and more
Services such as Google Cloud [11] and Microsoft Azure [12] are currently used in hospitals
Virtual Reality (VR) and Augmented Reality (AR)
VR replaces the real world with a fully virtual world created by computer; AR does not replace physical reality; it overlaps virtual information by adding that digital information in the context of an existing reality
Scientists use Augmented Reality to study the molecular structure of the virus, collaborating in a virtual environment shared among researchers Telehealth services for diagnosis and medical treatment [13]
Robotics
Autonomous robots perform tasks without the control of any external agency
Robots are currently used to clean, sterilize places, distribute food, medicines, and supplies, or perform other tasks in hospitals, restaurants, and airports to reduce human contact and prevent the spread of the disease
Industry 4.0 Technologies in the Control of Covid-19 in Peru
279
Fig. 1. Digital diagnosis of Peru
2 Materials and Methods A systematic search for information was carried out linked to different technologies that have been used around the world, contrasting them with the Peruvian case; that is, the technological solutions implemented in Peru. The search was carried out in Scopus, Elsevier, IEEE, Scielo, MedLINE databases, as well as national and international media, official websites of Peruvian public offices, and in both, Spanish and English. The inclusion criteria used were that the technologies have already been implemented and that they mostly belong to the Industry 4.0, excluding the ones that are in the process of research or development. A combined search was performed using the descriptors technology, industry 4.0, Coronavirus, COVID-19, pandemic.
3 Results 3.1 Technologies Used for COVID-19 Control and Management 3.1.1 IoT, Drones and Thermal Cameras “Science and technology are the most powerful weapon in humanity’s battle against disease,” said Chinese President Xi Jinping, emphasizing that scientific development and
280
J. Acevedo-Flores et al.
technological innovation are very useful tools for tackling today’s global health crisis. China’s approach is oriented to limit human contact, using drones to perform mandatory temperature measuring of potentially infected citizens; disinfection of public spaces, safe transportation of tests to hospitals and laboratories, and transmission of advertisements to the community about hygiene measures and social distancing [14]. In Europe, police drones are used to monitor citizens who do not respect social distancing. Special thermal cameras are being used to simultaneously measure temperature of several people to detect passengers with symptoms of the disease (high temperature) at Shanghai Airport in China. These cameras are designed to perceive and analyze radiation to determine people’s temperature; they convert caloric energy into an image visible to the human eye. In April 2020, the Peruvian Army announced the use of Matrice 200 modular drones. Through an integrated thermal camera, these devices detect the temperature of people who are in the inspection range of the lens [15]. Additionally, these drones have been used to detect the temperature of people in populous districts of Lima and Callao, as well as to disinfect markets identified as hubs of contagion. Huawei made an important donation of five thermal cameras to Peruvian government [16]. These systems are able to measure the temperature of up to 15 people simultaneously, and – as officially announced by the government – they would be located at high-traffic points. In China and Dubai police agents use “smart” helmets, equipped with two cameras that provide infrared thermal images from five meters away, being able to measure the temperature of up to 200 people per minute and show the results in real time [7]. In May 2020, the airport in Rome began using this technology to detect passengers with high temperatures to restrict their transit. In Peru, mandatory temperature checks were implemented at entry points of workplaces, airports, and terminals, among other places highly transited; however, measurements are made individually with digital thermometers. 3.1.2 Big Data and Dashboards Since the beginning of the outbreak, Environmental Systems Research Institute - Esri has provided Johns Hopkins University with ArcGIS technology to create a dashboard with a map that tracks almost in ‘real time’ the growth of SARS-COV-2 cases worldwide. Similarly, since 26 January 2020, WHO has enabled its official COVID-19 Dashboard based on the same technology [17]. Telematica, partner of Esri in Peru, has implemented the COVID-19 Cases Dashboard [18], including data provided by official public sources, such as the Ministry of Health (MINSA) [19]. This data is classified by regions and demographic groups, as well as daily information of confirmed, discarded and recovered cases, number of people in hospitals, death toll, and the availability of Intensive Care Units - ICU. Taiwan’s response to the pandemic consisted of a comprehensive strategy, with measures implemented at the beginning of the outbreak, such as location tracking of inhabitants who are highly likely to be carriers of the virus, based on their history of international travel from countries with high contagions. These early measures allowed the government to track, map and contain the spread of the disease, which, more than five months
Industry 4.0 Technologies in the Control of Covid-19 in Peru
281
after the outbreak, counts less than ten people dead by COVID-19. The Taiwanese government integrated the National Health Insurance Database with the Immigration and Customs Database for the creation of Big Data Analytics [20]; based on travel history and symptoms of citizens, the system generates real-time alerts to identify positive cases. Complementing Big Data, Quick Response - QR code scanning and online reports with patient information were also included to classify travelers at high risk of infection. Furthermore, to ensure the virus’ containment, high-risk or infected patients were isolated and monitored by the government, tracking them through their mobile phones to ensure the compliance of quarantine and imposing high fines in cases of non-compliance. In contrast, Peruvian former Head of MINSA announced that the immigration authorities would list and track travelers arriving from countries considered to be at high risk of infection; thus, no digital database was implemented before patient zero [21]. After the arrival of the virus in Peru, the Intelligence and Data Analysis Unit of the Social Security System–EsSalud, began to implement a heat map based on forecast services, developed to identify the evolution of the confirmed cases of COVID-19 and the concentration of cases by area or districts of Lima [22]. This heat map conglomerates three databases: that of the National Institute of Health (INS) which provides the results of COVID-19 tests; data from the National Registry of Identification and Civil Status (RENIEC) and, finally, from Google Maps. 3.1.3 Artificial Intelligence (AI) Benevolent AI, which uses AI to develop drugs for complex diseases, is now concentrating its efforts on Coronavirus [23]. This company recently published how they found, using AI, a drug that can be reused against Coronavirus; it is a substance called janus baricitinib kinase inhibitor, currently used to treat rheumatoid arthritis, which is believed to prevent the spread of infection as well as inflammation. Intelligent Ultrasound Group, the AI-based ultrasound software and simulation company has developed a simulation tool that is being used for free at NHS Nightingale Hospital in London to instruct health workers to look for signs of respiratory diseases, including COVID-19 [24]. The WHO and Rakuten Viber have launched an interactive chatbot to provide accurate information related to COVID-19 in multiple languages for readers around the world [25]. The goal of this technological solution is to provide reliable health information through innovative digital technology; General Director of WHO Tedros Adhanom Ghebreyesus declared “Information is powerful and can help to save lives during this pandemic”. The Indian government has developed the official platform myGov Corona Helpdesk, offering a wide range of services that provides information about the pandemic, including symptoms, how the infection is transmitted, preventive measures, and official government phonelines for personalized assistance [26]. The Rapid Response Virtual Agent program is a solution developed by Google to answer questions from organizations to its customers. Similarly, Microsoft offers Azure cognitive services to be used by large companies and healthcare organizations. Similarly, in Mexico City, Victoria chatbot was implemented to avoid the saturation of emergency phonelines and to offer a free selfdiagnosis [27]. In Spain, SemanticBots, a technology-based company in collaboration
282
J. Acevedo-Flores et al.
with research professors from Universitat Jaume I, created the “inclusive” COVID-19 chatbot, which ensures the availability of digital information about the pandemic for people with visual disabilities and older adults [28]. The Peruvian health system is just taking the first steps towards digital transformation. EsSalud implemented an application that allows patients to make appointments online, and provides other services such as digital medical history, telehealth, and drug-supply monitoring [29]. The Peruvian Government presented botPE in May 2020; this official chatbot provides up-to-date information concerning COVID-19 such as symptoms, and preventive measures among others [30]. 3.1.4 5G Technology The Fifth Generation of Cellular Telephony, 5G, has also been used as a solution against COVID-19; Sichuan University has adopted 5G to remotely perform computed tomography (CT) scanning and diagnostics, having already diagnosed 106 people in China. Scans are highly effective in diagnosing asymptomatic patients, allowing their identification and treatment at an early stage of infection thus preventing further spread of the disease [14]. Another hospital in China began using scanning machines with a sophisticated AI-based system to analyze CT images of COVID-19 patients in 20 s with 96% accuracy. Huawei with its Cloud service provides the EIHealth platform [31], which includes services such as genome detection, silicon antiviral drug detection and AIassisted computed tomography (CT) analysis; the latter has been successfully used as a COVID-19 Rapid Diagnostic System in many healthcare centers in Ecuador, Colombia, Panama and Honduras. In Peru, 5G technology is still in the implementation phase, with the promise of integrating mobile devices, smart devices at home, autonomous systems, and wearables, all reinforced with AI and the virtual cloud. However, according to the latest report by the Supervisory Authority for Private Investment in Telecommunications (Osiptel), there are only about 20,630 antennas implemented in Peruvian territory, which are insufficient to implement 5G, and it is estimated that it would require more than 200,000 antennas, representing an investment of US$ 27 billion [32]. 3.1.5 Mobile Contact Tracking Apps Contact tracking and contact tracing are perhaps the most common uses of technology worldwide to identify people (“contacts”) who may have been in contact with a person infected with COVID-19. Many proposals have been developed, usually with the support of the government, and with different characteristics, such as the level of intrusion depending on whether Bluetooth signals, Global Positioning System (GPS) or antenna triangulation are used to record proximity between mobile phones (their users). The difference between GPS and Bluetooth is that the latter offers greater accuracy in identifying nearby devices, outdoors and indoors efficiency, low coverage in buildings or underground parking lots, less power consumption, and with no network connection required; additionally, geographic location or movement information from mobile carrier users are not stored; that is, it guarantees personal information privacy.
Industry 4.0 Technologies in the Control of Covid-19 in Peru
283
In China, tech giants Alibaba and Tencent, allied with the Chinese government developed the HealthCode application that was quickly implemented in about 200 cities and allowed strict citizen surveillance [33]. In South Korea, Corona app transmits an alarm in case the owner of the mobile phone has been at least 100 m away from a contact that has been confirmed as an infected patient. Similarly, in Singapore, contact tracking app TraceTogether has been developed to identify whether the cellphone user has been close (2–5 m distance) to a COVID-19 infected agent for at least 30 min. The app also notifies people who might have been infected with COVID-19 to ensure they follow quarantine orders, and to have their medical diagnostic tests performed [34]. Forty days after COVID-19 was declared a pandemic, there were 29 countries worldwide that were already using digital contact tracing as a containment technique, including Peru. A team of researchers from the University of Oxford carried out simulations with this type of app, verifying that in order to substantially reduce the number of new infections with the virus, it was necessary that at least 60% of the population use the digital contact tracking application [35]. It is important to consider that coronavirus transmissions occur before symptoms appear; therefore, this type of technological solution should be used by 80% of all mobile phone users, requiring that smartphone owners download, enable and permanently use permanently the app [36]. Concerns regarding personal data privacy are stronger in Europe and the USA than in Asia. The European Commission considered a solution that did not compromise the privacy of citizens using anonymous and aggregated data. The EU supports initiatives that combine Big Data with AI to contain the pandemic, and citizens’ data are collected only because it is an exceptional situation of risk for public health [37]. Massachusetts Institute of Technology - MIT, Harvard University and Facebook are developing Safepaths [38], an app similar to those in Asia, however, considering individual rights and users’ privacy. Other countries using digital contact tracking or tracing apps are Australia, Colombia, Czech Republic, Hungary, India, Israel, Malaysia, Norway, Saudi Arabia, France, Finland, Italy, and Russia, among others. In the first week of April 2020, the Peruvian government presented the new geolocation strategy “Peru in your hands” (Peruentusmanos app) [39]. The official app, available in PlayStore, first requires a user’s/users’ personal identification, followed by questions regarding the symptoms of COVID-19, and then accurately geolocates the risk zones in real time through virtual maps marked with red or orange circles in areas of higher or lower risk, respectively. After months of its implementation, the app had been downloaded by about 1.5 million people, representing only 5% of the Peruvian population; thus, there are concerns about the effectiveness of the technological solution due to the 80% of users required to be really efficient (as previously explained). Nonetheless, this concern is widespread: in France the adoption only reached reach 5% and 20% in Germany; Iceland and Ireland reach less than 40%, as well as Singapore; the latter for example, although a society of greater digital development and high-tech education, only 30% of the population uses the TraceTogether app. 3.1.6 Robotics and Electronic Devices Pudu Technology from China, which generally uses robots in the catering industry, is now using its robots in more than 40 hospitals across the country [40]. In addition, Blue Ocean
284
J. Acevedo-Flores et al.
Robotics has deployed its autonomous robots to eliminate viruses and bacteria by using ultraviolet light. “Robot dogs” are used on the streets of Singapore to strengthen security and to verify that social distance measures to prevent transmission of the virus are abided by the population. A robot dog, called “Spot”, the creation of Boston Dynamics, has a built-in camera capable of detecting the number of people around it. These robots are commonly found patrolling Bishan-Ang Mo Kio Park, broadcasting recorded messages to warn passers-by [41]. Everett Regional Medical Center in Washington has implemented a system using a robot called Vici that features a screen, speakers, and microphone, to allow healthcare staff to communicate with patients [42]. The robot, which has been developed by InTouch Health, resembles a tablet with wheels and has an integrated stethoscope and allows basic tests such as temperature measurement. In Hong Kong, airport officers provide electronic bracelets to travelers arriving in the region; thus, their location is checked to ensure that they are following the mandatory twoweeks quarantine. In case of non-compliance of social isolation, the bracelet carrier is notified and persuaded; this surveillance technique was an attempt to contain the number of imported cases to Hong Kong [43]. In other countries such as Taiwan, this control was carried out without a bracelet, directly through the telephone service providers and synchronized with the quarantined citizen’s mobile phone. 3.1.7 Other Technologies Besides the technologies mentioned above, there are other innovations based on Virtual Reality and Augmented Reality, Cloud Computing, 3D Scanning and 3D Printing, which have been used in countries such as the United Kingdom or the USA, for example: • Oxford Nanopore Technologies has provided a set of sequencing products to assist with the research and genetic epidemiology of SARS-COV-2 virus. • Scientists from the University of California in San Francisco used Augmented Reality to study the molecular structure of the virus, collaborating in a virtual environment shared with other researchers. • Medopad and physicians from the Imperial College and John Hopkins University are working to develop a remote platform for monitoring chronic and vulnerable patients. • 3D printers are used to produce biosecurity ventilators and protectors. In Peru, some entities are using 3D printing to satisfy the personal protective equipment requirements, thus trying to face the biosecurity material shortage.
4 Discussion This paper identifies technologies used in the strategies to control the Coronavirus pandemic, providing a description of the technological solutions applied by different countries around the world in contrast to the Peruvian management of the health crisis. This document studies intelligent diagnostic and surveillance systems, and technological tools
Industry 4.0 Technologies in the Control of Covid-19 in Peru
285
for isolation and treatment, including some complex solutions that combine the use of Artificial Intelligence, Big Data, the Internet of Things, robotics, among others. From the description given in the previous section, it is clear that those countries that in recent years have invested more in technology (China, South Korea, Singapore, Taiwan and Japan), in human talent development to use it, and in R + D + I (Research, Development and Innovation), are the ones that currently are obtaining better results in this crisis and are expected to be the most strengthened countries after this pandemic. The strategies these countries are using are reinforced by technological systems and solutions which belong to the fourth industrial revolution or Industry 4.0 technologies. Based on the experience of other countries, there are several benefits of using strategies that include technological solutions of Industry 4.0 to mitigate the pandemic, such as better planning and coordination of activities, government integration, comprehensive monitoring of epidemiological growth, surveillance and strict control of restrictions, innovation in the production of medical equipment and in the manufacture of protective elements for healthcare staff, virtual reality for educational and research purposes, augmented reality to improve work experience and follow protocols, digital technologies to help academic community, robotics for treatment of infected patients and to reduce the risk of contagion, digital information through chatbots to ensure transparency and up-to-date information, avoiding putting the mental health of population at risk with fake news or online scams, smart tracking systems to detect contagions and plan smart fence strategies, among others. Regarding the Peruvian case, although some of the technologies used in the rest of the world have been deployed, it is essential to improve the implementation of some of these solutions or integrate them with others to enhance the strategy and obtain better results. For example, Peruentusmanos application has not been exploited to its full potential, so it is important to announce its advantages so that its use becomes massive. It is not the only concern regarding the app [44]; despite requesting Bluetooth permissions, Contact Tracing technology has not yet been implemented; GPS tracking could have been done without an app, only by requesting information from telecommunications companies – similar to the European Commission – only requiring aggregated and anonymous data; thus, complying with the Peruvian legislation around the Protection of Personal Data. Despite initial errors and inaccuracies about the terms and conditions of use of the app (they were updated twice), Peruvian government announced future versions with many features such as Bluetooth for the contact tracing, and alerting messages in the event of having been nearby to a confirmed infected agent. Furthermore, a fourth version of the app will include interpretation of the statistics collected for making strategic decisions in the fight against the pandemic [45]. On the other hand, to ensure efficiency in the use of acquired or donated equipment (thermal cameras, drones, diagnostic tests), Peruvian government should design a strategy for a proper implementation, for example, thermal cameras to measure temperature and drones should be used in high-traffic places such as metro stations, markets, banks agencies, and drugstores. Another strategy that the Peruvian government could imitate from other countries is concerning transparency and access to information related to COVID-19. In an analysis published (May 13) by the Chilean NGO Ciudadanía Inteligente, based on data from 20
286
J. Acevedo-Flores et al.
countries, the results show a high level of information transparency in Colombia, Mexico, Chile and Peru in the first positions regarding public access “basic information” [46]. According to the results, Latin American governments do not publish enough information in times of pandemic: “We see that very few countries exceed 50% in our index. No countries reach 100% and only two (Colombia and Mexico) are near 80%”. To improve Peru’ score, one or two chatbots could be implemented to provide up-to-date and official replies to enquiries on COVID-19. This technological solution is particularly relevant in times of uncertainty as it is important to preserve and strengthen populations’ mental health, mitigating the negative effects of the infodemic (excessive volume of information about the pandemic, including fake news and misinformation) [47]. Applications based on virtual reality and augmented reality can be used to develop inductions on biosecurity protocols for healthcare staff. In addition, technologies of Industry 4.0 could be applied to guarantee the supply chain, looking for decentralized alternative solutions to prevent agglomerations in markets, through systematized control with robots, drones and Radio Frequency based identifiers (RFIDs). The main limitations of Peru are because its governments did not follow a technological determinism in past decades, its development in this matter is still low, there are not “smart cities” appropriately connected, the data infrastructure is not robust, modernization and innovation in the government is still in progress, the national health system is fragmented, and there is far from a full implementation of telehealth services. Finally, it is important to note that the technological factor is not the only one to consider in the results; the human approach has been critical in this health crisis, including: the early response, that is the quick and smart decisions made by organizations and governments, the capacity of national healthcare systems, the level of militia involvement and civil society commitment, the geographical density and level of family overcrowding, social interactions habits and families logistics.
5 Conclusion This paper offers descriptions and contrasts regarding the technologies used in Peru and other countries in the world to control the COVID-19 pandemic. Based on the revised information, there are several technological solutions successfully applied in Peru as in the rest of the world for the pandemic control, as well as other digital tools not being used yet that offer great potential. We strongly recommend considering all the technology available for designing better strategies, as the results in Peru have not been as expected. Contribution. The main contribution of this review is to provide observations and recommendations that could be considered in the decision-making process of the Peruvian government and organizations responsible for the design of strategies to obtain better results in pandemic control. Additionally, this paper shows how important technology is in crisis management, so that Latin American countries, especially Peru, need to invest more in research and innovation, digital evolution, and telecommunication infrastructure.
Industry 4.0 Technologies in the Control of Covid-19 in Peru
287
References 1. World Health Organization. https://www.who.int/health-topics/coronavirus#tab=tab_1. Accessed 09 June 2020 2. Javaid, M., Haleem, A., Vaishya, R., Bahl, S., Suman, R., Vaish, A.: Industry 4.0 technologies and their applications in fighting COVID-19 pandemic. Diab. Metabolic Syndrome Clin. Res. Rev. 14(4), 419–422 (2020). https://doi.org/10.1016/j.dsx.2020.04.032 3. Kummitha, R.K.R.: Smart technologies for fighting pandemics: the techno- and human- driven approaches in controlling the virus transmission. Newcastle Business School, Northumbria University, Newcastle, UK (2020). https://doi.org/10.1016/j.giq.2020.101481 4. World Economic Forum. The Global Competitiveness Report 2019, pp. 458–461. https:// www3.weforum.org/docs/WEF_TheGlobalCompetitivenessReport2019.pdf. Accessed 02 Aug 2020 5. Zema, N.R., Natalizio, E., Ruggeri, G., Poss, M., Molinaro, A.: Medrone: on the use of a medical drone to heal a sensor network infected by a malicious epidemic. Ad Hoc Netw. 50, 115–127 (2016). https://doi.org/10.1016/j.adhoc.2016.06.008 6. Ding, G., Wu, Q., Zhang, L., Lin, Y., Tsiftsis, T.A., Yao, Y.D.: An amateur drone surveillance system based on the cognitive internet of things. IEEE Commun. Mag. 56(1), 29–35 (2018). https://doi.org/10.1109/MCOM.2017.1700452 7. Abdulrazaq, M., Zuhriyah, H., Al-Zubaidi, S., Karim, S., Ramli, R., Yusuf, E.: Novel covid19 detection and diagnosis system using IoT based smart helmet. Int. J. Psychosoc. Rehabi. 24(7), 2296–2303 (2020). https://doi.org/10.37200/IJPR/V24I7/PR270221 8. Shah, P., Patel, C.R.: Prevention is better than cure: an application of big data and geospatial technology in mitigating pandemic. Trans. Indian Natl. Acad. Eng. 5(2), 187–192 (2020). https://doi.org/10.1007/s41403-020-00120-y 9. Ting, D., Carin, L., Dzau, V., Wong, T.Y.: Digital technology and COVID-19. Nat. Med. 26(4), 459–461 (2020). https://doi.org/10.1038/s41591-020-0824-5 10. Chiuchisan, I., Chiuchisan, I., Dimian, M.: Internet of things for e-health: an approach to medical applications. In: International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), pp. 1–5 (2015). https://doi.org/10.1109/IWCIM.2015.7347091. 11. Google. Rapid Response Virtual Agent. Google Cloud (2020). https://cloud.google.com/sol utions/contact-center/covid19-rapid-response. Accessed 06 Aug 2020 12. Microsof. Azure (2020). https://azure.microsoft.com/es-mx/overview/covid-19/. Accessed 06 Aug 2020 13. Tavakoli, M., Carriere, J., Torabi, A.: Robotics, smart wearable technologies, and autonomous intelligent systems for healthcare during the covid-19 pandemic: an analysis of the state of the art and future vision. Adv. Intell. Syst. 2(7), 1–7 (2020). https://doi.org/10.1002/aisy.202 000071 14. Miaomiao, M.: When Life Becomes Sci-F. Advanced technology contributes to the battle against epidemic. Beijing Review. https://www.bjreview.com/Nation/202003/t20200306_800 195892.html. Accessed 06 Mar 2020. 15. Plataforma Digital Única del Estado Peruano. COVID-19: drones del Ejército detectan personas con fiebre. TV Peru, 4 April 2020. https://www.tvperu.gob.pe/noticias/locales/covid19-drones-del-ejercito-detectan-personas-con-fiebre. Accessed 18 July 2020 16. Plataforma Digital Única del Estado Peruano. Ministerio del Interior recibe importante donación para apoyar contención del Covid-19. El Peruano, 18 April 2020. https://www.gob. pe/institucion/mininter/noticias/126135-ministerio-del-interior-recibe-importante-donacionpara-apoyar-contencion-del-covid-19. Accessed 18 June 2020
288
J. Acevedo-Flores et al.
17. Kamel, M.N., Geraghty, E.M.: Geographical tracking and mapping of coronavirus disease COVID-19/severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic and associated events around the world: how 21st century GIS technologies are supporting the global fight against outbreaks and epidemics. Int. J. Health Geographics 19(1), 8 (2020). https://doi.org/10.1186/s12942-020-00202-8 18. Telemática S.A. (2020). https://www.telematica.com.pe/usando-datos-para-mapear-infecc iones-camas-de-hospital-y-mas/?fbclid=IwAR0Sm4PtjLsyiRr8XZN5YRUmuQEYEyprmw iBL8n2IR-c6dubC2r4NxMzRs8. Accessed 02 Aug 2020 19. Plataforma Digital Única del Estado Peruano. Sala Situacional COVID-19 Perú. MINSA, 31 July 2020. https://covid19.minsa.gob.pe/sala_situacional.asp. Accessed 09 June 2020. 20. Wang, C.J., Ng, C.Y., Brook, R.H.: Response to COVID-19 in Taiwan big data analytics, new technology, and proactive testing. Natl. Libr. Med. 323(14), 1341–1342 (2020). https://www. ncbi.nlm.nih.gov/pubmed/32125371. Accessed 03 Mar 2020 21. Plataforma Digital Única del Estado Peruano. El Peruano, March 2020. https://elperuano. pe/noticia-ministra-salud-aislamiento-domiciliario-es-una-de-estrategias-permite-un-mejorcontrol-90907.aspx. Accessed 09 June 2020 22. Plataforma Digital Única del Estado Peruano. EsSalud: Seguro Social de Salud del Perú (2020). https://noticias.essalud.gob.pe/?inno-noticia=mapa-de-calor-de-essalud-senala-quedistritos-de-la-victoria-brena-y-carmen-de-la-legua-presentan-mayr-indice-de-contagiospor-km2-de-covid-19. Accessed 12 July 2020 23. The New York Times. https://www.nytimes.com/2020/04/30/technology/coronavirus-treatm ent-benevolentai-baricitinib.html. Accessed 30 June 2020 24. Intelligent Ultrasound. USA. https://www.intelligentultrasound.com/bodyworks-eve-covid19-module-used-to-train-front-line-clinicians-in-nyc/. Accessed 09 June 2020 25. United Nation. https://www.un.org/es/coronavirus/articles/onu-une-fuerzas-con-nuevos-soc ios-para-compartir-informacion. Accessed 09 June 2020 26. Government of India (2020). #IndiaFightsCorona COVID-19. https://www.mygov.in/cov id-19. Accessed 09 Aug 2020 27. Gobierno de la Ciudad dhttps://covid19.cdmx.gob.mx/comunicacion/nota/lanzan-gobiernode-ciudad-de-mexico-y-facebook-chatbot-victoria-whataapp-para-informar-medidas-sobrecovid-19e Mexico Homepage. Lanzan Gobierno de Ciudad de México y Facebook chatbot “Victoria” en WhatsApp para informar medidas sobre COVID-19, 30 June 2020. . Accessed 09 Aug 2020 28. Universitat Jaime I. Semanticbots crea un chatbot inclusivo y accesible sobre el COVID-19. Fundación General FUGEN, 20 Apr 2020. https://espaitec.uji.es/semanticbots-crea-un-cha tbot-inclusivo-y-accesible-sobre-el-covid-19/. Accessed 09 Aug 2020 29. RPP Noticias. Perú. https://rpp.pe/tecnologia/apps/asi-funciona-el-aplicativo-de-citas-de-ess alud-noticia-1137163. Accessed 17 July 2018 30. Plataforma Digital Única del Estado Peruano. El Peruano. PCM y sector privado apuestan por lo digital. Más de 30 empresas y universidades articulan y desarrollan soluciones, 18 May 2020. https://elperuano.pe/noticia-pcm-y-sector-privado-apuestan-por-dig ital-95991.aspx. Accessed 17 July 2018 31. DPL: Digital Policy & Law Group News. https://digitalpolicylaw.com/con-inteligencia-artifi cial-y-la-nube-huawei-aporta-en-la-lucha-contra-el-coronavirus-2/. Accessed 09 Aug 2020 32. Diario Gestión. Perú. https://gestion.pe/economia/telefonia-movil-cuantas-antenas-necesi tara-el-peru-para-la-tecnologia-5g-noticia/. Accessed 16 July 2020 33. CNN en español. CNN Business. https://cnnespanol.cnn.com/2020/04/16/china-estaluchando-contra-el-coronavirus-con-un-codigo-qr-digital-asi-funciona/. Accessed 16 July 2020ss 34. TechwireAsia. https://techwireasia.com/2020/03/the-tech-leading-asias-fight-against-thecoronavirus/. Accessed 24 Aug 2020.
Industry 4.0 Technologies in the Control of Covid-19 in Peru
289
35. Armstrong, S.: Covid-19: deadline for roll out of UK’s tracing app will be missed. BMJ 369, m2085 (2020). https://doi.org/10.1136/bmj.m2085 36. Servick, K.: COVID-19 contact tracing apps are coming to a phone near you. How will we know whether they work? American Association for the Advancement of Science. (2020). https://www.sciencemag.org/news/2020/05/countries-around-world-are-rol ling-out-contact-tracing-apps-contain-coronavirus-how. https://doi.org/10.1126/science.abc 9379. Accessed 24 Aug 2020 37. Parlamento Europeo. COVID-19: cómo garantizar la privacidad y la protección de datos en aplicaciones móviles. Noticias, sección Sociedad, 6 May 2020. updated on July 16. https://www.europarl.europa.eu/news/es/headlines/society/20190410STO36615/est adisticas-sobre-la-mortalidad-en-las-carreteras-europeas-infografia. Accessed 24 Aug 2020 38. Massachusetts Institute of Technology MIT Safepaths Homepage. https://safepaths.mit.edu. Accessed 09 June 2020 39. Plataforma Digital Única del Estado Peruano. Gobierno recomienda usar aplicativo “Perú en tus manos” para saber todo sobre covid-19. Andina, 27 May 2020. https://www.gob.pe/ins titucion/pcm/noticias/111820-gobierno-implementa-aplicativo-para-identificar-situacionesde-riesgo-y-detener-cadena-de-contagio-por-covid19. Accessed 09 June 2020 40. IA Observatorio. https://observatorio-ia.com/inteligencia-artificial-big-data-y-tecnologiapara-combatir-el-coronavirus. Accessed 16 July 2020 41. CNN New. https://edition.cnn.com/2020/05/08/tech/singapore-coronavirus-social-distan cing-robot-intl-hnk/index.html. Accessed 08 Aug 2020 42. The Guardian. https://www.theguardian.com/us-news/2020/jan/22/coronavirus-doctors-userobot-to-treat-first-known-us-patient. Accessed 23 July 2020 43. The New York Time. https://www.nytimes.com/2020/04/08/world/asia/hong-kong-corona virus-quarantine-wristband.html. Accessed 08 Aug 2020. 44. Hiperderech. https://hiperderecho.org/2020/04/asi-funciona-la-aplicacion-movil-peru-entus-manos-tecnica-y-legalmente/. Accessed 04 Aug 2020 45. Plataforma Digital Única del Estado Peruano. Andina, 27 May 2020. https://andina.pe/age ncia/noticia-coronavirus-proyecto-para-prediagnostico-covid19-usa-inteligencia-artificial794607.aspx. Accessed 26 July 2020 46. ONG Ciudadanía Inteligent. Chile. https://ciudadaniai.org/campaigns/covid19. Accessed 13 Aug 2020 47. World Health Organization. Immunizing the public against misinformation (2020). https:// www.who.int/news-room/feature-stories/detail/immunizing-the-public-against-misinform ation. Accessed 13 Aug 2020
Proposal for a Software Development Project Management Model for Public Institutions in Ecuador: A Case Study José Antonio Quiña-Mera1,2(B) , Luis Germán Correa Real1 , Ana Gabriela Jácome Orozco1 , Pablo Andrés Landeta-López1 , and Cathy Pamela Guevara-Vega1,2 1 Universidad Técnica del Norte, Facultad de Ingeniería en Ciencias Aplicadas, Ibarra, Ecuador
[email protected], {lgcorrea,palandeta,cguevara}@utn.edu.ec, [email protected] 2 Network Science Research Group e-CIER, Ibarra, Ecuador
Abstract. Many software projects worldwide fail due to the lack of two aspects of quality in the development process or the resulting software product. The scientific community and the industry have developed and made available methods and best practices to control these two aspects to face this problem. However, several countries’ regulations must also be considered, which must modify or complement these methods. This research seeks to define a software development project management model based on the Government by Results (GBR) framework complemented by methodologies and best practices to guarantee its quality and provide a comprehensive development service. And software maintenance for internal and external clients of public institutions in Ecuador. A type of action research was applied to carry out the study, structured as follows: 1. Research design, 2. Theoretical foundation, 3. Development of the management model, 4 Evaluation of the management model (case study carried out in the Public Institution Yachay EP), 5. Results. As a result, an artifact framework for the management of software development projects was obtained. The case study (software project) shows that for every dollar invested in the project, $1.64 is recovered during one year after its execution. Besides, the management model evaluated the project with an average level of satisfaction since it optimizes the development area’s critical resources. It also found that the perception of the service should be improved. Keywords: Management model · PMI · GBR · ITIL · COBIT · AGILE · Software engineering · Software quality
1 Introduction The current market is very competitive; it is not enough to produce and distribute products and/or services; it is necessary to do it with quality to achieve customer acceptance. According to Moreno, quality does not have a concept; it is only recognized [1]. However, quality in software is a complex concept that is not directly comparable to a product © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 290–301, 2021. https://doi.org/10.1007/978-3-030-63665-4_23
Proposal for a Software Development Project Management Model
291
[2]. For this reason, finding quality in software has become one of the main strategic objectives for organizations, since their most important processes and often their survival depends on the flawless operation of the software [3]. In the software development process, it is essential to establish and implement a reference framework or methodologies that allow quality to be measured throughout the software product [4]. Currently, most software projects are governed by one or more methods. The use of these will depend on the context in which the project is developed. The existing methodologies in the market are of various types: traditional, agile, real-time, among others, but always governed by processes and aimed at the control and assurance of the quality of the software product [5]. In Ecuador, in addition to using the methodologies above, state regulation applies to software projects carried out in state institutions that must be framed in the Government by results (GBR) regulations [6]. This research aims to define a software development project management model that ensures product quality and complies with the GBP regulations in public institutions in Ecuador. The management model’s design will be carried out by combining various methodologies and acceptable development practices complemented by GBP regulations. This model will be carried out in the context of a case study of the Public Institution (Yachay E.P.). This work is structured as follows: Sect. 1. Introduction, where the problem, justification, and objective of the investigation are defined. Section 2. Materials and methods describe the research design and methodology, conceptual basis, and development of the management model (research proposal). Section 3. Case study: Implementation of the developed management model. Section 4 Results shows the results obtained from the case study. Section 4. Discussion, with similar works. And finally, Sect. 5. Conclusions and recommendations.
2 Materials and Methods This section defines the research methodology phases, the participants, and the specific objectives of the project (see Table 1). 2.1 Research Design Investigation Type. In this work, action research in the field was chosen to research an active actor within the context of public institutions in Ecuador to find solutions to software development problems. The study was carried out in the software development area of the Yachay E.P. Ecuador. Research Method. The chosen method to evaluate the research and the proposed management model was the case study. As a case study, this method was implemented in a software development project to automate the document storage and archiving process of Yachay E.P. Population and Sample. The study population comprises a work team made up of seven developers, two coordinators, and a director of the software development area. Due to the investigated population does not exceed 100 elements, we used the entire universe without drawing a representative sample.
292
J. A. Quiña-Mera et al. Table 1. Research methodology phases
First phase: Research design
(i) Define the research type, (ii) Define the research method, (iii) Population and sample.
Second phase: Theoretical foundation
(i) Management models, project management, frameworks, IT governance, (ii) Government by Results (GBR), (iii) COBIT 5 – IT, (iv) Project Management Institute (PMI), (v) Information Technology Infrastructure (ITIL), (vi) Agile Philosophy.
Third Phase: Management Model Design
(i) Study and analysis of needs, (ii) Theoretical foundation structure, (iii) Relationship of frameworks, (iv) Establishing the Management Model architecture, (v) Management model framework: (Policies, Norms, Processes, Procedures, Indicators).
Fourth Phase: Management Model Evaluation (i) Case study context, (ii) Summary (case study) development of the software project. Fifth Phase: Results
(i) Management model result, (ii) Case study Results, (iii) Satisfaction survey.
2.2 Theoretical Foundation This section conceptualizes the topics, standards, methodologies, and best practices used in developing the software development management model for public institutions in Ecuador. Management Model. It takes a set of administrative activities such as planning, coordination, measurement, monitoring, control, and reporting that ensures systemic, quantifiable, and disciplined project management [7]. Management models are work schemes for entities administration concerning strategy, processes, people, and results to achieve excellence. Organizations allow them to govern, order, direct, organize and arrange a set of developed actions to administer or manage a project [8]. The management model pillars must be aligned with the mission, vision, and values of the organization. The three pillars are: processes, technology, and people, which will achieve the proposed objectives and ensure optimal results; these pillars can be reviewed at the following weblink1 [9]. Software Development Project Management. It determines the previous prediction of the cost, time, money, resources, tasks, and quality of a software development project. These activities distributed throughout the project, correcting changes over 1 Figure of pillar of the management model aligned to institution: https://bit.ly/2yoA5NA.
Proposal for a Software Development Project Management Model
293
time as preventing risks [10]. Furthermore, it aims to improve the project so that it is successful [11]. Framework. A framework is a set of tools, procedures, techniques, and documentaries support that allows structure, plan, and control of information systems’ development process [12]. The role of frameworks is to improve software processes, determine the potentiality, the performance of their processes, and the organization [13]. They are defined as quality standards that are developed through a maturing time of the processes. IT Governance. Governments depend on information technologies (IT) for their proper functioning and development [14]. Investments carry out in technology to make institutions more efficient, safer, and mainly to comply with their strategic planning [15]. Government by Results (GBR). “Results-based governance is associated with the scorecard or command board, which is a management system that links the achievement of long-term strategic goals with the daily operations of an organization, clarifying the vision and strategy to translate it in action, through the redesign of internal business processes as well as external results to improve strategic performance and results” [16] continuously. The balanced scorecard suggests that organizations be evaluated and managed from four perspectives; these can be reviewed at the following weblink2 [16]. COBIT. Its mission is to develop, research, make public, and promote an updated and accepted IT governance control framework for the reception and daily use in companies by IT professionals, assurance, and business managers [15]. The four management domains are aligned with the Planning, Build, Operate, and Monitor (PBRM) areas of responsibility. The COBIT 5 process reference model can be reviewed at the following weblink3 [17]. Project Management Institute (PMI) and PMBOK®. The PMI is a non-profit institution for social purposes, founded in the United States in 1969. Its main objective is professionalization in project management [18]. The fifth version of PMBOK® defines project management as the use of skills, techniques, tools, knowledge, and execution of project activities to satisfy project requirements [18]; Project management is obtained through the application and integration of processes: initiation, planning, execution, monitoring and control, and closure. The PMPBOK process group can be reviewed at the following weblink4 [18]. Information Technology Infrastructure Library- ITIL. They are joint documents detailing processes that efficiently and effectively manage IT services: standards and best practices for efficient management and design within the organization [19]. ITIL’s main objective is to minimize costs in supporting IT services, increasing reliability, consistency, and quality in the information requirements [20]. Considering the areas of influence of ITIL in public institutions’ management and operation, the distribution of services can be reviewed at the following weblink5 [21]. 2 Figure of the balanced scorecard perspectives: https://bit.ly/2SKh6nZ. 3 Figure of COBIT® 5 process reference model: https://bit.ly/2YqVi4c. 4 Figure of PMPBOK process group: https://bit.ly/3bTzz8W. 5 Figure of ITIL in public institutions: https://bit.ly/2KTSBAe.
294
J. A. Quiña-Mera et al.
Agile Philosophy. Agile philosophy emphasizes value delivery. The important thing is to deliver a first functional version so that the end-user can work with it and provide feedback to the work team and the needs of the business can be better established. Cockburn, his work “The Heart of Agility,” encourages agility and emphasizes four essential aspects: collaboration, delivery, reflection, and improvement [22].
2.3 Management Model Development Analysis. The management model responds to the study of the prioritization of internal and external needs in the context of public institutions, the regulations to be followed by the control entities, and the guidelines and best practices of referential frameworks. For the development of the management model, the GRP balanced scorecard, PMI’s best practices were considered. However, to complement the model, it is necessary to use the IT Government COBIT 5 reference framework that allows establishing the levels of governance and administration of the operation in software development areas and maintenance of public institutions. Considering that software development is a computer service, it was necessary to include ITIL best practices and agile philosophy to design processes oriented to service and internal customer satisfaction. Finally, to give transparency and follow-up to the procedures carried out in the model, it was complemented with metrics based on the Balanced Scorecard. The design of the model was carried out by the technology manager, the director, an architect, and three software analysts from the Yachay E.P. The structure of the theoretical foundation for the development of the management model was carried out as follows (see Fig. 1).
Fig. 1. Theoretical foundation structure.
Relationship between frameworks: To create the IT Government and develop the policies, rules, processes, and procedures that will govern the model, the relationships outlined in the COBIT 5 framework were used as a basis, which will allow governance within the development area. After studying and analyzing the reference frameworks: GBP, COBIT, PMI, COBIT, ITIL, and agile philosophy, the relationships between them were established and shown in Table 2 and Table 3. After the analysis, theoretical structure, and relationship of referential frameworks, the architecture of the management model is established, which results in an Artifacts Framework consisting of policies, standards, processes, and procedures for the development and maintenance of software in public institutions in Ecuador, (see Fig. 2).
Proposal for a Software Development Project Management Model
295
Table 2. Policy-driven relationships. IT Government “COBIT” evaluate, direct and monitor
GBP
PMI
Ensures it fits within the IT management framework
Strategic management, Growth
PMI Process
Ensures value delivery
Client, processes, financial
Tracing
Ensures risk optimization
Process
Control
Ensures resources optimization
Process, Human talent, Financial
Planification
Ensures stakeholders transparency
Client, Process
Planification
Table 3. Relationships are directed to standards and balanced scorecard. COBIT (IT Governance)
PMI
ITIL
Agile philosophy
Align, Plan, Organize Planification
Spring
Build, Acquire, Implement
User stories
Execution
Deliver, Serve, Give support Monitor, Evaluate, Value
GBP
ITIL Monitoring and control
Monitoring
Balanced Scorecard
Fig. 2. Management model proposal for software development.
296
J. A. Quiña-Mera et al.
Management Model Framework. The Framework has one policy guide, one policy guide, one process, and three procedures. Besides, the documentation has 20 artifacts needed to carry out the following phases of software development and maintenance: start, planning, execution, monitoring, and closing of software projects. The templates of the components of the management model Framework are available at the following weblink6 . To establish the Balanced Scorecard, the model will measure and evaluate the productivity and performance of software development processes to control, manage and continuously improve the methods to achieve the strategic indicator and the mission of the area of institution development. For this, Balanced Scorecard - KMI of Kaplan and Norton [23] was used, which will evaluate the management of the development area with four perspectives: financial, the satisfaction of delivery, internal processes, development, and learning of team members. The indicators used are as follows, see Table 4. Table 4. Balanced scorecard indicators recovered from (Kaplan and Norton, 1992). Indicators Financial indicators
1. Return of investment - ROI
Delivery satisfaction indicators
2. Delivery satisfaction
Internal process indicators
3. Number of incidents found in the testing and certification stage 4. Number of incidents found in production 5. Hours earned per project 6. Productivity per member of the project team
Indicators of development and learning 7. Project research percentage 8. Project innovation percentage
Each indicator has a technical and methodological sheet that shows how it is calculated. The technical and methodological guides can be downloaded at the following weblink7 .
2.4 Management Model Evaluation - Case Study The evaluation of the management model was carried out with a case study about implementing a software development project for the public institution Yachay E.P., considering the functional requirements [24]. The context of the case study is described below. See Table 5.
6 Components templates of the management model Framework: https://bit.ly/2yV0QcF. 7 Technical and methodological guide: https://bit.ly/2VQsv7F.
Proposal for a Software Development Project Management Model
297
Table 5. Case study Software development project:
Document management automation of the public institution Yachay E.P.
Applicant:
Documentary Management
Functional requirements:
13 user stories
Application date:
10/10/2017
Start date:
16/10/2017
Execution date:
11/05/2018
Deadline:
12/06/2018
Training date:
15/06/2018
Work team:
One manager, one development director, one project leader, two developers
Sprint number:
11
Number of hours per Sprint
40 h
Smoke tests
50
Application tests
20
Load and stress test
9
Number of incidents after delivery
4
We developed the project with the artifacts described in the Management Model Framework (Table 3) and used several model frameworks tools. The activities carried out, the tools used, and the documents generated throughout the project development phases. The complete Software Development Project (case study) can be downloaded at the following weblink8 .
3 Results The result of the management model for software projects proposes a Framework (see Table 3) that is made up of one policy guide, one norm guide, one process, three procedures, and 20 artifacts for documentation in the phases: start, planning, execution, monitoring and closing of software projects. The artifacts to use will depend on the technology and type of project (software implementation or maintenance). The case study result shows in the integral scorecard of the project, see Table 6. The Balanced Scorecard results show that the financial perspective was beneficial for the institution because the project’s cost was higher than the estimated benefits and allowed to invest that difference in tasks of operational value. The cost of the project will be recovered in less than the first year of using the system. The model implemented the previous Quality Assurance (QA) review in each phase, which allowed to reduce 8 Software development project related to the case study: https://bit.ly/2KJKKFj.
298
J. A. Quiña-Mera et al. Table 6. Result of the balanced scorecard
Indicator
Calculation factors
Result
Return of Investment -ROI
Benefits estimation Project cost
$4,684.23
Customer satisfaction
Satisfaction survey’s Weighted average
4.3
4.3
Number of incidents found Hours gained from the project Team productivity
Incident test stage
48
48
Incidents: certification stage
16
16
Incidents: production stage
4
4
Project investigation Number of incidents found
Estimated hours
470
−1.5
Effective hours
500
$7,670.40
Total requirements
20
Hours gained from the project Team productivity
Effective productivity
0.94
Estimated productivity
1
Project investigation
Research hours
80
Total project hours
500
Hours of innovation
120
Total project hours
500
Number of incidents found
$1.64
0.94
16.00% 24.00%
incidents, reprocessing, and maximizing team members’ knowledge. The number of incidents in the production stage strengthened the customer’s perception of who received a stable system (retrieved from the Satisfaction survey). The project had a 6% deviation in planning, which means that each requirement was underestimated in 1.5 h, causing a non-significant late delivery of the software product; this was quickly alerted in project implementation stages as part of risk management and timely communication with the requesting area. For the evaluation of team productivity, the average of the team members’ productivity was considered, which showed a slight gap in their learning curve because of the change in project management and operation. The first assessment of the adoption of the method is the initial state of the equipment; following evaluations for the project manager should consider the actions necessary to potentiate each member’s skills and reduce the learning gap. In the context of the project’s implementation, it invested 16% of the team’s effort to conduct research and 24% in tasks that generated operational innovation and knowledge management for the institution. The satisfaction survey was carried out on nine requesting users of the project to assess customer satisfaction, and 15 days was given to fill it out after submitting the project. The survey generates quality metrics: 75,56% usability, 81,11% performance, 74,81% efficiency, and 77,78% effectiveness. The results are shown as a percentage reached concerning the expected value. The average satisfaction of the software project is 77.31%, equivalent to 3.51 on the Likert scale. This result, validated in the balanced
Proposal for a Software Development Project Management Model
299
scorecard’s technical and methodological guide, classifies the project with a medium level of satisfaction. The result suggests the opportunity for improvement in the fourquality metrics, depending on the needs of the project stakeholders.
4 Discussion To organize business processes, several authors generate proposals that combine frameworks and methodologies to optimize their current approaches to provide a better service. Alfaraj and Qin [25] report a proposal to implement an integrated model between CoBIT and CMMI using plugins of the ITIL, CoBIT, and ISO/IEC 22007 frameworks generated a template for integrating frameworks of job. The authors clarify that the model will provide benefits to the organizations that implement it; They can reduce the complexity of the decision-making process, determination, identification, validation, and description of the area where they use it. We agree with Hussain’s model in combining several frameworks in our case, CoBIT, ITIL, PMI, Agile Philosophy, and GBP, to improve a public institution’s software development process and meet the government requirements of results requested by external government entities. This proposal was complemented by evaluating the model by executing a case study. In the execution and development of the case study, it was observed that the institution’s improvement began since a process was formally established with the necessary instruments for its operation and monitoring. Furthermore, we were complimented by the measurement through an integrated scorecard of indicators that measure the financial perspectives, the satisfaction of delivery, processes, and learning of a project’s work team. When executing the case study, it was observed that the model promotes the establishment of more realistic objectives in the projects, reduced incidences of errors, thereby optimizing resources and reducing software costs, and also increases the confidence of the management of the software development area. On many occasions, the projects that require changes are not always well-received by all users; this directly influences the software product’s quality. It was evident that developing and implementing management models (proposal) showed a deficit of trained personnel, which requires a training plan for several team members. It was also evident that senior management’s support and commitment is an essential point to implement projects of this type because it requires effort in terms of training, organization, and change in business culture.
5 Conclusions In this research, we were able to study and integrate several methodologies and frameworks to propose a software development project management model that ensures the quality of the software product and complies with the GBR (Government by results) regulations that apply to public institutions in Ecuador. The proposed model integrated governance and IT administration processes recommended by COBIT 5; the IT balanced scorecard was aligned with the Government for Results (RBM). The project management structure was established with PMBOK® (PMI) practices to manage internal and external applicants’ software development needs. The agile philosophy applied in the model helped to organize the effort and relationships
300
J. A. Quiña-Mera et al.
of the work team. It was based on ITIL to establish management processes for software maintenance. Finally, the documentation proposed by the model met the parameters of internal and external audits of the public institution without falling into excessive and unnecessary documentation; on the contrary, the submitted documentation was a source of knowledge and experience that will serve for subsequent software projects. We evaluated the model through a case study applied in a public institution in Ecuador. We found that the application of the model improved the management, monitoring, and quality control of the software development process and the software product and improved the direction of the institution’s software maintenance. The results of applying the model in the case study were shown through the comprehensive scorecard in four perspectives: financial, satisfaction in delivery, internal processes, development, and learning of team members. The improvements obtained are 1) Traceability of the software development area’s work, where there was a favorable change in the process and product quality. 2) Organization and clarity of the project team’s functions allowed to establish a stable communication flow to achieve the project’s objective. 3) Diagnosis of the productivity of the development area members individually and as a team, which allowed the establishment of professional development plans for each of them. 4) Develop software features focused on customer needs, which were refined at each stage of development, which reduced the repetition of tasks and issued early warnings of change to the original scope. 5) Timely identification of risks and problems to take measures to minimize and mitigate the impact on the project. 6) Accessible and public documentation of the project’s execution that will serve as a reference for future projects, which can take advantage of the knowledge of the institution’s software development area. Finally, it was possible to have a tool for project evaluation and prioritization based on estimated financial indicators. Despite the positive results of the model’s application, it must be taken into account that such models are subject to the variation of government regulations and regulations.
References 1. Moreno, J.: Programación Orientada a Objetos. Ra-Ma (2015) 2. Harter, D.E., Krishnan, M.S., Slaughter, S.A.: Effects of process maturity on quality, cycle time, and effort in software product development. Manage. Sci. 46(4), 451–467 (2000) 3. Pino, F.J., García, F., Piattini, M.: Software process improvement in small and medium software enterprises: a systematic review. Softw. Qual. J. 16(2), 237–261 (2008). https://doi.org/ 10.1007/s11219-007-9038-z 4. Fernández, L., Piattini, M., Calvomanzano, J., Cervera, J.: Análisis y Diseño Detallado de Aplicaciones Informáticas de Gestión. Ra-Ma (2007) 5. Gutierrez, D.: Métodos de Desarrollo de Software (2011) 6. SENPLADES, Sistema Nacional de Información.. SENPLADES. SENPLADES (2017). http://goo.gl/DqH4SV 7. Salamanca, Y.T., Del Río Cortina, A., Ríos, D.G.: Modelo de gestión organizacional basado en el logro de objetivos. Suma Negocios 5(11), 70–77 (2014). https://doi.org/10.1016/s2215910x(14)70021-7 8. Goodstein, L., Nolan, T., Pfeiffer, W.: Planeación Estratégica Aplicada. Mc Graw Hill Interamericana, Santa Fé, Bogotá (1998)
Proposal for a Software Development Project Management Model
301
9. López Valerio, C.: “Modelo de 8 pilares para las Pymes de TIC´s, una mirada en retrospectiva.,” III Congr. Int. Cienc. y Tecnol. para el Desarro. Sostenible, Chiriquí, Panamá, p. 11 (2018) 10. Thayer, R.: Software Engineering Project Management. Wiley-IEEE Computer Society Press, New York (1997) 11. MacGregor, K.: Software development project management: a literature review In: 26th ASEM National Conference Proceedings, October 2005 (2005) 12. Mon, A., Estayno, M., Scalzone, P.: Definición de un marco de trabajo para la implementación inicial de un Modelo de Proceso Software. Aplicación de un caso de estudio in XII Congreso Argentino de Ciencias de la Computación (2006) 13. Reifer, D.: Soft Management. Wiley-IEEE Computer Society Press, NewYork (2002) 14. Hamzane, I., Belangour, A.: A multi-criteria analysis and advanced comparative study between IT governance references. Adv. Intell. Syst. Comput. 1105, 39–52 (2020). https:// doi.org/10.1007/978-3-030-36674-2_5 15. IT Governance Institute, “Cobit 4.1,” (2007) 16. Kaplan, R., Norton, D.: The Balanced Scorecard: Translating Strategy into Action, Harvard Bu. (1996) 17. ISACA®, COBIT® 5. A Business Framework for the Governance and Management of Enterprise IT (2012) 18. Project Management Institute Inc., “Project Management Institute (PMI),” 2013. https://www. pmi.org. Accessed: 01 Apr 2020 19. Weill, P., Ross, J., Governance, I.T.: How Top Performers Manage IT Decision Rights for Superior Results. Harvard Business School Press, Boston (2004) 20. Amón-Salinas, J., Zhindón-Mora, M., “Modelo De Gobierno Y Gestión De TI, Basado En COBIT 2019 E ITIL 4, Para La Universidad Católica De Cuenca,” Rev. Científica FIPCAEC (Fomento La Investig. Y publicación En Ciencias Adm. Económicas Y Contab., 5(16), 218– 239, (2020). https://doi.org/10.23857/fipcaec.v5i14.168 21. Bon, J., et al.: Foundations of IT Service Management based on ITIL. Van Haren Publishing, 36 (2005) 22. Cockburn, A.: Heart of agile (2016) 23. Kaplan, R., Norton, D.: The Balanced Scorecard - Measures That Drive Performance Harvard Bu (1992) 24. Guevara-Vega, C.P., Guzmán-Chamorro, E.D., Guevara-Vega, V.A., Andrade, A.V.B., QuiñaMera, J.A.: Functional requirement management automation and the impact on software projects: case study in Ecuador. Adv. Intell. Syst. Comput. 918, 317–324 (2019). https://doi. org/10.1007/978-3-030-11890-7_31 25. Alfaraj, H., Qin, S.: Operationalising CMMI: Integrating CMMI and CoBIT perspective. J. Eng. Des. Technol. 9(3), 323–335 (2011). https://doi.org/10.1108/17260531111179933
Development of a Linguistic Conversion System from Audio to Hologram María Rodríguez , Michael Rivera(B)
, William Oñate , and Gustavo Caiza
Universidad Politécnica Salesiana, Quito, Ecuador {mrodrigueza4,mriveram2}@est.ups.edu.ec, {wonate,gcaiza}@ups.edu.ec
Abstract. The sign language becomes the means to express ideas in society for the deaf community, still the conflict arises when speakers do not understand this form of communication, generating isolation between the parties. Therefore, this paper plans the development of a system of linguistic conversion from audio to hologram for the Ecuadorian sign language; it will allow a person to learn sign language in a friendly way for which he or she must say a word in a microphone, and by programming an algorithm in a microcomputer it will activate a holographic reproduction to visualize how certain essential terms of the Spanish language defined in a database are optimized. In this way, a system that will help the learning of Ecuadorian sign language is obtained thanks to technological resources. For the design of the application, three-dimensional images were generated, and the effectiveness of voice to hologram translation was measured with 96.09% of success. Keywords: Holography · Sign language · Speech recognition
1 Introduction There are approximately 360 million hard of hearing people globally, representing more than 5% of the world’s population [1]. According to the National Council for Equality of Disabilities (CONADIS), Ecuador registers in March 2020 485,325 people with varying degrees of disability of different types such as: intellectual, visual and auditory, of which the latter represents 14% of the totality with 67,929 individuals, divided in 45.25% women, 54.74% men and 0.01% belonging to the LGBTI community [2]. Sign language is the means through which people with hearing impairments can communicate, and it is essential to mention that this linguistic form consists not only in the gesture of their hands but also in the set of body movements and facial expressions that allow transmitting: ideas, thoughts, and emotions in a conversation [3]. In general terms, sign language is thought to be standardized, but the truth is that people with hearing impairments around the world have created different types of signs according to their region, identity and culture, yet the best known among the deaf community is the American Sign Language (ASL) [4]. The Ecuadorian Sign Language (LSEC) © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 302–313, 2021. https://doi.org/10.1007/978-3-030-63665-4_24
Development of a Linguistic Conversion System
303
has been recognized in Ecuador since 1998 as the right for people with deafness to communicate. In turn, in 2012, the first Official Dictionary of the Ecuadorian Sign Language is published, in which two dialects (Costa and Sierra) are recognized, composed of 50% native signs, 30% from the ASL and 20% from the Spanish sign language [5, 6]. Sign language communication is still a challenge for hard of hearing people, since deaf communities are still struggling to achieve their rights to communicate and also access to quality education, health, and work because the current environment does not have enough communication inclusion with people with hearing impairments, a clear example is that several attempts have not yet been made for deaf children to access a regular education system, because there are not enough people with training in this type of language. The health system has similar exclusion situations because of the lack of information [7]. There are educational centers in Ecuador for people with hearing disabilities or nonhearing disabilities, to strengthen communication between deaf people and listeners, seeking to equal opportunities as a means of social integration [8, 9]. Sign language learning media have now been supplemented by Information and Communication Technologies (ICT) to sustain specific cognitive gaps, the reason for which holography can help in the learning process as an interactive audiovisual content directly related to augmented reality [10]. As part of an inclusion project, a Café-Restaurant was created in Quito for people with hearing impairment; this place allows the interaction with people with hearing loss who are part of the team. “Café en señas” offers the option to live the experience of learning and knowing about the Ecuadorian Sign Language (LSEC). This place provides inclusion for the hearing impaired, and there are texts and menus for the LSEC exhibition [11]. From this experience, the idea of creating an electronic voice-hologram system that allows digital learning originated in the cafeteria. The article is distributed as follows: Sect. 2 presents the frame of reference, followed by design in Sect. 3, later Sect. 4 presents the results obtained, the conclusions are in Sect. 5, the concerns in Sect. 6.
2 Frame of Reference Communication is an essential part of society since it allows the interaction between individuals, for which different languages are used: verbal and nonverbal. Sign language is immersed in nonverbal languages, mostly used by people with hearing or speech impairments [12]. It is now evident that many listeners are unaware of this type of language, thus generating mutual isolation. Those affected are people with hearing deficits because they cannot relate to the same ease as listeners [7]. 2.1 Hearing Impairment Hearing impairment refers to a lack or decrease in hearing, which is caused by some damage to the hearing function [13], causing some impairment in learning and interpersonal relationships. A person is considered with a hearing impairment when he or she has some degree of hearing loss such as: mild hearing loss (20 to 40 dB), medium hearing loss (40 to 70 dB), severe hearing loss (70 to 90 dB), or profound hearing loss
304
M. Rodríguez et al.
(>90 dB). The term hearing impairment may be used according to the location of the injury or the age of onset [14]. It should be emphasized that people with such disabilities have various consequences due to hearing loss, detailed as functional consequences that are clinical signs and symptoms; emotional and psychic consequences due to problems of interaction with society; social consequences generated by difficult participation in labor, family, among others; the latter creating an economic crisis for those affected, because sources of employment are limited and their needs are broad [15]. 2.2 Sign Language Sign language represents how people with hearing loss interact through gesture-visual tools, because their choice of interaction with the world cannot be given by speech; for this reason, this language becomes their first language or also known as a natural language, which complies with linguistic standards. The integration of listeners into sign language learning is necessary to generate inclusive bounds between the deaf community and the listening community, to avoid isolation [16]. Globally, there are more than 300 sign languages, because each country has developed its linguistics, since regionalisms have been included for each expression [17]. In Ecuador, the natural language for hard of hearing people is the LSEC, endorsed by article 70 of the Organic Law on Disabilities, which is recognized as a form of alternative communication for people with hearing loss [18]. 2.3 Holography In recent years, the term holography has been conceptualized in various ways, taking into account that it is directly related to augmented reality, since it bases its principle on the perception of real or imaginary objects from a set point, thus obtaining a virtual image or video in fact [10]. There are different types of holograms, and these can be created differently, but they always comply with the same principle of generating realistic perspectives to the human eye. Therefore, the most relevant and known holograms can be classified a geometric shape, with axis or off-axis, and in other cases by transmission or reflection according to the direction of the beams of light [19]. Holography has been used in marketing and education due to its unique form and the impact on society. On this point, visual tools achieve more attention in students in the learning process. For example, in medicine, these tools have been used for projecting human anatomy. It is worth mentioning that this medium applies to all levels of study and branches, taking into account the content and environment [20].
3 Design Currently, there are various ways of viewing holograms due to the range of devices for holographic projections such as: smoke holograms, propeller holograms, pyramid holograms, holograms in glass and with holographic lamps, see Fig. 1.
Development of a Linguistic Conversion System
305
Fig. 1. General diagram of the prototype.
3.1 Holographic Pyramid The inverted pyramid hologram as a means of projection will be used for this research since it is easy to construct and has a low price about the other forms of holographic visualization. The inverted pyramid hologram involves Pepper’s Ghost technique, consisting of an optical illusion with a hidden image that perfectly matches a glass medium for obtaining a reflected image. The law of reflection of light plays a vital role in the visualization of the inverted pyramid hologram, bearing in mind that it is the change of direction of an incident beam that collides on a reflective surface, and because of this, it generates a reflected ray [21]. For the design, it is necessary to find the inclination angle of the holographic pyramid (α) to obtain a reflected ray that is horizontal to the observer, taking into account that the incident ray, reflected ray, and normal are on the same plane and that the rise of incidence (θi ) is equal to the rise of reflection (θr ), as presented in Eq. (1). θi = θ r
(1)
A trigonometric analysis of the reflective plane results in: β + θr + θi + γ = 180◦
(2)
Where: β: Internal angle of the reflected ray and the reflective object. γ: Internal angle of the incident beam and the reflective object. θr : Internal angle between the reflected and ordinary ray. θi : Internal angle between the incident beam and the standard beam. To obtain the horizontal reflected ray to the observer it is necessary that: β = α
(3)
γ + α = 90◦
(4)
β + θr = 90◦
(5)
306
M. Rodríguez et al.
When replacing Eq. (1) and Eq. (3) in Eq. (2) the following is accomplished: α + 2θr + γ = 180◦
(6)
α is solved from the Eq. (4) and it is replaced in Eq. (6) to get the value of θr : ◦ 90 − γ + 2θr + γ = 180◦ 90◦ + 2θr = 180◦ 2θr = 90◦ θr = 45◦
(7)
The value θr in Eq. (7) is determined β by Eq. (5): β + 45◦ = 90◦ β = 45◦
(8)
The angle of inclination of the holographic pyramid α: results by Eq. (3) and with the value of β: α = 45◦
(9)
The values of the sides of the truncated pyramid are calculated to generate the holographic effect; for this, the base of 360 mm is determined by taking into account that the screen has a projection area of 345 mm × 195 mm; with this data, the apothem (x) and the height of the pyramid (h) are determined. To get h, Eq. (9) is used and by the Pythagorean Theorem is obtained: h
tan α =
170 mm
h = 170 mm tan (45◦ ) h = 170 mm
(10)
The apothem (x) is calculated by having the h value and the inclination angle: cos α = x=
170 mm x
170 mm cos α
x = 240.42 mm
(11)
The x value in Eq. (11) represents the height of the side face of the truncated pyramid. It guarantees the tilt value of the holographic pyramid. The angle of inclination of the side face of the truncated pyramid is calculated as follows: tan α =
240.42 mm 170 mm
Development of a Linguistic Conversion System
α = 54.74◦
307
(12)
After obtaining the corresponding calculations, the design is made in the Autocad software of the pyramid, in conjunction with a black glass box that is the base of the screen and allows the internal opacity for the projection of the image, obtaining the final result of the computer-aided design that is displayed in Fig. 2.
Fig. 2. Holographic pyramid computer-assisted design.
For the construction of the holographic pyramid, a 4 mm dark glass was used, taking into account the strength of the material and its relatively low cost. 3.2 Hologram To create of the hologram, Adobe After Effects graphic design software is used to edit videos collected from the Ecuadorian Sign Language Dictionary. As a first instance, a simple H.264 format video with dimensions of 466px × 360px is generated, in which simple editing of the collected video is made consisting of the change of background color, so that in a second process it presents the shape: top, bottom, right side and left side. The final video of the hologram has the following dimensions 1020px × 670px in H.264 format, which includes the simple video previously edited but varying its characteristics concerning: size, rotation, and position on the x and y axes displayed in Table 1. Table 1. Quantitative characteristics of animations in simple video for the hologram. Variables
Superior animation
Inferior animation
Right lateral animation
Left lateral animation
Dimension (px)
162 × 296
162 × 296
266 × 468
266 × 468
Rotation (°)
0
180
270
90
X position (px)
510
510
782
238
Y position (px)
179
489
335
335
308
M. Rodríguez et al.
The general idea of creating a video for an inverted pyramidal hologram is mainly based on the darkness needed for the projection, for this reason the edits have a black background both in the simple video and in the final process, and it is essential to correctly adjust the simple videos in terms of positions within the playback area. With all this, the hologram video in the Adobe After Effects space is obtained, and it is presented in Fig. 3.
Fig. 3. Video for the pyramidal hologram in Adobe after effects.
For the generation of the algorithm, Python programming language was used on a Raspberry Pi3 B+ , this development card is the means of control for receiving and publishing data [22]. On the other hand, the main idea is to establish the voice link with the previously defined hologram by using the Speech Recognition library that works with Google’s services to detect of words. For this reason, the system needs a good Internet connection. The system requires a microphone with a USB port that connects to the Raspberry Pi3 B+ , in which individual configuration is required in the LX Terminal for its proper functioning, modifying the sensitivity of the microphone by using AlsaMixer, as well as the encoding in Python for a recognition of the Spanish language. Once the voice recognition is obtained, it is necessary to create a database of the variables that give way to the generation of the hologram. For this, certain words that are the basic principles of the Spanish language have been taken into account and were organized into four categories such as: good manners, animals, colors, and fruits, which are presented in Table 2. For displaying the hologram, it is necessary to use the OpenCV library that allows the creation of the window in which the playback is performed. This window is created to occupy most of the screen to generate a visible hologram with right size.
Development of a Linguistic Conversion System
309
Table 2. Commands of the variables that generate a holographic reproduction. Categories
Good manners
Animals
Colors
Fruits
Variables
Good morning
Cocodrile
Yellow
Good afternoon
Elephant
Blue
Peach
Good evening
Cat
Purple
Strawberry
Excuse me Hippopotamus
Orange
Mango
Lion
White
Apple
Dog
Black
Banana
Rat
Pink
Watermelon
To obtain the programming of the voice-hologram link, the following flowchart is performed, which can be seen in Fig. 4.
Fig. 4. Flow diagram of the programming algorithm.
310
M. Rodríguez et al.
After connecting the peripherals to the Raspberry Pi3 B+ development card and including the necessary programming media and the inverted pyramid for the projection of the hologram, the final prototype of the audio-to-hologram linguistic conversion system for the Ecuadorian sign language was obtained. For verifying the final result and validating the prototype, the results are based on two fundamental aspects: the percentage of effectiveness obtained in the speech recognition and the waiting time for the holographic projection, i.e., how much is the delay of the voice link to the hologram. For the percentage of effectiveness obtained in the speech recognition, i.e., when that it recognizes the name correctly from the word mentioned in the microphone, it is necessary to take into account the internet connection since an online library is used in the programming algorithm. For this sample, the network was 6.9 Mbps, indicating that the better the prototype connection, optimal results are achieved. Considering that the words were uttered ten times for each term, the number of right words was born and then was grouped into categories, obtaining the percentage of effectiveness of the speech recognition, which is presented in Fig. 5.
Fig. 5. Voice recognition effectiveness.
The errors observed were mostly due to a lousy vocalization of the word or, in turn, because the microphone was far from the speaker. Additionally, in all the words mentioned at the time of the voice recognition, out of the 230 terms that were vocalized, 221 were right and nine wrong, which corresponds to 3.91% as indicated in Eq. (13). % error =
|221−230| x 100% 230
% error = 3.91%
(13)
When vocalizing the word in the microphone, there is a time between the speech recognition and the holographic projection, which can vary according to the speech recognition response and the internet speed.
Development of a Linguistic Conversion System
311
Because of the latter, a sample was made with the reaction times of the program that were grouped by categories, evaluating ten utterances for each word to determine how long the speaker should wait to have a holographic visualization of what is desired; the results can be observed in Fig. 6.
Fig. 6. Waiting time for the holographic projection.
This variation in the time that can be observed is mostly due to the internet connection, but the standard waiting time for the translation with holographic projection is 3.91 s. On the other hand, the speech recognition detects all the languages of the world, so it is necessary to set Spanish as the default language by programming it to improve the speech recognition based on pronunciation, which must be performed with a maximum of 2 cm of microphone separation for the program to achieve the correct detection of vocalized words. The measurements used for the videos that generate the holographic effect displayed in Table 1 are recommended measures for the projection area presented by the device used. It is necessary to mention in this section that these values could change compared to larger or smaller screens, where in turn, the quality of the video allows a better perception of the hologram. In Fig. 7, the finished prototype can be observed. By generating a voice command of the words in the database, in this example
Fig. 7. Holographic visualization in Ecuadorian sign language of the word “Hello.”
312
M. Rodríguez et al.
is “hello,” the corresponding hologram is projected in Ecuadorian sign language on one of the faces of the pyramid.
4 Conclusions It is assumed that sign language is not a standardized language, but instead, it changes according to the region in which it is located. In the case of Ecuador, it is established by the LSEC. In terms of the hologram, this project was based on the holographic pyramid that requires several mathematical calculations necessary for the accurate reflection of light for the projection of the hologram. All this information was used for the development of the prototype. Three-dimensional images were generated in the Adobe After Effects program, editing the videos for the translation of the words in a sign that were obtained from the CONADIS platform, and that were modified to obtain the holographic projection and perform different tests, in which the effectiveness of the voice translation to the hologram was measured, getting 96.09%. The remaining value corresponding to the error could be verified due to the far distance compared to the recommended with the microphone or to the low vocalization of the words; also, the timeout for the holographic projection (voice translation to the hologram) was measured with an average of 3.23 s, which varies due to the network connection. A structured algorithm was implemented based especially on two libraries necessary for the project, such as: Speech Recognition, which allows the speech recognition, and OpenCV, for the playback of the video; in this way, the voice-hologram link was obtained. It must be taken into account that the internet connection in this prototype plays a vital role because it uses an online library; for this reason, the network speed interferes with how fast the audio to hologram conversion is performed, in the case of tests performed in this research, the speed was 6.9 Mbps, getting an error rate in the effectiveness of 3.91%, which can vary.
References 1. El Telégrafo, «El Telégrafo,» La discapacidad auditiva afecta a 360 millones de personas en el mundo, p. Redacción web, 28 septiembre (2017) 2. Consejo Nacional para la Igualdad de Discapacidades, «Consejo Nacional para la Igualdad de Discapacidades,» marzo 2020. [En línea]. https://www.consejodiscapacidades.gob.ec/est adisticas-de-discapacidad/. [Último acceso: 22 abril 2020] 3. Human Rights Watch, «Human Rights Watch» 23 septiembre 2018. [En línea]. https://www. hrw.org/es/news/2018/09/23/el-lenguaje-de-senas-un-componente-clave-para-los-derechosde-las-personas-sordas. [Último acceso: 12 mayo 2020] 4. Chacón, E., Aguilar, D., Sáenz, F.: «Desarrollo de una Interfaz para el Reconocimiento Automático del Lenguaje de Signos,» Quito, (2013) 5. Quiroz Moreira, M.I.: «Manual de gramática funcional de la lengua de signos ecuatoriana como estrategia de motivación a las y los estudiantes Sordos hacia la lectura,» Loja (2014) 6. Ortiz Yépez, M.D.: «Alfabetización del niño sordo en español escrito como segunda lengua: propuesta de estrategias metodológicas para alumnos del segundo año de educación básica del Inal.,» Quito (2018)
Development of a Linguistic Conversion System
313
7. Heredia, V.: «El lenguaje de señas, vital para la inclusión de sordos,» El Comercio, 23 septiembre 2019 8. Consejo Nacional para la Igualdad de Discapacidades: Normas Jurídicas en Discapacidad Ecuador. Imprenta Don Bosco, Quito (2014) 9. Oviedo, A., Carrera, X., Cabezas, R.: «Cultura Sorda,» (2015). https://cultura-sorda.org/ecu ador-atlas-sordo/. [Último acceso: 24 abril 2020] 10. Ochoa Peláez, V.: «Técnicas holográficas aplicadas a la educación,» Burgos (2018) 11. La Hora, «‘Cafetería en señas’, un espacio que fomenta la cultura sorda de Quito,» La Hora, 28 agosto 2018 12. Fransesc, E.: Oliveras, «P & A Group,» 03 2016 octubre. [En línea]. https://blog.grupo-pya. com/comunicacion-verbal-no-verbal-diferencias-bases/. [Último acceso: 2020 mayo 19] 13. Diaz, P.S.: «Calidad de vida y discapacidad auditiva en Chile,» Chile (2016) 14. Montaño Prado, M.D.P.: «Capacitación en lengua de señas a los estudiantes de ciencias de la educación de la Pontificia Universidad Católica del Ecuador, Sede Esmeraldas,» Esmeraldas (2014) 15. Faletty, P.: «La importancia de la detección temprana de la hipocausia,» Revista Médica Clínica Las Condes, pp. 745–752 (2016) 16. De Arado Pérez, B.: «Cultura Sorda,» [En línea] (2011). https://cultura-sorda.org/lengua-desenas/. [Último acceso: 06 mayo 2020] 17. Federación Mundial de Sordos, «Federación Mundial de Sordos,» [En línea] (2016). http:// wfdeaf.org/our-work/. [Último acceso: 5 mayo 2020] 18. Conadis Cordicom y Fenasec, «Manual práctico para intérpretes en lengua de señas ecuatoriana,» Quito (2020) 19. Pailiacho, G.S., Guevara, y B.D.: Villegas Villacís, «Diseño y construcción de una pirámide de proyección basada en la técnica “Fantasma de Pepper” Para un sistema de realidad aumentada,» Quito, (2019) 20. Rondal Vásconez, F.E.: «Diseño de un modelo holográfico en el ciclo del agua para atraer la atención y mejorar el aprendizaje de los alumnos de tercer año de educación general básica en la unidad educativa Pichincha en el año lectivo 2017–2018» Quito (2019) 21. García Paez, L.F., García Paez, M.J., Moreno Hernández, V., Pérez Hernández, K.A.: «Fantasma de Pepper,» XXVII Congreso de Investigación Cuam-ACMor (2017) 22. Vaca, D., Jácome, C., Saeteros, M., Caiza, G.: Braille grade 1 learning and monitoring system. In: 2018 IEEE 2nd Colombian Conference on Robotics and Automation (CCRA), Barranquilla, pp. 1–5 (2018) https://doi.org/10.1109/ccra.2018.8588144
Exploring Learning in Near-Field Communication-Based Serious Games in Children Diagnosed with ADHD Diego Avila-Pesantez1,2(B) , Sandra Santillán Guadalupe1 , Nelly Padilla Padilla1 , L. Miriam Avila3 , and Alberto Arellano-Aucancela1 1 Escuela Superior Politécnica de Chimborazo, Facultad de Informática Y Electrónica,
Riobamba, Ecuador [email protected] 2 Universidad Nacional Mayor de San Marcos, Facultad de Ingeniería En Sistemas E Informática, Lima, Peru 3 Escuela Superior Politécnica de Chimborazo, Facultad de Ciencias, Riobamba, Ecuador
Abstract. Attention deficit hyperactivity disorder (ADHD) is a common disorder, affecting more than 5% of children worldwide. This disorder includes symptoms such as hyperactivity, inattention, and impulsivity. Technological advances benefit the learning of this group of children, creating tools that strengthen their abilities and skills and support cognitive therapies. In particular, Serious Game (SG) are well-structured videogames to learning in entertainment settings. With the background, CIUDAD PUZZLE was designed with mini-puzzle games using Near Field Communication (NFC) technology. This SG allows for strengthening the treatment of 20 children from 7 to 10 years old diagnosed with ADHD. For experimentation, the children’s attention level was measured before and after using the SG, for which the Perception of differences test (faces-R) was used. As a result, a mean increase of 2.9 in the learning performance score was obtained. With this case study, it was possible to establish the improvement in the organization and completion of children’s school assignments, paying greater attention to the orders of their teachers and parents. Keywords: Attention deficit hyperactivity disorder (ADHD) · CIUDAD PUZZLE · Serious Games · Near-Field Communication (NFC)
1 Introduction Attention deficit hyperactivity disorder (ADHD) affects more than 5% of children globally [1], which has a higher prevalence in boys than in girls [2]. This disorder has three core symptoms: inattention, hyperactivity, and impulsivity, which manifest to a greater or lesser degree depending on the subtype. The inattentive subtype predominates the attention deficit. It is the most frequent among children and significantly impacts the school level. Children suffering from the impulsive subtype tend to be more hyperactive and aggressive. The combined subtype is the most prevalent of all and has an impact © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 314–324, 2021. https://doi.org/10.1007/978-3-030-63665-4_25
Exploring Learning in Near-Field Communication
315
on overall performance. For the symptoms of inattention, hyperactivity, and impulsivity to be associated with ADHD. Certain conditions must occur: they must appear before the age of 7 and remain in time for at least 6 months. They must affect two or more areas of the child’s life and have a significant impact, for instance, deteriorating their performance significantly. Early diagnosis and adequate treatment have been shown to allow a positive course of the disorder [3–7]. ADHD is a disorder that requires early intervention since it can negatively impact a child’s life until adulthood [8]. For this reason, it is necessary to look for technological alternatives that help with their treatment. In this sense, NFC technology and Serious Games (SG) can be considered tools that help to catch their attention, enhance motivation, and prevent boredom in traditional treatments. SG is a well-structured videogame designed for an educational purpose [9]. On the other hand, NFC is a radio-frequency technology that permits exchanging information wirelessly across a short-range (4–10 cm) [10]. These technologies have been implemented through multiple platforms and emerging new exciting applications in healthcare and education. Consequently, several researchers have developed SG, specifically to train cognitive areas and improve attention, memory, and task planning [11–14]. With this background, it is proposed to carry out a prototype of SG with the use of Near- Field Communication (NFC) technology, aimed to be a low-cost support tool in cognitive therapies and the learning process of children with ADHD, from the therapeutic center Lizarzaburu located in the city of Riobamba, Ecuador. This paper is presented with the following structure: The first section includes detailed information regarding ADHD and SG insight. The second section provides information about related works. Section 3 details the stages of development and operation of CIUDAD PUZZLE. Section 4 carries out a statistical analysis of the results gathered. The last section displays the conclusions of the work.
2 Related Works With the advancement in technology and SG development, several educational games have been implemented, which represent a valuable tool in education and rehabilitation [15]. Therefore, levels of difficulty, learning objectives, neutral user interfaces, and devices have been evaluated on different platforms. For instance, “Antonyms” is an SG that supports the control of patients’ impulsivity. It is based on a neuropsychological model, which causes self-regulation and stimulates metacognition [13]. The work developed by [16] proposes that ICT technologies like NFC could enhance psychopedagogical methodologies, improving the capacity for interaction, and integrate digital and physical educational games. Also, Ou and colleges developed an immersive virtual reality exercise game to enhance attention, abstract reasoning, cognitive ability, and complex information processing of children with ADHD with positive results [17]. The CogMed program, based on computer videogame improves the performance of working memory [18]. Regarding more recent evidence, ATHYNOS was developed with Augmented Reality technology and natural user interfaces with motion sensors. It allowed having the participants motivated, achieving a better level of concentration and mastery of their social skills [11]. Besides, in [19], they conclude that gamification and
316
D. Avila-Pesantez et al.
the use of Serious Games can help in the learning process, enhancing executive functions and working memory, generating greater motivation in children and teenagers. As shown in the former proposals, most SGs interact with children through the keyboard/mouse, touchscreen, or movement sensors. It encouraged the design and development of the game “CIUDAD PUZZLE” using a low-cost NFC technology with tags for this propose.
3 Serious Game “CIUDAD PUZZLE” In order to develop a prototype of SG that allows acceptability, interactivity, and practical use by those children diagnosed with ADHD, the conceptual model for the SG design proposed by [20] was used. It is divided into four iterative phases (Analysis, Design, Development, and Evaluation), which are executed sequentially, except for the evaluation phase, carried out throughout the project. Phase 1 Analysis: The main objective is to define aspects of the SG, such as the concept, genre, user profile, elements of the game, and technical elements, complemented by the pedagogical agenda. These elements are defined through the specific requirements document (see Table 1). Table 1. Main components to requirements analysis Components • Concept/Game Idea (name, user profile, genre, game platform, player type, tech spec) • Pedagogical Agenda (educational and learning goals, conceptual knowledge, models or learning theories) • Teamwork • Budget estimate
Phase 2 Design: Here, it is established the environment, game narrative based on the educational objectives, scenarios, game objects, learning system, and architecture & technical spec, as shown in Fig. 1. Phase 3 Development: The coding of the SG is carried out iteratively, thus achieving an executable version of the SG using the software and hardware tools necessary for development, based on the best practices of programming. Phase 4 Evaluation: This phase evaluates and adjusts different aspects of the Serious Game, eliminating the number of detected errors. In this way, versions of the game are released to be verified. This stage validates each phase of the SG implementation and ends when the educational objective set in the project is reached.
Exploring Learning in Near-Field Communication
317
Fig. 1. Components for the design of “CIUDAD PUZZLE” [20].
3.1 Ciudad Puzzle Gameplay CIUDAD PUZZLE is an SG intended to help treat children with ADHD between 7–10 years, strengthening attention, working memory, and concentration, complementing traditional psycho-pedagogical therapy. The central theme of the game is related to the most emblematic and tourist towns in Riobamba city. The gameplay consists of assembling puzzles with different numbers of pieces based on the game’s instructions. The interaction with the application is carried out through the NFC tag, which is affixed to each puzzle element, read through a mobile device. It allows the correct location to be identified within the scenario of the SG. It will depend on the participant’s solving capacity to advance to the next level (beginner, intermediate, and advanced). As a reward mechanism, the player will be awarded a diploma when all levels are completed. Figure 2 shows the general outline of the SG. The sketches of the characters and pieces were made in the analysis phase. For their design, the Adobe Illustrator tool was used, which facilitated 2D modeling. The C# language and the Unity 3D video game engine were used for programming, which enabled its development. In Tables 2 and 3, you can see several sketches and the selected characters. 3.2 Use of the NFC Tag and Tangible Interfaces The NFC Tag is placed in the physical piece of the puzzle (tangible interface), which will be identified with a code, through a mobile application (tag reader), before establishing a connection between them. It is essential to activate the NFC option on the smartphone to read the tag, which through a contactless system, identifies the code of each piece
318
D. Avila-Pesantez et al.
Fig. 2. Flowchart of “CIUDAD PUZZLE” Table 2. Examples of some sketches chosen for the creation of the puzzle. PIECES
DESCRIPTION
The Cathedral: The facade is one of the historical relics rescued from the ruins of the old Riobamba, built-in white calcareous stone
The train station: This construction was designed during the government of Eloy Alfaro (1925). It has a large plaza outside for tourists and local people to enjoy.
Sucre Park: I t was named like that in tribute to Marshal Antonio José de Sucre, who was the mentor of the battle of Tapi. It has the shape of a nautical rose, whose axis is the pool with god Neptuno in the center.
Exploring Learning in Near-Field Communication
319
Table 3. Sketches of the main characters of the SG CHARACTERS
DESCRIPTION
Psychologist: This character coordinates the activities inside the SG
Boy or Girl: These characters adopt the appearance of a caricaturized boy or girl between 7 to 10 years old. Through the avatar, the participants can choose their gender.
Fig. 3. Each piece has an attached/integrated NFC tag, which identifies it. An NFC device reads and transmits information to CIUDAD PUZZLE using WIFI-network
of the puzzle. The code is transferred using a WIFI-network to the CUIDAD PUZZLE game, hosted on a PC or laptop. The SG recognizes and visualizes each of the pieces on the computer screen, analyzing their correct location within the puzzle scenario. This process is evidenced in Figs. 3 and 4.
320
D. Avila-Pesantez et al.
Fig. 4. A screenshot of the Parque Sucre puzzle interacting with participant Irvin from the PC.
Fig. 5. Physical pieces of the puzzle containing the NFC tag, which interacts with the SG
4 Research Methodology This study’s development is supported by field and application research, which allowed us to identify the activities performed in therapies by professionals in psychology with children diagnosed with ADHD. The information gathered allowed us to base the development of the SG prototype. For evaluating the experiment, the Perception of differences Test (faces-R) [21] was used. It considers the attention and the aptitude to perceive, quickly and correctly, similarities, differences, and patterns, using 60 schematic face items. The test consists of determining which of the three faces is different from the other two. It can be applied individually or collectively in a time of approximately 3 min. Due to its playful and straightforward nature, this turns out to be a very well accepted task by the participants of this study [22, 23]. An example of the test is presented in Fig. 5.
Exploring Learning in Near-Field Communication
321
Fig. 6. Screenshot of Perception of differences Test (faces-R) applied to measure attention [24]
4.1 Participants This study’s participants are children diagnosed with ADHD from the “Lizarzaburu” therapeutic center located in Riobamba city. For this activity, meetings were held with psychologists and therapists to inform them of the research’s purpose, who agreed to participate in the study. Furthermore, they selected the participants based on their experience. A total of 20 participants were chosen to validate the prototype, which had the parents’ consent. Of this set, 70% were male and 30% female. The age range of the participants was 7–10 years, with an average of 8.2. 40% of the participants were 7 years old; the rest were equally divided by the ages of 8, 9, and 10. 4.2 Procedure A quasi-experimental Pre-test/Post-test design was implemented with the group of participants, chosen at random. During therapy, the children carried out traditional activities to improve attention, using manual puzzles (pre-test). The SG prototype was used by the experimental group (post-test), which aims to reinforce the activity applied in therapy more entertainingly, making use of NFC technology, through tangible interfaces. Participants attended 2 weekly sessions of 15 min, for 3 months. A Control Group (CG) was established, who performed the therapies using the traditional method, and the Experimental Group (EG) used the “Ciudad Puzzle.” The groups were made up of 10 participants each. The pre-test and post-test evaluations were carried out using the survey technique for data collection, applying the Perception of differences Test (faces-R) as an instrument. 4.3 Results Cronbach’s alpha coefficient was used to verify the reliability of the collected data, which allowed measuring the internal consistency of the answers given by the participants
322
D. Avila-Pesantez et al.
through the test. The coefficient results (0.908) show a high degree of consistency and reliability. After that, the data obtained in the pre-test and post-test were analyzed using the Shapiro-Wilk, demonstrating that the data have a normal distribution (pre-test pvalue = 0.935 and post-test p-value = 0.917). The result establishes that the participants in the pre-t obtained an average score of 16/20 points, unlike the post-test, where the average score was 18.90/20 points. Consequently, the difference between both scores was 2.90 points, equivalent to 14.50%. These results are shown in Table 4. Table 4. Statistical descriptive data obtained in software R Statistics Pre-test Post-test N
Valid
10
10
Lost
0
0
Mean
16,00
18,90
Median
16,00
19,00
Mode
16
19
Std. Deviation
1,944
1,449
Asymmetry
0,340
0,214
Asymmetry std error
0,687
0,687
Kurtosis
−0,333 −0,987
Kurtosis std error
1,334
1,334
Range
6
4
Minimum
13
17
Maximum
19
21
Percentiles 25
14,75
17,75
50
16,00
19,00
75
17,50
20,25
The t-test was chosen to contrast the means of the data obtained. This process raised the research question: RQ: Is there a significant difference between the data obtained in the pre-test versus post-test in developing the level of concentration in children diagnosed with ADHD? Two hypotheses derive from this question: H 0 : There are no significant differences between the level of concentration in children diagnosed with ADHD using traditional therapy versus the application of the game CIUDAD PUZZLE? H 1 : There are significant differences between the level of concentration in children diagnosed with ADHD using traditional therapy versus the application of the game CIUDAD PUZZLE?
Exploring Learning in Near-Field Communication
323
The results obtained are shown in Fig. 6, highlighting the value of p = 0.000067 (where p < 0.05); therefore, the hypothesis (H0 ) was discarded. Consequently, it is concluded that the therapeutic sessions supported by the CIUDAD PUZZLE game improve the concentration level of the participants, generating greater motivation.
5 Conclusions The Serious Game CIUDAD PUZZLE has proven to be a useful low-cost tool for cognitive therapy applied to children with ADHD, thanks to the intervention of tangible interfaces together with NFC technology. It allowed creating motivation and engagement to significantly improve attention, measured through the Perception of differences Test (faces-R). Besides, the SG contributed to know the city’s representative places, reflected in the designed puzzles, which motivated the participant’s interest in solving the SG’s beginner, intermediate, and advanced levels of difficulty. In addition, the therapists and teachers observed an improvement in the organization and completion of school assignments for the children who participated in this study, paying greater attention to the orders of their teachers and parents. However, it must be considered that each child has different behaviors, which implies disturbances typical of the disorder, which repress their actions and diminishes their performance in daily activities.
References 1. Sayal, K., Prasad, V., Daley, D., Ford, T., Coghill, D.: ADHD in children and young people: prevalence, care pathways, and service provision. JTLP, 5(2), 175–186 (2018) 2. Canals, J., Morales-Hidalgo, P., Jané, M.C., Domènech, E.: ADHD prevalence in Spanish preschoolers: comorbidity, socio-demographic factors, and functional consequences. JOAD, 22(2), 143–153 (2018) 3. Association, A.P.: Diagnostic and statistical manual of mental disorders (DSM-5®) (American Psychiatric Pub, 2013) (2013) 4. Danielson, M.L., Bitsko, R.H., Ghandour, R.M., Holbrook, J.R., Kogan, M.D., Blumberg, S.J.: Prevalence of parent-reported ADHD diagnosis and associated treatment among US children and adolescents, 2016. JOCCAP, 47(2), 199–212 (2018) 5. Giacobini, M., Medin, E., Ahnemark, E., Russo, L.J., Carlqvist, P.: Prevalence, patient characteristics, and pharmacological treatment of children, adolescents, and adults diagnosed with ADHD in Sweden. JOAD, 22(1), 3–13 (2018) 6. Thomas, R., Sanders, S., Doust, J., Beller, E., Glasziou, P.: Prevalence of attentiondeficit/hyperactivity disorder: a systematic review and meta-analysis. JP, 135(4), e994–e1001 (2015) 7. Wolraich, M.L., McKeown, R.E., Visser, S.N., Bard, D., Cuffe, S., Neas, B., Geryk, L.L., Doffing, M., Bottai, M., Abramowitz, A.: The prevalence of ADHD: its diagnosis and treatment in four school districts across two states. JOAD, 18(7), 563–575 (2014) 8. Storebø, O.J., Andersen, M.E., Skoog, M., Hansen, S.J., Simonsen, E., Pedersen, N., Tendal, B., Callesen, H.E., Faltinsen, E., Gluud, C.: Social skills training for Attention Deficit Hyperactivity Disorder (ADHD) in children aged 5 to 18 years. JCDOSR, 6 (2019) 9. Nazry, N.N.M., Romano, D.M.: Mood and learning in navigation-based serious games. JCIHB, 73, 596–604 (2017) 10. Singh, N.K.: Near-field Communication (NFC), JITAL, 39(2) (2020)
324
D. Avila-Pesantez et al.
11. Avila-Pesantez, D., Rivera, L.A., Vaca-Cardenas, L., Aguayo, S., Zuñiga, L. (eds.): Towards the improvement of ADHD children through augmented reality serious games: Preliminary results. In: Book Towards the improvement of ADHD children through augmented reality serious games: Preliminary results (2018, edn.), pp. 843–848. IEEE (2018) 12. Bul, K.C., Franken, I.H., Van der Oord, S., Kato, P.M., Danckaerts, M., Vreeke, L.J., Willems, A., Van Oers, H.J., Van den Heuvel, R., Van Slagmaat, R.: Development and user satisfaction of “Plan-It Commander,” a serious game for children with ADHD. JGFHJ, 4(6), 502–512 (2015) 13. Colombo, V., Baldassini, D., Mottura, S., Sacco, M., Crepaldi, M., Antonietti, A. (eds.): Antonyms: a serious game for enhancing inhibition mechanisms in children with Attention Deficit/Hyperactivity Disorder (ADHD). In: Book Antonyms: A Serious Game for Enhancing Inhibition Mechanisms in Children with Attention Deficit/Hyperactivity Disorder (ADHD) (edn.), pp. 1–2. IEEE (2017) 14. Durkin, K., Boyle, J., Hunter, S., Conti-Ramsden, G.: Video games for children and adolescents with special educational needs. JZFP (2015) 15. Gaggioli, A., Ferscha, A., Riva, G., Dunne, S., Viaud-Delmon, I.: Human Computer Confluence: Transforming Human Experience Through Symbiotic Technologies (De Gruyter Open, 2016) (2016) 16. Miglino, O., Di Ferdinando, A., Di Fuccio, R., Rega, A., Ricci, C.K.: Bridging digital and physical educational games using RFID/NFC technologies. JOE-LAS, 10(3) (2014) 17. Ou, Y.-K., Wang, Y.-L., Chang, H.-C., Yen, S.-Y., Zheng, Y.-H., Lee, B.-O.: Development of virtual reality rehabilitation games for children with attention-deficit hyperactivity disorder. JOAIAC, 1–8 (2020) 18. Green, C.T., Long, D.L., Green, D., Iosif, A.-M., Dixon, J.F., Miller, M.R., Fassbender, C., Schweitzer, J.B.: Will working memory training generalize to improve off-task behavior in children with attention-deficit/hyperactivity disorder?. JN, 9(3), 639–648 (2012) 19. Crepaldi, M., Colombo, V., Mottura, S., Baldassini, D., Sacco, M., Antonietti, A.: Antonyms: a Computer Game to Improve Inhibitory Control of Impulsivity in Children with Attention Deficit/Hyperactivity Disorder (ADHD). JI, 11(4), 230 (2020) 20. Avila-Pesantez, D., Delgadillo, R., Rivera, L.A.: Proposal of a conceptual model for serious games design: a case study in children with learning disabilities. JIA, 7, 161017–161033 (2019) 21. Thurstone, L., Yela, M. (eds.): CARAS-R-Percepción de diferencias [FACES-Perception of differences]. Madrid: TEA Ediciones. In: Book CARAS-R-Percepción de diferencias [FACES-Perception of differences]. Madrid: TEA Ediciones (2012, edn.) (2012) 22. Monteoliva, J.M., Carrada, M.A., Ison, M.S.: Test de percepción de diferencias: estudio normativo del desempeño atencional en escolares argentinos (2017) 23. Zulueta, A., Iriarte, Y., Orueta, U.D., Climent, G.: AULA NESPLORA: avance en la evaluación de los procesos atencionales. Estudio de la validez convergente con el test de percepción de diferencias” caras”(versión ampliada). JIS, 4, 4–10 (2013) 24. http://web.teaediciones.com/CARAS-R-Test-de-Percepcion-de-Diferencias—Revisado. aspx
“Women Are Less Anxious in Systems Engineering”: A Comparative Study in Two Engineering Careers Ling Katterin Huaman Sarmiento1 , Claudia Mego Sanchez2 , Ivan Iraola-Real3(B) , Mawly Latisha Huaman Sarmiento2 , and Héctor David Mego Sanchez2 1 Universidad Tecnológica del Perú, Lima 15046, Peru
[email protected] 2 Universidad Nacional Mayor de San Marcos, Lima 15081, Peru
[email protected], [email protected], [email protected] 3 Universidad de Ciencias y Humanidades, Lima 15314, Peru [email protected]
Abstract. In various educational studies, it is confirmed test anxiety is higher in women than in men, especially in the engineering careers, representing the need to also analyze differences according to careers. For this reason, the objective of the present research was to identify the test anxiety levels and academic performance in women students and males of two careers of engineering of two public universities. 144 students were polled between 16 and 22 years of age (M age = 17.78, SD = 1.40). There informed 70 (47.9%) students of System Engineering career (45 males (64.3%) and 25 women (35.7%)). And 74 (52.1%) was of Civil Engineering (51 males (68.9%) and 23 women (31.1%)). Finally, it concluded the Civil Engineering women possess major anxiety levels before examinations; and in System Engineering not. Also, in both careers, there were no significant differences in academic performance. Keywords: Text anxiety · Academic performance · Gender differences · Engineering career
1 Introduction Engineers have an essential role in society, fulfilling different functions; for example, they intervene in the design and construction of infrastructures or design, programming, and administration of networks and computer technologies, which contribute in various areas such as health, education, environment, transport, among others. They are professionals with unique characteristics who are attributed to the ability to build and fix things [1]. For these reasons, it is considered a career of increased industrial demand associated with high market wages; therefore, for many young people, making the decision to study engineering is essential [2]. According to the study titled “The Reform of the Peruvian © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 325–334, 2021. https://doi.org/10.1007/978-3-030-63665-4_26
326
L. K. Huaman Sarmiento et al.
University System: Internationalization, Progress, Challenges and Opportunities” by the British Council [3]. It shows that the Engineering career is located in the fourth group with the highest demand from employers, showing a growth of 3% from 1996 to 2010. Its demand and acceptance were confirmed in the “First National Survey of Young Peruvians - 2011” in which the engineering career had a 17.6% preference among the adolescent public [4]. However, when choosing the Peruvian university to study in, this population opts for the National University of Engineering (UNI) or the National University of San Marcos (UNMSM) because they are public universities. And also, because according to Quacquarelli Symonds (QS) [5], the UNMSM is in position 68, and the UNI is ranked 119 out of the 400 best universities in Latin America, which is why they are considered the most prestigious Peruvian public universities. That is why, in the Peruvian case, studying Civil Engineering leads to having one of the best-paid careers in the field of employability, and one of the reasons why it continues with its specialization lies in generating a more excellent individual economy to satisfy its needs [6]. Thus, studying engineering is often the best choice to perform in a professional life [7]. Additionally, these careers offer significant contributions to the population, thanks to their role in constructing buildings, parks, tracks and sidewalks, sanitation of essential services, energy supervision, network administration and communication systems, development of information, among others. The engineering career allows the knowledge of the natural sciences, of the numerical areas, that of the skills acquired throughout life, as well as the praxis, besides, to the tangible resources existing in the world are taken advantage, to contribute positively in live people. For these reasons, engineers are needed, in areas of high vitality, in industrial optimization, and the life models of the inhabitants [8]. 1.1 Gender Differences in Engineering However, for a long time in the engineering field, it was the men who took all the praise, achieving an excellent social and economic status. And regarding Civil Engineering, it is a profession that has been characterized as a career “for men” due to the gender stereotypes that accompany it [9]; the fact that influences the misconception that there are careers for men and others for women, which from childhood [10], where children are provided with “toys for men”, and girls “women’s toys”. These facts are linked to manhood because people think wrongly that an engineering professional must possess strength, energy, agility, and a lot of courage to face situations that require greater rigor within the field of work. What would be a motive why women demonstrate insecurity in their abilities [11]. Besides, the results of the Programme International Student Assessment (PISA) of the Organization for Economic Cooperation and Development (OECD) showed higher scores for men [12], who generally outperform women in academic terms [13]. Being the reasons for which the women, assume a doubtful attitude regarding the disciplines related to careers of science, technology, engineering, and mathematics (Science, Technology, Engineering and Mathematics, STEM), where males exert a higher position the hierarchy, such is the case of engineering [11], since various stereotypes such as that of the weaker
“Women Are Less Anxious in Systems Engineering”: A Comparative Study
327
sex, unable to carry out some activities, are fostered from childhood, leaving careers related to humanities such as psychology, education, medicine, among others. However, the evolution of women is evident through their inclusion in engineering careers [2]. Thanks to the inclusion processes, women have chosen jobs previously to their displeasure, since, for the female sex, engineering was not attractive to study. According these stereotypes, according to the men’s point of view, women seem too feminine to dedicate themselves to science [14], in addition to being more anxious and stressed than men [15]. 1.2 Test Anxiety and Academic Performance It is essential to consider that, in the engineering world, women have shown higher levels of anxiety [15], specifically higher levels of stress when faced with evaluations [16]; which, has generated lower ratings than men [17]. And although knowledge is available, fear encompasses a set of cases that the individual considers as risk whenever they question their abilities [18]. Anxiety involves emotionality and worry; the first, emotionality, refers to how the subject experiences her affection and the needs that her body presents as a consequence of anxiety; and the second, concern, is expressed through constant ideas regarding the events that could arise, such as, for example, obtaining unfavorable ratings [19]. Currently, thanks to the pioneers in engineering, there are already women in that field, such as Systems Engineering and Civil Engineering, during their academic training, situations arise that generate pressure, insecurity, panic, among others. emotions. Thus, mild fear and discomfort are expressed through anxiety [20]. In this context, it should be noted that careers such as Engineering, being demanding, generate levels of tension and anxiety in students. Thus, the possibility that students have of achieving good performance is affected by anxiety when faced with these tests [21]. However, it should be clarified that, for a woman, completing higher studies requires overcoming more challenging obstacles, compared to men [22], and the fact of facing evaluations could trigger in anxiety [16], lower grades [17], which could demonstrate poor educational abilities [23]. The previously analyzed allows us to understand that to carry out university studies, various psychosocial factors are required, such as motivation, family support, and emotional balance related to exam anxiety [24]. This type of anxiety is characterized by being harmful due to concern and cognitive disability [19, 25]. It is for these reasons that anxiety about exams is a psychological aspect of interest, since for Peruvian youths, undertaking university studies demands high levels of preparation, academic performance, and emotional control; being a type of anxiety, which unfortunately is more intense in women because they are studying very competitive careers [16]. Thus, according to the arguments previously presented, it is necessary to reflect on the academic problems of women in engineering careers [17], as well as on the various psychosocial factors that influence their emotions [24] and that, unlike men, they increase their levels of anxiety before the evaluations [16, 17], damaging their academic performance [19]. For these reasons, the objective of this research was to identify the predictive relationship between gender, test anxiety levels, and academic performance
328
L. K. Huaman Sarmiento et al.
in female and male students from two engineering majors at two prestigious public universities in Lima-Peru.
2 Methodology The methodology used is quantitative, so the attitudes in degrees of agreement and disagreement of the students were analyzed in self-report surveys through which subsequent statistical analyzes [26] of correlation of variables [27] and the predictive analysis using regressions [28]. 2.1 Sample From a population of 151 students, was selected a sample of 144 students (95.36%) from the first cycles of two engineering careers, who ranged in age from 16 to 22 years (M age = 17.78, SD = 1.40). 70 (47.9%) students from the Systems Engineering career participated (45 males (64.3%) and 25 females (35.7%)). And 74 (52.1%) were from Civil Engineering (51 men (68.9%) and 23 women (31.1%)) from two prestigious public universities in Lima - Peru. 2.2 Measures Sociodemographic Evaluation Registry: It consisted of identifying data such as age, gender, and place of origin (if they had been born in towns in the provinces or the capital city of Peru). Self-Assessment Test Anxiety Inventory (IDASE) [29]: The purpose of this inventory is to assess levels of concern and emotional reactions during evaluative situations. It has 20 items with four response dimensions on a Likert scale (Never to Always). For the present study, validity was assessed with the exploratory factorial analysis (EFA), and the sample fit test of Kaiser Meyer and Olkin (KMO), obtaining a score of .88, and Bartlett’s Sphericity Test was significant (χ 2 = 351,995, gl = 45, p < .001), proving to be a valid one-dimensional scale [30]. And finally, the reliability analysis with the internal consistency method reflected a Cronbach’s alpha coefficient of .93, demonstrating adequate reliability [31]. 2.3 Methodological Procedures The instruments were applied according to ethical research criteria, respecting the voluntary and anonymous participation of the participants [32]. Because in the sample there were participants of 16 and 17 years old, considered minors in Peru, they proceeded to act as established by Law No. 27337 of the Civil Code of the Child and Adolescent of Peru, which establishes that adolescents from the 16 years are considered people with all the mental faculties to decide freely [33]. The surveys were transcribed into a database designed with the Statistical Package for the Social Sciences software (SPSS, version 23) [34]. Subsequently, the validity analyzes of the IDASE instrument were performed applying the exploratory factor analysis (EFA) with the KMO test hoping to obtain a
“Women Are Less Anxious in Systems Engineering”: A Comparative Study
329
score greater than or equal to .50 for it to be valid and for the Bartlett Sphericity Test to be significant (p < .001) [30]. Then, for the reliability analysis of, consistency was analyzed with Cronbach’s Alpha, hoping to reach a coefficient of at least .70 for the scale to be reliable [31]. When confirming the validity and reliability, we performed comparative analyzes career’s differences [35]. Then, the relationships between variables were analized to identify significant relationships [27]. And finally, linear regression analyzes were performed to identify the predictive power between them [28].
3 Results Analyzed the psychometric properties (validity and reliability), the analysis of results according to the careers continued (Systems Engineering and Civil Engineering). 3.1 Differences According to the Careers According to the number of participants in both samples, the analysis was performed with Kolmogorov-Smirnov (KS) [35]. Thus, in the Systems Engineering career, in the analysis, it was identified according to gender, the variables demonstrated not having a normal distribution, and the two analysis variables proved to be significant (p < .05). And, Table 1 shows the analysis with Mann and Whitney Test according to gender (women and men) in which it was observed that the significance value (Sig. asymptotic p > .05) indicates that no study variable (academic performance and test anxiety) showed significant differences for women and men in the career of Systems Engineering. Table 1. Mann and Whitney test (systems engineering). Grouping variable: gender
Test statistics Test anxiety
Academic performance
U de Mann-Whitney
1123,000
1110,000
W de Wilcoxon
1311,000
2215,000
Z
−1,003
−,033
Sig. asymptotic (bilateral)
,248
,705
Then, in the Civil Engineering career, according to the analysis with KolmogorovSmirnov it was identified according to gender, the variables also show not have a normal distribution, and the two analysis variables proved to be significant (p < .05). Table 2 shows the analysis with Mann and Whitney Test according to gender (women and men) in which it was observed that the significance value (Sig. asymptotic p < .05) of the variable test anxiety confirmed lower levels in females. And, in the academic performance, men and women did not show a value of significance that indicates relevant differences (p > .05)
330
L. K. Huaman Sarmiento et al. Table 2. Mann and Whitney test (civil engineering). Grouping variable: gender
Test statistics Test anxiety
Academic performance
U de Mann-Whitney
1331,000
1213,000
W de Wilcoxon
1415,000
1750,000
Z
−1,226
−,043
Sig. asymptotic (bilateral)
,000
,399
These results allow us to identify in the Civil Engineering career, the women present higher levels of anxiety before the evaluations than the men, a difference from the Systems Engineering career. 3.2 Correlations Between Variables in System Engineering The relationships between study variables were analysed, in which some demographic variables such as age, gender, and place of origin were considered to better explore better samples studied. For the present correlation analysis, the Cohen criteria [36] were used, which specifies that the correlations are slight since they have a coefficient of .10 to .23, moderate if they are .24 to .36, and strong if they are .37 to more. According to these criteria, Table 3 shows that the results showed that there was no correlation between the variables studied in the sample of students in the Systems Engineering career (p > .05). Table 3. Correlations between variables in system engineering. Variables
1
2
3
4
1
Age
–
2
Gender
.25
–
3
Origin
.04
.18
–
4
Test anxiety
.25
.21
.07
–
5
Academic performance
.11
.05
.03
.30
Note. Asterisks (*, **, ***) show significant relationships. * p < .05, ** p < .01, *** p < .001 (bilateral).
“Women Are Less Anxious in Systems Engineering”: A Comparative Study
331
3.3 Correlations Between Variables in Civil Engineering Then, in Table 4, it is observed that the results evidenced in the student sample of the Systems Engineering career the test anxiety obtained a positive, significant and robust relationship with gender (r = .39*; p < .05). It is important to analyze that men were assigned the lowest value and women the highest value in the database. Table 4. Correlations between variables in civil engineering. Variables
1
2
3
4
1
Age
–
2
Gender
−.18
–
3
Origin
.06
.07
–
4
Test anxiety
.08
.39*
.26
–
5
Academic performance
.14
.06
−.13
.05
Note. Asterisks (*, **, ***) show significant relationships. * p < .05, ** p < .01, *** p < .001 (bilateral).
3.4 Regression Analysis in Civil Engineering A final linear regression analysis was performed to identify the predictive power between the variables [28] gender and test anxiety in Civil Engineering students. Table 5 shows that gender positively and significantly predicts exam anxiety (β = .39, p < .05). The variables test anxiety and gender explain 13% of the variance (R2 = 0.13). Dependent AnovaVariable: Test Anxiety (Sum of Squares = 1,800, gl = 1, f = 6,619, p < .05 Predictor Variable: Gender ). This result explains that unlike the students of the Systems Engineering career; Civil Engineering students, being female, predicts more anxiety about exams, and being male predicts less anxiety about evaluations. Table 5. Regression analysis in civil engineering Predictors Dependent variable Test anxiety R2 = .13 Gender
β = .39** p < .05
Note. *p < .05, **p < .01, ***p < .001.
332
L. K. Huaman Sarmiento et al.
4 Discussion Some careers were stereotyped as careers assigned to each gender [10]; And one of those careers in Civil Engineering, often characterized by being a career for men [9], due to its association with physical work and physical strength related to the image of masculinity. While women feel insecure about performing in engineering careers for perceiving them as male-dominated careers [11]. The situation that is sometimes complicated by the levels of anxiety that women suffer [15, 20], and with the mistaken idea of being less competent than men [17]; further, the condition of mothers and the little family support which culminates in discouraging many women from studying engineering [37]. In the present study, it is explained that these are the reasons why significant differences are found regarding the levels of insecurity and anxiety of women in the Civil Engineering career and not in the Systems Engineering career. The gender stereotype could constitute one of the psychological factors that in the present study seem to be involved with the students [10, 24], confirming that specifically, gender is related predictively with anxiety before evaluations; being women the ones that present higher levels of test anxiety than men [16, 17] which is also related to the low self-confidence they have, mainly in the Civil Engineering career. What could prove, being more associated with male virility, generates higher levels of insecurity or anxiety in women [11], since here they come to carve out gender stereotypes that end up making people believe that because of their feminine appearance must choose careers in which they will not make a physical effort because such are only for men, which makes them suppose that the profession of Systems Engineering being more associated with the field of computing, not related to physical strength, it allows women to feel more secure when studying it and also when it was assessed. Finally, it is essential to mention that in both samples, no significant differences in academic performance were observed between women and men. This was followed in both the Civil Engineering and Systems Engineering samples; A result that could indicate that perhaps women, despite having or not presenting levels of test anxiety in the various careers, achieve the same levels of learning expressed in their academic performance [38], because unlike men they are more willing to accept that they experience anxiety and also that their scores depend on several factors other than gender differences [17]. This finding does not correspond to the OECD studies [12] regarding the higher level of academic performance of men; and also disagreeing with the statements that affirm that men generally outperform women in academic qualifications [13].
5 Conclusions and Future Works As analyzed, it can be concluded that in the Systems Engineering career, women and men present the same levels of anxiety before exams and the same level of academic performance. But in the Civil Engineering career, women give higher levels of anxiety before exams. Finally, with this finding, it was possible to determine the predictive relationship that allows us to anticipate that being female is associated with a higher level of anxiety before examinations in students of the Civil Engineering career. These findings invite further studies related to the gender factor in some engineering careers (perhaps
“Women Are Less Anxious in Systems Engineering”: A Comparative Study
333
stereotyped as men’s career) and how they influence some academic variables related to student performance. According to this conclusion, studies could be carried in professions such as Mining Engineering, Mechanics, Electronics, Petroleum, Petrochemicals, etc. Analyzing various psychological variables such as motivation, self-efficacy, academic engagement, burnout, perseverance, academic procrastination, interests, and vocational skills, etc. To later implement measures that influence better conditions in the learning climate to receive more women in engineering.
References 1. Montoya, L., Serna, E.: Ingeniería: Realidad de una disciplina. Medellín. Editorial Instituto Antiqueño de Investigación, Antioquia (2018) 2. Panaia, M.: La inclusión de la mujer en la profesión de ingeniería. Rev. Virajes 16(1), 19–43 (2014) 3. British Council: The Reform of the Peruvian University System: Internationalization, progress, challenges, and opportunities (2016). https://www.britishcouncil.pe/sites/default/ files/the_reform_of_the_peruvian_university_system_interactive_version_23_02_2017.pdf. Accessed 25 May 2020 4. Instituto Nacional de Estadística e Informática del Perú: Primera Encuesta Nacional de la Juventud Peruana, Lima (2011) 5. Quacquarelli Symonds (QS): QS University Ranking: Latin América (2020). https://www.top universities.com/university-rankings/latin-american-university-rankings/2020. Accessed 25 May 2020 6. Cienfuegos, B.: Análisis y prospectiva de la mujer peruana en las ciencias y la ingeniería (2012). https://www.researchgate.net/publication/234077363_Analisis_y_prospec tiva_de_la_mujer_peruana_en_las_ciencias_y_la_ingenieria. Accessed 08 June 2020 7. Nava, J.: Estudiar ingenierías: implicaciones desde las narrativas de un grupo de estudiantes. Rev. Invest. Educativa (26), 87–113 (2018) 8. Olivos, E.: Significado de los intervalos de confianza para los estudiantes de Ingeniería en México. Tesis doctoral en Didáctica de la Matemática, Departamento de Didáctica de la Matemática, Universidad de Granada, Granada (2008) 9. López, A., Gálvez, J.: Trayectoria escolar y género en ingeniería civil, el caso de la UAEMéx. Rev. Cient. CIENCIA ergo-sum 17(1), 89–96 (2009) 10. Rosselló, M.: Estereotipación de profesiones en la infancia. Master thesis, Universitat de les Illes Balears, España (2014) 11. Tellhed, U., Backstrom, M., Bjorklund, M.: Will i fit in and do well? The importance of social belongingness and self-efficacy for explaining gender differences in interest in STEM and HEED majors. Sex Roles (77), 86–96 (2016). https://doi.org/10.1007/s11199-016-0694-y 12. Organization for Economic Cooperation and Development (OCDE): PISA 2015 results in focus. Programme for International Student Assessment – Organization for Economic Cooperation and Development, Paris (2018) 13. Arias, N., Rincón, W., Cruz, J.: Desempeño de mujeres y hombres en educación superior presencial, virtual y a distancia en Colombia. Rev. Panorama 12(22), 58–69 (2018). http:// dx.doi.org/10.15765/pnrm.v12i22.1142 14. Banchefsky, S., Westfall, J., Park, B., Judd, M.: But you don’t look like a scientist! Women scientists with feminine appearance, are deemed less likely to be scientists. Sex Roles 75(3–4), 95–109 (2016). http://dx.doi.org/10.1007/s11199-016-0586-1 15. Nguyen, D.: Employing relaxation techniques for female STEM majors to reduce anxiety. Saint Mary’s College of California (2016)
334
L. K. Huaman Sarmiento et al.
16. Chávez, H., Chávez, J., Ruelas, S., Gómez, M.: Ansiedad ante los exámenes en los estudiantes del Centro Preuniversitario – UNMSM 2012-I. Rev. de Invest. Psicología 17(2), 187–201 (2014) 17. Harris, R., Grunspan, D., Pelch, M.A., Fernandes, G., Ramirez, G., Freeman, S.: Can test anxiety interventions alleviate a gender gap in an undergraduate STEM course? CBE: Life Sci. Educ. 18(35), 1–9 (2019). https://www.lifescied.org/doi/pdf/10.1187/cbe.18-05-0083. Accessed 25 May 2020 18. Álvarez, E.: Hipnosis en un caso de ansiedad. Rev. Horizonte de Enfermería 19(1), 57–62 (2020). https://doi.org/10.7764/Horiz_Enferm.19.1.57 19. Dominguez, S.: Test anxiety inventory-state: preliminary analysis of validity and reliability in psychology college. Liberabit 22(2), 219–228 (2016) 20. Joselovsky, A.: El origen de la ansiedad como frenar el síntoma frente a la ansiedad, 1st edn. EDIC B, Barcelona (2016) 21. Onyeizugbo, E.: Auto-eficacia, sexo y rasgo de ansiedad como moderadores de la ansiedad ante exámenes. Electron. J. Res. Educ. Psychol. 8(1), 299–312 (2010) 22. Instituto Nacional de Estadística, Geografía e Informática (INEGI): Estadísticas a propósito del día de la madre, datos nacionales, México (2008) 23. Furlan, L., Rosas, J., Heredia, D., Piemontesi, S., Illbele, A.: Estrategias de aprendizaje y ansiedad ante los exámenes en estudiantes universitarios. Rev. Pensamiento Psicológico 5(12), 117–124 (2010) 24. León, J., Sugimaru, C.: Entre el estudio y el trabajo: Las decisiones de los jóvenes peruanos después concluir la educación básica regular. GRADE, Lima (2013) 25. Serrano, L., Escolar, M.: Description of the general procedure of a stress inoculation program to cope with test anxiety. Psychology 5, 956–965 (2014) 26. Leavy, P.: Research Design: Quantitative, Qualitative, Mixed Methods, Arts-Based, and Community-Based Participatory Research Approaches. The Guilford Press, London (2017) 27. Creswell, J.: Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 4th edn. Sage, Los Ángeles (2014) 28. Bingham, N., Fry, J.: Regression: Linear Models in Statistics. Springer, New York (2010) 29. Bauermeister, J.: Estrés de evaluación y reacciones de ansiedad ante la situación de examen. Avances en Psicología Clínica Latinoamericana 2, 69–88 (1989) 30. Field, A.: Discovering Statistics Using SPSS, 3rd edn. Sage Publications, Lóndres (2009) 31. Aiken, R.: Psychological Testing and Assessment, 11th edn. Allyn & Bacon, Boston (2002) 32. Goodwin, J.: Research in Psychology: Methods and Design. Brujas, Córdova (2010) 33. Congreso de la República: Código del Niño y del Adolescente. Ley N°. 27337, Lima, Perú (2000) 34. IBM Knowledge Center. Ibm.com (2016). http://www.ibm.com/support/knowledgecenter/ SSLVMB_23.0.0/spss/product_landing.dita. Accessed 10 July 2020 35. Guillaume, F.: The signed Kolmogorov-Smirnov test: why it should not be used. GigaScience 4(1) (2015). https://doi.org/10.1186/s13742-015-0048-7 36. Cohen, J.: A power primer. Psychol. Bull. 112(1), 155–159 (1992) 37. Rodríguez, O.: Why women do not want to be engineers? Case: female students of industrial engineering in UPTC. Master thesis, Universidad Politécnica de Cartagena, España (2016) 38. Peters, M.: Examining the relationships among classroom climate, self-efficacy, and achievement in undergraduate mathematics: a multi-level analysis. Int. J. Sci. Math. Educ. 11(2), 459–480 (2013). https://doi.org/10.1007/s10763-012-9347-y
Sorting Algorithms and Their Execution Times an Empirical Evaluation Guillermo O. Pizarro-Vasquez1(B) , Fabiola Mejia Morales1 , Pierina Galvez Minervini1 , and Miguel Botto-Tobar2,3 1 Salesian Polytechnic University, Guayaquil, Ecuador
[email protected] 2 Universidad de Guayaquil, Guayaquil, Ecuador 3 Eindhoven University of Technology, Eindhoven, The Netherlands
Abstract. One of the main topics in computer science is how to perform data classification without requiring plenty of resources and time. The sorting algorithms Quicksort, Mergesort, Timsort, Heapsort, Bubblesort, Insertion Sort, Selection Sort, Tree Sort, Shell Sort, Radix Sort, Counting Sort, are the most recognized and used. The existence of different sorting algorithm options led us to ask: What is the algorithm that us better execution times? Under this context, it was necessary to understand the various sorting algorithms in C and Python programming language to evaluate them and determine which one has the shortest execution time. We implement algorithms that help create four types of integer arrays (random, almost ordered, inverted, and few unique). We implement eleven classification algorithms to record each execution time, using different elements and iterations to verify the accuracy. We carry out the research using the integrated development environments Dev-C++ 5.11 and Sublime Text 3. The products allow us to identify different situations in which each algorithm shows better execution times. Keywords: Sorting · Sorting algorithms · Standard dataset · Integrated development environment · Execution time
1 Introduction One of the fundamental issues related to computer science is how to perform data sorting without requiring a lot of resources and time. We can define sorting as organizing a disordered collection of items to increase or decrease order [1]. Sorting and, by extension, sorting algorithms are critical to several tasks. Sorting algorithms can help remove or merge data through sorting by the primary uniqueness criterion; they are also useful in finding out where two broad sets of elements differ. By the same logic, sorting algorithms can also determine which data appears in both datasets. Over time sorting algorithms have been implemented in almost all programming languages; therefore, they combine multiple language components with helping new programmers learn how to code. Additionally, it is nearly impossible to discuss sorting without mentioning performance. Performance is the key for all systems to function © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 335–348, 2021. https://doi.org/10.1007/978-3-030-63665-4_27
336
G. O. Pizarro-Vasquez et al.
efficiently; thus, we can claim that sorting helps us understand performance, which leads to an improvement in software structures and designs. Since the early days of computer science, the sorting and classifying problem has been a prevalent topic of research due to the complexity of efficiently using precise and straightforward coding statements. One of the purposes intended to achieve using sorting is to minimize the execution time of a group of tasks. Hence multiple algorithms have been developed and improved to sort faster, and for this, it is necessary to know the computer specifications, program design methodology, and software architecture [2]. Some algorithms can be very complex depending on their execution; as an example, we have the “Bubblesort” that since 1956 is the subject of study [3], and that can be very complex compared to the “ShellSort” that presents less execution time required to perform a sorting [4]. We selected the sorting algorithms Quicksort, Mergesort, Timsort, Heapsort, Bubblesort, Insertion Sort, Selection Sort, Tree Sort, Shell Sort, Radix Sort, Counting Sort, because they are the most recognized and used around the world. In this research, it is necessary to stipulate that we do not implement memory management algorithms, as we will only measure the performance of the classification algorithms without additional code. There are two types of sorting data, internal sorting and external sorting. Internal sorting methods store sorted values in main memory; therefore, we assume that the time required to access any item is the same. On the other hand, external sorting methods store the values to sort in secondary memory; Assuming that the time required to access any item depends on the last position obtained. The classification algorithms have two classifications, which are comparative and non-comparative [5]. In the comparison-based sorting algorithm, the disordered data is sort by comparing the data pairs repeatedly. If the data is out of order, they are interchanged with each other [6]. This exchange operation of this sort is known as a comparison exchange. Non-comparison ordering algorithms are responsible for classifying data using the data’s specific well-established properties, such as data distribution or binary representation [7]. Four parameters are necessary for the sorting algorithms, which are determined: stability, adaptability, time complexity, space complexity [8]. Comparison-based sorting algorithms generally have two subdivisions: complexity O(n2 ) and complexity O(n log n). In general, the O(n2 ) sorting algorithms have a slower execution than the O(n log n) algorithms; despite this, the O(n2 ) sorting algorithms are still fundamental in computer science. One of O(n2 ) algorithms’ benefits is that they are non-recursive, requiring much less RAM. Another application of the O(n2 ) ordering algorithm is in the sorting of small matrices. Because the O (n log n) sorting algorithms are recursive, it is inappropriate to sort small arrays as they perform poorly (Table 1). We can highlight that among the O(n2 ) sorting algorithms, Selection Sort and Insertion Sort are the best-performing algorithms in general data distributions [9]. Several authors have carried out experiments to define which one or which of these algorithms have better execution times; most of them indicate that Quicksort is the ideal one; however, the authors of these experiments only venture to make comparisons between a maximum of 9 sorting algorithms at a time [10–13]. Also, it is necessary to
Sorting Algorithms and Their Execution Times an Empirical Evaluation
337
Table 1. Sorting algorithms used and their complexity. Sorting algorithms
Complexity
Memory
Method
Bubblesort [BS]
O(n2 )
O(1)
Exchanging
Insertion sort [IS]
O(n2 )
O(1)
Insertion
Counting sort [CS]
O(n + k)
O(n + k)
Non-comparison
Mergesort [MS]
O(n log n)
O(n)
Merging
Tree sort [TrS]
O(n log n)
O(n)
Insertion
Radix sort [RS]
O(nk)
O(n)
Non-comparison
Shell sort [ST]
O(n1.25)
O(1)
Insertion
Selection sort [SS]
O(n2 )
O(1)
Selection
Heapsort [HS]
O(n log n)
O(1)
Selection
Quicksort [QS]
O(n log n)
O(log n)
Partitioning
Timsort [TS]
O(n log n)
O(n)
Insertion & Merging
mention that the response times may vary depending on the CPU characteristics, RAM, and other computer specifications on which are run the algorithms. In this context, to obtain concrete results, a different number of data is required to execute the sorting. It also executes the process repeatedly to verify the integrity of the results. Consequently, the existence of different options of sorting algorithms leads us to ask: What is the algorithm that gives us better execution times? Does the programming language have any impact on the performance of the sorting algorithms? Furthermore, how to verify the integrity of said results? Due to the above reasons, it is necessary to carry out a complete experiment that indicates which of the sorting algorithms is the one with the best execution times. Thus, the research objective was to compile the various existing sorting algorithms in the C and Python programming languages to evaluate them. The research consists of three main steps. First, the algorithms’ implementation helped create four types of different integer arrangements (random data, nearly sorted data, reverse sorted data, random data sorted by categories), which forms standard datasets. The second step is to implement the eleven sorting algorithms and sort the standard datasets, recording each algorithm’s times using a different number of elements and iterations to check the results’ integrity. Moreover, the final step is the analysis of the obtained results. We will describe the methodology of the experiment in more detail; what were the steps to follow? We will explain the results in each stage, the tools used, and the results obtained.
2 Materials and Methods Runtimes may vary depending on the characteristics of the CPU, RAM, and other specifications of the computer running at the time. Therefore, it is necessary to indicate the
338
G. O. Pizarro-Vasquez et al.
specifications of the equipment used. We used a 64-bit computer with an Intel i5 processor, 8 GB RAM, and a Windows 10 operating system in this research. To start the investigation, the codes of the algorithms that generate the integer arrays had to be studied, considering that the ordering algorithms were going to be used to sort in ascending order four different types of integer arrays. The algorithms to generate the collections are [14]: • Random – It generates random numbers with uniform distribution. • Nearly Sorted – It generates an array of numbers sorted in ascending order and then introduces some randomness; 20% of the data, in no specific position, is changed by altering them with other random data. • Reversed – It generates an array of descending ordered numbers. • Few Unique – It generates an array by setting the value of the categories m. In this case, there will only be five categories; then, select several random numbers gave a size N (N is the size of the array). M represents the size of the types and implements the formula M = N/m. Finally, repeat each random number (obtained in the second step) M times to complete the matrix N; there is no sort. Once we implemented these algorithms, the arrays were stored in a text file (txt), to use the same dataset for each sorting algorithm. It is expected in this research that the algorithms generate data sets with 100, 1,000, 10,000, 100,000, 1,000,000 elements for each algorithm. Still, some of these algorithms threw errors at the moment of trying to generate integer arrays of more than 100.000 items in both Dev-C++ and Python (Table 2). Table 2. List of files with data generated by algorithms Algorithms for generating integer arrays
100
1,000
10,000
100,000
1,000,000
Random
✓
✓
✓
✓
X
Nearly sorted
✓
✓
✓
✓
X
Reversed
✓
✓
✓
✓
X
Few unique
✓
✓
✓
✓
✓
Discerning that creating data sets with 1,000,000 items was impossible with all algorithms, the best decision is to use a data set of up to 100,000 items. To later obtain the classification algorithm (the references of the codes are in [15]). The algorithms code was modified in Dev-C++ and Python, so that they consume the previously generated data files and that the system has a certain number of iterations (Fig. 1). The process is carried out with 1, 10, 100, 1,000, 10,000 iterations, recording the execution times in a spreadsheet file to be analyzed. It is essential to mention that the execution time of the algorithms is in milliseconds. As a result of the analysis of execution times, we obtained four tables, one for each type of integer arrays generator algorithm, with the average values of each sorting
Sorting Algorithms and Their Execution Times an Empirical Evaluation
339
Fig. 1. Experiment process diagram
algorithm group’s execution times by the number of iterations. Furthermore, with these files, graphs or figures were made using r programming. The tables and figures will show and explain in the next section.
3 Results The eight tables with the execution times were summarized in two tables to facilitate the results’ comprehension and analysis. Appendix 1 shows the execution time averages of the random, Nearly Sorted, Few Unique, and Reversed data in C. In most algorithms, the execution and classification were satisfactory, but there were cases where there was a considerable consumption of a resource, so the program returned an error message. Appendix 2 shows the execution time averages of the random, Nearly Sorted, Few Unique, and Reversed data in Python. An error occurred in Python due to excessive memory consumption, which did not allow the total execution of the sorting with 10.000 and 100.000 datasets. Some algorithms have a more extensive range of execution times than others. Therefore, to organize it more thoroughly and efficiently to understand, we divide the sorting algorithms into two categories, “efficient algorithms” with a standard range of execution times and the “inefficient algorithms” with a much more extensive range of execution times. “Efficient algorithms” are defined as those whose execution times exceed that of the other algorithms by a margin of at least 100 ms. The “inefficient algorithms” are Bubblesort, Insertion sort, and Selection sort; the remaining algorithms are considered “efficient algorithms.” There are differences between C and Python; one of the differences is that the range of Python runtimes is much more extensive. Considering the average execution times for sorting random data for the “efficient algorithms,” Heapsort is the one with the higher execution times. On the other hand,
340
G. O. Pizarro-Vasquez et al.
Timsort is the sorting algorithm that presents the lowest execution times in Python; in C, Tree sort is the most efficient one. In the case of the average execution times for sorting nearly sorted data for the “efficient algorithms,” Heapsort is the one with the higher execution times in C and only up to 1.000 iterations. When the number of iterations surpasses 1.000, Radix sort becomes the worst one. In Python, Tree sort is the one with the worst execution times. In all the cases while working with “inefficient algorithms,” Bubblesort was the one that got the worst execution times. Table 3. Most efficient & least efficient algorithms Efficient algorithms Algorithms
Most efficient
Programming language C
Least efficient
Python
C
Python
Random
Tree sort Timsort Heapsort
Heapsort
Nearly Sorted
Tree sort Timsort Radix sort Tree sort
Reversed
Tree sort Timsort Heapsort
Tree sort
Few Unique
Tree sort Timsort Heapsort
Heapsort
Inefficient algorithms Algorithms
Most efficient
Least efficient
Programming language
C
Python
C
Python
Random
Insertion sort
Selection sort
Bubblesort
Bubblesort
Nearly Sorted
Insertion sort
Insertion sort
Bubblesort
Bubblesort
Reversed
Insertion sort
Selection sort
Bubblesort
Bubblesort
Few Unique
Insertion sort
Selection sort
Bubblesort
Bubblesort
Table 3 specifies the final results for each type of integer array according to the programming language and the sorting algorithms’ efficiency. Figure 2 shows that Tree Sort is the most efficient sorting algorithm in C; however, it has higher execution times than Timsort in Python. It is crucial to point out that Timsort and Counting sort in C had bad execution times, and they will have been the most efficient ones, but they failed to sort more than 1000 elements. Thus, we can still say that Timsort and Counting sort are the most efficient algorithms in C when we try to type a few items. Contrary to Fig. 2, Fig. 3 shows that both programming languages have Bubblesort as the most inefficient algorithm, and its execution times are longer in Python than in C.
Sorting Algorithms and Their Execution Times an Empirical Evaluation
341
Fig. 2. Most efficient sorting algorithms
Fig. 3. Most inefficient sorting algorithms
With everything established above, we now know what algorithms give us the best execution times, no matter the number of elements and iterations.
4 Discussion This research paper found out that the programming language significantly impacts how the sorting algorithms behave and how much data they can sort. An example of this is
342
G. O. Pizarro-Vasquez et al.
that Timsort is the most efficient sorting algorithm in Python, while Tree Sort is the best one in C. Both have the same complexity O(n log n). Also, Bubblesort O(n2 ) is the one that has the worst execution times, no matter the number of elements or iterations. In the paper “Analysis and Review of Sorting Algorithms” [1], the authors only used five sorting algorithms. Bubblesort, Insertion Sort, Selection Sort, and Quicksort. Their research also concluded that Bubblesort is only suitable for small lists or arrays because it has the worst performance. Another paper that had similar results was “Analysis and Testing of Sorting Algorithms on a Standard Dataset” [10] in which once more, Bubblesort had the worst execution times. In their case, they worked with nine sorting algorithms programmed with C++. They also use different datasets, and the best algorithms in their case were Counting sort, which also gave us good execution times but did not run with larger arrays. Timsort and Tree Sort do not appear in the document mentioned above. In “Experimental study on the five sort algorithms,” they demonstrated that the number of items in the dataset or array has a considerable impact on the sorting algorithm’s performance. Each sorting algorithm is suitable for a specific situation. If any patterns or rules are found in the input sequence, inserting and sorting bubbles is a suitable option. However, when the input scale is large, Merge Sort and Quicksort are the main choices [12]. Timsort was created in 2002 by Tim Peters [16] for use in the Python language. A hybrid classification algorithm based on the Insertion Sort and Merge Sort algorithm works are in blocks that sort using the insertion order one by one. Then the sorted blocks are merged using the merge operation used in the merge [17]. Thus, its popularity has increased, which opens the way to a series of investigations on its operation such as the investigation of “Monte Carlo simulation of polymerization reactions: optimization of the computational time” in which they analyzed the Monte Carlo simulation of a steady-state polymerization process to reduce the overall computational time, where the authors compare four ordering algorithms such as Timsort, Bubblesort, Insertion Sort, and Selection Sort, resulting in that Timsort was the most efficient algorithm in that implementation and Bubblesort the one with the worst time [18]. The authors of “Binary Tree Sort is More Robust Than Quick Sort in Average Case” [19] explained that we could use Binary tree sort if the sorted elements do not need a uniform. They proved that the robustness of Tree sort is a decisive factor instead of just focusing on the algorithm’s complexity. The aforementioned makes it easier for us to explain why Trees Sort was better with larger C language arrays. “Best sorting algorithm for nearly sorted lists” compares five algorithms, Insertion Sort, Shell sort, Merge Sort, Quicksort, and Heapsort on nearly sorted lists. Their test results showed that Insertion Sort is best for small or very nearly-sorted lists and that Quicksort is better otherwise [20]. Insertion Sort was also the best “Inefficient Algorithm” in our experiment. They concluded that there is no one sorting method that is best for every situation. For that reason, it is necessary to keep experimenting and comparing a new sorting algorithm, which is why we used more sorting algorithms. With quantum computing, multiple implementations can be made, such as quantum treemaps used to visualize large hierarchical datasets. In an application such as using a recursive technique motivated by the Quicksort algorithm, these algorithms
Sorting Algorithms and Their Execution Times an Empirical Evaluation
343
offer compensation, producing partially sorted designs that are reasonably stable and have relatively low aspect ratios [21]. However, the question often arises. How fast can quantum computers sort? A quantum computer only needs to compare O(0.526n log2 n) + O(n) times. Performing an improvement to the lower limit to (n log n), we obtain that the best comparison-based quantum classification algorithm can be at most a constant time faster than the best classical algorithm [22]. We encourage experimenting with those interrogations similar to those shown in this research article and implementing quantum classification methods in future work.
5 Conclusion This research’s primary purpose was to evaluate and analyze variations in execution times of sorting algorithms written in C and Python. We were interested in the resulting execution times when a sorting algorithm is run multiple times for a given dataset, and if those times differ depending on a programming language. The experiment’s orientation was for eleven classification algorithms: Quicksort, Merge-sort, Timsort, Heapsort, Bubblesort, Insertion Sort, Selection Sort, Tree Sort, Shell Sort, Radix Sort, Counting Sort. For each sorting algorithm, a range of array sizes was created and then examined. One of the main results is that distributions of the execution times were discrete, with relatively few distinct values. Another important finding is that the execution time increased as the array size increased for all sorting algorithms. Also, organizing the data affects execution times, showing that some algorithms are better for sorting random data than inverted data, Etc. Finally, the programming language has a significant impact on how the sorting algorithms behave and how much data they can sort without the code throwing an error message. A concrete example of this is that Timsort is the most efficient sorting algorithm in Python, while Tree Sort is the best one in C. Both have the same complexity O(n log n). In both cases, Bubblesort O(n2 ) is the one that has the worst execution times, no matter the number of elements or iterations.
Appendix Appendix 1. Execution Time Averages in C Random (ms) Data 100
1000
Iter
QS 1
MS
TS
HS
BS
IS
SS
TrS
0.0149
0.0102
0.0057
0.0121
0.0284
0.0074
0.0163
0.0140
ST
RS
CS
0.0075
0.0037
0.0024
10
0.0035
0.0069
0.0039
0.0072
0.0229
0.0070
0.0142
0.0036
0.0057
0.0034
0.0020
100
0.0026
0.0061
0.0032
0.0051
0.0220
0.0072
0.0141
0.0026
0.0045
0.0034
0.0017
1000
0.0023
0.0053
0.0027
0.0057
0.0189
0.0046
0.0136
0.0042
0.0040
0.0096
error
10000
0.0021
0.0053
0.0022
0.0054
0.0157
0.0023
0.0133
0.0041
0.0029
0.0146
error
1
0.0705
0.1129
0.2143
0.2210
2.1418
0.6657
1.2248
0.0635
0.1102
0.0471
0.0071
10
0.0650
0.1081
0.0762
0.1509
2.1651
0.6505
1.2236
0.0401
0.1153
0.0514
0.0071 (continued)
344
G. O. Pizarro-Vasquez et al.
(continued) Random (ms) Data
Iter
QS
100
0.0635
MS
TS
HS
BS
IS
SS
TrS
0.1005
0.0712
0.1514
2.5847
0.6459
1.2322
0.0379
ST 0.1129
RS
CS
0.0470
0.0072
1000
0.0494
0.0869
0.0507
0.1325
1.6873
0.3392
1.2153
0.0460
0.0712
0.0883
error
10000
0.0287
0.0681
0.0317
0.0816
1.2030
0.0388
1.2111
0.0470
0.0296
0.1677
error 0.0718
10000
1
0.7475
1.2682
error
1.8875
298.9984
64.8266
121.1839
0.1820
1.5020
0.4770
10
0.7458
1.2530
error
1.9217
295.5248
66.9882
127.1354
0.1408
1.4965
0.4644
error
100
0.7383
1.3059
error
1.9203
292.9857
65.0310
123.7623
0.1393
1.5423
0.4886
error error
1000
0.5518
1.0522
error
1.6540
205.6387
32.9524
117.5058
0.1454
0.9564
0.9526
10000
0.3743
0.8348
error
1.2170
120.8390
3.3560
117.3990
0.1475
0.4116
1.3971
error
1
7.5794
14.6408
error
21.4637
33941.2179
6434.9664
11729.1623
0.6891
17.9443
4.8676
0.5679
10
8.0029
14.6789
error
22.8292
33618.3597
6464.3184
11766.5191
0.6403
17.6360
4.7118
error
100
7.6986
14.3785
error
21.3103
33650.1493
6421.5479
11690.4792
0.6535
18.2611
4.6397
error
1000
6.0036
12.0452
error
19.1913
22651.4197
3362.4318
11858.7491
0.6524
11.0385
0.9526
error
10000
4.4945
9.9346
error
17.0932
12265.4345
335.2610
11444.6697
0.6552
5.0353
14.5989
error
TS
HS
BS
IS
SS
100000
Nearly sorted (ms) Data
Iter
100
QS 1
0.0065
MS
TrS
0.0084
0.0036
0.0591
0.0157
0.0026
0.0498
0.0152
ST 0.0053
RS
CS
0.0054
0.0023
10
0.0031
0.0072
0.0024
0.0076
0.0211
0.0021
0.0145
0.0039
0.0040
0.0142
0.0020
100
0.0022
0.0051
0.0023
0.0055
0.0140
0.0019
0.0143
0.0028
0.0043
0.0051
0.0059
1000
0.0023
0.0056
0.0023
0.0056
0.0148
0.0021
0.0170
0.0044
0.0032
0.0118
error
10000
0.0022
0.0057
0.0022
0.0058
0.0150
0.0021
0.0132
0.0043
0.0030
0.0175
error 0.0167
1000
1
0.0664
0.0804
0.0436
0.1232
1.4216
1.0850
5.0960
0.0648
0.1264
0.0635
10
0.0587
0.0746
0.0473
0.1278
1.5113
0.2016
1.2170
0.0368
0.1130
0.1570
error
100
0.0566
0.0725
0.0385
0.1294
1.4778
0.2065
1.2323
0.0335
0.1099
0.0614
error
1000
0.0461
0.0707
0.0343
0.1209
1.2948
0.1138
1.2140
0.0436
0.0700
0.1040
error
10000
0.0282
0.0658
0.0303
0.0820
1.1436
0.0170
1.1859
0.0486
0.0293
0.1695
error 0.1365
10000
1
0.6643
0.9435
error
1.5351
208.4337
19.6579
119.1224
0.2130
1.7079
0.6386
10
0.6865
0.9634
error
1.6610
193.4145
20.0551
121.1661
0.1514
1.7112
0.6426
error
100
0.7111
0.9634
error
1.5482
190.4780
20.0197
118.8711
0.1548
1.7194
0.7056
error error
1000
0.5480
0.8773
error
1.4625
152.2325
10.1305
118.5112
0.1528
1.0949
1.1911
10000
0.3759
0.8214
error
1.2002
116.1112
1.0735
120.0545
0.1482
0.4238
1.4127
error
1
5.6883
11.0523
error
17.2255
14051.2450
501.8177
11832.1875
0.8658
10.1685
9.4685
1.2621
100000
10
5.5417
11.5409
error
17.9486
14027.0252
495.1589
11814.0204
0.6930
10.4666
9.4292
error
100
5.6053
10.7023
error
17.3267
14028.2937
496.3286
11930.1854
0.6568
10.2494
9.2739
error
1000
5.4688
10.2111
error
17.0733
12630.6195
255.7339
11939.0698
0.6492
8.6424
9.8717
error
10000
4.5268
9.7984
error
16.0976
11489.6494
27.4709
12013.5563
0.6699
5.0404
17.1336
error
TS
HS
BS
Few unique (ms) Data 100
1000
Iter
QS 1
0.0202
MS
IS
SS
TrS
0.0102
0.0036
0.0122
0.0990
0.0082
0.0159
0.0133
ST 0.0071
RS
CS
0.0053
0.0026
10
0.0035
0.0080
0.0024
0.0067
0.0245
0.0074
0.0144
0.0042
0.0061
0.0062
0.0017
100
0.0023
0.0057
0.0023
0.0050
0.0227
0.0073
0.0138
0.0029
0.0047
0.0051
0.0017
1000
0.0024
0.0058
0.0023
0.0054
0.0197
0.0050
0.0138
0.0049
0.0063
0.0093
error
10000
0.0021
0.0055
0.0023
0.0054
0.0165
0.0027
0.0135
0.0056
0.0028
0.0175
error 0.0092
1
0.0707
0.1079
0.0431
0.1654
2.3080
0.6995
1.8619
0.0681
0.1057
0.0459
10
0.0678
0.1103
0.1216
0.1584
2.2054
0.6461
1.2374
0.0417
0.1082
0.0480
error
100
0.0655
0.1042
0.0395
0.1496
2.3190
0.6789
1.2425
0.0410
0.1055
0.0464
error
1000
0.0473
0.0930
0.0343
0.1322
1.6716
0.3478
1.2311
0.0485
0.0677
0.0959
error
(continued)
Sorting Algorithms and Their Execution Times an Empirical Evaluation
345
(continued) Few unique (ms) Data
Iter
QS
RS
CS
10000
0.0282
0.0693
0.0304
0.0809
1.1960
0.0392
1.2142
0.0560
0.0291
0.1670
error 0.0673
10000
MS
TS
HS
BS
IS
SS
TrS
ST
1
0.7346
1.3134
error
1.9435
303.0240
64.6072
120.0242
0.1886
1.4950
0.4800
10
0.7173
1.2638
error
1.8604
291.4519
65.2157
121.5685
0.1542
1.5188
0.4765
error
100
0.7241
1.2929
error
1.8939
288.5536
64.3683
120.4502
0.3016
1.5972
0.4880
error error
1000
0.5528
1.0575
error
1.5950
202.7150
32.8495
120.1730
0.1609
0.9759
1.1033
10000
0.3741
0.8454
error
1.1478
128.2706
3.3390
120.0205
0.1595
0.4123
1.3959
error
1
7.5547
14.7736
error
20.7879
35097.0922
6494.1620
12027.4861
0.6967
17.5160
4.9567
0.6649
100000
10
7.6227
15.0838
error
22.5534
34684.1799
6477.7336
12157.8741
0.7044
17.8177
4.7438
error
100
7.6268
14.5629
error
20.9330
33167.3059
6477.3688
12295.0094
0.6874
17.2473
4.6473
error
1000
6.0191
12.1864
error
18.7028
22996.9892
3293.4073
11399.2589
0.6687
10.9321
9.6269
error
10000
4.3826
10.4619
error
14.6550
12252.0630
328.5804
11635.6127
0.6777
5.0300
8.0059
error
QS
MS
TS
HS
BS
Reversed (ms) Data
Iter
100
1
0.0027
IS
SS
0.0065
0.0063
0.0099
0.0284
0.0143
0.0135
TrS
ST
RS
CS
0.0331
0.0037
0.0053
0.0013
10
0.0017
0.0053
0.0068
0.0197
0.0276
0.0140
0.0130
0.0041
0.0075
0.0050
error
100
0.0078
0.0050
0.0053
0.0187
0.0300
0.0137
0.0140
0.0024
0.0029
0.0052
error
1000
0.0022
0.0051
0.0039
0.0055
0.0226
0.0083
0.0132
0.0041
0.0031
error
error
10000
0.0019
0.0052
0.0023
0.0055
0.0159
0.0030
0.0130
0.0038
0.0029
error
error 0.0127
1000
1
0.0226
0.1109
0.0647
0.2971
2.7470
1.3273
1.1790
0.0615
0.0480
0.0618
10
0.0237
0.0673
0.0667
0.5297
2.8306
1.3362
1.1401
0.0409
0.0568
0.0625
error
100
0.0611
0.0655
0.0657
0.2134
2.8199
1.3104
1.1621
0.0382
0.0472
0.0630
error
1000
0.0277
0.0665
0.0481
0.1321
1.9682
0.6799
1.1688
0.0458
0.0379
error
error
10000
0.0267
0.0656
0.0314
0.0832
1.2151
0.0732
1.1825
0.0467
0.0259
error
error
10000
1
0.3014
0.8716
error
1.5572
276.1865
130.9751
112.9974
0.2141
2.7750
0.7980
0.0938
10
0.3137
0.8425
error
1.4730
272.3418
132.0642
113.7755
0.1464
0.0266
0.7612
error
100
0.3342
0.9739
error
1.4701
271.2186
130.0438
112.7412
0.1429
0.0322
0.8062
error
1000
0.3610
0.8166
error
1.3609
193.0997
66.9404
114.9121
0.1435
0.0781
error
error
10000
0.3577
0.8106
error
1.1514
120.3492
6.6871
116.7057
0.1478
0.3679
error
error
1
4.3624
10.2862
error
17.1909
27093.0421
13115.8318
11257.0213
0.6813
8.5972
9.3526
0.9130
100000
10
4.1092
10.0653
error
17.2984
27077.2859
13081.0810
11274.6340
0.7093
8.4274
9.2778
error
100
4.0350
9.7103
error
17.3921
27186.5398
14016.1420
11416.3516
0.6549
8.4440
9.1002
error
1000
4.7252
9.6941
error
17.0638
19120.7744
6722.9272
11126.5332
0.6694
8.1070
error
error
10000
4.4514
9.5313
error
14.1699
11716.5482
701.1256
11351.2102
0.6577
5.0600
error
error
Appendix 2. Average Execution Times in Python Random (ms) Data 100
1000
Iter
QS
MS
TS
HS
BS
IS
SS
TrS
ST
RS
CS
1
0.3253
0.3447
0.0591
0.4410
1.1796
0.5673
0.5574
0.3174
0.1840
0.2072
0.1802
10
0.3017
0.3551
0.0669
0.4495
1.2176
0.6299
0.5537
0.3325
0.1903
0.1953
0.1933
100
0.3021
0.3713
0.0619
0.4433
1.2246
0.5906
0.5450
0.3358
0.1988
0.1963
0.1975
1000
0.2894
0.3552
0.0587
0.4239
1.1696
0.5829
0.5387
0.3265
0.1901
0.1884
0.1887
10000
0.2854
0.3510
0.0574
0.4171
1.1606
0.5755
0.5348
0.3225
0.1871
0.1862
0.1857
1
6.8990
5.7842
3.2887
27.7787
126.4296
60.8604
54.0734
4.4141
3.6280
3.6659
3.5635 (continued)
346
G. O. Pizarro-Vasquez et al.
(continued) Random (ms) Data
Iter
QS
MS
TS
HS
BS
IS
SS
TrS
ST
RS
CS
10
5.2033
4.8432
0.0599
6.8569
117.3365
67.3682
57.1244
4.5659
3.6835
3.7556
3.8659
100
5.1562
4.9112
0.0632
6.7423
118.1921
62.6277
55.6794
4.6610
3.7779
3.8228
3.8511
1000
5.2211
5.0086
0.0634
6.8367
118.8780
62.8142
55.6440
4.6790
3.8310
3.8301
3.8293
10000
5.1168
4.8983
0.0618
6.6996
116.7939
61.8774
54.6691
4.5610
3.7491
3.7441
3.7489
10000
1
65.7770
64.1239
0.0826
92.0450
12220.2270
6697.4468
5814.6917
66.2586
61.9572
61.7803
59.2173
10
65.4447
63.9647
0.0557
91.5492
11902.5500
6492.1609
5677.3800
66.9012
59.5979
59.4189
59.5188
100
65.5597
63.9213
0.0570
91.6201
11895.0791
6488.8347
5668.4324
66.8955
59.7671
59.8399
59.7862
1000
64.8800
63.1248
0.0558
90.4387
11782.7659
6437.5941
5604.8044
66.4650
59.1533
59.1234
59.2495
QS
MS
TS
HS
IS
SS
TrS
ST
RS
CS
Nearly sorted (ms) Data
Iter
100
BS
1
1.1844
1.0452
0.0676
1.3179
1.6930
0.2420
1.3455
2.1746
0.3287
0.3209
0.3215
10
0.7243
0.5087
0.0345
0.6955
0.9980
0.1390
0.7778
1.2902
0.2029
0.2069
0.2066
100
0.4552
0.3148
0.0179
0.4435
0.7052
0.1023
0.5591
0.9060
0.1393
0.1369
0.1355
1000
0.4654
0.3222
0.0186
0.4527
0.7142
0.1019
0.5728
0.9260
0.1412
0.1382
0.1362
10000
0.4716
0.3275
0.0189
0.4556
0.7184
0.1029
0.5726
0.9268
0.1412
0.1406
0.1389
1
10.8394
4.6520
0.0205
7.0100
69.8064
6.8512
56.4958
10.9039
2.8341
2.8221
2.7761 2.8254
1000
10
10.9654
4.7921
0.0189
7.1691
70.2443
6.8958
56.7438
11.1606
2.8516
2.8236
100
11.4077
4.9532
0.0211
7.3079
72.9078
7.2542
58.1838
11.5669
2.9910
3.5081
2.9799
1000
11.1879
4.8518
0.0200
7.3080
72.8122
7.0869
57.4603
11.3708
3.0061
3.0101
3.0027
10000
10.9778
4.7643
0.0199
7.1918
71.3234
6.9853
56.8876
11.2388
2.9574
2.9469
2.9455
1
114.5071
66.4808
0.0300
96.9319
8102.2887
1585.7202
6222.0002
167.6363
53.5893
53.5187
52.6985
10000
10
117.0330
68.2153
0.0593
103.4408
8083.3021
1580.7920
6064.5560
172.4053
53.5670
52.6310
53.3516
100
120.6905
68.6117
0.0331
104.5879
8640.4002
1626.8096
5884.9889
171.6287
59.5668
58.7390
58.7437
1000
123.7589
70.8472
0.0357
107.8657
8954.4318
1656.2141
5880.9480
174.2174
61.6672
61.6568
61.9305
QS
MS
TS
HS
IS
SS
TrS
ST
RS
CS
Few unique (ms) Data
Iter
100
BS
1
0.3721
0.3560
0.0556
0.4137
1.4227
0.5369
0.5403
0.3509
0.1649
0.1860
0.1669
10
0.4057
0.3991
0.0680
0.4757
1.3225
0.5262
0.5530
0.3265
0.1851
0.1822
0.2084
100
0.5676
0.6119
0.0696
0.4768
1.3199
0.8597
1.0294
0.5607
0.2395
0.2119
0.1994
1000
0.5602
0.5160
0.0752
0.6533
1.7388
0.5955
0.7250
0.4020
0.2350
0.2094
0.2768
10000
0.5746
0.5548
0.0908
0.6689
1.6950
0.7248
0.7463
0.4634
0.2872
0.2697
0.2813
1
5.7344
5.0763
0.0744
7.5250
205.9111
84.5285
80.0704
5.6699
4.2742
13.8807
4.8759
10
7.9406
7.4175
0.1013
13.6953
207.4326
99.4368
83.0363
6.7028
5.5434
9.5328
6.5981
100
8.3593
8.7713
0.1213
10.8606
202.8462
99.3244
77.9542
7.5102
6.0344
5.8915
5.8995
1000
6.9183
6.8396
0.0989
9.2002
170.5048
84.4819
65.6271
6.3029
4.9429
5.0778
5.0393
10000
6.6683
6.6184
0.0998
9.0623
162.9227
81.4049
64.7323
6.0444
4.7744
4.7598
4.6863
1000
10000
1
80.3642
67.9960
0.0608
9.1993
14215.9962
7218.1729
5719.4402
74.5561
59.7071
71.6940
61.3581
10
80.9916
69.8744
0.0727
99.7600
13450.4785
6995.5644
5610.6605
77.7963
64.7814
64.0847
59.8606
100
82.1668
68.2175
0.0671
98.8513
13436.4983
6998.2024
5619.0589
78.1411
65.0833
64.8308
64.8462
1000
82.4485
69.0838
0.0701
99.6724
13432.9928
6984.6595
5626.1705
78.1879
64.5584
64.7239
65.0979
TS
HS
BS
SS
TrS
Reversed (ms) Data 100
1000
Iter
QS
MS
IS
ST
RS
CS
1
0.9847
0.3123
0.1072
0.4099
1.7469
1.1418
0.5919
1.5278
0.1836
0.1529
0.1860
10
1.0336
0.3278
0.1047
0.3778
1.7754
1.1482
0.5783
1.5313
0.1573
0.1600
0.1901
100
0.9726
0.3119
0.1100
0.3828
1.6976
1.1333
0.5612
1.5143
0.1626
0.1708
0.1569
1000
1.0845
0.3518
0.1158
0.4133
1.9028
1.2136
0.5934
1.6322
0.1764
0.2057
0.2171
10000
1.3242
0.4352
0.1520
0.5311
2.2667
1.3632
0.7002
1.9623
0.2422
0.2267
0.2269
1
75.7457
5.6057
0.1055
11.1159
270.5590
235.2450
106.6613
163.7858
3.6045
3.6041
4.3029 (continued)
Sorting Algorithms and Their Execution Times an Empirical Evaluation
347
(continued) Reversed (ms) Data
10000
Iter
TS
HS
10
QS 94.1954
MS 7.9541
0.1553
10.7789
BS 280.5937
IS 179.2550
SS 75.0042
TrS 154.2006
ST 4.5140
RS 4.4940
CS 5.0139
100
95.9501
7.1001
0.1571
10.0475
297.5478
198.9861
85.0806
160.2506
5.7619
5.0435
5.7760
1000
89.6183
6.3754
0.2016
9.4727
266.8225
175.1879
74.5492
141.6040
5.0087
4.9125
4.8975
10000
70.1469
4.8713
0.1340
7.3407
209.4537
138.6884
60.2487
111.9173
3.7716
3.7280
3.7273
1
986.7406
56.0200
0.0809
99.4892
20500.1853
14292.7860
5826.8927
1665.4201
39.2121
43.4552
56.8798
10
1015.5719
56.0925
0.0862
91.3458
19832.2244
13878.7156
5695.0407
1675.0777
47.4680
48.8717
45.0589
100
1013.4446
57.3001
0.0939
94.1872
19825.9403
13865.9834
5687.1142
1676.9498
46.1496
47.3867
47.0581
1000
1016.1134
58.0542
0.0951
94.4595
19894.5795
13881.4600
5685.3320
1680.7228
47.5433
47.6109
47.5726
References 1. Kocher, G., Agrawal, N.: Analysis and review of sorting algorithms. Int. J. Sci. Eng. Res. 2(3), 81–84 (2014) 2. Mittermair, D., Puschner, P.: Which sorting algorithms to choose for hard real-time applications. In: Proceedings of the Ninth Euromicro Workshop on Real Time Systems, Toledo (2002) 3. Astrachan, O.: Bubble sort: an archaeological algorithmic analysis. SIGCSE Bull. (2003). (Association for Computing Machinery, Special Interest Group on Computer Science Education) 4. Khamitkar, S., Bhalchandra, P., Lokhande, S., Deshmukh, N.: The folklore of sorting algorithms. Int. J. Comput. Sci. Issues (IJCSI) 4(2), 25–30 (2009) 5. Downey, R.G., Fellows, M.R.: Parameterized Complexity, vol. 3. Springer, Heidelberg (1999) 6. Salaria, R.S.: Data Structures & Algorithms Using C. Khanna Book Publishing Co.(p) Ltd. (2004) 7. Knuth, D.E.: The Art of Computer Programming. Volume 4A: Combinatorial Algorithms, Part 1. Pearson Education India (2011) 8. Pahuja, S.: A Practical Approach to Data Structures and Algorithms. New Age International (2007) 9. Bansal, V.K., Srivastava, R., Pooja: Indexed array algorithm for sorting. In: 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies, Trivandrum, Kerala, pp. 34–36 (2009) 10. Faujdar, N., Ghrera, S.P.: Analysis and testing of sorting algorithms on a standard dataset. In: 2015 Fifth International Conference on Communication Systems and Network Technologies, Gwalior, pp. 962–967 (2015) 11. Edjlal, R., Edjlal, A., Moradi, T.: A sort implementation comparing with bubble sort and selection sort. In: 2011 3rd International Conference on Computer Research and Development, Shanghai, pp. 380–381 (2011) 12. Yang, Y., Yu, P., Gan, Y.: Experimental study on the five sort algorithms. In: 2011 Second International Conference on Mechanic Automation and Control Engineering, Hohhot, pp. 1314–1317 (2011) 13. McMaster, K., Sambasivam, S., Rague, B., Wolthuis, S.: Distribution of execution times for sorting algorithms implemented in Java. In: Proceedings of Informing Science & IT Education Conference (InSITE), pp. 269–283 (2015) 14. GeeksforGeeks, Sorting Algorithms. https://www.geeksforgeeks.org/sorting-algorithms/. Accessed 21 Feb 2020
348
G. O. Pizarro-Vasquez et al.
15. Toptal, Sorting Algorithms Animations. https://www.toptal.com/developers/sorting-algori thms. Accessed 21 Feb 2020 16. Peters, T.: Timsort description. http://svn.python.org/projects/python/trunk/Objects/listsort. txt. Accessed June 2015 17. Devi, O.R.: Int. J. Adv. Trends Comput. Sci. Eng. 4, 15–21 (2015) 18. Amaral, A., Serpa, A., Rego, D.C., Valim, S., Soares, J.: Monte Carlo simulation of polymerization reactions: optimization of the computational time (2019) 19. Chakraborty, S., Sarkar, R.: Binary tree sort is more robust than quick sort in average case. Int. J. Comput. Sci. Eng. Appl. 2, 115–123 (2012) 20. Cook, C.R., Kim, D.J.: Best sorting algorithm for nearly sorted lists. Commun. ACM 23(11), 620–624 (1980) 21. Odeh, A., Elleithy, K., Almasri, M., Alajlan, A.: Sorting N elements using quantum entanglement sets. In: 3rd International Conference on Innovative Computing Technology, INTECH 2013, no. 1, pp. 213–216 (2013) 22. Shi, Y.: Quantum lower bound for sorting, pp. 1–11. Arxiv Preprint (2000). http://arxiv.org/ abs/quant-ph/0009091. Accessed 21 Feb 2020
Author Index
A Acevedo-Flores, Jessica, 275 Andaluz, Víctor H., 205 Andersson, Christina, 127 Aquino Cruz, Mario, 264 Arcos-Medina, Gloria, 75 Arellano-Aucancela, Alberto, 314 Avila-Pesantez, Diego, 314 B Botto-Tobar, Miguel, 335 Bravo-Quezada, Omar G., 29 C Cacha-Nuñez, Yesbany, 91 Caiza, Gustavo, 302 Calderón-Hinojosa, Xavier, 252 Carpio Vargas, Edgar Eloy, 264 Chafla, Juan, 58 Chancay-García, Leonardo, 3 Cisneros, David, 42 Conde, Angel Alberto Silva, 151 Costales, Patricio Moreno, 103, 114 Correa Real, Luis Germán, 290 Cordovez Machado, Sonia Patricia, 114 Cuervo-Bejarano, William Javier, 217 E Espín, Fernando Vela, 239
F Flores, Graciela, 193 G Gallardo, Cristian, 193 Gallegos-Segovia, Pablo L., 29 Galvez Minervini, Pierina, 335 Gamboa, Silvana, 16 Gómez, Omar S., 75 Gonzales-Macavilca, Milton, 91 Guaiña Yungán, Jonny, 103, 114 Guevara, Bryan S., 205 Guevara-Vega, Cathy Pamela, 290 H Hernández-Álvarez, Myriam, 226 Hidalgo O., Angel G., 139 Huallpa Laguna, Jessica Noralinda, 264 Huaman Sarmiento, Ling Katterin, 325 Huaman Sarmiento, Mawly Latisha, 325 Huillcen Baca, Herwin Alayn, 264 I Iraola-Real, Ivan, 91, 127, 325 J Jácome Orozco, Ana Gabriela, 290 L La Cruz, Alexandra, 167, 179 Landeta-López, Pablo Andrés, 290
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. Botto-Tobar et al. (Eds.): ICAETT 2020, AISC 1302, pp. 349–350, 2021. https://doi.org/10.1007/978-3-030-63665-4
350 León-Paredes, Gabriel A., 29 Lopez-Espinosa, Jeisson Andres, 217 Luna Encalada, Washington, 103, 114 M Maldonado, Violeta, 16 Mego Sanchez, Claudia, 325 Mego Sanchez, Héctor David, 325 Mejia Morales, Fabiola, 335 Mendoza-Tello, Julio C., 252 Miriam Avila, L., 314 Moreira-Zambrano, Cesar, 3 Morillo, John, 275 N Neyra-Rivera, Carlos, 275 O Oñate, Alejandra, 42, 75 Oñate, William, 302 P Padilla, Nelly Padilla, 314 Palaguachi, Sonia, 29 Palomino Valdivia, Flor de Luz, 264 Pástor, Danilo, 75 Perpiñan, Gilberto, 167, 179 Perez Bayas, Miguel Ángel, 151 Pizarro-Vasquez, Guillermo O., 335 Portero A., Flavio J., 139 Q Quimbiamba C., Jorge V., 139 Quimiz-Moreira, Mauricio, 3 Quiña-Mera, José Antonio, 290 Quishpe Estrada, Fátima de los Ángeles, 151
Author Index R Recalde, Luis F., 205 Rivera, Michael, 302 Rodas, Ana, 16 Rodríguez, María, 302 Rodriguez-F, Ivonne E., 75 S Sacoto Cabrera, Erwin J., 29 Sanchez, Claudia Mego, 127 Santillán Guadalupe, Sandra, 314 Severeyn, Erika, 167, 179 T Toapanta, Angel, 58 Toapanta, Roberto, 58 Torres H., Edgar, 226 Torres P., Edgar P., 226 Trujillo, María Fernanda, 16 V Vargas C., Ramiro S., 139 W Wong, Sara, 167, 179 Y Yoo, Sang Guun, 226 Z Zabala, Mónica, 42 Zambrano-Romero, Walter, 3 Zamora, Darwin Loor, 3 Zea, Danny J., 205 Zuñiga-Quispe, Roxana, 91